Results 1 to 3 of 3
Hi, I'm looking for a new storage stack. Here is some background info, you may skip: I am currently running ZFS on Linux on two mirrored, encrypted (LUKS) 2TB drives ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 02-11-2012 #1
- Join Date
- Feb 2012
Advanced storage for home use
Hi, I'm looking for a new storage stack.
Here is some background info, you may skip:
I am currently running ZFS on Linux on two mirrored, encrypted (LUKS) 2TB drives and one encrypted 60GB SSD which acts as a read cache. I would like to know if there are any alternatives, because 1) I don't like recompiling kernel modules every time there is an update, using the development trunk, and hacking dracut scripts to make it boot. It's just a disaster waiting to happen. And 2) It would be nice to do encryption "above" the mirroring, so I don't have to encrypt the same data twice when writing.
Is there any filesystem, volume manager, etc. for Linux which has the following features:
1) Checksums for integrity protection of data and automatic correction using mirrored drives. Two times when I had messed around in the computer case, one of the drives started writing corrupted data, and if I didn't have checksums I wouldn't even know (maybe I'd get crashes, or artifacts in my videos).
2) Encryption. I'm not doing anything secret, but I've gotten used to having everything encrypted.
3) Caching. I can't afford to put all my data on SSD, so it would be nice to have a read cache. Maybe it could interoperate with bcache or Facebook flashcache. Memory caching is also great, but I can't rely on it -- suddenly some program will use 12 GB of RAM, and all my cache is cleared. ZFS has a good algorithm for RAM caching, which doesn't completely wipe the cache when doing backups or synchronizations, anything like that is great.
4) Performance. Shouldn't suck completely, should be able to feed a 1Gbit connection when transferring large files.
I hope someone can help me out with some ideas. I think btrfs doesn't have any advanced caching, and no built-in encryption, so I don't know if it's worth it. I'd lose the constant agony of wondering if my machine comes back after a reboot though. And I can't used FreeBSD, it doesn't have the graphics support so videos look incredibly ugly (not that anyone would suggest BSD on a Linux forum anyway).
- 02-12-2012 #2
- Join Date
- Apr 2009
- I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
Well, I thought that ZFS uses a fuse (user-space) driver. If so, why do you need to configure/rebuild your kernel when you update it (the kernel)?
1. If you are using hardware or software raid, drive integrity should be built in.
2. If you are using full-disc encryption (such as TrueCrypt), this should work ok for you.
3. Caching - the Linux OS uses available RAM for caching. More RAM == more cache. Don't bother with SSD - just get a lot of RAM!
4. Disc I/O is dependent upon the controller interface and raw disc speed. Current systems are good when you use a Sata-3 connection (6gbps), although that will saturate any drive and/or array PDQ. I have 1gbps network links from my workstation to my network storage array, but because of disc limitations, I only get about 100mbps throughput. On my sata drives, if I am reading from one drive to another (especially on another controller), I can get up to 1gbps in throughput, at least for some period of time. Just remember, that this sort of performance is due to the OS caching data as it is read, and using write-behind cache to deliver it to the target device. IE, the system may say that the file has been written, but it has not yet been physically written to the target disc - it is still in cache, and executing the "sync" command as root will show that as it will not return to the system prompt until all of the data has been committed to disc.Sometimes, real fast is almost as good as real time.
Just remember, Semper Gumbi - always be flexible!
- 02-12-2012 #3
- Join Date
- Feb 2012
Thanks for the insightful reply! I'm using the ZFS kernel modules (zfs#on#linux.#com, remove the hashes). I thought these would give better performance than FUSE, because it's in the kernel. Here's a post saying it's good to disable the ZFS cache for FUSE http#:#//groups#.google.#com/#group/zfs-fuse/browse_thread/thread/d4c5a30f28059317 . Of course, that means that the OS takes care of the caching, so it's maybe not that bad.
If LVM software raid really does integrity protection, that would be absolutely fantastic. I fear that it doesn't protect against silent corruption though, only failed reads. Silent corruption is when the drive returns some data without error, but the data is wrong. I have yet to see silent corruption due to bit rot, but like I said, it can happen due to hardware failure. Anyway, I will read up on Linux software raid and see how the protections are. Maybe there is somthing like that in the encryption layer as well, there was on FreeBSD.
Ram caching is great, I have 16 GB and only reboot when I have to. Don't think the ssd does a lot.
Interesting points about performance. I can't believe you are only getting 100 megabits/s of data from your storage array because of storage limitations though. Any crappy USB drive will give you more than 20 megabytes/s, which is more than 200 megabits. Anyway, I only get 19 megabytes/s between my router/server and my desktop, but I think that's a problem with Samba. I do use Sata-3, and I actually had to buy new sata cables twice due to the data corruption thing. I do indeed care most about cached/buffered IO, and it doesn't matter if long reads/writes are at 60 megabytes/s instead of 120.
Edit: I just had a program i wrote blow up and use 8 GB. SSD cache is good to keep the system speedy then...
Last edited by fa2k; 02-12-2012 at 11:53 AM.