So I've been doing a lot of research into raid and filesystems lately; mostly ZFS and BTRFS. I have been using the Kernel's software MD Raid for several years now, but I never really understood it much. I would like to think that after spending the last month or so doing some in depth research and testing, that I understand a little better.

################################################## #####
Data Scrubbing.

Lets say you have a raid 1 (mirror). If you read a file, and the copy on hard drive 1 is borked, your system will read it off of disk 2, and attempt to fix drive 1's copy. If it can't, your disk is marked faulty, and kicked from the raid group. You then hotswap your drive, put it into the raid array, and rebuild it based on the files on disk 2.

What isn't accounted for, is passive bit rot. Every so often, you will lost bits off your drive due to various reasons, that could be how HD1 in the above example went bad; not a full on hardware failure, just some bad sectors. Lets say for a moment though that you have a file that you haven't accessed in years (2007's tax return?) That file hasn't been read, so it hasn't been checked. Returning to the above, you lose HD1, replaced it, and a rebuild is in progress. It then gets to the section of disk with your old file in it, and it has bit rot that went undetected. BOOM, you just lost that file, and your build process is probably going to get jacked up, possibly causing more data loss.

I don't know about other people, but I had assumed that your drives were read periodically to make sure everything was peachy keen. I'm by no means a Linux noob either, I passed the RHCE certification exam with a perfect score on my first try, and nowhere had I learned about data scrubbing until after I had a real problem I had to deal with. No amount of synthetic lab work for school, or controlled tests in the home can prepare you for real world failure.

On to the definition. Data Scrubbing is where you force the system to read every single sector of your array, and make sure your data is sane. For consumer SATA drives, I've read that once a week is a pretty common suggestion. Enterprise drives, no less than once a month is suggested. The problem is, when you're scrubbing, your disk IO goes through the roof, and your performance drops like a rock (I love metaphors.)

I suggest you Google around more on Data Scrubbing, but the quick and dirty can be had here

RAID/Software - Gentoo Linux Wiki

################################################## #####
Raid 5 is pretty much useless.

I have been using raid 5 for a long time, and thankfully, I have never lost any important data. I've been REALLY lucky.

Hard drive technology really hasn't changed much in the last decade. You open up a hard drive from 10 years ago, and a brand new one, and it'll be really hard (if not impossible) to tell the difference. Platters haven't changed, they're still just a hunk of non-magnetic metal (or ceramic) pounded out into a uniform shape that we put magnetic coating on, and put a charge on. We have increased the capacities by making the distance between, and size of each bit smaller and smaller; thus jamming more and more onto the same platter.

All the while, the number of errors per bytes hasn't changed. So the chance of an error being stored on a disk has shot up greatly. Even with regular data scrubbing, your chance of having regular errors are pretty good. Even if you have a hot spare setup, and the moment a drive is kicked out of a Raid 5 array it starts rebuilding; AND you have done regular data scrubbing every week, you have a very good chance of losing 1 drive, and having errors on another during the rebuild process. The higher drive densities get, the worse your odds are of a full recovery.

Raid 6 has the capability of losing 2 drives, and still being OK. Assuming that you lose 1 hard drive completely, you should still be OK as long as there are no more than 1 corrupted stripe for every parity group. But now you give up 2 hard drives' capacity from your array, and there are some admins who don't think 2 spares is safe enough.

################################################## #####
RAID IS NOT BACKUP!

Repeat after me.
RAID IS NOT BACKUP!
RAID IS NOT BACKUP!
RAID IS NOT BACKUP!

There are countless number of things that can go wrong with a file that have nothing to do with your disks or raid array. Have you ever deleted your entire home directory? I have, several times. Not due to me making a mistake in typing or remembering where I was. I've lost it because of shell expansion that I wasn't considering
(don't do `mkdir temp; cd temp; mkdir '~'; rm -rf *`
you will nuke your home directory because the shell translates * to ~ and sends it to `rm -rf`, thus nuking your home.)

All the wile, my raid 1 dutifully mirrors the horror to all disks.

I've lost 3 raid 5s, and a raid 1, never once losing important data because I was smart enough to listen to countless horror stories of people losing everything because they didn't need backup, they've got RAID.

RAID IS NOT BACKUP!

I have started putting encrypted copies of my data into the cloud, as well as taking occasional burned DVDs over to a siblings house. I have a copy of files on my desktop, and on my fileserver. REALLY important stuff is on a flash drive I take everywhere. You can never have too many backups.

################################################## #####
Raid 1 performance

I thought that a 2 disk raid 1 array would have almost as good write performance as a single disk, and almost as good read performance as double that (you can read half the file from drive1, half from drive2.)

I've done some experimenting on a default setup of RAID 1, and that is not correct.

from iostat
iostat-out.jpg
The important part from that, is that even though I had significant reading from md127, sda was barely read from at all. Even when using an iscsi device that is a logical volume of md127 (a physical LVM volume,) I only see the 1 drive light up (hot swap bay.) I am stuck at about 130 MB/s for the array, even though each drive can easily read at that rate.

If anyone knows of some command or tuning I have to run to enable striped reading, please let me know.

################################################## #####

I write these things so that someone may benefit from my experiences. I have had to rebuild several arrays using backups, which are a pain. I have luckily never had to restore files using vi and memory.