Results 1 to 2 of 2
I have been searching and trying to piece together a good answer to this problem for a little while now but have not been able find a satisfactory answer. Summary: ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 03-27-2011 #1
Multiple RAID 10 + LVM Failure Risk
What happens when an LVM that spans multiple RAID array has one (or more) or the RAID arrays fail (note that would be same if you were using single drives in the LVM too)? Do you lose everything or can you recover?
I am upgrading my existing storage solution soon from a single RAID 5 array (4x1TB) with a single ext3 partition to a RAID 10 array (4x2TB). I then plan on to move all the data from the old RAID 5 array to the new RAID 10 array, break the RAID 5 array and rebuild it as a second RAID 10 array. I was initially thinking of using LVM on top of the new RAID 10 array and then expanding the volume group with the second RAID 10 array, giving me a single 6TB logical "drive" on which I would place a single ext4 file system.
The problem is I got thinking about what happens when a RAID array fails (note the whole array, not just a drive). I know I would lose all the data on the failed array (that is a given), but what happens to the remaining data? So suppose I build this as mentioned, the initial VG is the full 4TB on the new array, I then expand the VG by adding in the old 2TB array and then expand the file system, resulting in the single 6TB file system. This is in effect RAID Linear over the 2 RAID arrays. Suppose the first array fails (the 4TB one), all data on it would be lost, but would I be able to recover any data on the 2TB array?
I don't fully know how the ext file systems are implemented, but if its like a FAT system (I say this very loosely) if the first sector of the file system goes down, how does the second array even know it has a file system on it? Basically if I loose the data that tells the system there is a file system on this device, will I loose everything on that file system? or is ext4 smart enough to know if the first part of the file system gets corrupted that the later part can still be recovered?
There seems to be mixed opinions on this. Some say that LVM (even in spanning mode) introduces a RAID 0 like failure mode, while others claim that you can still recover data off the remaining devices (although it will take work).
I have no immediate need to migrate, and I have extra hardware laying around so I could set up a test machine and simulate a failure, but I don't know how I would do this (I don't think just unplugging the drive would work, would it?).
Finally is this even something to be worried about? If I were using single disks as the base hardware then this would certainly be something to consider, but I intend to use RAID 10 underneath the LVM. I also have a policy of replacing drives at 6 month intervals until all the drives in the array are offset by 6 months (since all the drives see the same wear level this should minimize seeing multiple failures during a rebuild since none of the drives should be of same assembly line and wear level). I also do not intend to leave the array running in the event of a failure, I will shut the server down until I have the replacement hardware.
openSuse 11.3 x64
All RAID arrays are linux kernel RAID.
The OS/boot/home are on a separate RAID 1 array, the large file system is purely data storage.
I do have backups, but the best backup is the backup that is never used.
I decided to try to simulate a failure by removing a drive, this is what happened
/dev/sdb => 4GB
/dev/sdc => 2GB
/dev/sdd => 1GB
1) /dev/vg0 => +4GB
2) /dev/vg0/vol0 => 4GB
3) mounted vol0 to /data and added to fstab (so it remounts on boot)
4) Filled vol0 with ~3.9GB of data
5) vg0 => +2GB + 1GB
6) vol0 => resize to max (vol0 is now 7GB)
7) Filled vol0 with ~6.9GB of data
8 ) Shutdown and unplugged sata cable for /dev/sdc (the 2GB partition in the "middle" of the volume group, it should contain part of the second data set and none of the first)
9) Boot system, OS fails to mount /dev/vg0/vol0 to /data
Activating LVM volume groups
Couldn't find device with uuid [ID]
Refusing activation of partial LV vol0, Use --partial to override
/dev/vg0/vol0: The superblock could not be read or does not describe a correct ext2 file system...
fsck failed for at least one file system (not /), repair and reboot
10) System is dropped to recovery console, asking for root password
11) activated vg0 using "vgchange -a y vg0 --partial". vg0 was activated in a read only mode
The problem now is that I cannot mount vol0 to /data, mount keeps demanding a file system type and when add "-t ext4" it claims wrong fs type.
So far it appears that LVM does indeed introduce RAID 0 like failure when spanned over multiple devices. I'll keep digging around and see if I can find a way to do a partial mount of the LVM, but it doesn't look good for the data.
Last edited by Superfuzz; 03-28-2011 at 01:28 AM.
- 03-29-2011 #2
After giving the recovery another try I noticed that I somehow managed to unplug /dev/sdb not /dev/sdc like I meant to. Which is fine, I have now have another test case.
What I have observed:
1) If the device with the file system's superblock goes missing you cannot mount the file system. (This makes sense now given the errors I was seeing, and is understandable since I did remove the first device not the second like I planned). In this case the whole logical volume is trashed, I could not seem to access it no matter what. This is very much like a RAID 0 failure.
2) If the device with the superblock is still functioning you can mount the logical volume. Due to use of the partial option, the volume group is mounted as read only (not a big deal, probably even preferable). What I find interesting is that the mounted file system appears to be the exact same as the fully functioning file system, it reports all the same sizes and show all the files that were on the file system before it broke (this makes sense if think about it for a moment). The problem with this is that you cannot tell easily which files are corrupt/missing and which are perfectly fine. If you had a specific file you wanted back you could check it and if you are lucky it may be one of the files that are fine, but for a general recovery this is not practical, even less so when your dealing with large number of file that occupy a large amount of space. This failure, as troublesome as it may be is technically better than RAID 0, with luck you can recover something.
An LVM spanning multiple devices is playing Russian Roulette with your data. (For comparison RAID 0 would be Russian Roulette with all 6 shots loaded in the revolver).
While this does answer the main question I had (what happens when the LVM fails), the second question still remains. This experiment is assuming the worst case failure, the entire RAID array failed. One of the reasons for using RAID is to prevent total failure like this. I know this will be opinion based answer as it is really a question of how much risk you are willing to take, but I ask anyway.
Is it safe (enough) to have LVM span multiple RAID 10 arrays? If the RAID 10 arrays never fully fail then the LVM would be perfectly fine.
I often see recommendations when building RAID arrays (5 and 10 usually) that suggest if your building a large array to place LVM on top of the array. Given that spanning arrays appears to be just as risky as RAID 0, why give this advice? From what I have seen placing an LVM on top of a RAID 10 is in effect a RAID 10L (or RAID 100 if you use striping instead of spanning for LVM), what benefit is there to essentially adding a stripped layer on top of an already stripped layer?