Results 1 to 5 of 5
Ok, I'm a Linux software raid veteran and I have the scars to prove it (google for mddump if you're bored), but that's not doing me much good now. I'm ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 11-28-2006 #1Just Joined!
- Join Date
- Nov 2006
- Posts
- 4
Eeek! Can't assemble degraded/dirty RAID6 array!
Ok, I'm a Linux software raid veteran and I have the scars to prove it (google for mddump if you're bored), but that's not doing me much good now. I'm at the end of my rope... er... SATA cable. Help? Please??

The subject platform is a PC running FC5 (Fedora Core 5, patched latest) with eight 400gb SATA drives (/dev/sd[b-i]1) assembled into a RAID6 md0 device. Originally built with mdadm. No LVM or other exotics. /dev/md0 is a /data filesystem, nothing there needed at boot time. It's been humming along nicely for months.
Then... This morning I found that /dev/sdb1 had been kicked out of the array and there was the requisite screaming in /var/log/messages about failed read/writes, SMART errors, highly miffed SATA controllers, etc., all associated with /dev/sdb1. (It appears to have been a temporary failure -- badblocks found no problems.) Tried shutting the system down cleanly, which didn't seem to be working, so finally crossed my fingers and hit the reset button.
No surprise, it booted back up refusing to assemble the array. More specfically:
Code:Nov 27 19:03:52 ornery kernel: md: bind<sdb1> Nov 27 19:03:52 ornery kernel: md: bind<sdd1> Nov 27 19:03:52 ornery kernel: md: bind<sde1> Nov 27 19:03:52 ornery kernel: md: bind<sdf1> Nov 27 19:03:52 ornery kernel: md: bind<sdg1> Nov 27 19:03:52 ornery kernel: md: bind<sdh1> Nov 27 19:03:52 ornery kernel: md: bind<sdi1> Nov 27 19:03:52 ornery kernel: md: bind<sdc1> Nov 27 19:03:52 ornery kernel: md: kicking non-fresh sdb1 from array! Nov 27 19:03:52 ornery kernel: md: unbind<sdb1> Nov 27 19:03:52 ornery kernel: md: export_rdev(sdb1) Nov 27 19:03:52 ornery kernel: md: md0: raid array is not clean -- starting back ground reconstruction Nov 27 19:03:52 ornery kernel: raid5: device sdc1 operational as raid disk 1 Nov 27 19:03:52 ornery kernel: raid5: device sdi1 operational as raid disk 7 Nov 27 19:03:52 ornery kernel: raid5: device sdh1 operational as raid disk 6 Nov 27 19:03:52 ornery kernel: raid5: device sdg1 operational as raid disk 5 Nov 27 19:03:52 ornery kernel: raid5: device sdf1 operational as raid disk 4 Nov 27 19:03:52 ornery kernel: raid5: device sde1 operational as raid disk 3 Nov 27 19:03:52 ornery kernel: raid5: device sdd1 operational as raid disk 2 Nov 27 19:03:52 ornery kernel: raid5: cannot start dirty degraded array for md0 Nov 27 19:03:52 ornery kernel: RAID5 conf printout: Nov 27 19:03:52 ornery kernel: --- rd:8 wd:7 fd:1 Nov 27 19:03:52 ornery kernel: disk 1, o:1, dev:sdc1 Nov 27 19:03:52 ornery kernel: disk 2, o:1, dev:sdd1 Nov 27 19:03:52 ornery kernel: disk 3, o:1, dev:sde1 Nov 27 19:03:52 ornery kernel: disk 4, o:1, dev:sdf1 Nov 27 19:03:52 ornery kernel: disk 5, o:1, dev:sdg1 Nov 27 19:03:52 ornery kernel: disk 6, o:1, dev:sdh1 Nov 27 19:03:52 ornery kernel: disk 7, o:1, dev:sdi1 Nov 27 19:03:52 ornery kernel: raid5: failed to run raid set md0 Nov 27 19:03:52 ornery kernel: md: pers->run() failed ...
Attempts to force assembly fail:Code:[root@ornery ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : inactive sdc1[1] sdi1[7] sdh1[6] sdg1[5] sdf1[4] sde1[3] sdd1[2] 2734961152 blocks unused devices: <none>
Leaving out the bad drive:Code:[root@ornery ~]# mdadm -S /dev/md0 [root@ornery ~]# mdadm --assemble --force --scan /dev/md0 mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
Trying to fail or remove the bad drive doesn't work either:Code:[root@ornery ~]# mdadm -S /dev/md0 [root@ornery ~]# mdadm --assemble --force /dev/md0 /dev/sd[c-i]1 mdadm: failed to RUN_ARRAY /dev/md0: Input/output error [root@ornery ~]# mdadm -S /dev/md0 [root@ornery ~]# mdadm --assemble --force --run /dev/md0 /dev/sd[c-i]1 mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
A quick check of the event counters shows that only /dev/sdb is stale:Code:[root@ornery ~]# mdadm -f /dev/md0 /dev/sdb1 mdadm: set device faulty failed for /dev/sdb1: No such device [root@ornery ~]# mdadm -r /dev/md0 /dev/sdb1 mdadm: hot remove failed for /dev/sdb1: No such device
Here's a full examine from one of the good drives:Code:[root@ornery ~]# mdadm -E /dev/sd[b-i]1 | grep Event Events : 0.851758 Events : 0.854919 Events : 0.854919 Events : 0.854919 Events : 0.854919 Events : 0.854919 Events : 0.854919 Events : 0.854919
And detail for the array:Code:[root@ornery ~]# mdadm -E /dev/sdc1 /dev/sdc1: Magic : a92b4efc Version : 00.90.03 UUID : d57cea81:3be21b7d:183a67d9:782c3329 Creation Time : Tue Mar 21 11:14:56 2006 Raid Level : raid6 Device Size : 390708736 (372.61 GiB 400.09 GB) Array Size : 2344252416 (2235.65 GiB 2400.51 GB) Raid Devices : 8 Total Devices : 8 Preferred Minor : 0 Update Time : Mon Nov 27 10:10:36 2006 State : active Active Devices : 7 Working Devices : 7 Failed Devices : 0 Spare Devices : 0 Checksum : ebd6e3a8 - correct Events : 0.854919 Number Major Minor RaidDevice State this 1 8 33 1 active sync /dev/sdc1 0 0 0 0 0 removed 1 1 8 33 1 active sync /dev/sdc1 2 2 8 49 2 active sync /dev/sdd1 3 3 8 65 3 active sync /dev/sde1 4 4 8 81 4 active sync /dev/sdf1 5 5 8 97 5 active sync /dev/sdg1 6 6 8 113 6 active sync /dev/sdh1 7 7 8 129 7 active sync /dev/sdi1
So I've obviously got a degraded array. Where does the "dirty" part come in? Why can't I simply force this thing back together in active degraded mode with 7 drives and then add a fresh /dev/sdb1?Code:[root@ornery ~]# mdadm -D /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Tue Mar 21 11:14:56 2006 Raid Level : raid6 Device Size : 390708736 (372.61 GiB 400.09 GB) Raid Devices : 8 Total Devices : 7 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Mon Nov 27 10:10:36 2006 State : active, degraded Active Devices : 7 Working Devices : 7 Failed Devices : 0 Spare Devices : 0 Chunk Size : 256K UUID : d57cea81:3be21b7d:183a67d9:782c3329 Events : 0.854919 Number Major Minor RaidDevice State 9421816 0 0 1912995864 removed 1 8 33 1 active sync /dev/sdc1 2 8 49 2 active sync /dev/sdd1 3 8 65 3 active sync /dev/sde1 4 8 81 4 active sync /dev/sdf1 5 8 97 5 active sync /dev/sdg1 6 8 113 6 active sync /dev/sdh1 7 8 129 7 active sync /dev/sdi1
I know as a last resort I can create a "new" array over my old one and as long as I get everything juuuuust right, it'll work, but that seems a rather drastic solution to what should be a trivial (and all to common) situation -- dealing with a single failed drive. I mean... I run RAID6 to provide a little extra protection, not to slam into these kinds of brick walls. Heck, I might as well run RAID0! ARGH!!! Ok... ok... I'll calm down.
FWIW, here's my mdadm.conf:
Have I missed something obvious? Thanks in advance for any clues...Code:[root@ornery ~]# grep -v '^#' /etc/mdadm.conf DEVICE /dev/sd[bcdefghi]1 ARRAY /dev/md0 UUID=d57cea81:3be21b7d:183a67d9:782c3329 MAILADDR root
- 11-28-2006 #2Just Joined!
- Join Date
- Nov 2006
- Posts
- 4
Ok, done a bit more poking around... I tried zeroing out the superblock on the failed device and adding it back into the array. It just sat there looking stupid. The status of the new drive became "sync", the array status remained inactive, and no resync took place:
Another thing I noticed was the new drive didn't fill the slot for the missing drive, but instead occupied a new slot. Here's a detail for the array:Code:[root@ornery ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : inactive sdb1[0](S) sdc1[1] sdi1[7] sdh1[6] sdg1[5] sdf1[4] sde1[3] sdd1[2] 3125669888 blocks unused devices: <none>
It's like it's just adding the new /dev/sdb1 in as a spare or something. My hunch is that the problem stems from the superblock indicating that the bad device is simply "removed" rather than failed. Yet trying to fail the device... well, failed.Code:[root@ornery ~]# mdadm --detail /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Tue Mar 21 11:14:56 2006 Raid Level : raid6 Device Size : 390708736 (372.61 GiB 400.09 GB) Raid Devices : 8 Total Devices : 8 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Mon Nov 27 10:10:36 2006 State : active Active Devices : 8 Working Devices : 8 Failed Devices : 0 Spare Devices : 0 Chunk Size : 256K UUID : d57cea81:3be21b7d:183a67d9:782c3329 Events : 0.854919 Number Major Minor RaidDevice State 4150256 0 0 1912995872 removed 1 8 33 1 active sync /dev/sdc1 2 8 49 2 active sync /dev/sdd1 3 8 65 3 active sync /dev/sde1 4 8 81 4 active sync /dev/sdf1 5 8 97 5 active sync /dev/sdg1 6 8 113 6 active sync /dev/sdh1 7 8 129 7 active sync /dev/sdi1 0 8 17 - active sync /dev/sdb1
Barring any sudden insights from my fellow Linuxens, it's looking like I have another romp with mddump looming in my future. By my reckoning, I would need to set the SB's to indicate that device 0's status is failed rather than removed, and set the counters to indicate 1 failed device and 7 active/working devices.
If anyone has suggestions, feel free to jump in at any time!!
- 11-28-2006 #3Just Joined!
- Join Date
- Nov 2006
- Posts
- 4
Ok, I tried hacking up the superblocks with mddump. The good news is I didn't screw anything up permanently. The bad news is I made no progress either.
Ultimately, I started reading through the kernel source and wandered into a helpful text file Documentation/md.txt in the kernel source tree. I was able to start the array, for reading at least. (baby steps...) Here's how:
At least I can get to my data now. Yay!Code:[root@ornery ~]# cat /sys/block/md0/md/array_state inactive [root@ornery ~]# echo "clean" > /sys/block/md0/md/array_state [root@ornery ~]# cat /sys/block/md0/md/array_state clean [root@ornery ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 sdc1[1] sdi1[7] sdh1[6] sdg1[5] sdf1[4] sde1[3] sdd1[2] 2344252416 blocks level 6, 256k chunk, algorithm 2 [8/7] [_UUUUUUU] unused devices: <none> [root@ornery ~]# mount -o ro /dev/md0 /data [root@ornery ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/hda2 226G 46G 168G 22% / /dev/hda1 251M 52M 187M 22% /boot /dev/shm 2.9G 0 2.9G 0% /dev/shm /dev/sda2 65G 35G 27G 56% /var /dev/md0 2.2T 307G 1.8T 15% /data
Once I've got a full backup (fingers crossed), I can apply some riskier methods of getting this array into a sane condition again.
- 11-29-2006 #4Just Joined!
- Join Date
- Nov 2006
- Posts
- 4
Backup successful!
So after that, I did the following:
Code:umount /data
The drive was added without error. A quick check of the array:Code:mdadm /dev/md0 -a /dev/sdb1
...and...Code:[root@ornery ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 sdb1[8] sdc1[1] sdi1[7] sdh1[6] sdg1[5] sdf1[4] sde1[3] sdd1[2] 2344252416 blocks level 6, 256k chunk, algorithm 2 [8/7] [_UUUUUUU] [>....................] recovery = 0.2% (823416/390708736) finish=13924.4min speed=465K/sec unused devices: <none>
Now that's what I was looking for! It's moving kinda slow right now, probably because I'm also doing an fsck. I can't be certain, but I think the problem was that the state of the good drives (and the array) were marked as "active" rather than "clean." (active == dirty?) I expect this was caused by doing a hard reset on a system with a degraded array, in the midst of it being brought to a crawl trying to talk to the failed drive. Seems like some work might be needed to be able to handle these situations a little more gracefully.Code:[root@ornery ~]# mdadm -D /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Tue Mar 21 11:14:56 2006 Raid Level : raid6 Array Size : 2344252416 (2235.65 GiB 2400.51 GB) Device Size : 390708736 (372.61 GiB 400.09 GB) Raid Devices : 8 Total Devices : 8 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Wed Nov 29 11:03:51 2006 State : clean, degraded, recovering Active Devices : 7 Working Devices : 8 Failed Devices : 0 Spare Devices : 1 Chunk Size : 256K Rebuild Status : 0% complete UUID : d57cea81:3be21b7d:183a67d9:782c3329 Events : 0.854924 Number Major Minor RaidDevice State 8 8 17 0 spare rebuilding /dev/sdb1 1 8 33 1 active sync /dev/sdc1 2 8 49 2 active sync /dev/sdd1 3 8 65 3 active sync /dev/sde1 4 8 81 4 active sync /dev/sdf1 5 8 97 5 active sync /dev/sdg1 6 8 113 6 active sync /dev/sdh1 7 8 129 7 active sync /dev/sdi1
Anyway, it appears I might be firmly on the road to recovery now. (If not, you'll hear the screams...) Hopefully my posts will be helpful to others encountering this problem.
-cw-
- 10-16-2008 #5Just Joined!
- Join Date
- Oct 2008
- Posts
- 1
Quick thank you...
I can confirm that this works.
I was in a similar position Raid 5, 4 1 TB drives. Had a drive fail, so off it goes to the WD repair shop, only it takes 3 weeks to return. During this time the system was available as a degraded array. Only being powered up when critical data was needed.
Then as you can only imagine, the machine looses power in one of these "I must have that file.." moments. The UPS kicked in but died (whilst at lunch 'Ahem!').
Needless to say the reboot resulted in some shock as one of the raid partitions (in my case 2.6TB MD3) was 'inactive'.
I spent an age looking at the web reading what are embarrassingly old "Linux RAID Software HOWTO" (Btw. people should date their howtos in bold at the top of the document)
And then I was most fortunate in that I found this post and hit hit my nail on the head. This describes what can only be a common occurrence and I guess it should be handled more obviously or at least reported more clearly in the log files, a simple "The raid partition terminated un-cleanly whilst in a degraded state, please use obscure echo "clean" > ....
or ideally a new switch in mdadm which lets you 'force clean on degraded array.
Anyway I followed the ideas presented here and was soon looking at the best bit of news possible:
NOTE: Some points to look out for....root@NAS:/home/andy# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md3 : active raid5 sdb4[4] sda4[0] sdd4[3] sdc4[2]
2881891968 blocks level 5, 64k chunk, algorithm 2 [4/3] [U_UU]
[>....................] recovery = 1.4% (13577932/960630656) finish=295.4min speed=53416K/sec
1) When you echo "clean" > to the system block device you may get an error, it will however still work!
2) Remember you need to put your clean new hard drive in and then set up the partitions BEFORE adding the drive back into the array. in my case I run a simple set up with all the drives having exactly the same partition table. So for me I can simple copy the partition table from any working drive in a couple of secondswhere I want to copy from good device A to new and clean device B.sfdisk -d /dev/sda | sfdisk /dev/sdb
so a big thank you for spending the time in sharing your experience and for trawling the source code



