Find the answer to your Linux question:
Results 1 to 5 of 5
Ok, I'm a Linux software raid veteran and I have the scars to prove it (google for mddump if you're bored), but that's not doing me much good now. I'm ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Nov 2006
    Posts
    4

    Eeek! Can't assemble degraded/dirty RAID6 array!


    Ok, I'm a Linux software raid veteran and I have the scars to prove it (google for mddump if you're bored), but that's not doing me much good now. I'm at the end of my rope... er... SATA cable. Help? Please??

    The subject platform is a PC running FC5 (Fedora Core 5, patched latest) with eight 400gb SATA drives (/dev/sd[b-i]1) assembled into a RAID6 md0 device. Originally built with mdadm. No LVM or other exotics. /dev/md0 is a /data filesystem, nothing there needed at boot time. It's been humming along nicely for months.

    Then... This morning I found that /dev/sdb1 had been kicked out of the array and there was the requisite screaming in /var/log/messages about failed read/writes, SMART errors, highly miffed SATA controllers, etc., all associated with /dev/sdb1. (It appears to have been a temporary failure -- badblocks found no problems.) Tried shutting the system down cleanly, which didn't seem to be working, so finally crossed my fingers and hit the reset button.

    No surprise, it booted back up refusing to assemble the array. More specfically:

    Code:
    Nov 27 19:03:52 ornery kernel: md: bind<sdb1>
    Nov 27 19:03:52 ornery kernel: md: bind<sdd1>
    Nov 27 19:03:52 ornery kernel: md: bind<sde1>
    Nov 27 19:03:52 ornery kernel: md: bind<sdf1>
    Nov 27 19:03:52 ornery kernel: md: bind<sdg1>
    Nov 27 19:03:52 ornery kernel: md: bind<sdh1>
    Nov 27 19:03:52 ornery kernel: md: bind<sdi1>
    Nov 27 19:03:52 ornery kernel: md: bind<sdc1>
    Nov 27 19:03:52 ornery kernel: md: kicking non-fresh sdb1 from array!
    Nov 27 19:03:52 ornery kernel: md: unbind<sdb1>
    Nov 27 19:03:52 ornery kernel: md: export_rdev(sdb1)
    Nov 27 19:03:52 ornery kernel: md: md0: raid array is not clean -- starting back
    ground reconstruction
    Nov 27 19:03:52 ornery kernel: raid5: device sdc1 operational as raid disk 1
    Nov 27 19:03:52 ornery kernel: raid5: device sdi1 operational as raid disk 7
    Nov 27 19:03:52 ornery kernel: raid5: device sdh1 operational as raid disk 6
    Nov 27 19:03:52 ornery kernel: raid5: device sdg1 operational as raid disk 5
    Nov 27 19:03:52 ornery kernel: raid5: device sdf1 operational as raid disk 4
    Nov 27 19:03:52 ornery kernel: raid5: device sde1 operational as raid disk 3
    Nov 27 19:03:52 ornery kernel: raid5: device sdd1 operational as raid disk 2
    Nov 27 19:03:52 ornery kernel: raid5: cannot start dirty degraded array for md0
    Nov 27 19:03:52 ornery kernel: RAID5 conf printout:
    Nov 27 19:03:52 ornery kernel:  --- rd:8 wd:7 fd:1
    Nov 27 19:03:52 ornery kernel:  disk 1, o:1, dev:sdc1
    Nov 27 19:03:52 ornery kernel:  disk 2, o:1, dev:sdd1
    Nov 27 19:03:52 ornery kernel:  disk 3, o:1, dev:sde1
    Nov 27 19:03:52 ornery kernel:  disk 4, o:1, dev:sdf1
    Nov 27 19:03:52 ornery kernel:  disk 5, o:1, dev:sdg1
    Nov 27 19:03:52 ornery kernel:  disk 6, o:1, dev:sdh1
    Nov 27 19:03:52 ornery kernel:  disk 7, o:1, dev:sdi1
    Nov 27 19:03:52 ornery kernel: raid5: failed to run raid set md0
    Nov 27 19:03:52 ornery kernel: md: pers->run() failed ...
    Code:
    [root@ornery ~]# cat /proc/mdstat
    Personalities : [raid6] [raid5] [raid4]
    md0 : inactive sdc1[1] sdi1[7] sdh1[6] sdg1[5] sdf1[4] sde1[3] sdd1[2]
          2734961152 blocks
    
    unused devices: <none>
    Attempts to force assembly fail:

    Code:
    [root@ornery ~]# mdadm -S /dev/md0
    [root@ornery ~]# mdadm --assemble --force --scan /dev/md0
    mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
    Leaving out the bad drive:

    Code:
    [root@ornery ~]# mdadm -S /dev/md0
    [root@ornery ~]# mdadm --assemble --force /dev/md0 /dev/sd[c-i]1
    mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
    [root@ornery ~]# mdadm -S /dev/md0
    [root@ornery ~]# mdadm --assemble --force --run /dev/md0 /dev/sd[c-i]1
    mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
    Trying to fail or remove the bad drive doesn't work either:

    Code:
    [root@ornery ~]# mdadm -f /dev/md0 /dev/sdb1
    mdadm: set device faulty failed for /dev/sdb1:  No such device
    [root@ornery ~]# mdadm -r /dev/md0 /dev/sdb1
    mdadm: hot remove failed for /dev/sdb1: No such device
    A quick check of the event counters shows that only /dev/sdb is stale:

    Code:
    [root@ornery ~]# mdadm -E /dev/sd[b-i]1 | grep Event
             Events : 0.851758
             Events : 0.854919
             Events : 0.854919
             Events : 0.854919
             Events : 0.854919
             Events : 0.854919
             Events : 0.854919
             Events : 0.854919
    Here's a full examine from one of the good drives:

    Code:
    [root@ornery ~]# mdadm -E /dev/sdc1
    /dev/sdc1:
              Magic : a92b4efc
            Version : 00.90.03
               UUID : d57cea81:3be21b7d:183a67d9:782c3329
      Creation Time : Tue Mar 21 11:14:56 2006
         Raid Level : raid6
        Device Size : 390708736 (372.61 GiB 400.09 GB)
         Array Size : 2344252416 (2235.65 GiB 2400.51 GB)
       Raid Devices : 8
      Total Devices : 8
    Preferred Minor : 0
    
        Update Time : Mon Nov 27 10:10:36 2006
              State : active
     Active Devices : 7
    Working Devices : 7
     Failed Devices : 0
      Spare Devices : 0
           Checksum : ebd6e3a8 - correct
             Events : 0.854919
    
    
          Number   Major   Minor   RaidDevice State
    this     1       8       33        1      active sync   /dev/sdc1
    
       0     0       0        0        0      removed
       1     1       8       33        1      active sync   /dev/sdc1
       2     2       8       49        2      active sync   /dev/sdd1
       3     3       8       65        3      active sync   /dev/sde1
       4     4       8       81        4      active sync   /dev/sdf1
       5     5       8       97        5      active sync   /dev/sdg1
       6     6       8      113        6      active sync   /dev/sdh1
       7     7       8      129        7      active sync   /dev/sdi1
    And detail for the array:

    Code:
    [root@ornery ~]# mdadm -D /dev/md0
    /dev/md0:
            Version : 00.90.03
      Creation Time : Tue Mar 21 11:14:56 2006
         Raid Level : raid6
        Device Size : 390708736 (372.61 GiB 400.09 GB)
       Raid Devices : 8
      Total Devices : 7
    Preferred Minor : 0
        Persistence : Superblock is persistent
    
        Update Time : Mon Nov 27 10:10:36 2006
              State : active, degraded
     Active Devices : 7
    Working Devices : 7
     Failed Devices : 0
      Spare Devices : 0
    
         Chunk Size : 256K
    
               UUID : d57cea81:3be21b7d:183a67d9:782c3329
             Events : 0.854919
    
        Number   Major   Minor   RaidDevice State
       9421816       0        0    1912995864      removed
           1       8       33        1      active sync   /dev/sdc1
           2       8       49        2      active sync   /dev/sdd1
           3       8       65        3      active sync   /dev/sde1
           4       8       81        4      active sync   /dev/sdf1
           5       8       97        5      active sync   /dev/sdg1
           6       8      113        6      active sync   /dev/sdh1
           7       8      129        7      active sync   /dev/sdi1
    So I've obviously got a degraded array. Where does the "dirty" part come in? Why can't I simply force this thing back together in active degraded mode with 7 drives and then add a fresh /dev/sdb1?

    I know as a last resort I can create a "new" array over my old one and as long as I get everything juuuuust right, it'll work, but that seems a rather drastic solution to what should be a trivial (and all to common) situation -- dealing with a single failed drive. I mean... I run RAID6 to provide a little extra protection, not to slam into these kinds of brick walls. Heck, I might as well run RAID0! ARGH!!! Ok... ok... I'll calm down.

    FWIW, here's my mdadm.conf:

    Code:
    [root@ornery ~]# grep -v '^#' /etc/mdadm.conf
    DEVICE /dev/sd[bcdefghi]1
    ARRAY /dev/md0 UUID=d57cea81:3be21b7d:183a67d9:782c3329
    MAILADDR root
    Have I missed something obvious? Thanks in advance for any clues...

  2. #2
    Just Joined!
    Join Date
    Nov 2006
    Posts
    4
    Ok, done a bit more poking around... I tried zeroing out the superblock on the failed device and adding it back into the array. It just sat there looking stupid. The status of the new drive became "sync", the array status remained inactive, and no resync took place:

    Code:
    [root@ornery ~]# cat /proc/mdstat
    Personalities : [raid6] [raid5] [raid4]
    md0 : inactive sdb1[0](S) sdc1[1] sdi1[7] sdh1[6] sdg1[5] sdf1[4] sde1[3] sdd1[2]
          3125669888 blocks
    
    unused devices: <none>
    Another thing I noticed was the new drive didn't fill the slot for the missing drive, but instead occupied a new slot. Here's a detail for the array:

    Code:
    [root@ornery ~]# mdadm --detail /dev/md0
    /dev/md0:
            Version : 00.90.03
      Creation Time : Tue Mar 21 11:14:56 2006
         Raid Level : raid6
        Device Size : 390708736 (372.61 GiB 400.09 GB)
       Raid Devices : 8
      Total Devices : 8
    Preferred Minor : 0
        Persistence : Superblock is persistent
    
        Update Time : Mon Nov 27 10:10:36 2006
              State : active
     Active Devices : 8
    Working Devices : 8
     Failed Devices : 0
      Spare Devices : 0
    
         Chunk Size : 256K
    
               UUID : d57cea81:3be21b7d:183a67d9:782c3329
             Events : 0.854919
    
        Number   Major   Minor   RaidDevice State
       4150256       0        0    1912995872      removed
           1       8       33        1      active sync   /dev/sdc1
           2       8       49        2      active sync   /dev/sdd1
           3       8       65        3      active sync   /dev/sde1
           4       8       81        4      active sync   /dev/sdf1
           5       8       97        5      active sync   /dev/sdg1
           6       8      113        6      active sync   /dev/sdh1
           7       8      129        7      active sync   /dev/sdi1
    
           0       8       17        -      active sync   /dev/sdb1
    It's like it's just adding the new /dev/sdb1 in as a spare or something. My hunch is that the problem stems from the superblock indicating that the bad device is simply "removed" rather than failed. Yet trying to fail the device... well, failed.

    Barring any sudden insights from my fellow Linuxens, it's looking like I have another romp with mddump looming in my future. By my reckoning, I would need to set the SB's to indicate that device 0's status is failed rather than removed, and set the counters to indicate 1 failed device and 7 active/working devices.

    If anyone has suggestions, feel free to jump in at any time!!

  3. #3
    Just Joined!
    Join Date
    Nov 2006
    Posts
    4
    Ok, I tried hacking up the superblocks with mddump. The good news is I didn't screw anything up permanently. The bad news is I made no progress either.

    Ultimately, I started reading through the kernel source and wandered into a helpful text file Documentation/md.txt in the kernel source tree. I was able to start the array, for reading at least. (baby steps...) Here's how:

    Code:
    [root@ornery ~]# cat /sys/block/md0/md/array_state
    inactive
    [root@ornery ~]# echo "clean" > /sys/block/md0/md/array_state
    [root@ornery ~]# cat /sys/block/md0/md/array_state
    clean
    [root@ornery ~]# cat /proc/mdstat
    Personalities : [raid6] [raid5] [raid4]
    md0 : active raid6 sdc1[1] sdi1[7] sdh1[6] sdg1[5] sdf1[4] sde1[3] sdd1[2]
          2344252416 blocks level 6, 256k chunk, algorithm 2 [8/7] [_UUUUUUU]
    
    unused devices: <none>
    [root@ornery ~]# mount -o ro /dev/md0 /data
    [root@ornery ~]# df -h
    Filesystem            Size  Used Avail Use% Mounted on
    /dev/hda2             226G   46G  168G  22% /
    /dev/hda1             251M   52M  187M  22% /boot
    /dev/shm              2.9G     0  2.9G   0% /dev/shm
    /dev/sda2              65G   35G   27G  56% /var
    /dev/md0              2.2T  307G  1.8T  15% /data
    At least I can get to my data now. Yay!

    Once I've got a full backup (fingers crossed), I can apply some riskier methods of getting this array into a sane condition again.

  4. #4
    Just Joined!
    Join Date
    Nov 2006
    Posts
    4
    Backup successful!

    So after that, I did the following:

    Code:
    umount /data
    Code:
    mdadm /dev/md0 -a /dev/sdb1
    The drive was added without error. A quick check of the array:

    Code:
    [root@ornery ~]# cat /proc/mdstat
    Personalities : [raid6] [raid5] [raid4]
    md0 : active raid6 sdb1[8] sdc1[1] sdi1[7] sdh1[6] sdg1[5] sdf1[4] sde1[3] sdd1[2]
          2344252416 blocks level 6, 256k chunk, algorithm 2 [8/7] [_UUUUUUU]
          [>....................]  recovery =  0.2% (823416/390708736) finish=13924.4min speed=465K/sec
    
    unused devices: <none>
    ...and...

    Code:
    [root@ornery ~]# mdadm -D /dev/md0
    /dev/md0:
            Version : 00.90.03
      Creation Time : Tue Mar 21 11:14:56 2006
         Raid Level : raid6
         Array Size : 2344252416 (2235.65 GiB 2400.51 GB)
        Device Size : 390708736 (372.61 GiB 400.09 GB)
       Raid Devices : 8
      Total Devices : 8
    Preferred Minor : 0
        Persistence : Superblock is persistent
    
        Update Time : Wed Nov 29 11:03:51 2006
              State : clean, degraded, recovering
     Active Devices : 7
    Working Devices : 8
     Failed Devices : 0
      Spare Devices : 1
    
         Chunk Size : 256K
    
     Rebuild Status : 0% complete
    
               UUID : d57cea81:3be21b7d:183a67d9:782c3329
             Events : 0.854924
    
        Number   Major   Minor   RaidDevice State
           8       8       17        0      spare rebuilding   /dev/sdb1
           1       8       33        1      active sync   /dev/sdc1
           2       8       49        2      active sync   /dev/sdd1
           3       8       65        3      active sync   /dev/sde1
           4       8       81        4      active sync   /dev/sdf1
           5       8       97        5      active sync   /dev/sdg1
           6       8      113        6      active sync   /dev/sdh1
           7       8      129        7      active sync   /dev/sdi1
    Now that's what I was looking for! It's moving kinda slow right now, probably because I'm also doing an fsck. I can't be certain, but I think the problem was that the state of the good drives (and the array) were marked as "active" rather than "clean." (active == dirty?) I expect this was caused by doing a hard reset on a system with a degraded array, in the midst of it being brought to a crawl trying to talk to the failed drive. Seems like some work might be needed to be able to handle these situations a little more gracefully.

    Anyway, it appears I might be firmly on the road to recovery now. (If not, you'll hear the screams...) Hopefully my posts will be helpful to others encountering this problem.

    -cw-

  5. #5
    Just Joined!
    Join Date
    Oct 2008
    Posts
    1

    Quick thank you...

    I can confirm that this works.

    I was in a similar position Raid 5, 4 1 TB drives. Had a drive fail, so off it goes to the WD repair shop, only it takes 3 weeks to return. During this time the system was available as a degraded array. Only being powered up when critical data was needed.

    Then as you can only imagine, the machine looses power in one of these "I must have that file.." moments. The UPS kicked in but died (whilst at lunch 'Ahem!').

    Needless to say the reboot resulted in some shock as one of the raid partitions (in my case 2.6TB MD3) was 'inactive'.

    I spent an age looking at the web reading what are embarrassingly old "Linux RAID Software HOWTO" (Btw. people should date their howtos in bold at the top of the document)

    And then I was most fortunate in that I found this post and hit hit my nail on the head. This describes what can only be a common occurrence and I guess it should be handled more obviously or at least reported more clearly in the log files, a simple "The raid partition terminated un-cleanly whilst in a degraded state, please use obscure echo "clean" > .... or ideally a new switch in mdadm which lets you 'force clean on degraded array.

    Anyway I followed the ideas presented here and was soon looking at the best bit of news possible:

    root@NAS:/home/andy# cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md3 : active raid5 sdb4[4] sda4[0] sdd4[3] sdc4[2]
    2881891968 blocks level 5, 64k chunk, algorithm 2 [4/3] [U_UU]
    [>....................] recovery = 1.4% (13577932/960630656) finish=295.4min speed=53416K/sec
    NOTE: Some points to look out for....

    1) When you echo "clean" > to the system block device you may get an error, it will however still work!

    2) Remember you need to put your clean new hard drive in and then set up the partitions BEFORE adding the drive back into the array. in my case I run a simple set up with all the drives having exactly the same partition table. So for me I can simple copy the partition table from any working drive in a couple of seconds
    sfdisk -d /dev/sda | sfdisk /dev/sdb
    where I want to copy from good device A to new and clean device B.

    so a big thank you for spending the time in sharing your experience and for trawling the source code

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •