Find the answer to your Linux question:
Results 1 to 5 of 5
Hello, I am running an md raid1 raid with two disks. Since a couple of weeks i see errors in the logfile. Code: Sep 6 15:22:48 SSS kernel: [ 509.057247] ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Aug 2010
    Posts
    4

    md raid1 - errors


    Hello,

    I am running an md raid1 raid with two disks. Since a couple of weeks i see errors in the logfile.

    Code:
    Sep  6 15:22:48 SSS kernel: [  509.057247] attempt to access beyond end of device
    Sep  6 15:22:48 SSS kernel: [  509.057247] md2: rw=0, want=34242316368, limit=1452532864
    Sep  6 15:22:48 SSS kernel: [  509.057247] attempt to access beyond end of device
    Sep  6 15:22:48 SSS kernel: [  509.057247] md2: rw=0, want=15761873664, limit=1452532864
    Sep  6 15:22:48 SSS kernel: [  509.057247] attempt to access beyond end of device
    Sep  6 15:22:48 SSS kernel: [  509.057247] md2: rw=0, want=16300352376, limit=1452532864
    Sep  6 15:22:48 SSS kernel: [  509.073905] attempt to access beyond end of device
    Sep  6 15:22:48 SSS kernel: [  509.073905] md2: rw=0, want=34242316368, limit=1452532864
    Sep  6 15:22:48 SSS kernel: [  509.073905] attempt to access beyond end of device
    Sep  6 15:22:48 SSS kernel: [  509.073905] md2: rw=0, want=15761873664, limit=1452532864
    Therefore I started a rescue system and did e2fsck on the /dev/md2. The result was weird.

    the result of the first run:
    Code:
    # e2fsck -f -y -D /dev/md2
    e2fsck 1.41.3 (12-Oct-2008)
    Pass 1: Checking inodes, blocks, and sizes
    Inode 21750079, i_blocks is 304, should be 104.  Fix? yes
    
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 3A: Optimizing directories
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    Block bitmap differences:  -(87009857--87009863) -(87009871--87009872) -(87009874--87009875) -(87009877--87009878) -(87009880--87009881) -(87009883--87009892)
    Fix? yes
    
    Free blocks count wrong for group #2655 (22944, counted=22969).
    Fix? yes
    
    Free blocks count wrong (171654137, counted=171654162).
    Fix? yes
    
    
    /dev/md2: ***** FILE SYSTEM WAS MODIFIED *****
    /dev/md2: 458582/45391872 files (0.8% non-contiguous), 9912446/181566608 blocks
    the result of the second run
    Code:
    # e2fsck -f -y -D /dev/md2
    e2fsck 1.41.3 (12-Oct-2008)
    Pass 1: Checking inodes, blocks, and sizes
    Inode 21750079, i_blocks is 104, should be 304.  Fix? yes
    
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 3A: Optimizing directories
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    Block bitmap differences:  +(87009857--87009863) +(87009871--87009872) +(87009874--87009875) +(87009877--87009878) +(87009880--87009881) +(87009883--87009892)
    Fix? yes
    
    Free blocks count wrong for group #2655 (22969, counted=22944).
    Fix? yes
    
    Free blocks count wrong (171654162, counted=171654137).
    Fix? yes
    
    
    /dev/md2: ***** FILE SYSTEM WAS MODIFIED *****
    /dev/md2: 458582/45391872 files (0.8% non-contiguous), 9912471/181566608 blocks

    smartctl states, that the disks sda and sdb are fine.

    The md config looks like this:
    Code:
    DEVICES /dev/sda* /dev/sdb*
    ARRAY /dev/md0 level=raid1 num-devices=2 metadata=0.90 UUID=44f771b6:0147286b:776c2c25:004bd7b2
    ARRAY /dev/md1 level=raid1 num-devices=2 metadata=0.90 UUID=ef35b97a:7b8ff70f:776c2c25:004bd7b2
    ARRAY /dev/md2 level=raid1 num-devices=2 metadata=0.90 UUID=e8004a1c:bf724e88:776c2c25:004bd7b2
    MAILADDR root
    /proc/mdstat says this:
    Code:
    Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] 
    md2 : active raid1 sda3[0] sdb3[1]
          726266432 blocks [2/2] [UU]
          
    md1 : active raid1 sda2[0] sdb2[1]
          2104448 blocks [2/2] [UU]
          
    md0 : active (auto-read-only) raid1 sda1[0] sdb1[1]
          4200896 blocks [2/2] [UU]
          
    unused devices: <none>
    and mdadm tells me this:
    Code:
    mdadm -Q --detail /dev/md2
    /dev/md2:
            Version : 00.90
      Creation Time : Mon Jun 15 00:24:04 2009
         Raid Level : raid1
         Array Size : 726266432 (692.62 GiB 743.70 GB)
      Used Dev Size : 726266432 (692.62 GiB 743.70 GB)
       Raid Devices : 2
      Total Devices : 2
    Preferred Minor : 2
        Persistence : Superblock is persistent
    
        Update Time : Tue Sep  7 14:35:05 2010
              State : clean
     Active Devices : 2
    Working Devices : 2
     Failed Devices : 0
      Spare Devices : 0
    
               UUID : e8004a1c:bf724e88:776c2c25:004bd7b2
             Events : 0.76
    
        Number   Major   Minor   RaidDevice State
           0       8        3        0      active sync   /dev/sda3
           1       8       19        1      active sync   /dev/sdb3
    Any help would be very much appreciated

    Best regards, Tom

  2. #2
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,591
    Hmmm (I say that when I am bumfuzzled and thinking about the question)...

    Possibly your discs are not identical?
    Possibly your partition tables are not correct and do not reflect the physical size of the discs?

    From your posting, these appear to be mirrored discs? If so, can you remove one, and then run fsck on a single disc?

    Sorry, but at this point, I am reverting to the SWAG (Stupid Wild Assed Guess) protocol...
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  3. #3
    Just Joined!
    Join Date
    Aug 2010
    Posts
    4
    Hey,

    many thanks for your response. I will give it a try today and post then the findings.

    regards, Tom

  4. #4
    Just Joined!
    Join Date
    Aug 2010
    Posts
    4
    Hey,

    I gave it a try yesterday. I removed the sda disk, started in rescue mode, run a e2fsck and now it seems like it is running smoothly. However, I will re-add the second disk to the array today.

    Additionally I have checked the partitions of the devices and I can't see anything suspicious

    Code:
    root-rescue ~ # sfdisk -l /dev/sda
    
    Disk /dev/sda: 91201 cylinders, 255 heads, 63 sectors/track
    Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0
    
       Device Boot Start     End   #cyls    #blocks   Id  System
    /dev/sda1          0+    522     523-   4200997   fd  Linux raid autodetect
    /dev/sda2        523     784     262    2104515   fd  Linux raid autodetect
    /dev/sda3        785   91200   90416  726266520   fd  Linux raid autodetect
    /dev/sda4          0       -       0          0    0  Empty
    Code:
    root-rescue ~ # sfdisk -l /dev/sdb
    
    Disk /dev/sdb: 91201 cylinders, 255 heads, 63 sectors/track
    Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0
    
       Device Boot Start     End   #cyls    #blocks   Id  System
    /dev/sdb1          0+    522     523-   4200997   fd  Linux raid autodetect
    /dev/sdb2        523     784     262    2104515   fd  Linux raid autodetect
    /dev/sdb3        785   91200   90416  726266520   fd  Linux raid autodetect
    /dev/sdb4          0       -       0          0    0  Empty
    Code:
    root-rescue ~ # sfdisk -s
    /dev/md0:   4200896
    /dev/sdb: 732574584
    /dev/sda: 732574584
    /dev/md1:   2104448
    /dev/md2: 726266432
    total: 2197720944 blocks
    Code:
    root-rescue ~ # sfdisk -s /dev/sda
    732574584
    Code:
    root-rescue ~ # sfdisk -s /dev/sdb
    732574584
    I will post something as soon as I see the error again.

    regards, tom

  5. #5
    Just Joined!
    Join Date
    Aug 2010
    Posts
    4
    Hi,

    I removed one harddisk from the array and run the e2fsck again, that resulted in a stable behavior. However, the raid was degraded. After a while the removed harddisk started throwing errors and it got replaced.
    Everything back to normal,... as it looks at the moment.

    regards, tom

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •