Find the answer to your Linux question:
Results 1 to 2 of 2
Hi, I'm looking for some help. I'm working with a friend on a fedora 10 server running primarily several fairly high traffic apache/php/mysql sites. Recently a 2nd hard drive was ...
  1. #1
    Just Joined!
    Join Date
    Dec 2009
    Posts
    1

    please help: unexplained spike in load average & drive usage

    Hi, I'm looking for some help. I'm working with a friend on a fedora 10 server running primarily several fairly high traffic apache/php/mysql sites.

    Recently a 2nd hard drive was added for some additional storage (this problem may or may not be related to this addition)

    Currently, we keep seeing big jumps in load average, without any obvious reason (ie: high cpu process, or something). What we have noticed, using iostat, is right before the jump in load average, both drives %util jump to 100%, and the await/svctm jump way up for several seconds. I haven't been able to track down what could cause this. It has lately been happening every couple minutes, usually giving the load average time to settle down (it will run around 1-2 if this is left alone), but at times if this happens several times in a row, the server can almost grind to a halt. There's lots of memory & cpu available, and we're not swapping.

    Is there anything else I can look at, or possible causes? I've been trying to track it down for several days now with no luck.

    Showing below, load averages and iostat output for several seconds where this took place.

    Now.. I'm no expert at this type of thing, so be nice.

    thanks very much for your time! any thoughts/comments/help would be greatly appreciated.


    load average: 6.65, 6.63, 6.66 08:26:50
    load average: 6.65, 6.63, 6.66 08:26:51
    load average: 6.65, 6.63, 6.66 08:26:52
    load average: 6.65, 6.63, 6.66 08:26:53
    load average: 6.65, 6.63, 6.66 08:26:57
    load average: 13.24, 7.99, 7.10 08:26:58
    load average: 13.24, 7.99, 7.10 08:26:59


    Code:
    Time: 08:26:50 PM
    Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
    sda              23.00     0.00   26.00    0.00   912.00     0.00    35.08     0.17    6.65   4.19  10.90
    sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda3             23.00     0.00   26.00    0.00   912.00     0.00    35.08     0.17    6.65   4.19  10.90
    sdb               0.00    60.00    1.00    2.00     8.00   496.00   168.00     0.33  110.33 110.33  33.10
    sdb1              0.00    60.00    1.00    2.00     8.00   496.00   168.00     0.33  110.33 110.33  33.10
    
    Time: 08:26:51 PM
    Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
    sda              23.00     0.00   12.00    0.00   480.00     0.00    40.00     5.42   45.83  53.50  64.20
    sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda3             23.00     0.00   12.00    0.00   480.00     0.00    40.00     5.42   45.83  53.50  64.20
    sdb              24.00     0.00    2.00    0.00   112.00     0.00    56.00     2.22  242.00 326.50  65.30
    sdb1             24.00     0.00    2.00    0.00   112.00     0.00    56.00     2.22  242.00 326.50  65.30
    
    Time: 08:26:52 PM
    Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
    sda              26.00     0.00    1.00    0.00   136.00     0.00   136.00    18.73  912.00 1000.00 100.00
    sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda3             26.00     0.00    1.00    0.00   136.00     0.00   136.00    18.73  912.00 1000.00 100.00
    sdb               0.00     0.00    1.00    0.00    88.00     0.00    88.00     4.59  686.00 1000.00 100.00
    sdb1              0.00     0.00    1.00    0.00    88.00     0.00    88.00     4.59  686.00 1000.00 100.00
    
    Time: 08:26:53 PM
    Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
    sda               0.00     0.00    2.00    0.00   112.00     0.00    56.00    29.34 2084.00 500.00 100.00
    sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda3              0.00     0.00    2.00    0.00   112.00     0.00    56.00    29.34 2084.00 500.00 100.00
    sdb               0.00     0.00    2.00    0.00    56.00     0.00    28.00     4.06 1927.00 500.00 100.00
    sdb1              0.00     0.00    2.00    0.00    56.00     0.00    28.00     4.06 1927.00 500.00 100.00
    
    Time: 08:26:54 PM
    Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
    sda               0.00     0.00    1.00    0.00     8.00     0.00     8.00    32.12 2719.00 1001.00 100.10
    sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda3              0.00     0.00    1.00    0.00     8.00     0.00     8.00    32.12 2719.00 1001.00 100.10
    sdb              12.00     0.00    1.00    0.00     8.00     0.00     8.00     2.61 2633.00 1000.00 100.00
    sdb1             12.00     0.00    1.00    0.00     8.00     0.00     8.00     2.61 2633.00 1000.00 100.00
    
    Time: 08:26:58 PM
    Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
    sda               9.18   113.61   74.37   22.78  1746.84  1091.14    29.21    24.62  505.05  10.17  98.77
    sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda3              9.18   113.61   74.37   22.78  1746.84  1091.14    29.21    24.62  505.05  10.17  98.77
    sdb               7.59     6.33    6.65    0.63   326.58    55.70    52.52     2.58  608.30  94.65  68.89
    sdb1              7.59     6.33    6.65    0.63   326.58    55.70    52.52     2.58  608.30  94.65  68.89
    
    Time: 08:26:59 PM
    Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
    sda               0.00     0.00   30.00    4.00   792.00    32.00    24.24     0.39   11.53   6.12  20.80
    sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
    sda3              0.00     0.00   30.00    4.00   792.00    32.00    24.24     0.39   11.53   6.12  20.80
    sdb              12.00     0.00    5.00    0.00   168.00     0.00    33.60     0.04    8.80   8.80   4.40
    sdb1             12.00     0.00    5.00    0.00   168.00     0.00    33.60     0.04    8.80   8.80   4.40

  2. #2
    Just Joined!
    Join Date
    Nov 2008
    Location
    Virginia, USA
    Posts
    18
    Just curious, when's the last time you've taken an outage and fsck'd all your partitions? Also, check out the output of smartctl on our drives (yum install smartmontools).

    Code:
    yum install smartmontools
    Code:
    smartctl -A /dev/sdx
    Then from a recovery shell (no partitions mounted):
    Code:
    for i in `ls /dev/sd*`; do fsck -fy $i; done
    That last one is a little ugly and will throw a few errors but it's quick and easy and it will fsck every sd* partition. Please post the output from these commands. I know an outage sucks but if our partitions are healthy and not too big it shouldn't take long. Smartctl doesn't need an outage though.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...