Find the answer to your Linux question:
Results 1 to 2 of 2
Hey to all, I'm new around here, have been using Linux for quite some time now (running Windows XP in VirtualBox as it is still required as part of my ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Jul 2009

    Lightbulb IOWait, overall system performance and DMA

    Hey to all,
    I'm new around here, have been using Linux for quite some time now (running Windows XP in VirtualBox as it is still required as part of my job), and i have to say i couldn't be happier since.

    I'm studying computer science in 5th semester now, having heard lectures on Operating Systems, Computer-Architecture and similar things, and i picked up as much knowledge of them as i could, which just brings me to the question i would like to ask. This field is really interesting to me.

    I'm wondering why performing I/O (file copies, in fact) cause a large portion of my CPU to be used up, reading in 'top' as being used for IOWait (the 'wa' value).

    As far as i know, the scheduler basically has a number of queues called something like "Running", "Runnable", "Blocked".

    From what i have learned i would assume that while performing disc i/o a process issues a write() or read() call, which blocks the process until it returns. During the time a process is blocked, (waiting for IO-operations to be completed), another should be able to run, causing the cpu time not to be wasted waiting for (comparatively) slow hard discs.

    Instead what the kernel seems (!) to do is that it does not actually block the process but seems to 'poll' the hard disc device continously, wasting cpu cycles.
    I need to admit i cannot be certain on this and its just a hypothesis that i would basically like to be falsified

    From what i know of technologies like DMA is that they basically perform 'asynchronous' I/O operations. When performing a write() this is the sequence of actions i would assume to happen:
    process calls write()
    CPU instructs DMA-Chip to write a chunk of data form somewhere to somewhere.
    CPU puts process in 'blocked' queue.
    CPU receives DMA-Completed interrupt from DMA-Chip and puts process into 'runnable' state.
    Now that should be able to save a whole lot of cpu-cycles compared to the polling method.

    I have recorded a list of 'vmstat 1' output and a screenshot of 'top' running inside xterm while copying a number of files from hard disc to an USB device, but the symptoms occur during harddisc-harddisc transactions as well, but are less noticable.

    As can be seen while transferring the files, the IOWait value ranges from 50 to above 70% cpu usage, which is confirmed by 'top'.

    Needless to say, the overall system performance drops noticably with effects ranging from sluggish windows interaction, windows not redrawing themselves, and other things.

    I'm running Linux Mint with XFCE on:
    Intel Q6600 2.66GHz quadcore
    4GB Ram (3GB detected, only 32 bit OS, at the moment)
    Asus P5Q Pro (P45 chipset, ICH10R)
    BIOS set to AHCI (but that makes little difference)

    dfisch-xfce ~ $ uname -a
    Linux dfisch-xfce 2.6.28-13-generic #45-Ubuntu SMP Tue Jun 30 19:49:51 UTC 2009 i686 GNU/Linux
    I'm pretty sure that mere disc I/O should not saturate such a CPU to noticable values.

    The same happened to me on openSuSe and every other distribution i used before.

    I'd be really glad to hear any explanation to this, for i'm pretty sure i got something wrong on the way or something else is happening.

    Thanks in advance!

    Edit: I'm sorry, the images appear to have been resized (this is my first posting, sorry), so i have attached the output of 'vmstat 1' below:
    dfisch-xfce ~ $ vmstat 1
    procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
     r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
     1  0  19532  89616  19996 2396904    0    1   128   156  166  558  4  2 91  3
     2  0  19532  89592  19996 2396904    0    0     0     0  460 1584  3  1 96  0
     1  0  19532  89560  19996 2396904    0    0     0     0  550 2250  4  2 94  0
     0  0  19532  89576  20004 2396900    0    0     0    40  503 2205  6  1 94  0
     0  2  19924  89408  20128 2390860    0  392   128 19268 1128 4902  8 12 73  7
     0  3  19972  89348  20128 2390588    0   48     0  3744  741 5234  7  2 40 52
     0  3  20052  89760  20128 2389992    0   80     0  3628  717 2004  5  1 24 70
     0  1  20052  89628  20128 2389892    0    0     0  3840  612 1587  5  0 40 55
     0  4  20052  86432  20140 2392668    0    0     0  1957  610 1646  3  1 56 40
     0  2  20052  86312  20148 2392508    0    0     0  3763  613 1586  5  0 41 55
     0  4  20252  88400  20148 2390024    0  200   128  3960  633 1746  4  1 10 84
    Attached Images Attached Images
    Attached Files Attached Files

  2. #2
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
    I don't know enough specifically how the kernel has implemented these services, but as I understand it an IOWait state is much like an Idle state - it really isn't using CPU cycles, but top is just showing how much relative time it is in the IOWait state. Sluggishness when moving/copying large data sets is usually due to a couple of factors:

    1. Controller performance
    2. Disc performance
    3. Memory bandwidth limitations

    Often copies from disc to disc are going over the same controller. Ie, either both discs are on the same controller, or the file systems in question are on the same disc. Even if the source and destination are on differenct discs and controllers, there is the memory bandwidth capability of the motherboard to consider as well. All of these things contribute to perceived performance. Other processes are affected because they also need these resources (disc, controller, memory bus, etc). So, basically, copying large data sets has a BIG impact on system performance overall.

    My personal experience (CentOS 5.3, dual E5450 3GHz Xeon quadcore CPUs, 5 internal 3gbps sata drives, 5 external 3gbps esata drives - 1 on internal controller, 4 on add-on esata raid controller), is that copying from internal drive to an external drive on the raid controller has the least impact and highest thruput, capable of saturating both drives.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts