Find the answer to your Linux question:
Results 1 to 6 of 6
Hi, I have a Fedora 16 system with Intel i7 970 processor, 12GB RAM, which seems sometimes to succumb to incredibly bad lock contention problems. Symptoms: any process attempting to ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Mar 2013
    Posts
    3

    Huge lock contention when doing I/O


    Hi,

    I have a Fedora 16 system with Intel i7 970 processor, 12GB RAM, which seems sometimes to succumb to incredibly bad lock contention problems.

    Symptoms: any process attempting to read anything from the filesystem uses 100% system time for between several seconds and several minutes. When the problem gets bad, reading 10MB from /dev/zero can take several minutes. The system eventually becomes pretty much unusable.

    This has been happening for a long time - I think since I upgraded from F13, but seems to be getting worse with recent kernels.

    The problems seems to occur only when VMWare Workstation is running, which suggests it's involved in some way. Quitting VMWare makes the problem go away until it's started again. (Unfortunately I need to run it most of the time.)

    When the problem is happening, perf top always shows this kind of output:

    49.14% [kernel] [k] mutex_spin_on_owner
    15.05% [kernel] [k] get_index
    7.92% [kernel] [k] prio_tree_next
    6.20% [kernel] [k] prio_tree_left
    5.98% [kernel] [k] prio_tree_right
    1.28% [kernel] [k] iter_walk_down

    It's always these calls, and basically similar percentages. Drilling down into mutex_spin_on_owner shows a call to
    Code:
    static inline bool owner_running()
    , so presumably there's some hideous amount of lock contention going on.

    Attempting to attach to a process in this state with a debugger will hang until it finishes its read (which may be in several minutes time) - they seem to be actually stuck in the read() call, but show in top running and using 100% system time, rather than in "D" device wait state as they would usually if stuck in a read() call.

    Does anyone know what might be happening here, if there's some setting which might help, or how I could find out more about what's going on? I posted on the Fedora and VMWare forums a while back, but got no answers.

    Any help/advice much appreciated.
    Last edited by mcgd; 03-07-2013 at 11:39 AM. Reason: Removed unnecessary code tag

  2. #2
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,392
    Have you determined if the core is in a high i/o wait situation? Is a lot of your I/O to USB devices?
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  3. #3
    Just Joined!
    Join Date
    Mar 2013
    Posts
    3
    Quote Originally Posted by Rubberman View Post
    Have you determined if the core is in a high i/o wait situation? Is a lot of your I/O to USB devices?
    That's the odd thing - it seems to be running 100% system time, rather than I/O wait even while stuck in the read() call. It really is spinning, not waiting - you can hear the fans going on the system when a few processes start to do this. The problem happens without any USB storage devices connected except an SD card reader, which is rarely used.

  4. #4
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,392
    Quote Originally Posted by mcgd View Post
    That's the odd thing - it seems to be running 100% system time, rather than I/O wait even while stuck in the read() call. It really is spinning, not waiting - you can hear the fans going on the system when a few processes start to do this. The problem happens without any USB storage devices connected except an SD card reader, which is rarely used.
    Hmmm (pause due to brain fart)... Have you tried a newer kernel, or upgrading to FC17 or 18? Remember that FC is Red Hat's sandbox for developing new stuff - not everything always works...
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  5. #5
    Just Joined!
    Join Date
    Mar 2013
    Posts
    3
    Quote Originally Posted by Rubberman View Post
    Hmmm (pause due to brain fart)... Have you tried a newer kernel, or upgrading to FC17 or 18? Remember that FC is Red Hat's sandbox for developing new stuff - not everything always works...
    This issue has been dogging me since F14 I think, so it's an issue with a wide range of kernels. It seems to be getting worse if anything with more recent kernels, so I don't hold out much hope of a solution there, though I will probably upgrade at some point soonish. I'm running 3.6.11 at the moment.

    I was hoping that there might be some way of tracking this down a bit further, or that someone might recognise the issue. Seems that I must be about the only person with the problem.

    Boy do I know that that not everything works on Fedora - I've used it since its fork from RH, and it used to be reasonably stable years ago. Don't think it will be the distro I choose for my next system.

  6. #6
    Linux Engineer
    Join Date
    Apr 2012
    Location
    Virginia, USA
    Posts
    881
    If it's only happening when you're running VMWare workstation, I'd say that's your problem right there. Have you considered using Virtual Box or KVM instead? Both are very mature at this point, can think of any real reason to use VMWare if you're having problems with it. There are tools to convert your existing disk images as well.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •