Find the answer to your Linux question:
Results 1 to 5 of 5
If I run a forever while loop on a specific core, and only for that specific core, the ping response time is then cut in half. I have a dell ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Aug 2013
    Posts
    3

    Forever While loop cuts ping response time in half




    If I run a forever while loop on a specific core, and only for that specific core, the ping response time is then cut in half.

    I have a dell R510 server running ubuntu 12.04 server. It has 8 cores on 2 sockets with NUMA. It use to only happen if I run the loop on core 0, then I used isolcpus to isolate core 0, now after boot, it can happen to core 1,2,3, or 4. I checked the interrupts in /proc and they don't seem to get more/less interrupts per cpu core whether the while loop is running or not. Also using ps, I don't see any program that are running on those cores that are not running on them when this effect does not take place.

    Does anyone have any ideas how I can figure out why? THanks.

  2. #2
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,592
    Not so hard to understand. When you switch cores, register values have to migrate (cache may or may not be involved). If you can stick a bit of code to one specific core, then this doesn't happen, and the hardware can be utilized in a much more optimal fashion. It's a matter of speed vs. "fairness".
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  3. #3
    Just Joined!
    Join Date
    Aug 2013
    Posts
    3
    Quote Originally Posted by Rubberman View Post
    Not so hard to understand. When you switch cores, register values have to migrate (cache may or may not be involved). If you can stick a bit of code to one specific core, then this doesn't happen, and the hardware can be utilized in a much more optimal fashion. It's a matter of speed vs. "fairness".
    but I have 8 cores, with eth0's interrupt spread out to various CPU cores. However, on each restart, only one core from 0-4 will give me a halved ping time if I run a while loop on it. If it was just better scheduling because I basically took a core out of equation (leaving 7 remaining cores to handle ping echos), shouldn't I get the same effect no matter which core I run the while loop on? Also if I run the while loop on more cores, I would get better ping echo times?

  4. #4
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,592
    Quote Originally Posted by quantumlight View Post
    but I have 8 cores, with eth0's interrupt spread out to various CPU cores. However, on each restart, only one core from 0-4 will give me a halved ping time if I run a while loop on it. If it was just better scheduling because I basically took a core out of equation (leaving 7 remaining cores to handle ping echos), shouldn't I get the same effect no matter which core I run the while loop on? Also if I run the while loop on more cores, I would get better ping echo times?
    Re: better ping times? Probably not, due to kernel tcp/ip overhead and latency issues. Remember, part of the ping time is the response time of the remote system's tcp/ip stack.

    In my opinion, this is an interesting exercise, and can help you understand where you can optimize necessary operations, but in the bigger picture, isn't probably too useful.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  5. #5
    Just Joined!
    Join Date
    Aug 2013
    Posts
    3
    You are right, this is just out of pure curiosity because it is so odd. I'm thinking of something in the network stack layer 3 and/or below that is somehow pinned at boot time to one specific core (and almost always pinned to core0) but the process doesn't show up in ps.

    Perhaps somekind of load balancer? softint handler? Perhaps ICMP echo reply are handled by a designated core since it is so simple and running a while loop prevents other programs that would scheduled before a upperhalf interrupt handling thread from running at all or get offloaded to anther core? (so I'm not seeing this using ps)

    It feels like its a scheduler thing, that somehow a while loop results in upperhalf of interrupt handling to be run much much quicker than it otherwise would be (I'm not even sure what kind of scheduler would do this), but I'm not sure how to even test, find which core it is running on or look inside the linux scheduler.

    Also a bit more information, I don't see this effect in linux kernel 2.6.32-49-generic when I use the server to ping my own computer which is running 4 cores with ubuntu 10.04, nor do I see it on my laptop which ran ubuntu 12.04 (not server) with Linux kernel 3.5.0-34-generic which also runs 4 cores. However the Linux 3.5.0-23-generic on the Dell R510 server running ubuntu 12.04 server exhibit it.

    It also doesn't seem to be because of different NIC as it also happens when I ping the loopback address. Only difference I can see is that the R510 has two sockets and NUMA,each socket has its own memory as with all servers whereas my laptop and computer only got one socket. Any theories?
    Last edited by quantumlight; 08-08-2013 at 04:37 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •