Find the answer to your Linux question:
Results 1 to 6 of 6
Hi, I'm developing a simple TCP server. 99% of incoming connections terminate correctly and the sockets disappear from netstat output. However, a few connections hang around for indefinitely in the ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Jul 2009
    Posts
    4

    TCP connections stuck in FIN_WAIT2 state


    Hi,

    I'm developing a simple TCP server. 99% of incoming connections terminate correctly and the sockets disappear from netstat output. However, a few connections hang around for indefinitely in the FIN_WAIT2 state. Now, I know that the clients in this case are misbehaving by not sending a FIN,ACK to close the connection. However, regardless of client behaviour, the connections should only remain in this state for a maximum of 60 seconds (set globally by /proc/sys/net/ipv4/tcp_fin_timeout)

    Code:
    Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name    Timer
    tcp        0      0 10.0.0.12:2000             10.0.0.6:50990              FIN_WAIT2   9507/perl           off (0.00/0/0)
    tcp        0      0 10.0.0.12:2000             10.0.0.6:57896              FIN_WAIT2   7247/perl           off (0.00/0/0)
    tcp        0      0 10.0.0.12:2000             10.0.0.6:60683              FIN_WAIT2   6835/perl           off (0.00/0/0)
    You will notice that the timer output of netstat is showing that these connections are not being timed. To me that suggests that these connections will hang around forever, contrary to what tcp(7) says (repeated below)

    Code:
    tcp_fin_timeout (integer; default: 60)
                  This  specifies  how many seconds to wait for a final FIN packet before the socket is forcibly closed.  This is strictly a violation of the TCP specifica-
                  tion, but required to prevent denial-of-service attacks.  In Linux 2.2, the default value was 180.
    I have watched the packets flowing back and forth, and the server is correctly performing a half close of the connection once it has finished sending data. The client is misbehaving by not sending its FIN,ACK, but so is the server by not closing the connection anyway. The end result is that I have hundreds of connections and processes hanging around forever.

    I'm running RHEL 5.3 with 2.6.18-128.1.6.el5.PAE (i386) as my kernel.

    Is there anything I can do to find out why these connections are not being forcibly closed by the kernel? Why are these connections NOT being timed?

    Cheers,
    Georgio

  2. #2
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,574
    If you are running RHEL 5, then I assume you have a support subscription with Red Hat? If so, shouldn't you be addressing this question to them? Personally, I haven't seen this problem on my CentOS 5.3 system (RHEL 5.3 clone), but that means very little I'm afraid since I don't have access to all of your system configuration information.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  3. #3
    Just Joined!
    Join Date
    Jul 2009
    Posts
    4
    Quote Originally Posted by Rubberman View Post
    If you are running RHEL 5, then I assume you have a support subscription with Red Hat? If so, shouldn't you be addressing this question to them? Personally, I haven't seen this problem on my CentOS 5.3 system (RHEL 5.3 clone), but that means very little I'm afraid since I don't have access to all of your system configuration information.

    Hi Rubberman,

    Ooops, I don't know why I wrote RHEL5.3, I'm actually running CentOS 5.3. I'm about to upgrade to the latest kernel (version 2.6.18-128.2.1). I'll keep you posted on how things go with it. I have a bad feeling that the problem won't be fixed.

    Cheers,
    Georgio

  4. #4
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,574
    I'm running the latest kernel now. I haven't experienced any problems like this, but that doesn't necessarily mean a lot. I have to think that the problem is likely in your TCP/IP configuration, or more likely a problem in the NIC and/or driver. I've had similar issues in the past with HP-UX running on PA-RISC systems where a fault in the onboard NIC would cause this type of problem. After a lot of investigation, HP engineering confirmed that the problem was a bug in the NIC firmware and how it interacted with the TCP/IP stack. The only solution was to use a separate ethernet board in the system instead of the one built into the standard I/O board. So, if you can use another NIC from a different manufacturer, that would confirm whether or not it is a problem related to your specific network adapter and/or drivers for it.

    FWIW, I am running an Intel S5000XVN workstation/server motherboard with dual onboard gigabit NIC's (Intel chip set) and have had absolutely zero problems with them.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  5. #5
    Just Joined!
    Join Date
    Jul 2009
    Posts
    4
    Hi,

    I think the problem was due to my own stupidity, not sure. I have changed my server code so that once I perform a half-close, I put a timeout on my call to select(). I think the reason why the kernel wasn't reaping the connections is because half closed sockets are still active. If I use the timeout, my code finishes gracefully and the kernel then reaps the connections after a couple of minutes. Thanks for your helpful suggestions! If you think what I'm doing sounds incorrect please let me know.

    Thanks!
    Georgio

  6. #6
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,574
    I've been on vacation this past week. I'll think about what you are doing (a code sample would be useful) with regard to appropriateness to dealing with closing type events. In any case, I'm glad you found a workable solution, though to me, having half-open connections for a couple of minutes on the close is still way too much except in unusual circumstances. At least that's my opinion.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •