Results 1 to 6 of 6
Hi,
I was wondering if anyone knows the semantics of select for Linux and other Unices...
Background info: (skip to enumerated Q's if impatient, but don't offer what I have ...
- 05-29-2009 #1Just Joined!
- Join Date
- May 2009
- Location
- Oregon
- Posts
- 51
netcat lockup / BASH / and select() or poll()
Hi,
I was wondering if anyone knows the semantics of select for Linux and other Unices...
Background info: (skip to enumerated Q's if impatient, but don't offer what I have already tried..)
I have Linux and MAX-OSX (BSD UNIX) and have been programming them for some time.
I discovered, recently, that the netcat (nc) utility has a terrible habit of not hanging up when the TCP pipe is broken (various circumstances leaving tons of processes hanging up the system);
What ought to have been a very simple BASH script to get my Linux box talking to my Mac without bloated SSH has become a nightmare. Besides job control in BASH being different between the unices, (LINUX BASH doesn't break out of wait when SIGINT is sent, but SIGINT can be trapped from a child process -- even disowned -- but OSX has exactly the opposite semantics, and ignores any SIGINT not sent from the terminal, etc, etc, etc,) So, I use SIGUSR1 and turned job control off.... and explicitly kill netcat when finished.... *gee*
But then I tried one more thing (of course! the important one...), using uuencode and decode, which really shows the problems with netcat clearly when used from a script.
essentially I am doing something like:
echo "this pipe dies quickly" | nc $myothercomputer $port | uudecode
The idea being that when the other computer finishes transferring the file -- the pipeline would deconstruct. Instead -- when the other side closes -- nc continues to linger... indefinitely.
I thought to fix the open netcat source, but that appears to be big and ugly for such a small problem... (It sort of violates the UNIX pipe philosophy, too.) Also, the TCP and UDP drivers for Linux /dev/TCP don't exist on the MAC -- and I wanted one bash script to work on both machines WITHOUT the bloated SSH running...
(and who knows, my brother in law has SUN workstations...)
So, I thought writing a very simple nc replacement was in order... and want it to be general purpose and correct for Unix in general.... *sigh*
I tried forking blocking IO for half the pipe -- which has semantics problems -- nonblocking IO has a side effect that if more than one program is using the same end of a pipe -- they all become nonblocking (bad ... very bad...) not to mention that BASH will turn nonblocking off when it detects it. So that leaves poll and select as I don't want to port a threaded app, nor do I want to install the gnu pthreads library for such a simple TCP replacement for netcat.
So:
I chose select as it is portable, and wrote tcpsocket.c to detect and close half stream at a time ( via: shutdown() ) when the reader or writer closes.... (so far GREAT!) But -- the manual for select (and even W. Richard Stevens) don't tell me how to correctly avoid a deadlock possibility.
0. I have figured out that I can set a timeout for the occasional case where a blocking read or write is accidentally entered into because the conditions of select() change before I actually read or write -- eg: a third program fills the pipe before my program services the select() return.
I have yet to add that....
BUT: These are the problem conditions I am not sure what to do about.
---------------------------------------------------------------------------------------------------------------------
1. When select is issued on a file descriptor which subsequently *closes* while select is blocking; if that descriptor is an input descriptor -- select will immediately return and the subsequent read() will return 0 -- so my program will close it. (YAY). But -- what if the descriptor is an OUTPUT unnamed pipe? eg: the pipe may have been FULL at the moment the other end was closed -- so it would not be writable before the close, and select ought not return.
But what actually happens after the close? Will linux return from a select() in that case; and if so -- is this portable to mac OS-X and perhaps sun? etc... (What is POSIXLY correct?)
2. If an unnamed pipe never has information to be written; SIGPIPE will never be generated even if the other end is closed.
Is there any portable UNIX way to detect that an unnamed pipe, which was never written to, has had the other end closed? { Obviously writing to it might send bad info if it isn't closed ... and everyone SAYS do not do a write( fd, buffer, 0 ) .... }
(I would love to use stream pipes, but as BASH and other shells set the pipe up -- I have no control over the pipe type -- aka; what kind of pipe does BASH and other shells choose? )
I'll post the working prototype code (next) that I have -- which works well --- with the exception of the 3 above points...
--Thanks.
- 05-29-2009 #2Just Joined!
- Join Date
- May 2009
- Location
- Oregon
- Posts
- 51
source code. andrew d.o.t pub at sophistasis.com
Code:// This is a simple tcp socket interface, nothing else // it requires two parameters: url, port // A third parameter will put it in listen mode. // Written by Andrew F. Robinson. (sophistasis.com) // // Anyone may use this code free of royaly and restrictions except: // Just (please) leave a note about the orignal // author in your comments so my resume won't be toally empty... ;) // (c) May 2009. #include <stdio.h> #include <string.h> #include <errno.h> // EAGAIN #include <sys/types.h> #include <sys/socket.h> #include <sys/un.h> #include <netdb.h> // gethostbyname #include <netinet/in.h> // IPPROTO_TCP #include <unistd.h> // open, close, select, etc. #include <fcntl.h> // O_NONBLOCK, fcntl() #include <sys/time.h> // struct timeval for select. #include <signal.h> // Signal handler #define BREAKPOINT(Z) { int dummy=0; dummy+=1; } // This is a breakpoint macro for auto extraction... void usage( char* arg ) { fprintf( stderr, "Usage:\n" "%s url port # for a remote connection\n" "%s -l ip port # To listen locally.\n" "%s -l port # to listen on all ip's locally\n", arg, arg ); return; } #define MAX_SPACE 4096 static int sockFd, sockFdParent; static void closePort( int signo ) { switch( signo ) { case -1: break; case SIGINT: fprintf(stderr,"SIGINT! shut down tcp.\n"); case SIGHUP: fprintf(stderr,"SIGHUP! shut down tcp.\n"); break; case SIGTERM: fprintf(stderr,"SIGTERM! shut down tcp.\n"); break; case SIGQUIT: fprintf(stderr,"SIGQUIT! shut down tcp.\n"); break; case SIGPIPE: break; default: fprintf(stderr,"Unexpected sig%d. ignored.\n", signo ); return; } shutdown( sockFd, SHUT_RDWR ); close( sockFd ); if (sockFdParent != -1) { shutdown( sockFdParent, SHUT_RDWR ); close( sockFdParent ); } close( STDIN_FILENO ); close( STDOUT_FILENO ); _exit( -1 ); // error because a signal killed it. close all Fd's. } int main( int narg, char **arg ) { struct sockaddr_in socketInfo; int port; int readSock=1, writeSock=1; int readPipe=1, writePipe=1; fd_set readset, writeset; struct timeval noTime = { 0, 0 }; char inBuffer[MAX_SPACE]; long inStart=0, inEnd=0, inUse=0; char outBuffer[MAX_SPACE]; long outStart=0, outEnd=0, outUse=0; int status; sockFdParent = -1; BREAKPOINT(); if ( signal( SIGHUP, closePort ) == SIG_ERR || signal( SIGTERM, closePort ) == SIG_ERR || signal( SIGQUIT, closePort ) == SIG_ERR || signal( SIGINT, closePort ) == SIG_ERR || signal( SIGPIPE, closePort ) == SIG_ERR ) { fprintf(stderr,"Unable to install signal handler.\n"); } sockFd = socket( AF_INET, SOCK_STREAM, IPPROTO_TCP ); if (sockFd == -1) { perror("Resource failure:"); close(STDIN_FILENO); close(STDOUT_FILENO); return -1; } memset( &socketInfo, 0, sizeof(socketInfo) ); socketInfo.sin_family = AF_INET; if ( narg > 4 || narg < 3 ) {usage(arg[0]); return -1;} if ( !strcmp( arg[1], "-l") ) { // Setup for listening. struct hostent *hostEntry; int res; switch (narg) { case 3: // Any address, specified port socketInfo.sin_addr.s_addr = htonl(INADDR_ANY); res = sscanf( arg[2], " %d", &port ); if (res == 1) socketInfo.sin_port = htons( port ); break; case 4: // address is specified. hostEntry = gethostbyname( arg[2] ); memcpy( &socketInfo.sin_addr, hostEntry->h_addr_list[0], hostEntry->h_length ); res = sscanf( arg[3], " %d", &port ); if (res == 1) socketInfo.sin_port = htons( port ); break; default: usage(arg[0]); return -1; } if ( !res ) { fprintf(stderr,"Port number was not identifiable.\n"); return -1; } if ( bind( sockFd, (struct sockaddr*)&socketInfo, sizeof(socketInfo) ) < 0 ) { perror("Unable to bind:"); closePort( -1 ); } if (listen( sockFd, 1 )<0) { perror("Unable to listen:"); closePort( -1 ); } sockFdParent = sockFd; if ((sockFd = accept( sockFd, NULL, NULL )) < 0) { perror("Connect error"); sockFd = sockFdParent; sockFdParent=-1; closePort( -1 ); } } else { // setup for talking. struct hostent *hostEntry; if (narg != 3) { usage(arg[0]); return -1; } hostEntry = gethostbyname( arg[1] ); memcpy( &socketInfo.sin_addr, hostEntry->h_addr_list[0], hostEntry->h_length ); if ( 1 != sscanf( arg[2], " %d", &port ) ) { fprintf(stderr,"Port number was not identifiable.\n"); return -1; } socketInfo.sin_port = htons( port ); if ( connect( sockFd, (struct sockaddr*)&socketInfo, sizeof(socketInfo) ) < 0 ) { perror("Connect failure:"); closePort( -1 ); } } // Connection is correct, now operate it... while( writeSock || writePipe ) { int status; int blockSize; // Set-up for appropriate read and writes. // calculate the space available. FD_ZERO( &readset ); FD_ZERO( &writeset ); if (writeSock) { if (outUse != MAX_SPACE && readPipe) { FD_SET( STDIN_FILENO, &readset ); } if (outUse) { FD_SET( sockFd, &writeset ); } } if (writePipe) { if (inUse != MAX_SPACE && readSock) { FD_SET( sockFd, &readset ); } if (inUse) { FD_SET( STDOUT_FILENO, &writeset ); } } // Now wait for something interesting to happen... // If either STDIN_FILENO, or socket in close -- this returns // those descriptors as readable. // // FIXME: // I am not sure about checking the output on the FIFO. // Clearly I could use a select timeout and check if the output // stream of the socket is closed occasionally / but if select() // auto returns on that being closed, that would be redundant. // KISS. (KEEP IT SIMPLE) -- what is the best solution here? {fd_set tmpRead = readset, tmpWrite = writeset; while( status=select( sockFd+1, &readset, &writeset, NULL, NULL ) < 0 ) { if (status == EBADF) { fprintf(stderr,"Internal error"); closePort( -1 ); } readset=tmpRead; writeset=tmpWrite; }}// temporary variables & loop. // The writes and reads to the pipe has the potential // of blocking if select() is not serviced before a third // program fills/empties the pipe in parallel with this one. // FIXME: A timeout needs to be added. // do the writes first. // WritePipe is the very first. if ( FD_ISSET( STDOUT_FILENO, &writeset ) ) { if (inStart+inUse >= MAX_SPACE) { blockSize = write( STDOUT_FILENO, inBuffer+inStart, MAX_SPACE-inStart ); } else { blockSize = write( STDOUT_FILENO, inBuffer+inStart, inUse ); } if (blockSize <= 0) { if (!blockSize || errno != EAGAIN) { writePipe=0; close( STDOUT_FILENO ); readSock=0; shutdown( sockFd, SHUT_RD ); fprintf(stderr,"TCP read close\n"); if (!writeSock) close( sockFd ); } } else { // some was written... inUse -= blockSize; inStart += blockSize; if (inStart>=MAX_SPACE) inStart=0; if (!inUse) { inStart=0; if (!readSock) { writePipe=0; close( STDOUT_FILENO ); } } } } // writePipe <-- readSock // WriteSock is next if ( FD_ISSET( sockFd, &writeset ) ) { if (inStart+outUse > MAX_SPACE) { blockSize = write( sockFd, outBuffer+outStart, MAX_SPACE-outStart ); } else { blockSize = write( sockFd, outBuffer+outStart, outUse ); } if (blockSize<=0) { if (!blockSize || errno != EAGAIN) { writeSock=0; shutdown( sockFd, SHUT_WR ); fprintf(stderr,"TCP write close\n"); if (!readSock) close( sockFd ); readPipe=0; close( STDIN_FILENO ); } } else { // some was written... outUse -= blockSize; outStart += blockSize; if (outStart>=MAX_SPACE) outStart=0; if (!outUse) { outStart=0; if (!readPipe) { writeSock=0; shutdown( sockFd, SHUT_WR ); fprintf(stderr, "TCP write close\n" ); } } } } // writeSock <-- readPipe // Now do the reads. // readPipe, (writeSocket), output if ( FD_ISSET( STDIN_FILENO, &readset ) ) { outEnd = outStart + inUse; if (outEnd < MAX_SPACE) { blockSize = read( STDIN_FILENO, outBuffer+outEnd, MAX_SPACE-outEnd ); } else { // rolled ! outEnd -= MAX_SPACE; blockSize = write( STDIN_FILENO, outBuffer+outEnd, outStart-outEnd ); } if (blockSize<=0) { if ( !blockSize || errno != EAGAIN) { readPipe=0; close( STDIN_FILENO ); if (!outUse) { writeSock=0; shutdown( sockFd, SHUT_WR ); fprintf( stderr, "TCP write close\n" ); if (!readSock) close(sockFd); } } } else { // some was read... outUse += blockSize; } } // readPipe -> socket // readSock (writePipe) , input if ( FD_ISSET( sockFd, &readset ) ) { inEnd = inStart + inUse; if (outEnd < MAX_SPACE) { blockSize = read( sockFd, inBuffer+inEnd, MAX_SPACE-inEnd ); } else { // rolled ! inEnd -= MAX_SPACE; blockSize = write( sockFd, inBuffer+inEnd, inStart-inEnd ); } if (blockSize<=0) { if ( !blockSize || errno != EAGAIN) { readSock=0; shutdown( sockFd, SHUT_RD ); fprintf(stderr, "TCP read close\n" ); if ( !writeSock ) close( sockFd ); if ( !inUse ) { close( STDOUT_FILENO ); writePipe=0; } } } else { // some was read... inUse += blockSize; } } // readSock -> STDOUT } // While something is still writable w/ data... if (sockFdParent != -1) close( sockFdParent ); if (inUse) { fprintf( stderr, "TCP in aborted %d characters.\n", inUse); } if (outUse) { fprintf( stderr, "TCP out aborted %d characters.\n", outUse); } return 0; // success... }
- 05-29-2009 #3Just Joined!
- Join Date
- May 2009
- Location
- Oregon
- Posts
- 51
Since there are no answers, I wrote a test program.
Seems like LINUX fails even the posix requirement for select().
Does anyone see a bug in my test program?
does it fail on your system too?
Code:// testSelect.c // Written By Andrew F. Robinson (sophistasis.com) // (c) May 2009 // This sourcecode is free for use in your project, as long // as a reference to my name and domain as original author is included. // and any changes are noted NOT to be my work... // // This program is designed to fill an unnamed-fifo until full; // It then closes it and checks whether or not select returns. // This tests for possible deadlock conditions that I would like // to rule out...although not?? ruled out by POSIX. // // Under typical scripting situations, an unnamed pipe can easily // have more than one writer, but seldom more than one reader. // in practice, O_NONBLOCK, is best not used as I do in the test // for it can disrupt the operation of other writers to the same // pipe.... weird... // // If there were a way to make multiple writers see different values // of O_NONBLOCK, that would be useful ... but I don't know how to do it. // do you? #include <unistd.h> // open(), close(), O_NONBLOCK, etc. #include <fcntl.h> // fcntl() #include <sys/ioctl.h> // ioctl() #include <stropts.h> // ioctl values for testing for streampipe. #include <stdio.h> // fprintf, perror, etc. #include <errno.h> // error defs, and variable errno. #if 1 # include <sys/time.h> // struct timeval for select() # include <sys/types.h> // ?? reqd. for old select standards. #else # include <sys/select.h> // alternately; posix.1 standard, 2001. #endif int main( void ) { int status; int pipeFd[2]; int readPipe, writePipe; int readTrigger, writeTrigger; if ( pipe( pipeFd ) < 0 ) { fprintf(stderr,"Test pipe IPC did not open.\n"); return -1; } readPipe=pipeFd[0]; writePipe=pipeFd[1]; if ( pipe( pipeFd ) < 0 ) { fprintf(stderr,"second test pipe IPC did not open.\n"); return -1; } readTrigger=pipeFd[0]; writeTrigger=pipeFd[1]; { // forking a child, just as BASH shell would, // can combine more than one writer on a pipe. For example, // if the pipe happens to replace stdout for BASH and its children... // unfortunately, changing the nonblocking status in the child // also changes it in the parent, typically. pid_t hasChild = fork(); // Create a child to test pipe with. if ( hasChild < 0 ) { perror("Fork failed; exiting early ::"); return -1; } if ( ! hasChild ) { // therefore is a child of fork() char blockingDummy; fprintf(stderr,"fork() ed...\n"); if (read( readTrigger, &blockingDummy, 1 ) <= 0) { fprintf(stderr,"exiting early\n"); return -1; // internal error. }; // wait for test #1 status = (fcntl( writePipe, F_GETFL, 0) & O_NONBLOCK); fprintf( stderr, "Child status of inherited pipe is:%s", (status)?"typical O_NONBLOCK passed on.\n" : "very good! O_NONBLOCK stopped.\n" ); close( writePipe ); // finished write combined test. if (!read( readTrigger, &blockingDummy, 1 )) { return -1; // internal error. }; // wait for test #2 fprintf( stderr, "Child exits to allow test #2\n" ); // use the automatic close feature of the OS. return 0; // Finished all tests in child half. } } // end of forked section. // OKAY, This is the parent portion of the test. // test #1; change to O_NONBLOCK, and see what happens. status = fcntl( writePipe, F_GETFL, 0); if ( status < 0 ) { perror( "could not change to nonblocking mode::"); return -1; } fcntl( writePipe, F_SETFL, status | O_NONBLOCK ); fprintf(stderr,"test O_NONBLOCK\n"); write( writeTrigger, "X", 1 ); // trigger test check #1 // The child process prints whether or not the O_NONBLOCK // propagated... sleep( 5 ); status = ( fcntl( writePipe, F_GETFL, 0) & O_NONBLOCK); fprintf( stderr, "Parent status of inherited pipe is:%s", (status)?"good since O_NONBLOCK is still set.\n" : "bad since O_NONBLOCK was reset!\n" ); // Now, test what kind of pipe the system naturally uses. // stream pipes are desirable, because they can be // tested for close (I think, if I remember right...) // but my impression is that most UNICIES, don't use them // by default for the pipe() system call // eg: the source code of BASH, et. all.... // So, let's find out what this system actually does. fprintf( stderr, "Test for streampipe:" ); if ( ioctl( writePipe, I_CANPUT, 0 ) > -1 ) { fprintf( stderr, "pass! extra good.\n"); } else { fprintf( stderr, "fail! eg: normal & ugly.\n"); } // Set up for test#2 which would be in blocking mode. status = fcntl( writePipe, F_GETFL, 0 ); if ( status < 0 ) { perror( "Could not change back to blocking mode::"); return -1; } fcntl( writePipe, F_SETFL, status & ~O_NONBLOCK ); // Now, do the final test. Fill the FIFO. fprintf( stderr, "Filling the fifo for select() test\n" ); { int i; for( i=0; i<1000; ++i) // wastefull, but pretty good guarantee. do { status = write( writePipe, "-", 1 ); } while( status && status != 1 ); } if (!status) { fprintf(stderr,"Test died early! pipe not filled!\n"); return -1; } // At this point, the FIFO is filled and the child is live. // test posix operation, and questionable operation... fprintf( stderr, "FIFO is full; testing basic select()....\n" ); { fd_set writeset, tempset; struct timeval oneSec = { 1, 0 }; FD_ZERO( &writeset ); FD_SET( writePipe, &writeset ); tempset = writeset; do { writeset = tempset; status = select( writePipe+2, NULL, &writeset, NULL, &oneSec ); } while (status<0 && errno == EINTR ); if (status != 0 ) { fprintf(stderr,"fail!!!\nno test possible.\n"); if (status <0 ) perror("cause::"); return -1; } fprintf(stderr,"ready\n"); fprintf(stderr,"Doing a deadlock test!\n"); write( writeTrigger, "X", 1 ); // cause child to exit. do { writeset = tempset; status = select( writePipe+2, NULL, &writeset, NULL, NULL ); } while (status<0 && errno == EINTR ); fprintf(stderr, "Even though not POSIX guaranteed\n" "This system did not deadlock... YAY!\n" ); } // end of select test block. return 0; } // main.
- 06-23-2009 #4Just Joined!
- Join Date
- May 2009
- Location
- Oregon
- Posts
- 51
Well I rechecked my test code, found a bug -- eg: put it in nonblocking after filling the fifo with a
do {
...
} while( status==1);
I was asleep on that one....
I am surprised no one is at all interested in this -- just about all programs, it turns out, including threaded ones with Ptreads -- rely on Poll/Select internally. The threaded technique I see occasionally does nothing to avoid this problem -- although people seem to think it will.
I just tested Linux and OS-X -- and they *all* deadlock if the un-named pipe is full. The program will never return from POLL/SELECT even though the file was closed by the child.
Some systems don't even have streams operation I_CANPUT (eg: OS-X, ) so that the test program needs to have that test deleted on some systems. Therefore, that isn't a good workaround -- either!
This *stinks* -- a program with output to flush that has the unlucky event that the program it is flushing to died -- will deadlock forever if the program dies at a moment where the fifo is full and Poll is called after that on the Fifo.
I have a better version of the TCP program, now, which is much better than netcat (nc), but I am curious how other people work around this deadlock issue for non socket programs.
It is rather a nuissance, and I find it difficult to believe no one else has noticed it...
any thoughts?
- 06-24-2009 #5Linux Guru
- Join Date
- Apr 2009
- Location
- I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
- Posts
- 8,974
You'll get more and better responses by limiting the scope of your query. This is just a bit much for people to digest quickly. Remember, we are donating our time. No one gets paid to help folks out here.
Sometimes, real fast is almost as good as real time.
Just remember, Semper Gumbi - always be flexible!
- 07-05-2009 #6Just Joined!
- Join Date
- May 2009
- Location
- Oregon
- Posts
- 51
I suppose so; although I don't need a quick answer anymore, as I have tested the systems and know they all fail differently. I have also written a TCP program to replace netcat which has options on how the pipe dies -- so that the deadlock issue is avoided. It's a kludge, but I figure it is the best that can be done given the issues involved.
I expect I'll post it once I have it more or less well worn in and pretty. Then, the next guy with netcat problems will at least have a partial solution. These threads get searched.... for years....


Reply With Quote
