Results 1 to 8 of 8
Hi,
I have a multithreaded application that allocates, opens, creates system resources like ports, semaphores, threads, memory etc. In the normal course of execution, these resources are deallocated, closed, deleted ...
- 12-03-2008 #1Just Joined!
- Join Date
- Dec 2008
- Posts
- 4
How to gracefully exit when Ctrl-C and pthreads collide?
Hi,
I have a multithreaded application that allocates, opens, creates system resources like ports, semaphores, threads, memory etc. In the normal course of execution, these resources are deallocated, closed, deleted etc. before the application exits and there are no problems.
However, some users abort the application (it's run from the command line) using Ctrl-C. I have a signal handler to catch SIGINT that is supposed to do the cleanup which would be done during a normal exit. However, it doesn't appear that the code in the signal handler gets executed. Since signals are between processes, and I have only one process (with multiple pthreads) running, why is this signal handler not doing the needful? Specifically, I am running into issues where open ports are not getting closed and other resources not deleted so that the next time the application is run, I get errors in the early setup and initialization of the said system resources or related code. And the only solution is to reboot.
The code is C and C++ and adding the cleanup code in a class destructor doesn't help when the application is aborted by impatient Ctrl-C'ers.
Desperately seeking pointers.
Thank you.
Ebc
- 12-04-2008 #2
Using signals with POSIX threads can be really tricky. The concepts are simple, but the details can get tangled.
Don't do any more work in the signal handler than you have to. Keep in mind that you don't know which thread is going to receive the SIGINT.
The safest thing to do in your SIGINT handler is to set a global variable and then return. Let each thread be mindful of this variable and clean up after itself fairly quickly when necessary.
In theory, this should all be easy. If it doesn't work well for you, you may find yourself ultimately reconsidering exactly how your threads work together and communicate with each other. This will not be trivial. You may find yourself considering using pthread_cancel() within your signal handler. pthread_kill() may also be useful, but remember that using pthread_kill() to send, for example, the SIGKILL signal will kill all threads, not just the intended one.
Before investing much time on this, I'd strongly advise that you run (don't walk) to your nearest bookseller and get David R. Butenhof's fine book Programming with POSIX Threads, published by Addison Wesley. This guy knows all the gotchas associated with POSIX threads. Accept no substitutes, including anything from O'Reilly. The O'Reilly book I have on this subject is accurate in all respects, as far as I can tell, but it won't warn you of many of the gotchas. Therefore it will give you a false sense of security. Therefore it is dangerous.
Hope this helps.--
Bill
Old age and treachery will overcome youth and skill.
- 12-04-2008 #3Just Joined!
- Join Date
- Dec 2008
- Posts
- 4
Thanks, I will get Butenhof's book.
All that my signal handler (for SIGINT) does is post a semaphore which then enables the cleanup operation that is pending on that semaphore. It appears that the problem arises when users do multiple Ctrl-Cs in quick succession and the cleanup operation (killing threads, deleting semaphores etc. that would be done on exit) gets aborted.
What I don't understand is why it matters which thread is executing when SIGINT comes. The thread should stop executing and control should go to the signal handler which would then indirectly terminate all the threads soon anyway.
Ebc
- 12-04-2008 #4Oh. Well. Then.the problem arises when users do multiple Ctrl-Cs in quick succession
The first thing your signal handler should do is to disable SIGINT completely.
So far, so good.The thread should stop executing and control should go to the signal handler
It depends on what you mean by "indirectly".which would then indirectly terminate all the threads soon anyway
If absolutely no functions whose names begin with pthread_ are newly called at or after the interrupt handler starts execution, you should be fine, and it won't matter which thread catches the SIGINT. But if you do call such functions in your shutdown code, you could be in trouble, because the POSIX threads library code depends on certain things it does being atomic (uninterruptible by any other thread).
My money is on ignoring any additional SIGINT after the first. This is a problem regardless of whether you're using POSIX threads, which seem entirely irrelevant to the situation.
I'd still get that book, though. :)--
Bill
Old age and treachery will overcome youth and skill.
- 12-04-2008 #5Just Joined!
- Join Date
- Dec 2008
- Posts
- 4
Does not seem to make a difference. I think something else is going on, i.e. the signal handler never gets executed because I once had a printf there (followed by some useless math to give it time)) and didn't see the message.
It depends on what you mean by "indirectly".
The cleanup is not done _in_ the signal handler but enabled by an action taken in the signal handler. Without thread priorities in pthreads, I don't know how I can ensure that the thread that actually does the cleanup gets scheduled next since it received the semaphore. So maybe, even though the semaphore is posted by the signal handler, the kernel thinks something else is more important.
If absolutely no functions whose names begin with pthread_ are newly called at or after the interrupt handler starts execution, you should be fine, and it won't matter which thread catches the SIGINT. But if you do call such functions in your shutdown code, you could be in trouble, because the POSIX threads library code depends on certain things it does being atomic (uninterruptible by any other thread).
Hmmm....
I do phtread_cancel() and pthread_join() as part of the cleanup process.
My money is on ignoring any additional SIGINT after the first. This is a problem regardless of whether you're using POSIX threads, which seem entirely irrelevant to the situation.
Yes, but I am not sure the handler gets executed. It used to show me the printfs when I had a single thread initially.
I'd still get that book, though.
Yes, yes, yes.
www.****.com ......... put in cart ..... proceed to checkout.....
Ebc
- 12-04-2008 #6
Ok, now I'm a little confused. But confusion might not be a bad thing at this point.
So if they only do one ^C, you see the printf() in the signal handler, and if they do more than one, you don't see it? Confusion finds itself upgraded to bewilderment.It appears that the problem arises when users do multiple Ctrl-Cs in quick succession
Understood. But is this by manipulation of a semaphore, as through semop() or sem_post(), or is it by using pthread_mutex_unlock()?The cleanup is not done _in_ the signal handler but enabled by an action taken in the signal handler.
If it's the latter, or even if it's the former and this results in pthread_something() being called, then you're might be doing something with pthreads while the pthreads library thinks it's right in the middle of an atomic pthreads operation in the thread which was interrupted. I'm grasping at straws here, but I'm just trying to figure out why this isn't working.
It's tricky enough using pthread_cancel() to wind things down. But to ask a thread to dive in and take unilateral action to clean things up sounds really, really iffy.
So, you're using priorities? You're not using priorities? Clue me in. :) If you're not using priorities, you can't ever be sure about which thread will execute first, as you already see. But if you use priorities, there are counterintuitive gotchas to worry about. I've had almost no experience with priorities, but the Butenhof book discusses this in nauseating detail, which is exactly what you want here.Without thread priorities in pthreads, I don't know how I can ensure that the thread that actually does the cleanup gets scheduled next since it received the semaphore.
Actually, the more I think about it, the more I think that multiple ^C's can't possibly be your problem, because you have a thread ready to pounce when it gets released from a semaphore? mutex? And once it does that, it presumably doesn't go back to block again, waiting to run. So that second ^C can't be the problem.
Oh. One more thing, and it sounds trivial. You mentioned
To give it time to do what? Flush the output buffer from the printf()?a printf there (followed by some useless math to give it time)
If the final byte of the data you printf() is a line feed ("\n"), then it will print immediately (assuming no priority problems). If it's not, you're going to have some data that doesn't get output until you either eventually get around to that "\n", or do an fflush(stdout).
This would not be necessary with an fprintf(stderr,... .--
Bill
Old age and treachery will overcome youth and skill.
- 12-04-2008 #7Just Joined!
- Join Date
- Dec 2008
- Posts
- 4
Mea culpa.So if they only do one ^C, you see the printf() in the signal handler, and if they do more than one, you don't see it? Confusion finds itself upgraded to bewilderment.
I was unnecessarily speculating and basing my bad conclusions on human behavior I disapprove of rather than thinking properly.
You never see a printf output. If you do just one ^C, nothing happens. Then if you let it be and let the app run to completion, and then invoke it again later, there are no issues. But if you start impatiently doing multiple ^Cs, the app does exit. Then the app won't run the next time as it is programmed to exit if some of the initialization calls don't succeed (which happens following ^C-based exits).
The SIGINT handler does a sem_post() on a sempahore on which the cleanup thread is sem_wait() ing.Understood. But is this by manipulation of a semaphore, as through semop() or sem_post(), or is it by using pthread_mutex_unlock()?
If it's the latter, or even if it's the former and this results in pthread_something() being called, then you're might be doing something with pthreads while the pthreads library thinks it's right in the middle of an atomic pthreads operation in the thread which was interrupted. I'm grasping at straws here, but I'm just trying to figure out why this isn't working.
Why? [I will read the bookIt's tricky enough using pthread_cancel() to wind things down. But to ask a thread to dive in and take unilateral action to clean things up sounds really, really iffy.
, and this may be an inelegant solution, but...] Once a ^C is done, the user/I don't care what happens to the operation of the threads at that point, i.e, if they do bad calculations or whatever, there is no interest in their outcome. So why not end deterministically, rather than leave something in an unknown state?
So, you're using priorities? You're not using priorities? Clue me in.
If you're not using priorities, you can't ever be sure about which thread will execute first, as you already see. But if you use priorities, there are counterintuitive gotchas to worry about. I've had almost no experience with priorities, but the Butenhof book discusses this in nauseating detail, which is exactly what you want here.
Not using priorities. Are there priorities in pthreads? I didn't see any in the pthread_** calls I looked at? The other OSs that support priorities that I've used before (VxWorks) assign the priority at the time of thread creation, if I recall right. Currently, I control my thread synchronization via semaphores and don't leave anything to the kernel's whims. But when I am killing the threads, it's not normal operation anymore.
Actually, the more I think about it, the more I think that multiple ^C's can't possibly be your problem, because you have a thread ready to pounce when it gets released from a semaphore? mutex? And once it does that, it presumably doesn't go back to block again, waiting to run. So that second ^C can't be the problem.
I agree. See para 1 above.
I think the issue is why is the signal handler not getting executed (even once) with multiple threads running?
Yes. That was then.....To give it time to do what? Flush the output buffer from the printf()?
Yes, that's what I've been doing for a while now when I want to see the output stat.This would not be necessary with an fprintf(stderr,... .
Thanks.
Ebc
- 12-05-2008 #8
- When I talked about winding down the threads in an orderly fashion, I didn't say that for the sake of letting the user have tidy output. I said that for the sake of seeing your application exit in a way that you, the programmer, would like it to exit. (Otherwise, don't even bother with ^C handling.)
- This is the first pthreads application I've heard of that used normal POSIX semaphores for any sort of synchronization, instead of using mutexes. You may very well be in uncharted, dangerous territory here.
- Once again, you want to do minimal work in a signal handler if you're using pthreads. Setting globals, yes. Synchronization operatins, no. That's my hunch, anyway.
Perhaps, if this problem drags out too long, you might be interested in shifting into a lower gear and spending more time exploring. In that case, try something like this:
- Write a non-pthreads program which simply loops around a sleep() call. Put in a SIGINT handler. See whether the program gets there if you type ^C.
- Add pthreads. Let both the main thread and a subsidiary thread have sleep() loops. Do the ^C dance.
- Change the ^C handler so it releases a mutex (using a semaphore here makes my skin crawl, but be my guest :) ) to release another thread to say "Hey, I'm running now!\n".
- Continue to make the program more and more similar to your application, until something breaks.
This doesn't sound very helpful, I know, but I sense that somewhere down the road you'll be doing this anyway. :(
I'll probably be indisposed for a few days, but good luck!--
Bill
Old age and treachery will overcome youth and skill.


Reply With Quote
