Find the answer to your Linux question:
Results 1 to 9 of 9
All, Say I have a pthread which is blocked on a mutex which is never unlocked and I want to destroy the mutex. Since the behavior is undefined if a ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Apr 2011
    Posts
    9

    How to destroy pthread mutexes?


    All,

    Say I have a pthread which is blocked on a mutex which is never unlocked and I want to destroy the mutex. Since the behavior is undefined if a pthread_mutex_destroy is called on a locked mutex how can I ensure the mutex is ultimately destroyed? Trying on Linux gives an EBUSY error.

    I'm thinking that cancelling the thread and unlocking the mutex in the cancellation handler may not a suitable option since pthread_mutex_lock is not async-cancel-safe, and I don't know the behavior afterwards. I'd like to return the system to a known good state.

    Please let me know if you have any ideas. I'm ultimately trying to put a system in place where at any point I can put the system back in an initial state without leaving the process and restarting. I hope (HOPE!) this is possible. A little success with the mutex API would give me confidence that POSIX allows programs to cleanup resources at any point.

    Thanks,
    --Shaun
    Last edited by shaunp; 08-18-2012 at 03:05 AM.

  2. #2
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,572
    Find the cause of the deadlock. This is one of the principal problem areas of multi-threaded applications. There are no simple solutions that allow the software to continue to execute.

    FWIW, as a consultant, I am paid $200 USD / hour (one day minimum) to help people find/fix this sort of problem, so as you can imagine, it is not a trivial one in cases of any but the simplest application.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  3. #3
    Just Joined!
    Join Date
    Aug 2012
    Posts
    6
    Bet solution is to redesign. something is wrong getting stuck in such state.

    Anyway.

    If (C++) I would wrap a smart-like class around the mutex
    and in the destructor I would make sure I unlock the mutex
    then I destroy it.

    If (C) I would let the thread go in a loop, and I would use
    pthread_mutex_trylock(). If try wont lock, I would usleep a while.

    Then you can destroy the thread, because is spinning anyway.

  4. #4
    Just Joined!
    Join Date
    Apr 2011
    Posts
    9
    Quote Originally Posted by Rubberman View Post
    Find the cause of the deadlock. This is one of the principal problem areas of multi-threaded applications. There are no simple solutions that allow the software to continue to execute.
    Thanks for the reply. I absolutely agree that deadlocks must be investigated and fixed to prevent the deadlock from occuring again. For resource reclamation, I mainly used this as a quick example to illustrate a case where I don't see how to reclaim resources. I do think I found an approach to solve the problem, even if it is a bit involved and requires too much thinking. I'll include it in my next reply on this thread- I'd appreciate your critiques and thoughts, if you've got some interest. Thanks.

    BTW - I'm an embedded sw engineer (new to Linux, but studying as much as I can) and it's a great skill to find and resolve deadlocks. I hope you never have to fix anything I've broken

  5. #5
    Just Joined!
    Join Date
    Apr 2011
    Posts
    9
    So far, the only reliable approach I can see is to ensure that all threads are enabled for deferred cancel, and ensure their cancellation handlers will unlock the mutex if they own the lock. Then, the procedure to delete a mutex is to cancel all app threads that use the mutex and use pthread_join in a special 'cleanup' thread to wait for all cancels to complete, then destroy the mutex.

    This approach also requires replacing all pthread_mutex_lock calls with a pthread_mutex_trylock delay loop, where the loop calls either sleep, usleep, or pthread_testcancel. I can see a similar approach working to destroy condition variables involving replacing pthread_cond_wait with a pthread_cond_timedwait delay loop calling sleep/usleep/pthread_cancel.

    FYI - I think it is necessary to use deferred, rather than asynchronous cancel, since most POSIX functions are not async-cancel-safe and may not behave correctly afterwards. This runs the risk of breaking the cleanup thread above and does not allow the application to reliably 'restart' itself and run predictably.

    Honestly, this all seems overly excessive to me, but POSIX does not seem to provide another way due to limitations on destroy and thread cancellation. Maybe POSIX has or could add an easier way to reclaim resources to help make the world a better place . Please critique this and let me know what you think. And especially if it's broken in some way!! Thanks!
    Last edited by shaunp; 08-22-2012 at 11:22 PM.

  6. #6
    Just Joined!
    Join Date
    Apr 2011
    Posts
    9
    Quote Originally Posted by mariuschincisan View Post
    Bet solution is to redesign. something is wrong getting stuck in such state.
    Thanks for the great advice. I think you're right on that pthread_mutex_lock needs to be replaced with a pthread_mutex_trylock delay loop. Please check out my previous post and let me know if you see any bugs. Thanks!

  7. #7
    Just Joined!
    Join Date
    Apr 2011
    Posts
    9
    Just for reference, a related topic to destroying mutexes is that any shared resource protected a mutex may be left in an inconsistent state if threads are cancelled while they are operating on the shared resource. So, to use the resource again (without restarting the process) it is necessary to have a recovery procedure in place for each shared resource to place the resource back into a safe state so the program can use it again. Also, the recovery procedure may vary from resource to resource.

    As an example, a global structure in the app left in an inconsistent state due to cancellation could be restored simply by using memset to set to zero, or otherwise reinitialized back to its initial values. Other types of shared resources can require much more complex recovery procedures. And of course, apps cannot practically guarantee that shared resources in external libraries can be put back into a safe state since the effect of cancelling a thread in the middle of a library call is usually not documented and time is not usually available to perform an audit. I think this is the point where most rational people would say 'then why bother?', which is unfortunate, but a good question nonetheless.

    I'd have to say that using the linker, it is possible to intercept all calls a library may make to pthread_mutex_lock/unlock. These calls could be replaced with the 'delay loop' code from above, with pthread_setcancelstate calls added around them to disable cancellation around shared resource handling. This gives a partial, but ultimately incomplete, solution. Apps could still guarantee that all app mutexes are cleaned up and app shared resources restored to safe states, and also that the library's shared resources which it protects with mutexes are not left in inconsistant states due to thread deletion. Assuming the library does not have other shared resources that are not protected by mutexes, there is no need for libraries to have a procedure to put shared resources back in a safe state during unexpected cancellation, since they will always be expected to be in a safe state. However, this assumption may not hold (e.g. file accesses), and even if it holds, deadlock in an external library could mean that some library resources may not be recovered as thread deletion would be disabled at these points due to the pthread_setcancelstate calls.

    All in all, with the current POSIX API, a full solution appears to require library developers to also follow the same approach as apps above. This ensures an appropriate recovery procedure is in place for all shared resources and so pthread_setcancelstate is not required around pthread_mutex_lock/unlock. Also, use of pthread_mutex_lock would need to be replaced with the delay loop from above. This would allow all mutexes to be reclaimed at any point (app and library) and all shared resources to be placed back in safe states following cancellation (app and library), while allowing the program to recover and restart itself without exiting the process. But, I digress.

    I think most people will not want the complication of handling these details (myself included), but it is good to have some ideas for a solution with the current POSIX API. But really, I'd like to see the POSIX people do more work to ease resource reclamation to hide as much of this complexity from app developers as possible. IMO, allowing pthread_mutex_destroy (and pthread_cond_destroy) to succeed when threads are blocked, and changing pthread_mutex_lock (and related functions) to return an errno to blocked threads when destroyed would simplify things greatly. Also, refining how asynchronous cancellation affects the different functions (e.g. does async-cancel during pthread_mutex_lock break all future mutex calls [except destroy] on every mutex or only that one mutex?) would be useful to better identify when it can and can't be used.
    Last edited by shaunp; 08-22-2012 at 11:42 PM.

  8. #8
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,572
    Myself, I find that using a finite state machine to model and execute concurrent resource access is most useful. They will make you think rigorously about what states things can get into, and what events are needed to make them transition to a new state (locked vs. unlocked for example).
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  9. #9
    Just Joined!
    Join Date
    Aug 2012
    Posts
    6
    When I touch threads I always carry around same classes
    I wrapped around on mutex & threads to have kindof unified cross platform windows/linux, this only when I don't touch
    boost (which lately looks like procedural programming with
    lot's of :
    Excerpt of a queue between a listener thread which pushes and handling threads which pops can lead to some ideas.

    Code:
    int CtxQueue::push(Context* c)
    {
        _c.lock();
        _q.push_back(c); 
        _c.signal(); // let a thread go
        _c.unlock();
        if(_q.size() < _cap)
            return 1;   // in range app specific ret codes
        if(_q.size() > _maxcap)
            return -1;  // too big. close incomming connections
        return 0;       // overflow but still accepting
    }
    
    //-----------------------------------------------------------------------------
    bool CtxQueue::pop(Context** ppc)
    {
        bool br = false;
        _c.lock();
    
         if(_sync){  // if que is configured to block the threads until has items.
            while(0 == _q.size()){
                _c.wait();
                usleep(0xF);
            }
        } // or no, thread just spin and checks return code
    
        if(_q.size()) {
            *ppc = _q.front();
            br = true;
            _q.pop_front();
        }
        _c.unlock();
        if(_q.size())
            _c.signal(); // keep unblocking threads that are waiting to consume
        return br;
    }
    
    
    class CtxQueue
    {
    public:
        CtxQueue();
        virtual ~CtxQueue();
        int push(Context*);
        bool pop(Context**);
        void signal() {
            _c.signal();
        }
        void broadcast() {
            _c.broadcast();
        }
    private:
        deque<Context*> _q;
        condition       _c;
    };
    
    
    class condition
    {
    public:
        condition()
        {
            pthread_cond_init(&_cond, NULL);
            pthread_mutex_init(&_mutex  ,NULL);
        }
        ~condition()
        {
            pthread_cond_signal(&_cond);
            pthread_mutex_unlock(&_mutex);
    
            pthread_cond_destroy(&_cond);
            pthread_mutex_destroy(&_mutex);
        }
        void lock()
        {
            pthread_mutex_lock(&_mutex);
        }
        void signal()
        {
            pthread_cond_signal(&_cond);
        }
        void broadcast()
        {
            pthread_cond_broadcast(&_cond);
        };
    
        void wait()     // threadsa are witing on a wait
        {
            pthread_cond_wait(&_cond, &_mutex);
        }
        void unlock()
        {
            pthread_mutex_unlock(&_mutex);
        }
    private:
    
        pthread_cond_t _cond;
        pthread_mutex_t _mutex;
    };
    //-------;----------------------------------------------------------------------
    class mutex
    {
        mutable pthread_mutex_t _mut;
    public:
        mutex()
        {
            pthread_mutexattr_t     attr;
    
            pthread_mutexattr_init(&attr);
            pthread_mutexattr_settype(&attr,PTHREAD_MUTEX_RECURSIVE);
            pthread_mutex_init(&_mut,&attr);
            pthread_mutexattr_destroy(&attr);
        }
    
        virtual ~mutex()
        {
            pthread_mutex_unlock(&_mut);
            pthread_mutex_destroy(&_mut);
        }
    
        int mlock() const
        {
            return pthread_mutex_lock(&_mut);
        }
    
        int try_lock() const
        {
            return pthread_mutex_trylock(&_mut);
        }
    
        int munlock() const
        {
            return pthread_mutex_unlock(&_mut);
        }
    };

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •