Detecting Abandoned Critical Sections
Multithreading is a powerful way to improve the processing throughput and responsiveness of your software. We use it to great effect at Software Verify. In order to manage multithreading successfully it’s necessary to use some form of synchronization between each thread that wishes to read/write data. Deadlocks can result. The main cause of deadlocks is two or more locks (critical section being an example) accessed in different orders on each thread. This has been the subject of much writing, so for now I won’t repeat that topic here.
There is another cause of deadlock which is less well known. The abandoned critical section.
In this article I’m going to describe how to detect abandoned critical sections. But first I need to describe them to you and explain how abandoned critical sections get created.
What is an Abandoned Critical Section?
An abandoned critical section is a critical section that has been locked but then the thread that owns the lock ends without unlocking the critical section. This creates a critical section that cannot be unlocked, and is thus permanently locked. If any other thread attempts to enter the critical section it will wait forever, in a deadlock caused by an infinite wait.
How does this happen?
There are several ways that a critical section can become abandoned.
- Incorrect code.
- Incorrect exception handling.
- Terminate Thread.
Incorrect code
This is where the thread code enters a critical section to do some work and forgets to unlock the critical section. Then the thread exits. If you use object oriented code (CSingleLock for example) to manage the lifetime of critical section ownership then this problem should never happen. But if you manually control the locking, using say, CCriticalSection::Lock() and CCriticalSection::Unlock(), or EnterCriticalSection(&cs) and LeaveCriticalSection(&cs) then it’s possible for you to forget to leave a locked CS, or for a logic failure to result in a critical section not being locked.
If you’re using object oriented synchronization locking methods you might want to look at Thread Lock Checker to automate checking for some simple and common errors that can happen.
DWORD doThread(void *param) { EnterCriticalSection(&dataCS); doWork(data); return 0; // forgot to call LeaveCriticalSection(&dataCS); }
Incorrect exception handling
This is where some code in a thread is protected by an exception handler (you’re calling a 3rd party library, or working with data of unknown integrity) and a critical section is locked when an exception is thrown. In an ideal world the exception handler will leave that locked critical section. Unfortunately the writer of the exception handler may not known about the critical section, or they may have forgotten about it – either way the locked critical section doesn’t get unlocked. As with the previous case, if you use object oriented access to critical sections (CCriticalSection, CSingleLock) the process of unwinding the stack during the exception handling should automatically unlock these locks. This won’t happen if you’re using CRITICAL_SECTIONs with the Win32 API.
DWORD doThread(void *param) { __try { EnterCriticalSection(&dataCS); doWork(data); // something inside here throws an exception. LeaveCriticalSection(&dataCS); } __except(EXCEPTION_EXECUTE_HANDLER) { // forgot to call LeaveCriticalSection(&dataCS); } return 0; }
Terminate Thread
This is where a thread that is doing some work that has accessed some critical sections is killed by another thread calling TerminateThread(). There are occasions where TerminateThread() can be useful, but this is a last ditch method for dealing with threads. If your code is using TerminateThread() to manage your own threads why not spend some time to work out how not to use TerminateThread and to make your threads end normally (by exiting the thread or calling ExitThread()).
// correctly written thread DWORD doThread(void *param) { EnterCriticalSection(&dataCS); doWork(data); LeaveCriticalSection(&dataCS); return 0; } void mainThread() { HANDLE hThread; DWORD threadId; hThread = CreateThread(NULL, 0, doThread, NULL, 0, &threadId); if (hThread != NULL) { doSomeWork(); TerminateThread(hThread, 0); // this is a bit brutal CloseHandle(hThread); } }
How to detect Abandoned Critical Sections?
We have two ways to detect Abandoned Critical Sections.
- Thread Wait Chain Inspector
- Thread Validator
Thread Wait Chain Inspector
Thread Wait Chain Inspector is a free software tool that we wrote that uses the Win32 Wait Chain API to identify various wait chain states of the locks and waits in a given application. Just select the application in question and look at the results.
This tool tells you process ids and thread ids, but it can’t give you symbols, filenames and line numbers. It will provide thread names if you’re working on Windows 10 and you’ve named your threads using the SetThreadDescription() API.
Thread Validator
Thread Validator is our thread analysis software tool for analysing thread synchronization problems, deadlocks, busy locks, slow locks, contended locks and recursing locks. We’ve recently added some reporting options to Thread Validator will help you identify the location of abandoned critical sections.
I’ve used the nativeExample demonstration application that ships with Thread Validator (you’ll need to build) to deliberately create two abandoned critical sections. From the test menu choose “Exit thread with a locked critical section” and “Terminate thread with a locked critical section”.
The summary display will show an abandoned count of 2 in the Errors panel.
The various locks displays will colour the abandoned thread dark purple and list the Lock status as Abandoned
If you click the Abandoned bar in the Errors panel, the display will move to the Analysis tab and the callstacks for the abandoned critical sections will be displayed.
Expanding each entry reveals the callstacks so that you can see where see where each critical section is abandoned. Note that each entry shows two callstacks. The first is where the critical section was created. The second is where the critical section was abandoned. You can expand any entry on any callstack to see the source code.
Abandoned because of thread exit
Abandoned because of TerminateThread()
Expanding the callstack entries to reveal the source code…
Conclusion
Abandoned Critical Sections are bad news. They cause deadlocks.
But they don’t need to be hard to track down when you’ve got the right tools to put to work.