Thread safety

A key, and unfortunately often not a clearly apparent, issue when developing multithreaded applications is that of thread safety. thread-safe, or, as the man pages like to specify it, MT-Safe, function or API is one that can be safely executed in parallel by multiple threads with no adverse issue.

To understand what this thread-safety issue actually is, let's go back to one of the programs we saw in Appendix A, File I/O Essentials; you can find the source code within the book's GitHub repository: https://github.com/PacktPublishing/Hands-on-System-Programming-with-Linux/blob/master/A_fileio/iobuf.c. In this program, we used fopen(3) to open a file in append mode and then performed some I/O (reads/writes) upon it; we duplicate a small paragraph of that chapter here:

  • We fopen(3) a stream (in append mode: a) to our destination, just a regular file in the /tmp directory (it will be created if it does not exist)
  • Then, in a loop, for a number of iterations provided by the user as a parameter, we will do the following:
    • Read several (512) bytes from the source stream (they will be random values) via the fread(3) stdio library API
    • Write those values to our destination stream via the fwrite(3) stdio library API (checking for EOF and/or error conditions)

Here's a snippet of the code, mainly the testit function performs the actual I/O; refer to: https://github.com/PacktPublishing/Hands-on-System-Programming-with-Linux/blob/master/A_fileio/iobuf.c:

static char *gbuf = NULL;

static void testit(FILE * wrstrm, FILE * rdstrm, int numio)
{
int i, syscalls = NREAD*numio/getpagesize();
size_t fnr=0;

if (syscalls <= 0)
syscalls = 1;
VPRINT("numio=%d total rdwr=%u expected # rw syscalls=%d ",
numio, NREAD*numio, NREAD*numio/getpagesize());

for (i = 0; i < numio; i++) {
fnr = fread(gbuf, 1, NREAD, rdstrm);
if (!fnr)
FATAL("fread on /dev/urandom failed ");

if (!fwrite(gbuf, 1, fnr, wrstrm)) {
free(gbuf);
if (feof(wrstrm))
return;
if (ferror(wrstrm))
FATAL("fwrite on our file failed ");
}
}
}

Notice the first line of code, it's really important to our discussion; the memory buffer used to hold the source and destination data is a global (static) variable, gbuf

Here's where it's allocated in the main() function of the app:

...
gbuf = malloc(NREAD);
if (!gbuf)
FATAL("malloc %zu failed! ", NREAD);
...

So what? In Appendix A, File I/O Essentials, we worked with the implicit assumption that the process is single-threaded; so long as this assumption remains true, the program will work well. But think carefully about this; the moment we want to port this program to become multithreaded-capable, the code is not good enough. Why? It should be quite clear: if multiple threads simultaneously execute the code of the testit function (which is exactly the expectation), the presence of the global shared writable memory variable, gbuf, tells us that we will have critical sections in the code path. As we learned in detail in Chapter 15, Multithreading with Pthreads Part II - Synchronization, every critical section must be either eliminated or protected to prevent data races. 

In the preceding code fragment, we happily invoke both fread(3) and fwrite(3) on this global buffer without any protection whatsoeverJust visualize multiple threads that run through this code path simultaneously; the result is havoc.

So, now we can see it and conclude that the testit function is not thread-safe (at the very least, the programmer must document this fact, preventing others from using the code in a multithreaded application!).

Worse imagine that the preceding thread-unsafe function we developed is merged into a shared library (often referred to as a shared object file on Unix/Linux); any (multithreaded) application that links into this library will have access to this function. If multiple threads of such an application ever invoke it, we have a potential race—a bug, a defect! Not just that, such defects are the really hard-to-spot and hard-to-understand ones, causing issues and perhaps all kinds of temporary bandage fixes (which only make the situation worse and the customer even less confident in the software). Disasters are caused in seemingly innocent ways indeed.

Our conclusion on this is either render the function thread-safe, or clearly document it as being thread-unsafe (and only use it, if at all, in a single-threaded context).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset