Chapter 5. Parallel System Tools

“Telling the Monkeys What to Do”

Most computers spend a lot of time doing nothing. If you start a system monitor tool and watch the CPU utilization, you’ll see what I mean—it’s rare to see one hit 100 percent, even when you are running multiple programs.[12] There are just too many delays built into software: disk accesses, network traffic, database queries, waiting for users to click a button, and so on. In fact, the majority of a modern CPU’s capacity is often spent in an idle state; faster chips help speed up performance demand peaks, but much of their power can go largely unused.

Early on in computing, programmers realized that they could tap into such unused processing power by running more than one program at the same time. By dividing the CPU’s attention among a set of tasks, its capacity need not go to waste while any given task is waiting for an external event to occur. The technique is usually called parallel processing (and sometimes “multiprocessing” or even “multitasking”) because many tasks seem to be performed at once, overlapping and parallel in time. It’s at the heart of modern operating systems, and it gave rise to the notion of multiple-active-window computer interfaces we’ve all come to take for granted. Even within a single program, dividing processing into tasks that run in parallel can make the overall system faster, at least as measured by the clock on your wall.

Just as important is that modern software systems are expected to be responsive to users regardless of the amount of work they must perform behind the scenes. It’s usually unacceptable for a program to stall while busy carrying out a request. Consider an email-browser user interface, for example; when asked to fetch email from a server, the program must download text from a server over a network. If you have enough email or a slow enough Internet link, that step alone can take minutes to finish. But while the download task proceeds, the program as a whole shouldn’t stall—it still must respond to screen redraws, mouse clicks, and so on.

Parallel processing comes to the rescue here, too. By performing such long-running tasks in parallel with the rest of the program, the system at large can remain responsive no matter how busy some of its parts may be. Moreover, the parallel processing model is a natural fit for structuring such programs and others; some tasks are more easily conceptualized and coded as components running as independent, parallel entities.

There are two fundamental ways to get tasks running at the same time in Pythonprocess forks and spawned threads. Functionally, both rely on underlying operating system services to run bits of Python code in parallel. Procedurally, they are very different in terms of interface, portability, and communication. For instance, at this writing direct process forks are not supported on Windows under standard Python (though they are under Cygwin Python on Windows).

By contrast, Python’s thread support works on all major platforms. Moreover, the os.spawn family of calls provides additional ways to launch programs in a platform-neutral way that is similar to forks, and the os.popen and os.system calls and subprocess module we studied in Chapters 2 and 3 can be used to portably spawn programs with shell commands. The newer multiprocessing module offers additional ways to run processes portably in many contexts.

In this chapter, which is a continuation of our look at system interfaces available to Python programmers, we explore Python’s built-in tools for starting tasks in parallel, as well as communicating with those tasks. In some sense, we’ve already begun doing so—os.system, os.popen, and subprocess, which we learned and applied over the last three chapters, are a fairly portable way to spawn and speak with command-line programs, too. We won’t repeat full coverage of those tools here.

Instead, our emphasis in this chapter is on introducing more direct techniques—forks, threads, pipes, signals, sockets, and other launching techniques—and on using Python’s built-in tools that support them, such as the os.fork call and the threading, queue, and multiprocessing modules. In the next chapter (and in the remainder of this book), we use these techniques in more realistic programs, so be sure you understand the basics here before flipping ahead.

One note up front: although the process, thread, and IPC mechanisms we will explore in this chapter are the primary parallel processing tools in Python scripts, the third party domain offers additional options which may serve more advanced or specialized roles. As just one example, the MPI for Python system allows Python scripts to also employ the Message Passing Interface (MPI) standard, allowing Python programs to exploit multiple processors in various ways (see the Web for details). While such specific extensions are beyond our scope in this book, the fundamentals of multiprocessing that we will explore here should apply to more advanced techniques you may encounter in your parallel futures.

Forking Processes

Forked processes are a traditional way to structure parallel tasks, and they are a fundamental part of the Unix tool set. Forking is a straightforward way to start an independent program, whether it is different from the calling program or not. Forking is based on the notion of copying programs: when a program calls the fork routine, the operating system makes a new copy of that program and its process in memory and starts running that copy in parallel with the original. Some systems don’t really copy the original program (it’s an expensive operation), but the new copy works as if it were a literal copy.

After a fork operation, the original copy of the program is called the parent process, and the copy created by os.fork is called the child process. In general, parents can make any number of children, and children can create child processes of their own; all forked processes run independently and in parallel under the operating system’s control, and children may continue to run after their parent exits.

This is probably simpler in practice than in theory, though. The Python script in Example 5-1 forks new child processes until you type the letter q at the console.

Example 5-1. PP4ESystemProcessesfork1.py
"forks child processes until you type 'q'"

import os

def child():
    print('Hello from child',  os.getpid())
    os._exit(0)  # else goes back to parent loop

def parent():
    while True:
        newpid = os.fork()
        if newpid == 0:
            child()
        else:
            print('Hello from parent', os.getpid(), newpid)
        if input() == 'q': break

parent()

Python’s process forking tools, available in the os module, are simply thin wrappers over standard forking calls in the system library also used by C language programs. To start a new, parallel process, call the os.fork built-in function. Because this function generates a copy of the calling program, it returns a different value in each copy: zero in the child process and the process ID of the new child in the parent.

Programs generally test this result to begin different processing in the child only; this script, for instance, runs the child function in child processes only.[13]

Because forking is ingrained in the Unix programming model, this script works well on Unix, Linux, and modern Macs. Unfortunately, this script won’t work on the standard version of Python for Windows today, because fork is too much at odds with the Windows model. Python scripts can always spawn threads on Windows, and the multiprocessing module described later in this chapter provides an alternative for running processes portably, which can obviate the need for process forks on Windows in contexts that conform to its constraints (albeit at some potential cost in low-level control).

The script in Example 5-1 does work on Windows, however, if you use the Python shipped with the Cygwin system (or build one of your own from source-code with Cygwin’s libraries). Cygwin is a free, open source system that provides full Unix-like functionality for Windows (and is described further in More on Cygwin Python for Windows). You can fork with Python on Windows under Cygwin, even though its behavior is not exactly the same as true Unix forks. Because it’s close enough for this book’s examples, though, let’s use it to run our script live:

[C:...PP4ESystemProcesses]$ python fork1.py
Hello from parent 7296 7920
Hello from child 7920

Hello from parent 7296 3988
Hello from child 3988

Hello from parent 7296 6796
Hello from child 6796
q

These messages represent three forked child processes; the unique identifiers of all the processes involved are fetched and displayed with the os.getpid call. A subtle point: the child process function is also careful to exit explicitly with an os._exit call. We’ll discuss this call in more detail later in this chapter, but if it’s not made, the child process would live on after the child function returns (remember, it’s just a copy of the original process). The net effect is that the child would go back to the loop in parent and start forking children of its own (i.e., the parent would have grandchildren). If you delete the exit call and rerun, you’ll likely have to type more than one q to stop, because multiple processes are running in the parent function.

In Example 5-1, each process exits very soon after it starts, so there’s little overlap in time. Let’s do something slightly more sophisticated to better illustrate multiple forked processes running in parallel. Example 5-2 starts up 5 copies of itself, each copy counting up to 5 with a one-second delay between iterations. The time.sleep standard library call simply pauses the calling process for a number of seconds (you can pass a floating-point value to pause for fractions of seconds).

Example 5-2. PP4ESystemProcessesfork-count.py
"""
fork basics: start 5 copies of this program running in parallel with
the original; each copy counts up to 5 on the same stdout stream--forks
copy process memory, including file descriptors; fork doesn't currently
work on Windows without Cygwin: use os.spawnv or multiprocessing on
Windows instead; spawnv is roughly like a fork+exec combination;
"""

import os, time

def counter(count):                                    # run in new process
    for i in range(count):
        time.sleep(1)                                  # simulate real work
        print('[%s] => %s' % (os.getpid(), i))

for i in range(5):
    pid = os.fork()
    if pid != 0:
        print('Process %d spawned' % pid)              # in parent: continue
    else:
        counter(5)                                     # else in child/new process
        os._exit(0)                                    # run function and exit

print('Main process exiting.')                         # parent need not wait

When run, this script starts 5 processes immediately and exits. All 5 forked processes check in with their first count display one second later and every second thereafter. Notice that child processes continue to run, even if the parent process that created them terminates:

[C:...PP4ESystemProcesses]$ python fork-count.py
Process 4556 spawned
Process 3724 spawned
Process 6360 spawned
Process 6476 spawned
Process 6684 spawned
Main process exiting.
[4556] => 0
[3724] => 0
[6360] => 0
[6476] => 0
[6684] => 0
[4556] => 1
[3724] => 1
[6360] => 1
[6476] => 1
[6684] => 1
[4556] => 2
[3724] => 2
[6360] => 2
[6476] => 2
[6684] => 2
...more output omitted...

The output of all of these processes shows up on the same screen, because all of them share the standard output stream (and a system prompt may show up along the way, too). Technically, a forked process gets a copy of the original process’s global memory, including open file descriptors. Because of that, global objects like files start out with the same values in a child process, so all the processes here are tied to the same single stream. But it’s important to remember that global memory is copied, not shared; if a child process changes a global object, it changes only its own copy. (As we’ll see, this works differently in threads, the topic of the next section.)

The fork/exec Combination

In Examples 5-1 and 5-2, child processes simply ran a function within the Python program and then exited. On Unix-like platforms, forks are often the basis of starting independently running programs that are completely different from the program that performed the fork call. For instance, Example 5-3 forks new processes until we type q again, but child processes run a brand-new program instead of calling a function in the same file.

Example 5-3. PP4ESystemProcessesfork-exec.py
"starts programs until you type 'q'"

import os

parm = 0
while True:
    parm += 1
    pid = os.fork()
    if pid == 0:                                             # copy process
        os.execlp('python', 'python', 'child.py', str(parm)) # overlay program
        assert False, 'error starting program'               # shouldn't return
    else:
        print('Child is', pid)
        if input() == 'q': break

If you’ve done much Unix development, the fork/exec combination will probably look familiar. The main thing to notice is the os.execlp call in this code. In a nutshell, this call replaces (overlays) the program running in the current process with a brand new program. Because of that, the combination of os.fork and os.execlp means start a new process and run a new program in that process—in other words, launch a new program in parallel with the original program.

os.exec call formats

The arguments to os.execlp specify the program to be run by giving command-line arguments used to start the program (i.e., what Python scripts know as sys.argv). If successful, the new program begins running and the call to os.execlp itself never returns (since the original program has been replaced, there’s really nothing to return to). If the call does return, an error has occurred, so we code an assert after it that will always raise an exception if reached.

There are a handful of os.exec variants in the Python standard library; some allow us to configure environment variables for the new program, pass command-line arguments in different forms, and so on. All are available on both Unix and Windows, and they replace the calling program (i.e., the Python interpreter). exec comes in eight flavors, which can be a bit confusing unless you generalize:

os.execv(program, commandlinesequence)

The basic “v” exec form is passed an executable program’s name, along with a list or tuple of command-line argument strings used to run the executable (that is, the words you would normally type in a shell to start a program).

os.execl(program, cmdarg1, cmdarg2,... cmdargN)

The basic “l” exec form is passed an executable’s name, followed by one or more command-line arguments passed as individual function arguments. This is the same as os.execv(program, (cmdarg1, cmdarg2,...)).

os.execlp
os.execvp

Adding the letter p to the execv and execl names means that Python will locate the executable’s directory using your system search-path setting (i.e., PATH).

os.execle
os.execve

Adding a letter e to the execv and execl names means an extra, last argument is a dictionary containing shell environment variables to send to the program.

os.execvpe
os.execlpe

Adding the letters p and e to the basic exec names means to use the search path and to accept a shell environment settings dictionary.

So when the script in Example 5-3 calls os.execlp, individually passed parameters specify a command line for the program to be run on, and the word python maps to an executable file according to the underlying system search-path setting environment variable (PATH). It’s as if we were running a command of the form python child.py 1 in a shell, but with a different command-line argument on the end each time.

Spawned child program

Just as when typed at a shell, the string of arguments passed to os.execlp by the fork-exec script in Example 5-3 starts another Python program file, as shown in Example 5-4.

Example 5-4. PP4ESystemProcesseschild.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])

Here is this code in action on Linux. It doesn’t look much different from the original fork1.py, but it’s really running a new program in each forked process. More observant readers may notice that the child process ID displayed is the same in the parent program and the launched child.py program; os.execlp simply overlays a program in the same process:

[C:...PP4ESystemProcesses]$ python fork-exec.py
Child is 4556
Hello from child 4556 1

Child is 5920
Hello from child 5920 2

Child is 316
Hello from child 316 3
q

There are other ways to start up programs in Python besides the fork/exec combination. For example, the os.system and os.popen calls and subprocess module which we explored in Chapters 2 and 3 allow us to spawn shell commands. And the os.spawnv call and multiprocessing module, which we’ll meet later in this chapter, allow us to start independent programs and processes more portably. In fact, we’ll see later that multiprocessing’s process spawning model can be used as a sort of portable replacement for os.fork in some contexts (albeit a less efficient one) and used in conjunction with the os.exec* calls shown here to achieve a similar effect in standard Windows Python.

We’ll see more process fork examples later in this chapter, especially in the program exits and process communication sections, so we’ll forego additional examples here. We’ll also discuss additional process topics in later chapters of this book. For instance, forks are revisited in Chapter 12 to deal with servers and their zombies—dead processes lurking in system tables after their demise. For now, let’s move on to threads, a subject which at least some programmers find to be substantially less frightening…

Threads

Threads are another way to start activities running at the same time. In short, they run a call to a function (or any other type of callable object) in parallel with the rest of the program. Threads are sometimes called “lightweight processes,” because they run in parallel like forked processes, but all of them run within the same single process. While processes are commonly used to start independent programs, threads are commonly used for tasks such as nonblocking input calls and long-running tasks in a GUI. They also provide a natural model for algorithms that can be expressed as independently running tasks. For applications that can benefit from parallel processing, some developers consider threads to offer a number of advantages:

Performance

Because all threads run within the same process, they don’t generally incur a big startup cost to copy the process itself. The costs of both copying forked processes and running threads can vary per platform, but threads are usually considered less expensive in terms of performance overhead.

Simplicity

To many observers, threads can be noticeably simpler to program, too, especially when some of the more complex aspects of processes enter the picture (e.g., process exits, communication schemes, and zombie processes, covered in Chapter 12).

Shared global memory

On a related note, because threads run in a single process, every thread shares the same global memory space of the process. This provides a natural and easy way for threads to communicate—by fetching and setting names or objects accessible to all the threads. To the Python programmer, this means that things like global scope variables, passed objects and their attributes, and program-wide interpreter components such as imported modules are shared among all threads in a program; if one thread assigns a global variable, for instance, its new value will be seen by other threads. Some care must be taken to control access to shared items, but to some this seems generally simpler to use than the process communication tools necessary for forked processes, which we’ll meet later in this chapter and book (e.g., pipes, streams, signals, sockets, etc.). Like much in programming, this is not a universally shared view, however, so you’ll have to weigh the difference for your programs and platforms yourself.

Portability

Perhaps most important is the fact that threads are more portable than forked processes. At this writing, os.fork is not supported by the standard version of Python on Windows, but threads are. If you want to run parallel tasks portably in a Python script today and you are unwilling or unable to install a Unix-like library such as Cygwin on Windows, threads may be your best bet. Python’s thread tools automatically account for any platform-specific thread differences, and they provide a consistent interface across all operating systems. Having said that, the relatively new multiprocessing module described later in this chapter offers another answer to the process portability issue in some use cases.

So what’s the catch? There are three potential downsides you should be aware of before you start spinning your threads:

Function calls versus programs

First of all, threads are not a way—at least, not a direct way—to start up another program. Rather, threads are designed to run a call to a function (technically, any callable, including bound and unbound methods) in parallel with the rest of the program. As we saw in the prior section, by contrast, forked processes can either call a function or start a new program. Naturally, the threaded function can run scripts with the exec built-in function and can start new programs with tools such as os.system, os.popen and the subprocess module, especially if doing so is itself a long-running task. But fundamentally, threads run in-program functions.

In practice, this is usually not a limitation. For many applications, parallel functions are sufficiently powerful. For instance, if you want to implement nonblocking input and output and avoid blocking a GUI or its users with long-running tasks, threads do the job; simply spawn a thread to run a function that performs the potentially long-running task. The rest of the program will continue independently.

Thread synchronization and queues

Secondly, the fact that threads share objects and names in global process memory is both good news and bad news—it provides a communication mechanism, but we have to be careful to synchronize a variety of operations. As we’ll see, even operations such as printing are a potential conflict since there is only one sys.stdout per process, which is shared by all threads.

Luckily, the Python queue module, described in this section, makes this simple: realistic threaded programs are usually structured as one or more producer (a.k.a. worker) threads that add data to a queue, along with one or more consumer threads that take the data off the queue and process it. In a typical threaded GUI, for example, producers may download or compute data and place it on the queue; the consumer—the main GUI thread—checks the queue for data periodically with a timer event and displays it in the GUI when it arrives. Because the shared queue is thread-safe, programs structured this way automatically synchronize much cross-thread data communication.

The global interpreter lock (GIL)

Finally, as we’ll learn in more detail later in this section, Python’s implementation of threads means that only one thread is ever really running its Python language code in the Python virtual machine at any point in time. Python threads are true operating system threads, but all threads must acquire a single shared lock when they are ready to run, and each thread may be swapped out after running for a short period of time (currently, after a set number of virtual machine instructions, though this implementation may change in Python 3.2).

Because of this structure, the Python language parts of Python threads cannot today be distributed across multiple CPUs on a multi-CPU computer. To leverage more than one CPU, you’ll simply need to use process forking, not threads (the amount and complexity of code required for both are roughly the same). Moreover, the parts of a thread that perform long-running tasks implemented as C extensions can run truly independently if they release the GIL to allow the Python code of other threads to run while their task is in progress. Python code, however, cannot truly overlap in time.

The advantage of Python’s implementation of threads is performance—when it was attempted, making the virtual machine truly thread-safe reportedly slowed all programs by a factor of two on Windows and by an even larger factor on Linux. Even nonthreaded programs ran at half speed.

Even though the GIL’s multiplexing of Python language code makes Python threads less useful for leveraging capacity on multiple CPU machines, threads are still useful as programming tools to implement nonblocking operations, especially in GUIs. Moreover, the newer multiprocessing module we’ll meet later offers another solution here, too—by providing a portable thread-like API that is implemented with processes, programs can both leverage the simplicity and programmability of threads and benefit from the scalability of independent processes across CPUs.

Despite what you may think after reading the preceding overview, threads are remarkably easy to use in Python. In fact, when a program is started it is already running a thread, usually called the “main thread” of the process. To start new, independent threads of execution within a process, Python code uses either the low-level _thread module to run a function call in a spawned thread, or the higher-level threading module to manage threads with high-level class-based objects. Both modules also provide tools for synchronizing access to shared objects with locks.

Note

This book presents both the _thread and threading modules, and its examples use both interchangeably. Some Python users would recommend that you always use threading rather than _thread in general. In fact, the latter was renamed from thread to _thread in 3.X to suggest such a lesser status for it. Personally, I think that is too extreme (and this is one reason this book sometimes uses as thread in imports to retain the original module name). Unless you need the more powerful tools in threading, the choice is largely arbitrary, and the threading module’s extra requirements may be unwarranted.

The basic thread module does not impose OOP, and as you’ll see in the examples of this section, is very straightforward to use. The threading module may be better for more complex tasks which require per-thread state retention or joins, but not all threaded programs require its extra tools, and many use threads in more limited scopes. In fact, this is roughly the same as comparing the os.walk call and visitor classes we’ll meet in Chapter 6—both have valid audiences and use cases. The most general Python rule of thumb applies here as always: keep it simple, unless it has to be complex.

The _thread Module

Since the basic _thread module is a bit simpler than the more advanced threading module covered later in this section, let’s look at some of its interfaces first. This module provides a portable interface to whatever threading system is available in your platform: its interfaces work the same on Windows, Solaris, SGI, and any system with an installed pthreads POSIX threads implementation (including Linux and others). Python scripts that use the Python _thread module work on all of these platforms without changing their source code.

Basic usage

Let’s start off by experimenting with a script that demonstrates the main thread interfaces. The script in Example 5-5 spawns threads until you reply with a q at the console; it’s similar in spirit to (and a bit simpler than) the script in Example 5-1, but it goes parallel with threads instead of process forks.

Example 5-5. PP4ESystemThreads hread1.py
"spawn threads until you type 'q'"

import _thread

def child(tid):
    print('Hello from thread', tid)

def parent():
    i = 0
    while True:
        i += 1
        _thread.start_new_thread(child, (i,))
        if input() == 'q': break

parent()

This script really contains only two thread-specific lines: the import of the _thread module and the thread creation call. To start a thread, we simply call the _thread.start_new_thread function, no matter what platform we’re programming on.[14] This call takes a function (or other callable) object and an arguments tuple and starts a new thread to execute a call to the passed function with the passed arguments. It’s almost like Python’s function(*args) call syntax, and similarly accepts an optional keyword arguments dictionary, too, but in this case the function call begins running in parallel with the rest of the program.

Operationally speaking, the _thread.start_new_thread call itself returns immediately with no useful value, and the thread it spawns silently exits when the function being run returns (the return value of the threaded function call is simply ignored). Moreover, if a function run in a thread raises an uncaught exception, a stack trace is printed and the thread exits, but the rest of the program continues. With the _thread module, the entire program exits silently on most platforms when the main thread does (though as we’ll see later, the threading module may require special handling if child threads are still running).

In practice, though, it’s almost trivial to use threads in a Python script. Let’s run this program to launch a few threads; we can run it on both Unix-like platforms and Windows this time, because threads are more portable than process forks—here it is spawning threads on Windows:

C:...PP4ESystemThreads> python thread1.py
Hello from thread 1

Hello from thread 2

Hello from thread 3

Hello from thread 4
q

Each message here is printed from a new thread, which exits almost as soon as it is started.

Other ways to code threads with _thread

Although the preceding script runs a simple function, any callable object may be run in the thread, because all threads live in the same process. For instance, a thread can also run a lambda function or bound method of an object (the following code is part of file thread-alts.py in the book examples package):

import _thread                                       # all 3 print 4294967296

def action(i):                                       # function run in threads
    print(i ** 32)

class Power:
    def __init__(self, i):
        self.i = i
    def action(self):                                # bound method run in threads
        print(self.i ** 32)

_thread.start_new_thread(action, (2,))               # simple function

_thread.start_new_thread((lambda: action(2)), ())    # lambda function to defer

obj = Power(2)
_thread.start_new_thread(obj.action, ())             # bound method object

As we’ll see in larger examples later in this book, bound methods are especially useful in this role—because they remember both the method function and instance object, they also give access to state information and class methods for use within and during the thread.

More fundamentally, because threads all run in the same process, bound methods run by threads reference the original in-process instance object, not a copy of it. Hence, any changes to its state made in a thread will be visible to all threads automatically. Moreover, since bound methods of a class instance pass for callables interchangeably with simple functions, using them in threads this way just works. And as we’ll see later, the fact that they are normal objects also allows them to be stored freely on shared queues.

Running multiple threads

To really understand the power of threads running in parallel, though, we have to do something more long-lived in our threads, just as we did earlier for processes. Let’s mutate the fork-count program of the prior section to use threads. The script in Example 5-6 starts 5 copies of its counter function running in parallel threads.

Example 5-6. PP4ESystemThreads hread-count.py
"""
thread basics: start 5 copies of a function running in parallel;
uses time.sleep so that the main thread doesn't die too early--
this kills all other threads on some platforms; stdout is shared:
thread outputs may be intermixed in this version arbitrarily.
"""

import _thread as thread, time

def counter(myId, count):                        # function run in threads
    for i in range(count):
        time.sleep(1)                            # simulate real work
        print('[%s] => %s' % (myId, i))

for i in range(5):                               # spawn 5 threads
    thread.start_new_thread(counter, (i, 5))     # each thread loops 5 times

time.sleep(6)
print('Main thread exiting.')                    # don't exit too early

Each parallel copy of the counter function simply counts from zero up to four here and prints a message to standard output for each count.

Notice how this script sleeps for 6 seconds at the end. On Windows and Linux machines this has been tested on, the main thread shouldn’t exit while any spawned threads are running if it cares about their work; if it does exit, all spawned threads are immediately terminated. This differs from processes, where spawned children live on when parents exit. Without the sleep here, the spawned threads would die almost immediately after they are started.

This may seem ad hoc, but it isn’t required on all platforms, and programs are usually structured such that the main thread naturally lives as long as the threads it starts. For instance, a user interface may start an FTP download running in a thread, but the download lives a much shorter life than the user interface itself. Later in this section, we’ll also see different ways to avoid this sleep using global locks and flags that let threads signal their completion.

Moreover, we’ll later find that the threading module both provides a join method that lets us wait for spawned threads to finish explicitly, and refuses to allow a program to exit at all if any of its normal threads are still running (which may be useful in this case, but can require extra work to shut down in others). The multiprocessing module we’ll meet later in this chapter also allows spawned children to outlive their parents, though this is largely an artifact of its process-based model.

Now, when Example 5-6 is run on Windows 7 under Python 3.1, here is the output I get:

C:...PP4ESystemThreads> python thread-count.py
[1] => 0
[1] => 0
[0] => 0
[1] => 0
[0] => 0
[2] => 0
[3] => 0
[3] => 0

[1] => 1
[3] => 1
[3] => 1
[0] => 1[2] => 1
[3] => 1
[0] => 1[2] => 1
[4] => 1

[1] => 2
[3] => 2[4] => 2
[3] => 2[4] => 2
[0] => 2
[3] => 2[4] => 2
[0] => 2
[2] => 2
[3] => 2[4] => 2
[0] => 2
[2] => 2

...more output omitted...
Main thread exiting.

If this looks odd, it’s because it should. In fact, this demonstrates probably the most unusual aspect of threads. What’s happening here is that the output of the 5 threads run in parallel is intermixed—because all the threaded function calls run in the same process, they all share the same standard output stream (in Python terms, there is just one sys.stdout file between them, which is where printed text is sent). The net effect is that their output can be combined and confused arbitrarily. In fact, this script’s output can differ on each run. This jumbling of output grew even more pronounced in Python 3, presumably due to its new file output implementation.

More fundamentally, when multiple threads can access a shared resource like this, their access must be synchronized to avoid overlap in time—as explained in the next section.

Synchronizing access to shared objects and names

One of the nice things about threads is that they automatically come with a cross-task communications mechanism: objects and namespaces in a process that span the life of threads are shared by all spawned threads. For instance, because every thread runs in the same process, if one Python thread changes a global scope variable, the change can be seen by every other thread in the process, main or child. Similarly, threads can share and change mutable objects in the process’s memory as long as they hold a reference to them (e.g., passed-in arguments). This serves as a simple way for a program’s threads to pass information—exit flags, result objects, event indicators, and so on—back and forth to each other.

The downside to this scheme is that our threads must sometimes be careful to avoid changing global objects and names at the same time. If two threads may change a shared object at once, it’s not impossible that one of the two changes will be lost (or worse, will silently corrupt the state of the shared object completely): one thread may step on the work done so far by another whose operations are still in progress. The extent to which this becomes an issue varies per application, and sometimes it isn’t an issue at all.

But even things that aren’t obviously at risk may be at risk. Files and streams, for example, are shared by all threads in a program; if multiple threads write to one stream at the same time, the stream might wind up with interleaved, garbled data. Example 5-6 of the prior section was a simple demonstration of this phenomenon in action, but it’s indicative of the sorts of clashes in time that can occur when our programs go parallel. Even simple changes can go awry if they might happen concurrently. To be robust, threaded programs need to control access to shared global items like these so that only one thread uses them at once.

Luckily, Python’s _thread module comes with its own easy-to-use tools for synchronizing access to objects shared among threads. These tools are based on the concept of a lock—to change a shared object, threads acquire a lock, make their changes, and then release the lock for other threads to grab. Python ensures that only one thread can hold a lock at any point in time; if others request it while it’s held, they are blocked until the lock becomes available. Lock objects are allocated and processed with simple and portable calls in the _thread module that are automatically mapped to thread locking mechanisms on the underlying platform.

For instance, in Example 5-7, a lock object created by _thread.allocate_lock is acquired and released by each thread around the print call that writes to the shared standard output stream.

Example 5-7. PP4ESystemThreads hread-count-mutex.py
"""
synchronize access to stdout: because it is shared global,
thread outputs may be intermixed if not synchronized
"""

import _thread as thread, time

def counter(myId, count):                        # function run in threads
    for i in range(count):
        time.sleep(1)                            # simulate real work
        mutex.acquire()
        print('[%s] => %s' % (myId, i))          # print isn't interrupted now
        mutex.release()

mutex = thread.allocate_lock()                   # make a global lock object
for i in range(5):                               # spawn 5 threads
    thread.start_new_thread(counter, (i, 5))     # each thread loops 5 times

time.sleep(6)
print('Main thread exiting.')                    # don't exit too early

Really, this script simply augments Example 5-6 to synchronize prints with a thread lock. The net effect of the additional lock calls in this script is that no two threads will ever execute a print call at the same point in time; the lock ensures mutually exclusive access to the stdout stream. Hence, the output of this script is similar to that of the original version, except that standard output text is never mangled by overlapping prints:

C:...PP4ESystemThreads> thread-count-mutex.py
[0] => 0
[1] => 0
[3] => 0
[2] => 0
[4] => 0
[0] => 1
[1] => 1
[3] => 1
[2] => 1
[4] => 1
[0] => 2
[1] => 2
[3] => 2
[4] => 2
[2] => 2
[0] => 3
[1] => 3
[3] => 3
[4] => 3
[2] => 3
[0] => 4
[1] => 4
[3] => 4
[4] => 4
[2] => 4
Main thread exiting.

Though somewhat platform-specific, the order in which the threads check in with their prints may still be arbitrary from run to run because they execute in parallel (getting work done in parallel is the whole point of threads, after all); but they no longer collide in time while printing their text. We’ll see other cases where the lock idiom comes in to play later in this chapter—it’s a core component of the multithreading model.

Waiting for spawned thread exits

Besides avoiding print collisions, thread module locks are surprisingly useful. They can form the basis of higher-level synchronization paradigms (e.g., semaphores) and can be used as general thread communication devices.[15] For instance, Example 5-8 uses a global list of locks to know when all child threads have finished.

Example 5-8. PP4ESystemThreads hread-count-wait1.py
"""
uses mutexes to know when threads are done in parent/main thread,
instead of time.sleep; lock stdout to avoid comingled prints;
"""

import _thread as thread
stdoutmutex = thread.allocate_lock()
exitmutexes = [thread.allocate_lock() for i in range(10)]

def counter(myId, count):
    for i in range(count):
        stdoutmutex.acquire()
        print('[%s] => %s' % (myId, i))
        stdoutmutex.release()
    exitmutexes[myId].acquire()    # signal main thread

for i in range(10):
    thread.start_new_thread(counter, (i, 100))

for mutex in exitmutexes:
    while not mutex.locked(): pass
print('Main thread exiting.')

A lock’s locked method can be used to check its state. To make this work, the main thread makes one lock per child and tacks them onto a global exitmutexes list (remember, the threaded function shares global scope with the main thread). On exit, each thread acquires its lock on the list, and the main thread simply watches for all locks to be acquired. This is much more accurate than naïvely sleeping while child threads run in hopes that all will have exited after the sleep. Run this on your own to see its output—all 10 spawned threads count up to 100 (they run in arbitrarily interleaved order that can vary per run and platform, but their prints run atomically and do not comingle), before the main thread exits.

Depending on how your threads run, this could be even simpler: since threads share global memory anyhow, we can usually achieve the same effect with a simple global list of integers instead of locks. In Example 5-9, the module’s namespace (scope) is shared by top-level code and the threaded function, as before. exitmutexes refers to the same list object in the main thread and all threads it spawns. Because of that, changes made in a thread are still noticed in the main thread without resorting to extra locks.

Example 5-9. PP4ESystemThreads hread-count-wait2.py
"""
uses simple shared global data (not mutexes) to know when threads
are done in parent/main thread; threads share list but not its items,
assumes list won't move in memory once it has been created initially
"""

import _thread as thread
stdoutmutex = thread.allocate_lock()
exitmutexes = [False] * 10

def counter(myId, count):
    for i in range(count):
        stdoutmutex.acquire()
        print('[%s] => %s' % (myId, i))
        stdoutmutex.release()
    exitmutexes[myId] = True  # signal main thread

for i in range(10):
    thread.start_new_thread(counter, (i, 100))

while False in exitmutexes: pass
print('Main thread exiting.')

The output of this script is similar to the prior—10 threads counting to 100 in parallel and synchronizing their prints along the way. In fact, both of the last two counting thread scripts produce roughly the same output as the original thread_count.py, albeit without stdout corruption and with larger counts and different random ordering of output lines. The main difference is that the main thread exits immediately after (and no sooner than!) the spawned child threads:

C:...PP4ESystemThreads> python thread-count-wait2.py
...more deleted...
[4] => 98
[6] => 98
[8] => 98
[5] => 98
[0] => 99
[7] => 98
[9] => 98
[1] => 99
[3] => 99
[2] => 99
[4] => 99
[6] => 99
[8] => 99
[5] => 99
[7] => 99
[9] => 99
Main thread exiting.

Coding alternatives: busy loops, arguments, and context managers

Notice how the main threads of both of the last two scripts fall into busy-wait loops at the end, which might become significant performance drains in tight applications. If so, simply add a time.sleep call in the wait loops to insert a pause between end tests and to free up the CPU for other tasks: this call pauses the calling thread only (in this case, the main one). You might also try experimenting with adding sleep calls to the thread function to simulate real work.

Passing in the lock to threaded functions as an argument instead of referencing it in the global scope might be more coherent, too. When passed in, all threads reference the same object, because they are all part of the same process. Really, the process’s object memory is shared memory for threads, regardless of how objects in that shared memory are referenced (whether through global scope variables, passed argument names, object attributes, or another way).

And while we’re at it, the with statement can be used to ensure thread operations around a nested block of code, much like its use to ensure file closure in the prior chapter. The thread lock’s context manager acquires the lock on with statement entry and releases it on statement exit regardless of exception outcomes. The net effect is to save one line of code, but also to guarantee lock release when exceptions are possible. Example 5-10 adds all these coding alternatives to our threaded counter script.

Example 5-10. PP4ESystemThreads hread-count-wait3.py
"""
passed in mutex object shared by all threads instead of globals;
use with context manager statement for auto acquire/release;
sleep calls added to avoid busy loops and simulate real work
"""

import _thread as thread, time
stdoutmutex = thread.allocate_lock()
numthreads  = 5
exitmutexes = [thread.allocate_lock() for i in range(numthreads)]

def counter(myId, count, mutex):                     # shared object passed in
    for i in range(count):
        time.sleep(1 / (myId+1))                     # diff fractions of second
        with mutex:                                  # auto acquire/release: with
            print('[%s] => %s' % (myId, i))
    exitmutexes[myId].acquire()                      # global: signal main thread

for i in range(numthreads):
    thread.start_new_thread(counter, (i, 5, stdoutmutex))

while not all(mutex.locked() for mutex in exitmutexes): time.sleep(0.25)
print('Main thread exiting.')

When run, the different sleep times per thread make them run more independently:

C:...PP4ESystemThreads> thread-count-wait3.py
[4] => 0
[3] => 0
[2] => 0
[4] => 1
[1] => 0
[3] => 1
[4] => 2
[2] => 1
[3] => 2
[4] => 3
[4] => 4
[0] => 0
[1] => 1
[2] => 2
[3] => 3
[3] => 4
[2] => 3
[1] => 2
[2] => 4
[0] => 1
[1] => 3
[1] => 4
[0] => 2
[0] => 3
[0] => 4
Main thread exiting.

Of course, threads are for much more than counting. We’ll put shared global data to more practical use in Adding a User Interface, where it will serve as completion signals from child processing threads transferring data over a network to a main thread controlling a tkinter GUI display, and again later in Chapter 10’s threadtools and Chapter 14’s PyMailGUI to post results of email operations to a GUI (watch for Preview: GUIs and Threads for more pointers on this topic). Global data shared among threads also turns out to be the basis of queues, which are discussed later in this chapter; each thread gets or puts data using the same shared queue object.

The threading Module

The Python standard library comes with two thread modules: _thread, the basic lower-level interface illustrated thus far, and threading, a higher-level interface based on objects and classes. The threading module internally uses the _thread module to implement objects that represent threads and common synchronization tools. It is loosely based on a subset of the Java language’s threading model, but it differs in ways that only Java programmers would notice.[16] Example 5-11 morphs our counting threads example again to demonstrate this new module’s interfaces.

Example 5-11. PP4ESystemThreads hread-classes.py
"""
thread class instances with state and run() for thread's action;
uses higher-level Java-like threading module object join method (not
mutexes or shared global vars) to know when threads are done in main
parent thread; see library manual for more details on threading;
"""

import threading

class Mythread(threading.Thread):              # subclass Thread object
    def __init__(self, myId, count, mutex):
        self.myId  = myId
        self.count = count                     # per-thread state information
        self.mutex = mutex                     # shared objects, not globals
        threading.Thread.__init__(self)

    def run(self):                             # run provides thread logic
        for i in range(self.count):            # still sync stdout access
            with self.mutex:
                print('[%s] => %s' % (self.myId, i))

stdoutmutex = threading.Lock()                 # same as thread.allocate_lock()
threads = []
for i in range(10):
    thread = Mythread(i, 100, stdoutmutex)     # make/start 10 threads
    thread.start()                             # starts run method in a thread
    threads.append(thread)

for thread in threads:
    thread.join()                              # wait for thread exits
print('Main thread exiting.')

The output of this script is the same as that shown for its ancestors earlier (again, threads may be randomly distributed in time, depending on your platform):

C:...PP4ESystemThreads> python thread-classes.py
...more deleted...
[4] => 98
[8] => 97
[9] => 97
[5] => 98
[3] => 99
[6] => 98
[7] => 98
[4] => 99
[8] => 98
[9] => 98
[5] => 99
[6] => 99
[7] => 99
[8] => 99
[9] => 99
Main thread exiting.

Using the threading module this way is largely a matter of specializing classes. Threads in this module are implemented with a Thread object, a Python class which we may customize per application by providing a run method that defines the thread’s action. For example, this script subclasses Thread with its own Mythread class; the run method will be executed by the Thread framework in a new thread when we make a Mythread and call its start method.

In other words, this script simply provides methods expected by the Thread framework. The advantage of taking this more coding-intensive route is that we get both per-thread state information (the usual instance attribute namespace), and a set of additional thread-related tools from the framework “for free.” The Thread.join method used near the end of this script, for instance, waits until the thread exits (by default); we can use this method to prevent the main thread from exiting before its children, rather than using the time.sleep calls and global locks and variables we relied on in earlier threading examples.

The example script also uses threading.Lock to synchronize stream access as before (though this name is really just a synonym for _thread.allocate_lock in the current implementation). The threading module may provide the extra structure of classes, but it doesn’t remove the specter of concurrent updates in the multithreading model in general.

Other ways to code threads with threading

The Thread class can also be used to start a simple function, or any other type of callable object, without coding subclasses at all—if not redefined, the Thread class’s default run method simply calls whatever you pass to its constructor’s target argument, with any provided arguments passed to args (which defaults to () for none). This allows us to use Thread to run simple functions, too, though this call form is not noticeably simpler than the basic _thread module. For instance, the following code snippets sketch four different ways to spawn the same sort of thread (see four-threads*.py in the examples tree; you can run all four in the same script, but would have to also synchronize prints to avoid overlap):

import threading, _thread
def action(i):
    print(i ** 32)

# subclass with state
class Mythread(threading.Thread):
    def __init__(self, i):
        self.i = i
        threading.Thread.__init__(self)
    def run(self):                                        # redefine run for action
        print(self.i ** 32)
Mythread(2).start()                                       # start invokes run()

# pass action in
thread = threading.Thread(target=(lambda: action(2)))     # run invokes target
thread.start()

# same but no lambda wrapper for state
threading.Thread(target=action, args=(2,)).start()        # callable plus its args

# basic thread module
_thread.start_new_thread(action, (2,))                    # all-function interface

As a rule of thumb, class-based threads may be better if your threads require per-thread state, or can leverage any of OOP’s many benefits in general. Your thread classes don’t necessarily have to subclass Thread, though. In fact, just as in the _thread module, the thread’s target in threading may be any type of callable object. When combined with techniques such as bound methods and nested scope references, the choice between coding techniques becomes even less clear-cut:

# a non-thread class with state, OOP
class Power:
    def __init__(self, i):
        self.i = i
    def action(self):
        print(self.i ** 32)

obj = Power(2)
threading.Thread(target=obj.action).start()        # thread runs bound method

# nested scope to retain state
def action(i):
    def power():
        print(i ** 32)
    return power

threading.Thread(target=action(2)).start()         # thread runs returned function

# both with basic thread module
_thread.start_new_thread(obj.action, ())           # thread runs a callable object
_thread.start_new_thread(action(2), ())

As usual, the threading APIs are as flexible as the Python language itself.

Synchronizing access to shared objects and names revisited

Earlier, we saw how print operations in threads need to be synchronized with locks to avoid overlap, because the output stream is shared by all threads. More formally, threads need to synchronize their changes to any item that may be shared across thread in a process—both objects and namespaces. Depending on a given program’s goals, this might include:

  • Mutable object in memory (passed or otherwise referenced objects whose lifetimes span threads)

  • Names in global scopes (changeable variables outside thread functions and classes)

  • The contents of modules (each has just one shared copy in the system’s module table)

For instance, even simple global variables can require coordination if concurrent updates are possible, as in Example 5-12.

Example 5-12. PP4ESystemThreads hread-add-random.py
"prints different results on different runs on Windows 7"

import threading, time
count = 0

def adder():
    global count
    count = count + 1             # update a shared name in global scope
    time.sleep(0.5)               # threads share object memory and global names
    count = count + 1

threads = []
for i in range(100):
    thread = threading.Thread(target=adder, args=())
    thread.start()
    threads.append(thread)

for thread in threads: thread.join()
print(count)

Here, 100 threads are spawned to update the same global scope variable twice (with a sleep between updates to better interleave their operations). When run on Windows 7 with Python 3.1, different runs produce different results:

C:...PP4ESystemThreads> thread-add-random.py
189

C:...PP4ESystemThreads> thread-add-random.py
200

C:...PP4ESystemThreads> thread-add-random.py
194

C:...PP4ESystemThreads> thread-add-random.py
191

This happens because threads overlap arbitrarily in time: statements, even the simple assignment statements like those here, are not guaranteed to run to completion by themselves (that is, they are not atomic). As one thread updates the global, it may be using the partial result of another thread’s work in progress. The net effect is this seemingly random behavior. To make this script work correctly, we need to again use thread locks to synchronize the updates—when Example 5-13 is run, it always prints 200 as expected.

Example 5-13. PP4ESystemThreads hread-add-synch.py
"prints 200 each time, because shared resource access synchronized"

import threading, time
count = 0

def adder(addlock):                 # shared lock object passed in
    global count
    with addlock:
        count = count + 1           # auto acquire/release around stmt
    time.sleep(0.5)
    with addlock:
        count = count + 1           # only 1 thread updating at once

addlock = threading.Lock()
threads = []
for i in range(100):
    thread = threading.Thread(target=adder, args=(addlock,))
    thread.start()
    threads.append(thread)

for thread in threads: thread.join()
print(count)

Although some basic operations in the Python language are atomic and need not be synchronized, you’re probably better off doing so for every potential concurrent update. Not only might the set of atomic operations change over time, but the internal implementation of threads in general can as well (and in fact, it may in Python 3.2, as described ahead).

Of course, this is an artificial example (spawning 100 threads to add twice isn’t exactly a real-world use case for threads!), but it illustrates the issues that threads must address for any sort of potentially concurrent updates to shared object or name. Luckily, for many or most realistic applications, the queue module of the next section can make thread synchronization an automatic artifact of program structure.

Before we move ahead, I should point out that besides Thread and Lock, the threading module also includes higher-level objects for synchronizing access to shared items (e.g., Semaphore, Condition, Event)—many more, in fact, than we have space to cover here; see the library manual for details. For more examples of threads and forks in general, see the remainder this chapter as well as the examples in the GUI and network scripting parts of this book. We will thread GUIs, for instance, to avoid blocking them, and we will thread and fork network servers to avoid denying service to clients.

We’ll also explore the threading module’s approach to program exits in the absence of join calls in conjunction with queues—our next topic.

The queue Module

You can synchronize your threads’ access to shared resources with locks, but you often don’t have to. As mentioned, realistically scaled threaded programs are often structured as a set of producer and consumer threads, which communicate by placing data on, and taking it off of, a shared queue. As long as the queue synchronizes access to itself, this automatically synchronizes the threads’ interactions.

The Python queue module implements this storage device. It provides a standard queue data structure—a first-in first-out (fifo) list of Python objects, in which items are added on one end and removed from the other. Like normal lists, the queues provided by this module may contain any type of Python object, including both simple types (strings, lists, dictionaries, and so on) and more exotic types (class instances, arbitrary callables like functions and bound methods, and more).

Unlike normal lists, though, the queue object is automatically controlled with thread lock acquire and release operations, such that only one thread can modify the queue at any given point in time. Because of this, programs that use a queue for their cross-thread communication will be thread-safe and can usually avoid dealing with locks of their own for data passed between threads.

Like the other tools in Python’s threading arsenal, queues are surprisingly simple to use. The script in Example 5-14, for instance, spawns two consumer threads that watch for data to appear on the shared queue and four producer threads that place data on the queue periodically after a sleep interval (each of their sleep durations differs to simulate a real, long-running task). In other words, this program runs 7 threads (including the main one), 6 of which access the shared queue in parallel.

Example 5-14. PP4ESystemThreadsqueuetest.py
"producer and consumer threads communicating with a shared queue"

numconsumers = 2                  # how many consumers to start
numproducers = 4                  # how many producers to start
nummessages  = 4                  # messages per producer to put

import _thread as thread, queue, time
safeprint = thread.allocate_lock()    # else prints may overlap
dataQueue = queue.Queue()             # shared global, infinite size

def producer(idnum):
    for msgnum in range(nummessages):
        time.sleep(idnum)
        dataQueue.put('[producer id=%d, count=%d]' % (idnum, msgnum))

def consumer(idnum):
    while True:
        time.sleep(0.1)
        try:
            data = dataQueue.get(block=False)
        except queue.Empty:
            pass
        else:
            with safeprint:
                print('consumer', idnum, 'got =>', data)

if __name__ == '__main__':
    for i in range(numconsumers):
        thread.start_new_thread(consumer, (i,))
    for i in range(numproducers):
        thread.start_new_thread(producer, (i,))
    time.sleep(((numproducers-1) * nummessages) + 1)
    print('Main thread exit.')

Before I show you this script’s output, I want to highlight a few points in its code.

Arguments versus globals

Notice how the queue is assigned to a global variable; because of that, it is shared by all of the spawned threads (all of them run in the same process and in the same global scope). Since these threads change an object instead of a variable name, it would work just as well to pass the queue object in to the threaded functions as an argument—the queue is a shared object in memory, regardless of how it is referenced (see queuetest2.py in the examples tree for a full version that does this):

dataQueue = queue.Queue()             # shared object, infinite size

def producer(idnum, dataqueue):
    for msgnum in range(nummessages):
        time.sleep(idnum)
        dataqueue.put('[producer id=%d, count=%d]' % (idnum, msgnum))

def consumer(idnum, dataqueue): ...

if __name__ == '__main__':
    for i in range(numproducers):
        thread.start_new_thread(producer, (i, dataQueue))
    for i in range(numproducers):
        thread.start_new_thread(producer, (i, dataQueue))

Program exit with child threads

Also notice how this script exits when the main thread does, even though consumer threads are still running in their infinite loops. This works fine on Windows (and most other platforms)—with the basic _thread module, the program ends silently when the main thread does. This is why we’ve had to sleep in some examples to give threads time to do their work, but is also why we do not need to be concerned about exiting while consumer threads are still running here.

In the alternative threading module, though, the program will not exit if any spawned threads are running, unless they are set to be daemon threads. Specifically, the entire program exits when only daemon threads are left. Threads inherit a default initial daemonic value from the thread that creates them. The initial thread of a Python program is considered not daemonic, though alien threads created outside this module’s control are considered daemonic (including some threads created in C code). To override inherited defaults, a thread object’s daemon flag can be set manually. In other words, nondaemon threads prevent program exit, and programs by default do not exit until all threading-managed threads finish.

This is either a feature or nonfeature, depending on your program—it allows spawned worker threads to finish their tasks in the absence of join calls or sleeps, but it can prevent programs like the one in Example 5-14 from shutting down when they wish. To make this example work with threading, use the following alternative code (see queuetest3.py in the examples tree for a complete version of this, as well as thread-count-threading.py, also in the tree, for a case where this refusal to exit can come in handy):

import threading, queue, time

def producer(idnum, dataqueue): ...

def consumer(idnum, dataqueue): ...

if __name__ == '__main__':
    for i in range(numconsumers):
        thread = threading.Thread(target=consumer, args=(i, dataQueue))
        thread.daemon = True  # else cannot exit!
        thread.start()

    waitfor = []
    for i in range(numproducers):
        thread = threading.Thread(target=producer, args=(i, dataQueue))
        waitfor.append(thread)
        thread.start()

    for thread in waitfor: thread.join()    # or time.sleep() long enough here
    print('Main thread exit.')

We’ll revisit the daemons and exits issue in Chapter 10 while studying GUIs; as we’ll see, it’s no different in that context, except that the main thread is usually the GUI itself.

Running the script

Now, as coded in Example 5-14, the following is the output of this example when run on my Windows machine. Notice that even though the queue automatically coordinates the communication of data between the threads, this script still must use a lock to manually synchronize access to the standard output stream; queues synchronize data passing, but some programs may still need to use locks for other purposes. As in prior examples, if the safeprint lock is not used, the printed lines from one consumer may be intermixed with those of another. It is not impossible that a consumer may be paused in the middle of a print operation:

C:...PP4ESystemThreads> queuetest.py
consumer 1 got => [producer id=0, count=0]
consumer 0 got => [producer id=0, count=1]
consumer 1 got => [producer id=0, count=2]
consumer 0 got => [producer id=0, count=3]
consumer 1 got => [producer id=1, count=0]
consumer 1 got => [producer id=2, count=0]
consumer 0 got => [producer id=1, count=1]
consumer 1 got => [producer id=3, count=0]
consumer 0 got => [producer id=1, count=2]
consumer 1 got => [producer id=2, count=1]
consumer 1 got => [producer id=1, count=3]
consumer 1 got => [producer id=3, count=1]
consumer 0 got => [producer id=2, count=2]
consumer 1 got => [producer id=2, count=3]
consumer 1 got => [producer id=3, count=2]
consumer 1 got => [producer id=3, count=3]
Main thread exit.

Try adjusting the parameters at the top of this script to experiment with different scenarios. A single consumer, for instance, would simulate a GUI’s main thread. Here is the output of a single-consumer run—producers still add to the queue in fairly random fashion, because threads run in parallel with each other and with the consumer:

C:...PP4ESystemThreads> queuetest.py
consumer 0 got => [producer id=0, count=0]
consumer 0 got => [producer id=0, count=1]
consumer 0 got => [producer id=0, count=2]
consumer 0 got => [producer id=0, count=3]
consumer 0 got => [producer id=1, count=0]
consumer 0 got => [producer id=2, count=0]
consumer 0 got => [producer id=1, count=1]
consumer 0 got => [producer id=3, count=0]
consumer 0 got => [producer id=1, count=2]
consumer 0 got => [producer id=2, count=1]
consumer 0 got => [producer id=1, count=3]
consumer 0 got => [producer id=3, count=1]
consumer 0 got => [producer id=2, count=2]
consumer 0 got => [producer id=2, count=3]
consumer 0 got => [producer id=3, count=2]
consumer 0 got => [producer id=3, count=3]
Main thread exit.

In addition to the basics used in our script, queues may be fixed or infinite in size, and get and put calls may or may not block; see the Python library manual for more details on queue interface options. Since we just simulated a typical GUI structure, though, let’s explore the notion a bit further.

Preview: GUIs and Threads

We will return to threads and queues and see additional thread and queue examples when we study GUIs later in this book. The PyMailGUI example in Chapter 14, for instance, will make extensive use of thread and queue tools introduced here and developed further in Chapter 10, and Chapter 9 will discuss threading in the context of the tkinter GUI toolkit once we’ve had a chance to study it. Although we can’t get into code at this point, threads are usually an integral part of most nontrivial GUIs. In fact, the activity model of many GUIs is a combination of threads, a queue, and a timer-based loop.

Here’s why. In the context of a GUI, any operation that can block or take a long time to complete must be spawned off to run in parallel so that the GUI (the main thread) remains active and continues responding to its users. Although such tasks can be run as processes, the efficiency and shared-state model of threads make them ideal for this role. Moreover, since most GUI toolkits do not allow multiple threads to update the GUI in parallel, updates are best restricted to the main thread.

Because only the main thread should generally update the display, GUI programs typically take the form of a main GUI thread and one or more long-running producer threads—one for each long-running task being performed. To synchronize their points of interface, all of the threads share data on a global queue: non-GUI threads post results, and the GUI thread consumes them.

More specifically:

  • The main thread handles all GUI updates and runs a timer-based loop that wakes up periodically to check for new data on the queue to be displayed on-screen. In Python’s tkinter toolkit, for instance, the widget after(msecs, func, *args) method can be used to schedule queue-check events. Because such events are dispatched by the GUI’s event processor, all GUI updates occur only in this main thread (and often must, due to the lack of thread safety in GUI toolkits).

  • The child threads don’t do anything GUI-related. They just produce data and put it on the queue to be picked up by the main thread. Alternatively, child threads can place a callback function on the queue, to be picked up and run by the main thread. It’s not generally sufficient to simply pass in a GUI update callback function from the main thread to the child thread and run it from there; the function in shared memory will still be executed in the child thread, and potentially in parallel with other threads.

Since threads are much more responsive than a timer event loop in the GUI, this scheme both avoids blocking the GUI (producer threads run in parallel with the GUI), and avoids missing incoming events (producer threads run independent of the GUI event loop and as fast as they can). The main GUI thread will display the queued results as quickly as it can, in the context of a slower GUI event loop.

Also keep in mind that regardless of the thread safety of a GUI toolkit, threaded GUI programs must still adhere to the principles of threaded programs in general—access to shared resources may still need to be synchronized if it falls outside the scope of the producer/consumer shared queue model. If spawned threads might also update another shared state that is used by the main GUI thread, thread locks may also be required to avoid operation overlap. For instance, spawned threads that download and cache email probably cannot overlap with others that use or update the same cache. That is, queues may not be enough; unless you can restrict threads’ work to queuing their results, threaded GUIs still must address concurrent updates.

We’ll see how the threaded GUI model can be realized in code later in this book. For more on this subject, see especially the discussion of threaded tkinter GUIs in Chapter 9, the thread queue tools implemented in Chapter 10, and the PyMailGUI example in Chapter 14.

Later in this chapter, we’ll also meet the multiprocessing module, whose process and queue support offers new options for implementing this GUI model using processes instead of threads; as such, they work around the limitations of the thread GIL, but may incur extra performance overheads that can vary per platform, and may not be directly usable at all in threading contexts (the direct shared and mutable object state of threads is not supported, though messaging is). For now, let’s cover a few final thread fine points.

More on the Global Interpreter Lock

Although it’s a lower-level topic than you generally need to do useful thread work in Python, the implementation of Python’s threads can have impacts on both performance and coding. This section summarizes implementation details and some of their ramifications.

Note

Threads implementation in the upcoming Python 3.2: This section describes the current implementation of threads up to and including Python 3.1. At this writing, Python 3.2 is still in development, but one of its likely enhancements is a new version of the GIL that provides better performance, especially on some multicore CPUs. The new GIL implementation will still synchronize access to the PVM (Python language code is still multiplexed as before), but it will use a context switching scheme that is more efficient than the current N-bytecode-instruction approach.

Among other things, the current sys.setcheckinterval call will likely be replaced with a timer duration call in the new scheme. Specifically, the concept of a check interval for thread switches will be abandoned and replaced by an absolute time duration expressed in seconds. It’s anticipated that this duration will default to 5 milliseconds, but it will be tunable through sys.setswitchinterval.

Moreover, there have been a variety of plans made to remove the GIL altogether (including goals of the Unladen Swallow project being conducted by Google employees), though none have managed to produce any fruit thus far. Since I cannot predict the future, please see Python release documents to follow this (well…) thread.

Strictly speaking, Python currently uses the global interpreter lock (GIL) mechanism introduced at the start of this section, which guarantees that one thread, at most, is running code within the Python interpreter at any given point in time. In addition, to make sure that each thread gets a chance to run, the interpreter automatically switches its attention between threads at regular intervals (in Python 3.1, by releasing and acquiring the lock after a number of bytecode instructions) as well as at the start of long-running operations (e.g., on some file input/output requests).

This scheme avoids problems that could arise if multiple threads were to update Python system data at the same time. For instance, if two threads were allowed to simultaneously change an object’s reference count, the result might be unpredictable. This scheme can also have subtle consequences. In this chapter’s threading examples, for instance, the stdout stream can be corrupted unless each thread’s call to write text is synchronized with thread locks.

Moreover, even though the GIL prevents more than one Python thread from running at the same time, it is not enough to ensure thread safety in general, and it does not address higher-level synchronization issues at all. For example, as we saw, when more than one thread might attempt to update the same variable at the same time, the threads should generally be given exclusive access to the object with locks. Otherwise, it’s not impossible that thread switches will occur in the middle of an update statement’s bytecode.

Locks are not strictly required for all shared object access, especially if a single thread updates an object inspected by other threads. As a rule of thumb, though, you should generally use locks to synchronize threads whenever update rendezvous are possible instead of relying on artifacts of the current thread implementation.

The thread switch interval

Some concurrent updates might work without locks if the thread-switch interval is set high enough to allow each thread to finish without being swapped out. The sys.setcheckinterval(N) call sets the frequency with which the interpreter checks for things like thread switches and signal handlers.

This interval defines the number of bytecode instructions before a switch. It does not need to be reset for most programs, but it can be used to tune thread performance. Setting higher values means switches happen less often: threads incur less overhead but they are less responsive to events. Setting lower values makes threads more responsive to events but increases thread switch overhead.

Atomic operations

Because of the way Python uses the GIL to synchronize threads’ access to the virtual machine, whole statements are not generally thread-safe, but each bytecode instruction is. Because of this bytecode indivisibility, some Python language operations are thread-safe—also called atomic, because they run without interruption—and do not require the use of locks or queues to avoid concurrent update issues. For instance, as of this writing, list.append, fetches and some assignments for variables, list items, dictionary keys, and object attributes, and other operations were still atomic in standard C Python; others, such as x = x+1 (and any operation in general that reads data, modifies it, and writes it back) were not.

As mentioned earlier, though, relying on these rules is a bit of a gamble, because they require a deep understanding of Python internals and may vary per release. Indeed, the set of atomic operations may be radically changed if a new free-threaded implementation ever appears. As a rule of thumb, it may be easier to use locks for all access to global and shared objects than to try to remember which types of access may or may not be safe across multiple threads.

C API thread considerations

Finally, if you plan to mix Python with C, also see the thread interfaces described in the Python/C API standard manual. In threaded programs, C extensions must release and reacquire the GIL around long-running operations to let the Python language portions of other Python threads run during the wait. Specifically, the long-running C extension function should release the lock on entry and reacquire it on exit when resuming Python code.

Also note that even though the Python code in Python threads cannot truly overlap in time due to the GIL synchronization, the C-coded portions of threads can. Any number may be running in parallel, as long as they do work outside the scope of the Python virtual machine. In fact, C threads may overlap both with other C threads and with Python language threads run in the virtual machine. Because of this, splitting code off to C libraries is one way that Python applications can still take advantage of multi-CPU machines.

Still, it may often be easier to leverage such machines by simply writing Python programs that fork processes instead of starting threads. The complexity of process and thread code is similar. For more on C extensions and their threading requirements, see Chapter 20. In short, Python includes C language tools (including a pair of GIL management macros) that can be used to wrap long-running operations in C-coded extensions and that allow other Python language threads to run in parallel.

A process-based alternative: multiprocessing (ahead)

By now, you should have a basic grasp of parallel processes and threads, and Python’s tools that support them. Later in this chapter, we’ll revisit both ideas to study the multiprocessing module—a standard library tool that seeks to combine the simplicity and portability of threads with the benefits of processes, by implementing a threading-like API that runs processes instead of threads. It seeks to address the portability issue of processes, as well as the multiple-CPU limitations imposed in threads by the GIL, but it cannot be used as a replacement for forking in some contexts, and it imposes some constraints that threads do not, which stem from its process-based model (for instance, mutable object state is not directly shared because objects are copied across process boundaries, and unpickleable objects such as bound methods cannot be as freely used).

Because the multiprocessing module also implements tools to simplify tasks such as inter-process communication and exit status, though, let’s first get a handle on Python’s support in those domains as well, and explore some more process and thread examples along the way.

Program Exits

As we’ve seen, unlike C, there is no “main” function in Python. When we run a program, we simply execute all of the code in the top-level file, from top to bottom (i.e., in the filename we listed in the command line, clicked in a file explorer, and so on). Scripts normally exit when Python falls off the end of the file, but we may also call for program exit explicitly with tools in the sys and os modules.

sys Module Exits

For example, the built-in sys.exit function ends a program when called, and earlier than normal:

>>> sys.exit(N)           # exit with status N, else exits on end of script

Interestingly, this call really just raises the built-in SystemExit exception. Because of this, we can catch it as usual to intercept early exits and perform cleanup activities; if uncaught, the interpreter exits as usual. For instance:

C:...PP4ESystem> python
>>> import sys
>>> try:
...     sys.exit()              # see also: os._exit, Tk().quit()
... except SystemExit:
...     print('ignoring exit')
...
ignoring exit
>>>

Programming tools such as debuggers can make use of this hook to avoid shutting down. In fact, explicitly raising the built-in SystemExit exception with a Python raise statement is equivalent to calling sys.exit. More realistically, a try block would catch the exit exception raised elsewhere in a program; the script in Example 5-15, for instance, exits from within a processing function.

Example 5-15. PP4ESystemExits estexit_sys.py
def later():
    import sys
    print('Bye sys world')
    sys.exit(42)
    print('Never reached')

if __name__ == '__main__': later()

Running this program as a script causes it to exit before the interpreter falls off the end of the file. But because sys.exit raises a Python exception, importers of its function can trap and override its exit exception or specify a finally cleanup block to be run during program exit processing:

C:...PP4ESystemExits> python testexit_sys.py
Bye sys world

C:...PP4ESystemExits> python
>>> from testexit_sys import later
>>> try:
...     later()
... except SystemExit:
...     print('Ignored...')
...
Bye sys world
Ignored...
>>> try:
...     later()
... finally:
...     print('Cleanup')
...
Bye sys world
Cleanup

C:...PP4ESystemExits>              # interactive session process exits

os Module Exits

It’s possible to exit Python in other ways, too. For instance, within a forked child process on Unix, we typically call the os._exit function rather than sys.exit; threads may exit with a _thread.exit call; and tkinter GUI applications often end by calling something named Tk().quit(). We’ll meet the tkinter module later in this book; let’s take a look at os exits here.

On os._exit, the calling process exits immediately instead of raising an exception that could be trapped and ignored. In fact, the process also exits without flushing output stream buffers or running cleanup handlers (defined by the atexit standard library module), so this generally should be used only by child processes after a fork, where overall program shutdown actions aren’t desired. Example 5-16 illustrates the basics.

Example 5-16. PP4ESystemExits estexit_os.py
def outahere():
    import os
    print('Bye os world')
    os._exit(99)
    print('Never reached')

if __name__ == '__main__': outahere()

Unlike sys.exit, os._exit is immune to both try/except and try/finally interception:

C:...PP4ESystemExits> python testexit_os.py
Bye os world

C:...PP4ESystemExits> python
>>> from testexit_os import outahere
>>> try:
...     outahere()
... except:
...     print('Ignored')
...
Bye os world                                 # exits interactive process

C:...PP4ESystemExits> python
>>> from testexit_os import outahere
>>> try:
...     outahere()
... finally:
...     print('Cleanup')
...
Bye os world                                 # ditto

Shell Command Exit Status Codes

Both the sys and os exit calls we just met accept an argument that denotes the exit status code of the process (it’s optional in the sys call but required by os). After exit, this code may be interrogated in shells and by programs that ran the script as a child process. On Linux, for example, we ask for the status shell variable’s value in order to fetch the last program’s exit status; by convention, a nonzero status generally indicates that some sort of problem occurred:

[mark@linux]$ python testexit_sys.py
Bye sys world
[mark@linux]$ echo $status
42
[mark@linux]$ python testexit_os.py
Bye os world
[mark@linux]$ echo $status
99

In a chain of command-line programs, exit statuses could be checked along the way as a simple form of cross-program communication.

We can also grab hold of the exit status of a program run by another script. For instance, as introduced in Chapters 2 and 3, when launching shell commands, exit status is provided as:

  • The return value of an os.system call

  • The return value of the close method of an os.popen object (for historical reasons, None is returned if the exit status was 0, which means no error occurred)

  • A variety of interfaces in the subprocess module (e.g., the call function’s return value, a Popen object’s returncode attribute and wait method result)

In addition, when running programs by forking processes, the exit status is available through the os.wait and os.waitpid calls in a parent process.

Exit status with os.system and os.popen

Let’s look at the case of the shell commands first—the following, run on Linux, spawns Example 5-15, and Example 5-16 reads the output streams through pipes and fetches their exit status codes:

[mark@linux]$ python
>>> import os
>>> pipe = os.popen('python testexit_sys.py')
>>> pipe.read()
'Bye sys world12'
>>> stat = pipe.close()              # returns exit code
>>> stat
10752
>>> hex(stat)
'0x2a00'
>>> stat >> 8                        # extract status from bitmask on Unix-likes
42

>>> pipe = os.popen('python testexit_os.py')
>>> stat = pipe.close()
>>> stat, stat >> 8
(25344, 99)

This code works the same under Cygwin Python on Windows. When using os.popen on such Unix-like platforms, for reasons we won’t go into here, the exit status is actually packed into specific bit positions of the return value; it’s really there, but we need to shift the result right by eight bits to see it. Commands run with os.system send their statuses back directly through the Python library call:

>>> stat = os.system('python testexit_sys.py')
Bye sys world
>>> stat, stat >> 8
(10752, 42)

>>> stat = os.system('python testexit_os.py')
Bye os world
>>> stat, stat >> 8
(25344, 99)

All of this code works under the standard version of Python for Windows, too, though exit status is not encoded in a bit mask (test sys.platform if your code must handle both formats):

C:...PP4ESystemExits> python
>>> os.system('python testexit_sys.py')
Bye sys world
42
>>> os.system('python testexit_os.py')
Bye os world
99

>>> pipe = os.popen('python testexit_sys.py')
>>> pipe.read()
'Bye sys world
'
>>> pipe.close()
42
>>>
>>> os.popen('python testexit_os.py').close()
99

Output stream buffering: A first look

Notice that the last test in the preceding code didn’t attempt to read the command’s output pipe. If we do, we may have to run the target script in unbuffered mode with the -u Python command-line flag or change the script to flush its output manually with sys.stdout.flush. Otherwise, the text printed to the standard output stream might not be flushed from its buffer when os._exit is called in this case for immediate shutdown. By default, standard output is fully buffered when connected to a pipe like this; it’s only line-buffered when connected to a terminal:

>>> pipe = os.popen('python testexit_os.py')
>>> pipe.read()                                     # streams not flushed on exit
''

>>> pipe = os.popen('python -u testexit_os.py')     # force unbuffered streams
>>> pipe.read()
'Bye os world
'

Confusingly, you can pass mode and buffering argument to specify line buffering in both os.popen and subprocess.Popen, but this won’t help here—arguments passed to these tools pertain to the calling process’s input end of the pipe, not to the spawned program’s output stream:

>>> pipe = os.popen('python testexit_os.py', 'r', 1)   # line buffered only
>>> pipe.read()                                        # but my pipe, not program's!
''

>>> from subprocess import Popen, PIPE
>>> pipe = Popen('python testexit_os.py', bufsize=1, stdout=PIPE)   # for my pipe
>>> pipe.stdout.read()                                              # doesn't help
b''

Really, buffering mode arguments in these tools pertain to output the caller writes to a command’s standard input stream, not to output read from that command.

If required, the spawned script itself can also manually flush its output buffers periodically or before forced exits. More on buffering when we discuss the potential for deadlocks later in this chapter, and again in Chapters 10 and 12 where we’ll see how it applies to sockets. Since we brought up subprocess, though, let’s turn to its exit tools next.

Exit status with subprocess

The alternative subprocess module offers exit status in a variety of ways, as we saw in Chapters 2 and 3 (a None value in returncode indicates that the spawned program has not yet terminated):

C:...PP4ESystemExits> python
>>> from subprocess import Popen, PIPE, call
>>> pipe = Popen('python testexit_sys.py', stdout=PIPE)
>>> pipe.stdout.read()
b'Bye sys world
'
>>> pipe.wait()
42

>>> call('python testexit_sys.py')
Bye sys world
42

>>> pipe = Popen('python testexit_sys.py', stdout=PIPE)
>>> pipe.communicate()
(b'Bye sys world
', None)
>>> pipe.returncode
42

The subprocess module works the same on Unix-like platforms like Cygwin, but unlike os.popen, the exit status is not encoded, and so it matches the Windows result (note that shell=True is needed to run this as is on Cygwin and Unix-like platforms, as we learned in Chapter 2; on Windows this argument is required only to run commands built into the shell, like dir):

[C:...PP4ESystemExits]$ python
>>> from subprocess import Popen, PIPE, call
>>> pipe = Popen('python testexit_sys.py', stdout=PIPE, shell=True)
>>> pipe.stdout.read()
b'Bye sys world
'
>>> pipe.wait()
42

>>> call('python testexit_sys.py', shell=True)
Bye sys world
42

Process Exit Status and Shared State

Now, to learn how to obtain the exit status from forked processes, let’s write a simple forking program: the script in Example 5-17 forks child processes and prints child process exit statuses returned by os.wait calls in the parent until a “q” is typed at the console.

Example 5-17. PP4ESystemExits estexit_fork.py
"""
fork child processes to watch exit status with os.wait; fork works on Unix
and Cygwin but not standard Windows Python 3.1; note: spawned threads share
globals, but each forked process has its own copy of them (forks share file
descriptors)--exitstat is always the same here but will vary if for threads;
"""

import os
exitstat = 0

def child():                                  # could os.exit a script here
    global exitstat                           # change this process's global
    exitstat += 1                             # exit status to parent's wait
    print('Hello from child', os.getpid(), exitstat)
    os._exit(exitstat)
    print('never reached')

def parent():
    while True:
        newpid = os.fork()                     # start a new copy of process
        if newpid == 0:                        # if in copy, run child logic
            child()                            # loop until 'q' console input
        else:
            pid, status = os.wait()
            print('Parent got', pid, status, (status >> 8))
            if input() == 'q': break

if __name__ == '__main__': parent()

Running this program on Linux, Unix, or Cygwin (remember, fork still doesn’t work on standard Windows Python as I write the fourth edition of this book) produces the following sort of results:

[C:...PP4ESystemExits]$ python testexit_fork.py
Hello from child 5828 1
Parent got 5828 256 1

Hello from child 9540 1
Parent got 9540 256 1

Hello from child 3152 1
Parent got 3152 256 1
q

If you study this output closely, you’ll notice that the exit status (the last number printed) is always the same—the number 1. Because forked processes begin life as copies of the process that created them, they also have copies of global memory. Because of that, each forked child gets and changes its own exitstat global variable without changing any other process’s copy of this variable. At the same time, forked processes copy and thus share file descriptors, which is why prints go to the same place.

Thread Exits and Shared State

In contrast, threads run in parallel within the same process and share global memory. Each thread in Example 5-18 changes the single shared global variable, exitstat.

Example 5-18. PP4ESystemExits estexit_thread.py
"""
spawn threads to watch shared global memory change; threads normally exit
when the function they run returns, but _thread.exit() can be called to
exit calling thread; _thread.exit is the same as sys.exit and raising
SystemExit; threads communicate with possibly locked global vars; caveat:
may need to make print/input calls atomic on some platforms--shared stdout;
"""

import _thread as thread
exitstat = 0

def child():
    global exitstat                               # process global names
    exitstat += 1                                 # shared by all threads
    threadid = thread.get_ident()
    print('Hello from child', threadid, exitstat)
    thread.exit()
    print('never reached')

def parent():
    while True:
        thread.start_new_thread(child, ())
        if input() == 'q': break

if __name__ == '__main__': parent()

The following shows this script in action on Windows; unlike forks, threads run in the standard version of Python on Windows, too. Thread identifiers created by Python differ each time—they are arbitrary but unique among all currently active threads and so may be used as dictionary keys to keep per-thread information (a thread’s id may be reused after it exits on some platforms):

C:...PP4ESystemExits> python testexit_thread.py
Hello from child 4908 1

Hello from child 4860 2

Hello from child 2752 3

Hello from child 8964 4
q

Notice how the value of this script’s global exitstat is changed by each thread, because threads share global memory within the process. In fact, this is often how threads communicate in general. Rather than exit status codes, threads assign module-level globals or change shared mutable objects in-place to signal conditions, and they use thread module locks and queues to synchronize access to shared items if needed. This script might need to synchronize, too, if it ever does something more realistic—for global counter changes, but even print and input may have to be synchronized if they overlap stream access badly on some platforms. For this simple demo, we forego locks by assuming threads won’t mix their operations oddly.

As we’ve learned, a thread normally exits silently when the function it runs returns, and the function return value is ignored. Optionally, the _thread.exit function can be called to terminate the calling thread explicitly and silently. This call works almost exactly like sys.exit (but takes no return status argument), and it works by raising a SystemExit exception in the calling thread. Because of that, a thread can also prematurely end by calling sys.exit or by directly raising SystemExit. Be sure not to call os._exit within a thread function, though—doing so can have odd results (the last time I tried, it hung the entire process on my Linux system and killed every thread in the process on Windows!).

The alternative threading module for threads has no method equivalent to _thread.exit(), but since all that the latter does is raise a system-exit exception, doing the same in threading has the same effect—the thread exits immediately and silently, as in the following sort of code (see testexit-threading.py in the example tree for this code):

import threading, sys, time

def action():
   sys.exit()                 # or raise SystemExit()
   print('not reached')

threading.Thread(target=action).start()
time.sleep(2)
print('Main exit')

On a related note, keep in mind that threads and processes have default lifespan models, which we explored earlier. By way of review, when child threads are still running, the two thread modules’ behavior differs—programs on most platforms exit when the parent thread does under _thread, but not normally under threading unless children are made daemons. When using processes, children normally outlive their parent. This different process behavior makes sense if you remember that threads are in-process function calls, but processes are more independent and autonomous.

When used well, exit status can be used to implement error detection and simple communication protocols in systems composed of command-line scripts. But having said that, I should underscore that most scripts do simply fall off the end of the source to exit, and most thread functions simply return; explicit exit calls are generally employed for exceptional conditions and in limited contexts only. More typically, programs communicate with richer tools than integer exit codes; the next section shows how.

Interprocess Communication

As we saw earlier, when scripts spawn threads—tasks that run in parallel within the program—they can naturally communicate by changing and inspecting names and objects in shared global memory. This includes both accessible variables and attributes, as well as referenced mutable objects. As we also saw, some care must be taken to use locks to synchronize access to shared items that can be updated concurrently. Still, threads offer a fairly straightforward communication model, and the queue module can make this nearly automatic for many programs.

Things aren’t quite as simple when scripts start child processes and independent programs that do not share memory in general. If we limit the kinds of communications that can happen between programs, many options are available, most of which we’ve already seen in this and the prior chapters. For example, the following simple mechanisms can all be interpreted as cross-program communication devices:

  • Simple files

  • Command-line arguments

  • Program exit status codes

  • Shell environment variables

  • Standard stream redirections

  • Stream pipes managed by os.popen and subprocess

For instance, sending command-line options and writing to input streams lets us pass in program execution parameters; reading program output streams and exit codes gives us a way to grab a result. Because shell environment variable settings are inherited by spawned programs, they provide another way to pass context in. And pipes made by os.popen or subprocess allow even more dynamic communication. Data can be sent between programs at arbitrary times, not only at program start and exit.

Beyond this set, there are other tools in the Python library for performing Inter-Process Communication (IPC). This includes sockets, shared memory, signals, anonymous and named pipes, and more. Some vary in portability, and all vary in complexity and utility. For instance:

  • Signals allow programs to send simple notification events to other programs.

  • Anonymous pipes allow threads and related processes that share file descriptors to pass data, but generally rely on the Unix-like forking model for processes, which is not universally portable.

  • Named pipes are mapped to the system’s filesystem—they allow completely unrelated programs to converse, but are not available in Python on all platforms.

  • Sockets map to system-wide port numbers—they similarly let us transfer data between arbitrary programs running on the same computer, but also between programs located on remote networked machines, and offer a more portable option.

While some of these can be used as communication devices by threads, too, their full power becomes more evident when leveraged by separate processes which do not share memory at large.

In this section, we explore directly managed pipes (both anonymous and named), as well as signals. We also take a first look at sockets here, but largely as a preview; sockets can be used for IPC on a single machine, but because the larger socket story also involves their role in networking, we’ll save most of their details until the Internet part of this book.

Other IPC tools are available to Python programmers (e.g., shared memory as provided by the mmap module) but are not covered here for lack of space; search the Python manuals and website for more details on other IPC schemes if you’re looking for something more specific.

After this section, we’ll also study the multiprocessing module, which offers additional and portable IPC options as part of its general process-launching API, including shared memory, and pipes and queues of arbitrary pickled Python objects. For now, let’s study traditional approaches first.

Anonymous Pipes

Pipes, a cross-program communication device, are implemented by your operating system and made available in the Python standard library. Pipes are unidirectional channels that work something like a shared memory buffer, but with an interface resembling a simple file on each of two ends. In typical use, one program writes data on one end of the pipe, and another reads that data on the other end. Each program sees only its end of the pipes and processes it using normal Python file calls.

Pipes are much more within the operating system, though. For instance, calls to read a pipe will normally block the caller until data becomes available (i.e., is sent by the program on the other end) instead of returning an end-of-file indicator. Moreover, read calls on a pipe always return the oldest data written to the pipe, resulting in a first-in-first-out model—the first data written is the first to be read. Because of such properties, pipes are also a way to synchronize the execution of independent programs.

Pipes come in two flavors—anonymous and named. Named pipes (often called fifos) are represented by a file on your computer. Because named pipes are really external files, the communicating processes need not be related at all; in fact, they can be independently started programs.

By contrast, anonymous pipes exist only within processes and are typically used in conjunction with process forks as a way to link parent and spawned child processes within an application. Parent and child converse over shared pipe file descriptors, which are inherited by spawned processes. Because threads run in the same process and share all global memory in general, anonymous pipes apply to them as well.

Anonymous pipe basics

Since they are more traditional, let’s start with a look at anonymous pipes. To illustrate, the script in Example 5-19 uses the os.fork call to make a copy of the calling process as usual (we met forks earlier in this chapter). After forking, the original parent process and its child copy speak through the two ends of a pipe created with os.pipe prior to the fork. The os.pipe call returns a tuple of two file descriptorsthe low-level file identifiers we met in Chapter 4—representing the input and output sides of the pipe. Because forked child processes get copies of their parents’ file descriptors, writing to the pipe’s output descriptor in the child sends data back to the parent on the pipe created before the child was spawned.

Example 5-19. PP4ESystemProcessespipe1.py
import os, time

def child(pipeout):
    zzz = 0
    while True:
        time.sleep(zzz)                          # make parent wait
        msg = ('Spam %03d' % zzz).encode()       # pipes are binary bytes
        os.write(pipeout, msg)                   # send to parent
        zzz = (zzz+1) % 5                        # goto 0 after 4

def parent():
    pipein, pipeout = os.pipe()                  # make 2-ended pipe
    if os.fork() == 0:                           # copy this process
        child(pipeout)                           # in copy, run child
    else:                                        # in parent, listen to pipe
        while True:
            line = os.read(pipein, 32)           # blocks until data sent
            print('Parent %d got [%s] at %s' % (os.getpid(), line, time.time()))

parent()

If you run this program on Linux, Cygwin, or another Unix-like platform (pipe is available on standard Windows Python, but fork is not), the parent process waits for the child to send data on the pipe each time it calls os.read. It’s almost as if the child and parent act as client and server here—the parent starts the child and waits for it to initiate communication.[17] To simulate differing task durations, the child keeps the parent waiting one second longer between messages with time.sleep calls, until the delay has reached four seconds. When the zzz delay counter hits 005, it rolls back down to 000 and starts again:

[C:...PP4ESystemProcesses]$ python pipe1.py
Parent 6716 got [b'Spam 000'] at 1267996104.53
Parent 6716 got [b'Spam 001'] at 1267996105.54
Parent 6716 got [b'Spam 002'] at 1267996107.55
Parent 6716 got [b'Spam 003'] at 1267996110.56
Parent 6716 got [b'Spam 004'] at 1267996114.57
Parent 6716 got [b'Spam 000'] at 1267996114.57
Parent 6716 got [b'Spam 001'] at 1267996115.59
Parent 6716 got [b'Spam 002'] at 1267996117.6
Parent 6716 got [b'Spam 003'] at 1267996120.61
Parent 6716 got [b'Spam 004'] at 1267996124.62
Parent 6716 got [b'Spam 000'] at 1267996124.62
Parent 6716 got [b'Spam 001'] at 1267996125.63
...etc.: Ctrl-C to exit...

Notice how the parent received a bytes string through the pipe. Raw pipes normally deal in binary byte strings when their descriptors are used directly this way with the descriptor-based file tools we met in Chapter 4 (as we saw there, descriptor read and write tools in os always return and expect byte strings). That’s why we also have to manually encode to bytes when writing in the child—the string formatting operation is not available on bytes. As the next section shows, it’s also possible to wrap a pipe descriptor in a text-mode file object, much as we did in the file examples in Chapter 4, but that object simply performs encoding and decoding automatically on transfers; it’s still bytes in the pipe.

Wrapping pipe descriptors in file objects

If you look closely at the preceding output, you’ll see that when the child’s delay counter hits 004, the parent ends up reading two messages from the pipe at the same time; the child wrote two distinct messages, but on some platforms or configurations (other than that used here) they might be interleaved or processed close enough in time to be fetched as a single unit by the parent. Really, the parent blindly asks to read, at most, 32 bytes each time, but it gets back whatever text is available in the pipe, when it becomes available.

To distinguish messages better, we can mandate a separator character in the pipe. An end-of-line makes this easy, because we can wrap the pipe descriptor in a file object with os.fdopen and rely on the file object’s readline method to scan up through the next separator in the pipe. This also lets us leverage the more powerful tools of the text-mode file object we met in Chapter 4. Example 5-20 implements this scheme for the parent’s end of the pipe.

Example 5-20. PP4ESystemProcessespipe2.py
# same as pipe1.py, but wrap pipe input in stdio file object
# to read by line, and close unused pipe fds in both processes

import os, time

def child(pipeout):
    zzz = 0
    while True:
        time.sleep(zzz)                          # make parent wait
        msg = ('Spam %03d
' % zzz).encode()     # pipes are binary in 3.X
        os.write(pipeout, msg)                   # send to parent
        zzz = (zzz+1) % 5                        # roll to 0 at 5

def parent():
    pipein, pipeout = os.pipe()                  # make 2-ended pipe
    if os.fork() == 0:                           # in child, write to pipe
        os.close(pipein)                         # close input side here
        child(pipeout)
    else:                                        # in parent, listen to pipe
        os.close(pipeout)                        # close output side here
        pipein = os.fdopen(pipein)               # make text mode input file object
        while True:
            line = pipein.readline()[:-1]        # blocks until data sent
            print('Parent %d got [%s] at %s' % (os.getpid(), line, time.time()))

parent()

This version has also been augmented to close the unused end of the pipe in each process (e.g., after the fork, the parent process closes its copy of the output side of the pipe written by the child); programs should close unused pipe ends in general. Running with this new version reliably returns a single child message to the parent each time it reads from the pipe, because they are separated with markers when written:

[C:...PP4ESystemProcesses]$ python pipe2.py
Parent 8204 got [Spam 000] at 1267997789.33
Parent 8204 got [Spam 001] at 1267997790.03
Parent 8204 got [Spam 002] at 1267997792.05
Parent 8204 got [Spam 003] at 1267997795.06
Parent 8204 got [Spam 004] at 1267997799.07
Parent 8204 got [Spam 000] at 1267997799.07
Parent 8204 got [Spam 001] at 1267997800.08
Parent 8204 got [Spam 002] at 1267997802.09
Parent 8204 got [Spam 003] at 1267997805.1
Parent 8204 got [Spam 004] at 1267997809.11
Parent 8204 got [Spam 000] at 1267997809.11
Parent 8204 got [Spam 001] at 1267997810.13
...etc.: Ctrl-C to exit...

Notice that this version’s reads also return a text data str object now, per the default r text mode for os.fdopen. As mentioned, pipes normally deal in binary byte strings when their descriptors are used directly with os file tools, but wrapping in text-mode files allows us to use str strings to represent text data instead of bytes. In this example, bytes are decoded to str when read by the parent; using os.fdopen and text mode in the child would allow us to avoid its manual encoding call, but the file object would encode the str data anyhow (though the encoding is trivial for ASCII bytes like those used here). As for simple files, the best mode for processing pipe data in is determined by its nature.

Anonymous pipes and threads

Although the os.fork call required by the prior section’s examples isn’t available on standard Windows Python, os.pipe is. Because threads all run in the same process and share file descriptors (and global memory in general), this makes anonymous pipes usable as a communication and synchronization device for threads, too. This is an arguably lower-level mechanism than queues or shared names and objects, but it provides an additional IPC option for threads. Example 5-21, for instance, demonstrates the same type of pipe-based communication occurring between threads instead of processes.

Example 5-21. PP4ESystemProcessespipe-thread.py
# anonymous pipes and threads, not processes; this version works on Windows

import os, time, threading

def child(pipeout):
    zzz = 0
    while True:
        time.sleep(zzz)                              # make parent wait
        msg = ('Spam %03d' % zzz).encode()           # pipes are binary bytes
        os.write(pipeout, msg)                       # send to parent
        zzz = (zzz+1) % 5                            # goto 0 after 4

def parent(pipein):
    while True:
        line = os.read(pipein, 32)                   # blocks until data sent
        print('Parent %d got [%s] at %s' % (os.getpid(), line, time.time()))

pipein, pipeout = os.pipe()
threading.Thread(target=child, args=(pipeout,)).start()
parent(pipein)

Since threads work on standard Windows Python, this script does too. The output is similar here, but the speakers are in-process threads, not processes (note that because of its simple-minded infinite loops, at least one of its threads may not die on a Ctrl-C—on Windows you may need to use Task Manager to kill the python.exe process running this script or close its window to exit):

C:...PP4ESystemProcesses> pipe-thread.py
Parent 8876 got [b'Spam 000'] at 1268579215.71
Parent 8876 got [b'Spam 001'] at 1268579216.73
Parent 8876 got [b'Spam 002'] at 1268579218.74
Parent 8876 got [b'Spam 003'] at 1268579221.75
Parent 8876 got [b'Spam 004'] at 1268579225.76
Parent 8876 got [b'Spam 000'] at 1268579225.76
Parent 8876 got [b'Spam 001'] at 1268579226.77
Parent 8876 got [b'Spam 002'] at 1268579228.79
...etc.: Ctrl-C or Task Manager to exit...

Bidirectional IPC with anonymous pipes

Pipes normally let data flow in only one direction—one side is input, one is output. What if you need your programs to talk back and forth, though? For example, one program might send another a request for information and then wait for that information to be sent back. A single pipe can’t generally handle such bidirectional conversations, but two pipes can. One pipe can be used to pass requests to a program and another can be used to ship replies back to the requestor.

This really does have real-world applications. For instance, I once added a GUI interface to a command-line debugger for a C-like programming language by connecting two processes with pipes this way. The GUI ran as a separate process that constructed and sent commands to the non-GUI debugger’s input stream pipe and parsed the results that showed up in the debugger’s output stream pipe. In effect, the GUI acted like a programmer typing commands at a keyboard and a client to the debugger server. More generally, by spawning command-line programs with streams attached by pipes, systems can add new interfaces to legacy programs. In fact, we’ll see a simple example of this sort of GUI program structure in Chapter 10.

The module in Example 5-22 demonstrates one way to apply this idea to link the input and output streams of two programs. Its spawn function forks a new child program and connects the input and output streams of the parent to the output and input streams of the child. That is:

  • When the parent reads from its standard input, it is reading text sent to the child’s standard output.

  • When the parent writes to its standard output, it is sending data to the child’s standard input.

The net effect is that the two independent programs communicate by speaking over their standard streams.

Example 5-22. PP4ESystemProcessespipes.py
"""
spawn a child process/program, connect my stdin/stdout to child process's
stdout/stdin--my reads and writes map to output and input streams of the
spawned program; much like tying together streams with subprocess module;
"""

import os, sys

def spawn(prog, *args):                       # pass progname, cmdline args
    stdinFd  = sys.stdin.fileno()             # get descriptors for streams
    stdoutFd = sys.stdout.fileno()            # normally stdin=0, stdout=1

    parentStdin, childStdout  = os.pipe()     # make two IPC pipe channels
    childStdin,  parentStdout = os.pipe()     # pipe returns (inputfd, outoutfd)
    pid = os.fork()                           # make a copy of this process
    if pid:
        os.close(childStdout)                 # in parent process after fork:
        os.close(childStdin)                  # close child ends in parent
        os.dup2(parentStdin,  stdinFd)        # my sys.stdin copy  = pipe1[0]
        os.dup2(parentStdout, stdoutFd)       # my sys.stdout copy = pipe2[1]
    else:
        os.close(parentStdin)                 # in child process after fork:
        os.close(parentStdout)                # close parent ends in child
        os.dup2(childStdin,  stdinFd)         # my sys.stdin copy  = pipe2[0]
        os.dup2(childStdout, stdoutFd)        # my sys.stdout copy = pipe1[1]
        args = (prog,) + args
        os.execvp(prog, args)                 # new program in this process
        assert False, 'execvp failed!'        # os.exec call never returns here

if __name__ == '__main__':
    mypid = os.getpid()
    spawn('python', 'pipes-testchild.py', 'spam')     # fork child program

    print('Hello 1 from parent', mypid)               # to child's stdin
    sys.stdout.flush()                                # subvert stdio buffering
    reply = input()                                   # from child's stdout
    sys.stderr.write('Parent got: "%s"
' % reply)    # stderr not tied to pipe!

    print('Hello 2 from parent', mypid)
    sys.stdout.flush()
    reply = sys.stdin.readline()
    sys.stderr.write('Parent got: "%s"
' % reply[:-1])

The spawn function in this module does not work on standard Windows Python (remember that fork isn’t yet available there today). In fact, most of the calls in this module map straight to Unix system calls (and may be arbitrarily terrifying at first glance to non-Unix developers!). We’ve already met some of these (e.g., os.fork), but much of this code depends on Unix concepts we don’t have time to address well in this text. But in simple terms, here is a brief summary of the system calls demonstrated in this code:

os.fork

Copies the calling process as usual and returns the child’s process ID in the parent process only.

os.execvp

Overlays a new program in the calling process; it’s just like the os.execlp used earlier but takes a tuple or list of command-line argument strings (collected with the *args form in the function header).

os.pipe

Returns a tuple of file descriptors representing the input and output ends of a pipe, as in earlier examples.

os.close(fd)

Closes the descriptor-based file fd.

os.dup2(fd1,fd2)

Copies all system information associated with the file named by the file descriptor fd1 to the file named by fd2.

In terms of connecting standard streams, os.dup2 is the real nitty-gritty here. For example, the call os.dup2(parentStdin,stdinFd) essentially assigns the parent process’s stdin file to the input end of one of the two pipes created; all stdin reads will henceforth come from the pipe. By connecting the other end of this pipe to the child process’s copy of the stdout stream file with os.dup2(childStdout,stdoutFd), text written by the child to its sdtdout winds up being routed through the pipe to the parent’s stdin stream. The effect is reminiscent of the way we tied together streams with the subprocess module in Chapter 3, but this script is more low-level and less portable.

To test this utility, the self-test code at the end of the file spawns the program shown in Example 5-23 in a child process and reads and writes standard streams to converse with it over two pipes.

Example 5-23. PP4ESystemProcessespipes-testchild.py
import os, time, sys
mypid     = os.getpid()
parentpid = os.getppid()
sys.stderr.write('Child %d of %d got arg: "%s"
' %
                                (mypid, parentpid, sys.argv[1]))
for i in range(2):
    time.sleep(3)              # make parent process wait by sleeping here
    recv = input()             # stdin tied to pipe: comes from parent's stdout
    time.sleep(3)
    send = 'Child %d got: [%s]' % (mypid, recv)
    print(send)                # stdout tied to pipe: goes to parent's stdin
    sys.stdout.flush()         # make sure it's sent now or else process blocks

The following is our test in action on Cygwin (it’s similar other Unix-like platforms like Linux); its output is not incredibly impressive to read, but it represents two programs running independently and shipping data back and forth through a pipe device managed by the operating system. This is even more like a client/server model (if you imagine the child as the server, responding to requests sent from the parent). The text in square brackets in this output went from the parent process to the child and back to the parent again, all through pipes connected to standard streams:

[C:...PP4ESystemProcesses]$ python pipes.py
Child 9228 of 9096 got arg: "spam"
Parent got: "Child 9228 got: [Hello 1 from parent 9096]"
Parent got: "Child 9228 got: [Hello 2 from parent 9096]"

Output stream buffering revisited: Deadlocks and flushes

The two processes of the prior section’s example engage in a simple dialog, but it’s already enough to illustrate some of the dangers lurking in cross-program communications. First of all, notice that both programs need to write to stderr to display a message; their stdout streams are tied to the other program’s input stream. Because processes share file descriptors, stderr is the same in both parent and child, so status messages show up in the same place.

More subtly, note that both parent and child call sys.stdout.flush after they print text to the output stream. Input requests on pipes normally block the caller if no data is available, but it seems that this shouldn’t be a problem in our example because there are as many writes as there are reads on the other side of the pipe. By default, though, sys.stdout is buffered in this context, so the printed text may not actually be transmitted until some time in the future (when the output buffers fill up). In fact, if the flush calls are not made, both processes may get stuck on some platforms waiting for input from the other—input that is sitting in a buffer and is never flushed out over the pipe. They wind up in a deadlock state, both blocked on input calls waiting for events that never occur.

Technically, by default stdout is just line-buffered when connected to a terminal, but it is fully buffered when connected to other devices such as files, sockets, and the pipes used here. This is why you see a script’s printed text in a shell window immediately as it is produced, but not until the process exits or its buffer fills when its output stream is connected to something else.

This output buffering is really a function of the system libraries used to access pipes, not of the pipes themselves (pipes do queue up output data, but they never hide it from readers!). In fact, it appears to occur in this example only because we copy the pipe’s information over to sys.stdout, a built-in file object that uses stream buffering by default. However, such anomalies can also occur when using other cross-process tools.

In general terms, if your programs engage in a two-way dialog like this, there are a variety of ways to avoid buffering-related deadlock problems:

  • Flushes: As demonstrated in Examples 5-22 and 5-23, manually flushing output pipe streams by calling the file object flush method is an easy way to force buffers to be cleared. Use sys.stdout.flush for the output stream used by print.

  • Arguments: As introduced earlier in this chapter, the -u Python command-line flag turns off full buffering for the sys.stdout stream in Python programs. Setting your PYTHONUNBUFFERED environment variable to a nonempty value is equivalent to passing this flag but applies to every program run.

  • Open modes: It’s possible to use pipes themselves in unbuffered mode. Either use low-level os module calls to read and write pipe descriptors directly, or pass a buffer size argument of 0 (for unbuffered) or 1 (for line-buffered) to os.fdopen to disable buffering in the file object used to wrap the descriptor. You can use open arguments the same way to control buffering for output to fifo files (described in the next section). Note that in Python 3.X, fully unbuffered mode is allowed only for binary mode files, not text.

  • Command pipes: As mentioned earlier in this chapter, you can similarly specify buffering mode arguments for command-line pipes when they are created by os.popen and subprocess.Popen, but this pertains to the caller’s end of the pipe, not those of the spawned program. Hence it cannot prevent delayed outputs from the latter, but can be used for text sent to another program’s input pipe.

  • Sockets: As we’ll see later, the socket.makefile call accepts a similar buffering mode argument for sockets (described later in this chapter and book), but in Python 3.X this call requires buffering for text-mode access and appears to not support line-buffered mode (more on this on Chapter 12).

  • Tools: For more complex tasks, we can also use higher-level tools that essentially fool a program into believing it is connected to a terminal. These address programs not written in Python, for which neither manual flush calls nor -u are an option. See More on Stream Buffering: pty and Pexpect.

Thread can avoid blocking a main GUI, too, but really just delegate the problem (the spawned thread will still be deadlocked). Of the options listed, the first two—manual flushes and command-line arguments—are often the simplest solutions. In fact, because it is so useful, the second technique listed above merits a few more words. Try this: comment-out all the sys.stdout.flush calls in Examples 5-22 and 5-23 (the files pipes.py and pipes-testchild.py) and change the parent’s spawn call in pipes.py to this (i.e., add a -u command-line argument):

spawn('python', '-u', 'pipes-testchild.py', 'spam')

Then start the program with a command line like this: python -u pipes.py. It will work as it did with the manual stdout flush calls, because stdout will be operating in unbuffered mode in both parent and child.

We’ll revisit the effects of unbuffered output streams in Chapter 10, where we’ll code a simple GUI that displays the output of a non-GUI program by reading it over both a nonblocking socket and a pipe in a thread. We’ll explore the topic again in more depth in Chapter 12, where we will redirect standard streams to sockets in more general ways. Deadlock in general, though, is a bigger problem than we have space to address fully here. On the other hand, if you know enough that you want to do IPC in Python, you’re probably already a veteran of the deadlock wars.

Anonymous pipes allow related tasks to communicate but are not directly suited for independently launched programs. To allow the latter group to converse, we need to move on to the next section and explore devices that have broader visibility.

Named Pipes (Fifos)

On some platforms, it is also possible to create a long-lived pipe that exists as a real named file in the filesystem. Such files are called named pipes (or, sometimes, fifos) because they behave just like the pipes created by the previous section’s programs. Because fifos are associated with a real file on your computer, though, they are external to any particular program—they do not rely on memory shared between tasks, and so they can be used as an IPC mechanism for threads, processes, and independently launched programs.

Once a named pipe file is created, clients open it by name and read and write data using normal file operations. Fifos are unidirectional streams. In typical operation, a server program reads data from the fifo, and one or more client programs write data to it. In addition, a set of two fifos can be used to implement bidirectional communication just as we did for anonymous pipes in the prior section.

Because fifos reside in the filesystem, they are longer-lived than in-process anonymous pipes and can be accessed by programs started independently. The unnamed, in-process pipe examples thus far depend on the fact that file descriptors (including pipes) are copied to child processes’ memory. That makes it difficult to use anonymous pipes to connect programs started independently. With fifos, pipes are accessed instead by a filename visible to all programs running on the computer, regardless of any parent/child process relationships. In fact, like normal files, fifos typically outlive the programs that access them. Unlike normal files, though, the operating system synchronizes fifo access, making them ideal for IPC.

Because of their distinctions, fifo pipes are better suited as general IPC mechanisms for independent client and server programs. For instance, a perpetually running server program may create and listen for requests on a fifo that can be accessed later by arbitrary clients not forked by the server. In a sense, fifos are an alternative to the socket port interface we’ll meet in the next section. Unlike sockets, though, fifos do not directly support remote network connections, are not available in standard Windows Python today, and are accessed using the standard file interface instead of the more unique socket port numbers and calls we’ll study later.

Named pipe basics

In Python, named pipe files are created with the os.mkfifo call, which is available today on Unix-like platforms, including Cygwin’s Python on Windows, but is not currently available in standard Windows Python. This call creates only the external file, though; to send and receive data through a fifo, it must be opened and processed as if it were a standard file.

To illustrate, Example 5-24 is a derivation of the pipe2.py script listed in Example 5-20, but rewritten here to use fifos rather than anonymous pipes. Much like pipe2.py, this script opens the fifo using os.open in the child for low-level byte string access, but with the open built-in in the parent to treat the pipe as text; in general, either end may use either technique to treat the pipe’s data as bytes or text.

Example 5-24. PP4ESystemProcessespipefifo.py
"""
named pipes; os.mkfifo is not available on Windows (without Cygwin);
there is no reason to fork here, since fifo file pipes are external
to processes--shared fds in parent/child processes are irrelevent;
"""

import os, time, sys
fifoname = '/tmp/pipefifo'                       # must open same name

def child():
    pipeout = os.open(fifoname, os.O_WRONLY)     # open fifo pipe file as fd
    zzz = 0
    while True:
        time.sleep(zzz)
        msg = ('Spam %03d
' % zzz).encode()     # binary as opened here
        os.write(pipeout, msg)
        zzz = (zzz+1) % 5

def parent():
    pipein = open(fifoname, 'r')                 # open fifo as text file object
    while True:
        line = pipein.readline()[:-1]            # blocks until data sent
        print('Parent %d got "%s" at %s' % (os.getpid(), line, time.time()))

if __name__ == '__main__':
    if not os.path.exists(fifoname):
        os.mkfifo(fifoname)                      # create a named pipe file
    if len(sys.argv) == 1:
        parent()                                 # run as parent if no args
    else:                                        # else run as child process
        child()

Because the fifo exists independently of both parent and child, there’s no reason to fork here. The child may be started independently of the parent as long as it opens a fifo file by the same name. Here, for instance, on Cygwin the parent is started in one shell window and then the child is started in another. Messages start appearing in the parent window only after the child is started and begins writing messages onto the fifo file:

[C:...PP4ESystemProcesses] $ python pipefifo.py           # parent window
Parent 8324 got "Spam 000" at 1268003696.07
Parent 8324 got "Spam 001" at 1268003697.06
Parent 8324 got "Spam 002" at 1268003699.07
Parent 8324 got "Spam 003" at 1268003702.08
Parent 8324 got "Spam 004" at 1268003706.09
Parent 8324 got "Spam 000" at 1268003706.09
Parent 8324 got "Spam 001" at 1268003707.11
Parent 8324 got "Spam 002" at 1268003709.12
Parent 8324 got "Spam 003" at 1268003712.13
Parent 8324 got "Spam 004" at 1268003716.14
Parent 8324 got "Spam 000" at 1268003716.14
Parent 8324 got "Spam 001" at 1268003717.15
...etc: Ctrl-C to exit...

[C:...PP4ESystemProcesses]$ file /tmp/pipefifo            # child window
/tmp/pipefifo: fifo (named pipe)

[C:...PP4ESystemProcesses]$ python pipefifo.py -child
...Ctrl-C to exit...

Named pipe use cases

By mapping communication points to a file system entity accessible to all programs run on a machine, fifos can address a broad range of IPC goals on platforms where they are supported. For instance, although this section’s example runs independent programs, named pipes can also be used as an IPC device by both in-process threads and directly forked related processes, much as we saw for anonymous pipes earlier.

By also supporting unrelated programs, though, fifo files are more widely applicable to general client/server models. For example, named pipes can make the GUI and command-line debugger integration I described earlier for anonymous pipes even more flexible—by using fifo files to connect the GUI to the non-GUI debugger’s streams, the GUI could be started independently when needed.

Sockets provide similar functionality but also buy us both inherent network awareness and broader portability to Windows—as the next section explains.

Sockets: A First Look

Sockets, implemented by the Python socket module, are a more general IPC device than the pipes we’ve seen so far. Sockets let us transfer data between programs running on the same computer, as well as programs located on remote networked machines. When used as an IPC mechanism on the same machine, programs connect to sockets by a machine-global port number and transfer data. When used as a networking connection, programs provide both a machine name and port number to transfer data to a remotely-running program.

Socket basics

Although sockets are one of the most commonly used IPC tools, it’s impossible to fully grasp their API without also seeing its role in networking. Because of that, we’ll defer most of our socket coverage until we can explore their use in network scripting in Chapter 12. This section provides a brief introduction and preview, so you can compare with the prior section’s named pipes (a.k.a. fifos). In short:

  • Like fifos, sockets are global across a machine; they do not require shared memory among threads or processes, and are thus applicable to independent programs.

  • Unlike fifos, sockets are identified by port number, not filesystem path name; they employ a very different nonfile API, though they can be wrapped in a file-like object; and they are more portable: they work on nearly every Python platform, including standard Windows Python.

In addition, sockets support networking roles that go beyond both IPC and this chapter’s scope. To illustrate the basics, though, Example 5-25 launches a server and 5 clients in threads running in parallel on the same machine, to communicate over a socket—because all threads connect to the same port, the server consumes the data added by each of the clients.

Example 5-25. PP4ESystemProcessessocket_preview.py
"""
sockets for cross-task communication: start threads to communicate over sockets;
independent programs can too, because sockets are system-wide, much like fifos;
see the GUI and Internet parts of the book for more realistic socket use cases;
some socket servers may also need to talk to clients in threads or processes;
sockets pass byte strings, but can be pickled objects or encoded Unicode text;
caveat: prints in threads may need to be synchronized if their output overlaps;
"""

from socket import socket, AF_INET, SOCK_STREAM     # portable socket api

port = 50008                 # port number identifies socket on machine
host = 'localhost'           # server and client run on same local machine here

def server():
    sock = socket(AF_INET, SOCK_STREAM)         # ip addresses tcp connection
    sock.bind(('', port))                       # bind to port on this machine
    sock.listen(5)                              # allow up to 5 pending clients
    while True:
        conn, addr = sock.accept()              # wait for client to connect
        data = conn.recv(1024)                  # read bytes data from this client
        reply = 'server got: [%s]' % data       # conn is a new connected socket
        conn.send(reply.encode())               # send bytes reply back to client

def client(name):
    sock = socket(AF_INET, SOCK_STREAM)
    sock.connect((host, port))                  # connect to a socket port
    sock.send(name.encode())                    # send bytes data to listener
    reply = sock.recv(1024)                     # receive bytes data from listener
    sock.close()                                # up to 1024 bytes in message
    print('client got: [%s]' % reply)

if __name__ == '__main__':
    from threading import Thread
    sthread = Thread(target=server)
    sthread.daemon = True                       # don't wait for server thread
    sthread.start()                             # do wait for children to exit
    for i in range(5):
         Thread(target=client, args=('client%s' % i,)).start()

Study this script’s code and comments to see how the socket objects’ methods are used to transfer data. In a nutshell, with this type of socket the server accepts a client connection, which by default blocks until a client requests service, and returns a new socket connected to the client. Once connected, the client and server transfer byte strings by using send and receive calls instead of writes and reads, though as we’ll see later in the book, sockets can be wrapped in file objects much as we did earlier for pipe descriptors. Also like pipe descriptors, unwrapped sockets deal in binary bytes strings, not text str; that’s why string formatting results are manually encoded again here.

Here is this script’s output on Windows:

C:...PP4ESystemProcesses> socket_preview.py
client got: [b"server got: [b'client1']"]
client got: [b"server got: [b'client3']"]
client got: [b"server got: [b'client4']"]
client got: [b"server got: [b'client2']"]
client got: [b"server got: [b'client0']"]

This output isn’t much to look at, but each line reflects data sent from client to server, and then back again: the server receives a bytes string from a connected client and echoes it back in a larger reply string. Because all threads run in parallel, the order in which the clients are served is random on this machine.

Sockets and independent programs

Although sockets work for threads, the shared memory model of threads often allows them to employ simpler communication devices such as shared names and objects and queues. Sockets tend to shine brighter when used for IPC by separate processes and independently launched programs. Example 5-26, for instance, reuses the server and client functions of the prior example, but runs them in both processes and threads of independently launched programs.

Example 5-26. PP4ESystemProcessessocket-preview-progs.py
"""
same socket, but talk between independent programs too, not just threads;
server here runs in a process and serves both process and thread clients;
sockets are machine-global, much like fifos: don't require shared memory
"""

from socket_preview import server, client         # both use same port number
import sys, os
from threading import Thread

mode = int(sys.argv[1])
if mode == 1:                                     # run server in this process
    server()
elif mode == 2:                                   # run client in this process
    client('client:process=%s' % os.getpid())
else:                                             # run 5 client threads in process
    for i in range(5):
        Thread(target=client, args=('client:thread=%s' % i,)).start()

Let’s run this script on Windows, too (again, this portability is a major advantage of sockets). First, start the server in a process as an independently launched program in its own window; this process runs perpetually waiting for clients to request connections (and as for our prior pipe example you may need to use Task Manager or a window close to kill the server process eventually):

C:...PP4ESystemProcesses> socket-preview-progs.py 1

Now, in another window, run a few clients in both processes and thread, by launching them as independent programs—using 2 as the command-line argument runs a single client process, but 3 spawns five threads to converse with the server on parallel:

C:...PP4ESystemProcesses> socket-preview-progs.py 2
client got: [b"server got: [b'client:process=7384']"]

C:...PP4ESystemProcesses> socket-preview-progs.py 2
client got: [b"server got: [b'client:process=7604']"]

C:...PP4ESystemProcesses> socket-preview-progs.py 3
client got: [b"server got: [b'client:thread=1']"]
client got: [b"server got: [b'client:thread=2']"]
client got: [b"server got: [b'client:thread=0']"]
client got: [b"server got: [b'client:thread=3']"]
client got: [b"server got: [b'client:thread=4']"]

C:..PP4ESystemProcesses> socket-preview-progs.py 3
client got: [b"server got: [b'client:thread=3']"]
client got: [b"server got: [b'client:thread=1']"]
client got: [b"server got: [b'client:thread=2']"]
client got: [b"server got: [b'client:thread=4']"]
client got: [b"server got: [b'client:thread=0']"]

C:...PP4ESystemProcesses> socket-preview-progs.py 2
client got: [b"server got: [b'client:process=6428']"]

Socket use cases

This section’s examples illustrate the basic IPC role of sockets, but this only hints at their full utility. Despite their seemingly limited byte string nature, higher-order use cases for sockets are not difficult to imagine. With a little extra work, for instance:

  • Arbitrary Python objects like lists and dictionaries (or at least copies of them) can be transferred over sockets, too, by shipping the serialized byte strings produced by Python’s pickle module introduced in Chapter 1 and covered in full in Chapter 17.

  • As we’ll see in Chapter 10, the printed output of a simple script can be redirected to a GUI window, by connecting the script’s output stream to a socket on which a GUI is listening in nonblocking mode.

  • Programs that fetch arbitrary text off the Web might read it as byte strings over sockets, but manually decode it using encoding names embedded in content-type headers or tags in the data itself.

  • In fact, the entire Internet can be seen as a socket use case—as we’ll see in Chapter 12, at the bottom, email, FTP, and web pages are largely just formatted byte string messages shipped over sockets.

Plus any other context in which programs exchange data—sockets are a general, portable, and flexible tool. For instance, they would provide the same utility as fifos for the GUI/debugger example used earlier, but would also work in Python on Windows and would even allow the GUI to connect to a debugger running on a different computer altogether. As such, they are seen by many as a more powerful IPC tool.

Again, you should consider this section just a preview; because the grander socket story also entails networking concepts, we’ll defer a more in-depth look at the socket API until Chapter 12. We’ll also see sockets again briefly in Chapter 10 in the GUI stream redirection use case listed above, and we’ll explore a variety of additional socket use cases in the Internet part of this book. In Part IV, for instance, we’ll use sockets to transfer entire files and write more robust socket servers that spawn threads or processes to converse with clients to avoid denying connections. For the purposes of this chapter, let’s move on to one last traditional IPC tool—the signal.

Signals

For lack of a better analogy, signals are a way to poke a stick at a process. Programs generate signals to trigger a handler for that signal in another process. The operating system pokes, too—some signals are generated on unusual system events and may kill the program if not handled. If this sounds a little like raising exceptions in Python, it should; signals are software-generated events and the cross-process analog of exceptions. Unlike exceptions, though, signals are identified by number, are not stacked, and are really an asynchronous event mechanism outside the scope of the Python interpreter controlled by the operating system.

In order to make signals available to scripts, Python provides a signal module that allows Python programs to register Python functions as handlers for signal events. This module is available on both Unix-like platforms and Windows (though the Windows version may define fewer kinds of signals to be caught). To illustrate the basic signal interface, the script in Example 5-27 installs a Python handler function for the signal number passed in as a command-line argument.

Example 5-27. PP4ESystemProcessessignal1.py
"""
catch signals in Python; pass signal number N as a command-line arg,
use a "kill -N pid" shell command to send this process a signal;  most
signal handlers restored by Python after caught (see network scripting
chapter for SIGCHLD details); on Windows, signal module is available,
but it defines only a few signal types there, and os.kill is missing;
"""

import sys, signal, time
def now(): return time.ctime(time.time())        # current time string

def onSignal(signum, stackframe):                # python signal handler
    print('Got signal', signum, 'at', now())     # most handlers stay in effect

signum = int(sys.argv[1])
signal.signal(signum, onSignal)                  # install signal handler
while True: signal.pause()                       # wait for signals (or: pass)

There are only two signal module calls at work here:

signal.signal

Takes a signal number and function object and installs that function to handle that signal number when it is raised. Python automatically restores most signal handlers when signals occur, so there is no need to recall this function within the signal handler itself to reregister the handler. That is, except for SIGCHLD, a signal handler remains installed until explicitly reset (e.g., by setting the handler to SIG_DFL to restore default behavior or to SIG_IGN to ignore the signal). SIGCHLD behavior is platform specific.

signal.pause

Makes the process sleep until the next signal is caught. A time.sleep call is similar but doesn’t work with signals on my Linux box; it generates an interrupted system call error. A busy while True: pass loop here would pause the script, too, but may squander CPU resources.

Here is what this script looks like running on Cygwin on Windows (it works the same on other Unix-like platforms like Linux): a signal number to watch for (12) is passed in on the command line, and the program is made to run in the background with an & shell operator (available in most Unix-like shells):

[C:...PP4ESystemProcesses]$ python signal1.py 12 &
[1] 8224

$ ps
      PID    PPID    PGID     WINPID  TTY  UID    STIME COMMAND
I    8944       1    8944       8944  con 1004 18:09:54 /usr/bin/bash
     8224    7336    8224      10020  con 1004 18:26:47 /usr/local/bin/python
     8380    7336    8380        428  con 1004 18:26:50 /usr/bin/ps

$ kill −12 8224
Got signal 12 at Sun Mar  7 18:27:28 2010

$ kill −12 8224
Got signal 12 at Sun Mar  7 18:27:30 2010

$ kill −9 8224
[1]+  Killed                  python signal1.py 12

Inputs and outputs can be a bit jumbled here because the process prints to the same screen used to type new shell commands. To send the program a signal, the kill shell command takes a signal number and a process ID to be signaled (8224); every time a new kill command sends a signal, the process replies with a message generated by a Python signal handler function. Signal 9 always kills the process altogether.

The signal module also exports a signal.alarm function for scheduling a SIGALRM signal to occur at some number of seconds in the future. To trigger and catch timeouts, set the alarm and install a SIGALRM handler as shown in Example 5-28.

Example 5-28. PP4ESystemProcessessignal2.py
"""
set and catch alarm timeout signals in Python; time.sleep doesn't play
well with alarm (or signal in general in my Linux PC), so we call
signal.pause here to do nothing until a signal is received;
"""

import sys, signal, time
def now(): return time.asctime()

def onSignal(signum, stackframe):                 # python signal handler
    print('Got alarm', signum, 'at', now())       # most handlers stay in effect

while True:
    print('Setting at', now())
    signal.signal(signal.SIGALRM, onSignal)       # install signal handler
    signal.alarm(5)                               # do signal in 5 seconds
    signal.pause()                                # wait for signals

Running this script on Cygwin on Windows causes its onSignal handler function to be invoked every five seconds:

[C:...PP4ESystemProcesses]$ python signal2.py
Setting at Sun Mar  7 18:37:10 2010
Got alarm 14 at Sun Mar  7 18:37:15 2010
Setting at Sun Mar  7 18:37:15 2010
Got alarm 14 at Sun Mar  7 18:37:20 2010
Setting at Sun Mar  7 18:37:20 2010
Got alarm 14 at Sun Mar  7 18:37:25 2010
Setting at Sun Mar  7 18:37:25 2010
Got alarm 14 at Sun Mar  7 18:37:30 2010
Setting at Sun Mar  7 18:37:30 2010
...Ctrl-C to exit...

Generally speaking, signals must be used with cautions not made obvious by the examples we’ve just seen. For instance, some system calls don’t react well to being interrupted by signals, and only the main thread can install signal handlers and respond to signals in a multithreaded program.

When used well, though, signals provide an event-based communication mechanism. They are less powerful than data streams such as pipes, but are sufficient in situations in which you just need to tell a program that something important has occurred and don’t need to pass along any details about the event itself. Signals are sometimes also combined with other IPC tools. For example, an initial signal may inform a program that a client wishes to communicate over a named pipe—the equivalent of tapping someone’s shoulder to get their attention before speaking. Most platforms reserve one or more SIGUSR signal numbers for user-defined events of this sort. Such an integration structure is sometimes an alternative to running a blocking input call in a spawned thread.

See also the os.kill(pid, sig) call for sending signals to known processes from within a Python script on Unix-like platforms, much like the kill shell command used earlier; the required process ID can be obtained from the os.fork call’s child process ID return value or from other interfaces. Like os.fork, this call is also available in Cygwin Python, but not in standard Windows Python. Also watch for the discussion about using signal handlers to clean up “zombie” processes in Chapter 12.

The multiprocessing Module

Now that you know about IPC alternatives and have had a chance to explore processes, threads, and both process nonportability and thread GIL limitations, it turns out that there is another alternative, which aims to provide just the best of both worlds. As mentioned earlier, Python’s standard library multiprocessing module package allows scripts to spawn processes using an API very similar to the threading module.

This relatively new package works on both Unix and Windows, unlike low-level process forks. It supports a process spawning model which is largely platform-neutral, and provides tools for related goals, such as IPC, including locks, pipes, and queues. In addition, because it uses processes instead of threads to run code in parallel, it effectively works around the limitations of the thread GIL. Hence, multiprocessing allows the programmer to leverage the capacity of multiple processors for parallel tasks, while retaining much of the simplicity and portability of the threading model.

Why multiprocessing?

So why learn yet another parallel processing paradigm and toolkit, when we already have the threads, processes, and IPC tools like sockets, pipes, and thread queues that we’ve already studied? Before we get into the details, I want to begin with a few words about why you may (or may not) care about this package. In more specific terms, although this module’s performance may not compete with that of pure threads or process forks for some applications, this module offers a compelling solution for many:

  • Compared to raw process forks, you gain cross-platform portability and powerful IPC tools.

  • Compared to threads, you essentially trade some potential and platform-dependent extra task start-up time for the ability to run tasks in truly parallel fashion on multi-core or multi-CPU machines.

On the other hand, this module imposes some constraints and tradeoffs that threads do not:

  • Since objects are copied across process boundaries, shared mutable state does not work as it does for threads—changes in one process are not generally noticed in the other. Really, freely shared state may be the most compelling reason to use threads; its absence in this module may prove limiting in some threading contexts.

  • Because this module requires pickleability for both its processes on Windows, as well as some of its IPC tools in general, some coding paradigms are difficult or nonportable—especially if they use bound methods or pass unpickleable objects such as sockets to spawned processes.

For instance, common coding patterns with lambda that work for the threading module cannot be used as process target callables in this module on Windows, because they cannot be pickled. Similarly, because bound object methods are also not pickleable, a threaded program may require a more indirect design if it either runs bound methods in its threads or implements thread exit actions by posting arbitrary callables (possibly including bound methods) on shared queues. The in-process model of threads supports such direct lambda and bound method use, but the separate processes of multiprocessing do not.

In fact we’ll write a thread manager for GUIs in Chapter 10 that relies on queueing in-process callables this way to implement thread exit actions—the callables are queued by worker threads, and fetched and dispatched by the main thread. Because the threaded PyMailGUI program we’ll code in Chapter 14 both uses this manager to queue bound methods for thread exit actions and runs bound methods as the main action of a thread itself, it could not be directly translated to the separate process model implied by multiprocessing.

Without getting into too many details here, to use multiprocessing, PyMailGUI’s actions might have to be coded as simple functions or complete process subclasses for pickleability. Worse, they may have to be implemented as simpler action identifiers dispatched in the main process, if they update either the GUI itself or object state in general —pickling results in an object copy in the receiving process, not a reference to the original, and forks on Unix essentially copy an entire process. Updating the state of a mutable message cache copied by pickling it to pass to a new process, for example, has no effect on the original.

The pickleability constraints for process arguments on Windows can limit multiprocessing’s scope in other contexts as well. For instance, in Chapter 12, we’ll find that this module doesn’t directly solve the lack of portability for the os.fork call for traditionally coded socket servers on Windows, because connected sockets are not pickled correctly when passed into a new process created by this module to converse with a client. In this context, threads provide a more portable and likely more efficient solution.

Applications that pass simpler types of messages, of course, may fare better. Message constraints are easier to accommodate when they are part of an initial process-based design. Moreover, other tools in this module, such as its managers and shared memory API, while narrowly focused and not as general as shared thread state, offer additional mutable state options for some programs.

Fundamentally, though, because multiprocessing is based on separate processes, it may be best geared for tasks which are relatively independent, do not share mutable object state freely, and can make do with the message passing and shared memory tools provided by this module. This includes many applications, but this module is not necessarily a direct replacement for every threaded program, and it is not an alternative to process forks in all contexts.

To truly understand both this module package’s benefits, as well as its tradeoffs, let’s turn to a first example and explore this package’s implementation along the way.

The Basics: Processes and Locks

We don’t have space to do full justice to this sophisticated module in this book; see its coverage in the Python library manual for the full story. But as a brief introduction, by design most of this module’s interfaces mirror the threading and queue modules we’ve already met, so they should already seem familiar. For example, the multiprocessing module’s Process class is intended to mimic the threading module’s Thread class we met earlier—it allows us to launch a function call in parallel with the calling script; with this module, though, the function runs in a process instead of a thread. Example 5-29 illustrates these basics in action:

Example 5-29. PP4ESystemProcessesmulti1.py
"""
multiprocess basics: Process works like threading.Thread, but
runs function call in parallel in a process instead of a thread;
locks can be used to synchronize, e.g. prints on some platforms;
starts new interpreter on windows, forks a new process on unix;
"""

import os
from multiprocessing import Process, Lock

def whoami(label, lock):
    msg = '%s: name:%s, pid:%s'
    with lock:
        print(msg % (label, __name__, os.getpid()))

if __name__ == '__main__':
    lock = Lock()
    whoami('function call', lock)

    p = Process(target=whoami, args=('spawned child', lock))
    p.start()
    p.join()

    for i in range(5):
        Process(target=whoami, args=(('run process %s' % i), lock)).start()

    with lock:
        print('Main process exit.')

When run, this script first calls a function directly and in-process; then launches a call to that function in a new process and waits for it to exit; and finally spawns five function call processes in parallel in a loop—all using an API identical to that of the threading.Thread model we studied earlier in this chapter. Here’s this script’s output on Windows; notice how the five child processes spawned at the end of this script outlive their parent, as is the usual case for processes:

C:...PP4ESystemProcesses> multi1.py
function call: name:__main__, pid:8752
spawned child: name:__main__, pid:9268
Main process exit.
run process 3: name:__main__, pid:9296
run process 1: name:__main__, pid:8792
run process 4: name:__main__, pid:2224
run process 2: name:__main__, pid:8716
run process 0: name:__main__, pid:6936

Just like the threading.Thread class we met earlier, the multiprocessing.Process object can either be passed a target with arguments (as done here) or subclassed to redefine its run action method. Its start method invokes its run method in a new process, and the default run simply calls the passed-in target. Also like threading, a join method waits for child process exit, and a Lock object is provided as one of a handful of process synchronization tools; it’s used here to ensure that prints don’t overlap among processes on platforms where this might matter (it may not on Windows).

Implementation and usage rules

Technically, to achieve its portability, this module currently works by selecting from platform-specific alternatives:

  • On Unix, it forks a new child process and invokes the Process object’s run method in the new child.

  • On Windows, it spawns a new interpreter by using Windows-specific process creation tools, passing the pickled Process object in to the new process over a pipe, and starting a “python -c” command line in the new process, which runs a special Python-coded function in this package that reads and unpickles the Process and invokes its run method.

We met pickling briefly in Chapter 1, and we will study it further later in this book. The implementation is a bit more complex than this, and is prone to change over time, of course, but it’s really quite an amazing trick. While the portable API generally hides these details from your code, its basic structure can still have subtle impacts on the way you’re allowed to use it. For instance:

  • On Windows, the main process’s logic should generally be nested under a __name__ == __main__ test as done here when using this module, so it can be imported freely by a new interpreter without side effects. As we’ll learn in more detail in Chapter 17, unpickling classes and functions requires an import of their enclosing module, and this is the root of this requirement.

  • Moreover, when globals are accessed in child processes on Windows, their values may not be the same as that in the parent at start time, because their module will be imported into a new process.

  • Also on Windows, all arguments to Process must be pickleable. Because this includes target, targets should be simple functions so they can be pickled; they cannot be bound or unbound object methods and cannot be functions created with a lambda. See pickle in Python’s library manual for more on pickleability rules; nearly every object type works, but callables like functions and classes must be importable—they are pickled by name only, and later imported to recreate bytecode. On Windows, objects with system state, such as connected sockets, won’t generally work as arguments to a process target either, because they are not pickleable.

  • Similarly, instances of custom Process subclasses must be pickleable on Windows as well. This includes all their attribute values. Objects available in this package (e.g., Lock in Example 5-29) are pickleable, and so may be used as both Process constructor arguments and subclass attributes.

  • IPC objects in this package that appear in later examples like Pipe and Queue accept only pickleable objects, because of their implementation (more on this in the next section).

  • On Unix, although a child process can make use of a shared global item created in the parent, it’s better to pass the object as an argument to the child process’s constructor, both for portability to Windows and to avoid potential problems if such objects were garbage collected in the parent.

There are additional rules documented in the library manual. In general, though, if you stick to passing in shared objects to processes and using the synchronization and communication tools provided by this package, your code will usually be portable and correct. Let’s look next at a few of those tools in action.

IPC Tools: Pipes, Shared Memory, and Queues

While the processes created by this package can always communicate using general system-wide tools like the sockets and fifo files we met earlier, the multiprocessing module also provides portable message passing tools specifically geared to this purpose for the processes it spawns:

  • Its Pipe object provides an anonymous pipe, which serves as a connection between two processes. When called, Pipe returns two Connection objects that represent the ends of the pipe. Pipes are bidirectional by default, and allow arbitrary pickleable Python objects to be sent and received. On Unix they are implemented internally today with either a connected socket pair or the os.pipe call we met earlier, and on Windows with named pipes specific to that platform. Much like the Process object described earlier, though, the Pipe object’s portable API spares callers from such things.

  • Its Value and Array objects implement shared process/thread-safe memory for communication between processes. These calls return scalar and array objects based in the ctypes module and created in shared memory, with access synchronized by default.

  • Its Queue object serves as a FIFO list of Python objects, which allows multiple producers and consumers. A queue is essentially a pipe with extra locking mechanisms to coordinate more arbitrary accesses, and inherits the pickleability constraints of Pipe.

Because these devices are safe to use across multiple processes, they can often serve to synchronize points of communication and obviate lower-level tools like locks, much the same as the thread queues we met earlier. As usual, a pipe (or a pair of them) may be used to implement a request/reply model. Queues support more flexible models; in fact, a GUI that wishes to avoid the limitations of the GIL might use the multiprocessing module’s Process and Queue to spawn long-running tasks that post results, rather than threads. As mentioned, although this may incur extra start-up overhead on some platforms, unlike threads today, tasks coded this way can be as truly parallel as the underlying platform allows.

One constraint worth noting here: this package’s pipes (and by proxy, queues) pickle the objects passed through them, so that they can be reconstructed in the receiving process (as we’ve seen, on Windows the receiver process may be a fully independent Python interpreter). Because of that, they do not support unpickleable objects; as suggested earlier, this includes some callables like bound methods and lambda functions (see file multi-badq.py in the book examples package for a demonstration of code that violates this constraint). Objects with system state, such as sockets, may fail as well. Most other Python object types, including classes and simple functions, work fine on pipes and queues.

Also keep in mind that because they are pickled, objects transferred this way are effectively copied in the receiving process; direct in-place changes to mutable objects’ state won’t be noticed in the sender. This makes sense if you remember that this package runs independent processes with their own memory spaces; state cannot be as freely shared as in threading, regardless of which IPC tools you use.

multiprocessing pipes

To demonstrate the IPC tools listed above, the next three examples implement three flavors of communication between parent and child processes. Example 5-30 uses a simple shared pipe object to send and receive data between parent and child processes.

Example 5-30. PP4ESystemProcessesmulti2.py
"""
Use multiprocess anonymous pipes to communicate. Returns 2 connection
object representing ends of the pipe: objects are sent on one end and
received on the other, though pipes are bidirectional by default
"""

import os
from multiprocessing import Process, Pipe

def sender(pipe):
    """
    send object to parent on anonymous pipe
    """
    pipe.send(['spam'] +  [42, 'eggs'])
    pipe.close()

def talker(pipe):
    """
    send and receive objects on a pipe
    """
    pipe.send(dict(name='Bob', spam=42))
    reply = pipe.recv()
    print('talker got:', reply)

if __name__ == '__main__':
    (parentEnd, childEnd) = Pipe()
    Process(target=sender, args=(childEnd,)).start()        # spawn child with pipe
    print('parent got:', parentEnd.recv())                  # receive from child
    parentEnd.close()                                       # or auto-closed on gc

    (parentEnd, childEnd) = Pipe()
    child = Process(target=talker, args=(childEnd,))
    child.start()
    print('parent got:', parentEnd.recv())                  # receieve from child
    parentEnd.send({x * 2 for x in 'spam'})                 # send to child
    child.join()                                            # wait for child exit
    print('parent exit')

When run on Windows, here’s this script’s output—one child passes an object to the parent, and the other both sends and receives on the same pipe:

C:...PP4ESystemProcesses> multi2.py
parent got: ['spam', 42, 'eggs']
parent got: {'name': 'Bob', 'spam': 42}
talker got: {'ss', 'aa', 'pp', 'mm'}
parent exit

This module’s pipe objects make communication between two processes portable (and nearly trivial).

Shared memory and globals

Example 5-31 uses shared memory to serve as both inputs and outputs of spawned processes. To make this work portably, we must create objects defined by the package and pass them to Process constructors. The last test in this demo (“loop4”) probably represents the most common use case for shared memory—that of distributing computation work to multiple parallel processes.

Example 5-31. PP4ESystemProcessesmulti3.py
"""
Use multiprocess shared memory objects to communicate.
Passed objects are shared, but globals are not on Windows.
Last test here reflects common use case: distributing work.
"""

import os
from multiprocessing import Process, Value, Array

procs = 3
count = 0    # per-process globals, not shared

def showdata(label, val, arr):
    """
    print data values in this process
    """
    msg = '%-12s: pid:%4s, global:%s, value:%s, array:%s'
    print(msg % (label, os.getpid(), count, val.value, list(arr)))

def updater(val, arr):
    """
    communicate via shared memory
    """
    global count
    count += 1                         # global count not shared
    val.value += 1                     # passed in objects are
    for i in range(3): arr[i] += 1

if __name__ == '__main__':
    scalar = Value('i', 0)             # shared memory: process/thread safe
    vector = Array('d', procs)         # type codes from ctypes: int, double

    # show start value in parent process
    showdata('parent start', scalar, vector)

    # spawn child, pass in shared memory
    p = Process(target=showdata, args=('child ', scalar, vector))
    p.start(); p.join()

    # pass in shared memory updated in parent, wait for each to finish
    # each child sees updates in parent so far for args (but not global)

    print('
loop1 (updates in parent, serial children)...')
    for i in range(procs):
        count += 1
        scalar.value += 1
        vector[i] += 1
        p = Process(target=showdata, args=(('process %s' % i), scalar, vector))
        p.start(); p.join()

    # same as prior, but allow children to run in parallel
    # all see the last iteration's result because all share objects

    print('
loop2 (updates in parent, parallel children)...')
    ps = []
    for i in range(procs):
        count += 1
        scalar.value += 1
        vector[i] += 1
        p = Process(target=showdata, args=(('process %s' % i), scalar, vector))
        p.start()
        ps.append(p)
    for p in ps: p.join()

    # shared memory updated in spawned children, wait for each

    print('
loop3 (updates in serial children)...')
    for i in range(procs):
        p = Process(target=updater, args=(scalar, vector))
        p.start()
        p.join()
    showdata('parent temp', scalar, vector)

    # same, but allow children to update in parallel

    ps = []
    print('
loop4 (updates in parallel children)...')
    for i in range(procs):
        p = Process(target=updater, args=(scalar, vector))
        p.start()
        ps.append(p)
    for p in ps: p.join()
                                           # global count=6 in parent only
    # show final results here              # scalar=12:  +6 parent, +6 in 6 children
    showdata('parent end', scalar, vector) # array[i]=8: +2 parent, +6 in 6 children

The following is this script’s output on Windows. Trace through this and the code to see how it runs; notice how the changed value of the global variable is not shared by the spawned processes on Windows, but passed-in Value and Array objects are. The final output line reflects changes made to shared memory in both the parent and spawned children—the array’s final values are all 8.0, because they were incremented twice in the parent, and once in each of six spawned children; the scalar value similarly reflects changes made by both parent and child; but unlike for threads, the global is per-process data on Windows:

C:...PP4ESystemProcesses> multi3.py
parent start: pid:6204, global:0, value:0, array:[0.0, 0.0, 0.0]
child       : pid:9660, global:0, value:0, array:[0.0, 0.0, 0.0]

loop1 (updates in parent, serial children)...
process 0   : pid:3900, global:0, value:1, array:[1.0, 0.0, 0.0]
process 1   : pid:5072, global:0, value:2, array:[1.0, 1.0, 0.0]
process 2   : pid:9472, global:0, value:3, array:[1.0, 1.0, 1.0]

loop2 (updates in parent, parallel children)...
process 1   : pid:9468, global:0, value:6, array:[2.0, 2.0, 2.0]
process 2   : pid:9036, global:0, value:6, array:[2.0, 2.0, 2.0]
process 0   : pid:9548, global:0, value:6, array:[2.0, 2.0, 2.0]

loop3 (updates in serial children)...
parent temp : pid:6204, global:6, value:9, array:[5.0, 5.0, 5.0]

loop4 (updates in parallel children)...
parent end  : pid:6204, global:6, value:12, array:[8.0, 8.0, 8.0]

If you imagine the last test here run with a much larger array and many more parallel children, you might begin to sense some of the power of this package for distributing work.

Queues and subclassing

Finally, besides basic spawning and IPC tools, the multiprocessing module also:

  • Allows its Process class to be subclassed to provide structure and state retention (much like threading.Thread, but for processes).

  • Implements a process-safe Queue object which may be shared by any number of processes for more general communication needs (much like queue.Queue, but for processes).

Queues support a more flexible multiple client/server model. Example 5-32, for instance, spawns three producer threads to post to a shared queue and repeatedly polls for results to appear—in much the same fashion that a GUI might collect results in parallel with the display itself, though here the concurrency is achieved with processes instead of threads.

Example 5-32. PP4ESystemProcessesmulti4.py
"""
Process class can also be subclassed just like threading.Thread;
Queue works like queue.Queue but for cross-process, not cross-thread
"""

import os, time, queue
from multiprocessing import Process, Queue           # process-safe shared queue
                                                     # queue is a pipe + locks/semas
class Counter(Process):
    label = '  @'
    def __init__(self, start, queue):                # retain state for use in run
        self.state = start
        self.post  = queue
        Process.__init__(self)

    def run(self):                                   # run in newprocess on start()
        for i in range(3):
            time.sleep(1)
            self.state += 1
            print(self.label ,self.pid, self.state)  # self.pid is this child's pid
            self.post.put([self.pid, self.state])    # stdout file is shared by all
        print(self.label, self.pid, '-')

if __name__ == '__main__':
    print('start', os.getpid())
    expected = 9

    post = Queue()
    p = Counter(0, post)                        # start 3 processes sharing queue
    q = Counter(100, post)                      # children are producers
    r = Counter(1000, post)
    p.start(); q.start(); r.start()

    while expected:                             # parent consumes data on queue
        time.sleep(0.5)                         # this is essentially like a GUI,
        try:                                    # though GUIs often use threads
            data = post.get(block=False)
        except queue.Empty:
            print('no data...')
        else:
            print('posted:', data)
            expected -= 1

    p.join(); q.join(); r.join()                # must get before join putter
    print('finish', os.getpid(), r.exitcode)    # exitcode is child exit status

Notice in this code how:

  • The time.sleep calls in this code’s producer simulate long-running tasks.

  • All four processes share the same output stream; print calls go the same place and don’t overlap badly on Windows (as we saw earlier, the multiprocessing module also has a shareable Lock object to synchronize access if required).

  • The exit status of child process is available after they finish in their exitcode attribute.

When run, the output of the main consumer process traces its queue fetches, and the (indented) output of spawned child producer processes gives process IDs and state.

C:...PP4ESystemProcesses> multi4.py
start 6296
no data...
no data...
  @ 8008 101
posted: [8008, 101]
  @ 6068 1
  @ 3760 1001
posted: [6068, 1]
  @ 8008 102
posted: [3760, 1001]
  @ 6068 2
  @ 3760 1002
posted: [8008, 102]
  @ 8008 103
  @ 8008 -
posted: [6068, 2]
  @ 6068 3
  @ 6068 -
  @ 3760 1003
  @ 3760 -
posted: [3760, 1002]
posted: [8008, 103]
posted: [6068, 3]
posted: [3760, 1003]
finish 6296 0

If you imagine the “@” lines here as results of long-running operations and the others as a main GUI thread, the wide relevance of this package may become more apparent.

Starting Independent Programs

As we learned earlier, independent programs generally communicate with system-global tools such as sockets and the fifo files we studied earlier. Although processes spawned by multiprocessing can leverage these tools, too, their closer relationship affords them the host of additional IPC communication devices provided by this module.

Like threads, multiprocessing is designed to run function calls in parallel, not to start entirely separate programs directly. Spawned functions might use tools like os.system, os.popen, and subprocess to start a program if such an operation might block the caller, but there’s otherwise often no point in starting a process that just starts a program (you might as well start the program and skip a step). In fact, on Windows, multiprocessing today uses the same process creation call as subprocess, so there’s little point in starting two processes to run one.

It is, however, possible to start new programs in the child processes spawned, using tools like the os.exec* calls we met earlier—by spawning a process portably with multiprocessing and overlaying it with a new program this way, we start a new independent program, and effectively work around the lack of the os.fork call in standard Windows Python.

This generally assumes that the new program doesn’t require any resources passed in by the Process API, of course (once a new program starts, it erases that which was running), but it offers a portable equivalent to the fork/exec combination on Unix. Furthermore, programs started this way can still make use of more traditional IPC tools, such as sockets and fifos, we met earlier in this chapter. Example 5-33 illustrates the technique.

Example 5-33. PP4ESystemProcessesmulti5.py
"Use multiprocessing to start independent programs, os.fork or not"

import os
from multiprocessing import Process

def runprogram(arg):
    os.execlp('python', 'python', 'child.py', str(arg))

if __name__ == '__main__':
    for i in range(5):
        Process(target=runprogram, args=(i,)).start()
    print('parent exit')

This script starts 5 instances of the child.py script we wrote in Example 5-4 as independent processes, without waiting for them to finish. Here’s this script at work on Windows, after deleting a superfluous system prompt that shows up arbitrarily in the middle of its output (it runs the same on Cygwin, but the output is not interleaved there):

C:...PP4ESystemProcesses> type child.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])

C:...PP4ESystemProcesses> multi5.py
parent exit
Hello from child 9844 2
Hello from child 8696 4
Hello from child 1840 0
Hello from child 6724 1
Hello from child 9368 3

This technique isn’t possible with threads, because all threads run in the same process; overlaying it with a new program would kill all its threads. Though this is unlikely to be as fast as a fork/exec combination on Unix, it at least provides similar and portable functionality on Windows when required.

And Much More

Finally, multiprocessing provides many more tools than these examples deploy, including condition, event, and semaphore synchronization tools, and local and remote managers that implement servers for shared object. For instance, Example 5-34 demonstrates its support for pools—spawned children that work in concert on a given task.

Example 5-34. PP4ESystemProcessesmulti6.py
"Plus much more: process pools, managers, locks, condition,..."

import os
from multiprocessing import Pool

def powers(x):
    #print(os.getpid())                  # enable to watch children
    return 2 ** x

if __name__ == '__main__':
    workers = Pool(processes=5)

    results = workers.map(powers, [2]*100)
    print(results[:16])
    print(results[-2:])

    results = workers.map(powers, range(100))
    print(results[:16])
    print(results[-2:])

When run, Python arranges to delegate portions of the task to workers run in parallel:

C:...PP4ESystemProcesses> multi6.py
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
[4, 4]
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768]
[316912650057057350374175801344, 633825300114114700748351602688]

And a little less…

To be fair, besides such additional features and tools, multiprocessing also comes with additional constraints beyond those we’ve already covered (pickleability, mutable state, and so on). For example, consider the following sort of code:

def action(arg1, arg2):
    print(arg1, arg2)

if __name__ == '__main__':
    Process(target=action, args=('spam', 'eggs')).start()    # shell waits for child

This works as expected, but if we change the last line to the following it fails on Windows because lambdas are not pickleable (really, not importable):

Process(target=(lambda: action('spam', 'eggs'))).start()  # fails!-not pickleable

This precludes a common coding pattern that uses lambda to add data to calls, which we’ll use often for callbacks in the GUI part of this book. Moreover, this differs from the threading module that is the model for this package—calls like the following which work for threads must be translated to a callable and arguments:

threading.Thread(target=(lambda: action(2, 4))).start()   # but lambdas work here

Conversely, some behavior of the threading module is mimicked by multiprocessing, whether you wish it did or not. Because programs using this package wait for child processes to end by default, we must mark processes as daemon if we don’t want to block the shell where the following sort of code is run (technically, parents attempt to terminate daemonic children on exit, which means that the program can exit when only daemonic children remain, much like threading):

def action(arg1, arg2):
   print(arg1, arg2)
   time.sleep(5)          # normally prevents the parent from exiting

if __name__ == '__main__':
    p = Process(target=action, args=('spam', 'eggs'))
    p.daemon = True                                        # don't wait for it
    p.start()

There’s more on some of these issues in the Python library manual; they are not show-stoppers by any stretch, but special cases and potential pitfalls to some. We’ll revisit the lambda and daemon issues in a more realistic context in Chapter 8, where we’ll use multiprocessing to launch GUI demos independently.

Why multiprocessing? The Conclusion

As this section’s examples suggest, multiprocessing provides a powerful alternative which aims to combine the portability and much of the utility of threads with the fully parallel potential of processes and offers additional solutions to IPC, exit status, and other parallel processing goals.

Hopefully, this section has also given you a better understanding of this module’s tradeoffs discussed at its beginning. In particular, its separate process model precludes the freely shared mutable state of threads, and bound methods and lambdas are prohibited by both the pickleability requirements of its IPC pipes and queues, as well as its process action implementation on Windows. Moreover, its requirement of pickleability for process arguments on Windows also precludes it as an option for conversing with clients in socket servers portably.

While not a replacement for threading in all applications, though, multiprocessing offers compelling solutions for many. Especially for parallel-programming tasks which can be designed to avoid its limitations, this module can offer both performance and portability that Python’s more direct multitasking tools cannot.

Unfortunately, beyond this brief introduction, we don’t have space for a more complete treatment of this module in this book. For more details, refer to the Python library manual. Here, we turn next to a handful of additional program launching tools and a wrap up of this chapter.

Other Ways to Start Programs

We’ve seen a variety of ways to launch programs in this book so far—from the os.fork/exec combination on Unix, to portable shell command-line launchers like os.system, os.popen, and subprocess, to the portable multiprocessing module options of the last section. There are still other ways to start programs in the Python standard library, some of which are more platform neutral or obscure than others. This section wraps up this chapter with a quick tour through this set.

The os.spawn Calls

The os.spawnv and os.spawnve calls were originally introduced to launch programs on Windows, much like a fork/exec call combination on Unix-like platforms. Today, these calls work on both Windows and Unix-like systems, and additional variants have been added to parrot os.exec.

In recent versions of Python, the portable subprocess module has started to supersede these calls. In fact, Python’s library manual includes a note stating that this module has more powerful and equivalent tools and should be preferred to os.spawn calls. Moreover, the newer multiprocessing module can achieve similarly portable results today when combined with os.exec calls, as we saw earlier. Still, the os.spawn calls continue to work as advertised and may appear in Python code you encounter.

The os.spawn family of calls execute a program named by a command line in a new process, on both Windows and Unix-like systems. In basic operation, they are similar to the fork/exec call combination on Unix and can be used as alternatives to the system and popen calls we’ve already learned. In the following interaction, for instance, we start a Python program with a command line in two traditional ways (the second also reads its output):

C:...PP4ESystemProcesses> python
>>> print(open('makewords.py').read())
print('spam')
print('eggs')
print('ham')

>>> import os
>>> os.system('python makewords.py')
spam
eggs
ham
0

>>> result = os.popen('python makewords.py').read()
>>> print(result)
spam
eggs
ham

The equivalent os.spawn calls achieve the same effect, with a slightly more complex call signature that provides more control over the way the program is launched:

>>> os.spawnv(os.P_WAIT, r'C:Python31python', ('python', 'makewords.py'))
spam
eggs
ham
0
>>> os.spawnl(os.P_NOWAIT, r'C:Python31python', 'python', 'makewords.py')
1820
>>> spam
eggs
ham

The spawn calls are also much like forking programs in Unix. They don’t actually copy the calling process (so shared descriptor operations won’t work), but they can be used to start a program running completely independent of the calling program, even on Windows. The script in Example 5-35 makes the similarity to Unix programming patterns more obvious. It launches a program with a fork/exec combination on Unix-like platforms (including Cygwin), or an os.spawnv call on Windows.

Example 5-35. PP4ESystemProcessesspawnv.py
"""
start up 10 copies of child.py running in parallel;
use spawnv to launch a program on Windows (like fork+exec);
P_OVERLAY replaces, P_DETACH makes child stdout go nowhere;
or use portable subprocess or multiprocessing options today!
"""

import os, sys

for i in range(10):
    if sys.platform[:3] == 'win':
        pypath = sys.executable
        os.spawnv(os.P_NOWAIT, pypath, ('python', 'child.py', str(i)))
    else:
        pid = os.fork()
        if pid != 0:
            print('Process %d spawned' % pid)
        else:
            os.execlp('python', 'python', 'child.py', str(i))
print('Main process exiting.')

To make sense of these examples, you have to understand the arguments being passed to the spawn calls. In this script, we call os.spawnv with a process mode flag, the full directory path to the Python interpreter, and a tuple of strings representing the shell command line with which to start a new program. The path to the Python interpreter executable program running a script is available as sys.executable. In general, the process mode flag is taken from these predefined values:

os.P_NOWAIT and os.P_NOWAITO

The spawn functions will return as soon as the new process has been created, with the process ID as the return value. Available on Unix and Windows.

os.P_WAIT

The spawn functions will not return until the new process has run to completion and will return the exit code of the process if the run is successful or “-signal” if a signal kills the process. Available on Unix and Windows.

os.P_DETACH and os.P_OVERLAY

P_DETACH is similar to P_NOWAIT, but the new process is detached from the console of the calling process. If P_OVERLAY is used, the current program will be replaced (much like os.exec). Available on Windows.

In fact, there are eight different calls in the spawn family, which all start a program but vary slightly in their call signatures. In their names, an “l” means you list arguments individually, “p” means the executable file is looked up on the system path, and “e” means a dictionary is passed in to provide the shelled environment of the spawned program: the os.spawnve call, for example, works the same way as os.spawnv but accepts an extra fourth dictionary argument to specify a different shell environment for the spawned program (which, by default, inherits all of the parent’s settings):

os.spawnl(mode, path, ...)
os.spawnle(mode, path, ..., env)
os.spawnlp(mode, file, ...)                 # Unix only
os.spawnlpe(mode, file, ..., env)           # Unix only
os.spawnv(mode, path, args)
os.spawnve(mode, path, args, env)
os.spawnvp(mode, file, args)                # Unix only
os.spawnvpe(mode, file, args, env)          # Unix only

Because these calls mimic the names and call signatures of the os.exec variants, see earlier in this chapter for more details on the differences between these call forms. Unlike the os.exec calls, only half of the os.spawn forms—those without system path checking (and hence without a “p” in their names)—are currently implemented on Windows. All the process mode flags are supported on Windows, but detach and overlay modes are not available on Unix. Because this sort of detail may be prone to change, to verify which are present, be sure to see the library manual or run a dir built-in function call on the os module after an import.

Here is the script in Example 5-35 at work on Windows, spawning 10 independent copies of the child.py Python program we met earlier in this chapter:

C:...PP4ESystemProcesses> type child.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])

C:...PP4ESystemProcesses> python spawnv.py
Hello from child −583587 0
Hello from child −558199 2
Hello from child −586755 1
Hello from child −562171 3
Main process exiting.
Hello from child −581867 6
Hello from child −588651 5
Hello from child −568247 4
Hello from child −563527 7
Hello from child −543163 9
Hello from child −587083 8

Notice that the copies print their output in random order, and the parent program exits before all children do; all of these programs are really running in parallel on Windows. Also observe that the child program’s output shows up in the console box where spawnv.py was run; when using P_NOWAIT, standard output comes to the parent’s console, but it seems to go nowhere when using P_DETACH (which is most likely a feature when spawning GUI programs).

But having shown you this call, I need to again point out that both the subprocess and multiprocessing modules offer more portable alternatives for spawning programs with command lines today. In fact, unless os.spawn calls provide unique behavior you can’t live without (e.g., control of shell window pop ups on Windows), the platform-specific alternatives code of Example 5-35 can be replaced altogether with the portable multiprocessing code in Example 5-33.

The os.startfile call on Windows

Although os.spawn calls may be largely superfluous today, there are other tools that can still make a strong case for themselves. For instance, the os.system call can be used on Windows to launch a DOS start command, which opens (i.e., runs) a file independently based on its Windows filename associations, as though it were clicked. os.startfile makes this even simpler in recent Python releases, and it can avoid blocking its caller, unlike some other tools.

Using the DOS start command

To understand why, first you need to know how the DOS start command works in general. Roughly, a DOS command line of the form start command works as if command were typed in the Windows Run dialog box available in the Start button menu. If command is a filename, it is opened exactly as if its name was double-clicked in the Windows Explorer file selector GUI.

For instance, the following three DOS commands automatically start Internet Explorer, my registered image viewer program, and my sound media player program on the files named in the commands. Windows simply opens the file with whatever program is associated to handle filenames of that form. Moreover, all three of these programs run independently of the DOS console box where the command is typed:

C:...PP4ESystemMedia> start lp4e-preface-preview.html
C:...PP4ESystemMedia> start ora-lp4e.jpg
C:...PP4ESystemMedia> start sousa.au

Because the start command can run any file and command line, there is no reason it cannot also be used to start an independently running Python program:

C:...PP4ESystemProcesses> start child.py 1

This works because Python is registered to open names ending in .py when it is installed. The script child.py is launched independently of the DOS console window even though we didn’t provide the name or path of the Python interpreter program. Because child.py simply prints a message and exits, though, the result isn’t exactly satisfying: a new DOS window pops up to serve as the script’s standard output, and it immediately goes away when the child exits. To do better, add an input call at the bottom of the program file to wait for a key press before exiting:

C:...PP4ESystemProcesses> type child-wait.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])
input("Press <Enter>")       # don't flash on Windows

C:...PP4ESystemProcesses> start child-wait.py 2

Now the child’s DOS window pops up and stays up after the start command has returned. Pressing the Enter key in the pop-up DOS window makes it go away.

Using start in Python scripts

Since we know that Python’s os.system and os.popen can be called by a script to run any command line that can be typed at a DOS shell prompt, we can also start independently running programs from a Python script by simply running a DOS start command line. For instance:

C:...PP4ESystemMedia> python
>>> import os
>>> cmd = 'start lp4e-preface-preview.html'            # start IE browser
>>> os.system(cmd)                                     # runs independent
0

The Python os.system calls here start whatever web page browser is registered on your machine to open .html files (unless these programs are already running). The launched programs run completely independent of the Python session—when running a DOS start command, os.system does not wait for the spawned program to exit.

The os.startfile call

In fact, start is so useful that recent Python releases also include an os.startfile call, which is essentially the same as spawning a DOS start command with os.system and works as though the named file were double-clicked. The following calls, for instance, have a similar effect:

>>> os.startfile('lp-code-readme.txt')
>>> os.system('start lp-code-readme.txt')

Both pop up the text file in Notepad on my Windows computer. Unlike the second of these calls, though, os.startfile provides no option to wait for the application to close (the DOS start command’s /WAIT option does) and no way to retrieve the application’s exit status (returned from os.system).

On recent versions of Windows, the following has a similar effect, too, because the registry is used at the command line (though this form pauses until the file’s viewer is closed—like using start /WAIT):

>>> os.system('lp-code-readme.txt')       # 'start' is optional today

This is a convenient way to open arbitrary document and media files, but keep in mind that the os.startfile call works only on Windows, because it uses the Windows registry to know how to open a file. In fact, there are even more obscure and nonportable ways to launch programs, including Windows-specific options in the PyWin32 package, which we’ll finesse here. If you want to be more platform neutral, consider using one of the other many program launcher tools we’ve seen, such as os.popen or os.spawnv. Or better yet, write a module to hide the details—as the next and final section demonstrates.

A Portable Program-Launch Framework

With all of these different ways to start programs on different platforms, it can be difficult to remember what tools to use in a given situation. Moreover, some of these tools are called in ways that are complicated and thus easy to forget. Although modules like subprocess and multiprocessing offer fully portable options today, other tools sometimes provide more specific behavior that’s better on a given platform; shell window pop ups on Windows, for example, are often better suppressed.

I write scripts that need to launch Python programs often enough that I eventually wrote a module to try to hide most of the underlying details. By encapsulating the details in this module, I’m free to change them to use new tools in the future without breaking code that relies on them. While I was at it, I made this module smart enough to automatically pick a “best” launch scheme based on the underlying platform. Laziness is the mother of many a useful module.

Example 5-36 collects in a single module many of the techniques we’ve met in this chapter. It implements an abstract superclass, LaunchMode, which defines what it means to start a Python program named by a shell command line, but it doesn’t define how. Instead, its subclasses provide a run method that actually starts a Python program according to a given scheme and (optionally) define an announce method to display a program’s name at startup time.

Example 5-36. PP4Elaunchmodes.py
"""
###################################################################################
launch Python programs with command lines and reusable launcher scheme classes;
auto inserts "python" and/or path to Python executable at front of command line;
some of this module may assume 'python' is on your system path (see Launcher.py);

subprocess module would work too, but os.popen() uses it internally, and the goal
is to start a program running independently here, not to connect to its streams;
multiprocessing module also is an option, but this is command-lines, not functions:
doesn't make sense to start a process which would just do one of the options here;

new in this edition: runs script filename path through normpath() to change any
/ to  for Windows tools where required; fix is inherited by PyEdit and others;
on Windows, / is generally allowed for file opens, but not by all launcher tools;
###################################################################################
"""

import sys, os
pyfile = (sys.platform[:3] == 'win' and 'python.exe') or 'python'
pypath = sys.executable     # use sys in newer pys

def fixWindowsPath(cmdline):
    """
    change all / to  in script filename path at front of cmdline;
    used only by classes which run tools that require this on Windows;
    on other platforms, this does not hurt (e.g., os.system on Unix);
    """
    splitline = cmdline.lstrip().split(' ')           # split on spaces
    fixedpath = os.path.normpath(splitline[0])        # fix forward slashes
    return ' '.join([fixedpath] + splitline[1:])      # put it back together

class LaunchMode:
    """
    on call to instance, announce label and run command;
    subclasses format command lines as required in run();
    command should begin with name of the Python script
    file to run, and not with "python" or its full path;
    """
    def __init__(self, label, command):
        self.what  = label
        self.where = command
    def __call__(self):                     # on call, ex: button press callback
        self.announce(self.what)
        self.run(self.where)                # subclasses must define run()
    def announce(self, text):               # subclasses may redefine announce()
        print(text)                         # methods instead of if/elif logic
    def run(self, cmdline):
        assert False, 'run must be defined'

class System(LaunchMode):
    """
    run Python script named in shell command line
    caveat: may block caller, unless & added on Unix
    """
    def run(self, cmdline):
        cmdline = fixWindowsPath(cmdline)
        os.system('%s %s' % (pypath, cmdline))

class Popen(LaunchMode):
    """
    run shell command line in a new process
    caveat: may block caller, since pipe closed too soon
    """
    def run(self, cmdline):
        cmdline = fixWindowsPath(cmdline)
        os.popen(pypath + ' ' + cmdline)           # assume nothing to be read

class Fork(LaunchMode):
    """
    run command in explicitly created new process
    for Unix-like systems only, including cygwin
    """
    def run(self, cmdline):
        assert hasattr(os, 'fork')
        cmdline = cmdline.split()                  # convert string to list
        if os.fork() == 0:                         # start new child process
            os.execvp(pypath, [pyfile] + cmdline)  # run new program in child

class Start(LaunchMode):
    """
    run command independent of caller
    for Windows only: uses filename associations
    """
    def run(self, cmdline):
        assert sys.platform[:3] == 'win'
        cmdline = fixWindowsPath(cmdline)
        os.startfile(cmdline)

class StartArgs(LaunchMode):
    """
    for Windows only: args may require real start
    forward slashes are okay here
    """
    def run(self, cmdline):
        assert sys.platform[:3] == 'win'
        os.system('start ' + cmdline)              # may create pop-up window

class Spawn(LaunchMode):
    """
    run python in new process independent of caller
    for Windows or Unix; use P_NOWAIT for dos box;
    forward slashes are okay here
    """
    def run(self, cmdline):
        os.spawnv(os.P_DETACH, pypath, (pyfile, cmdline))

class Top_level(LaunchMode):
    """
    run in new window, same process
    tbd: requires GUI class info too
    """
    def run(self, cmdline):
        assert False, 'Sorry - mode not yet implemented'

#
# pick a "best" launcher for this platform
# may need to specialize the choice elsewhere
#

if sys.platform[:3] == 'win':
    PortableLauncher = Spawn
else:
    PortableLauncher = Fork

class QuietPortableLauncher(PortableLauncher):
    def announce(self, text):
        pass

def selftest():
    file = 'echo.py'
    input('default mode...')
    launcher = PortableLauncher(file, file)
    launcher()                                             # no block

    input('system mode...')
    System(file, file)()                                   # blocks

    if sys.platform[:3] == 'win':
        input('DOS start mode...')                         # no block
        StartArgs(file, file)()

if __name__ == '__main__': selftest()

Near the end of the file, the module picks a default class based on the sys.platform attribute: PortableLauncher is set to a class that uses spawnv on Windows and one that uses the fork/exec combination elsewhere; in recent Pythons, we could probably just use the spawnv scheme on most platforms, but the alternatives in this module are used in additional contexts. If you import this module and always use its PortableLauncher attribute, you can forget many of the platform-specific details enumerated in this chapter.

To run a Python program, simply import the PortableLauncher class, make an instance by passing a label and command line (without a leading “python” word), and then call the instance object as though it were a function. The program is started by a call operation—by its __call__ operator-overloading method, instead of a normally named method—so that the classes in this module can also be used to generate callback handlers in tkinter-based GUIs. As we’ll see in the upcoming chapters, button-presses in tkinter invoke a callable object with no arguments; by registering a PortableLauncher instance to handle the press event, we can automatically start a new program from another program’s GUI. A GUI might associate a launcher with a GUI’s button press with code like this:

 Button(root, text=name, command=PortableLauncher(name, commandLine))

When run standalone, this module’s selftest function is invoked as usual. As coded, System blocks the caller until the program exits, but PortableLauncher (really, Spawn or Fork) and Start do not:

C:...PP4E> type echo.py
print('Spam')
input('press Enter')

C:...PP4E> python launchmodes.py
default mode...
echo.py
system mode...
echo.py
Spam
press Enter
DOS start mode...
echo.py

As more practical applications, this file is also used in Chapter 8 to launch GUI dialog demos independently, and again in a number of Chapter 10’s examples, including PyDemos and PyGadgets—launcher scripts designed to run major examples in this book in a portable fashion, which live at the top of this book’s examples distribution directory. Because these launcher scripts simply import PortableLauncher and register instances to respond to GUI events, they run on both Windows and Unix unchanged (tkinter’s portability helps, too, of course). The PyGadgets script even customizes PortableLauncher to update a GUI label at start time:

class Launcher(launchmodes.PortableLauncher):    # use wrapped launcher class
    def announce(self, text):                    # customize to set GUI label
        Info.config(text=text)

We’ll explore these two client scripts, and others, such as Chapter 11’s PyEdit after we start coding GUIs in Part III. Partly because of its role in PyEdit, this edition extends this module to automatically replace forward slashes with backward slashes in the script’s file path name. PyEdit uses forward slashes in some filenames because they are allowed in file opens on Windows, but some Windows launcher tools require the backslash form instead. Specifically, system, popen, and startfile in os require backslashes, but spawnv does not. PyEdit and others inherit the new pathname fix of fixWindowsPath here simply by importing and using this module’s classes; PyEdit eventually changed so as to make this fix irrelevant for its own use case (see Chapter 11), but other clients still acquire the fix for free.

Also notice how some of the classes in this example use the sys.executable path string to obtain the Python executable’s full path name. This is partly due to their role in user-friendly demo launchers. In prior versions that predated sys.executable, these classes instead called two functions exported by a module named Launcher.py to find a suitable Python executable, regardless of whether the user had added its directory to the system PATH variable’s setting.

This search is no longer required. Since I’ll describe this module’s other roles in the next chapter, and since this search has been largely precluded by Python’s perpetual pandering to programmers’ professional proclivities, I’ll postpone any pointless pedagogical presentation here. (Period.)

Other System Tools Coverage

That concludes our tour of Python system tools. In this and the prior three chapters, we’ve met most of the commonly used system tools in the Python library. Along the way, we’ve also learned how to use them to do useful things such as start programs, process directories, and so on. The next chapter wraps up this domain by using the tools we’ve just met to implement scripts that do useful and more realistic system-level work.

Still other system-related tools in Python appear later in this text. For instance:

  • Sockets, used to communicate with other programs and networks and introduced briefly here, show up again in Chapter 10 in a common GUI use case and are covered in full in Chapter 12.

  • Select calls, used to multiplex among tasks, are also introduced in Chapter 12 as a way to implement servers.

  • File locking with os.open, introduced in Chapter 4, is discussed again in conjunction with later examples.

  • Regular expressions, string pattern matching used by many text processing tools in the system administration domain, don’t appear until Chapter 19.

Moreover, things like forks and threads are used extensively in the Internet scripting chapters: see the discussion of threaded GUIs in Chapters 9 and 10; the server implementations in Chapter 12; the FTP client GUI in Chapter 13; and the PyMailGUI program in Chapter 14. Along the way, we’ll also meet higher-level Python modules, such as socketserver, which implement fork and thread-based socket server code for us. In fact, many of the last four chapters’ tools will pop up constantly in later examples in this book—about what one would expect of general-purpose portable libraries.

Last, but not necessarily least, I’d like to point out one more time that many additional tools in the Python library don’t appear in this book at all. With hundreds of library modules, more appearing all the time, and even more in the third-party domain, Python book authors have to pick and choose their topics frugally! As always, be sure to browse the Python library manuals and Web early and often in your Python career.



[12] To watch on Windows, click the Start button, select All Programs → Accessories → System Tools → Resource Monitor, and monitor CPU/Processor usage (Task Manager’s Performance tab may give similar results). The graph rarely climbed above single-digit percentages on my laptop machine while writing this footnote (at least until I typed while True: pass in a Python interactive session window…).

[13] At least in the current Python implementation, calling os.fork in a Python script actually copies the Python interpreter process (if you look at your process list, you’ll see two Python entries after a fork). But since the Python interpreter records everything about your running script, it’s OK to think of fork as copying your program directly. It really will if Python scripts are ever compiled to binary machine code.

[14] The _thread examples in this book now all use start_new_thread. This call is also available as thread.start_new for historical reasons, but this synonym may be removed in a future Python release. As of Python 3.1, both names are still available, but the help documentation for start_new claims that it is obsolete; in other words, you should probably prefer the other if you care about the future (and this book must!).

[15] They cannot, however, be used to directly synchronize processes. Since processes are more independent, they usually require locking mechanisms that are more long-lived and external to programs. Chapter 4’s os.open call with an open flag of O_EXCL allows scripts to lock and unlock files and so is ideal as a cross-process locking tool. See also the synchronization tools in the multiprocessing and threading modules and the IPC section later in this chapter for other general synchronization ideas.

[16] But in case this means you, Python’s lock and condition variables are distinct objects, not something inherent in all objects, and Python’s Thread class doesn’t have all the features of Java’s. See Python’s library manual for further details.

[17] We will clarify the notions of “client” and “server” in the Internet programming part of this book. There, we’ll communicate with sockets (which we’ll see later in this chapter are roughly like bidirectional pipes for programs running both across networks and on the same machine), but the overall conversation model is similar. Named pipes (fifos), described ahead, are also a better match to the client/server model because they can be accessed by arbitrary, unrelated processes (no forks are required). But as we’ll see, the socket port model is generally used by most Internet scripting protocols—email, for instance, is mostly just formatted strings shipped over sockets between programs on standard port numbers reserved for the email protocol.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset