Concurrency is the art of making a computer do (or appear to do) multiple things at once. Historically, this meant inviting the processor to switch between different tasks many times per second. In modern systems, it can also literally mean doing two or more things simultaneously on separate processor cores.
Concurrency is not inherently an object-oriented topic, but Python's concurrent systems are built on top of the object-oriented constructs we've covered throughout the module. This chapter will introduce you to the following topics:
Concurrency is complicated. The basic concepts are fairly simple, but the bugs that can occur are notoriously difficult to track down. However, for many projects, concurrency is the only way to get the performance we need. Imagine if a web server couldn't respond to a user's request until the previous one was completed! We won't be going into all the details of just how hard it is (another full module would be required) but we'll see how to do basic concurrency in Python, and some of the most common pitfalls to avoid.
Most often, concurrency is created so that work can continue happening while the program is waiting for I/O to happen. For example, a server can start processing a new network request while it waits for data from a previous request to arrive. An interactive program might render an animation or perform a calculation while waiting for the user to press a key. Bear in mind that while a person can type more than 500 characters per minute, a computer can perform billions of instructions per second. Thus, a ton of processing can happen between individual key presses, even when typing quickly.
It's theoretically possible to manage all this switching between activities within your program, but it would be virtually impossible to get right. Instead, we can rely on Python and the operating system to take care of the tricky switching part, while we create objects that appear to be running independently, but simultaneously. These objects are called threads; in Python they have a very simple API. Let's take a look at a basic example:
from threading import Thread class InputReader(Thread): def run(self): self.line_of_text = input() print("Enter some text and press enter: ") thread = InputReader() thread.start() count = result = 1 while thread.is_alive(): result = count * count count += 1 print("calculated squares up to {0} * {0} = {1}".format( count, result)) print("while you typed '{}'".format(thread.line_of_text))
This example runs two threads. Can you see them? Every program has one thread, called the main thread. The code that executes from the beginning is happening in this thread. The second thread, more obviously, exists as the InputReader
class.
To construct a thread, we must extend the Thread
class and implement the run
method. Any code inside the run
method (or that is called from within that method) is executed in a separate thread.
The new thread doesn't start running until we call the start()
method on the object. In this case, the thread immediately pauses to wait for input from the keyboard. In the meantime, the original thread continues executing at the point start
was called. It starts calculating squares inside a while
loop. The condition in the while
loop checks if the InputReader
thread has exited its run
method yet; once it does, it outputs some summary information to the screen.
If we run the example and type the string "hello world", the output looks as follows:
Enter some text and press enter: hello world calculated squares up to 1044477 * 1044477 = 1090930114576 while you typed 'hello world'
You will, of course, calculate more or less squares while typing the string as the numbers are related to both our relative typing speeds, and to the processor speeds of the computers we are running.
A thread only starts running in concurrent mode when we call the start
method. If we want to take out the concurrent call to see how it compares, we can call thread.run()
in the place that we originally called thread.start()
. The output is telling:
Enter some text and press enter: hello world calculated squares up to 1 * 1 = 1 while you typed 'hello world'
In this case, the thread never becomes alive and the while
loop never executes. We wasted a lot of CPU power sitting idle while we were typing.
There are a lot of different patterns for using threads effectively. We won't be covering all of them, but we will look at a common one so we can learn about the join
method. Let's check the current temperature in the capital city of every province in Canada:
from threading import Thread import json from urllib.request import urlopen import time CITIES = [ 'Edmonton', 'Victoria', 'Winnipeg', 'Fredericton', "St. John's", 'Halifax', 'Toronto', 'Charlottetown', 'Quebec City', 'Regina' ] class TempGetter(Thread): def __init__(self, city): super().__init__() self.city = city def run(self): url_template = ( 'http://api.openweathermap.org/data/2.5/' 'weather?q={},CA&units=metric') response = urlopen(url_template.format(self.city)) data = json.loads(response.read().decode()) self.temperature = data['main']['temp'] threads = [TempGetter(c) for c in CITIES] start = time.time() for thread in threads: thread.start() for thread in threads: thread.join() for thread in threads: print( "it is {0.temperature:.0f}°C in {0.city}".format(thread)) print( "Got {} temps in {} seconds".format( len(threads), time.time() - start))
This code constructs 10 threads before starting them. Notice how we can override the constructor to pass them into the Thread
object, remembering to call super
to ensure the Thread
is properly initialized. Pay attention to this: the new thread isn't running yet, so the __init__
method is still executing from inside the main thread. Data we construct in one thread is accessible from other running threads.
After the 10 threads have been started, we loop over them again, calling the join()
method on each. This method essentially says "wait for the thread to complete before doing anything". We call this ten times in sequence; the for loop won't exit until all ten threads have completed.
At this point, we can print the temperature that was stored on each thread object. Notice once again that we can access data that was constructed within the thread from the main thread. In threads, all state is shared by default.
Executing this code on my 100 mbit connection takes about two tenths of a second:
it is 5°C in Edmonton it is 11°C in Victoria it is 0°C in Winnipeg it is -10°C in Fredericton it is -12°C in St. John's it is -8°C in Halifax it is -6°C in Toronto it is -13°C in Charlottetown it is -12°C in Quebec City it is 2°C in Regina Got 10 temps in 0.18970298767089844 seconds
If we run this code in a single thread (by changing the start()
call to run()
and commenting out the join()
call), it takes closer to 2 seconds because each 0.2 second request has to complete before the next one begins. This speedup of 10 times shows just how useful concurrent programming can be.
Threads can be useful, especially in other programming languages, but modern Python programmers tend to avoid them for several reasons. As we'll see, there are other ways to do concurrent programming that are receiving more attention from the Python developers. Let's discuss some of these pitfalls before moving on to more salient topics.
The main problem with threads is also their primary advantage. Threads have access to all the memory and thus all the variables in the program. This can too easily cause inconsistencies in the program state. Have you ever encountered a room where a single light has two switches and two different people turn them on at the same time? Each person (thread) expects their action to turn the lamp (a variable) on, but the resulting value (the lamp is off) is inconsistent with those expectations. Now imagine if those two threads were transferring funds between bank accounts or managing the cruise control in a vehicle.
The solution to this problem in threaded programming is to "synchronize" access to any code that reads or writes a shared variable. There are a few different ways to do this, but we won't go into them here so we can focus on more Pythonic constructs. The synchronization solution works, but it is way too easy to forget to apply it. Worse, bugs due to inappropriate use of synchronization are really hard to track down because the order in which threads perform operations is inconsistent. We can't easily reproduce the error. Usually, it is safest to force communication between threads to happen using a lightweight data structure that already uses locks appropriately. Python offers the queue.Queue
class to do this; it's functionality is basically the same as the multiprocessing.Queue
that we will discuss in the next section.
In some cases, these disadvantages might be outweighed by the one advantage of allowing shared memory: it's fast. If multiple threads need access to a huge data structure, shared memory can provide that access quickly. However, this advantage is usually nullified by the fact that, in Python, it is impossible for two threads running on different CPU cores to be performing calculations at exactly the same time. This brings us to our second problem with threads.
In order to efficiently manage memory, garbage collection, and calls to machine code in libraries, Python has a utility called the global interpreter lock, or GIL. It's impossible to turn off, and it means that threads are useless in Python for one thing that they excel at in other languages: parallel processing. The GIL's primary effect, for our purposes is to prevent any two threads from doing work at the exact same time, even if they have work to do. In this case, "doing work" means using the CPU, so it's perfectly ok for multiple threads to access the disk or network; the GIL is released as soon as the thread starts to wait for something.
The GIL is quite highly disparaged, mostly by people who don't understand what it is or all the benefits it brings to Python. It would definitely be nice if our language didn't have this restriction, but the Python reference developers have determined that, for now at least, it brings more value than it costs. It makes the reference implementation easier to maintain and develop, and during the single-core processor days when Python was originally developed, it actually made the interpreter faster. The net result of the GIL, however, is that it limits the benefits that threads bring us, without alleviating the costs.
One final limitation of threads as compared to the asynchronous system we will be discussing later is the cost of maintaining the thread. Each thread takes up a certain amount of memory (both in the Python process and the operating system kernel) to record the state of that thread. Switching between the threads also uses a (small) amount of CPU time. This work happens seamlessly without any extra coding (we just have to call start()
and the rest is taken care of), but the work still has to happen somewhere.
This can be alleviated somewhat by structuring our workload so that threads can be reused to perform multiple jobs. Python provides a ThreadPool
feature to handle this. It is shipped as part of the multiprocessing library and behaves identically to the ProcessPool
, that we will discuss shortly, so let's defer discussion until the next section.