Chapter 13. Concurrency

Concurrency and one of its manifestations—parallel processing—is one of the broadest topics in the area of software engineering. Most of the chapters in this book also cover vast areas, and almost all of them could be big enough topics for a separate book. But the topic of concurrency by itself is so huge that it could take dozens of positions and we would still not be able to discuss all of its important aspects and models.

This is why I won't try to fool you, and from the very beginning state that we will barely touch the surface of this topic. The purpose of this chapter is to show why concurrency may be required in your application, when to use it, and what are the most important concurrency models that you may use in Python:

  • Multithreading
  • Multiprocessing
  • Asynchronous programming

We will also discuss some of the language features, built-in modules, and third-party packages that allow you to implement these models in your code. But we won't cover them in much detail. Treat the content of this chapter as an entry point for your further research and reading. It is here to guide you through the basic ideas and help in deciding if you really need concurrency, and if so, which approach will best suit your needs.

Why concurrency?

Before we answer the question why concurrency, we need to ask what is concurrency at all?

And the answer to the second question may be surprising for some who used to think that this is a synonym for parallel processing. But concurrency is not the same as parallelism. Concurrency is not a matter of application implementation but only a property of a program, algorithm, or problem. And parallelism is only one of the possible approaches to problems that are concurrent.

Leslie Lamport in his Time, Clocks, and the Ordering of Events in Distributed Systems paper from 1976, says:

"Two events are concurrent if neither can causally affect the other."

By extrapolating events to programs, algorithms, or problems, we can say that something is concurrent if it can be fully or partially decomposed into components (units) that are order-independent. Such units may be processed independently from each other, and the order of processing does not affect the final result. This means that they can also be processed simultaneously or in parallel. If we process information this way, then we are indeed dealing with parallel processing. But this is still not obligatory.

Doing work in a distributed manner, preferably using capabilities of multicore processors or computing clusters, is a natural consequence of concurrent problems. Anyway, it does not mean that this is the only way of efficiently dealing with concurrency. There are a lot of use cases where concurrent problems can be approached in other than synchronous ways, but without the need for parallel execution.

So, once we know what concurrency really is, it is time to explain what the fuss is about. When the problem is concurrent, it gives you the opportunity to deal with it in a special, preferably more efficient, way.

We often get used to deal with problems in a classical way by performing a sequence of steps. This is how most of us think and process information—using synchronous algorithms that do one thing at a time, step by step. But this way of processing information is not well suited for solving large-scale problems or when you need to satisfy the demands of multiple users or software agents simultaneously:

  • The time to process the job is limited by the performance of the single processing unit (single machine, CPU core, and so on)
  • You are not able to accept and process new inputs until your program has finished processing the previous one

So generally, approaching concurrent problems concurrently is the best approach when:

  • The scale of problems is so big that the only way to process them in an acceptable time or within the range of available resources is to distribute execution to multiple processing units that can handle the work in parallel
  • Your application needs to maintain responsiveness (accept new inputs) even if it has not finished processing the old ones

This covers most of the situations where concurrent processing is a reasonable option. The first group of problems definitely needs the parallel processing solution so it is usually solved with multithreading and multiprocessing models. The second group does not necessarily need to be processed in parallel, so the actual solution really depends on the problem details. Note that this group also covers cases where the application needs to serve multiple clients (users or software agents) independently, without the need to wait for others to be successfully served.

The other thing worth mentioning is that the preceding two groups are not exclusive. Very often you need to maintain application responsiveness and at the same time you are not able to handle the input on a single processing unit. This is the reason why different and seemingly alternative or conflicting approaches to concurrency may often be used at the same time. This is especially common in the development of web servers where it may be necessary to use asynchronous event loops, or threads with a conjunction of multiple processes, in order to utilize all the available resources and still maintain low latencies under high load.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset