18. Multithreading

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

18. Multithreading

Two significant trends of the past decade have had an enormous effect on the field of software development. First, the continued decrease in the cost of performing computations is no longer driven by increases in clock speed and transistor density, as illustrated by Figure 18.1. Rather, the cost of computation is now falling because it is economical to make hardware that has multiple CPUs.

FIGURE 18.1: Clock Speeds over Time
(Graph compiled by Herb Sutter. Used with permission. Original at www.gotw.ca.)

Second, computations now routinely involve enormous latency. Latency is, simply put, the amount of time required to obtain a desired result. There are two principal causes of latency. Processor-bound latency occurs when the computational task is complex; if a computation requires performing 12 billion arithmetic operations and the total processing power available is only 6 billion operations per second, at least 2 seconds of processor-bound latency will be incurred between asking for the result and obtaining it. I/O-bound latency, by contrast, is latency incurred by the need to obtain data from an external source such as a disk drive, web server, and so on. Any computation that requires fetching data from a web server physically located far from the client machine will incur latency equivalent to millions of processor cycles.

These two trends together create an enormous challenge for modern software developers. Given that machines have more computing power than ever, how are we to make effective use of that power to deliver results to the user quickly, and without compromising on the user experience? How do we avoid creating frustrating user interfaces that freeze up when a high-latency operation is triggered? Moreover, how do we go about splitting CPU-bound work among multiple processors to decrease the time required for the computation?

The standard technique for engineering software that keeps the user interface responsive and CPU utilization high is to write multithreaded programs that do multiple computations “in parallel.” Unfortunately, multithreading logic is notoriously difficult to get right; we’ll spend the next two chapters exploring what makes multithreading difficult, and learning how to use higher-level abstractions and new language features to ease that burden.

The higher-level abstractions we’ll discuss are, first, the two principal components of the Parallel Extensions library that was released with .NET 4.0¹—the Task Parallel Library (TPL) and Parallel LINQ (PLINQ)—and second, the Task-based Asynchronous Pattern (TAP) and its accompanying language support in C# 5.0. Although we strongly encourage you to use these higher-level abstractions, we will also cover some of the lower-level threading APIs from previous versions of the .NET runtime in this chapter. Additional multithreading patterns prior to C# 5.0 are available for download at http://IntelliTect.com/EssentialCSharp along with the chapters from Essential C# 3.0. Thus, if you want to fully understand the resources from multithreaded programming without the later features, you still have access to that material.

1. These libraries are available in .NET 3.5 by downloading the Reactive Extensions library for .NET 3.5, but this is not officially supported.

We’ll start this chapter with a few beginner topics in case you are new to multithreading. Then we’ll briefly discuss “traditional” thread manipulation without using the Parallel Extensions libraries to ensure that you have a basic understanding of thread manipulation; the following chapter goes into more details on that topic. We’ll then spend most of this chapter covering the TPL, TAP, and PLINQ, in that order.

Multithreading Basics

There is a lot of confusing jargon associated with multithreading, so let’s define a few terms.

A CPU (central processing unit) or core² is the unit of hardware that actually executes a given program. Every machine has at least one CPU, though today multiple CPU machines are common. Many modern CPUs support simultaneous multithreading (which Intel trademarks as Hyper-Threading), a mode where a single CPU can appear as multiple “virtual” CPUs.

2. Technically we ought to say that “CPU” always refers to the physical chip and “core” may refer to a physical or virtual CPU. This distinction is unimportant for the purposes of this book, so we will use these terms interchangeably.

A process is a currently executing instance of a given program; the fundamental purpose of the operating system is to manage processes. Each process contains one or more threads. A process is represented by an instance of the Process class in the System.Diagnostics namespace.

C# programming at the level of statements and expressions is fundamentally about describing flow of control, and thus far in this book we’ve made the implicit assumption that a given program has only a single “point of control.” You can imagine the point of control as being a cursor that enters the text of your program at the Main method when you start it up, and then moves around the program as the various conditions, loops, method calls, and so on, are executed. A thread is this point of control. A thread is represented by an instance of the System.Threading.Thread class and the API for manipulating a Thread is in the same System.Threading namespace.

A single-threaded program is one in which there is only one thread in the process. A multithreaded program has two or more threads in the process.

A piece of code is said to be thread safe if it behaves correctly when used in a multithreaded program. The threading model of a piece of code is the set of requirements that the code places upon its caller in exchange for guaranteeing thread safety. (For example, the threading model of many classes is “static methods may be called from any thread but instance methods may be called only from the thread that allocated the instance.”)

A task is a unit of potentially high-latency work that produces a resultant value or desired side effect. The distinction between tasks and threads is as follows: A task represents a job that needs to be performed, whereas a thread represents the worker that does the job. A task is useful only for its side effects and is represented by an instance of the Task class. A task used to produce a value of a given type is represented by the Task<T> class, which derives from the nongeneric Task type. These can be found in the System.Threading.Tasks namespace.

A thread pool is a collection of threads, along with logic for determining how to assign work to those threads. When your program has a task to perform, it can delegate a worker thread from the pool, assign the thread to perform the task, and then de-allocate it when the work completes, thereby making it available the next time additional work is requested.

Beginner Topic: The Why and How of Multithreading

There are two principal scenarios for multithreading: enabling multitasking and dealing with latency.

Users think nothing of running dozens of processes at the same time. They might have presentations and spreadsheets open for editing while at the same time they are browsing documents on the Internet, listening to music, receiving instant messages and email arrival notifications, and watching the little clock in the corner. Each of these processes has to continue to do its job even though it is not the only task the machine has to attend to. This kind of multitasking is usually implemented at the process level, but there are situations in which you want to do this sort of multitasking within a single process.

For the purposes of this book, however, we will mostly be considering multithreading as a technique for dealing with latency. For example, to import a large file while simultaneously allowing a user to click Cancel, a developer creates an additional thread to perform the import. By performing the import on a different thread, the user can request cancellation instead of freezing the user interface until the import completes.

If enough cores are available that each thread can be assigned a core, each thread essentially gets its own little machine. However, more often than not there are more threads than cores. Even the relatively common multicore machines of today still have only a handful of cores, while each process could quite possibly run dozens of threads.

To overcome the discrepancy between the numerous threads and the handful of cores, an operating system simulates multiple threads running concurrently by time slicing. The operating system switches execution from one thread to the next so quickly that it appears the threads are executing simultaneously. The period of time that the processor executes a particular thread before switching to another is the time slice or quantum. The act of changing which thread is executing in a given core is called a context switch.

The effect is similar to that of a fiber-optic telephone line in which the fiber-optic line represents the processor and each conversation represents a thread. A (single-mode) fiber-optic telephone line can send only one signal at a time, but many people can hold simultaneous conversations over the line. The fiber-optic channel is fast enough to switch between conversations so quickly that each conversation appears uninterrupted. Similarly, each thread of a multithreaded process appears to run continuously with other threads.

If two operations are running “in parallel,” via either true multicore parallelism or simulated parallelism using time slicing, they are said to be concurrent. To implement such concurrency, you invoke it asynchronously, such that both the execution and the completion of the invoked operation are separate from the control flow that invoked it. Concurrency, therefore, occurs when work dispatched asynchronously executes in parallel with the current control flow. Parallel programming is the act of taking a single problem and splitting it into pieces, whereby you asynchronously initiate the process of each piece such that the pieces can all be processed concurrently.

Beginner Topic: Performance Considerations

A thread that is servicing an I/O bound operation can essentially be ignored by the operating system until the result is available from the I/O subsystem; switching away from an I/O bound thread to a processor-bound thread results in more efficient processor utilization because the CPU is not idle while waiting for the I/O operation to complete.

However, context switching is not free; the current internal state of the CPU must be saved to memory, and the state associated with the new thread must be loaded. If there are too many threads, the switching overhead can begin to noticeably affect performance. Adding more threads will likely decrease performance further, to the point where the processor spends more time switching from one thread to another than it does accomplishing the work of each thread.

Even if we ignore the cost of context switching, time slicing itself can have a huge impact on performance. Suppose, for example, that you have two processor-bound high-latency tasks, each working out the average of two lists of 1 billion numbers each. Suppose the processor can perform 1 billion operations per second. If the two tasks are each associated with a thread, and the two threads each have their own core, obviously we can get both results in 1 second.

If, however, we have a single processor that the two threads share, time slicing will perform a few hundred thousand operations on one thread, then switch to the other thread, then switch back, and so on. Each task will consume a total of 1 second of processor time, and the results of both will therefore be available after 2 seconds, leading to an average completion time of 2 seconds. (Again, we are ignoring the cost of context switching.)

If we assigned those two tasks to a single thread that performed the first task and did not even start the second until after the first was completed, the result of the first task would be obtained in 1 second and the result of the subsequent task would be obtained 1 second after that, leading to an average time of 1.5 seconds (a task completes in either 1 or 2 seconds and, therefore, on average completes in 1.5 seconds).

Guidelines

DO NOT fall into the common error of believing that more threads always makes code faster.

DO carefully measure performance when attempting to speed up processor-bound problems through multithreading.

Beginner Topic: Threading Problems

We’ve said several times that writing multithreaded programs is complex and difficult, but we have not said why. In a nutshell, the problem is that many of our reasonable assumptions that are true of single-threaded programs are violated in multithreaded programs. The issues include a lack of atomicity, race conditions, complex memory models, and deadlocks.

Most Operations Are Not Atomic

An atomic operation is one that always is observed to be either not started or already completed. Its state is never externally visible as “in progress.” Consider, for example, this code fragment:

Table of Contents for 18. Multithreading

Create new playlist

Sign In

Sign Up

18. Multithreading

Multithreading Basics

Working with System.Threading

Asynchronous Operations with System.Threading.Thread

Thread Management

Do Not Put Threads to Sleep in Production Code

Do Not Abort Threads in Production Code

Thread Pooling

Asynchronous Tasks

From Thread to Task

Introducing Asynchronous Tasks

Task Continuation

Unhandled Exception Handling on Task with AggregateException

Canceling a Task

Task.Run(): A Shortcut and Simplification to Task.Factory.StartNew()

Long-Running Tasks

Tasks Are Disposable

The Task-Based Asynchronous Pattern

Synchronously Invoking a High-Latency Operation

Asynchronously Invoking a High-Latency Operation Using the TPL

The Task-Based Asynchronous Pattern with async and await

Asynchronous Lambdas

Task Schedulers and the Synchronization Context

async/await with the Windows UI

await Operators

Executing Loop Iterations in Parallel

Canceling a Parallel Loop

Running LINQ Queries in Parallel

Canceling a PLINQ Query

Summary

Table of Contents for
18. Multithreading