Chapter 6. Async implementation

This chapter covers

  • The structure of asynchronous code
  • Interacting with the framework builder types
  • Performing a single step in an async method
  • Understanding execution context flow across await expressions
  • Interacting with custom task types

I vividly remember the evening of October 28, 2010. Anders Hejlsberg was presenting async/await at PDC, and shortly before his talk started, an avalanche of downloadable material was made available, including a draft of the changes to the C# specification, a Community Technology Preview (CTP) of the C# 5 compiler, and the slides Anders was presenting. At one point, I was watching the talk live and skimming through the slides while the CTP installed. By the time Anders had finished, I was writing async code and trying things out.

In the next few weeks, I started taking bits apart and looking at exactly what code the compiler was generating, trying to write my own simplistic implementation of the library that came with the CTP, and generally poking at it from every angle. As new versions came out, I worked out what had changed and became more and more comfortable with what was going on behind the scenes. The more I saw, the more I appreciated how much boilerplate code the compiler is happy to write on our behalf. It’s like looking at a beautiful flower under a microscope: the beauty is still there to be admired, but there’s so much more to it than can be seen at first glance.

Not everyone is like me, of course. If you just want to rely on the behavior I’ve already described and simply trust that the compiler will do the right thing, that’s absolutely fine. Alternatively, you won’t miss out on anything if you skip this chapter for now and come back to it at a later date; none of the rest of the book relies on it. It’s unlikely that you’ll ever have to debug your code down to the level that you’ll look at here, but I believe this chapter will give you more insight into how async/await hangs together. Both the awaitable pattern and the requirements for custom task types make more sense after you’ve looked at the generated code. I don’t want to get too mystical about this, but there’s a certain connection between the language and the developer that’s enriched by studying these implementation details.

As a rough approximation, we’ll pretend that the C# compiler performs a transformation from C# code using async/await to C# code without using async/await. Of course, the compiler is able to operate at a lower level than this with intermediate representations that can be emitted as IL. Indeed, in some aspects of async/await, the IL generated can’t be represented in regular C#, but it’s easy enough to explain those places.

Debug and release builds differ, and future implementations may, too

While writing this chapter, I became aware of a difference between debug and release builds of async code: in debug builds, the generated state machines are classes rather than structs. (This is to give a better debugger experience; in particular, it gives more flexibility in Edit and Continue scenarios.) This wasn’t true when I wrote the third edition; the compiler implementation has changed. It may change again in the future, too. If you decompile async code compiled by a C# 8 compiler, it could look slightly different from what’s presented here.

Although this is surprising, it shouldn’t be too alarming. By definition, implementation details can change over time. None of this invalidates any of the insight to be gained from studying a particular implementation. Just be aware that this is a different kind of learning from “these are the rules of C#, and they’ll change only in well-specified ways.”

In this chapter, I show the code generated by a release build. The differences mostly affect performance, and I believe most readers will be more interested in the performance of release builds than debug builds.

The generated code is somewhat like an onion; it has layers of complexity. We’ll start from the very outside and work our way in toward the tricky bit: await expressions and the dance of awaiters and continuations. For the sake of brevity, I’m going to present only asynchronous methods, not async anonymous functions; the machinery between the two is the same anyway, so there’s nothing particularly interesting to learn by repeating the work.

6.1. Structure of the generated code

As I mentioned in chapter 5, the implementation (both in this approximation and in the code generated by the real compiler) is in the form of a state machine. The compiler will generate a private nested struct to represent the asynchronous method, and it must also include a method with the same signature as the one you’ve declared. I call this the stub method; there’s not much to it, but it starts all of the rest going.

Note

Frequently, I’m going to talk about the state machine pausing. This corresponds to a point where the async method reaches an await expression and the operation being awaited hasn’t completed yet. As you may remember from chapter 5, when that happens, a continuation is scheduled to execute the rest of the async method when the awaited operation has completed, and then the async method returns. Similarly, it’s useful to talk about the async method taking a step: the code it executes between pauses, effectively. These aren’t official terms, but they’re useful as shorthand.

The state machine keeps track of where you are within the async method. Logically, there are four kinds of state, in common execution order:

  • Not started
  • Executing
  • Paused
  • Complete (either successfully or faulted)

Only the Paused set of states depends on the structure of the async method. Each await expression within the method is a distinct state to be returned to in order to trigger more execution. While the state machine is executing, it doesn’t need to keep track of the exact piece of code that’s executing; at that point, it’s just regular code, and the CPU keeps track of the instruction pointer just as with synchronous code. The state is recorded when the state machine needs to pause; the whole purpose is to allow it to continue the code execution later from the point it reached. Figure 6.1 shows the transitions between the possible states.

Figure 6.1. State transition diagram

Let’s make this concrete with a real piece of code. The following listing shows a simple async method. It’s not quite as simple as you could make it, but it can demonstrate a few things at the same time.

Listing 6.1. Simple introductory async method
static async Task PrintAndWait(TimeSpan delay)
{
    Console.WriteLine("Before first delay");
    await Task.Delay(delay);
    Console.WriteLine("Between delays");
    await Task.Delay(delay);
    Console.WriteLine("After second delay");
}

Three points to note at this stage are as follows:

  • You have a parameter that you’ll need to use in the state machine.
  • The method includes two await expressions.
  • The method returns Task, so you need to return a task that will complete after the final line is printed, but there’s no specific result.

This is nice and simple because you have no loops or try/catch/finally blocks to worry about. The control flow is simple, apart from the awaiting, of course. Let’s see what the compiler generates for this code.

Do try this at home

I typically use a mixture of ildasm and Redgate Reflector for this sort of work, setting the Optimization level to C# 1 to prevent the decompiler from reconstructing the async method for us. Other decompilers are available, but whichever one you pick, I recommend checking the IL as well. I’ve seen subtle bugs in decompilers when it comes to await, often in terms of the execution order.

You don’t have to do any of this if you don’t want to, but if you find yourself wondering what the compiler does with a particular code construct, and this chapter doesn’t provide the answer, just go for it. Don’t forget the difference between debug and release builds, though, and don’t be put off by the names generated by the compiler, which can make the result harder to read.

Using the tools available, you can decompile listing 6.1 into something like listing 6.2. Many of the names that the C# compiler generates aren’t valid C#; I’ve rewritten them as valid identifiers for the sake of getting runnable code. In other cases, I’ve renamed the identifiers to make the code more readable. Later, I’ve taken a few liberties with how the cases and labels for the state machine are ordered; it’s absolutely logically equivalent to the generated code, but much easier to read. In other places, I’ve used a switch statement even with only two cases, where the compiler might effectively use if/else. In these places, the switch statement represents the more general case that can work when there are multiple points to jump to, but the compiler can generate simpler code for simpler situations.

Listing 6.2. Generated code for listing 6.1 (except for MoveNext)
Stub method
[AsyncStateMachine(typeof(PrintAndWaitStateMachine))]
[DebuggerStepThrough]
private static unsafe Task PrintAndWait(TimeSpan delay)
{
    var machine = new PrintAndWaitStateMachine               1
    {                                                        1
        delay = delay,                                       1
        builder = AsyncTaskMethodBuilder.Create(),           1
        state = -1                                           1
    };                                                       1
    machine.builder.Start(ref machine);                      2
    return machine.builder.Task;                             3
}

Private struct for the state machine
[CompilerGenerated]
private struct PrintAndWaitStateMachine : IAsyncStateMachine
{
    public int state;                                        4
    public AsyncTaskMethodBuilder builder;                   5
    private TaskAwaiter awaiter;                             6
    public TimeSpan delay;                                   7

    void IAsyncStateMachine.MoveNext()                       8
    {                                                        8
    }                                                        8

    [DebuggerHidden]
    void IAsyncStateMachine.SetStateMachine(
        IAsyncStateMachine stateMachine)
    {
        this.builder.SetStateMachine(stateMachine);          9
    }
}

  • 1 Initializes the state machine, including method parameters
  • 2 Runs the state machine until it needs to wait
  • 3 Returns the task representing the async operation
  • 4 State of the state machine (where to resume)
  • 5 The builder hooking into async infrastructure types
  • 6 Awaiter to fetch result from when resuming
  • 7 Original method parameter
  • 8 Main state machine work goes here.
  • 9 Connects the builder and the boxed state machine

This listing looks somewhat complicated already, but I should warn you that the bulk of the work is done in the MoveNext method, and I’ve completely removed the implementation of that for now. The point of listing 6.2 is to set the scene and provide the structure so that when you get to the MoveNext implementation, it makes sense. Let’s look at the pieces of the listing in turn, starting with the stub method.

6.1.1. The stub method: Preparation and taking the first step

The stub method from listing 6.2 is simple apart from the AsyncTaskMethodBuilder. This is a value type, and it’s part of the common async infrastructure. You’ll see over the rest of the chapter how the state machine interacts with the builder.

[AsyncStateMachine(typeof(PrintAndWaitStateMachine))]
[DebuggerStepThrough]
private static unsafe Task PrintAndWait(TimeSpan delay)
{
    var machine = new PrintAndWaitStateMachine
    {
        delay = delay,
        builder = AsyncTaskMethodBuilder.Create(),
        state = -1
    };
    machine.builder.Start(ref machine);
    return machine.builder.Task;
}

The attributes applied to the method are essentially for tooling. They have no effect on regular execution, and you don’t need to know any details about them in order to understand the generated asynchronous code. The state machine is always created in the stub method with three pieces of information:

  • Any parameters (in this case, just delay), each as separate fields in the state machine
  • The builder, which varies depending on the return type of the async method
  • The initial state, which is always –1
Note

The name AsyncTaskMethodBuilder may make you think of reflection, but it’s not creating a method in IL or anything like that. The builder provides functionality that the generated code uses to propagate success and failure, handle awaiting, and so forth. If the name “helper” works better for you, feel free to think of it that way.

After creating the state machine, the stub method asks the machine’s builder to start it, passing the machine itself by reference. You’ll see quite a lot of passing by reference in the following few pages, and this comes down to a need for efficiency and consistency. Both the state machine and the AsyncTaskMethodBuilder are mutable value types. Passing machine by reference to the Start method avoids making a copy of the state, which is more efficient and ensures that any changes made to the state within Start are still visible when the Start method returns. In particular, the builder state within the machine may well change during Start. That’s why it’s important that you use machine.builder for both the Start call and the Task property afterward. Suppose you extracted machine.builder to a local variable, like this:

var builder = machine.builder;     1
builder.Start(ref machine);        1
return builder.Task;               1

  • 1 Invalid attempt at refactoring

With that code, state changes made directly within builder.Start() wouldn’t be seen within machine.builder (or vice versa) because it would be a copy of the builder. This is where it’s important that machine.builder refers to a field, not a property. You don’t want to operate on a copy of the builder in the state machine; rather, you want to operate directly on the value that the state machine contains. This is precisely the sort of detail that you don’t want to have to deal with yourself and is why mutable value types and public fields are almost always a bad idea. (You’ll see in chapter 11 how they can be useful when carefully considered.)

Starting the machine doesn’t create any new threads. It just runs the state machine’s MoveNext() method until either the state machine needs to pause while it awaits another asynchronous operation or completes. In other words, it takes one step. Either way, MoveNext() returns, at which point machine.builder.Start() returns, and you can return a task representing the overall asynchronous method back to our caller. The builder is responsible for creating the task and ensuring that it changes state appropriately over the course of the asynchronous method.

That’s the stub method. Now let’s look at the state machine itself.

6.1.2. Structure of the state machine

I’m still omitting the majority of the code from the state machine (in the MoveNext() method), but here’s a reminder of the structure of the type:

[CompilerGenerated]
private struct PrintAndWaitStateMachine : IAsyncStateMachine
{
    public int state;
    public AsyncTaskMethodBuilder builder;
    private TaskAwaiter awaiter;
    public TimeSpan delay;
 
    void IAsyncStateMachine.MoveNext()
    {
                 1
    }

    [DebuggerHidden]
    void IAsyncStateMachine.SetStateMachine(
        IAsyncStateMachine stateMachine)
    {
        this.builder.SetStateMachine(stateMachine);
    }
}

  • 1 Implementation omitted

Again, the attributes aren’t important. The important aspects of the type are as follows:

  • It implements the IAsyncStateMachine interface, which is used for the async infrastructure. The interface has only the two methods shown.
  • The fields, which store the information the state machine needs to remember between one step and the next.
  • The MoveNext() method, which is called once when the state machine is started and once each time it resumes after being paused.
  • The SetStateMachine() method, which always has the same implementation (in release builds).

You’ve seen one use of the type implementing IAsyncStateMachine already, although it was somewhat hidden: AsyncTaskMethodBuilder.Start() is a generic method with a constraint that the type parameter has to implement IAsyncStateMachine. After performing a bit of housekeeping, Start() calls MoveNext() to make the state machine take the first step of the async method.

The fields involved can be broadly split into five categories:

  • The current state (for example, not started, paused at a particular await expression, and so forth)
  • The method builder used to communicate with the async infrastructure and to provide the Task to return
  • Awaiters
  • Parameters and local variables
  • Temporary stack variables

The state and builder are fairly simple. The state is just an integer with one of the following values:

  • –1—Not started, or currently executing (it doesn’t matter which)
  • –2—Finished (either successfully or faulted)
  • Anything else—Paused at a particular await expression

As I mentioned before, the type of the builder depends on the return type of the async method. Before C# 7, the builder type was always AsyncVoidMethodBuilder, AsyncTaskMethodBuilder, or AsyncTaskMethodBuilder<T>. With C# 7 and custom task types, the builder type specified by the AsyncTaskMethodBuilderAttribute is applied to the custom task type.

The other fields are slightly trickier in that all of them depend on the body of the async method, and the compiler tries to use as few fields as it can. The crucial point to remember is that you need fields only for values that you need to come back to after the state machine resumes at some point. Sometimes the compiler can use fields for multiple purposes, and sometimes it can omit them entirely.

The first example of how the compiler can reuse fields is with awaiters. Only one awaiter is relevant at a time, because any particular state machine can await only one value at a time. The compiler creates a single field for each awaiter type that’s used. If you await two Task<int> values, one Task<string>, and three nongeneric Task values in an async method, you’ll end up with three fields: a TaskAwaiter<int>, a TaskAwaiter<string>, and a nongeneric TaskAwaiter. The compiler uses the appropriate field for each await expression based on the awaiter type.

Note

This assumes the awaiter is introduced by the compiler. If you call GetAwaiter() yourself and assign the result to a local variable, that’s treated like any other local variable. I’m talking about the awaiters that are produced as the result of await expressions.

Next, let’s consider local variables. Here, the compiler doesn’t reuse fields but can omit them entirely. If a local variable is used only between two await expressions rather than across await expressions, it can stay as a local variable in the MoveNext() method.

It’s easier to see what I mean with an example. Consider the following async method:

public async Task LocalVariableDemoAsync()
{
    int x = DateTime.UtcNow.Second;   1
    int y = DateTime.UtcNow.Second;   2
    Console.WriteLine(y);             2
    await Task.Delay();
    Console.WriteLine(x);             3
}

  • 1 x is assigned before the await.
  • 2 y is used only before the await.
  • 3 x is used after the await.

The compiler would generate a field for x because the value has to be preserved while the state machine is paused, but y can just be a local variable on the stack while the code is executing.

Note

The compiler does a pretty good job of creating only as many fields as it needs. But at times, you might spot an optimization that the compiler could perform but doesn’t. For example, if two variables have the same type and are both used across await expressions (so they need fields), but they’re never both in scope at the same time, the compiler could use just one field for both as it does for awaiters. At the time of this writing, it doesn’t, but who knows what the future could hold?

Finally, there are temporary stack variables. These are introduced when an await expression is used as part of a bigger expression and some intermediate values need to be remembered. Our simple example in listing 6.1 doesn’t need any, which is why listing 6.2 shows only four fields: the state, builder, awaiter, and parameter. As an example of this, consider the following method:

public async Task TemporaryStackDemoAsync()
{
    Task<int> task = Task.FromResult(10);
    DateTime now = DateTime.UtcNow;
    int result = now.Second + now.Hours * await task;
}

The C# rules for operand evaluation don’t change just because you’re within an async method. The properties now.Second and now.Hours both have to be evaluated before the task is awaited, and their results have to be remembered in order to perform the arithmetic later, after the state machine resumes when the task completes. That means it needs to use fields.

Note

In this case, you know that Task.FromResult always returns a completed task. But the compiler doesn’t know that, and it has to generate the state machine in a way that would let it pause and resume if the task weren’t complete.

You can think of it as if the compiler rewrites the code to introduce extra local variables:

public async Task TemporaryStackDemoAsync()
{
    Task<int> task = Task.FromResult(10);
    DateTime now = DateTime.UtcNow;
    int tmp1 = now.Second;
    int tmp2 = now.Hours;
    int result = tmp1 + tmp2 * await task;
}

Then the local variables are converted into fields. Unlike real local variables, the compiler does reuse temporary stack variables of the same type and generates only as many fields as it needs to.

That explains all the fields in the state machine. Next, you need to look at the MoveNext() method—but only conceptually, to start with.

6.1.3. The MoveNext() method (high level)

I’m not going to show you the decompiled code for listing 6.1’s MoveNext() method yet, because it’s long and scary.[1] After you know what the flow looks like, it’s more manageable, so I’ll describe it in the abstract here.

1

If A Few Good Men had been about async, the line would have been, “You want the MoveNext? You can’t handle the MoveNext!”

Each time MoveNext() is called, the state machine takes another step. Each time it reaches an await expression, it’ll continue if the value being awaited has already completed and pause otherwise. MoveNext() returns if any of the following occurs:

  • The state machine needs to pause to await an incomplete value.
  • Execution reaches the end of the method or a return statement.
  • An exception is thrown but not caught in the async method.

Note that in the final case, the MoveNext() method doesn’t end up throwing an exception. Instead, the task associated with the async call becomes faulted. (If that surprises you, see section 5.6.5 for a reminder of the behavior of async methods with respect to exceptions.)

Figure 6.2 shows a general flowchart of an async method that focuses on the MoveNext() method. I haven’t included exception handling in the figure, as flowcharts don’t have a way of representing try/catch blocks. You’ll see how that’s managed when you eventually look at the code. Likewise, I haven’t shown where SetStateMachine is called, as the flowchart is complicated enough as it is.

Figure 6.2. Flowchart of an async method

One final point about the MoveNext() method: its return type is void, not a task type. Only the stub method needs to return the task, which it gets from the state machine’s builder after the builder’s Start() method has called MoveNext() to take the first step. All the other calls to MoveNext() are part of the infrastructure for resuming the state machine from a paused state, and those don’t need the associated task. You’ll see what all of this looks like in code in section 6.2 (not long to go now), but first, a brief word on SetStateMachine.

6.1.4. The SetStateMachine method and the state machine boxing dance

I’ve already shown the implementation of SetStateMachine. It’s simple:

void IAsyncStateMachine.SetStateMachine(
    IAsyncStateMachine stateMachine)
{
    this.builder.SetStateMachine(stateMachine);
}

The implementation in release builds always looks like this. (In debug builds, where the state machine is a class, the implementation is empty.) The purpose of the method is easy to explain at a high level, but the details are fiddly. When a state machine takes its first step, it’s on the stack as a local variable of the stub method. If it pauses, it has to box itself (onto the heap) so that all that information is still in place when it resumes. After it’s been boxed, SetStateMachine is called on the boxed value using the boxed value as the argument. In other words, somewhere deep in the heart of the infrastructure, there’s code that looks a bit like this:

void BoxAndRemember<TStateMachine>(ref TStateMachine stateMachine)
    where TStateMachine : IStateMachine
{
    IStateMachine boxed = stateMachine;
    boxed.SetStateMachine(boxed);
}

It’s not quite as simple as that, but that conveys the essence of what’s going on. The implementation of SetStateMachine then makes sure that the AsyncTaskMethodBuilder has a reference to the single boxed version of the state machine that it’s a part of. The method has to be called on the boxed value; it can be called only after boxing, because that’s when you have the reference to the boxed value, and if you called it on the unboxed value after boxing, that wouldn’t affect the boxed value. (Remember, AsyncTaskMethodBuilder is itself a value type.) This intricate dance ensures that when a continuation delegate is passed to the awaiter, that continuation will call MoveNext() on the same boxed instance.

The result is that the state machine isn’t boxed at all if it doesn’t need to be and is boxed exactly once if necessary. After it’s boxed, everything happens on the boxed version. It’s a lot of complicated code in the name of efficiency.

I find this little dance one of the most intriguing and bizarre bits of the whole async machinery. It sounds like it’s utterly pointless, but it’s necessary because of the way boxing works, and boxing is necessary to preserve information while the state machine is paused.

It’s absolutely fine not to fully understand this code. If you ever find yourself debugging async code at a low level, you can come back to this section. For all other intents and purposes, this code is more of a novelty than anything else.

That’s what the state machine consists of. Most of the rest of the chapter is devoted to the MoveNext()method and how it operates in various situations. We’ll start with the simple case and work up from there.

6.2. A simple MoveNext() implementation

We’re going to start with the simple async method that you saw in listing 6.1. It’s simple not because it’s short (although that helps) but because it doesn’t contain any loops, try statements, or using statements. It has simple control flow, which leads to a relatively simple state machine. Let’s get cracking.

6.2.1. A full concrete example

I’m going to show you the full method to start with. Don’t expect this to all make sense yet, but do spend a few minutes looking through it. With this concrete example in hand, the more general structure is easier to understand, because you can always look back to see how each part of that structure is present in this example. At the risk of boring you, here’s listing 6.1 yet again as a reminder of the compiler’s input:

static async Task PrintAndWait(TimeSpan delay)
{
    Console.WriteLine("Before first delay");
    await Task.Delay(delay);
    Console.WriteLine("Between delays");
    await Task.Delay(delay);
    Console.WriteLine("After second delay");
}

The following listing is a version of the decompiled code that has been slightly rewritten for readability. (Yes, this is the easy-to-read version.)

Listing 6.3. The decompiled MoveNext() method from listing 6.1
void IAsyncStateMachine.MoveNext()
{
    int num = this.state;
    try
    {
        TaskAwaiter awaiter1;
        switch (num)
        {
            default:
                goto MethodStart;
            case 0:
                goto FirstAwaitContinuation;
            case 1:
                goto SecondAwaitContinuation;
        }
    MethodStart:
        Console.WriteLine("Before first delay");
        awaiter1 = Task.Delay(this.delay).GetAwaiter();
        if (awaiter1.IsCompleted)
        {
            goto GetFirstAwaitResult;
        }
        this.state = num = 0;
        this.awaiter = awaiter1;
        this.builder.AwaitUnsafeOnCompleted(ref awaiter1, ref this);
        return;
    FirstAwaitContinuation:
        awaiter1 = this.awaiter;
        this.awaiter = default(TaskAwaiter);
        this.state = num = -1;
    GetFirstAwaitResult:
        awaiter1.GetResult();
        Console.WriteLine("Between delays");
        TaskAwaiter awaiter2 = Task.Delay(this.delay).GetAwaiter();
        if (awaiter2.IsCompleted)
        {
            goto GetSecondAwaitResult;
        }
        this.state = num = 1;
        this.awaiter = awaiter2;
        this.builder.AwaitUnsafeOnCompleted(ref awaiter2, ref this);
        return;
    SecondAwaitContinuation:
        awaiter2 = this.awaiter;
        this.awaiter = default(TaskAwaiter);
        this.state = num = -1;
    GetSecondAwaitResult:
        awaiter2.GetResult();
        Console.WriteLine("After second delay");
    }
    catch (Exception exception)
    {
        this.state = -2;
        this.builder.SetException(exception);
        return;
    }
    this.state = -2;
    this.builder.SetResult();
}

That’s a lot of code, and you may notice that it has a lot of goto statements and code labels, which you hardly ever see in handwritten C#. At the moment, I expect it to be somewhat impenetrable, but I wanted to show you a concrete example to start with, so you can refer to it anytime it’s useful to you. I’m going to break this down further into general structure and then the specifics of await expressions. By the end of this section, listing 6.3 will probably still look extremely ugly to you, but you’ll be in a better position to understand what it’s doing and why.

6.2.2. MoveNext() method general structure

We’re into the next layer of the async onion. The MoveNext() method is at the heart of the async state machine, and its complexity is a reminder of how hard it is to get async code right. The more complex the state machine, the more reason you have to be grateful that it’s the C# compiler that has to write the code rather than you.

Note

It’s time to introduce more terminology for the sake of brevity. At each await expression, the value being awaited may already have completed or may still be incomplete. If it has already completed by the time you await it, the state machine keeps executing. I call this the fast path. If it hasn’t already completed, the state machine schedules a continuation and pauses. I call this the slow path.

As a reminder, the MoveNext() method is invoked once when the async method is first called and then once each time it needs to resume from being paused at an await expression. (If every await expression takes the fast path, MoveNext() will be called only once.) The method is responsible for the following:

  • Executing from the right place (whether that’s the start of the original async code or partway through)
  • Preserving state when it needs to pause, both in terms of local variables and location within the code
  • Scheduling a continuation when it needs to pause
  • Retrieving return values from awaiters
  • Propagating exceptions via the builder (rather than letting MoveNext() itself fail with an exception)
  • Propagating any return value or method completion via the builder

With this in mind, the following listing shows pseudocode for the general structure of a MoveNext() method. You’ll see in later sections how this can end up being more complicated because of extra control flow, but it’s a natural extension.

Listing 6.4. Pseudocode of a MoveNext() method
void IAsyncStateMachine.MoveNext()
{
    try
    {
        switch (this.state)
        {
            default: goto MethodStart;
            case 0: goto Label0A;
            case 1: goto Label1A;
            case 2: goto Label2A;
                                        1
        }
    MethodStart:
                                        2
                                        3
    Label0A:
                                        4
    Label0B:
                                        5
                                        6
    }
    catch (Exception e)                 7
    {                                   7
        this.state = -2;                7
        builder.SetException(e);        7
        return;                         7
    }                                   7
    this.state = -2;                    8
    builder.SetResult();                8
}

  • 1 As many cases as there are await expressions
  • 2 Code before the first await expression
  • 3 Sets up the first awaiter
  • 4 Code resuming from a continuation
  • 5 Fast and slow paths rejoin
  • 6 Remainder of code, with more labels, awaiters, and so on
  • 7 Propagates all exceptions via the builder
  • 8 Propagates method completion via the builder

The big try/catch block covers all the code from the original async method. If anything in there throws an exception, however it’s thrown (via awaiting a faulted operation, calling a synchronous method that throws, or simply throwing an exception directly), that exception is caught and then propagated via the builder. Only special exceptions (ThreadAbortException and StackOverflowException, for example) will ever cause MoveNext() to end with an exception.

Within the try/catch block, the start of the MoveNext() method is always effectively a switch statement used to jump to the right piece of code within the method based on the state. If the state is non-negative, that means you’re resuming after an await expression. Otherwise, it’s assumed that you’re executing MoveNext() for the first time.

What about other states?

In section 6.1, I listed the possible states as not started, executing, paused, and complete (where paused is a separate state per await expression). Why doesn’t the state machine handle not started, executing, and complete differently?

The answer is that MoveNext() should never end up being called in the executing or complete states. You can force it to by writing a broken awaiter implementation or by using reflection, but under normal operation, MoveNext() is called only to start or resume the state machine. There aren’t even distinct state numbers for not started and executing; both use –1. There’s a state number of –2 for completed, but the state machine never checks for that value.

One bit of trickiness to be aware of is the difference between a return statement in the state machine and a return statement in the original async code. Within the state machine, return is used when the state machine is paused after scheduling a continuation for an awaiter. Any return statement in the original code ends up dropping to the bottom part of the state machine outside the try/catch block, where the method completion is propagated via the builder.

If you compare listings 6.3 and 6.4, hopefully you can see how our concrete example fits into the general pattern. At this point, I’ve explained almost everything about the code generated by the simple async method you started with. The only bit that’s missing is exactly what happens around await expressions.

6.2.3. Zooming into an await expression

Let’s think again about what has to happen each time you hit an await expression when executing an async method, assuming you’ve already evaluated the operand to get something that’s awaitable:

  1. You fetch the awaiter from the awaitable by calling GetAwaiter(), storing it on the stack.
  2. You check whether the awaiter has already completed. If it has, you can skip straight to fetching the result (step 9). This is the fast path.
  3. It looks like you’re on the slow path. Oh well. Remember where you reached via the state field.
  4. Remember the awaiter in a field.
  5. Schedule a continuation with the awaiter, making sure that when the continuation is executed, you’ll be back to the right state (doing the boxing dance, if necessary).
  6. Return from the MoveNext() method either to the original caller, if this is the first time you’ve paused, or to whatever scheduled the continuation otherwise.
  7. When the continuation fires, set your state back to running (value of –1).
  8. Copy the awaiter out of the field and back onto the stack, clearing the field in order to potentially help the garbage collector. Now you’re ready to rejoin the fast path.
  9. Fetch the result from the awaiter, which is on the stack at this point regardless of which path you took. You have to call GetResult() even if there isn’t a result value to let the awaiter propagate errors if necessary.
  10. Continue on your merry way, executing the rest of the original code using the result value if there was one.

With that list in mind, let’s review a section of listing 6.3 that corresponds to our first await expression.

Listing 6.5. A section of listing 6.3 corresponding to a single await
    awaiter1 = Task.Delay(this.delay).GetAwaiter();
    if (awaiter1.IsCompleted)
    {
        goto GetFirstAwaitResult;
    }
    this.state = num = 0;
    this.awaiter = awaiter1;
    this.builder.AwaitUnsafeOnCompleted(ref awaiter1, ref this);
    return;
FirstAwaitContinuation:
    awaiter1 = this.awaiter;
    this.awaiter = default(TaskAwaiter);
    this.state = num = -1;
GetFirstAwaitResult:
    awaiter1.GetResult();

Unsurprisingly, the code follows the set of steps precisely.[2] The two labels represent the two places you have to jump to, depending on the path:

2

It’s unsurprising in that it would have been pretty odd of me to write that list of steps and then present code that didn’t follow the list.

  • In the fast path, you jump over the slow-path code.
  • In the slow path, you jump back into the middle of the code when the continuation is called. (Remember, that’s what the switch statement at the start of the method is for.)

The call to builder.AwaitUnsafeOnCompleted(ref awaiter1, ref this) is the part that does the boxing dance with a call back into SetStateMachine (if necessary; it happens only once per state machine) and schedules the continuation. In some cases, you’ll see a call to AwaitOnCompleted instead of AwaitUnsafeOnCompleted. These differ only in terms of how the execution context is handled. You’ll look at this in more detail in section 6.5.

One aspect that may seem slightly unclear is the use of the num local variable. It’s always assigned a value at the same time as the state field but is always read instead of the field. (Its initial value is copied out of the field, but that’s the only time the field is read.) I believe this is purely for optimization. Whenever you read num, it’s fine to think of it as this.state instead.

Looking at listing 6.5, that’s 16 lines of code for what was originally just the following:

await Task.Delay(delay);

The good news is that you almost never need to see all that code unless you’re going through this kind of exercise. There’s a small amount of bad news in that the code inflation means that even small async methods—even those using ValueTask<TResult>—can’t be sensibly inlined by the JIT compiler. In most cases, that’s a miniscule price to pay for the benefits afforded by async/await, though.

That’s the simple case with simple control flow. With that background, you can explore a couple of more-complex cases.

6.3. How control flow affects MoveNext()

The example you’ve been looking at so far has just been a sequence of method calls with only the await operator introducing complexity. Life gets a little harder when you want to write real code with all the normal control-flow statements you’re used to.

In this section, I’ll show you just two elements of control flow: loops and try/finally statements. This isn’t intended to be comprehensive, but it should give you enough of a glimpse at the control-flow gymnastics the compiler has to perform to help you understand other situations if you need to.

6.3.1. Control flow between await expressions is simple

Before we get into the tricky part, I’ll give an example of where introducing control flow doesn’t add to the generated code complexity any more than it would in the synchronous code. In the following listing, a loop is introduced into our example method, so you print Between delays three times instead of once.

Listing 6.6. Introducing a loop between await expressions
static async Task PrintAndWaitWithSimpleLoop(TimeSpan delay)
{
    Console.WriteLine("Before first delay");
    await Task.Delay(delay);
    for (int i = 0; i < 3; i++)
    {
        Console.WriteLine("Between delays");
    }
    await Task.Delay(delay);
    Console.WriteLine("After second delay");
}

What does this look like when decompiled? Very much like listing 6.2! The only difference is this

GetFirstAwaitResult:
    awaiter1.GetResult();
    Console.WriteLine("Between delays");
    TaskAwaiter awaiter2 = Task.Delay(this.delay).GetAwaiter();

becomes the following:

GetFirstAwaitResult:
    awaiter1.GetResult();
    for (int i = 0; i < 3; i++)
    {
        Console.WriteLine("Between delays");
    }
    TaskAwaiter awaiter2 = Task.Delay(this.delay).GetAwaiter();

The change in the state machine is exactly the same as the change in the original code. There are no extra fields and no complexities in terms of how to continue execution; it’s just a loop.

The reason I bring this up is to help you think about why extra complexity is required in our next examples. In listing 6.6, you never need to jump into the loop from outside, and you never need to pause execution and jump out of the loop, thereby pausing the state machine. Those are the situations introduced by await expressions when you await within the loop. Let’s do that now.

6.3.2. Awaiting within a loop

Our example so far has contained two await expressions. To keep the code somewhat manageable as I introduce other complexities, I’m going to reduce that to one. The following listing shows the async method you’re going to decompile in this subsection.

Listing 6.7. Awaiting in a loop
static async Task AwaitInLoop(TimeSpan delay)
{
    Console.WriteLine("Before loop");
    for (int i = 0; i < 3; i++)
    {
        Console.WriteLine("Before await in loop");
        await Task.Delay(delay);
        Console.WriteLine("After await in loop");
    }
    Console.WriteLine("After loop delay");
}

The Console.WriteLine calls are mostly present as signposts within the decompiled code, which makes it easier to map to the original listing.

What does the compiler generate for this? I’m not going to show the complete code, because most of it is similar to what you’ve seen before. (It’s all in the downloadable source, though.) The stub method and state machine are almost exactly as they were for earlier examples but with one additional field in the state machine corresponding to i, the loop counter. The interesting part is in MoveNext().

You can represent the code faithfully in C# but not using a loop construct. The problem is that after the state machine returns from pausing at Task.Delay, you want to jump into the middle of the original loop. You can’t do that with a goto statement in C#; the language forbids a goto statement specifying a label if the goto statement isn’t in the scope of that label.

That’s okay; you can implement your for loop with a lot of goto statements without introducing any extra scopes at all. That way, you can jump to the middle of it without a problem. The following listing shows the bulk of the decompiled code for the body of the MoveNext() method. I’ve included only the part within the try block, as that’s what we’re focusing on here. (The rest is simple boilerplate.)

Listing 6.8. Decompiled loop without using any loop constructs
    switch (num)
    {
        default:
            goto MethodStart;
        case 0:
            goto AwaitContinuation;
    }
MethodStart:
    Console.WriteLine("Before loop");
    this.i = 0;                                    1
    goto ForLoopCondition;                         2
ForLoopBody:                                       3
    Console.WriteLine("Before await in loop");
    TaskAwaiter awaiter = Task.Delay(this.delay).GetAwaiter();
    if (awaiter.IsCompleted)
    {
        goto GetAwaitResult;
    }
    this.state = num = 0;
    this.awaiter = awaiter;
    this.builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
    return;
AwaitContinuation:                                 4
    awaiter = this.awaiter;
    this.awaiter = default(TaskAwaiter);
    this.state = num = -1;
GetAwaitResult:
    awaiter.GetResult();
    Console.WriteLine("After await in loop");
    this.i++;                                      5
ForLoopCondition:                                  6
    if (this.i < 3)                                6
    {                                              6
        goto ForLoopBody;                          6
    }                                              6
    Console.WriteLine("After loop delay");

  • 1 For loop initializer
  • 2 Skips straight to checking the loop condition
  • 3 Body of the for loop
  • 4 Target for jump when the state machine resumes
  • 5 For loop iterator
  • 6 Checks for loop condition and jumps back to body if it holds

I could’ve skipped this example entirely, but it brings up a few interesting points. First, the C# compiler doesn’t convert an async method into equivalent C# that doesn’t use async/await. It only has to generate appropriate IL. In some places, C# has rules that are stricter than those in IL. (The set of valid identifiers is another example of this.)

Second, although decompilers can be useful when looking at async code, sometimes they produce invalid C#. When I first decompiled the output of listing 6.7, the output included a while loop containing a label and a goto statement outside that loop trying to jump into it. You can sometimes get valid (but harder-to-read) C# by telling the decompiler not to work as hard to produce idiomatic C#, at which point you’ll see an awful lot of goto statements.

Third, in case you weren’t already convinced, you don’t want to be writing this sort of code by hand. If you had to write C# 4 code for this sort of task, you’d no doubt do it in a very different way, but it would still be significantly uglier than the async method you can use in C# 5.

You’ve seen how awaiting within a loop might cause humans some stress, but it doesn’t cause the compiler to break a sweat. For our final control-flow example, you’ll give it some harder work to do: a try/finally block.

6.3.3. Awaiting within a try/finally block

Just to remind you, it’s always been valid to use await in a try block, but in C# 5, it was invalid to use it in a catch or finally block. That restriction was lifted in C# 6, although I’m not going to show any code that takes advantage of it.

Note

There are simply too many possibilities to go through here. The aim of this chapter is to give you insight into the kind of thing the C# compiler does with async/await rather than provide an exhaustive list of translations.

In this section, I’m only going to show you an example of awaiting within a try block that has just a finally block. That’s probably the most common kind of try block, because it’s the one that using statements are equivalent to. The following listing shows the async method you’re going to decompile. Again, all the console output is present only to make it simpler to understand the state machine.

Listing 6.9. Awaiting within a try block
static async Task AwaitInTryFinally(TimeSpan delay)
{
    Console.WriteLine("Before try block");
    await Task.Delay(delay);
    try
    {
        Console.WriteLine("Before await");
        await Task.Delay(delay);
        Console.WriteLine("After await");
    }
    finally
    {
        Console.WriteLine("In finally block");
    }
    Console.WriteLine("After finally block");
}

You might imagine that the decompiled code would look something like this:

    switch (num)
    {
        default:
            goto MethodStart;
        case 0:
            goto AwaitContinuation;
    }
MethodStart:
    ...
    try
    {
        ...
    AwaitContinuation:
        ...
    GetAwaitResult:
        ...
    }
        finally
    {
        ...
    }
    ...

Here, each ellipsis (...) represents more code. There’s a problem with that approach, though: even in IL, you’re not allowed to jump from outside a try block to inside it. It’s a little bit like the problem you saw in the previous section with loops, but this time instead of a C# rule, it’s an IL rule.

To achieve this, the C# compiler uses a technique I like to think of as a trampoline. (This isn’t official terminology, although the term is used elsewhere for similar purposes.) It jumps to just before the try block, and then the first thing inside the try block is a piece of code that jumps to the right place within the block.

In addition to the trampoline, the finally block needs to be handled with care, too. There are three situations in which you’ll execute the finally block of the generated code:

  • You reach the end of the try block.
  • The try block throws an exception.
  • You need to pause within the try block because of an await expression.

(If the async method contained a return statement, that would be another option.) If the finally block is executing because you’re pausing the state machine and returning to the caller, the code in the original async method’s finally block shouldn’t execute. After all, you’re logically paused inside the try block and will be resuming there when the delay completes. Fortunately, this is easy to detect: the num local variable (which always has the same as the state field) is negative if the state machine is still executing or finished and non-negative if you’re pausing.

All of this together leads to the following listing, which again is the code within the outer try block of MoveNext(). Although there’s still a lot of code, most of it is similar to what you’ve seen before. I’ve highlighted the try/finally-specific aspects in bold.

Listing 6.10. Decompiled await within try/finally
    switch (num)
    {
        default:
            goto MethodStart;
        case 0:
            goto AwaitContinuationTrampoline;    1
    }
MethodStart:
    Console.WriteLine("Before try");
AwaitContinuationTrampoline:
    try
    {
        switch (num)                             2
        {                                        2
            default:                             2
                goto TryBlockStart;              2
            case 0:                              2
                goto AwaitContinuation;          2
        }                                        2
    TryBlockStart:                               2
        Console.WriteLine("Before await");
        TaskAwaiter awaiter = Task.Delay(this.delay).GetAwaiter();
        if (awaiter.IsCompleted)
        {
            goto GetAwaitResult;
        }
        this.state = num = 0;
        this.awaiter = awaiter;
        this.builder.AwaitUnsafeOnCompleted(ref awaiter, ref this);
        return;
     AwaitContinuation:                             3
        awaiter = this.awaiter;
        this.awaiter = default(TaskAwaiter);
        this.state = num = -1;
    GetAwaitResult:
        awaiter.GetResult();
        Console.WriteLine("After await");
    }
    finally
    {
        if (num < 0)                               4
        {                                          4
            Console.WriteLine("In finally block"); 4
        }                                          4
    }
    Console.WriteLine("After finally block");

  • 1 Jumps to just before the trampoline, so it can bounce execution to the right place
  • 2 Trampoline within the try block
  • 3 Real continuation target
  • 4 Effectively ignores finally block if you’re pausing

That’s the final decompilation in the chapter, I promise. I wanted to get to that level of complexity to help you navigate the generated code if you ever need to. That’s not to say you won’t need to keep your wits about you when looking through it, particularly bearing in mind the many transformations the compiler can perform to make the code simpler than what I’ve shown. As I said earlier, where I’ve always used a switch statement for “jump to X” pieces of code, the compiler can sometimes use simpler branching code. Consistency in multiple situations is important when reading source code, but that doesn’t matter to the compiler.

One of the aspects I’ve skimmed over so far is why awaiters have to implement INotifyCompletion but can also implement ICriticalNotifyCompletion, and the effect that has on the generated code. Let’s take a closer look now.

6.4. Execution contexts and flow

In section 5.2.2, I described synchronization contexts, which are used to govern the thread that code executes on. This is just one of many contexts in .NET, although it’s probably the best known. Context provides an ambient way of maintaining information transparently. For example, SecurityContext keeps track of the current security principal and code access security. You don’t need to pass all that information around explicitly; it just follows your code, doing the right thing in almost all cases. A single class is used to manage all the other contexts: ExecutionContext.

Deep and scary stuff

I almost didn’t include this section. It’s at the very limits of my knowledge about async. If you ever need to know the intimate details, you’ll want to know far more about the topic than I’ve included here.

I’ve covered this at all only because otherwise there’d be no explanation whatsoever for having both AwaitOnCompleted and AwaitUnsafeOnCompleted in the builder or why awaiters usually implement ICriticalNotifyCompletion.

As a reminder, Task and Task<T> manage the synchronization context for any tasks being awaited. If you’re on a UI thread and you await a task, the continuation of your async method will be executed on the UI thread, too. You can opt out of that by using Task.ConfigureAwait. You need that in order to explicitly say “I know I don’t need the rest of my method to execute in the same synchronization context.” Execution contexts aren’t like that; you pretty much always want the same execution context when your async method continues, even if it’s on a different thread.

This preservation of the execution context is called flow. An execution context is said to flow across await expressions, meaning that all your code operates in the same execution context. What makes sure that happens? Well, AsyncTaskMethodBuilder always does, and TaskAwaiter sometimes does. This is where things get tricky.

The INotifyCompletion.OnCompleted method is just a normal method; anyone can call it. By contrast, ICriticalNotifyCompletion.UnsafeOnCompleted is marked with [SecurityCritical]. It can be called only by trusted code, such as the framework’s AsyncTaskMethodBuilder class.

If you ever write your own awaiter class and you care about running code correctly and safely in partially trusted environments, you should ensure that your INotifyCompletion.OnCompleted code flows the execution context (via ExecutionContext.Capture and ExecutionContext.Run). You can also implement ICriticalNotifyCompletion and not flow the execution context in that case, trusting that the async infrastructure will already have done so. Effectively, this is an optimization for the common case in which awaiters are used only by the async infrastructure. There’s no point in capturing and restoring the execution context twice in cases where you can safely do it only once.

When compiling an async method, the compiler will create a call to either builder.AwaitOnCompleted or builder.AwaitUnsafeOnCompleted at each await expression, depending on whether the awaiter implements ICriticalNotifyCompletion. Those builder methods are generic and have constraints to ensure that the awaiters that are passed into them implement the appropriate interface.

If you ever implement your own custom task type (and again, that’s extremely unlikely for anything other than educational purposes), you should follow the same pattern as AsyncTaskMethodBuilder: capture the execution context in both AwaitOnCompleted and AwaitUnsafeOnCompleted, so it’s safe to call ICriticalNotifyCompletion.UnsafeOnCompleted when you’re asked to. Speaking of custom tasks, let’s review the requirements for a custom task builder now that you’ve seen how the compiler uses AsyncTaskMethodBuilder.

6.5. Custom task types revisited

Listing 6.11 shows a repeat of the builder part of listing 5.10, where you first looked at custom task types. The set of methods may feel a lot more familiar now after you’ve looked at so many decompiled state machines. You can use this section as a reminder of how the methods on AsyncTaskMethodBuilder are called, as the compiler treats all builders the same way.

Listing 6.11. A sample custom task builder
public class CustomTaskBuilder<T>
{
    public static CustomTaskBuilder<T> Create();
    public void Start<TStateMachine>(ref TStateMachine stateMachine)
        where TStateMachine : IAsyncStateMachine;
    public CustomTask<T> Task { get; }

    public void AwaitOnCompleted<TAwaiter, TStateMachine>
        (ref TAwaiter awaiter, ref TStateMachine stateMachine)
        where TAwaiter : INotifyCompletion
        where TStateMachine : IAsyncStateMachine;
    public void AwaitUnsafeOnCompleted<TAwaiter, TStateMachine>
        (ref TAwaiter awaiter, ref TStateMachine stateMachine)
        where TAwaiter : INotifyCompletion
        where TStateMachine : IAsyncStateMachine;
    public void SetStateMachine(IAsyncStateMachine stateMachine);

    public void SetException(Exception exception);
    public void SetResult(T result);
}

I’ve grouped the methods in the normal chronological order in which they’re called.

The stub method calls Create to create a builder instance as part of the newly created state machine. It then calls Start to make the state machine take the first step and returns the result of the Task property.

Within the state machine, each await expression will generate a call to AwaitOnCompleted or AwaitUnsafeOnCompleted as discussed in the previous section. Assuming a task-like design, the first such call will end up calling IAsyncStateMachine.SetStateMachine, which will in turn call the builder’s SetStateMachine so that any boxing is resolved in a consistent way. See section 6.1.4 for a reminder of the details.

Finally, a state machine indicates that the async operation has completed by calling either SetException or SetResult on the builder. That final state should be propagated to the custom task that was originally returned by the stub method.

This chapter is by far the deepest dive in this book. Nowhere else do I look at the code generated by the C# compiler in such detail. To many developers, everything in this chapter would be superfluous; you don’t really need it to write correct async code in C#. But for curious developers, I hope it’s been enlightening. You may never need to decompile generated code, but having some idea of what’s going on under the hood can be useful. And if you ever do need to look at what’s going on in detail, I hope this chapter will help you make sense of what you see.

I’ve taken two chapters to cover the one major feature of C# 5. In the next short chapter, I’ll cover the remaining two features. After the details of async, they come as a bit of light relief.

Summary

  • Async methods are converted into stub methods and state machines by using builders as async infrastructure.
  • The state machine keeps track of the builder, method parameters, local variables, awaiters, and where to resume in a continuation.
  • The compiler creates code to get back into the middle of a method when it resumes.
  • The INotifyCompletion and ICriticalNotifyCompletion interfaces help control execution context flow.
  • The methods of custom task builders are called by the C# compiler.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset