I vividly remember the evening of October 28, 2010. Anders Hejlsberg was presenting async/await at PDC, and shortly before his talk started, an avalanche of downloadable material was made available, including a draft of the changes to the C# specification, a Community Technology Preview (CTP) of the C# 5 compiler, and the slides Anders was presenting. At one point, I was watching the talk live and skimming through the slides while the CTP installed. By the time Anders had finished, I was writing async code and trying things out.
In the next few weeks, I started taking bits apart and looking at exactly what code the compiler was generating, trying to write my own simplistic implementation of the library that came with the CTP, and generally poking at it from every angle. As new versions came out, I worked out what had changed and became more and more comfortable with what was going on behind the scenes. The more I saw, the more I appreciated how much boilerplate code the compiler is happy to write on our behalf. It’s like looking at a beautiful flower under a microscope: the beauty is still there to be admired, but there’s so much more to it than can be seen at first glance.
Not everyone is like me, of course. If you just want to rely on the behavior I’ve already described and simply trust that the compiler will do the right thing, that’s absolutely fine. Alternatively, you won’t miss out on anything if you skip this chapter for now and come back to it at a later date; none of the rest of the book relies on it. It’s unlikely that you’ll ever have to debug your code down to the level that you’ll look at here, but I believe this chapter will give you more insight into how async/await hangs together. Both the awaitable pattern and the requirements for custom task types make more sense after you’ve looked at the generated code. I don’t want to get too mystical about this, but there’s a certain connection between the language and the developer that’s enriched by studying these implementation details.
As a rough approximation, we’ll pretend that the C# compiler performs a transformation from C# code using async/await to C# code without using async/await. Of course, the compiler is able to operate at a lower level than this with intermediate representations that can be emitted as IL. Indeed, in some aspects of async/await, the IL generated can’t be represented in regular C#, but it’s easy enough to explain those places.
While writing this chapter, I became aware of a difference between debug and release builds of async code: in debug builds, the generated state machines are classes rather than structs. (This is to give a better debugger experience; in particular, it gives more flexibility in Edit and Continue scenarios.) This wasn’t true when I wrote the third edition; the compiler implementation has changed. It may change again in the future, too. If you decompile async code compiled by a C# 8 compiler, it could look slightly different from what’s presented here.
Although this is surprising, it shouldn’t be too alarming. By definition, implementation details can change over time. None of this invalidates any of the insight to be gained from studying a particular implementation. Just be aware that this is a different kind of learning from “these are the rules of C#, and they’ll change only in well-specified ways.”
In this chapter, I show the code generated by a release build. The differences mostly affect performance, and I believe most readers will be more interested in the performance of release builds than debug builds.
The generated code is somewhat like an onion; it has layers of complexity. We’ll start from the very outside and work our way in toward the tricky bit: await expressions and the dance of awaiters and continuations. For the sake of brevity, I’m going to present only asynchronous methods, not async anonymous functions; the machinery between the two is the same anyway, so there’s nothing particularly interesting to learn by repeating the work.
As I mentioned in chapter 5, the implementation (both in this approximation and in the code generated by the real compiler) is in the form of a state machine. The compiler will generate a private nested struct to represent the asynchronous method, and it must also include a method with the same signature as the one you’ve declared. I call this the stub method; there’s not much to it, but it starts all of the rest going.
Frequently, I’m going to talk about the state machine pausing. This corresponds to a point where the async method reaches an await expression and the operation being awaited hasn’t completed yet. As you may remember from chapter 5, when that happens, a continuation is scheduled to execute the rest of the async method when the awaited operation has completed, and then the async method returns. Similarly, it’s useful to talk about the async method taking a step: the code it executes between pauses, effectively. These aren’t official terms, but they’re useful as shorthand.
The state machine keeps track of where you are within the async method. Logically, there are four kinds of state, in common execution order:
Only the Paused set of states depends on the structure of the async method. Each await expression within the method is a distinct state to be returned to in order to trigger more execution. While the state machine is executing, it doesn’t need to keep track of the exact piece of code that’s executing; at that point, it’s just regular code, and the CPU keeps track of the instruction pointer just as with synchronous code. The state is recorded when the state machine needs to pause; the whole purpose is to allow it to continue the code execution later from the point it reached. Figure 6.1 shows the transitions between the possible states.
Let’s make this concrete with a real piece of code. The following listing shows a simple async method. It’s not quite as simple as you could make it, but it can demonstrate a few things at the same time.
static async Task PrintAndWait(TimeSpan delay) { Console.WriteLine("Before first delay"); await Task.Delay(delay); Console.WriteLine("Between delays"); await Task.Delay(delay); Console.WriteLine("After second delay"); }
Three points to note at this stage are as follows:
This is nice and simple because you have no loops or try/catch/finally blocks to worry about. The control flow is simple, apart from the awaiting, of course. Let’s see what the compiler generates for this code.
I typically use a mixture of ildasm and Redgate Reflector for this sort of work, setting the Optimization level to C# 1 to prevent the decompiler from reconstructing the async method for us. Other decompilers are available, but whichever one you pick, I recommend checking the IL as well. I’ve seen subtle bugs in decompilers when it comes to await, often in terms of the execution order.
You don’t have to do any of this if you don’t want to, but if you find yourself wondering what the compiler does with a particular code construct, and this chapter doesn’t provide the answer, just go for it. Don’t forget the difference between debug and release builds, though, and don’t be put off by the names generated by the compiler, which can make the result harder to read.
Using the tools available, you can decompile listing 6.1 into something like listing 6.2. Many of the names that the C# compiler generates aren’t valid C#; I’ve rewritten them as valid identifiers for the sake of getting runnable code. In other cases, I’ve renamed the identifiers to make the code more readable. Later, I’ve taken a few liberties with how the cases and labels for the state machine are ordered; it’s absolutely logically equivalent to the generated code, but much easier to read. In other places, I’ve used a switch statement even with only two cases, where the compiler might effectively use if/else. In these places, the switch statement represents the more general case that can work when there are multiple points to jump to, but the compiler can generate simpler code for simpler situations.
Stub method [AsyncStateMachine(typeof(PrintAndWaitStateMachine))] [DebuggerStepThrough] private static unsafe Task PrintAndWait(TimeSpan delay) { var machine = new PrintAndWaitStateMachine 1 { 1 delay = delay, 1 builder = AsyncTaskMethodBuilder.Create(), 1 state = -1 1 }; 1 machine.builder.Start(ref machine); 2 return machine.builder.Task; 3 } Private struct for the state machine [CompilerGenerated] private struct PrintAndWaitStateMachine : IAsyncStateMachine { public int state; 4 public AsyncTaskMethodBuilder builder; 5 private TaskAwaiter awaiter; 6 public TimeSpan delay; 7 void IAsyncStateMachine.MoveNext() 8 { 8 } 8 [DebuggerHidden] void IAsyncStateMachine.SetStateMachine( IAsyncStateMachine stateMachine) { this.builder.SetStateMachine(stateMachine); 9 } }
This listing looks somewhat complicated already, but I should warn you that the bulk of the work is done in the MoveNext method, and I’ve completely removed the implementation of that for now. The point of listing 6.2 is to set the scene and provide the structure so that when you get to the MoveNext implementation, it makes sense. Let’s look at the pieces of the listing in turn, starting with the stub method.
The stub method from listing 6.2 is simple apart from the AsyncTaskMethodBuilder. This is a value type, and it’s part of the common async infrastructure. You’ll see over the rest of the chapter how the state machine interacts with the builder.
[AsyncStateMachine(typeof(PrintAndWaitStateMachine))] [DebuggerStepThrough] private static unsafe Task PrintAndWait(TimeSpan delay) { var machine = new PrintAndWaitStateMachine { delay = delay, builder = AsyncTaskMethodBuilder.Create(), state = -1 }; machine.builder.Start(ref machine); return machine.builder.Task; }
The attributes applied to the method are essentially for tooling. They have no effect on regular execution, and you don’t need to know any details about them in order to understand the generated asynchronous code. The state machine is always created in the stub method with three pieces of information:
The name AsyncTaskMethodBuilder may make you think of reflection, but it’s not creating a method in IL or anything like that. The builder provides functionality that the generated code uses to propagate success and failure, handle awaiting, and so forth. If the name “helper” works better for you, feel free to think of it that way.
After creating the state machine, the stub method asks the machine’s builder to start it, passing the machine itself by reference. You’ll see quite a lot of passing by reference in the following few pages, and this comes down to a need for efficiency and consistency. Both the state machine and the AsyncTaskMethodBuilder are mutable value types. Passing machine by reference to the Start method avoids making a copy of the state, which is more efficient and ensures that any changes made to the state within Start are still visible when the Start method returns. In particular, the builder state within the machine may well change during Start. That’s why it’s important that you use machine.builder for both the Start call and the Task property afterward. Suppose you extracted machine.builder to a local variable, like this:
var builder = machine.builder; 1 builder.Start(ref machine); 1 return builder.Task; 1
With that code, state changes made directly within builder.Start() wouldn’t be seen within machine.builder (or vice versa) because it would be a copy of the builder. This is where it’s important that machine.builder refers to a field, not a property. You don’t want to operate on a copy of the builder in the state machine; rather, you want to operate directly on the value that the state machine contains. This is precisely the sort of detail that you don’t want to have to deal with yourself and is why mutable value types and public fields are almost always a bad idea. (You’ll see in chapter 11 how they can be useful when carefully considered.)
Starting the machine doesn’t create any new threads. It just runs the state machine’s MoveNext() method until either the state machine needs to pause while it awaits another asynchronous operation or completes. In other words, it takes one step. Either way, MoveNext() returns, at which point machine.builder.Start() returns, and you can return a task representing the overall asynchronous method back to our caller. The builder is responsible for creating the task and ensuring that it changes state appropriately over the course of the asynchronous method.
That’s the stub method. Now let’s look at the state machine itself.
I’m still omitting the majority of the code from the state machine (in the MoveNext() method), but here’s a reminder of the structure of the type:
[CompilerGenerated] private struct PrintAndWaitStateMachine : IAsyncStateMachine { public int state; public AsyncTaskMethodBuilder builder; private TaskAwaiter awaiter; public TimeSpan delay; void IAsyncStateMachine.MoveNext() { 1 } [DebuggerHidden] void IAsyncStateMachine.SetStateMachine( IAsyncStateMachine stateMachine) { this.builder.SetStateMachine(stateMachine); } }
Again, the attributes aren’t important. The important aspects of the type are as follows:
You’ve seen one use of the type implementing IAsyncStateMachine already, although it was somewhat hidden: AsyncTaskMethodBuilder.Start() is a generic method with a constraint that the type parameter has to implement IAsyncStateMachine. After performing a bit of housekeeping, Start() calls MoveNext() to make the state machine take the first step of the async method.
The fields involved can be broadly split into five categories:
The state and builder are fairly simple. The state is just an integer with one of the following values:
As I mentioned before, the type of the builder depends on the return type of the async method. Before C# 7, the builder type was always AsyncVoidMethodBuilder, AsyncTaskMethodBuilder, or AsyncTaskMethodBuilder<T>. With C# 7 and custom task types, the builder type specified by the AsyncTaskMethodBuilderAttribute is applied to the custom task type.
The other fields are slightly trickier in that all of them depend on the body of the async method, and the compiler tries to use as few fields as it can. The crucial point to remember is that you need fields only for values that you need to come back to after the state machine resumes at some point. Sometimes the compiler can use fields for multiple purposes, and sometimes it can omit them entirely.
The first example of how the compiler can reuse fields is with awaiters. Only one awaiter is relevant at a time, because any particular state machine can await only one value at a time. The compiler creates a single field for each awaiter type that’s used. If you await two Task<int> values, one Task<string>, and three nongeneric Task values in an async method, you’ll end up with three fields: a TaskAwaiter<int>, a TaskAwaiter<string>, and a nongeneric TaskAwaiter. The compiler uses the appropriate field for each await expression based on the awaiter type.
This assumes the awaiter is introduced by the compiler. If you call GetAwaiter() yourself and assign the result to a local variable, that’s treated like any other local variable. I’m talking about the awaiters that are produced as the result of await expressions.
Next, let’s consider local variables. Here, the compiler doesn’t reuse fields but can omit them entirely. If a local variable is used only between two await expressions rather than across await expressions, it can stay as a local variable in the MoveNext() method.
It’s easier to see what I mean with an example. Consider the following async method:
public async Task LocalVariableDemoAsync() { int x = DateTime.UtcNow.Second; 1 int y = DateTime.UtcNow.Second; 2 Console.WriteLine(y); 2 await Task.Delay(); Console.WriteLine(x); 3 }
The compiler would generate a field for x because the value has to be preserved while the state machine is paused, but y can just be a local variable on the stack while the code is executing.
The compiler does a pretty good job of creating only as many fields as it needs. But at times, you might spot an optimization that the compiler could perform but doesn’t. For example, if two variables have the same type and are both used across await expressions (so they need fields), but they’re never both in scope at the same time, the compiler could use just one field for both as it does for awaiters. At the time of this writing, it doesn’t, but who knows what the future could hold?
Finally, there are temporary stack variables. These are introduced when an await expression is used as part of a bigger expression and some intermediate values need to be remembered. Our simple example in listing 6.1 doesn’t need any, which is why listing 6.2 shows only four fields: the state, builder, awaiter, and parameter. As an example of this, consider the following method:
public async Task TemporaryStackDemoAsync() { Task<int> task = Task.FromResult(10); DateTime now = DateTime.UtcNow; int result = now.Second + now.Hours * await task; }
The C# rules for operand evaluation don’t change just because you’re within an async method. The properties now.Second and now.Hours both have to be evaluated before the task is awaited, and their results have to be remembered in order to perform the arithmetic later, after the state machine resumes when the task completes. That means it needs to use fields.
In this case, you know that Task.FromResult always returns a completed task. But the compiler doesn’t know that, and it has to generate the state machine in a way that would let it pause and resume if the task weren’t complete.
You can think of it as if the compiler rewrites the code to introduce extra local variables:
public async Task TemporaryStackDemoAsync() { Task<int> task = Task.FromResult(10); DateTime now = DateTime.UtcNow; int tmp1 = now.Second; int tmp2 = now.Hours; int result = tmp1 + tmp2 * await task; }
Then the local variables are converted into fields. Unlike real local variables, the compiler does reuse temporary stack variables of the same type and generates only as many fields as it needs to.
That explains all the fields in the state machine. Next, you need to look at the MoveNext() method—but only conceptually, to start with.
I’m not going to show you the decompiled code for listing 6.1’s MoveNext() method yet, because it’s long and scary.[1] After you know what the flow looks like, it’s more manageable, so I’ll describe it in the abstract here.
If A Few Good Men had been about async, the line would have been, “You want the MoveNext? You can’t handle the MoveNext!”
Each time MoveNext() is called, the state machine takes another step. Each time it reaches an await expression, it’ll continue if the value being awaited has already completed and pause otherwise. MoveNext() returns if any of the following occurs:
Note that in the final case, the MoveNext() method doesn’t end up throwing an exception. Instead, the task associated with the async call becomes faulted. (If that surprises you, see section 5.6.5 for a reminder of the behavior of async methods with respect to exceptions.)
Figure 6.2 shows a general flowchart of an async method that focuses on the MoveNext() method. I haven’t included exception handling in the figure, as flowcharts don’t have a way of representing try/catch blocks. You’ll see how that’s managed when you eventually look at the code. Likewise, I haven’t shown where SetStateMachine is called, as the flowchart is complicated enough as it is.
One final point about the MoveNext() method: its return type is void, not a task type. Only the stub method needs to return the task, which it gets from the state machine’s builder after the builder’s Start() method has called MoveNext() to take the first step. All the other calls to MoveNext() are part of the infrastructure for resuming the state machine from a paused state, and those don’t need the associated task. You’ll see what all of this looks like in code in section 6.2 (not long to go now), but first, a brief word on SetStateMachine.
I’ve already shown the implementation of SetStateMachine. It’s simple:
void IAsyncStateMachine.SetStateMachine( IAsyncStateMachine stateMachine) { this.builder.SetStateMachine(stateMachine); }
The implementation in release builds always looks like this. (In debug builds, where the state machine is a class, the implementation is empty.) The purpose of the method is easy to explain at a high level, but the details are fiddly. When a state machine takes its first step, it’s on the stack as a local variable of the stub method. If it pauses, it has to box itself (onto the heap) so that all that information is still in place when it resumes. After it’s been boxed, SetStateMachine is called on the boxed value using the boxed value as the argument. In other words, somewhere deep in the heart of the infrastructure, there’s code that looks a bit like this:
void BoxAndRemember<TStateMachine>(ref TStateMachine stateMachine) where TStateMachine : IStateMachine { IStateMachine boxed = stateMachine; boxed.SetStateMachine(boxed); }
It’s not quite as simple as that, but that conveys the essence of what’s going on. The implementation of SetStateMachine then makes sure that the AsyncTaskMethodBuilder has a reference to the single boxed version of the state machine that it’s a part of. The method has to be called on the boxed value; it can be called only after boxing, because that’s when you have the reference to the boxed value, and if you called it on the unboxed value after boxing, that wouldn’t affect the boxed value. (Remember, AsyncTaskMethodBuilder is itself a value type.) This intricate dance ensures that when a continuation delegate is passed to the awaiter, that continuation will call MoveNext() on the same boxed instance.
The result is that the state machine isn’t boxed at all if it doesn’t need to be and is boxed exactly once if necessary. After it’s boxed, everything happens on the boxed version. It’s a lot of complicated code in the name of efficiency.
I find this little dance one of the most intriguing and bizarre bits of the whole async machinery. It sounds like it’s utterly pointless, but it’s necessary because of the way boxing works, and boxing is necessary to preserve information while the state machine is paused.
It’s absolutely fine not to fully understand this code. If you ever find yourself debugging async code at a low level, you can come back to this section. For all other intents and purposes, this code is more of a novelty than anything else.
That’s what the state machine consists of. Most of the rest of the chapter is devoted to the MoveNext()method and how it operates in various situations. We’ll start with the simple case and work up from there.
We’re going to start with the simple async method that you saw in listing 6.1. It’s simple not because it’s short (although that helps) but because it doesn’t contain any loops, try statements, or using statements. It has simple control flow, which leads to a relatively simple state machine. Let’s get cracking.
I’m going to show you the full method to start with. Don’t expect this to all make sense yet, but do spend a few minutes looking through it. With this concrete example in hand, the more general structure is easier to understand, because you can always look back to see how each part of that structure is present in this example. At the risk of boring you, here’s listing 6.1 yet again as a reminder of the compiler’s input:
static async Task PrintAndWait(TimeSpan delay) { Console.WriteLine("Before first delay"); await Task.Delay(delay); Console.WriteLine("Between delays"); await Task.Delay(delay); Console.WriteLine("After second delay"); }
The following listing is a version of the decompiled code that has been slightly rewritten for readability. (Yes, this is the easy-to-read version.)
void IAsyncStateMachine.MoveNext() { int num = this.state; try { TaskAwaiter awaiter1; switch (num) { default: goto MethodStart; case 0: goto FirstAwaitContinuation; case 1: goto SecondAwaitContinuation; } MethodStart: Console.WriteLine("Before first delay"); awaiter1 = Task.Delay(this.delay).GetAwaiter(); if (awaiter1.IsCompleted) { goto GetFirstAwaitResult; } this.state = num = 0; this.awaiter = awaiter1; this.builder.AwaitUnsafeOnCompleted(ref awaiter1, ref this); return; FirstAwaitContinuation: awaiter1 = this.awaiter; this.awaiter = default(TaskAwaiter); this.state = num = -1; GetFirstAwaitResult: awaiter1.GetResult(); Console.WriteLine("Between delays"); TaskAwaiter awaiter2 = Task.Delay(this.delay).GetAwaiter(); if (awaiter2.IsCompleted) { goto GetSecondAwaitResult; } this.state = num = 1; this.awaiter = awaiter2; this.builder.AwaitUnsafeOnCompleted(ref awaiter2, ref this); return; SecondAwaitContinuation: awaiter2 = this.awaiter; this.awaiter = default(TaskAwaiter); this.state = num = -1; GetSecondAwaitResult: awaiter2.GetResult(); Console.WriteLine("After second delay"); } catch (Exception exception) { this.state = -2; this.builder.SetException(exception); return; } this.state = -2; this.builder.SetResult(); }
That’s a lot of code, and you may notice that it has a lot of goto statements and code labels, which you hardly ever see in handwritten C#. At the moment, I expect it to be somewhat impenetrable, but I wanted to show you a concrete example to start with, so you can refer to it anytime it’s useful to you. I’m going to break this down further into general structure and then the specifics of await expressions. By the end of this section, listing 6.3 will probably still look extremely ugly to you, but you’ll be in a better position to understand what it’s doing and why.
We’re into the next layer of the async onion. The MoveNext() method is at the heart of the async state machine, and its complexity is a reminder of how hard it is to get async code right. The more complex the state machine, the more reason you have to be grateful that it’s the C# compiler that has to write the code rather than you.
It’s time to introduce more terminology for the sake of brevity. At each await expression, the value being awaited may already have completed or may still be incomplete. If it has already completed by the time you await it, the state machine keeps executing. I call this the fast path. If it hasn’t already completed, the state machine schedules a continuation and pauses. I call this the slow path.
As a reminder, the MoveNext() method is invoked once when the async method is first called and then once each time it needs to resume from being paused at an await expression. (If every await expression takes the fast path, MoveNext() will be called only once.) The method is responsible for the following:
With this in mind, the following listing shows pseudocode for the general structure of a MoveNext() method. You’ll see in later sections how this can end up being more complicated because of extra control flow, but it’s a natural extension.
void IAsyncStateMachine.MoveNext() { try { switch (this.state) { default: goto MethodStart; case 0: goto Label0A; case 1: goto Label1A; case 2: goto Label2A; 1 } MethodStart: 2 3 Label0A: 4 Label0B: 5 6 } catch (Exception e) 7 { 7 this.state = -2; 7 builder.SetException(e); 7 return; 7 } 7 this.state = -2; 8 builder.SetResult(); 8 }
The big try/catch block covers all the code from the original async method. If anything in there throws an exception, however it’s thrown (via awaiting a faulted operation, calling a synchronous method that throws, or simply throwing an exception directly), that exception is caught and then propagated via the builder. Only special exceptions (ThreadAbortException and StackOverflowException, for example) will ever cause MoveNext() to end with an exception.
Within the try/catch block, the start of the MoveNext() method is always effectively a switch statement used to jump to the right piece of code within the method based on the state. If the state is non-negative, that means you’re resuming after an await expression. Otherwise, it’s assumed that you’re executing MoveNext() for the first time.
In section 6.1, I listed the possible states as not started, executing, paused, and complete (where paused is a separate state per await expression). Why doesn’t the state machine handle not started, executing, and complete differently?
The answer is that MoveNext() should never end up being called in the executing or complete states. You can force it to by writing a broken awaiter implementation or by using reflection, but under normal operation, MoveNext() is called only to start or resume the state machine. There aren’t even distinct state numbers for not started and executing; both use –1. There’s a state number of –2 for completed, but the state machine never checks for that value.
One bit of trickiness to be aware of is the difference between a return statement in the state machine and a return statement in the original async code. Within the state machine, return is used when the state machine is paused after scheduling a continuation for an awaiter. Any return statement in the original code ends up dropping to the bottom part of the state machine outside the try/catch block, where the method completion is propagated via the builder.
If you compare listings 6.3 and 6.4, hopefully you can see how our concrete example fits into the general pattern. At this point, I’ve explained almost everything about the code generated by the simple async method you started with. The only bit that’s missing is exactly what happens around await expressions.
Let’s think again about what has to happen each time you hit an await expression when executing an async method, assuming you’ve already evaluated the operand to get something that’s awaitable:
With that list in mind, let’s review a section of listing 6.3 that corresponds to our first await expression.
awaiter1 = Task.Delay(this.delay).GetAwaiter(); if (awaiter1.IsCompleted) { goto GetFirstAwaitResult; } this.state = num = 0; this.awaiter = awaiter1; this.builder.AwaitUnsafeOnCompleted(ref awaiter1, ref this); return; FirstAwaitContinuation: awaiter1 = this.awaiter; this.awaiter = default(TaskAwaiter); this.state = num = -1; GetFirstAwaitResult: awaiter1.GetResult();
Unsurprisingly, the code follows the set of steps precisely.[2] The two labels represent the two places you have to jump to, depending on the path:
It’s unsurprising in that it would have been pretty odd of me to write that list of steps and then present code that didn’t follow the list.
The call to builder.AwaitUnsafeOnCompleted(ref awaiter1, ref this) is the part that does the boxing dance with a call back into SetStateMachine (if necessary; it happens only once per state machine) and schedules the continuation. In some cases, you’ll see a call to AwaitOnCompleted instead of AwaitUnsafeOnCompleted. These differ only in terms of how the execution context is handled. You’ll look at this in more detail in section 6.5.
One aspect that may seem slightly unclear is the use of the num local variable. It’s always assigned a value at the same time as the state field but is always read instead of the field. (Its initial value is copied out of the field, but that’s the only time the field is read.) I believe this is purely for optimization. Whenever you read num, it’s fine to think of it as this.state instead.
Looking at listing 6.5, that’s 16 lines of code for what was originally just the following:
await Task.Delay(delay);
The good news is that you almost never need to see all that code unless you’re going through this kind of exercise. There’s a small amount of bad news in that the code inflation means that even small async methods—even those using ValueTask<TResult>—can’t be sensibly inlined by the JIT compiler. In most cases, that’s a miniscule price to pay for the benefits afforded by async/await, though.
That’s the simple case with simple control flow. With that background, you can explore a couple of more-complex cases.
The example you’ve been looking at so far has just been a sequence of method calls with only the await operator introducing complexity. Life gets a little harder when you want to write real code with all the normal control-flow statements you’re used to.
In this section, I’ll show you just two elements of control flow: loops and try/finally statements. This isn’t intended to be comprehensive, but it should give you enough of a glimpse at the control-flow gymnastics the compiler has to perform to help you understand other situations if you need to.
Before we get into the tricky part, I’ll give an example of where introducing control flow doesn’t add to the generated code complexity any more than it would in the synchronous code. In the following listing, a loop is introduced into our example method, so you print Between delays three times instead of once.
static async Task PrintAndWaitWithSimpleLoop(TimeSpan delay) { Console.WriteLine("Before first delay"); await Task.Delay(delay); for (int i = 0; i < 3; i++) { Console.WriteLine("Between delays"); } await Task.Delay(delay); Console.WriteLine("After second delay"); }
What does this look like when decompiled? Very much like listing 6.2! The only difference is this
GetFirstAwaitResult: awaiter1.GetResult(); Console.WriteLine("Between delays"); TaskAwaiter awaiter2 = Task.Delay(this.delay).GetAwaiter();
becomes the following:
GetFirstAwaitResult: awaiter1.GetResult(); for (int i = 0; i < 3; i++) { Console.WriteLine("Between delays"); } TaskAwaiter awaiter2 = Task.Delay(this.delay).GetAwaiter();
The change in the state machine is exactly the same as the change in the original code. There are no extra fields and no complexities in terms of how to continue execution; it’s just a loop.
The reason I bring this up is to help you think about why extra complexity is required in our next examples. In listing 6.6, you never need to jump into the loop from outside, and you never need to pause execution and jump out of the loop, thereby pausing the state machine. Those are the situations introduced by await expressions when you await within the loop. Let’s do that now.
Our example so far has contained two await expressions. To keep the code somewhat manageable as I introduce other complexities, I’m going to reduce that to one. The following listing shows the async method you’re going to decompile in this subsection.
static async Task AwaitInLoop(TimeSpan delay) { Console.WriteLine("Before loop"); for (int i = 0; i < 3; i++) { Console.WriteLine("Before await in loop"); await Task.Delay(delay); Console.WriteLine("After await in loop"); } Console.WriteLine("After loop delay"); }
The Console.WriteLine calls are mostly present as signposts within the decompiled code, which makes it easier to map to the original listing.
What does the compiler generate for this? I’m not going to show the complete code, because most of it is similar to what you’ve seen before. (It’s all in the downloadable source, though.) The stub method and state machine are almost exactly as they were for earlier examples but with one additional field in the state machine corresponding to i, the loop counter. The interesting part is in MoveNext().
You can represent the code faithfully in C# but not using a loop construct. The problem is that after the state machine returns from pausing at Task.Delay, you want to jump into the middle of the original loop. You can’t do that with a goto statement in C#; the language forbids a goto statement specifying a label if the goto statement isn’t in the scope of that label.
That’s okay; you can implement your for loop with a lot of goto statements without introducing any extra scopes at all. That way, you can jump to the middle of it without a problem. The following listing shows the bulk of the decompiled code for the body of the MoveNext() method. I’ve included only the part within the try block, as that’s what we’re focusing on here. (The rest is simple boilerplate.)
switch (num) { default: goto MethodStart; case 0: goto AwaitContinuation; } MethodStart: Console.WriteLine("Before loop"); this.i = 0; 1 goto ForLoopCondition; 2 ForLoopBody: 3 Console.WriteLine("Before await in loop"); TaskAwaiter awaiter = Task.Delay(this.delay).GetAwaiter(); if (awaiter.IsCompleted) { goto GetAwaitResult; } this.state = num = 0; this.awaiter = awaiter; this.builder.AwaitUnsafeOnCompleted(ref awaiter, ref this); return; AwaitContinuation: 4 awaiter = this.awaiter; this.awaiter = default(TaskAwaiter); this.state = num = -1; GetAwaitResult: awaiter.GetResult(); Console.WriteLine("After await in loop"); this.i++; 5 ForLoopCondition: 6 if (this.i < 3) 6 { 6 goto ForLoopBody; 6 } 6 Console.WriteLine("After loop delay");
I could’ve skipped this example entirely, but it brings up a few interesting points. First, the C# compiler doesn’t convert an async method into equivalent C# that doesn’t use async/await. It only has to generate appropriate IL. In some places, C# has rules that are stricter than those in IL. (The set of valid identifiers is another example of this.)
Second, although decompilers can be useful when looking at async code, sometimes they produce invalid C#. When I first decompiled the output of listing 6.7, the output included a while loop containing a label and a goto statement outside that loop trying to jump into it. You can sometimes get valid (but harder-to-read) C# by telling the decompiler not to work as hard to produce idiomatic C#, at which point you’ll see an awful lot of goto statements.
Third, in case you weren’t already convinced, you don’t want to be writing this sort of code by hand. If you had to write C# 4 code for this sort of task, you’d no doubt do it in a very different way, but it would still be significantly uglier than the async method you can use in C# 5.
You’ve seen how awaiting within a loop might cause humans some stress, but it doesn’t cause the compiler to break a sweat. For our final control-flow example, you’ll give it some harder work to do: a try/finally block.
Just to remind you, it’s always been valid to use await in a try block, but in C# 5, it was invalid to use it in a catch or finally block. That restriction was lifted in C# 6, although I’m not going to show any code that takes advantage of it.
There are simply too many possibilities to go through here. The aim of this chapter is to give you insight into the kind of thing the C# compiler does with async/await rather than provide an exhaustive list of translations.
In this section, I’m only going to show you an example of awaiting within a try block that has just a finally block. That’s probably the most common kind of try block, because it’s the one that using statements are equivalent to. The following listing shows the async method you’re going to decompile. Again, all the console output is present only to make it simpler to understand the state machine.
static async Task AwaitInTryFinally(TimeSpan delay) { Console.WriteLine("Before try block"); await Task.Delay(delay); try { Console.WriteLine("Before await"); await Task.Delay(delay); Console.WriteLine("After await"); } finally { Console.WriteLine("In finally block"); } Console.WriteLine("After finally block"); }
You might imagine that the decompiled code would look something like this:
switch (num) { default: goto MethodStart; case 0: goto AwaitContinuation; } MethodStart: ... try { ... AwaitContinuation: ... GetAwaitResult: ... } finally { ... } ...
Here, each ellipsis (...) represents more code. There’s a problem with that approach, though: even in IL, you’re not allowed to jump from outside a try block to inside it. It’s a little bit like the problem you saw in the previous section with loops, but this time instead of a C# rule, it’s an IL rule.
To achieve this, the C# compiler uses a technique I like to think of as a trampoline. (This isn’t official terminology, although the term is used elsewhere for similar purposes.) It jumps to just before the try block, and then the first thing inside the try block is a piece of code that jumps to the right place within the block.
In addition to the trampoline, the finally block needs to be handled with care, too. There are three situations in which you’ll execute the finally block of the generated code:
(If the async method contained a return statement, that would be another option.) If the finally block is executing because you’re pausing the state machine and returning to the caller, the code in the original async method’s finally block shouldn’t execute. After all, you’re logically paused inside the try block and will be resuming there when the delay completes. Fortunately, this is easy to detect: the num local variable (which always has the same as the state field) is negative if the state machine is still executing or finished and non-negative if you’re pausing.
All of this together leads to the following listing, which again is the code within the outer try block of MoveNext(). Although there’s still a lot of code, most of it is similar to what you’ve seen before. I’ve highlighted the try/finally-specific aspects in bold.
switch (num) { default: goto MethodStart; case 0: goto AwaitContinuationTrampoline; 1 } MethodStart: Console.WriteLine("Before try"); AwaitContinuationTrampoline: try { switch (num) 2 { 2 default: 2 goto TryBlockStart; 2 case 0: 2 goto AwaitContinuation; 2 } 2 TryBlockStart: 2 Console.WriteLine("Before await"); TaskAwaiter awaiter = Task.Delay(this.delay).GetAwaiter(); if (awaiter.IsCompleted) { goto GetAwaitResult; } this.state = num = 0; this.awaiter = awaiter; this.builder.AwaitUnsafeOnCompleted(ref awaiter, ref this); return; AwaitContinuation: 3 awaiter = this.awaiter; this.awaiter = default(TaskAwaiter); this.state = num = -1; GetAwaitResult: awaiter.GetResult(); Console.WriteLine("After await"); } finally { if (num < 0) 4 { 4 Console.WriteLine("In finally block"); 4 } 4 } Console.WriteLine("After finally block");
That’s the final decompilation in the chapter, I promise. I wanted to get to that level of complexity to help you navigate the generated code if you ever need to. That’s not to say you won’t need to keep your wits about you when looking through it, particularly bearing in mind the many transformations the compiler can perform to make the code simpler than what I’ve shown. As I said earlier, where I’ve always used a switch statement for “jump to X” pieces of code, the compiler can sometimes use simpler branching code. Consistency in multiple situations is important when reading source code, but that doesn’t matter to the compiler.
One of the aspects I’ve skimmed over so far is why awaiters have to implement INotifyCompletion but can also implement ICriticalNotifyCompletion, and the effect that has on the generated code. Let’s take a closer look now.
In section 5.2.2, I described synchronization contexts, which are used to govern the thread that code executes on. This is just one of many contexts in .NET, although it’s probably the best known. Context provides an ambient way of maintaining information transparently. For example, SecurityContext keeps track of the current security principal and code access security. You don’t need to pass all that information around explicitly; it just follows your code, doing the right thing in almost all cases. A single class is used to manage all the other contexts: ExecutionContext.
I almost didn’t include this section. It’s at the very limits of my knowledge about async. If you ever need to know the intimate details, you’ll want to know far more about the topic than I’ve included here.
I’ve covered this at all only because otherwise there’d be no explanation whatsoever for having both AwaitOnCompleted and AwaitUnsafeOnCompleted in the builder or why awaiters usually implement ICriticalNotifyCompletion.
As a reminder, Task and Task<T> manage the synchronization context for any tasks being awaited. If you’re on a UI thread and you await a task, the continuation of your async method will be executed on the UI thread, too. You can opt out of that by using Task.ConfigureAwait. You need that in order to explicitly say “I know I don’t need the rest of my method to execute in the same synchronization context.” Execution contexts aren’t like that; you pretty much always want the same execution context when your async method continues, even if it’s on a different thread.
This preservation of the execution context is called flow. An execution context is said to flow across await expressions, meaning that all your code operates in the same execution context. What makes sure that happens? Well, AsyncTaskMethodBuilder always does, and TaskAwaiter sometimes does. This is where things get tricky.
The INotifyCompletion.OnCompleted method is just a normal method; anyone can call it. By contrast, ICriticalNotifyCompletion.UnsafeOnCompleted is marked with [SecurityCritical]. It can be called only by trusted code, such as the framework’s AsyncTaskMethodBuilder class.
If you ever write your own awaiter class and you care about running code correctly and safely in partially trusted environments, you should ensure that your INotifyCompletion.OnCompleted code flows the execution context (via ExecutionContext.Capture and ExecutionContext.Run). You can also implement ICriticalNotifyCompletion and not flow the execution context in that case, trusting that the async infrastructure will already have done so. Effectively, this is an optimization for the common case in which awaiters are used only by the async infrastructure. There’s no point in capturing and restoring the execution context twice in cases where you can safely do it only once.
When compiling an async method, the compiler will create a call to either builder.AwaitOnCompleted or builder.AwaitUnsafeOnCompleted at each await expression, depending on whether the awaiter implements ICriticalNotifyCompletion. Those builder methods are generic and have constraints to ensure that the awaiters that are passed into them implement the appropriate interface.
If you ever implement your own custom task type (and again, that’s extremely unlikely for anything other than educational purposes), you should follow the same pattern as AsyncTaskMethodBuilder: capture the execution context in both AwaitOnCompleted and AwaitUnsafeOnCompleted, so it’s safe to call ICriticalNotifyCompletion.UnsafeOnCompleted when you’re asked to. Speaking of custom tasks, let’s review the requirements for a custom task builder now that you’ve seen how the compiler uses AsyncTaskMethodBuilder.
Listing 6.11 shows a repeat of the builder part of listing 5.10, where you first looked at custom task types. The set of methods may feel a lot more familiar now after you’ve looked at so many decompiled state machines. You can use this section as a reminder of how the methods on AsyncTaskMethodBuilder are called, as the compiler treats all builders the same way.
public class CustomTaskBuilder<T> { public static CustomTaskBuilder<T> Create(); public void Start<TStateMachine>(ref TStateMachine stateMachine) where TStateMachine : IAsyncStateMachine; public CustomTask<T> Task { get; } public void AwaitOnCompleted<TAwaiter, TStateMachine> (ref TAwaiter awaiter, ref TStateMachine stateMachine) where TAwaiter : INotifyCompletion where TStateMachine : IAsyncStateMachine; public void AwaitUnsafeOnCompleted<TAwaiter, TStateMachine> (ref TAwaiter awaiter, ref TStateMachine stateMachine) where TAwaiter : INotifyCompletion where TStateMachine : IAsyncStateMachine; public void SetStateMachine(IAsyncStateMachine stateMachine); public void SetException(Exception exception); public void SetResult(T result); }
I’ve grouped the methods in the normal chronological order in which they’re called.
The stub method calls Create to create a builder instance as part of the newly created state machine. It then calls Start to make the state machine take the first step and returns the result of the Task property.
Within the state machine, each await expression will generate a call to AwaitOnCompleted or AwaitUnsafeOnCompleted as discussed in the previous section. Assuming a task-like design, the first such call will end up calling IAsyncStateMachine.SetStateMachine, which will in turn call the builder’s SetStateMachine so that any boxing is resolved in a consistent way. See section 6.1.4 for a reminder of the details.
Finally, a state machine indicates that the async operation has completed by calling either SetException or SetResult on the builder. That final state should be propagated to the custom task that was originally returned by the stub method.
This chapter is by far the deepest dive in this book. Nowhere else do I look at the code generated by the C# compiler in such detail. To many developers, everything in this chapter would be superfluous; you don’t really need it to write correct async code in C#. But for curious developers, I hope it’s been enlightening. You may never need to decompile generated code, but having some idea of what’s going on under the hood can be useful. And if you ever do need to look at what’s going on in detail, I hope this chapter will help you make sense of what you see.
I’ve taken two chapters to cover the one major feature of C# 5. In the next short chapter, I’ll cover the remaining two features. After the details of async, they come as a bit of light relief.