Chapter 15. C# 8 and beyond

This chapter covers

  • Expressing null and non-null expectations for reference types
  • Using switch expressions with pattern matching
  • Matching patterns recursively against properties
  • Using index and range syntax for concise and consistent code
  • Using asynchronous versions of the using, foreach, and yield statements

At the time of this writing, C# 8 is still being designed. The GitHub repository shows a lot of potential features, but only a few have reached the stage of publicly available preview builds of the compiler. This chapter is educated guesswork; nothing here is set in stone. It’s almost inconceivable that all the features being considered would be included in C# 8, and I’ve restricted myself to the ones I consider reasonably likely to make the cut. I’ve provided the most detail about the features available in preview at the time of writing, but even so, that doesn’t mean further changes won’t occur.

Note

At the time of this writing, only a few C# 8 features are available in preview builds, and there are different builds with different features. The preview for nullable reference types supports only full .NET projects (rather than .NET Core SDK style projects), which makes it harder to experiment with them on real code if all your projects use the new project format. I expect these limitations to be overcome in later builds, possibly by the time you read this.

We’ll start with nullable reference types.

15.1. Nullable reference types

Ah, null references. The so-called billion-dollar mistake that Tony Hoare apologized for in 2009 after introducing them in the 1960s. It’s hard to find an experienced C# developer who hasn’t been bitten by a NullReferenceException at least a few times. The C# team has a plan to tame null references, making it clearer where we should expect to find them.

15.1.1. What problem do nullable reference types solve?

As an example that I’ll expand on over the course of this section, let’s consider the classes in the following listing. If you’re following along in the downloadable source code, you’ll see that I’ve declared them as separate nested classes within each example, as the code changes over time.

Listing 15.1. Initial model before C# 8
public class Customer
{
    public string Name { get; set; }
    public Address Address { get; set; }
}

public class Address
{
    public string Country { get; set; }
}

An address would usually contain far more information than a country, but a single property is enough for the examples in this chapter. With those classes in place, how safe is this code?

Customer customer = ...;
Console.WriteLine(customer.Address.Country);

If you know (somehow) that customer is non-null and that a customer always has an associated address, that may be fine. But how can you know that? If you know that only because you’ve looked at documentation, what has to change to make the code safer?

Since C# 2, we’ve had nullable value types, non-nullable value types, and implicitly nullable reference types. A grid of nullable/non-nullable against value/reference types has had three of the four cells filled in, but the fourth has been elusive, as shown in table 15.1.

Table 15.1. Support for nullability and non-nullability for reference and value types in C# 7
 

Nullable

Non-nullable

Reference types Implicit Not supported
Value types Nullable<T> or ? suffix Default

The fact that there’s only one supported cell in the top row means we have no way of expressing an intention that some reference values may be null and others should never be null. When you run into a problem with an unexpected null value, it can be hard to determine where the fault lies unless the code has been carefully documented with null checks implemented consistently.[1]

1

The day before I wrote this paragraph, most of my time was spent trying to track down a problem of exactly this kind. The issue is very real.

Given the huge body of .NET code that now exists with no machine-readable discrimination between references that can reasonably be null and those that must always be non-null, any attempt to rectify this situation can only be a cautious one. What can we do?

15.1.2. Changing the meaning when using reference types

The broad idea of the null safety feature is to assume that when a developer is intentionally discriminating between non-null and nullable reference types, the default is to be non-nullable. New syntax is introduced for nullable reference types: string is a non-nullable reference type, and string? is a nullable reference type. The grid then evolves, as shown in table 15.2.

Table 15.2. Support for nullability and non-nullability for reference and value types in C# 8
 

Nullable

Non-nullable

Reference types No CLR type representation, but the ? suffix as an annotation Default when nullable reference type support is enabled
Value types Nullable<T> or ? suffix Default

That sounds like the opposite of caution; it’s changing the meaning of all C# code that deals with reference types! Turning on the feature changes the default from nullable to non-nullable. The expectation is that there are far fewer places where a null reference is intended to be valid than places where it should never occur.

Let’s go back to our customer and address example. Without any changes to the code, the compiler warns us that our Customer and Address classes are allowing non-nullable properties to be uninitialized. That can be fixed by adding constructors with non-nullable parameters, as shown in the following listing.

Listing 15.2. Model with non-nullable properties everywhere
public class Customer
{
    public string Name { get; set; }
    public Address Address { get; set; }

    public Customer(string name, Address address) =>
        (Name, Address) = (name, address);
}

public class Address
{
    public string Country { get; set; }

    public Address(string country) =>
        Country = country;
}

At this point, you “can’t” construct a Customer without providing a non-null name and address, and you “can’t” construct an Address without providing a non-null country. I’ve deliberately put can’t in scare-quotes for reasons you’ll see in section 15.1.4.

But now consider our console output code again:

Customer customer = ...;
Console.WriteLine(customer.Address.Country);

This is safe, assuming everyone is obeying the contracts properly. Not only will it not throw an exception, but you won’t be passing a null value to Console.WriteLine, because the country in the address won’t be null.

Okay, so the compiler can check that things aren’t null. But what about when you want to allow null values? It’s time to explore the new syntax I mentioned before.

15.1.3. Enter nullable reference types

The syntax used to indicate a reference type that can be null is designed to be immediately familiar. It’s the same as the syntax for nullable value types: adding a question mark after the type name. This can be used in most places that a reference type can appear. For example, consider this method:

string FirstOrSecond(string? first, string second) =>
    first ?? second;

The signature of the method shows the following:

  • The type of first is nullable string.
  • The type of second is non-nullable string.
  • The return type is non-nullable string.

The compiler then uses that information to warn you if try to misuse a value that might be null. For example, it can warn you if you do the following:

  • Assign a possibly null value to a non-nullable variable or property.
  • Pass a possibly null value as an argument for a non-nullable parameter.
  • Dereference a possibly null value.

Let’s build this into our customer model. Let’s suppose the customer address could be null. You need to modify the Customer class as follows:

  • Change the property type.
  • Either remove the constructor parameter for the address, make it nullable, or overload it.

The Address type itself doesn’t change, only how it’s used. The following listing shows the new Customer class. I’ve chosen to remove the constructor parameter for the address.

Listing 15.3. Making the customer Address property nullable
public class Customer
{
    public string Name { get; set; }
    public Address? Address { get; set; }   1

    public Customer(string name) =>         2
        Name = name;
}

  • 1 The address is now optional information.
  • 2 Removes the address parameter from the constructor

Great, you’ve now made your intent clear: the Name property won’t be null, but the Address property might be. The compiler now gives you a different warning when you try to display the country of the user’s address:

CS8602 Possible dereference of a null reference.

Great! It’s now identifying the problem you originally faced, which caused a NullReferenceException. How do you fix the problem? It’s time to look at the behavior of nullable reference types rather than just the syntax.

15.1.4. Nullable reference types at compile time and execution time

One golden rule of the new feature is that no behavior is changed implicitly. Even though the meaning of your code has changed to assume an intent of non-nullable types, the behavior hasn’t. The only difference is at compile time in terms of the warnings generated. No new real types are involved; the CLR has no notion of a nullable versus non-nullable reference type. Attributes are used to propagate nullability information, but that’s all. This is similar to the extra information about tuple element names, which are not part of the type at execution time. This has two important consequences:

  • Defensive programming remains a best practice. With the code you’ve written so far, it’s possible for Name to be null, because a user could be ignoring warnings or using code from another project that uses only C# 7. Argument validation is still important.
  • To understand the feature fully, you need to understand the compiler warnings. You definitely shouldn’t just ignore them; they’re present to provide value.

Let’s look at the warning you’re currently facing and consider all the ways you could avoid it. You currently have this:

Console.WriteLine(customer.Address.Country);

The compiler is correctly telling you this is dangerous because customer.Address could be null. You’ll look at three ways you can make the code safer. First, you can use the null conditional and null coalescing operators in tandem, as shown in the next listing.

Listing 15.4. Safe dereferencing using the null conditional operator
Console.WriteLine(customer.Address?.Country ?? "(Address unknown)");

If customer.Address is null, the expression customer.Address?.Country won’t try to evaluate the Country property, and the result of the expression will be null. The null coalescing operator then provides a default value to print. The compiler understands that you’re no longer trying to dereference anything that might be null, and the warning goes away.

You may be a little uneasy with this at the moment. It’s easy to get lost in a sea of question marks if you’re not careful. I believe that C# developers will become more comfortable with this over time, but it’s not the only solution available. You could take a more verbose approach that’s simple to follow, as shown in the following listing.

Listing 15.5. Checking a reference with a local variable
Address? address = customer.Address;      1
if (address != null)                      2
{                                         2
    Console.WriteLine(address.Country);   2
}                                         2
else
{
    Console.WriteLine("(Address unknown)");
}

  • 1 Extracts address to a new local variable
  • 2 Checks for nullity and dereferences only if non-null

There’s an interesting point to note here: the compiler needs to keep track of more than just the type of the variable. If the rule were as simple as “dereferencing a value of a nullable reference type causes a warning,” this code would still generate a warning, despite being safe. Instead, the compiler keeps track of whether a variable’s value can be null at each place in the code in a similar manner to the way it keeps track of definite assignment. By the time you reach the body of the if statement, the compiler knows that the value of address can’t be null, so it doesn’t warn when you dereference it. Our third approach, shown in the following listing, is similar to the second one, but without the local variable.

Listing 15.6. Checking a reference with repeated property access
if (customer.Address != null)
{
    Console.WriteLine(customer.Address.Country);
}
else
{
    Console.WriteLine("(Address unknown)");
}

Even when you understand how the second example compiles without a warning, listing 15.6 can be a little surprising. The compiler doesn’t just keep track of whether a variable value can be null; it does that for properties, too. It assumes that if you access the same property on the same value twice, the result will be the same both times.

This may worry you. It means the feature isn’t guaranteed to stop your code from dereferencing null values. Another thread could modify the Address property between the two calls you’ve seen, or the Address property itself could be written to randomly return a null value sometimes. There are other ways you can fool the compiler into believing your code is fine when it’s not absolutely safe. This is known and accepted by the C# design team, because it’s a pragmatic balance between safety and awkwardness. Code using the C# 8 features will be much more null-safe than code written before, but making it 100% safe would almost certainly require more-invasive changes that would put a lot of developers off. So long as you understand the limits of what it’s trying to achieve, you’ll be fine.

You’ve seen that the compiler works hard to understand what might or might not be null. What can you do when it doesn’t have as much context as you do?

15.1.5. The damn it or bang operator

There’s one additional piece of syntax you haven’t looked at yet: the dammit, or damn it, or bang operator.[2] This is an exclamation mark at the end of an expression, and it’s a way of telling the compiler to ignore whatever it thinks it knows about that expression and just treat it as non-null.

2

I doubt that it’ll ever officially be called the damn it operator, but I suspect the name will live on in the community, just like everyone calls the Microsoft .NET Compiler Platform by its original name of Roslyn.

This is useful in two opposite situations:

  • Sometimes you have more information than the compiler does, so you know a value won’t be null, even though the compiler thinks it might be.
  • Sometimes you want to deliberately pass in a null value to check your argument validation.

Brief examples of the first situation are somewhat contrived, because you’d typically try to reorganize the code to avoid getting into that situation. In small examples, that’s almost always feasible, but it’s harder in real applications. The following listing shows a method to print the length of a string with input that can be null.

Listing 15.7. Using the bang operator to satisfy the compiler
static void PrintLength(string? text)                    1
{
    if (!string.IsNullOrEmpty(text))                     2
    {
        Console.WriteLine($"{text}: {text!.Length}");    3
    }
    else
    {
        Console.WriteLine("Empty or null");
    }
}

  • 1 Input can be null
  • 2 If IsNullOrEmpty returns false, it’s not null.
  • 3 Use the bang operator to convince the compiler.

In this example, you know something the compiler doesn’t in terms of the relationship between the input to string.IsNullOrEmpty and the return value. If string.IsNullOrEmpty returns false, the input can’t be null, so it’s fine to dereference that value to get the length of the string. If you just try to use text.Length, the compiler issues a warning. With text!.Length, you’re telling the compiler that you know better, effectively taking responsibility for reasoning about the value.

Now it’d be nice if the compiler did understand that input/result relationship for string.IsNullOrEmpty method. We’ll come back to that idea in section 15.1.7.

The second use of the bang operator is far easier to demonstrate with a realistic example. I mentioned earlier that you should still validate parameters for null, because it’s still entirely possible for you to receive null values. You may then want to add a unit test for that validation, but then the compiler warns you because you’re providing a null value when you’ve said it shouldn’t be null. The following listing shows how the bang operator fixes this.

Listing 15.8. Using the bang operator in unit tests
public class Customer
{
    public string Name { get; }
    public Address? Address { get; }

    public Customer(string name, Address? address)
    {
        Name = name ?? throw new ArgumentNullException(nameof(name));
        Address = address;
    }
}

public class Address
{
    public string Country { get; }

    public Address(string country)
    {
        Country = country ??                                     
            throw new ArgumentNullException(nameof(country));    
    }
}

[Test]
public void Customer_NameValidation()
{
    Address address = new Address("UK");
    Assert.Throws<ArgumentNullException>(
        () => new Customer(null!, address));         1
}

  • 1 Deliberately passes in a null value for the non-nullable parameter

I’ve made the Customer and Address types immutable in listing 15.8 for simplicity. It’s interesting to note that the compiler doesn’t raise any kind of warning on the validation itself. Even though it knows the value shouldn’t be null, it doesn’t complain that the code checks whether it is null. But it does try to enforce that when you call the constructor in the test, the first argument is non-null. In an earlier version of C#, the lambda expression in the test would look like this:

() => new Customer(null, address)

That code generates a warning, as you’d want it to in almost all cases. Changing the argument to null! satisfies the compiler, and the test does what you want. This raises the question of what it’s like working with nullable reference types in practice, and, in particular, how to migrate existing code to use the feature.

15.1.6. Experiences of nullable reference type migration

There’s no better way to get a feel for how a feature works than to try it. I used the C# 8 preview build with Noda Time to see how much work would be required to make it warning free and to see whether it found any bugs. This section describes this experience and some guidelines I found myself following. Your code may face different challenges, but I suspect there’ll be plenty of commonality.

Using attributes to express nullable intent before C# 8

For a long time, Noda Time has used attributes (at least for all public methods) to indicate whether reference type parameters can be null and likewise whether return values may return null. For example, here’s the signature for a method in IDateTimeZoneProvider:

[CanBeNull] DateTimeZone GetZoneOrNull([NotNull] string id);

This shows that the argument for the id parameter must not be null, but the method may return a null reference. I’ve already expressed the intent around nullity, just not in a way that the C# compiler understood. That meant my first pass was just to go to all the places in the code where I’d said that null values were allowed and change them to use nullable reference types.

I happened to use the JetBrains annotations provided with ReSharper. This allows ReSharper to perform the same kind of inspection that C# 8 does in the language. I won’t go into the details of these annotations other than to note that they’re available. You don’t have to use a third-party set of annotations at all, however. You can easily create your own attributes and apply them right now. Even without any tooling support, this can make your code easier to maintain, and you’ll be in a better position to move to the C# 8 nullable reference types in the future.

Iteration is natural

After this first pass, I had about 100 warnings. I went through and fixed most of those and then rebuilt. After the second pass, I had about 110 warnings—more than before! I went through and fixed most of those and then rebuilt. After the third pass, I still had about 100 warnings. I went through and fixed most of those and then rebuilt.

I don’t remember how many iterations this took, but it’s not a sign of anything being wrong. The process of making a codebase nullable-reference-type compliant is like playing whack-a-mole: you decide to change the nullability in one place, and then that can cause warnings everywhere that value is used. You change those, and the problem moves again. Decisions about nullability propagate through the code and need careful checking. This is fine and is what you should expect to happen.

But when part of the code needs a value to be nullable and another part needs it to be non-nullable, you’ve discovered a problem. This isn’t a problem that C# 8 has introduced; it’s a problem that the feature has revealed. How you handle it will be context specific.

Best practices for using the bang operator

If you have to use the bang operator in production code, add a comment to explain why you did so. If you use a nicely searchable format (for example, including NULLABLEREF in the comment), you’ll be able to find them later. You may be able to remove the operator later through further tooling improvements. It’s not that using the operator is wrong, but it’s an assertion that you know better than the compiler, and I prefer not to trust myself that much.

I used the operator more often in test code and mostly for performing the sort of validation tests you saw in the previous section. Beyond that, if I expect a value to be non-null because of the way I’ve set up the test, I’m usually happy forcing the compiler to be happy with it, particularly if I know that it’ll be validated by the code I’m calling afterward anyway. If I’m wrong, the result should be the test failing with either an ArgumentNullException or NullReferenceException, which is fine, as I’d still know that my assumptions were invalid. Arguably, test code should be less defensive than production code in general; instead of trying to handle unexpected situations in a graceful way, it’s fine for them to fail.

Null-inconsistent generics

I found it odd to implement IEqualityComparer<T> for reference types in Noda Time, because it was defined long before nullable reference types were considered. Both Equals and GetHashCode are defined in terms of parameters of type T, but they’re inconsistent in terms of null handling: Equals is meant to handle null values, but GetHashCode is meant to throw an ArgumentNullException.

It’s unclear how this should be expressed in implementations. If I have an equality comparer for the Period class, should I implement IEqualityComparer <Period?> to allow null arguments or IEqualityComparer<Period> to prohibit them? Either way, callers could be surprised either at compile time or execution time.

Beyond just an implementation issue, it’s unclear to me how this could be expressed more clearly in the interface itself. More language design work may be required here in order to express how generic type parameters should be handled. Just using T? in the interface would feel wrong, as you wouldn’t want to accept Nullable<T> when T is a value type.

Although I happened to encounter this with IEqualityComparer<T>, I anticipate the same issue cropping up in other interfaces and even in generic classes. I’m mostly mentioning it here so that you don’t think you’ve done anything wrong when you come across it.

The end result

The Noda Time codebase isn’t huge, but it’s not tiny either. The whole process took me about five hours, including time diagnosing a bug in the preview build of Roslyn. In the end, I found a bug (now fixed) in Noda Time around inconsistent handling of an odd situation where TimeZoneInfo.Local returns null in some environments on Mono. I also found some missing annotations and had to clarify the intent for some internal members.

I was pleased with the result; knowing the compiler was checking the consistency of the code improves my confidence in it. Additionally, after I’ve published a version of Noda Time built with C# 8, anyone using the library from C# 8 will benefit from the extra information. This will help move more errors from execution time to compile time, giving users more confidence in how they’re using Noda Time. It’s a win-win situation.

All of this experience was with the preview from the first half of 2018. This isn’t the end state of the language design or the implementation, however. Let’s take a speculative look at the future.

15.1.7. Future improvements

In June 2018, I spent time in conferences and user groups with Mads Torgersen, the lead of the C# language design team. I traveled with a laundry list of feature requests and issues based on my experience with Noda Time, and his responses reassured me about the future of the features.

The C# team is aware that the preview that’s available already isn’t quite ready for mainstream adoption. A few things need a bit more work, but the preview allows the team to gather early feedback. The changes listed here won’t be the only ones, but they’re the ones I was most interested in.

Providing the compiler with more semantic information

When I introduced the bang operator in section 15.1.5, I showed that the compiler didn’t understand the semantics of string.IsNullOrEmpty. (The compiler doesn’t infer that if the method returns false, the input couldn’t have been null.) This isn’t the only situation in which a relationship between input and output should be able to help the compiler. Here are three examples that feel like they should compile without warnings (including string.IsNullOrEmpty again for completeness):

string? a = ...;
if (!string.IsNullOrEmpty(a))
{
    Console.WriteLine(a.Length);
}

object b = ...;
if (!ReferenceEquals(b, null))
{
    Console.WriteLine(b.GetHashCode());
}

XElement c = ...;
string d = (string) c;

In each case, the semantics of the code you’re calling are important. For these examples, the compiler would need to know the following:

  • If the result of string.IsNullOrEmpty is false, the input can’t be null.
  • If the result of ReferenceEquals is false and one of the inputs is known to be a null reference, the other input can’t be null.
  • If the input to the XElement to string conversion operator is non-null, the output is also non-null.

These are all examples of relationships between inputs and outputs, and those relationships can’t be expressed at the moment. I suspect that most uses of the bang operator in the preview build could be avoided if the compiler understood these relationships. How can the compiler get that extra information?

One approach that could work for these specific examples would be for the compiler to have the information hardcoded. That would be easy for the C# design team but unsatisfactory in other ways. It’d put the framework libraries on a different footing to third-party libraries, which would be annoying. I may want to express relationships like this in Noda Time, for example, which would make it more pleasant to use.

It’s likely that the C# team will instead design a whole new mini-language that can be expressed in attributes to give the compiler the extra semantic information it needs to be smarter about determining whether a particular value should be considered “definitely not null.” This will require a lot of work to design and implement but will provide a much more complete solution.

Deeper thinking about generics

Generics present interesting challenges for nullability design. I mentioned one example when implementing IEqualityComparer<T>, but the issue goes well beyond that. Consider the following simple class that’s already valid in C# 7:

public class Wrapper<T>
{
    public T Value { get; set; }
}

Should that be valid, and what does it mean? In particular, what’s the result of constructing an instance of it without setting the Value property?

  • For Wrapper<int>, the value of Value will be 0 by default.
  • For Wrapper<int?>, the value of Value will be the null value for int? by default.
  • For Wrapper<string>, the value of Value will be a null reference by default. That’s bad, as it goes against the type of Value being the non-nullable string type.
  • For Wrapper<string?>, the value of Value will be a null reference by default. That’s okay, as the type of Value is the nullable string type.

It gets even more confusing when you consider that at execution time, Wrapper <int> and Wrapper<int?> will be different CLR types, but Wrapper<string> and Wrapper<string?> will be the same CLR type.

I don’t know how this confusion will be resolved in C# 8, but the team is aware of it. I’m glad it’s their job rather than mine to make sense of it, as it makes my head hurt just thinking about it.

That example uses only syntax that’s valid in C# 7 and doesn’t explicitly refer to nullable types at all. What if you try to use T? within a generic type or method?

In C# 7, if you have a type parameter T, the type T? can be used only when T is constrained to be a non-nullable value type, at which point it means Nullable<T>. That’s reasonably simple, but what can you do for nullable reference types? It seems likely that you’ll need a new generic constraint of non-nullable reference type, at which point T? could be used when T is either constrained to be a non-nullable value type or is constrained to be a non-nullable reference type. I wouldn’t expect a single constraint to indicate “some non-nullable type,” because the representation of the corresponding nullable type is very different between value types and reference types.

Opt-in parameter validation

The only changes implemented so far have been at compile time. The IL generated by the compiler doesn’t change, and you still need to perform parameter validation to protect against code that ignores compiler warnings, uses the bang operator, or is compiled against an earlier version of C#.

That makes sense, but the validation feels like boilerplate code. The null-coalescing operator, nameof operator, and throw expressions are all features that have helped improve the code required for validation in some cases, but it’s still annoying and easy to forget.

One feature under discussion is to allow an exclamation mark after a parameter name to indicate that the compiler should generate a null validation at the start of a method. Consider a method that might currently be written like this:

static void PrintLength(string text)
{
    string validated =                                            
        text ?? throw new ArgumentNullException(nameof(text));    
    Console.WriteLine(validated.Length);
}

You could instead write this:

static void PrintLength(string text!)     1
{
    Console.WriteLine(text.Length);
}

  • 1 Automatic null validation

It’s possible that properties could have automatic validation in the same way.

Enabling nullability checking

In the preview build I’ve used, nullability checking is turned on by default. Although you can suppress warnings in the normal way, it’s likely that the C# 8 compiler will have more nuanced settings before it launches. There are lots of different scenarios to consider.

When developers upgrade to the C# 8 compiler, they’re likely to want to do this without seeing any new warnings. This is particularly important if the project settings treat warnings as errors. I suspect this means nullability checking will be turned off by default, at least for existing projects.

Not all class libraries will embrace C# 8 at the same time. It’ll be important for code that uses C# 8 with nullability checking turned on to be able to consume libraries that haven’t migrated yet. This is likely to be geared toward reporting as few errors as possible. For example, the compiler could treat all inputs to the library as nullable but all outputs from the library as non-nullable. Additionally, there’ll need to be a way for a library to indicate when it has migrated.

When developers decide to migrate a project to use nullable reference types, they may want to do so over the course of several changes. It’s possible that their project may contain generated code that can’t be easily modified to express nullability. This suggests it’d be useful to be able to express the concept of “this code expresses nullability” on a per type basis.

These considerations are new for C#. We’ve never had a language feature with such a broad impact on compatibility. I expect the team to iterate on this aspect several times before the final launch on C# 8.

Nullable reference types likely will be the biggest feature in C# 8, but others are also available in preview builds already. One of my favorites is switch expressions.

15.2. Switch expressions

The switch statement has been available in C# right from the start, and the only way it has changed in all that time is to permit pattern matching in C# 7. It remains an imperative control structure: if this case matches, do this; if that case matches, do that. A lot of the uses of switch statements are more functional, though, with each case computing a result: if this case matches, the result is X; if that case matches, the result is Y. This is a common construct in functional programming languages in which many functions are expressed purely in terms of pattern matching.

The introduction of expression-bodied members has made this stick out like a sore thumb. Many methods can be implemented with a single expression, but if you want to use a switch/case structure, you have to use a block body. This is usually just an inconvenience, but it’s still a point of friction.

C# 8 introduces switch expressions as an alternative to switch statements. This uses somewhat different syntax from switch statements, so it’s worth comparing the two. In chapter 12, when I introduced pattern matching, you looked at an example of a switch statement to compute the perimeter of different shapes. Here’s the code used in chapter 12:

static double Perimeter(Shape shape)
{
    switch (shape)
    {
        case null:
            throw new ArgumentNullException(nameof(shape));
        case Rectangle rect:
            return 2 * (rect.Height + rect.Width);
        case Circle circle:
            return 2 * PI * circle.Radius;
        case Triangle triangle:
            return triangle.SideA + triangle.SideB + triangle.SideC;
        default:
            throw new ArgumentException(
                $"Shape type {shape.GetType()} perimeter unknown",
                nameof(shape));
    }
}

The following listing shows the equivalent code using a switch expression instead but still using a regular block-bodied method.

Listing 15.9. Converting a switch statement into a switch expression
static double Perimeter(Shape shape)
{
    return shape switch
    {
        null => throw new ArgumentNullException(nameof(shape)),
        Rectangle rect => 2 * (rect.Height + rect.Width),
        Circle circle => 2 * PI * circle.Radius,
        Triangle triangle =>
            triangle.SideA + triangle.SideB + triangle.SideC,
        _ => throw new ArgumentException(
            $"Shape type {shape.GetType()} perimeter unknown",
            nameof(shape))
    };
}

There are a lot of things to point out here, so I haven’t tried to cram them all into the code as annotations. Here are all the differences between a switch statement and a switch expression:

  • Instead of switch (value), the introductory syntax for switch expressions is value switch.
  • A fat arrow => comes between the pattern and the result to return if that pattern is matched. (In a switch statement, a colon is used instead.)
  • The case keyword isn’t used at all in switch expressions. The left side of the => is just a pattern with an optional guard clause with the when keyword.
  • The right side of the => is just an expression. The return keyword isn’t used, because every pattern results in a value or throws. Likewise, there’s never a break statement.
  • The patterns are comma separated. If you’re converting a switch statement into a switch expression, this usually means changing semicolons into commas.
  • There’s no default case. Instead, the discard _ (underscore) is used to match anything that hasn’t already been matched.

My experience has mostly been writing methods that return a switch expression result directly, but you can also use it like any other expression. For example, you could write this:

double circumference = shape switch
{
          1
};

  • 1 Body of switch expression as before

This is fine, but as I mentioned before, one of the nicest aspects of switch expressions is to use them for expression-bodied methods. The following listing shows the evolution of listing 15.9 into an expression-bodied method.

Listing 15.10. Using a switch expression to implement an expression-bodied method
static double Perimeter(Shape shape) =>
    shape switch
    {
        null => throw new ArgumentNullException(nameof(shape)),
        Rectangle rect => 2 * (rect.Height + rect.Width),
        Circle circle => 2 * PI * circle.Radius,
        Triangle triangle =>
            triangle.SideA + triangle.SideB + triangle.SideC,
        _ => throw new ArgumentException(
            $"Shape type {shape.GetType()} perimeter unknown",
            nameof(shape))
    };

You can format this however you like, perhaps moving the shape switch onto the first line, or maybe outdenting the braces to the same level as the method declaration.

One important difference between switch statements and switch expressions is that there must always be some result (which could be an exception) from a switch expression. A switch expression isn’t allowed to do nothing and produce no value. You can use the _ discard to make sure of that, but it’s possible to write a switch expression that isn’t exhaustive—in other words, an expression that may not always match. With the preview build I’ve been working with, this produces a compiler warning, and then the compiler emits invalid IL. This might become a compile-time error instead, or the compiler may inject code to throw an exception (possibly InvalidOperationException) to indicate that the code encountered a situation it didn’t expect.

The one issue I have with switch expressions at the moment is that there’s no way of expressing multiple patterns that should evaluate to the same result. In a switch statement, you can specify multiple case labels, but there’s no equivalent in switch expressions yet. The C# team is aware of the desire for this, so hopefully it will be included before C# 8 is released.

The use of patterns in C# 8 isn’t just improved via switch expressions. The patterns themselves are growing in scope.

15.3. Recursive pattern matching

As a reminder, the patterns introduced in C# 7 were as follows:

  • Type patterns (expression is Type t)
  • Constant patterns (expression is 10, expression is null, and so on)
  • The var pattern (expression is var v)

C# 8 will introduce recursive patterns (patterns can be nested within bigger patterns) as well as deconstruction patterns. The simplest way of explaining recursive patterns is to show them in action. We’ll come back to deconstruction patterns.

15.3.1. Matching properties in patterns

To match properties with additional patterns inside an overall pattern, you use braces containing a comma-separated list of patterns against properties. The property patterns match the property value against the nested pattern using any of the normal pattern types. As an example, let’s have another look at the three patterns we’re using to work out the areas of rectangles, circles, and triangles taken from listing 15.10:

Rectangle rect => 2 * (rect.Height + rect.Width),
Circle circle => 2 * PI * circle.Radius,
Triangle triangle => triangle.SideA + triangle.SideB + triangle.SideC,

In each case, you don’t need the shape itself; you just need properties from it. You can use nested var patterns to match those properties against any value and extract pattern variables for each of the properties you need. The following listing shows the full method with the nested patterns.

Listing 15.11. Matching nested patterns
static double Perimeter(Shape shape) => shape switch
{
    null => throw new ArgumentNullException(nameof(shape)),
    Rectangle { Height: var h, Width: var w } => 2 * (h + w),
    Circle { Radius: var r } => 2 * PI * r,
    Triangle { SideA: var a, SideB: var b, SideC: var c } => a + b + c,
    _ => throw new ArgumentException(
        $"Shape type {shape.GetType()} perimeter unknown", nameof(shape))
};

Is this clearer than the previous code? I’m not sure. I’ve used it as an example that follows neatly from the previous one, but I might easily stick with the code in listing 15.10. You’ll look at a more complicated example later, in which the feature becomes more compelling but would be harder to immediately understand.

Note that although here you’ve stopped capturing the Rectangle, Circle, or Triangle in their own pattern variables (rect, circle, and triangle before), that’s only because you don’t need them for anything. It’s still valid to introduce a pattern variable that way. For example, if you were describing shapes, you might have a pattern to describe a flat rectangle with zero height:

Rectangle { Height: 0 } rect => $"Flat rectangle of width {rect.Width}"

This is useful when you have a lot of properties but you’re just testing patterns against a few of them. Next up, we’ll look at deconstruction patterns.

15.3.2. Deconstruction patterns

You saw deconstruction of tuples in section 12.1 and deconstruction via the Deconstruct method in section 12.2. Patterns in C# 8 will be extended to allow deconstruction with nested patterns inside. As a somewhat contrived example, you might decide that it’s natural to deconstruct a Triangle to all three of its sides:

public void Deconstruct
    (out double sideA, out double sideB, out double sideC) =>
    (sideA, sideB, sideC) = (SideA, SideB, SideC);

You could then simplify our perimeter computation to deconstruct to three variables instead of specifying each property name. So instead of this case in our switch expression

Triangle { SideA: var a, SideB: var b, SideC: var c } => a + b + c

you could have this:

Triangle (var a, var b, var c) => a + b + c

Again, is that more readable than just matching against the type? Maybe. Over time, I suspect each developer will work out their own preferences around pattern matching and ideally come to a convention within the codebases they’re working in, too.

15.3.3. Omitting types from patterns

The ability to look inside objects makes patterns useful even when you’re not testing the value’s type. At that point, it feels redundant to specify the type as part of the pattern. For this example, let’s go back to the customer and address example used for nullable reference types. You’ll go back to the first data model: all mutable, all nullable:

public class Customer
{
    public string Name { get; set; }
    public Address Address { get; set; }
}

public class Address
{
    public string Country { get; set; }
}

Now suppose you want to greet customers in different ways depending on the country in their address. Your input could be of type Customer, so you don’t want to have to repeat that within the pattern. When you match the Address of a customer within a pattern, that will always be of type Address, so you don’t need to specify that type either.

The following listing shows multiple patterns matching different kinds of customers. It also demonstrates the { } pattern, which is a special case of a property pattern that doesn’t have any properties to match. That pattern matches any non-null value.

Listing 15.12. Matching customers against multiple patterns concisely
static void Greet(Customer customer)
{
    string greeting = customer switch
    {
        { Address: { Country: "UK" } } =>                      1
             "Welcome, customer from the United Kingdom!",      
        { Address: { Country: "USA" } } =>                     2
             "Welcome, customer from the USA!",                 
        { Address: { Country: string country } } =>            3
             $"Welcome, customer from {country}!",              
        { Address: { } } =>                                    4
             "Welcome, customer whose address has no country!",
        { } =>                                                 5
            "Welcome, customer of an unknown address!",      
        _ =>                                                   6
            "Welcome, nullness my old friend!"                 
    };
    Console.WriteLine(greeting);
}

  • 1 Matches a country of UK
  • 2 Matches a country of USA
  • 3 Matches any country, but it must be present
  • 4 Matches any address
  • 5 Matches any customer, even with a null address
  • 6 Matches anything, even a null customer reference

The ordering is important here. For example, a customer with an address with a country of USA could match every pattern except the first one. You could make the patterns more selective instead (using the constant null pattern to match customers with a null Address property value, for example), but it’s simpler to rely on the ordering.

The enhancements to pattern matching in C# 8 will allow them to be used in more cases where currently you need if statements. Switch expressions add to this flexibility, too. I expect more and more code to be written with patterns. As always, it’s important to avoid going over the top; not all code will be simpler when written with patterns than with the control structures we had before. Still, this area of C#’s evolution definitely has a lot of potential. Our next feature is really a pair of features enabled by two new framework types.

15.4. Indexes and ranges

Compared with nullable reference types and improved pattern handling, indexes and ranges feel like a small feature, even combined. But I suspect over time we’ll come to wonder why it took so long to have them. The following listing provides a tiny taste before you look at the details.

Listing 15.13. Trimming the first and last character from a string with a range
string quotedText = "'This text was in quotes'";
Console.WriteLine(quotedText);
Console.WriteLine(quotedText.Substring(1..^1));       1

  • 1 Takes a substring of the string with a range literal

The output is as follows:

'This text was in quotes'
This text was in quotes

The highlighted expression of 1..^1 is the interesting part here. To understand this code, you need to learn about two new types.

15.4.1. Index and Range types and literals

The idea is simple. Index and Range are two structs that will be provided in the framework but currently need to be defined in your own code:

  • Index is an integer from either the start or end of something indexable. The value of the index is never negative.
  • Range is a pair of indexes: one for the start of the range and one for the end.

There are then three pieces of important syntax:

  • A regular implicit conversion from int to create a “from the start” Index.
  • A new unary operator (^) that can be used with int to create a “from the end” Index. Here a value of 0 means the element just past the end, and a value of 1 means the last element.[3]

    3

    This is slightly counterintuitive when using an Index with an indexer, but it makes a lot more sense with ranges, which have exclusive upper bounds. A range with an upper bound of ^0 is effectively “to the end of the sequence,” which is probably what you’d expect.

  • A new binary-ish operator (..) with optional operands for the start and end to create a Range.

The .. operator is binary-ish because there can be zero, one, or two operands. The following listing shows examples of all of these. You’re not applying the indexes or ranges to anything; you’re just creating the values.

Listing 15.14. Index and range literals
Index start = 2;
Index end = ^2;
Range all = ..;
Range startOnly = start..;
Range endOnly = ..end;
Range startAndEnd = start..end;
Range implicitIndexes = 1..5;

One point to note is that the start and end points of a range can be any index. For example, you could have a range of ^5..10 representing the fifth element from the end to the tenth element from the start. This would be unusual, but valid.

This is the sum total of the direct language support for indexes and ranges. It’s when they also have framework support that they become useful.

15.4.2. Applying indexes and ranges

All the examples in this section require extension methods and extension operators supported by the C# 8 preview build. The exact APIs may change, and the extensions provided in the preview work with only a limited set of types; this is just enough to demonstrate the benefits. In listing 15.13, I showed how the Substring method can be used with a Range. Both indexes and ranges will be applied and most often to types that represent sequences of some form, such as

  • Arrays
  • Spans
  • Strings (as sequences of UTF-16 code units)

These all support two operations:

  • Retrieving a single element
  • Creating a slice to represent part of the sequence

The single-element-retrieval operation already has a common representation using an indexer accepting an int parameter, but this makes it hard to retrieve the last element in a uniform way. The Index type solves this with its from the start or from the end aspect. The slice operation has previously taken different forms depending on the type involved. For example, Span<T> has a Slice method, whereas String has a Substring method.

By adding indexer overloads accepting Index and Range values, you can use a consistent and convenient syntax to perform both operations on all of the relevant types. The following listing shows similar calls working for a string and a Span<int>.

Listing 15.15. Using indexer overloads for index and range in a string and a span
string text = "hello world";
Console.WriteLine(text[2]);                                  1
Console.WriteLine(text[^3]);                                 2
Console.WriteLine(text[2..7])                                3

Span<int> span = stackalloc int[] { 5, 2, 7, 8, 2, 4, 3 };
Console.WriteLine(span[2]);                                  4
Console.WriteLine(span[^3]);                                 5
Span<int> slice = span[2..7];                                6
Console.WriteLine(string.Join(", ", slice.ToArray()));

  • 1 Accesses a single character by index from start
  • 2 Accesses a single character by index from end
  • 3 Takes a substring using a range
  • 4 Accesses a single element by index from start
  • 5 Accesses a single element by index from end
  • 6 Creates a slice using a range

The output is as follows:

l
r
llo w
7
2
7, 8, 2, 4, 3

Both the string and span indexers accepting a Range treat the upper bound of the range as exclusive: the range [2..7] returns the elements with indexes 2, 3, 4, 5, and 6.

In listing 15.15, the ranges included both start and end indexes, and both index values were computed from the start. You can use any range with the indexers so long as the indexes are valid for the sequence they’re applied to. For example, using text[^5..] with the code in listing 15.15 would return world as the last five characters of text.

Likewise, you could write text[^10..5], which would return ello. In the context of a string of length 11 (hello world), an index of ^10 is equivalent to an index of 1, so text[^10..5] is equivalent (in this case, it does depend on the length of text) to text[1..5], returning the four characters after the first. Next, we’ll look at increased language support for asynchrony.

15.5. More async integration

When async/await was introduced in C# 5, it revolutionized asynchrony for many C# developers. But a few language features have so far stayed synchronous, making it hard to go all in on asynchrony. In this section, we’ll look at the following:

  • Async disposal
  • Async iteration (foreach)
  • Async iterators (yield return)

These require framework support as well as language support. It wouldn’t be appropriate for the compiler to approximate asynchrony by executing the synchronous code on a different thread, for example. Let’s start with async disposal, which is the simplest of the three features.

15.5.1. Asynchronous resource disposal with using await

The IDisposable interface with its single Dispose method is naturally synchronous. If that method needs to perform I/O, such as to flush a stream, then it can block with all the normal issues that causes.

A new interface will be introduced for classes that support asynchronous disposal:

public interface IAsyncDisposable
{
    Task DisposeAsync();
}

There’s no requirement that a type that implements IAsyncDisposable also implements IDisposable, although I suspect many types will do so.

There’s then corresponding language support in the form of the using await statement, which works as you’d expect it to, calling DisposeAsync automatically and awaiting the resulting task. The following listing shows an example of implementing the interface and then using it.

Listing 15.16. Implementing IAsyncDisposal and calling it with using await
class AsyncResource : IAsyncDisposable
{
    public async Task DisposeAsync()
    {
        Console.WriteLine("Disposing asynchronously...");
        await Task.Delay(2000);
        Console.WriteLine("... done");
    }

    public async Task PerformWorkAsync()
    {
        Console.WriteLine("Performing work asynchronously...");
        await Task.Delay(2000);
        Console.WriteLine("... done");
    }
}
async static Task Main()
{
    using await (var resource = new AsyncResource())
    {
        await resource.PerformWorkAsync();
    }
    Console.WriteLine("After the using await statement");
}

The output shows the resource disposal:

Performing work asynchronously...
... done
Disposing asynchronously...
... done
After the using await statement

This is simple, but it hides two aspects of complexity that need to be addressed:

  • Libraries typically await tasks with ConfigureAwait(false). Applications typically await tasks without this. If the compiler is doing the awaiting automatically, how can the user configure this?
  • It’d be natural to have cancellation available for disposal. Where does that fit into the interface and the call site?

The C# team is aware of both points, and I expect them to be addressed in some form before release. The same problems occur for the other async features in C# 8, and I hope they’ll all be solved in a similar way. Let’s look at the next feature now: asynchronous iteration with foreach.

15.5.2. Asynchronous iteration with foreach await

Spoiler alert: there’s quite a lot of text before we reach the language feature in this section. That’s necessary in order to explain it properly, but the upshot is that code like this will be valid, where asyncSequence requires asynchronous work to retrieve the items:

foreach await (var item in asyncSequence)
{
       1
}

  • 1 Uses item

The interfaces introduced for asynchronous iteration aren’t quite as straightforward as the one for disposal. There are two interfaces, mirroring IEnumerable<T> and IEnumerator<T> to some extent, but not quite so obviously:

public interface IAsyncEnumerable<out T>
{
    IAsyncEnumerator<T> GetAsyncEnumerator();
}

public interface IAsyncEnumerator<out T>
{
    Task<bool> WaitForNextAsync();
    T TryGetNext(out bool success);
}

IAsyncEnumerable<T> may be closer to IEnumerable<T> than you expect; there’s nothing asynchronous in it. Instead of GetEnumerator(), it has GetAsyncEnumerator(), and that returns an IAsyncEnumerator<T>, but it does so synchronously. It’s possible that for some implementations this will be problematic, but I expect it to be the natural approach for most asynchronous sequences. Any implementation that wants to perform asynchronous operations as part of setup will probably need to defer that work until the caller starts iterating over the result.

The IAsyncEnumerator<T> interface is much further from IEnumerator<T> and reflects a common pattern in real-world implementations. Asynchrony is often used when I/O is involved, such as retrieving results over a network. That often naturally results in sequences being retrieved in chunks; you may perform a query and retrieve the first 10 results together, then the next 7, and then be told that’s the complete result set.

While you’re iterating within a set of results that has been buffered, there’s no need for asynchrony. Although asynchrony is quite efficient, it’s not completely free, so it’s worth avoiding if you can. Instead, you can iterate synchronously, so long as you have a way of determining when you’ve reached the end of the current result set. At that point, you can asynchronously fetch the next one and iterate through that synchronously again.

The IAsyncEnumerator<T> interface exposes this pattern through its two methods:

  • WaitForNextAsync is asynchronous, returning a task that indicates whether any more results were retrieved or whether you’ve reached the end of the sequence.
  • TryGetNext is synchronous, returning the next item. The out parameter is used to indicate whether there was a next item to return.[4] When this is false, that doesn’t mean you’ve necessarily reached the end of sequence; it just means you need to call WaitForNextAsync again.

    4

    This is oddly inconsistent with most TryXyz methods, which return bool and use an out parameter for the value. This could change before release.

That may all sound complicated, but the good news is that you’re unlikely to need to do any of this yourself; the new foreach await statement handles it all for you.

Let’s look at an example, which draws heavily from my experience working with Google Cloud Platform APIs. Many APIs have list operations, such as listing contacts in an address book or virtual machines in a cluster. There may be too many results to return in a single RPC response, so we have a page-based pattern: each response contains a “next page token” that the client supplies on a subsequent request to retrieve more data. For the first request, the client doesn’t supply a page token, and the final response doesn’t contain a page token. A simplified view of the API might look like the following listing.

Listing 15.17. Simplified RPC-based service for listing cities
public interface IGeoService
{
    Task<ListCitiesResponse> ListCitiesAsync(ListCitiesRequest request);
}

public class ListCitiesRequest
{
    public string PageToken { get; }
    public ListCitiesRequest(string pageToken) =>
        PageToken = pageToken;
}

public class ListCitiesResponse
{
    public string NextPageToken { get; }
    public List<string> Cities { get; }

    public ListCitiesResponse(string nextPageToken, List<string> cities) =>
        (NextPageToken, Cities) = (nextPageToken, cities);
}

That’s unwieldy to use directly, but it can easily be wrapped in a client that exposes this API instead, as shown in the next listing.

Listing 15.18. Wrapper around the RPC service to provide a simpler API
public class GeoClient
{
    public GeoClient(IGeoService service) { ... }               1
    public IAsyncEnumerable<string> ListCitiesAsync() { ... }   2
}

  • 1 Constructs a GeoClient with an RPC service
  • 2 Provides a simple async sequence of cities

With GeoClient in place, you can finally use foreach await, as in the following listing.

Listing 15.19. Using foreach await with a GeoClient
var client = new GeoClient(service);

foreach await (var city in client.ListCitiesAsync())
{
    Console.WriteLine(city);
}

The final code here is a lot simpler than all the code I had to show you to set up the example, and that’s without even looking at the implementation of GeoClient. But that’s a good thing; it shows the benefit of the feature. You’ve taken relatively complex definitions in both IGeoService and IAsyncEnumerable<T> and consumed them in a simple and efficient manner with foreach await.

Note

The downloadable source code contains a complete example with an in-memory fake service implementation.

One thing you may be surprised about is that IAsyncEnumerator<T> doesn’t implement IAsyncDisposable. That could change before release, but even if it doesn’t, I expect the compiler to dispose of an enumerator if it turns out to implement IAsyncDisposable at execution time.

Just like the synchronous foreach statement, foreach await won’t require the IAsyncEnumerable<T> and IAsyncEnumerator<T> interfaces to be implemented. It’ll be pattern based, so any type providing a GetAsyncEnumerator() method that returns a type that in turn provides the appropriate WaitForNextAsync and TryGetNext methods will be supported. This could allow some optimizations, but I expect the interfaces to be used most of the time.

So far, you’ve seen how to consume asynchronous sequences. What about producing them?

15.5.3. Asynchronous iterators

C# 2 introduced iterators with yield return and yield break statements to make it easy to write methods returning IEnumerable<T> or IEnumerator<T>. C# 8 will have the same feature for asynchronous sequences. The feature isn’t available in the preview, but the following listing shows how I expect it to work.

Listing 15.20. Implementing ListCitiesAsync with an iterator
public async IAsyncEnumerable<string> ListCitiesAsync()
{
    string pageToken = null;
    do
    {
        var request = new ListCitiesRequest(pageToken);
        var response = await service.ListCitiesAsync(request);
        foreach (var city in response.Cities)
        {
            yield return city;
        }
        pageToken = response.NextPageToken;
    } while (pageToken != null);
}

The mapping between the async iterator method and the IAsyncEnumerator<T> interface, with its mixture of asynchronous and synchronous parts, will be complex to implement. Whenever you continue executing code in the async method, it can complete that specific call in several ways:

  • It could await an incomplete asynchronous operation.
  • It could reach a yield return statement.
  • It could reach a yield break statement.
  • It could reach the end of the method.
  • It could throw an exception.

How those are handled will depend on whether the caller is executing WaitForNextAsync() or TryGetNext(). To make this efficient, the generated code should effectively switch between synchronous mode (if you’re yielding values with no intervening awaits) and asynchronous mode (if you’re awaiting an asynchronous operation). I can broadly picture how this might be achieved, but I’m glad I’m not the one having to implement it.

There are other features not available in the C# 8 preview yet. We’ll look at these more briefly.

15.6. Features not yet in preview

If C# 8 turns out to have only the features I’ve listed so far, it’ll still be a big deal. In some ways, I wish we could have a release with just nullable reference types, wait a year or so for most codebases to be updated to it, and then continue with more features. But C# 8 likely will ship with more features than I’ve shown so far.

This section discusses the features I think are the most likely to be included in C# 8. Even more features have been proposed either by members of the C# team or by external developers. The C# team uses GitHub to keep track of language proposals, which makes it easy to see what’s going on and contribute yourself; see https://github.com/dotnet/csharplang. We’ll start with a feature inspired by Java.

15.6.1. Default interface methods

Whereas C# introduced extension methods for LINQ, Java took a different approach to enable its support for streams, which covers many of the same use cases as LINQ. In Java 8, Oracle introduced default methods in Java interfaces: an interface could declare a method and a default implementation for it, which could then be overridden within a concrete implementation. The default implementation can’t declare any state in terms of fields; it has to be expressed in terms of the other members of the interface.

The two features are similar in some ways: they both allow logic to be expressed so the consumer of an interface can call a method without every interface implementation having to directly know about it or implement it. There are pros and cons with each approach:

  • Extension methods can be introduced by anyone, not just the author of the interface. You can’t add a default method to an interface you can’t control. (Extension methods can also be applied to classes and structs, of course.)
  • Default methods can be overridden by implementing classes, often for the sake of optimization. Extension methods can’t be overridden; they’re just static methods with syntactic sugar to make calling them look more like they’re regular instance methods.

The second point can be easily appreciated using LINQ’s Enumerable.Count() method as an example. By default, it counts the elements in a sequence by calling GetEnumerator() and then counting how many calls to MoveNext() on that enumerator return true.

Many implementations of IEnumerable<T> have far more efficient ways of determining the number of elements. Enumerable.Count() is specifically optimized for some of those, such as ICollection and ICollection<T> implementations. But what about a collection that doesn’t want to implement either of those interfaces but still wants to provide the Count cheaply? It’s stuck; it has no way of communicating to Enumerable.Count() that it can implement that part of LINQ itself more efficiently. If Count() had been a method in IEnumerable<T> with a default implementation, however, our new collection could just override that method.

Here’s an example of how IEnumerable<T> could’ve been declared using C# 8 default interface methods:

public interface IEnumerable<T>
{
    IEnumerator<T> GetEnumerator();

    int Count()
    {
        using (var iterator = GetEnumerator())
        {
            int count = 0;
            while (iterator.MoveNext())
            {
                count++;
            }
        }
    }
    return count;
}

Default interface methods also allow interfaces to be expanded over time in a rather more version-friendly way. New methods can be added with a default implementation that either implements the new functionality using the existing members or potentially throws a NotSupportedException. That way, old implementations will still build, even if the new method can’t be called reliably. Versioning is a tricky subject, to say the least, but having another option in our toolbox is welcome. In numerous situations, this would’ve made things simpler in code that I maintain.

Default interface methods are proving to be a controversial feature. They require CLR support, which makes the feature harder to experiment with before committing to it wholeheartedly. If the feature is included, it’ll be interesting to see its adoption rate. It may remain rarely used until the runtime versions that support it are widely adopted, too. Next, we’ll look at a feature that has been talked about and even prototyped for a long time.

15.6.2. Record types

The forerunner of record types was a feature called primary constructors, which was originally intended to be present in C# 6. The language team wasn’t happy with some of the rough edges in the original design, so they decided to delay its introduction until it could be improved.

Record types are designed to make it easy to create immutable classes or structs with a given set of properties. I tend to think of them in terms of starting with anonymous types but adding all kinds of features. They can be declared incredibly simply. For example, here’s a complete class declaration:

public class Point(int X, int Y, int Z);

That generates a bunch of members for you, although you can still introduce your own behavior as well. The generated members are a constructor, properties, equality methods, a Deconstruct method for deconstruction, and a With method like this:

public Point With(int X = this.X, int Y = this.Y, int Z = this.Z) =>
    new Point(X, Y, Z);

That isn’t valid syntax for optional parameter default values at the moment, and it’s not clear whether it’ll be valid to write that code explicitly, but it at least shows the intention of the method’s behavior.

The With method is designed to interoperate with new syntax in the form of with expressions. The idea is that both the method and the syntax make it easy to create a new instance of the immutable type that’s the same as an existing one but with one or more properties changed. WithFoo methods are common in immutable types already (where Foo is the name of a property in the type), but they typically work on one property at a time. For example, with an immutable Point class with X, Y, and Z properties, you might use the following code to create a new point that has the same Z value as a previous point, but new X and Y values:

var newPoint = oldPoint.WithX(10).WithY(20);

Each WithFoo method calls a constructor, passing in all the existing properties other than the one named in the method, where the new value specified in the parameter is used. These methods become tedious to write and have a performance implication, too: to “change” N properties, you need to make N method calls, each one of which creates a new object.

The With method for record types is different: it has one parameter for each property of the type, with new syntax for a default parameter value if that parameter isn’t specified, indicating that the value should be taken from the current object. For example, consider the With method in our Point type. You could either call that directly

var newPoint = oldPoint.With(X: 10, Y: 20);

or use the new with expression syntax, which looks more like an object initializer:

var newPoint = oldPoint with { X = 10, Y = 20 };

The two would compile to the same IL. This way, only a single new object is constructed.

This is only a simple example. It becomes trickier when you have a complex type and you want to modify just one leaf node. For example, if you have a Contact type with an Address property, you may want to create a new contact that’s the same as the old one but with one part of the Address property different. It’s possible that’ll still be tricky in C# 8 but that with expression syntax may be enhanced to make that simpler over time, just as the syntax for pattern matching has grown.

I’m excited about the possibilities here. Immutable types have been a pain to create and work with in C# for a long time. Whereas C# 7 tuples filled one gap left by anonymous types, record types fill another. I’ve always loved anonymous types for the work the compiler does for you in terms of equality, constructor, and property code. It’s just a shame we couldn’t name them or add more functionality later. Record types fix all of this and more. Finally, I want to highlight a few features that involve a little more thinking outside the box.

15.6.3. Even more features in brief

Although some minor features are more likely to make it into C# 8, they’re not as interesting as the ones I discuss here. Remember, you can always check GitHub to learn more about what might be included and its up-to-date status.

Type classes (aka concepts, shapes, or structural generic constraints)

Although generics are great for many situations, they have limitations. There are “shapes” of data types that can’t be expressed with generics, such as operators and constructors. Although you can require that a type argument has a parameterless constructor, you can’t require that it has a constructor with a specific parameter list. Additionally, at times types can have the same shape in some useful way but not implement any common interfaces or have any common base classes other than System .Object. Type classes would be a new kind of type to address these concerns. They’d be a little like interfaces, but the implementing class wouldn’t need to know about them. You would be able to constrain a generic type parameter by the type class instead.

This has the potential to be powerful but somewhat confusing; I’m of two minds about it myself. It’s likely to require runtime changes in order to execute efficiently. It may take C# developers (or me, at least) a while to work out when it’s useful and when it’s just confusing. Adding a whole new kind of type at this stage in the language’s evolution feels like a giant step. For all these caveats, this feature definitely fills a gap: where you need this functionality, the current tools don’t offer any clean solutions.

Extension everything

At the time of this writing, this has a milestone of X.0 in GitHub, but I wouldn’t be overly surprised to see it move up the priority list. The name does a good job of explaining the feature: the concept of extension methods would be applied to other member types, such as properties, constructors, and operators. It may also allow static extension members to be introduced—ones that look like they’re static methods on the extended type. (For example, you could write a method in StringExtensions that could be called as string.IsNullOrTabs as a more specific version of string.IsNullOrWhiteSpace.)

The syntax used for extension methods doesn’t lend itself to other member types, so it’s probable that a whole new syntax would be used instead. This might be an extension type that’s purely present to create multiple extension members all on one specific extended type.

Extension types still wouldn’t be able to introduce new state. Any extension properties would be likely to present a different view of existing properties. For example, you could have an extension property on DateTime called FinancialQuarter that knew your company’s financial reporting dates and used the existing Year/Month/Day properties to compute the appropriate quarter.

Target-typed new

Implicit typing with var can be useful for reducing clutter when long type names are involved. It doesn’t help for fields, though, because they can’t be implicitly typed. We still end up with code like this:

Dictionary<string, List<DateTime>> entryTimesByName =
    new Dictionary<string, List<DateTime>>();

The target-typed new feature wouldn’t affect where you could use var. Instead, it would shorten the right-hand side of the declaration:

Dictionary<string, List<DateTime>> entryTimesByName = new();

Anytime the compiler can tell which type you probably mean when calling a constructor, you’d be able to leave out the type name entirely. This introduces interesting complexity with member invocations. For example, Method(new()) would take the target type from the method parameter, which is fine until Method is generic or overloaded.

I love and hate this feature proposal, in roughly equal measure. It could certainly make code unreadable if used excessively, but almost any feature can be misused. On the other hand, I relish the possibility of removing the duplication of long field initialization.

I expect this to be even more controversial than default interface methods. We’ll see what happens, and you can be part of the conversation.

15.7. Getting involved

The C# design process is more open than ever before. Although a lot of work goes on in the background with Language Design Meetings (LDMs) in Microsoft offices, there’s plenty of room for community involvement, too. The GitHub repository at https://github.com/dotnet/csharplang is the place to start. It contains notes from LDMs, proposals, discussions, and specifications. You’re welcome to engage at any of the following levels:

  • Trying out preview builds to see how well new features fit with your existing code
  • Discussing currently proposed features
  • Proposing new features
  • Prototyping new features in Roslyn
  • Helping draft language in the specification for new features
  • Spotting mistakes in the existing specification (it happens!)

You may feel it’s a better use of your time to wait for full releases with complete documentation and a polished implementation. That’s perfectly fine, too. It’s easy enough to dip your toe in the water at any time, if only to look at the set of proposed features for a given milestone.

This open design process is relatively new, and I expect it to be fine-tuned over time. I’d be surprised if the team ever went back to a more closed process. Although community engagement like this is expensive in terms of time, there are huge benefits in making sure the new features are ones developers really need.

Conclusion

There’s been a lot more text than code in this chapter, mostly because I don’t want to present too much code that’ll be wrong by the time C# 8 ships. I doubt that all the features I’ve described will be present in C# 8, but I think it’s at least likely that some of them will be. I’d be surprised if nullable reference types or the pattern-related features didn’t make it into C# 8.

What comes beyond that? Well, minor releases in the C# 8 line, presumably, and then on to C# 9. Some of the features of C# 9 are probably already on GitHub as proposals, but I suspect there’ll be some that haven’t been talked about at all yet. I expect C# to continue to evolve to meet the needs of developers as the computing landscape changes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset