Chapter 14. Concise code in C# 7

This chapter covers

  • Declaring methods within methods
  • Simplifying calls by using out parameters
  • Writing numeric literals more readably
  • Using throw as an expression
  • Using default literals

C# 7 comes with large features that change the way we approach code: tuples, deconstruction, and patterns. It comes with complex but effective features that are squarely aimed at high-performance scenarios. It also comes with a set of small features that just make life a little bit more pleasant. There’s no single feature in this chapter that’s earth-shattering; each makes a small difference, and the combination of all of them can lead to beautifully concise, clear code.

14.1. Local methods

If this weren’t C# in Depth, this section would be short indeed; you can write methods within methods. There’s more to it than that, of course, but let’s start with a simple example. The following listing shows a simple local method within a regular Main method. The local method prints and then increments a local variable declared within Main, demonstrating that variable capture works with local methods.

Listing 14.1. A simple local method that accesses a local variable
static void Main()
{
    int x = 10;                                  1
    PrintAndIncrementX();                        2
    PrintAndIncrementX();                        2
    Console.WriteLine($"After calls, x = {x}");

    void PrintAndIncrementX()                    3
    {                                            3
        Console.WriteLine($"x = {x}");           3
        x++;                                     3
    }                                            3
}

  • 1 Declares local variable used within method
  • 2 Calls local method twice
  • 3 Local method

This looks a bit odd when you see it for the first time, but you soon get used to it. Local methods can appear anywhere you have a block of statements: methods, constructors, properties, indexers, event accessors, finalizers, and even within anonymous functions or nested within another local method.

A local method declaration is like a normal method declaration but with the following restrictions:

  • It can’t have any access modifiers (public and so on).
  • It can’t have the extern, virtual, new, override, static, or abstract modifiers.
  • It can’t have any attributes (such as [MethodImpl]) applied to it.
  • It can’t have the same name as another local method within the same parent; there’s no way to overload local methods.

On the other hand, a local method acts like standard methods in other ways, such as the following:

  • It can be void or return a value.
  • It can have the async modifier.
  • It can have the unsafe modifier.
  • It can be implemented via an iterator block.
  • It can have parameters, including optional ones.
  • It can be generic.
  • It can refer to any enclosing type parameters.
  • It can be the target of a method group conversion to a delegate type.

As shown in the listing 14.1, it’s fine to declare the method after it’s used. Local methods can call themselves or other local methods that are in scope. Positioning can still be important, though, largely in terms of how the local methods refer to captured variables: local variables declared in the enclosing code but used in the local method.

Indeed, much of the complexity around local methods, both in language rules and implementation, revolves around the ability for them to read and write captured variables. Let’s start off by talking about the rules that the language imposes.

14.1.1. Variable access within local methods

You’ve already seen that local variables in the enclosing block can be read and written, but there’s more nuance to it than that. There are a lot of small rules here, but you don’t need to worry about learning them exhaustively. Mostly of the time you won’t even notice them, and you can refer back to this section if the compiler complains about code that you expect to be valid.

A local method can capture only variables that are in scope

You can’t refer to a local variable outside its scope, which is, broadly speaking, the block in which it’s declared. For example, suppose you want your local method to use an iteration variable declared in a loop; the local method itself has to be declared in the loop, too. As a trivial example, this isn’t valid:

static void Invalid()
{
    for (int i = 0; i < 10; i++)
    {
        PrintI();
    }

    void PrintI() => Console.WriteLine(i);      1
}

  • 1 Unable to access i; it’s not in scope.

But with the local method inside the loop, it’s valid:[1]

1

It may be a little strange to read, but it’s valid.

static void Valid()
{
    for (int i = 0; i < 10; i++)
    {
        PrintI();

        void PrintI() => Console.WriteLine(i);      1
    }
}

  • 1 Local method declared within loop; i is in scope.
A local method must be declared after the declaration of any vari- iables it captures

Just as you can’t use a variable earlier than its declaration in regular code, you can’t use a captured variable in a local method until after its declaration, either. This rule is more for consistency than out of necessity; it would’ve been feasible to specify the language to require that any calls to the method occur after the variable’s declaration, for example, but it’s simpler to require all access to occur after declaration. Here’s another trivial example of invalid code:

static void Invalid()
{
    void PrintI() => Console.WriteLine(i);        1
    int i = 10;
    PrintI();
}

  • 1 CS0841: Can’t use local variable i before it’s declared

Just moving the local method declaration to after the variable declaration (whether before or after the PrintI() call) fixes the error.

A Local method can’t capture ref parameters of the enclosing method

Just like anonymous functions, local methods aren’t permitted to use reference parameters of their enclosing method. For example, this is invalid:

static void Invalid(ref int p)
{
    PrintAndIncrementP();
    void PrintAndIncrementP() =>
        Console.WriteLine(p++);      1
}

  • 1 Invalid access to reference parameter

The reason for this prohibition for anonymous functions is that the created delegate might outlive the variable being captured. In most cases, this reason wouldn’t apply to local methods, but as you’ll see later, it’s possible for local methods to have the same kind of issue. In most cases, you can work around this by declaring an extra parameter in the local method and passing the reference parameter by reference again:

static void Valid(ref int p)
{
    PrintAndIncrement(ref p);
    void PrintAndIncrement(ref int x) => Console.WriteLine(x++);
 
}

If you don’t need to modify the parameter within the local method, you can make it a value parameter instead.

As a corollary of this restriction (again, mirroring a restriction for anonymous functions), local methods declared within structs can’t access this. Imagine that this is an implicit extra parameter at the start of every instance method’s parameter list. For class methods, it’s a value parameter; for struct methods, it’s a reference parameter. Therefore, you can capture this in local methods in classes but not in structs. The same workaround applies as for other reference parameters.

Note

I’ve provided an example in the source code accompanying the book in LocalMethodUsingThisInStruct.cs.

Local methods interact with definite assignment

The rules of definite assignment in C# are complicated, and local methods complicate them further. The simplest way to think about it is as if the method were inlined at any point where it’s called. That impacts assignment in two ways.

First, if a method that reads a captured variable is called before it’s definitely assigned, that causes a compile-time error. Here’s an example that tries to print the value of a captured variable in two places: once before it’s been assigned a value and once afterward:

static void AttemptToReadNotDefinitelyAssignedVariable()
{
    int i;
    void PrintI() => Console.WriteLine(i);
    PrintI();                                 1
    i = 10;
    PrintI();                                 2
}

  • 1 CS0165: Use of unassigned local variable ‘i’
  • 2 No error: i is definitely assigned here.

Notice that it’s the location of the call to PrintI that causes the error here; the location of the method declaration itself is fine. If you move the assignment to i before any calls to PrintI(), that’s fine, even if it’s still after the declaration of PrintI().

Second, if a local method writes to a captured variable in all possible execution flows, the variable will be definitely assigned at the end of any call to that method. Here’s an example that assigns a value within a local method but then reads it within the containing method:

static void DefinitelyAssignInMethod()
{
    int i;
    AssignI();                   1
    Console.WriteLine(i);        2
    void AssignI() => i = 10;    3
}

  • 1 Call to the method makes i definitely assigned.
  • 2 So it’s fine to print it out.
  • 3 Method performs the assignment.

There are a couple of final points to make about local methods and variables, but this time the variables under discussion are not captured variables but fields.

Local methods can’t assign read-only fields

Read-only fields can be assigned values only in field initializers or constructors. That rule doesn’t change with local methods, but it’s made a little stricter: even if a local method is declared within a constructor, it doesn’t count as being inside the constructor in terms of field initialization. This code is invalid:

class Demo
{
    private readonly int value;

    public Demo()
    {
        AssignValue();
        void AssignValue()
        {
            value = 10;      1
        }
    }
}

  • 1 Invalid assignment to read-only field

This restriction isn’t likely to be a significant problem, but it’s worth being aware of. It stems from the fact that the CLR hasn’t had to change in order to support local methods. They’re just a compiler transformation. This leads us to considering exactly how the compiler does implement local methods, particularly with respect to captured variables.

14.1.2. Local method implementations

Local methods don’t exist at the CLR level.[2] The C# compiler converts local methods into regular methods by performing whatever transformations are required to make the final code behave according to the language rules. This section provides examples of the transformations implemented by Roslyn (the Microsoft C# compiler) and focuses on how captured variables are treated, as that’s the most complex aspect of the transformation.

2

If a C# compiler were to target an environment where local methods did exist, all of the information in this section would probably be irrelevant for that compiler.

Implementation details: Nothing guaranteed here

This section really is about how the C# 7.0 version of Roslyn implements local methods. This implementation could change in future versions of Roslyn, and other C# compilers may use a different implementation. It also means there’s quite a lot of detail here that you may not be interested in.

The implementation does have performance implications that may affect how comfortable you are with using local methods in performance-sensitive code. But as with all performance matters, you should be basing your decisions more on careful measurement than on theory.

Local methods feel like anonymous functions in the way they can capture local variables from their surrounding code. But significant differences in the implementation can make local methods rather more efficient in many cases. At the root of this difference is the lifetime of the local variables involved. If an anonymous function is converted into a delegate instance, that delegate could be invoked long after the method has returned, so the compiler has to perform tricks, hoisting the captured variables into a class and making the delegate refer to a method in that class.

Compare that with local methods: in most cases, the local method can be invoked only during the call of the enclosing method; you don’t need to worry about it referring to captured variables after that call has completed. That allows for a more efficient, stack-based implementation with no heap allocations. Let’s start reasonably simply with a local method that increments a captured variable by an amount specified as an argument to the local method.

Listing 14.2. Local method modifying a local variable
static void Main()
{
    int i = 0;
    AddToI(5);
    AddToI(10);
    Console.WriteLine(i);
    void AddToI(int amount) => i += amount;
}

What does Roslyn do with this method? It creates a private mutable struct with public fields to represent all the local variables in the same scope that are captured by any local method. In this case, that’s just the i variable. It creates a local variable within the Main method of that struct type and passes the variable by reference to the regular method created from AddToI along with the declared amount parameter, of course. You end up with something like the following listing.

Listing 14.3. What Roslyn does with listing 14.2
private struct MainLocals                               1
{
    public int i;
}

static void Main()
{
    MainLocals locals = new MainLocals();               2
    locals.i = 0;                                       2
    AddToI(5, ref locals);                              3
    AddToI(10, ref locals);                             3
    Console.WriteLine(locals.i);
}

static void AddToI(int amount, ref MainLocals locals)   4
{                                                       4
    locals.i += amount;                                 4
}                                                       4

  • 1 Generated mutable struct to store the local variables from Main
  • 2 Creates and uses a value of the struct within the method
  • 3 Passes the struct by reference to the generated method
  • 4 Generated method to represent the original local method

As usual, the compiler generates unspeakable names for the method and the struct. Note that in this example, the generated method is static. That’s the case when either the local method is originally contained in a static member or when it’s contained in an instance member but the local method doesn’t capture this (explicitly or implicitly by using instance members within the local method).

The important point about generating this struct is that the transformation is almost free in terms of performance: all the local variables that would’ve been on the stack before are still on the stack; they are just bunched together in a struct so that they can be passed by reference to the generated method. Passing the struct by reference has two benefits:

  • It allows the local method to modify the local variables.
  • However many local variables are captured, calling the local method is cheap. (Compare that with passing them all by value, which would mean creating a second copy of each captured local variable.)

All of this without any garbage being generated on the heap. Hooray! Now let’s make things a little more complex.

Capturing variables in multiple scopes

In an anonymous function, if local variables are captured from multiple scopes, multiple classes are generated with a field in each class representing the inner scope holding a reference to an instance of the class representing the outer scope. That wouldn’t work with the struct approach for local methods that you just saw because of the copying involved. Instead, the compiler generates one struct for each scope containing a captured variable and uses a separate parameter for each scope. The following listing deliberately creates two scopes, so we can see how the compiler handles it.

Listing 14.4. Capturing variables from multiple scopes
static void Main()
{
    DateTime now = DateTime.UtcNow;
    int hour = now.Hour;
    if (hour > 5)
    {
        int minute = now.Minute;
        PrintValues();

        void PrintValues() =>
            Console.WriteLine($"hour = {hour}; minute = {minute}");
    }
}

I used a simple if statement to introduce a new scope rather than a for or foreach loop, because this made the translation simpler to represent reasonably accurately. The following listing shows the compiler how the compiler translates the local methods into regular ones.

Listing 14.5. What Roslyn does with listing 14.4
struct OuterScope                                 1
{                                                 1
    public int hour;                              1
}                                                 1
struct InnerScope                                 2
{                                                 2
    public int minute;                            2
}                                                 2

static void Main()
{
    DateTime now = DateTime.UtcNow;               3
    OuterScope outer = new OuterScope();          4
    outer.hour = now.Hour;                        4
    if (outer.hour > 5)                           4
    {
        InnerScope inner = new InnerScope();      5
        inner.minute = now.Minute;                5
        PrintValues(ref outer, ref inner);        6
    }
}

static void PrintValues(                          7
    ref OuterScope outer, ref InnerScope inner)   7
{
    Console.WriteLine($"hour = {outer.hour}; minute = {inner.minute}");
}

  • 1 Generated struct for outer scope
  • 2 Generated struct for inner scope
  • 3 Uncaptured local variable
  • 4 Creates and uses struct for outer scope variable hour
  • 5 Creates and uses struct for inner scope variable minute
  • 6 Passes both structs by reference to generated method
  • 7 Generated method to represent the original local method

In addition to demonstrating how multiple scopes are handled, this listing shows that uncaptured local variables aren’t included in the generated structs.

So far, we’ve looked at cases where the local method can execute only while the containing method is executing, which makes it safe for the local variables to be captured in this efficient way. In my experience, this covers most of the cases where I’ve wanted to use local methods. There are occasional exceptions to that safe situation, though.

Prison break! How local methods can escape their containing code

Local methods behave like regular methods in four ways that can stop the compiler from performing the “keep everything on the stack” optimization we’ve discussed so far:

  • They can be asynchronous, so a call that returns a task almost immediately won’t necessarily have finished executing the logical operation.
  • They can be implemented with iterators, so a call that creates a sequence will need to continue executing the method when the next value in the sequence is requested.
  • They can be called from anonymous functions, which could in turn be called (as delegates) long after the original method has finished.
  • They can be the targets of method group conversions, again creating delegates that can outlive the original method call.

The following listing shows a simple example of the last bullet point. A local Count method captures a local variable in its enclosing CreateCounter method. The Count method is used to create an Action delegate, which is then invoked after the CreateCounter method has returned.

Listing 14.6. Method group conversion of a local method
static void Main()
{
    Action counter = CreateCounter();     
    counter();                                    1
    counter();                                    1
}

static Action CreateCounter()
{
    int count = 0;                                2
    return Count;                                 3
    void Count() => Console.WriteLine(count++);   4
}

  • 1 Invokes the delegate after CreateCounter has finished
  • 2 Local variable captured by Count
  • 3 Method group conversion of Count to an Action delegate
  • 4 Local method

You can’t use a struct on the stack for count anymore. The stack for CreateCounter won’t exist by the time the delegate is invoked. But this feels very much like an anonymous function now; you could’ve implemented CreateCounter by using a lambda expression instead:

static Action CreateCounter()
{
    int count = 0;
    return () => Console.WriteLine(count++);     1
}

  • 1 Alternative implementation using a lambda expression

That gives you a clue as to how the compiler can implement the local method: it can apply a similar transformation for the local method as it would for the lambda expression, as shown in the following listing.

Listing 14.7. What Roslyn does with listing 14.6
static void Main()
{
    Action counter = CreateCounter();
    counter();
    counter();
}

static Action CreateCounter()
{
    CountHolder holder = new CountHolder();              1
    holder.count = 0;                                    1
    return holder.Count;                                 2
}

private class CountHolder                                3
{
    public int count;                                    4

    public void Count() => Console.WriteLine(count++);   5
}

  • 1 Creates and initializes object holding captured variables
  • 2 Method group conversion of instance method from holder
  • 3 Private class with captured variables and local method
  • 4 Captured variable
  • 5 Local method is now an instance method in generated class

The same kind of transformation is performed if the local method is used within an anonymous function if it’s an async method or if it’s an iterator (with yield statements). The performance-minded may wish to be aware that async methods and iterators can end up generating multiple objects; if you’re working hard to prevent allocations and you’re using local methods, you may wish to pass parameters explicitly to those local methods instead of capturing local variables. An example of this is shown in the next section.

Of course, the set of possible scenarios is pretty huge; one local method could use a method conversion for another local method, or you could use a local method within an async method, and so on. I’m certainly not going to try to cover every possible case here. This section is intended to give you a good idea of the two kinds of transformation the compiler can use when dealing with captured variables. To see what it’s doing with your code, use a decompiler or ildasm, remembering to disable any “optimizations” the decompiler might do for you. (Otherwise, it could easily just show the local method, which doesn’t help you at all.) Now that you’ve seen what you can do with local methods and how the compiler handles them, let’s consider when it’s appropriate to use them.

14.1.3. Usage guidelines

There are two primary patterns to spot where local methods might be applicable:

  • You have the same logic repeated multiple times in a method.
  • You have a private method that’s used from only one other method.

The second case is a special case of the first in which you’ve taken the time to refactor the common code already. But the first case can occur when there’s enough local state to make that refactoring ugly. Local methods can make the extraction significantly more appealing because of the ability to capture local variables.

When refactoring an existing method to become a local method, I advise consciously taking a two-stage approach. First, move the single-use method into the code that uses it without changing its signature.[3] Second, look at the method parameters: are all the calls to the method using the same local variables as arguments? If so, those are good candidates for using captured variables instead, removing the parameter from the local method. Sometimes you may even be able to remove the parameters entirely.

3

Sometimes this requires changes to the type parameters in the signature. Often if you have one generic method calling another, when you move the second method into the first, it can just use the type parameters of the first. Listing 14.9 demonstrates this.

Depending on the number and size of the parameters, this second step could even have a performance impact. If you were previously passing large value types by value, those were being copied on each call. Using captured variables instead can eliminate that copy, which could be significant if the method is being called a lot.

The important point about local methods is that it becomes clear that they’re an implementation detail of a method rather than of a type. If you have a private method that makes sense as an operation in its own right but happens to be used in only one place at the moment, you may be better off leaving it where it is. The payoff—in terms of logical type structure—is much bigger when a private method is tightly bound to a single operation and you can’t easily imagine any other circumstances where you’d use it.

Iterator/async argument validation and local method optimization

One common example of this is when you have iterator or async methods and want to eagerly perform argument validation. For example, the Listing 14.8 provides a sample implementation of one overload of Select in LINQ to Objects. The argument validation isn’t in an iterator block, so it’s performed as soon as the method is called, whereas the foreach loop doesn’t execute at all until the caller starts iterating over the returned sequence.

Listing 14.8. Implementing Select without local methods
public static IEnumerable<TResult> Select<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, TResult> selector)
{
    Preconditions.CheckNotNull(source, nameof(source));     1
    Preconditions.CheckNotNull(                             1
        selector, nameof(selector));                        1
    return SelectImpl(source, selector);                    2
}

private static IEnumerable<TResult> SelectImpl<TSource, TResult>(
    IEnumerable<TSource> source,
    Func<TSource, TResult> selector)
{
    foreach (TSource item in source)                        3
    {                                                       3
        yield return selector(item);                        3
    }                                                       3
}

  • 1 Eagerly checks arguments
  • 2 Delegates to the implementation
  • 3 Implementation executes lazily

Now, with local methods available, you can move the implementation into the Select method, as shown in the following listing.

Listing 14.9. Implementing Select with a local method
public static IEnumerable<TResult> Select<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, TResult> selector)
{
    Preconditions.CheckNotNull(source, nameof(source));
    Preconditions.CheckNotNull(selector, nameof(selector));
    return SelectImpl(source, selector);

    IEnumerable<TResult> SelectImpl(
        IEnumerable<TSource> validatedSource,
        Func<TSource, TResult> validatedSelector)
    {
        foreach (TSource item in validatedSource)
        {
            yield return validatedSelector(item);
        }
    }
}

I’ve highlighted one interesting aspect of the implementation: you still pass the (now-validated) parameters into the local method. This isn’t required; you could make the local method parameterless and just use the captured source and selector variables, but it’s a performance tweak—it reduces the number of allocations required. Is this performance difference important? Would the version using variable capture be significantly more readable? Answers to both questions depend on the context and are likely to be somewhat subjective.

Readability suggestions

Local methods are still new enough to me that I’m slightly wary of them. I’m erring on the side of leaving code as it is rather than refactoring toward local methods at the moment. In particular, I’m avoiding using the following two features:

  • Even though you can declare a local method within the scope of a loop or other block, I find that odd to read. I prefer to use local methods only when I can declare them right at the bottom of the enclosing method. I can’t capture any variables declared within loops, but I’m okay with that.
  • You can declare local methods within other local methods, but that feels like a rabbit hole I’d rather not go down.

Your tastes may vary, of course, but as always, I caution against using a new feature just because you can. (Experiment with it for the sake of experimentation, certainly, but don’t let the new shiny things lure you into sacrificing readability.)

Time for some good news: the first feature of this chapter was the big one. The remaining features are much simpler.

14.2. Out variables

Before C# 7, out parameters were slightly painful to work with. An out parameter required a variable to already be declared before you could use it as an argument for the parameter. Because declarations are separate statements, this meant that in some places where you wanted a single expression—initializing a variable, for example—you had to reorganize your code to have multiple statements.

14.2.1. Inline variable declarations for out parameters

C# 7 removes this pain point by allowing new variables to be declared within the method invocation itself. As a trivial example, consider a method that takes textual input, attempts to parse it as an integer using int.TryParse, and then returns either the parsed value as a nullable integer (if it parsed successfully) or null (if it didn’t). In C# 6, this would have to be implemented using at least two statements: one to declare the variable and a second to call int.TryParse passing the newly declared variable for the out parameter:

static int? ParseInt32(string text)
{
    int value;
    return int.TryParse(text, out value) ? value : (int?) null;
}

In C# 7, the value variable can be declared within the method call itself, which means you can implement the method with an expression body:

static int? ParseInt32(string text) =>
    int.TryParse(text, out int value) ? value : (int?) null;

In several ways, out variable arguments behave similarly to variables introduced by pattern matches:

  • If you don’t care about the value, you can use a single underscore as the name to create a discard.
  • You can use var to declare an implicitly typed variable (the type is inferred from the type of the parameter).
  • You can’t use an out variable argument in an expression tree.
  • The scope of the variable is the surrounding block.
  • You can’t use out variables in field, property, or constructor initializers or in query expressions before C# 7.3. You’ll look at an example of this shortly.
  • The variable will be definitely assigned if (and only if) the method is definitely invoked.

To demonstrate the last point, consider the following code, which tries to parse two strings and sum the results:

static int? ParseAndSum(string text1, string text2) =>
    int.TryParse(text1, out int value1) &&
    int.TryParse(text2, out int value2)
    ? value1 + value2 : (int?) null;

In the third operand of the conditional operator, value1 is definitely assigned (so you could return that if you like), but value2 isn’t definitely assigned; if the first call to int.TryParse returned false, you wouldn’t call int.TryParse the second time because of the short-circuiting nature of the && operator.

14.2.2. Restrictions lifted in C# 7.3 for out variables and pattern variables

As I mentioned in section 12.5, pattern variables can’t be used when initializing fields or properties, in construction initializers (this(...) and base(...)), or in query expressions. The same restriction applies to out variables until C# 7.3, which lifts all those restrictions. The following listing demonstrates this and shows that the result of the out variable is also available within the constructor body.

Listing 14.10. Using an out variable in a constructor initializer
class ParsedText
{
    public string Text { get; }
    public bool Valid { get; }

    protected ParsedText(string text, bool valid)
    {
        Text = text;
        Valid = valid;
    }
}

class ParsedInt32 : ParsedText
{
    public int? Value { get; }

    public ParsedInt32(string text)
        : base(text, int.TryParse(text, out int parseResult))  
    {
        Value = Valid ? parseResult : (int?) null;             
    }
}

Although the restrictions prior to C# 7.3 never bothered me, it’s nice that they’ve now been removed. In the rare cases that you needed to use patterns or out variables for initializers, the alternatives were relatively annoying and usually involved creating a new method just for this purpose.

That’s about it for out variable arguments. They’re just a useful little shorthand to avoid otherwise-annoying variable declaration statements.

14.3. Improvements to numeric literals

Literals haven’t changed much in the course of C#’s history. No changes at all occurred from C# 1 until C# 6, when interpolated string literals were introduced, but that didn’t change numbers at all. In C# 7, two features are aimed at number literals, both for the sake of improving readability: binary integer literals and underscore separators.

14.3.1. Binary integer literals

Unlike floating-point literals (for float, double, and decimal), integer literals have always had two options for the base of the literal: you could use decimal (no prefix) or hex (a prefix of 0x or 0X).[4] C# 7 extends this to binary literals, which use a prefix of 0b or 0B. This is particularly useful if you’re implementing a protocol with specific bit patterns for certain values. It doesn’t affect the execution-time behavior at all, but it can make the code a lot easier to read. For example, which of these three lines initializes a byte with the top bit and the bottom three bits set and the other bits unset?

4

The C# designers wisely eschewed the ghastly octal literals that Java inherited from C. What’s the value of 011? Why, 9, “of course.”

byte b1 = 135;
byte b2 = 0x83;
byte b3 = 0b10000111;

They all do. But you can tell that easily in the third line, whereas the other two take slightly longer to check (at least for me). Even that last one takes longer than it might, because you still have to check that you have the right number of bits in total. If only there were a way of clarifying it even more.

14.3.2. Underscore separators

Let’s jump straight into underscore separators by improving the previous example. If you want to specify all the bits of a byte and do so in binary, it’s easier to spot that you have two nibbles than to count all eight bits. Here’s the same code with a fourth line that uses an underscore to separate the nibbles:

byte b1 = 135;
byte b2 = 0x83;
byte b3 = 0b10000111;
byte b4 = 0b1000_0111;

Love it! I can really check that at a glance. Underscore separators aren’t restricted to binary literals, though, or even to integer literals. You can use them in any numeric literal and put them (almost) anywhere within the literal. In decimal literals, you’re most likely to use them every three digits like thousands separators (at least in Western cultures). In hex literals, they’re generally most useful every two, four, or eight digits to separate 8-, 16-, or 32-bit parts within the literal. For example:

int maxInt32 = 2_147_483_647;
decimal largeSalary = 123_456_789.12m;
ulong alternatingBytes = 0xff_00_ff_00_ff_00_ff_00;
ulong alternatingWords = 0xffff_0000_ffff_0000;
ulong alternatingDwords = 0xffffffff_00000000;

This flexibility comes at a price: the compiler doesn’t check that you’re putting the underscores in sensible places. You can even put multiple underscores together. Valid but unfortunate examples include the following:

int wideFifteen = 1____________________5;
ulong notQuiteAlternatingWords = 0xffff_000_ffff_0000;

You also should be aware of a few restrictions:

  • You can’t put an underscore at the start of the literal.
  • You can’t put an underscore at the end of the literal (including just before the suffix).
  • You can’t put an underscore directly before or after the period in a floating-point literal.
  • In C# 7.0 and 7.1, you can’t put an underscore after the base specifier (0x or 0b) of an integer literal.

The final restriction has been lifted in C# 7.2. Although readability is subjective, I definitely prefer to use an underscore after the base specifier when there are underscores elsewhere, as in the following examples:

  • 0b_1000_0111 versus 0b1000_0111
  • 0x_ffff_0000 versus 0xffff_0000

That’s it! A nice simple feature with very little nuance. The next feature is similarly straightforward and permits a simplification in some cases where you need to throw an exception conditionally.

14.4. Throw expressions

Earlier versions of C# always included the throw statement, but you couldn’t use throw as an expression. Presumably, the reasoning was that you wouldn’t want to, because it would always throw an exception. It turns out that as more language features were added that needed expressions, this classification became increasingly more irritating. In C# 7, you can use throw expressions in a limited set of contexts:

  • As the body of a lambda expression
  • As the body of an expression-bodied member
  • As the second operand of the ?? operator
  • As the second or third operand of the conditional ?: operator (but not both in the same expression)

All of these are valid:

public void UnimplementedMethod() =>                     1
    throw new NotImplementedException();                 1

public void TestPredicateNeverCalledOnEmptySequence()
{
    int count = new string[0]
        .Count(x => throw new Exception("Bang!"));       2
    Assert.AreEqual(0, count);
}

public static T CheckNotNull<T>(T value, string paramName) where T : class
    => value ??                                          3
    throw new ArgumentNullException(paramName);          3

public static Name =>
    initialized                                          4
    ? data["name"]                                       4
    : throw new Exception("...");                        4

  • 1 Expression-bodied method
  • 2 Lambda expression
  • 3 ?? operator (in expression-bodied method)
  • 4 ?: operator (in expression-bodied property)

You can’t use throw expressions everywhere, though; that just wouldn’t make sense. For example, you can’t use them unconditionally in assignments or as method arguments:

int invalid = throw new Exception("This would make no sense");
Console.WriteLine(throw new Exception("Nor would this"));

The C# team has given us flexibility where it’s useful (typically, where it allows you to express the exact same concepts as before, but in a more concise fashion) but prevented us from shooting ourselves in the foot with throw expressions that would be ludicrous in context.

Our next feature continues the theme of allowing us to express the same logic but with less fluff by simplifying the default operator with default literals.

14.5. Default literals (C# 7.1)

The default(T) operator was introduced in C# 2.0 primarily for use with generic types. For example, to retrieve a value from a list if the index was in bounds or the default for the type instead, you could write a method like this:

static T GetValueOrDefault<T>(IList<T> list, int index)
{
    return index >= 0 && index < list.Count ? list[index] : default(T);
}

The result of the default operator is the same default value for a type that you observe when you leave a field uninitialized: a null reference for reference types, an appropriately typed zero for all numeric types, U+0000 for char, false for bool, and a value with all fields set to the corresponding default for other value types.

When C# 4 introduced optional parameters, one way of specifying the default value for a parameter was to use the default operator. This can be unwieldy if the type name is long, because you end up with the type name in both the parameter type and its default value. One of the worst offenders for this is CancellationToken, particularly because the conventional name for a parameter of that type is cancellationToken. A common async method signature might be something like this:

public async Task<string> FetchValueAsync(
    string key,
    CancellationToken cancellationToken = default(CancellationToken))

The second parameter declaration is so long it needs a whole line to itself for book formatting; it’s 64 characters.

In C# 7.1, in certain contexts, you can use default instead of default(T) and let the compiler figure out which type you intended. Although there are definitely benefits beyond the preceding example, I suspect it was one of the main motivating factors. The preceding example can become this:

public async Task<string> FetchValueAsync(
    string key, CancellationToken cancellationToken = default)

That’s much cleaner. Without the type after it, default is a literal rather than an operator, and it works similarly to the null literal, except that it works for all types. The literal itself has no type, just like the null literal has no type, but it can be converted to any type. That type might be inferred from elsewhere, such as an implicitly typed array:

var intArray = new[] { default, 5 };
var stringArray = new[] { default, "text" };

That code snippet doesn’t list any type names explicitly, but intArray is implicitly an int[] (with the default literal being converted to 0), and stringArray is implicitly a string[] (with the default literal being converted to a null reference). Just like the null literal, there does have to be some type involved to convert it to; you can’t just ask the compiler to infer a type with no information:

var invalid = default;
var alsoInvalid = new[] { default };

The default literal is classified as a constant expression if the type it’s converted to is a reference type or a primitive type. This allows you to use it in attributes if you want to.

One quirk to be aware of is that the term default has multiple meanings. It can mean the default value of a type or the default value of an optional parameter. The default literal always refers to the default value of the appropriate type. That could lead to some confusion if you use it as an argument for an optional parameter that has a different default value. Consider the following listing.

Listing 14.11. Specifying a default literal as a method argument
static void PrintValue(int value = 10)   1
{
    Console.WriteLine(value);
}

static void Main()
{
    PrintValue(default);                 2
}

  • 1 Parameter’s default value is 10.
  • 2 Method argument is default for int.

This prints 0, because that’s the default value for int. The language is entirely consistent, but this code could cause confusion because of the different possible meanings of default. I’d try to avoid using the default literal in situations like this.

14.6. Nontrailing named arguments (C# 7.2)

Optional parameters and named arguments were introduced as complementary features in C# 4, and both had ordering requirements: optional parameters had to come after all required parameters (other than parameter arrays), and named arguments had to come after all positional arguments. Optional parameters haven’t changed, but the C# team has noticed that often named arguments can be useful as tools for increasing clarity, even for arguments in the middle of an argument list. This is particularly true when the argument is a literal (typically, a number, Boolean, literal, or null) where the context doesn’t clarify the purpose of the value.

As an example, I’ve been writing samples for the BigQuery client library recently. When you upload a CSV file to BigQuery, you can specify a schema, let the server determine the schema, or fetch it from the table if that already exists. When writing the samples for the autodetection, I wanted to make it clear that you can pass a null reference for the schema parameter. Written in the simplest—but not clearest—form, it’s not at all obvious what the null argument means:

client.UploadCsv(table, null, csvData, options);

Before C# 7.2, my options for making this clearer were to either use named arguments for the last three parameters, which ended up looking a little awkward, or use an explanatory local variable:

TableSchema schema = null;
client.UploadCsv(table, schema, csvData, options);

That’s clearer, but it’s still not great. C# 7.2 allows named arguments anywhere in the argument list, so I can make it clear what the second argument means without any extra statements:

client.UploadCsv(table, schema: null, csvData, options);

This can also help differentiate between overloads in some cases in which the argument (typically null) could be converted to the same parameter position in multiple overloads.

The rules for nontrailing named arguments have been designed carefully to avoid any subsequent positional arguments from becoming ambiguous: if there are any unnamed arguments after a named one, the named argument has to correspond to the same parameter as it would if it were a simple positional argument. For example, consider this method declaration and three calls to it:

void M(int x, int y, int z){}

M(5, z: 15, y: 10);         1
M(5, y: 10, 15);            2
M(y: 10, 5, 15);            3

  • 1 Valid: trailing named arguments out of order
  • 2 Valid: nontrailing named argument in order
  • 3 Invalid: nontrailing named argument out of order

The first call is valid because it consists of one positional argument followed by two named arguments; it’s obvious that the positional argument corresponds to the parameter x, and the other two are named. No ambiguity.

The second call is valid because although there’s a named argument with a later positional argument, the named argument corresponds to the same parameter as it would if it were positional (y). Again, it’s clear what value each parameter should take.

The third call is invalid: the first argument is named but corresponds to the second parameter (y). Should the second argument correspond to the first parameter (x) on the grounds that it’s the first non-named argument? Although the rules could work this way, it all becomes a bit confusing; it’s even worse when optional parameters get involved. It’s simpler to prohibit it, so that’s what the language team decided to do. Next is a feature that has been in the CLR forever but was exposed only in C# 7.2.

14.7. Private protected access (C# 7.2)

A few years ago, private protected was going to be part of C# 6 (and perhaps they planned to introduce it even earlier than this). The problem was coming up with a name. By the time the team had reached 7.2, they decided they weren’t going to find a better name than private protected. This combination of access modifiers is more restrictive than either protected or internal. You have access to a private protected member only from code that’s in the same assembly and is within a subclass of the member declaration (or is in the same type).

Compare this with protected internal, which is less restrictive than either protected or internal. You have access to a protected internal member from code that’s in the same assembly or is within a subclass of the member declaration (or is in the same type).

That’s all there is to say about it; it doesn’t even merit an example. It’s nice to have from a completeness perspective, as it was odd for there to be an access level that could be expressed in the CLR but not in C#. I’ve used it only once so far in my own code, and I don’t expect it to be something I find much more useful in the future. We’ll finish this chapter with a few odds and ends that don’t fit in neatly anywhere else.

14.8. Minor improvements in C# 7.3

As you’ve already seen in this chapter and earlier in the book, the C# design team didn’t stop work on C# 7 after releasing C# 7.0. Small tweaks were made, mostly to enhance the features released in C# 7.0. Where possible, I’ve included those details along with the general feature description. A few of the features in C# 7.3 don’t fit in that way, and they don’t really fit in with this chapter’s theme of concise code, either. But it wouldn’t feel right to leave them out.

14.8.1. Generic type constraints

When I briefly described type constraints in section 2.1.5, I left out a few restrictions. Prior to C# 7.3, a type constraint couldn’t specify that the type argument must derive from Enum or Delegate. This restriction has been lifted, and a new kind of constraint has been added: a constraint of unmanaged. The following listing gives examples of how these constraints are specified and used.

Listing 14.12. New constraints in C# 7.3
enum SampleEnum {}
static void EnumMethod<T>() where T : struct, Enum {}
static void DelegateMethod<T>() where T : Delegate {}
static void UnmanagedMethod<T>() where T : unmanaged {}
...
EnumMethod<SampleEnum>();                 1
EnumMethod<Enum>();                       2

DelegateMethod<Action>();                 3
DelegateMethod<Delegate>();               3
DelegateMethod<MulticastDelegate>();      3

UnmanagedMethod<int>();                   4
UnmanagedMethod<string>();                5

  • 1 Valid: enum value type
  • 2 Invalid: doesn’t meet struct constraint
  • 3 All valid (unfortunately)
  • 4 Valid: System.Int32 is an unmanaged type.
  • 5 Invalid: System.String is a managed type.

I’ve shown a constraint of where T : struct, Enum for the enum constraint, because that’s how you almost always want to use it. That constrains T to be a real enum type: a value type derived from Enum. The struct constraint excludes the Enum type itself. If you’re trying to write a method that works with any enum type, you usually wouldn’t want to handle Enum, which isn’t really an enum type in itself. Unfortunately, it’s far too late to add these constraints onto the various enum parsing methods in the framework.

The delegate constraint doesn’t have an equivalent, unfortunately. There’s no way of expressing a constraint of “only the types declared with a delegate declaration.” You could use a constraint of where T : MulticastDelegate instead, but then you’d still be able to use MulticastDelegate itself as a type argument.

The final constraint is for unmanaged types. I’ve mentioned these in passing before, but an unmanaged type is a non-nullable, nongeneric value type whose fields aren’t reference types, recursively. Most of the value types in the framework (Int32, Double, Decimal, Guid) are unmanaged types. As an example of a value type that isn’t, a ZonedDateTime in Noda Time wouldn’t be an unmanaged type because it contains a reference to a DateTimeZone instance.

14.8.2. Overload resolution improvements

The rules around overload resolution have been tweaked over and over again, usually in hard-to-explain ways, but the change in C# 7.3 is welcome and reasonably simple. A few conditions that used to be checked after overload resolution had finished are now checked earlier. Some calls that would have been considered to be ambiguous or invalid in an earlier version of C# are now fine. The checks are as follows:

  • Generic type arguments must meet any constraints on the type parameters.
  • Static methods can’t be called as if they were instance methods.
  • Instance methods can’t be called as if they were static methods.

As an example of the first scenario, consider these overloads:

static void Method<T>(object x) where T : struct =>          1
    Console.WriteLine($"{typeof(T)} is a struct");

static void Method<T>(string x) where T : class =>           2
    Console.WriteLine($"{typeof(T)} is a reference type");
...
Method<int>("text");

  • 1 Method with a struct constraint
  • 2 Method with a class constraint

In previous versions of C#, overload resolution would’ve ignored the type parameter constraints to start with. It would’ve picked the second overload, because string is a more specific regular parameter type than object, and then discovered that the supplied type argument (int) violated the type constraint.

With C# 7.3, the code compiles with no error or ambiguity because the type constraint is checked as part of finding applicable methods. The other checks are similar; the compiler discards methods that would be invalid for the call earlier than it used to. Examples of all three scenarios are in the downloadable source code.

14.8.3. Attributes for fields backing automatically implemented properties

Suppose you want a trivial property backed by a field, but you need to apply an attribute to the field to enable other infrastructure. Prior to C# 7.3, you’d have to declare the field separately and then write a simple property with boilerplate code. For example, suppose you wanted to apply a DemoAttribute (just an attribute I’ve made up) to a field backing a string property. You’d have needed code like this:

[Demo]
private string name;
public string Name
{
    get { return name; }
    set { name = value; }
}

That’s annoying when automatically implemented properties do almost everything you want. In C# 7.3, you can specify a field attribute directly to an automatically implemented property:

[field: Demo]
public string Name { get; set; }

This isn’t a new modifier for attributes, but previously it wasn’t available in this context. (At least not officially and not in the Microsoft compiler. The Mono compiler has allowed it for some time.) It’s just another rough edge of the specification where the language wasn’t consistent that has been smoothed out for C# 7.3.

Summary

  • Local methods allow you to clearly express that a particular piece of code is an implementation detail of a single operation rather than being of general use within the type itself.
  • out variables are pure ceremony reduction that allow some cases that involved multiple statements (declaring a variable and then using it) to be reduced to a single expression.
  • Binary literals allow more clarity when you need to express an integer value, but the bit pattern is more important than the magnitude.
  • Literals with many digits that could easily become confusing to the reader are clearer when digit separators are inserted.
  • Like out variables, throw expressions often allow logic that previously had to be expressed in multiple statements to be represented in a single expression.
  • Default literals remove redundancy. They also stop you from having to say the same thing twice.[5]

    5

    See how annoying redundancy is? Sorry, I couldn’t resist.

  • Unlike the other features, using nontrailing named arguments may increase the size of your source code, but all in the name of clarity. Or, if you were previously specifying lots of named arguments when you wanted to name only one in the middle, you’ll be able to remove some names without losing readability.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset