Chapter 3. Abstracting Ideas with Classes and Structs

In the previous couple of chapters, we looked at some basic programming techniques such as loops and conditions, and used some of the data types built into the language and platform, such as int and string.

Unfortunately, real programs—even fairly simple ones—are much, much more complicated than the examples we’ve built so far. They need to model the behavior of real-world objects like cars and planes, or ideas like mathematical expressions, or behaviors, like the transaction between you and your favorite coffee shop when you buy a double espresso and a brownie with your bank card.

Divide and Conquer

The best way to manage this complexity is to break a system down into manageable pieces, where each piece is small enough for us to understand completely. We should aim to craft each piece so that it fits neatly into the system as a whole with a small enough number of connections to the other pieces that we can comprehend all of those too.

Abstracting Ideas with Methods

We’ve already seen one tool for dividing our code into manageable pieces: methods. A method is a piece of a program that encapsulates a particular behavior completely. It’s worth understanding the benefits of methods, because the same principles apply to the classes and structs that are this chapter’s main subject.

Note

You will often see the term function used instead of method; they’re related, but not identical. A function is a method that returns something. Some methods just do some work, and do not return any value. So in C#, all functions are methods, but not all methods are functions.

Methods offer a contract: if we meet particular conditions, a method will do certain things for us. Conditions come in various forms: we might need to pass arguments of suitable types, perhaps with limits on the range (e.g., negative numbers may not be allowed). We may need to ensure certain things about the program’s environment—maybe we need to check that certain directories exist on disk, or that there’s sufficient free memory and disk space. There may be constraints on when we are allowed to call the method—perhaps we’re not allowed to call it if some related work we started earlier hasn’t completed yet.

Likewise, there are several ways in which a method can hold up its side of the bargain. Perhaps it will just return a string or a number that is the result of a calculation involving the method’s inputs. It might change the state of some entity in our system in some way, such as modifying an employee’s salary. It may change something about the system environment—the method might install a new device driver, or change the current user’s color scheme, for example. Some methods interact with the outside world by sending messages over the network.

Some aspects of the contract are formalized—a method’s parameter list defines the number and type of arguments we need to pass, for example, and its return type tells us what, if anything, to expect as a return value. But most of the contract is informally specified—we rely on documentation (or sometimes, conversations with the developer who wrote the method) to understand the full contract. But understand it we must, because the contract is at the heart of how methods make our lives easier.

Methods simplify things for us in two ways. If we are the user of a method, then, as long as its internal implementation conforms to the contract, we can treat it as a “black box.” We call it, we expect it to work as described, and we don’t need to worry about how it worked. All its internal complexity is hidden from us, freeing us to think about ideas like “increase this employee’s salary,” without getting bogged down by details such as “open a connection to the database and execute some SQL.”

If, on the other hand, we are the developer of a method, we don’t need to worry about who might call us, and why. As long as our implementation works as promised, we can choose any means of implementation we like—perhaps optimizing for speed, or size, or (more often than not) simplicity and maintainability. We can concentrate on details like whether we’re using the right connection string, and whether the SQL query modifies the database as intended, without needing to ask ourselves questions like “should we even be adjusting this particular employee’s salary at all?”

So, one objective of good design is to hide distracting details and expose a simple model to your client. This practice is called encapsulation, and it’s harder than it looks. As is so often the case in life, making something look easy takes years of practice and hard work. It can also be a thankless task: if you devise a contract that is a model of clarity, people will probably think it was easy to design. Conversely, unnecessary complexity is often mistaken for cleverness.

Warning

While methods are essential for achieving encapsulation, they do not guarantee it. It’s all too easy to write methods whose contract is unclear. This often happens when developers do something as an afterthought—it can be oh so tempting to add a bit of extra code to an existing method as a quick solution to a problem, but this risks making that’s method’s responsibilities less clear.

A method’s name is often a good indicator of the clarity of the contract—if the name is vague, or worse, if it’s an inaccurate description of what the method does, you’re probably looking at a method that does a bad job of encapsulation.

One of the great things about methods is that we can use them to keep breaking things into smaller and smaller pieces. Suppose we have some method called PlaceOrder, which has a well-defined responsibility, but which is getting a bit complicated. We can just split its implementation into smaller methods—say, CheckCustomerCredit, AllocateStock, and IssueRequestToWarehouse. These smaller methods do different bits of the work for us.

This general technique, sometimes called functional decomposition, has a long history in mathematics. It was explored academically in computing applications as early as the 1930s. Bearing in mind that the first working programmable computers didn’t appear until the 1940s, that’s quite a pedigree. In fact, it has been around for so long that it now seems “obvious” to most people who have had anything to do with computer programming.

That’s not the end of the story, though. Methods are great for describing the dynamics of a system—how things change in response to particular input data (the method arguments), and the results of those changes (a function’s return value, or a method’s side effects). What they’re not so good at is describing the current state of the system. If we examine a set of functions and a load of variables, how can we work out which pieces of information are supposed to be operated on by which functions? If methods were the only tool available for abstraction, we’d have a hard time telling the difference between the double that describes my blood pressure, and can be operated on by this method:

void LowerMyBloodPressure(double pressureDelta)

and the double that describes my weight and can be affected by this method:

void EatSomeDonuts(int quantityOfDonuts)

As programs get ever larger, the number of system state variables floating around increases, and the number of methods can explode exponentially. But the problems aren’t just about the sheer number of functions and variables you end up with. As you try to model a more complex system, it becomes harder to work out which functions and variables you actually need—what is a good “decomposition” of the system? Which methods relate to one another, and to which variables?

Abstracting Ideas with Objects and Classes

In the 1960s, two guys called Dahl and Nygaard (they’re Norwegian) were working on big simulation systems and were struggling with this problem. Because they worked on simulating real things, they realized that their code would be easier to understand if they had some clear way to group together all of the data and functions related to a particular type of real thing (or a particular object, we might say).

They designed a programming language that could do this, called Simula 67 (after the year of its birth), and it is generally recognized as the grandmother of all the languages we’d call object-oriented, which (of course) includes C#.

They had hit upon two important concepts:

  • The class: a description of a collection of data and the functions that operate on them

  • The object: an instance of a collection of data and the functions that operate on them (i.e., an instance of a class)

With these simple ideas, we can remove all doubt over which functions operate on which data—the class describes for us exactly what goes with what, and we can handle multiple entities of the same kind by creating several objects of a particular class.

Object-oriented analysis

As an example, let’s think about a very simple computer system that maintains the information for an air traffic control (ATC) operation. (Safety notice: if you happen to be building an ATC system, I strongly recommend that you don’t base it on this one.)

How does (this particular, slightly peculiar) ATC system work? It turns out that we’ve got a bunch of people in a big room in Washington, tracking a large number of planes that buzz around the airport in Seattle. Each plane has an identifier (BA0049, which flies in from London Heathrow, for instance). We need to know the plane’s position, which we’ll represent using three numbers: an altitude (in feet); the distance from the airport control tower (in miles); and a compass heading (measured in degrees from North), which will also be relative to the tower. Just to be clear, that’s not the direction the aircraft itself is facing—it’s the direction we’d have to face in order to be looking at the plane if we’re standing in the tower. We also need to know whether the aircraft is coming in to us, or away from us, and how fast. This, apparently, is quite important. (A more comprehensive model might include a second compass heading, representing the exact direction the plane is facing. But to keep this example simple, we’ll just track whether planes are approaching or departing.)

As the planes come in, the controllers give them permission to take off or land, and instruct them to change their heading, height, or speed. The aim is to avoid them hitting each other at any point. This, apparently, is also quite important.

At present they have a system where each controller is responsible for a particular piece of airspace. They have a rack which contains little slips of plastic with the aircraft’s ID on it, ordered by the height at which they are flying. If they are coming in to the airport, they use a piece of blue plastic. If they are going away, they use white plastic. To keep track of the heading, distance, and speed, they just write on the slip with a china graph pencil.[9] If the plane moves out of their airspace, they hand the plane over to another controller, who slips it into his own rack.

So that’s our specification.

Note

In reality, a safety-critical system such as ATC would have a more robust spec. However, when lives are not at stake, software specifications are often pretty nebulous, so this example is, sadly, a fair representation of what to expect on your average software project.

Armed with this brilliant description we need to come up with a design for a program which can model the system. We’re going to do that using object-oriented techniques.

When we do an object-oriented analysis we’re looking for the different classes of object that we are going to describe. Very often, they will correspond to real things in the system. For a class to represent these real objects properly, we need to work out what information it is going to hold, and what functions it will define to manipulate that information. In general, any one piece of information will belong to exactly one object, of exactly one class.

Note

Not all of your classes will represent real-world objects. Some will relate to more abstract concepts like collections, or commands. However, designs that wander too far into the realms of the wholly abstract are often “clever” but not necessarily “good”.

In our ATC example, it’s clear that we have a whole lot of different planes buzzing round the airport. It would therefore seem logical that we would model each one as an object, for which we would define a class called Plane.

Because C# is a language with object-oriented features, we have a simple and expressive way of doing that.

Defining Classes

We can start out with the simplest possible class. It will have no methods, and no data, so as a model of a plane in our system, it leaves something to be desired, but it gets us started.

If you want to build your own version as you read, create a new Console Application project just as we did in Chapter 2. To add a new class, use the ProjectAdd Class menu item (or right-click on the project in the Solution Explorer and select AddClass). It’ll add a new file for the class, and if we call it Plane.cs, Visual Studio will create a new source file with the usual using directives and namespace declaration. And most importantly, the file will contain a new, empty class definition, as shown in Example 3-1.

Example 3-1. The empty Plane class

class Plane
{
}

Right; if we look back at the specification, there’s clearly a whole bunch of information we’ve got about the plane that we need to store somewhere. C# gives us a handy mechanism for this called a property.

Representing State with Properties

Each plane has an identifier which is just a string of letters and numbers. We’ve already seen a built-in type ideal for representing this kind of data: string. So, we can add a property called Identifier, of type string, as Example 3-2 shows.

Example 3-2. Adding a property

class Plane
{
    string Identifier
    {
        get;
        set;
    }
}

A property definition always states the type of data the property holds (string in this case), followed by its name. By convention, we use PascalCasing for this name—see the sidebar on the next page. As with most nontrivial elements of a C# program, this is followed by a pair of braces, and inside these we say that we want to provide a get-ter and a set-ter for the property. You might be wondering why we need to declare these—wouldn’t any property need to be gettable and settable? But as we’ll see, these explicit declarations turn out to be useful.

If we create an instance of this class, we could use this Identifier property to get and set its identifier. Example 3-3 shows this in a modified version of the Main function in our Program.cs file.

Example 3-3. Using the Plane class’s property

static void Main(string[] args)
{
    Plane someBoeing777 = new Plane();

    someBoeing777.Identifier = "BA0049";

    Console.WriteLine(
        "Your plane has identifier {0}",
        someBoeing777.Identifier);

    // Wait for the user to press a key, so
    // that we can see what happened
    Console.ReadKey();
}

But wait! If you try to compile this, you end up with an error message:

'Plane.Identifier' is inaccessible due to its protection level

What’s that all about?

Protection Levels

Earlier, we mentioned that one of the objectives of good design is encapsulation: hiding the implementation details so that other developers can use our objects without relying on (or knowing about) how they work. As the error we just saw in Example 3-3 shows, a class’s members are hidden by default. If we want them to be visible to users of our class, we must change their protection level.

Every entity that we declare has its own protection level, whether we specify it or not. A class, for example, has a default protection level called internal. This means that it can only be seen by other classes in its own assembly. We’ll talk a lot more about assemblies in Chapter 15. For now, though, we’re only using one assembly (our example application itself), so we can leave the class at its default protection level.

While classes default to being internal, the default protection level for a class member (such as a property) is private. This means that it is only accessible to other members of the class. To make it accessible from outside the class, we need to change its protection level to public, as Example 3-4 shows.

Example 3-4. Making a property public

class Plane
{
    public string Identifier
    {
        get;
        set;
    }
}

Now when we compile and run the application, we see the correct output:

Your plane has identifier BA0049

Notice how this is an opt-in scheme. If you don’t do anything to the contrary, you get the lowest sensible visibility. Your classes are visible to any code inside your assembly, but aren’t accessible to anyone else; a class’s properties and methods are only visible inside the class, unless you explicitly choose to make them more widely accessible.

When different layers specify different protection, the effective accessibility is the lowest specified. For example, although our property has public accessibility, the class of which it is a member has internal accessibility. The lower of the two wins, so the Identifier property is, in practice, only accessible to code in the same assembly.

It is a good practice to design your classes with the smallest possible public interface (part of something we sometimes call “minimizing the surface area”). This makes it easier for clients to understand how they’re supposed to be used and often cuts down on the amount of testing you need to do. Having a clean, simple public API can also improve the security characteristics of your class framework, because the larger and more complex the API gets, the harder it generally gets to spot all the possible lines of attack.

That being said, there’s a common misconception that accessibility modifiers “secure” your class, by preventing people from accessing private members. Hence this warning:

Warning

It is important to recognize that these protection levels are a convenient design constraint, to help us structure our applications properly. They are not a security feature. It’s possible to use the reflection features described in Chapter 17 to circumvent these constraints and to access these supposedly hidden details.

To finish this discussion, you should know that there are two other protection levels available to us—protected and protected internal—which we can use to expose (or hide) members to developers who derive new classes from our class without making the members visible to all. But since we won’t be talking about derived classes until Chapter 4, we’ll defer the discussion of these protection levels until then.

We can take advantage of protection in our Plane class. A plane’s identifier shouldn’t change mid-flight, and it’s a good practice for code to prevent things from happening that we know shouldn’t happen. We should therefore add that constraint to our class. Fortunately, we have the ability to change the accessibility of the getter and the setter individually, as Example 3-5 shows. (This is one reason the property syntax makes use declare the get and set explicitly—it gives us a place to put the protection level.)

Example 3-5. Making a property setter private

class Plane
{
    public string Identifier
    {
        get;
        private set;
    }
}

Compiling again, we get a new error message:

The property or indexer 'Plane.Identifier' cannot be used in this context because
the set accessor is inaccessible

The problem is with this bit of code from Example 3-3:

someBoeing777.Identifier = "BA0049";

We’re no longer able to set the property, because we’ve made the setter private (which means that we can only set it from other members of our class). We wanted to prevent the property from changing, but we’ve gone too far: we don’t even have a way of giving it a value in the first place. Fortunately, there’s a language feature that’s perfect for this situation: a constructor.

Initializing with a Constructor

A constructor is a special method which allows you to perform some “setup” when you create an instance of a class. Just like any other method, you can provide it with parameters, but it doesn’t have an explicit return value. Constructors always have the same name as their containing class.

Example 3-6 adds a constructor that takes the plane’s identifier. Because the constructor is a member of the class, it’s allowed to use the Identifier property’s private setter.

Example 3-6. Defining a constructor

class Plane
{
    public Plane(string newIdentifier)
    {
        Identifier = newIdentifier;
    }

    public string Identifier
    {
        get;
        private set;
    }
}

Notice how the constructor looks like a standard method declaration, except that since there’s no need for a return type specifier, we leave that out. We don’t even write void, like we would for a normal method that returns nothing. And it would be weird if we did; in a sense this does return something—the newly created Plane—it just does so implicitly.

What sort of work should you do in a constructor? Opinion is divided on the subject—should you do everything required to make the object ready to use, or the minimum necessary to make it safe? The truth is that it is a judgment call—there are no hard and fast rules. Developers tend to think of a constructor as being a relatively low-cost operation, so enormous amounts of heavy lifting (opening files, reading data) might be a bad idea. Getting the object into a fit state for use is a good objective, though, because requiring other functions to be called before the object is fully operational tends to lead to bugs.

We need to update our Main function to use this new constructor and to get rid of the line of code that was setting the property, as Example 3-7 shows.

Example 3-7. Using a constructor

static void Main(string[] args)
{
    Plane someBoeing777 = new Plane("BA0049");

    Console.WriteLine(
        "Your plane has identifier {0}",
        someBoeing777.Identifier);

    Console.ReadKey();
}

Notice how we pass the argument to the constructor inside the parentheses, in much the same way that we pass arguments in a normal method call.

If you compile and run that, you’ll see the same output as before—but now we have an identifier that can’t be changed by users of the object.

Warning

Be very careful when you talk about properties that “can’t be changed” because they have a private setter. Even if you can’t set a property, you may still be able to modify the state of the object referred to by that property. The built-in string type happens to be immune to that because it is immutable (i.e., it can’t be changed once it has been created), so making the setter on a string property private does actually prevent clients from changing the property, but most types aren’t like that.

Speaking of properties that might need to change, our specification requires us to know the speed at which each plane is traveling. Sadly, our specification didn’t mention the units in which we were expected to express that speed. Let’s assume it is miles per hour, and add a suitable property. We’ll use the floating-point double data type for this. Example 3-8 shows the code to add to Plane.

Example 3-8. A modifiable speed property

public double SpeedInMilesPerHour
{
    get;
    set;
}

If we were to review this design with the customer, they might point out that while they have some systems that do indeed want the speed in miles per hour the people they liaise with in European air traffic control want the speed in kilometers per hour. To avoid confusion, we will add another property so that they can get or set the speed in the units with which they are familiar. Example 3-9 shows a suitable property.

Example 3-9. Property with code in its get and set

public double SpeedInKilometersPerHour
{
    get
    {
        return SpeedInMilesPerHour * 1.609344;
    }
    set
    {
        SpeedInMilesPerHour = value / 1.609344;
    }
}

We’ve done something different here—rather than just writing get; and set; we’ve provided code for these accessors. This is another reason we have to declare the accessors explicitly—the C# compiler needs to know whether we want to write a custom property implementation.

We don’t want to use an ordinary property in Example 3-9, because our SpeedInKilometersPerHour is not really a property in its own right—it’s an alternative representation for the information stored in the SpeedInMilesPerHour property. If we used the normal property syntax for both, it would be possible to set the speed as being both 100 mph and 400 km/h, which would clearly be inconsistent. So instead we’ve chosen to implement SpeedInKilometersPerHour as a wrapper around the SpeedInMilesPerHour property.

If you look at the getter, you’ll see that it returns a value of type double. It is equivalent to a function with this signature:

public double get_SpeedInKilometersPerHour()

The setter seems to provide an invisible parameter called value, which is also of type double. So it is equivalent to a method with this signature:

public void set_SpeedInKilometersPerHour(double value)

Note

This value parameter is a contextual keyword—C# only considers it to be a keyword in property or event accessors. (Events are described in Chapter 5.) This means you’re allowed to use value as an identifier in other contexts—for example, you can write a method that takes a parameter called value. You can’t do that with other keywords—you can’t have a parameter called class, for example.

This is a very flexible system indeed. You can provide properties that provide real storage in the class to store their data, or calculated properties that use any mechanism you like to get and/or set the values concerned. This choice is an implementation detail hidden from users of our class—we can switch between one and the other without changing our class’s public face. For example, we could switch the implementation of these speed properties around so that we stored the value in kilometers per hour, and calculated the miles per hour—Example 3-10 shows how these two properties would look if the “real” value was in km/h.

Example 3-10. Swapping over the real and calculated properties

public double SpeedInMilesPerHour
{
    get
    {
        return SpeedInKilometersPerHour / 1.609344;
    }
    set
    {
        SpeedInKilometersPerHour = value * 1.609344;
    }
}

public double SpeedInKilometersPerHour
{
    get;
    set;
}

As far as users of the Plane class are concerned, there’s no discernible difference between the two approaches—the way in which properties work is an encapsulated implementation detail. Example 3-11 shows an updated Main function that uses the new properties. It neither knows nor cares which one is the “real” one.

Example 3-11. Using the speed properties

static void Main(string[] args)
{
    Plane someBoeing777 = new Plane("BA0049");

    someBoeing777.SpeedInMilesPerHour = 150.0;

    Console.WriteLine(
        "Your plane has identifier {0}, " +
        "and is traveling at {1:0.00}mph [{2:0.00}kph]",
        someBoeing777.Identifier,
        someBoeing777.SpeedInMilesPerHour,
        someBoeing777.SpeedInKilometersPerHour);

    someBoeing777.SpeedInKilometersPerHour = 140.0;

    Console.WriteLine(
        "Your plane has identifier {0}, " +
        "and is traveling at {1:0.00}mph [{2:0.00}kph]",
        someBoeing777.Identifier,
        someBoeing777.SpeedInMilesPerHour,
        someBoeing777.SpeedInKilometersPerHour);

    Console.ReadKey();
}

Although our public API supports two different units for speed while successfully keeping the implementation for that private, there’s something unsatisfactory about that implementation. Our conversion relies on a magic number (1.609344) that appears repeatedly. Repetition impedes readability, and is prone to typos (I know that for a fact. I’ve typed it incorrectly once already this morning while preparing the example!) There’s an important principle in programming: don’t repeat yourself (or dry, as it’s sometimes abbreviated). Your code should aim to express any single fact or concept no more than once, because that way, you only need to get it right once.

It would be much better to put this conversion factor in one place, give it a name, and refer to it by that instead. We can do that by declaring a field.

Fields: A Place to Put Data

A field is a place to put some data of a particular type. There’s no option to add code like you can in a property—a field is nothing more than data. Back before C# 3.0 the compiler didn’t let us write just get; and set;—we always had to write properties with code as in Example 3-9, and if we wanted a simple property that stored a value, we had to provide a field, with code such as Example 3-12.

Example 3-12. Writing your own simple property

// Field to hold the SpeedInMilesPerHour property's value
double speedInMilesPerHourValue;

public double SpeedInMilesPerHour
{
    get
    {
        return speedInMilesPerHourValue;
    }
    set
    {
        speedInMilesPerHourValue = value;
    }
}

When you write just get; and set; as we did in Example 3-8, the C# compiler generates code that’s more or less identical to Example 3-12, except it gives the field a peculiar name to prevent us from accessing it directly. (These compiler-generated properties are called auto properties.) So, if we want to store a value in an object, there’s always a field involved, even if it’s a hidden one provided automatically by the compiler. Fields are the only class members that can hold information—properties are really just methods in disguise.

As you can see, a field declaration looks similar to the start of a property declaration. There’s the type (double), and a name. By convention, this name is camelCased, to make fields visibly different from properties. (Some developers like to distinguish fields further by giving them a name that starts with an underscore.)

We can modify a field’s protection level if we want, but, conventionally, we leave all fields with the default private accessibility. That’s because a field is just a place for some data, and if we make it public, we lose control over the internal state of our object. Properties always involve some code, even if it’s generated automatically by the compiler. We can use private backing fields as we wish, or calculate property values any way we like, and we’re free to modify the implementation without ever changing the public face of the class. But with a field, we have nowhere to put code, so if we decide to change our implementation by switching from a field to a calculated value, we would need to remove the field entirely. If the field was part of the public contract of the class, that could break our clients. In short, fields have no innate capacity for encapsulation, so it’s a bad idea to make them public.

Example 3-13 shows a modified version of the Plane class. Instead of repeating the magic number for our speed conversion factor, we declare a single field initialized to the required value. Not only does this mean that we get to state the conversion value just once, but we’ve also been able to give it a descriptive name—in the conversions, it’s now obvious that we’re multiplying and dividing by the number of kilometers in a mile, even if you happen not to have committed the conversion factor to memory.

Example 3-13. Storing the conversion factor in a field

class Plane
{
    // Constructor with a parameter
    public Plane(string newIdentifier)
    {
        Identifier = newIdentifier;
    }

    public string Identifier
    {
        get;
        private set;
    }

    double kilometersPerMile = 1.609344;

    public double SpeedInMilesPerHour
    {
        get
        {
            return SpeedInKilometersPerHour / kilometersPerMile;
        }
        set
        {
            SpeedInKilometersPerHour = value * kilometersPerMile;
        }
    }

    public double SpeedInKilometersPerHour
    {
        get;
        set;
    }
}

Notice how we’re able to initialize the field to a default value right where we declare it, by using the = operator. (This sort of code is called, predictably enough, a field initializer.) Alternatively, we could have initialized it inside a constructor, but if the default is a constant value, it is conventional to set it at the point of declaration.

What about the first example of a field that we saw—the one we used as the backing data for a property in Example 3-12? We didn’t explicitly initialize it. In some other languages that would be a ghastly mistake. (Failure to initialize fields correctly is a major source of bugs in C++, for example.) Fortunately, the designers of .NET decided that the trade-off between performance and robustness wasn’t worth the pain, and kindly initialize all fields to a default value for us—numeric fields are set to zero and fields of other types get whatever the nearest equivalent of zero is. (Boolean fields are initialized to false, for example.)

Note

There’s also a security reason for this initialization. Because a new object’s memory is always zeroed out before we get to see it, we can’t just allocate a whole load of objects and then peer at the “uninitialized” values to see if anything interesting was left behind by the last object that used the same memory.

Defining a field for our scale factor is an improvement, but we could do better. Our 1.609344 isn’t ever going to change. There are always that many kilometers per mile, not just for this instance of a Plane, but for any Plane there ever will be. Why allocate the storage for the field in every single instance? Wouldn’t it be better if we could define this value just once, and not store it in every Plane instance?

Fields Can Be Fickle, but const Is Forever

C# provides a mechanism for declaring that a field holds a constant value, and will never, ever change. You use the const modifier, as Example 3-14 shows.

Example 3-14. Defining a constant value

const double kilometersPerMile = 1.609344;

The platform now takes advantage of the fact that this can never change, and allocates storage for it only once, no matter how many instances of Plane you new up. Handy.

This isn’t just a storage optimization, though. By making the field const, there’s no danger that someone might accidentally change it for some reason inside another function he’s building in the class—the C# compiler prevents you from assigning a value to a const field anywhere other than at the point of declaration.

Note

In general, when we are developing software, we’re trying to make it as easy as possible for other developers (including our “future selves”) to do the right thing, almost by accident. You’ll often hear this approach called “designing for the pit of success.” The idea is that people will fall into doing the right things because of the choices you’ve made.

Some aspects of an object don’t fit well as either a normal modifiable field or a constant value. Take the plane’s identifier, for example—that’s fixed, in the sense that it never changes after construction, but it’s not a constant value like kilometersPerMile. Different planes have different identifiers. .NET supports this sort of information through read-only properties and fields, which aren’t quite the same as const.

Read-only Fields and Properties

In Example 3-5, we made our Plane class’s Identifier property private. This prevented users of our class from setting the property, but our class is still free to shoot itself in the foot. Suppose a careless developer added some code like that in Example 3-15, which prints out messages in the SpeedInMilesPerHour property perhaps in order to debug some problem he was investigating.

Example 3-15. Badly written debugging code

public double SpeedInMilesPerHour
{
    get
    {
        return SpeedInKilometersPerHour / kilometersPerMile;
    }
    set
    {
        Identifier += ": speed modified to " + value;
        Console.WriteLine(Identifier);
        SpeedInKilometersPerHour = value * kilometersPerMile;
    }
}

The first time someone tries to modify a plane’s SpeedInMilesPerHour this will print out a message that includes the identifier, for example:

BA0048: speed modified to 400

Unfortunately, the developer who wrote this clearly wasn’t the sharpest tool in the box—he used the += operator to build that debug string, which will end up modifying the Identifier property. So, the plane now thinks its identifier is that whole text, including the part about the speed. And if we modified the speed again, we’d see:

BA0048: speed modified to 400: speed modified to 380

While it might be interesting to see the entire modification history, the fact that we’ve messed up the Identifier is bad. Example 3-15 was able to do this because the SpeedInMilesPerHour property is part of the Plane class, so it can still use the private setter. We can fix this (up to a point) by making the property read-only—rather than merely making the setter private, we can leave it out entirely. However, we can’t just write the code in Example 3-16.

Example 3-16. The wrong way to define a read-only property

class Plane
{
    // Wrong!
    public string Identifier
    {
        get;
    }
    ...
}

That won’t work because there’s no way we could ever set Identifier—not even in the constructor. Auto properties cannot be read-only, so we must write a getter with code. Example 3-17 will compile, although as we’re about to see, the job’s not done yet.

Example 3-17. A better, but incomplete, read-only property

class Plane
{
    public Plane(string newIdentifier)
    {
        _identifier = newIdentifier;
    }

    public string Identifier
    {
        get { return _identifier; }
    }
    private string _identifier;
    ...
}

This turns out to give us two problems. First, the original constructor from Example 3-6 would no longer compile—it set Identifier, but that’s now read-only. That was easy to fix, though—Example 3-17 just sets the explicit backing field we’ve added. More worryingly, this hasn’t solved the original problem—the developer who wrote the code in Example 3-15 has “cleverly” realized that he can “fix” his code by doing exactly the same thing as the constructor. As Example 3-18 shows he has just used the _identifier field directly.

Example 3-18. “Clever” badly written debugging code

public double SpeedInMilesPerHour
{
    get
    {
        return SpeedInKilometersPerHour / kilometersPerMile;
    }
    set
    {
        _identifier += ": speed modified to " + value;
        Console.WriteLine(Identifier);
        SpeedInKilometersPerHour = value * kilometersPerMile;
    }
}

That seemed like a long journey for no purpose. However, we can fix this problem—we can modify the backing field itself to be read-only, as shown in Example 3-19.

Example 3-19. A read-only field

private readonly string _identifier;

That will foil the developer who wrote Example 3-15 and Example 3-18. But doesn’t it also break our constructor again? In fact, it doesn’t: read-only fields behave differently from read-only properties. A read-only property can never be modified. A read-only field can be modified, but only by a constructor.

Since read-only fields only become truly read-only after construction completes, it makes them perfect for properties that need to be able to be different from one instance to another, but which need to be fixed for the lifetime of an instance.

Before we move on from const and readonly fields, there’s another property our Plane needs for which const seems like it could be relevant, albeit in a slightly different way. In addition to monitoring the speed of an aircraft, we also need to know whether it is approaching or heading away from the airport.

We could represent that with a bool property called something like IsApproaching (where true would mean that it was approaching, and false would, by implication, indicate that it was heading away). That’s a bit clumsy, though. You can often end up having to negate Boolean properties—you might need to write this sort of thing:

if (!plane.IsApproaching) { ... }

That reads as “if not plane is approaching” which sounds a bit awkward. We could go with:

if (somePlane.IsApproaching == false) { ... }

That’s “if is approaching is false” which isn’t much better. We could offer a second, calculated property called IsNotApproaching, but our code is likely to be simpler and easier to read (and therefore likely to contain fewer bugs) if, instead of using bool, we have a Direction property whose value could somehow be either Approaching or Leaving.

We’ve just seen a technique we could use for that. We could create two constant fields of any type we like (int, for example), and a property of type int called Direction (see Example 3-20).

Example 3-20. Named options with const int

class Plane
{
    public const int Approaching = 0;
    public const int Leaving = 1;

    // ...

    public int Direction { get; set; }
}

This lets us write code that reads a bit more naturally than it would if we had used just true and false:

someBoeing777.Direction = Plane.Approaching;
if (someAirbusA380.Direction == Plane.Leaving) { /* Do something */ }

But there’s one problem: if our Direction property’s type is int, there’s nothing to stop us from saying something like:

someBoeing777.Direction = 72;

This makes no sense, but the C# compiler doesn’t know that—after all, we told it the property’s type was int, so how’s it supposed to know that’s wrong? Fortunately, the designers of C# have thought of this, and have given us a kind of type for precisely this situation, called an enum, and it turns out to be a much better solution for this than const int.

Related Constants with enum

The enum[10] keyword lets us define a type whose values can be one of a fixed set of possibilities. Example 3-21 declares an enum for our Direction property. You can add this to an existing source file, above or below the Plane class, for example. Alternatively, you could add a whole new source file to the project, although Visual Studio doesn’t offer a file template for enum types, so either you’d have to add a new class and then change the class keyword to enum, or you could use the Code File template to add a new, empty source file.

Example 3-21. Direction enum

enum DirectionOfApproach
{
    Approaching,
    Leaving
}

This is similar in some respects to a class declaration. We can optionally begin with a protection level but if, like Example 3-21, we omit that, we get internal protection by default. Then there’s the enum specifier itself, followed by the name of the type, which by convention we PascalCase. Inside the braces, we declare the members, again using PascalCasing. Notice that we use commas to separate the list of constants—this is where the syntax starts to part company with class. Unusually, the members are publicly accessible by default. That’s because an enum has no behavior, and so there are no implementation details—it’s just a list of named values, and those need to be public for the type to serve any useful purpose.

Note

Notice that we’ve chosen to call this DirectionOfApproach, and not the plural DirectionsOfApproach. By convention, we give enum types a singular name even though they usually contain a list. This makes sense because when you use named entries from an enumeration, you use them one at a time, and so it would look odd if the type name were plural. Obviously, there won’t be any technical consequences for breaking this convention, but following it helps make your code consistent with the .NET Framework class libraries.

We can now declare our Direction property, using the enumeration instead of an integer. Example 3-22 shows the property to add to the Plane class.

Example 3-22. Property with enum type

public DirectionOfApproach Direction
{
    get;
    set;
}

There are some optional features we can use in an enum declaration. Example 3-23 uses these, and they provide some insight into how enum types work.

Example 3-23. Explicit type and values for enum

enum DirectionOfApproach : int
{
    Approaching = 0,
    Leaving = 1
}

In this declaration, we have explicitly specified the governing type for the enumeration. This is the type that stores the individual values for an enumeration, and we specify it with a colon and the type name. By default, it uses an int (exactly as we did in our original const-based implementation of this property), so we’ve not actually changed anything here; we’re just being more explicit. The governing type must be one of the built-in integer types: byte, sbyte, short, ushort, uint, long, or ulong.

Example 3-23 also specifies the numbers to use for each named value. As it happens, if you don’t provide these numbers, the first member is assigned the value 0, and we count off sequentially after that, so again, this example hasn’t changed anything, it’s just showing the values explicitly.

We could, if we wanted, specify any value for any particular member. Maybe we start from 10 and go up in powers of 2. And we’re also free to define duplicates, giving the same value several different names. (That might not be useful, but C# won’t stop you.)

We normally leave all these explicit specifiers off, and accept the defaults. However, the sidebar on the next page describes a scenario in which you would need to control the numbers.

Note

If you don’t specify explicit values, the first item in your list is effectively the default value for the enum (because it corresponds to the zero value). If you provide explicit values, be sure to define a value that corresponds to zero—if you don’t, fields using your type will default to a value that’s not a valid member of the enum, which is not desirable.

We can now access the enumeration property like this:

someBoeing777.Direction = DirectionOfApproach.Approaching;

We’ve clearly made some progress with our Plane class, but we’re not done yet. We have a read-only property for its Identifier. We can store the speed, which we can get and set using two different properties representing different units, using a const field for the conversion factor. And we know the direction, which will be either the Approaching or the Leaving member of an enum.

We still need to store the aircraft’s position. According to the specification, we’ve got two polar coordinates (an angle and a distance) for its position on the ground, and another value for its height above sea level.

We’re likely to need to do a lot of calculations based on this position information. Every time we want to create a function to do that, we’d need three parameters per point, which seems overly complex. (And error-prone—it’d be all too easy to inadvertently pass two numbers from one position, and a third number from a different position.) It would be nicer if we could wrap the numbers up into a single, lightweight, “3D point” type that we can think of in the same kind of way we do int or double—a basic building block for other classes to use with minimum overhead.

This is a good candidate for a value type.

Value Types and Reference Types

So far, we’ve been building a class. When creating an instance of the class, we stored it in a named variable, as Example 3-24 shows.

Example 3-24. Storing a reference in a variable

Plane someBoeing777 = new Plane("BA0049");
someBoeing777.Direction = DirectionOfApproach.Approaching;

We can define another variable with a different name, and store a reference to the same plane in that new variable, as shown in Example 3-25.

Example 3-25. Copying a reference from one variable to another

Plane theSameBoeing777ByAnotherName = someBoeing777;

If we change a property through one variable, that change will be visible through the other. Example 3-26 modifies our plane’s Direction property through the second variable, but then reads it through the first variable, verifying that they really are referring to the same object.

Example 3-26. Using one object through two variables

theSameBoeing777ByAnotherName.Direction = DirectionOfApproach.Leaving;
if (someBoeing777.Direction == DirectionOfApproach.Leaving)
{
    Console.WriteLine("Oh, they are the same!");
}

As Shakespeare might have said, if only he’d found his true vocation as a C# developer:

That which we call someBoeing777 By any other name would smell as sweet.

Assuming you like the smell of jet fuel.

When we define a type using class, we always get this behavior—our variables behave as references to an underlying object. We therefore call a type defined as a class a reference type.

Note

It’s possible for a reference type variable to be in a state where it isn’t referring to any object at all. C# has a special keyword, null, to represent this. You can set a variable to null, or you can pass null as an argument to a method. And you can also test to see if a field, variable, or argument is equal to null in an if statement. Any field whose type is a reference type will automatically be initialized to null before the constructor runs, in much the same way as numeric fields are initialized to zero.

The enum we declared earlier and the built-in numeric types (int, double) behave differently, though, as Example 3-27 illustrates.

Example 3-27. Copying values, not references

int firstInt = 3;
int secondInt = firstInt;

secondInt = 4;

if (firstInt != 4)
{
    Console.WriteLine("Well. They're not the same at all.");
}

When we assign firstInt to secondInt, we are copying the value. In this case, the variables hold the actual value, not a reference to a value. We call types that behave this way value types.

People often refer to reference types as being allocated “on the heap” and value types “on the stack.” C++ programmers will be familiar with these concepts, and C++ provided one syntax in the language to explicitly create items on the stack (a cheap form of storage local to a particular scope), and a different syntax for working on the heap (a slightly more expensive but sophisticated form of storage that could persist beyond the current scope). C# doesn’t make that distinction in its syntax, because the .NET Framework itself makes no such distinction. These aspects of memory management are completely opaque to the developer, and it is actively wrong to think of value types as being always allocated on a stack.

For people familiar with C++ this can take a while to get used to, especially as the myth is perpetuated on the Web, in the MSDN documentation and elsewhere. (For example, at the time of this writing, http://msdn.microsoft.com/library/aa288471 states that structs are created on the stack, and while that happens to be true of the ones in that example when running against the current version of .NET, it would have been helpful if the page had mentioned that it’s not always true. For example, if a class has a field of value type, that field doesn’t live on the stack—it lives inside the object, and in all the versions of .NET released so far, objects live on the heap.)

Note

The important difference for the C# developer between these two kinds of types is the one of reference versus copy semantics.

As well as understanding the difference in behavior, you also need to be aware of some constraints. To be useful, a value type should be:

  • Immutable

  • Lightweight

Something is immutable if it doesn’t change over time. So, the integer 3 is immutable. It doesn’t have any internal workings that can change its “three-ness”. You can replace the value of an int variable that currently contains a 3, by copying a 4 into it, but you can’t change a 3 itself. (Unlike, say, a particular Plane object, which has a Direction property that you can change anytime you like without needing to replace the whole Plane.)

Warning

There’s nothing in C# that stops you from creating a mutable value type. It is just a bad idea (in general). If your type is mutable, it is probably safer to make it a reference type, by declaring it as a class. Mutable value types cause problems because of the copy semantics—if you modify a value, it’s all too easy to end up modifying the wrong one, because there may be many copies.

It should be fairly apparent that a value type also needs to be pretty lightweight, because of all that copying going on. Every time you pass it into a function, or assign it to a variable, a copy is made. And copies are generally the enemy of good performance. If your value type consists of more than two or three of the built-in types, it may be getting too big.

These constraints mean it is very rare that you will actually want to declare a value type yourself. A lot of the obviously useful ones you might want are already defined in the .NET Framework class libraries (things like 2D points, times, and dates). Custom value types are so rare that it was hard to come up with a useful example for this book that wasn’t already provided in the class libraries. (If you were wondering why our example application represents aircraft positions in such an idiosyncratic fashion, this is the reason.)

But that doesn’t mean you should never, ever declare a value type. Value types can have performance benefits when used in arrays (although as with most performance issues, this is not entirely clear-cut), and the immutability and copy semantics can make them safer when passing them in to functions—you won’t normally introduce side effects by working with a value type because you end up using a copy, rather than modifying shared data that other code might be relying on.

Our polar 3D point seems to comply with the requirements. Any given point is just that: a specific point in 3D space—a good candidate for immutability. (We might want to move a plane to a different point, but we can’t change what a particular point means.) It is also no more than three doubles in size, which is small enough for copy semantics. Example 3-28 shows our declaration of this type, which we can add to our project. (As with enum, Visual Studio doesn’t offer a template for value types. Again, we can use the Class template, replacing the class with the code we want.)

Example 3-28. A value type

struct PolarPoint3D
{
    public PolarPoint3D(double distance, double angle, double altitude)
    {
        Distance = distance;
        Angle = angle;
        Altitude = altitude;
    }

    public double Distance
    {
        get;
        private set;
    }

    public double Angle
    {
        get;
        private set;
    }

    public double Altitude
    {
        get;
        private set;
    }
}

If you think that it looks just like a class declaration, but using the struct keyword instead of class, you’d be right—these two kinds of types are very similar. However, if we try to compile it, we get an error on the first line of the constructor:

The 'this' object cannot be used before all of its fields are assigned to

So, although the basic syntax of a struct looks just like a class there are important differences. Remember that when you allocate an instance of a particular type, it is always initialized to some default value. With classes, all fields are initialized to zero (or the nearest equivalent value). But things work slightly differently with value types—we need to do slightly more work.

Anytime we write a struct, C# automatically generates a default, parameterless constructor that initializes all of our storage to zero, so if we don’t want to write any custom constructors, we won’t have any problems. (Unlike with a class, we aren’t allowed to replace the default constructor. We can define extra constructors, but the default constructor is always present and we’re not allowed to write our own—see the sidebar on the next page for details.)

Example 3-28 has hit trouble because we’re trying to provide an additional constructor, which initializes the properties to particular values. If we write a constructor in a struct, the compiler refuses to let us invoke any methods until we’ve initialized all the fields. (It doesn’t do the normal zero initialization for custom constructors.) This restriction turns out to include properties, because get and set accessors are methods under the covers. So C# won’t let us use our properties until the underlying fields have been initialized, and we can’t do that because these are auto properties—the C# compiler has generated hidden fields that we can only access through the properties. This is a bit of a chicken-and-egg bootstrapping problem!

Fortunately, C# gives us a way of calling one of our constructors from another. We can use this to call the default constructor to do the initialization; then our constructor can set the properties to whatever values it wishes. We call the constructor using the this keyword, and the standard function calling syntax with any arguments enclosed in parentheses. As Example 3-29 shows, we can invoke the default constructor with an empty argument list.

Example 3-29. Calling one constructor from another

public PolarPoint3D(double distance, double angle, double altitude)
    : this()
{
    Distance = distance;
    Angle = angle;
    Altitude = altitude;
}

You add the call just before the opening brace for the body of the constructor, and prefix it with a colon. We can also use this technique to avoid writing common initialization code multiple times. Say we wanted to provide another utility constructor that just took the polar coordinates, and initialized the altitude to zero by default. Instead of repeating all the code from the first constructor, we could just add this extra constructor to our definition for PolarPoint3D, as shown in Example 3-30.

Example 3-30. Sharing common initialization code

public PolarPoint3D(double distance, double angle)
    : this(distance, angle, 0)
{
}

public PolarPoint3D(
    double distance,
    double angle,
    double altitude)
    : this()
{
    Distance = distance;
    Angle = angle;
    Altitude = altitude;
}

Incidentally, this syntax for calling one constructor from another works equally well in classes, and is a great way of avoiding code duplication.

Too Many Constructors, Mr. Mozart

You should be careful of adding too many constructors to a class or struct. It is easy to lose track of which parameters are which, or to make arbitrary choices about which constructors you provide and which you don’t.

For example, let’s say we wanted to add yet another constructor to PolarPoint3D that lets callers pass just the angle and altitude, initializing the distance to a default of zero, as Example 3-31 shows.

Example 3-31. A constructor too far

public PolarPoint3D(
    double altitude,
    double angle )
    : this( 0, angle, altitude )
{
}

Even before we compile, we can see that there’s a problem—we happen to have added the altitude parameter so that it is the first in the list, and angle stays second. In our main constructor, the altitude comes after the angle. Because they are both just doubles, there’s nothing to stop you from accidentally passing the parameters “the wrong way round.” This is the exactly the kind of thing that surprises users of your class, and leads to hard-to-find bugs. But while inconsistent parameter ordering is bad design, it’s not a showstopper.

However, when we compile, things get even worse. We get another error:

Type 'PolarPoint3D' already defines a member called 'PolarPoint3D' with the same
parameter types

We have too many constructors. But how many is too many?

Overloading

When we define more than one member in a type with the same name (be it a constructor or, as we’ll see later, a method) we call this overloading.

Initially, we created two constructors (two overloads of the constructor) for PolarPoint3D, and they compiled just fine. This is because they took different sets of parameters. One took three doubles, the other two. In fact, there was also the third, hidden constructor that took no parameters at all. All three constructors took different numbers of parameters, meaning there’s no ambiguity about which constructor we want when we initialize a new PolarPoint3D.

The constructor in Example 3-31 seems different: the two doubles have different names. Unfortunately, this doesn’t matter to the C# compiler—it only looks at the types of the parameters, and the order in which they are declared. It does not use names for disambiguation. This should hardly be surprising, because we’re not required to provide argument names when we call methods or constructors. If we add the overload in Example 3-31, it’s not clear what new PolarPoint3D(0, 0) would mean, and that’s why we get an error—we’ve got two members with the same name (PolarPoint3D—the constructor), and exactly the same parameter types, in the same order.

Looking at overloaded functions will emphasize that it really is only the method name and the parameters that matter—a function’s return type is not considered to be a disambiguating aspect of the member for overload purposes.

That means there’s nothing we can do about it: we’re going to have to get rid of this third constructor (just delete it); and while we’re in the code, we’ll finish up the declaration of the data portion of our Plane by adding a property for its position, shown in Example 3-32.

Example 3-32. Using our custom value type for a property

public PolarPoint3D Position
{
    get;
    set;
}

Overloaded Methods and Default Named Parameters

Just as with constructors, we can provide more than one method with the same name, but a different list of parameter types. It is, in general, a bad idea to provide two overloads with the same name if they perform a semantically different operation (again—that’s the kind of thing that surprises developers using your class), so the most common reason for overloading is to provide several different ways to do something. We can provide users of our code with flexible methods that take lots of arguments to control different aspects of the code, and we can also provide developers that don’t need this flexibility with simpler options by providing overloads that don’t need as many arguments.

Suppose we added a method to our Plane class enabling messages to be sent to aircraft. Perhaps in our first attempt we define a method whose signature looks like this:

public void SendMessage(string messageText)

But suppose that as the project progresses, we find that it would be useful to be able to delay transmission of certain messages. We could modify the SendMessage method so that it accepts an extra argument. There’s a handy type in the framework called TimeSpan which lets us specify duration. We could modify the method to make use of it:

public void SendMessage(string messageText, TimeSpan delay)

Alas! If we already had code in our project depending on the original signature, we’d start to see this compiler error:

No overload for method 'SendMessage' takes '1' arguments

We’ve changed the signature of that method, so all our clients are sadly broken. They need to be rewritten to use the new method. That’s not great.

A better alternative is to provide both signatures—keep the old single-parameter contract around, but add an overload with the extra argument. And to ensure that the overloads behave consistently (and to avoid duplicating code) we can make the simpler method call the new method as its actual implementation. The old method was just the equivalent of calling the new method with a delay of zero, so we could replace it with the method shown in Example 3-33. This lets us provide the newly enhanced SendMessage, while continuing to support the old, simpler version.

Example 3-33. Implementing one overload in terms of another

public void SendMessage(string messageName)
{
    SendMessage(messageName, TimeSpan.Zero);
}

(TimeSpan.Zero is a static field that returns a duration of zero.)

Until C# 4.0 that’s as far as we could go. However, the C# designers noticed that a lot of member overloads were just like this one: facades over an über-implementation, with a bunch of parameters defaulted out to particular values. So they decided to make it easier for us to support multiple variations on the same method. Rather than writing lots of overloads, we can now just specify default values for a method’s arguments, which saves us typing a lot of boilerplate, and helps make our default choices more transparent.

Let’s take out the single-parameter method overload we just added, and instead change the declaration of our multiparameter implementation, as shown in Example 3-34.

Example 3-34. Parameter with default value

public void SendMessage(
    string messageName,
    TimeSpan delay = default(TimeSpan))

Even though we’ve only got one method, which supports two arguments, code that tries to call it with a single argument will still work. That’s because default values can fill in for missing arguments. (If we tried to call SendMessage with no arguments at all, we’d get a compiler error, because there’s no default for the first argument here.)

But it doesn’t end there. Say we had a method with four parameters, like this one:

public void MyMethod(
    int firstOne,
    double secondInLine = 3.1416,
    string thirdHere = "The third parameter",
    TimeSpan lastButNotLeast = default(TimeSpan))
{
    // ...
}

If we want to call it and specify the first parameter (which we have to, because it has no default), and the third, but not the second or the fourth, we can do so by using the names of the parameters, like this:

MyMethod(127, thirdHere: "New third parameter");

With just one method, we now have many different ways to call it—we can provide all the arguments, or just the first and second, or perhaps the first, second, and third. There are many combinations. Before named arguments and defaults were added in C# 4.0, the only way to get this kind of flexibility was to write an overload for each distinct combination.

This is not just limited to normal methods—you can use this same syntax to provide default values for parameters in your constructors, if you wish.

Being forced to delete the extra constructor we tried to add back in Example 3-31 was a little disappointing—we’re constraining the number of ways users of our type can initialize it. Named arguments and default values have helped, but can we do more?

Object Initializers

Until C# 3.0, the only real solution to this was to write one or more factory methods. These are described in the sidebar below. But now we have another option.

With C# 3.0 the language was extended to support object initializers—an extension to the new syntax that lets us set up a load of properties, by name, as we create our object instance.

Example 3-35 shows how an object initializer looks when we use it in our Main function.

Example 3-35. Using object initializers

static void Main(string[] args)
{
    Plane someBoeing777 = new Plane("BA0049")
                          {
                              Direction = DirectionOfApproach.Approaching,
                              SpeedInMilesPerHour = 150
                          };

    Console.WriteLine(
        "Your plane has identifier {0}," +
        " and is traveling at {1:0.00}mph [{2:0.00}kph]",
        // Use the property getter
        someBoeing777.Identifier,
        someBoeing777.SpeedInMilesPerHour,
        someBoeing777.SpeedInKilometersPerHour);

        someBoeing777.SpeedInKilometersPerHour = 140.0;

    Console.WriteLine(
        "Your plane has identifier {0}," +
        " and is traveling at {1:0.00}mph [{2:0.00}kph]",
        // Use the property getter
        someBoeing777.Identifier,
        someBoeing777.SpeedInMilesPerHour,
        someBoeing777.SpeedInKilometersPerHour);

    Console.ReadKey();

}

Note

Object initializers are mostly just a convenient syntax for constructing a new object and then setting some properties. Consequently, this only works with writable properties—you can’t use it for immutable types,[11] so this wouldn’t work with our PolarPoint3D.

We still use the constructor parameter for the read-only Identifier property; but then we add an extra section in braces, between the closing parenthesis and the semicolon, in which we have a list of property assignments, separated by commas. What’s particularly interesting is that the purpose of the constructor parameter is normally identifiable only by the value we happen to assign to it, but the object initializer is “self-documenting”—we can easily see what is being initialized to which values, at a glance.

The job isn’t quite done yet, though. While there’s nothing technically wrong with using both the constructor parameter and the object initializer, it does look a little bit clumsy. It might be easier for our clients if we allow them to use a default, parameterless constructor, and then initialize all the members using this new syntax. As we’ll see in Chapter 6, we have other ways of enforcing invariants in the object state, and dealing with incorrect usages. Object initializers are certainly a more expressive syntax, and on the basis that self-documenting and transparent is better, we’re going to change how Plane works so that we can initialize the whole object with an object initializer.

Note

As with any design consideration, there is a counter argument. Some classes may be downright difficult to put into a “default” (zero-ish) state that isn’t actively dangerous. We’re also increasing the size of the public API by the changes we’re making—we’re adding a public setter. Here, we’ve decided that the benefits outweigh the disadvantages in this particular case (although it’s really a judgment call; no doubt some developers would disagree).

First, as Example 3-36 shows, we’ll delete the special constructor from Plane, and then make Identifier an ordinary read/write property. We can also remove the _identifier backing field we added earlier, because we’ve gone back to using an auto property.

Example 3-36. Modifying Plane to work better with object initializers

class Plane
{
    // Remove the constructor that we no longer require
    // public Plane(string newIdentifier)
    // {
    //    Identifier = newIdentifier;
    // }

    public string Identifier
    {
        get;
        // remove the access modifier
        // to make it public
         set;
    }

    // ...
}

We can now use the object initializer syntax for all the properties we want to set. As Example 3-37 shows, this makes our code look somewhat neater—we only need one style of code to initialize the object.

Example 3-37. Nothing but object initializer syntax

Plane someBoeing777 = new Plane
                          {
                              Identifier = "BA0049",
                              Direction = DirectionOfApproach.Approaching,
                              SpeedInMilesPerHour = 150
                          };

Object initializer syntax provides one big advantage over offering lots of specialized constructors: people using your class can provide any combination of properties they want. They might decide to set the Position property inline in this object initializer too, as Example 3-38 does—if we’d been relying on constructors, default or named arguments wouldn’t have helped if there was no constructor available that accepted a Position. We’ve not had to provide an additional constructor overload to make this possible—developers using our class have a great deal of flexibility. Of course, this approach only makes sense if our type is able to work sensibly with default values for the properties in question. If you absolutely need certain values to be provided on initialization, you’re better off with constructors.

Example 3-38. Providing an extra property

Plane someBoeing777 = new Plane
                          {
                              Identifier = "BA0049",
                              Direction = DirectionOfApproach.Approaching,
                              SpeedInMilesPerHour = 150,
                              Position = new PolarPoint3D(20, 180, 14500)
                          };

So, we’ve addressed the data part of our Plane; but the whole point of a class is that it can encapsulate both state and operations. What methods are we going to define in our class?

Defining Methods

When deciding what methods a class might need, we generally scan our specifications or scenarios for verbs that relate to the object of that class. If we look back at the ATC system description at the beginning of this chapter, we can see several plane-related actions, to do with granting permissions to land and permissions to take off. But do we need functions on the Plane class to deal with that? Possibly not. It might be better to deal with that in another part of the model, to do with our ground control, runways, and runway management (that, you’ll be pleased to hear, we won’t be building).

But we will periodically need to update the position of all the planes. This involves changing the state of the plane—we will need to modify its Position. And it’s a change of state whose details depend on the existing state—we need to take the direction and speed into account. This sounds like a good candidate for a method that the Plane class should offer. Example 3-39 shows the code to add inside the class.

Example 3-39. A method

public void UpdatePosition(double minutesToAdvance)
{
    double hours = minutesToAdvance / 60.0;
    double milesMoved = SpeedInMilesPerHour * hours;
    double milesToTower = Position.Distance;
    if (Direction == DirectionOfApproach.Approaching)
    {
        milesToTower -= milesMoved;
        if (milesToTower < 0)
        {
            // We've arrived!
            milesToTower = 0;
        }
    }
    else
    {
        milesToTower += milesMoved;
    }
    PolarPoint3D newPosition = new PolarPoint3D(
        milesToTower, Position.Angle, Position.Altitude);
}

This method takes a single argument, indicating how much elapsed time the calculation should take into account. It looks at the speed, the direction, and the current position, and uses this information to calculate the new position.

Note

This code illustrates that our design is some way from being finished. We never change the altitude, which suggests that our planes are going to have a hard time reaching the ground. (Although since this code makes them stop moving when they get directly above the tower, they’ll probably reach the ground soon enough...) Apparently our initial specification did not fully and accurately describe the problem our software should be solving. This will not come as astonishing news to anyone who has worked in the software industry. Clearly we need to talk to the client to get clarification, but let’s implement what we can for now.

Notice that our code is able to use all of the properties—SpeedInMilesPerHour, Direction, and so on—without needing to qualify them with a variable. Whereas in Example 3-35 we had to write someBoeing777.SpeedInMilesPerHour, here we just write SpeedInMilesPerHour. Methods are meant to access and modify an object’s state, and so you can refer directly to any member of the method’s containing class.

There’s one snag with that. It can mean that for someone reading the code, it’s not always instantly obvious when the code uses a local variable or argument, and when it uses some member of the class. Our properties use PascalCasing, while we’re using camelCasing for arguments and variables, which helps, but what it we wanted to access a field? Those conventionally use camelCasing too. That’s why some developers put an underscore in front of their field names—it makes it more obvious when we’re doing something with the object’s state. But there’s an alternative—a more explicit style, shown in Example 3-40.

Example 3-40. Explicit member access

public void UpdatePosition(double minutesToAdvance)
{
    double hours = minutesToAdvance / 60;
    double milesMoved = this.SpeedInMilesPerHour * hours;
    double milesToTower = this.Position.Distance;
    if (this.Direction == DirectionOfApproach.Approaching)
    {
        milesToTower -= milesMoved;
        if (milesToTower < 0)
        {
            // We've arrived!
            milesToTower = 0;
        }
    }
    else
    {
        milesToTower += milesMoved;
    }
    PolarPoint3D newPosition = new PolarPoint3D(
        milesToTower,
        this.Position.Angle,
        this.Position.Altitude);
}

This is almost the same as Example 3-39, except every member access goes through a variable called this. But we’ve not defined any such variable—where did that come from?

The UpdatePosition method effectively has an implied extra argument called this, and it’s the object on which the method has been invoked. So, if our Main method were to call someBoeing777.UpdatePosition(10), the this variable would refer to whatever object the Main method’s someBoeing777 variable referred to.

Methods get a this argument by default, but they can opt out, because sometimes it makes sense to write methods that don’t apply to any particular object. The Main method of our Program class is one example—it has no this argument, because the .NET Framework doesn’t presume to create an object; it just calls the method and lets us decide what objects, if any, to create. You can tell a method has no this argument because it will be marked with the static keyword—you may recall from Chapter 2 that this means the method can be run without needing an instance of its defining type.

Aside from our Main method, why might we not want a method to be associated with a particular instance? Well, one case comes to mind for our example application. There’s a rather important feature of airspace management that we’re likely to need to cope with: ensuring that we don’t let two planes hit each other. So, another method likely to be useful is one that allows us to check whether one plane is too close to another one, within some margin of error (say, 5,000 feet). And this method isn’t associated with any single plane: it always involves two planes.

Now we could define a method on Plane that accepted another Plane as an argument, but that’s a slightly misleading design—it has a lack of symmetry which suggests that the planes play different roles, because you’re invoking the method on one while passing in the other as an argument. So it would make more sense to define a static method—one not directly associated with any single plane—and to have that take two Plane objects.

Declaring Static Methods

We’ll add the method shown in Example 3-41 to the Plane class. Because it is marked static, it’s not associated with a single Plane, and will have no implicit this argument. Instead, we pass in both of the Plane objects we want to look at as explicit arguments, to emphasize the fact that neither of the objects is in any way more significant than the other in this calculation.

Example 3-41. Detecting when Planes are too close

public static bool TooClose(Plane first, Plane second, double minimumMiles)
{
    double x1 = first.Position.Distance * Math.Cos(first.Position.Angle);
    double x2 = second.Position.Distance * Math.Cos(second.Position.Angle);
    double y1 = first.Position.Distance * Math.Sin(first.Position.Angle);
    double y2 = second.Position.Distance * Math.Sin(second.Position.Angle);
    double z1 = first.Position.Altitude / feetPerMile;
    double z2 = second.Position.Altitude / feetPerMile;

    double dx = x1 - x2;
    double dy = y1 - y2;
    double dz = z1 - z2;

    double distanceSquared = dx * dx + dy * dy + dz * dz;
    double minimumSquared = minimumMiles * minimumMiles;
    return distanceSquared < minimumSquared;
}
private const double feetPerMile = 5280;

We’ve seen plenty of function declarations like this before, but we’ll quickly recap its anatomy. This one returns a bool to indicate whether we’re safe (true) or not (false). In its parameter list, we have the references to the two Plane objects, and a double for the margin of error (in miles).

Note

Because there’s no implicit this parameter, any attempt to use nonstatic members of the class without going through an argument or variable such as first and second in Example 3-41 will cause an error. This often catches people out when learning C#. They try adding a method to the Program class of a new program, and they forget to mark it as static (or don’t realize that they need to), and then are surprised by the error they get when attempting to call it from Main. Main is a static method, and like any static method, it cannot use nonstatic members of its containing type unless you provide it with an instance.

Example 3-41 performs some calculations to work out how close the planes are. The details aren’t particularly important here—we’re more interested in how this uses C# methods. But just for completeness, the method converts the position into Cartesian coordinates, and then calculates the sum of the squares of the differences of the coordinates in all three dimensions, which will give us the square of the distance between the two planes. We could calculate the actual distance by taking the square root, but since we only want to know whether or not we’re too close, we can just compare with the minimum distance squared. (Computers are much faster at squaring than they are at calculating square roots, so given that we could do it either way, we may as well avoid the square root.)

Static Fields and Properties

It isn’t just functions that we can declare as static. Fields and properties can be static, too. In fact, we’ve already seen a special kind of static field—the const value we defined for the conversion between miles and kilometers. There was only one conversion factor value, however many objects we instantiated.

The only difference between a const field and a static field is that we can modify the static field. (Remember: the const field was immutable.) So, a static property or field effectively lets us get or set data associated with the class, rather than the object. No matter how many objects we create, we are always getting and setting the same value.

Let’s look at a trivial illustration, shown in Example 3-42, to explore how it works, before we think about why we might want to use it.

Example 3-42. Static state

public class MyClassWithAStaticProperty
{
    public static bool TrueOrFalse
    {
        get;
        set;
    }

    public void SayWhetherTrueOrFalse()
    {
        Console.WriteLine("Object is {0}", TrueOrFalse);
    }
}


class Program
{
    static void Main(string[] args)
    {
        // Create two objects
        MyClassWithAStaticProperty object1 = new MyClassWithAStaticProperty();
        MyClassWithAStaticProperty object2 = new MyClassWithAStaticProperty();

        // Check how the property looks to each object,
        // and accessed through the class name


        object1.SayWhetherTrueOrFalse();
        object2.SayWhetherTrueOrFalse();
        Console.WriteLine("Class is {0}",
           MyClassWithAStaticProperty.TrueOrFalse);

        // Change the value
        MyClassWithAStaticProperty.TrueOrFalse = true;

        // And see that it has changed everywhere
        object1.SayWhetherTrueOrFalse();
        object2.SayWhetherTrueOrFalse();
        Console.WriteLine("Class is {0}",
           MyClassWithAStaticProperty.TrueOrFalse);

        Console.ReadKey();
    }
}

If you compile and run this code in a console application project, you’ll see the following output:

Object is False
Object is False
Class is False
Object is True
Object is True
Class is True

This demonstrates that there’s clearly just the one piece of information here, no matter how many different object instances we may try to look at it through. But why might we want this kind of static, class-level data storage?

The principal use for class-level data is to enforce the reality that there is exactly one instance of some piece of data throughout the whole system. If you think about it, that’s exactly what our miles-to-kilometers value is all about—we only need one instance of that number for the whole system, so we declare it as const (which, as we’ve already seen, is like a special case of static). A similar pattern crops up in lots of places in the .NET Framework class library. For example, on a computer running Windows, there is a specific directory containing certain OS system files (typically C:Windowssystem32). The class library provides a class called Environment which offers, among other things, a SystemDirectory property that returns that location, and since there’s only one such directory, this is a static property.

Another common use for static is when we want to cache information that is expensive to calculate, or which is frequently reused by lots of different objects of the same type. To get a benefit when lots of objects use the common data, it needs to be available to all instances.

Static Constructors

We can even apply the static keyword to a constructor. This lets us write a special constructor that only runs once for the whole class. We could add the constructor in Example 3-43 to our Plane class to illustrate this.

Example 3-43. Static constructor

static Plane()
{
    Console.WriteLine("Plane static constructor");
}

With this code in place, you would see the message printed out by that constructor just once at the beginning of the program—static constructors run exactly once.

Note

In case you’re wondering, yes, static fields can be marked as readonly. And just as a normal readonly field can only be modified in a constructor, a static readonly field can only be modified in a static constructor.

But when exactly do static constructors run? We know when regular members get initialized and when normal constructors run—that happens when we new up the object. Everything gets initialized to zero, and then our constructor(s) are called to do any other initialization that we need doing. But what about static initialization?

The static constructor will run no later than the first time either of the following happens: you create an instance of the class; you use any static member of the class. There are no guarantees about the exact moment the code will run—it’s possible you’ll see them running earlier than you would have expected for optimization reasons.

Field initializers for static fields add some slight complication. (Remember, a field initializer is an expression that provides a default value for a field, and which appears in the field declaration itself, rather than the constructor. Example 3-44 shows some examples.) .NET initializes the statics in the order in which they are declared. So, if you reference one static field from the initializer for another static field in the same class, you need to be careful, or you can get errors at runtime. Example 3-44 illustrates how this can go wrong. (Also, the .NET Framework is somewhat noncommittal about exactly when field initializers will run—in theory it has more freedom than with a static constructor, and could run them either later or earlier than you might expect, although in practice, it’s not something you’d normally need to worry about unless you’re writing multithreaded code that depends on the order in which static initialization occurs.)

Example 3-44. Unwise ordering of static field initializers

class Bar
{
    public bool myField;
}

// Bad - null reference exception on construction
class Foo
{
    public static bool field2 = field1.myField;
    public static Bar field1 = new Bar();
}

// OK - initialized in the right order
class Foo
{
    public static Bar field1 = new Bar();
    public static bool field2 = field1.myField;
}

Summary

We saw how to define classes from which we can create instances called objects, and that this can be useful when attempting to model real-world entities. We can also define value types, using the struct keyword, and the main difference is that when we assign variables or pass arguments, value types always copy the whole value, whereas ordinary classes (which are reference types) only copy a reference to the underlying object. We also saw a simpler kind of type: enum. This lets us define named sets of constant values, and is useful when we need a value representing a choice from a fixed set of options.

So, now we know how to abstract basic ideas of information storage (through fields and simple properties) and manipulation (through functions and calculated properties), using classes and objects. In the next chapter, we’re going to look at how we can extend these ideas further using a concept called polymorphism to model a hierarchy of related classes that can extend or refine some basic contract.



[9] A special kind of crayon, designed for writing on glossy surfaces such as plastic.

[10] It’s short for “enumeration,” by the way. So it’s often pronounced “e-noom” or, depending on where you’re from, “e-nyoom.” However, some developers (and one of the authors) ignore the etymology and pronounce it “ee numb” because that’s how it looks like it should sound.

[11] This is a slight oversimplification. In Chapter 8, we’ll encounter anonymous types, which are always immutable, and yet we can use object initializers with those. In fact, we are required to. But anonymous types are a special case.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset