In the previous couple of chapters, we looked at some basic
programming techniques such as loops and conditions, and used some of the
data types built into the language and platform, such as int
and string
.
Unfortunately, real programs—even fairly simple ones—are much, much more complicated than the examples we’ve built so far. They need to model the behavior of real-world objects like cars and planes, or ideas like mathematical expressions, or behaviors, like the transaction between you and your favorite coffee shop when you buy a double espresso and a brownie with your bank card.
The best way to manage this complexity is to break a system down into manageable pieces, where each piece is small enough for us to understand completely. We should aim to craft each piece so that it fits neatly into the system as a whole with a small enough number of connections to the other pieces that we can comprehend all of those too.
We’ve already seen one tool for dividing our code into manageable pieces: methods. A method is a piece of a program that encapsulates a particular behavior completely. It’s worth understanding the benefits of methods, because the same principles apply to the classes and structs that are this chapter’s main subject.
You will often see the term function used instead of method; they’re related, but not identical. A function is a method that returns something. Some methods just do some work, and do not return any value. So in C#, all functions are methods, but not all methods are functions.
Methods offer a contract: if we meet particular conditions, a method will do certain things for us. Conditions come in various forms: we might need to pass arguments of suitable types, perhaps with limits on the range (e.g., negative numbers may not be allowed). We may need to ensure certain things about the program’s environment—maybe we need to check that certain directories exist on disk, or that there’s sufficient free memory and disk space. There may be constraints on when we are allowed to call the method—perhaps we’re not allowed to call it if some related work we started earlier hasn’t completed yet.
Likewise, there are several ways in which a method can hold up its side of the bargain. Perhaps it will just return a string or a number that is the result of a calculation involving the method’s inputs. It might change the state of some entity in our system in some way, such as modifying an employee’s salary. It may change something about the system environment—the method might install a new device driver, or change the current user’s color scheme, for example. Some methods interact with the outside world by sending messages over the network.
Some aspects of the contract are formalized—a method’s parameter list defines the number and type of arguments we need to pass, for example, and its return type tells us what, if anything, to expect as a return value. But most of the contract is informally specified—we rely on documentation (or sometimes, conversations with the developer who wrote the method) to understand the full contract. But understand it we must, because the contract is at the heart of how methods make our lives easier.
Methods simplify things for us in two ways. If we are the user of a method, then, as long as its internal implementation conforms to the contract, we can treat it as a “black box.” We call it, we expect it to work as described, and we don’t need to worry about how it worked. All its internal complexity is hidden from us, freeing us to think about ideas like “increase this employee’s salary,” without getting bogged down by details such as “open a connection to the database and execute some SQL.”
If, on the other hand, we are the developer of a method, we don’t need to worry about who might call us, and why. As long as our implementation works as promised, we can choose any means of implementation we like—perhaps optimizing for speed, or size, or (more often than not) simplicity and maintainability. We can concentrate on details like whether we’re using the right connection string, and whether the SQL query modifies the database as intended, without needing to ask ourselves questions like “should we even be adjusting this particular employee’s salary at all?”
So, one objective of good design is to hide distracting details and expose a simple model to your client. This practice is called encapsulation, and it’s harder than it looks. As is so often the case in life, making something look easy takes years of practice and hard work. It can also be a thankless task: if you devise a contract that is a model of clarity, people will probably think it was easy to design. Conversely, unnecessary complexity is often mistaken for cleverness.
While methods are essential for achieving encapsulation, they do not guarantee it. It’s all too easy to write methods whose contract is unclear. This often happens when developers do something as an afterthought—it can be oh so tempting to add a bit of extra code to an existing method as a quick solution to a problem, but this risks making that’s method’s responsibilities less clear.
A method’s name is often a good indicator of the clarity of the contract—if the name is vague, or worse, if it’s an inaccurate description of what the method does, you’re probably looking at a method that does a bad job of encapsulation.
One of the great things about methods is that we can use them to
keep breaking things into smaller and smaller pieces. Suppose we have
some method called PlaceOrder
, which
has a well-defined responsibility, but which is getting a bit
complicated. We can just split its implementation into smaller
methods—say, CheckCustomerCredit
,
AllocateStock
,
and IssueRequestToWarehouse
. These
smaller methods do different bits of the work for us.
This general technique, sometimes called functional decomposition, has a long history in mathematics. It was explored academically in computing applications as early as the 1930s. Bearing in mind that the first working programmable computers didn’t appear until the 1940s, that’s quite a pedigree. In fact, it has been around for so long that it now seems “obvious” to most people who have had anything to do with computer programming.
That’s not the end of the story, though. Methods are great for
describing the dynamics of a system—how things
change in response to particular input data (the method arguments), and
the results of those changes (a function’s return value, or a method’s
side effects). What they’re not so good at is describing the current
state of the system. If we examine a set of
functions and a load of variables, how can we work out which pieces of
information are supposed to be operated on by which functions? If
methods were the only tool available for abstraction, we’d have a hard
time telling the difference between the double
that describes my blood pressure, and
can be operated on by this method:
void LowerMyBloodPressure(double pressureDelta)
and the double
that describes
my weight and can be affected by this method:
void EatSomeDonuts(int quantityOfDonuts)
As programs get ever larger, the number of system state variables floating around increases, and the number of methods can explode exponentially. But the problems aren’t just about the sheer number of functions and variables you end up with. As you try to model a more complex system, it becomes harder to work out which functions and variables you actually need—what is a good “decomposition” of the system? Which methods relate to one another, and to which variables?
In the 1960s, two guys called Dahl and Nygaard (they’re Norwegian) were working on big simulation systems and were struggling with this problem. Because they worked on simulating real things, they realized that their code would be easier to understand if they had some clear way to group together all of the data and functions related to a particular type of real thing (or a particular object, we might say).
They designed a programming language that could do this, called Simula 67 (after the year of its birth), and it is generally recognized as the grandmother of all the languages we’d call object-oriented, which (of course) includes C#.
They had hit upon two important concepts:
With these simple ideas, we can remove all doubt over which functions operate on which data—the class describes for us exactly what goes with what, and we can handle multiple entities of the same kind by creating several objects of a particular class.
As an example, let’s think about a very simple computer system that maintains the information for an air traffic control (ATC) operation. (Safety notice: if you happen to be building an ATC system, I strongly recommend that you don’t base it on this one.)
How does (this particular, slightly peculiar) ATC system work?
It turns out that we’ve got a bunch of people in a big room in
Washington, tracking a large number of planes that buzz around the
airport in Seattle. Each plane has an identifier (BA0049
, which flies in from London Heathrow,
for instance). We need to know the plane’s position, which we’ll
represent using three numbers: an altitude (in feet); the distance
from the airport control tower (in miles); and a compass heading
(measured in degrees from North), which will also be relative to the
tower. Just to be clear, that’s not the direction the aircraft itself
is facing—it’s the direction we’d have to face in order to be looking
at the plane if we’re standing in the tower. We also need to know
whether the aircraft is coming in to us, or away from us, and how
fast. This, apparently, is quite important. (A more comprehensive
model might include a second compass heading, representing the exact
direction the plane is facing. But to keep this example simple, we’ll
just track whether planes are approaching or departing.)
As the planes come in, the controllers give them permission to take off or land, and instruct them to change their heading, height, or speed. The aim is to avoid them hitting each other at any point. This, apparently, is also quite important.
At present they have a system where each controller is responsible for a particular piece of airspace. They have a rack which contains little slips of plastic with the aircraft’s ID on it, ordered by the height at which they are flying. If they are coming in to the airport, they use a piece of blue plastic. If they are going away, they use white plastic. To keep track of the heading, distance, and speed, they just write on the slip with a china graph pencil.[9] If the plane moves out of their airspace, they hand the plane over to another controller, who slips it into his own rack.
So that’s our specification.
In reality, a safety-critical system such as ATC would have a more robust spec. However, when lives are not at stake, software specifications are often pretty nebulous, so this example is, sadly, a fair representation of what to expect on your average software project.
Armed with this brilliant description we need to come up with a design for a program which can model the system. We’re going to do that using object-oriented techniques.
When we do an object-oriented analysis we’re looking for the different classes of object that we are going to describe. Very often, they will correspond to real things in the system. For a class to represent these real objects properly, we need to work out what information it is going to hold, and what functions it will define to manipulate that information. In general, any one piece of information will belong to exactly one object, of exactly one class.
Not all of your classes will represent real-world objects. Some will relate to more abstract concepts like collections, or commands. However, designs that wander too far into the realms of the wholly abstract are often “clever” but not necessarily “good”.
In our ATC example, it’s clear that we have a whole lot of
different planes buzzing round the airport. It would therefore seem
logical that we would model each one as an object, for which we would
define a class called Plane
.
Because C# is a language with object-oriented features, we have a simple and expressive way of doing that.
We can start out with the simplest possible class. It will have no methods, and no data, so as a model of a plane in our system, it leaves something to be desired, but it gets us started.
If you want to build your own version as you read, create a new
Console Application project just as we did in Chapter 2. To add a new class, use the
Project→Add Class menu item (or
right-click on the project in the Solution Explorer and select Add→Class). It’ll add a new file for the class, and
if we call it Plane.cs, Visual Studio
will create a new source file with the usual
using directives and namespace
declaration. And most importantly, the file will contain a new, empty
class definition, as shown in Example 3-1.
Right; if we look back at the specification, there’s clearly a whole bunch of information we’ve got about the plane that we need to store somewhere. C# gives us a handy mechanism for this called a property.
Each plane has an identifier which is just a string of
letters and numbers. We’ve already seen a built-in type ideal for
representing this kind of data: string
. So, we can add a property called
Identifier
, of type string
, as Example 3-2 shows.
A property definition always states the type of data the property
holds (string
in this case), followed
by its name. By convention, we use PascalCasing for
this name—see the sidebar on the next page. As with most nontrivial
elements of a C# program, this is followed by a pair of braces, and
inside these we say that we want to provide a get
-ter and a
set
-ter for the
property. You might be wondering why we need to declare these—wouldn’t
any property need to be gettable and settable? But as we’ll see, these
explicit declarations turn out to be useful.
If we create an instance of this class, we could use this Identifier
property to get and set its
identifier. Example 3-3
shows this in a modified version of the Main
function in our Program.cs file.
Example 3-3. Using the Plane class’s property
static void Main(string[] args)
{
Plane someBoeing777 = new Plane();
someBoeing777.Identifier = "BA0049";
Console.WriteLine(
"Your plane has identifier {0}",
someBoeing777.Identifier
);
// Wait for the user to press a key, so
// that we can see what happened
Console.ReadKey();
}
But wait! If you try to compile this, you end up with an error message:
'Plane.Identifier' is inaccessible due to its protection level
Earlier, we mentioned that one of the objectives of good design is encapsulation: hiding the implementation details so that other developers can use our objects without relying on (or knowing about) how they work. As the error we just saw in Example 3-3 shows, a class’s members are hidden by default. If we want them to be visible to users of our class, we must change their protection level.
Every entity that we declare has its own protection level, whether
we specify it or not. A class, for example, has a default protection
level called internal
. This means
that it can only be seen by other classes in its own assembly. We’ll
talk a lot more about assemblies in Chapter 15. For
now, though, we’re only using one assembly (our example application
itself), so we can leave the class at its default protection
level.
While classes default to being internal
, the default protection level for a
class member (such as a property) is private
. This means
that it is only accessible to other members of the class. To make it
accessible from outside the class, we need to
change its protection level to public
, as Example 3-4 shows.
Now when we compile and run the application, we see the correct output:
Your plane has identifier BA0049
Notice how this is an opt-in scheme. If you don’t do anything to the contrary, you get the lowest sensible visibility. Your classes are visible to any code inside your assembly, but aren’t accessible to anyone else; a class’s properties and methods are only visible inside the class, unless you explicitly choose to make them more widely accessible.
When different layers specify different protection, the effective
accessibility is the lowest specified. For example, although our
property has public
accessibility,
the class of which it is a member has internal
accessibility. The lower of the two
wins, so the Identifier
property is,
in practice, only accessible to code in the same assembly.
It is a good practice to design your classes with the smallest possible public interface (part of something we sometimes call “minimizing the surface area”). This makes it easier for clients to understand how they’re supposed to be used and often cuts down on the amount of testing you need to do. Having a clean, simple public API can also improve the security characteristics of your class framework, because the larger and more complex the API gets, the harder it generally gets to spot all the possible lines of attack.
That being said, there’s a common misconception that accessibility modifiers “secure” your class, by preventing people from accessing private members. Hence this warning:
It is important to recognize that these protection levels are a convenient design constraint, to help us structure our applications properly. They are not a security feature. It’s possible to use the reflection features described in Chapter 17 to circumvent these constraints and to access these supposedly hidden details.
To finish this discussion, you should know that there are two
other protection levels available to us—protected
and protected internal
—which we can use to expose
(or hide) members to developers who derive new classes from our class
without making the members visible to all. But since we won’t be talking
about derived classes until Chapter 4, we’ll defer the discussion
of these protection levels until then.
We can take advantage of protection in our Plane
class. A plane’s identifier shouldn’t
change mid-flight, and it’s a good practice for code to prevent things
from happening that we know shouldn’t happen. We should therefore add
that constraint to our class. Fortunately, we have the ability to change
the accessibility of the getter and the setter individually, as Example 3-5 shows. (This is one reason
the property syntax makes use declare the get
and set
explicitly—it gives us a place to put the protection level.)
Example 3-5. Making a property setter private
class Plane { public string Identifier { get; private set; } }
Compiling again, we get a new error message:
The property or indexer 'Plane.Identifier' cannot be used in this context because the set accessor is inaccessible
The problem is with this bit of code from Example 3-3:
someBoeing777.Identifier = "BA0049";
We’re no longer able to set the property, because we’ve made the
setter private
(which means that we
can only set it from other members of our class). We wanted to prevent
the property from changing, but we’ve gone too far: we don’t even have a
way of giving it a value in the first place. Fortunately, there’s a
language feature that’s perfect for this situation: a
constructor.
A constructor is a special method which allows you to perform some “setup” when you create an instance of a class. Just like any other method, you can provide it with parameters, but it doesn’t have an explicit return value. Constructors always have the same name as their containing class.
Example 3-6 adds a constructor that
takes the plane’s identifier. Because the constructor is a member of the
class, it’s allowed to use the Identifier
property’s private
setter.
Example 3-6. Defining a constructor
class Plane { public Plane(string newIdentifier) { Identifier = newIdentifier; } public string Identifier { get; private set; } }
Notice how the constructor looks like a standard method
declaration, except that since there’s no need for a return type
specifier, we leave that out. We don’t even write void
, like we would for a normal method that
returns nothing. And it would be weird if we did; in a sense this does
return something—the newly created Plane
—it just does so implicitly.
What sort of work should you do in a constructor? Opinion is divided on the subject—should you do everything required to make the object ready to use, or the minimum necessary to make it safe? The truth is that it is a judgment call—there are no hard and fast rules. Developers tend to think of a constructor as being a relatively low-cost operation, so enormous amounts of heavy lifting (opening files, reading data) might be a bad idea. Getting the object into a fit state for use is a good objective, though, because requiring other functions to be called before the object is fully operational tends to lead to bugs.
We need to update our Main
function to use this new constructor and to get rid of the line of code
that was setting the property, as Example 3-7
shows.
Example 3-7. Using a constructor
static void Main(string[] args)
{
Plane someBoeing777 = new Plane("BA0049");
Console.WriteLine(
"Your plane has identifier {0}",
someBoeing777.Identifier);
Console.ReadKey();
}
Notice how we pass the argument to the constructor inside the parentheses, in much the same way that we pass arguments in a normal method call.
If you compile and run that, you’ll see the same output as before—but now we have an identifier that can’t be changed by users of the object.
Be very careful when you talk about properties that “can’t be
changed” because they have a private setter. Even if you can’t set a
property, you may still be able to modify the state of the object
referred to by that property. The built-in string
type happens to be immune to that
because it is immutable (i.e., it can’t be changed once it has been
created), so making the setter on a string property private
does actually prevent clients from
changing the property, but most types aren’t like that.
Speaking of properties that might need to change, our
specification requires us to know the speed at which each plane is
traveling. Sadly, our specification didn’t mention the units in which we
were expected to express that speed. Let’s assume it is miles per hour,
and add a suitable property. We’ll use the floating-point double
data type for this. Example 3-8 shows the code to add to
Plane
.
If we were to review this design with the customer, they might point out that while they have some systems that do indeed want the speed in miles per hour the people they liaise with in European air traffic control want the speed in kilometers per hour. To avoid confusion, we will add another property so that they can get or set the speed in the units with which they are familiar. Example 3-9 shows a suitable property.
Example 3-9. Property with code in its get and set
public double SpeedInKilometersPerHour { get { return SpeedInMilesPerHour * 1.609344; } set { SpeedInMilesPerHour = value / 1.609344; } }
We’ve done something different here—rather than just writing
get;
and set;
we’ve provided code for these accessors.
This is another reason we have to declare the accessors explicitly—the
C# compiler needs to know whether we want to write a custom property
implementation.
We don’t want to use an ordinary property in Example 3-9, because our SpeedInKilometersPerHour
is not really a
property in its own right—it’s an alternative representation for the
information stored in the SpeedInMilesPerHour
property. If we used the
normal property syntax for both, it would be possible to set the speed
as being both 100 mph and 400 km/h, which would clearly be inconsistent.
So instead we’ve chosen to implement SpeedInKilometersPerHour
as a wrapper around
the SpeedInMilesPerHour
property.
If you look at the getter, you’ll see that it returns a value of
type double
. It is equivalent to a
function with this signature:
public double get_SpeedInKilometersPerHour()
The setter seems to provide an invisible parameter called value
, which is also of type double
. So it is equivalent to a method with
this signature:
public void set_SpeedInKilometersPerHour(double value)
This value
parameter is a
contextual keyword—C# only
considers it to be a keyword in property or event accessors. (Events
are described in Chapter 5.) This means
you’re allowed to use value
as an
identifier in other contexts—for example, you can write a method that
takes a parameter called value
. You
can’t do that with other keywords—you can’t have a parameter called
class
, for example.
This is a very flexible system indeed. You can provide properties that provide real storage in the class to store their data, or calculated properties that use any mechanism you like to get and/or set the values concerned. This choice is an implementation detail hidden from users of our class—we can switch between one and the other without changing our class’s public face. For example, we could switch the implementation of these speed properties around so that we stored the value in kilometers per hour, and calculated the miles per hour—Example 3-10 shows how these two properties would look if the “real” value was in km/h.
Example 3-10. Swapping over the real and calculated properties
public double SpeedInMilesPerHour { get { return SpeedInKilometersPerHour / 1.609344; } set { SpeedInKilometersPerHour = value * 1.609344; } } public double SpeedInKilometersPerHour { get; set; }
As far as users of the Plane
class are concerned, there’s no discernible difference between the two
approaches—the way in which properties work is an encapsulated implementation detail. Example 3-11 shows an updated Main
function that uses the new properties. It
neither knows nor cares which one is the “real” one.
Example 3-11. Using the speed properties
static void Main(string[] args) { Plane someBoeing777 = new Plane("BA0049"); someBoeing777.SpeedInMilesPerHour = 150.0; Console.WriteLine( "Your plane has identifier {0}, " + "and is traveling at {1:0.00}mph [{2:0.00}kph]", someBoeing777.Identifier,someBoeing777.SpeedInMilesPerHour
,someBoeing777.SpeedInKilometersPerHour
); someBoeing777.SpeedInKilometersPerHour = 140.0; Console.WriteLine( "Your plane has identifier {0}, " + "and is traveling at {1:0.00}mph [{2:0.00}kph]", someBoeing777.Identifier,someBoeing777.SpeedInMilesPerHour
,someBoeing777.SpeedInKilometersPerHour
); Console.ReadKey(); }
Although our public API supports two different units for speed
while successfully keeping the implementation for that private, there’s
something unsatisfactory about that implementation. Our conversion
relies on a magic number (1.609344
)
that appears repeatedly. Repetition impedes readability, and is prone to
typos (I know that for a fact. I’ve typed it incorrectly once already
this morning while preparing the example!) There’s an important
principle in programming: don’t repeat yourself (or
dry, as it’s sometimes abbreviated). Your code
should aim to express any single fact or concept no more than once,
because that way, you only need to get it right once.
It would be much better to put this conversion factor in one place, give it a name, and refer to it by that instead. We can do that by declaring a field.
A field is a place to put some data of a particular type.
There’s no option to add code like you can in a property—a field is
nothing more than data. Back before C# 3.0 the compiler didn’t let us
write just get;
and set;
—we always had to write properties with
code as in Example 3-9, and
if we wanted a simple property that stored a value, we had to provide a
field, with code such as Example 3-12.
Example 3-12. Writing your own simple property
// Field to hold the SpeedInMilesPerHour property's value double speedInMilesPerHourValue; public double SpeedInMilesPerHour { get { return speedInMilesPerHourValue; } set { speedInMilesPerHourValue = value; } }
When you write just get;
and
set;
as we did in Example 3-8, the C# compiler generates code
that’s more or less identical to Example 3-12, except it gives the field
a peculiar name to prevent us from accessing it directly. (These
compiler-generated properties are called auto properties.) So, if we want to
store a value in an object, there’s always a field involved, even if
it’s a hidden one provided automatically by the compiler. Fields are the
only class members that can hold information—properties are really just
methods in disguise.
As you can see, a field declaration looks similar to the start of
a property declaration. There’s the type (double
), and a name. By convention, this name
is camelCased, to make fields visibly different from properties. (Some
developers like to distinguish fields further by giving them a name that
starts with an underscore.)
We can modify a field’s protection level if we want, but,
conventionally, we leave all fields with the default private
accessibility. That’s because a field
is just a place for some data, and if we make it public
, we lose control over the internal
state of our object. Properties always involve some code, even if it’s
generated automatically by the compiler. We can use private backing
fields as we wish, or calculate property values any way we like, and
we’re free to modify the implementation without ever changing the public
face of the class. But with a field, we have nowhere to put code, so if
we decide to change our implementation by switching from a field to a
calculated value, we would need to remove the field entirely. If the
field was part of the public contract of the class, that could break our
clients. In short, fields have no innate capacity for encapsulation, so
it’s a bad idea to make them public
.
Example 3-13 shows
a modified version of the Plane
class. Instead of repeating the magic number for our speed conversion
factor, we declare a single field initialized to the required value. Not
only does this mean that we get to state the conversion value just once,
but we’ve also been able to give it a descriptive name—in the
conversions, it’s now obvious that we’re multiplying and dividing by the
number of kilometers in a mile, even if you happen not to have committed
the conversion factor to memory.
Example 3-13. Storing the conversion factor in a field
class Plane { // Constructor with a parameter public Plane(string newIdentifier) { Identifier = newIdentifier; } public string Identifier { get; private set; } double kilometersPerMile = 1.609344; public double SpeedInMilesPerHour { get { return SpeedInKilometersPerHour /kilometersPerMile
; } set { SpeedInKilometersPerHour = value *kilometersPerMile
; } } public double SpeedInKilometersPerHour { get; set; } }
Notice how we’re able to initialize the field to a default value
right where we declare it, by using the =
operator. (This sort
of code is called, predictably enough, a field
initializer.) Alternatively, we could have initialized it
inside a constructor, but if the default is a constant value, it is
conventional to set it at the point of declaration.
What about the first example of a field that we saw—the one we
used as the backing data for a property in Example 3-12? We didn’t explicitly
initialize it. In some other languages that would be a ghastly mistake.
(Failure to initialize fields correctly is a major source of bugs in
C++, for example.) Fortunately, the designers of .NET decided that the
trade-off between performance and robustness wasn’t worth the pain, and
kindly initialize all fields to a default value for us—numeric fields
are set to zero and fields of other types get whatever the nearest
equivalent of zero is. (Boolean fields are initialized to false
, for example.)
There’s also a security reason for this initialization. Because a new object’s memory is always zeroed out before we get to see it, we can’t just allocate a whole load of objects and then peer at the “uninitialized” values to see if anything interesting was left behind by the last object that used the same memory.
Defining a field for our scale factor is an improvement, but we
could do better. Our 1.609344
isn’t
ever going to change. There are always that many kilometers per mile,
not just for this instance of a Plane
, but for any Plane
there ever will be. Why allocate the
storage for the field in every single instance? Wouldn’t it be better if
we could define this value just once, and not store it in every Plane
instance?
C# provides a mechanism for declaring that a field holds a
constant value, and will never, ever change. You use the const
modifier, as
Example 3-14 shows.
The platform now takes advantage of the fact that this can never
change, and allocates storage for it only once, no matter how many
instances of Plane
you new
up. Handy.
This isn’t just a storage optimization, though. By making the
field const
, there’s no danger that
someone might accidentally change it for some reason inside another
function he’s building in the class—the C# compiler prevents you from
assigning a value to a const
field
anywhere other than at the point of declaration.
In general, when we are developing software, we’re trying to make it as easy as possible for other developers (including our “future selves”) to do the right thing, almost by accident. You’ll often hear this approach called “designing for the pit of success.” The idea is that people will fall into doing the right things because of the choices you’ve made.
Some aspects of an object don’t fit well as either a normal
modifiable field or a constant value. Take the plane’s identifier, for
example—that’s fixed, in the sense that it never changes after
construction, but it’s not a constant value like kilometersPerMile
. Different planes have
different identifiers. .NET supports this sort of information through
read-only properties and fields, which aren’t quite the same as const
.
In Example 3-5, we
made our Plane
class’s Identifier
property private
. This prevented users of our class
from setting the property, but our class is still free to shoot itself
in the foot. Suppose a careless developer added some code like that in
Example 3-15, which prints out
messages in the SpeedInMilesPerHour
property perhaps in order to debug some problem he was
investigating.
Example 3-15. Badly written debugging code
public double SpeedInMilesPerHour { get { return SpeedInKilometersPerHour / kilometersPerMile; } set { Identifier += ": speed modified to " + value; Console.WriteLine(Identifier); SpeedInKilometersPerHour = value * kilometersPerMile; } }
The first time someone tries to modify a plane’s SpeedInMilesPerHour
this will print out a
message that includes the identifier, for example:
BA0048: speed modified to 400
Unfortunately, the developer who wrote this clearly wasn’t the
sharpest tool in the box—he used
the +=
operator to build that debug
string, which will end up modifying the Identifier
property. So, the plane now thinks
its identifier is that whole text, including the part about the speed.
And if we modified the speed again, we’d see:
BA0048: speed modified to 400: speed modified to 380
While it might be interesting to see the entire modification
history, the fact that we’ve messed up the Identifier
is bad. Example 3-15 was able to do this because
the SpeedInMilesPerHour
property is part of
the Plane
class, so it can still use
the private
setter. We can fix this
(up to a point) by making the property read-only—rather than merely
making the setter private
, we can
leave it out entirely. However, we can’t just write the code in Example 3-16.
Example 3-16. The wrong way to define a read-only property
class Plane { // Wrong! public string Identifier { get; } ... }
That won’t work because there’s no way we could
ever set Identifier
—not even in the constructor. Auto
properties cannot be read-only, so we must write a getter with code.
Example 3-17 will
compile, although as we’re about to see, the job’s not done yet.
Example 3-17. A better, but incomplete, read-only property
class Plane
{
public Plane(string newIdentifier)
{
_identifier = newIdentifier;
}
public string Identifier
{
get { return _identifier; }
}
private string _identifier;
...
}
This turns out to give us two problems. First, the original
constructor from Example 3-6 would no
longer compile—it set Identifier
, but
that’s now read-only. That was easy to fix, though—Example 3-17 just sets the
explicit backing field we’ve added. More worryingly, this hasn’t solved
the original problem—the developer who wrote the code in Example 3-15 has “cleverly” realized that
he can “fix” his code by doing exactly the same thing as the
constructor. As Example 3-18 shows he has just
used the _identifier
field
directly.
Example 3-18. “Clever” badly written debugging code
public double SpeedInMilesPerHour
{
get
{
return SpeedInKilometersPerHour / kilometersPerMile;
}
set
{
_identifier += ": speed modified to " + value;
Console.WriteLine(Identifier);
SpeedInKilometersPerHour = value * kilometersPerMile;
}
}
That seemed like a long journey for no purpose. However, we can fix this problem—we can modify the backing field itself to be read-only, as shown in Example 3-19.
That will foil the developer who wrote Example 3-15 and Example 3-18. But doesn’t it also break our constructor again? In fact, it doesn’t: read-only fields behave differently from read-only properties. A read-only property can never be modified. A read-only field can be modified, but only by a constructor.
Since read-only fields only become truly read-only after construction completes, it makes them perfect for properties that need to be able to be different from one instance to another, but which need to be fixed for the lifetime of an instance.
Before we move on from const
and readonly
fields, there’s another
property our Plane
needs for which
const
seems like it could be
relevant, albeit in a slightly different way. In addition to monitoring
the speed of an aircraft, we also need to know whether it is approaching
or heading away from the airport.
We could represent that with a bool
property called something like IsApproaching
(where true
would mean that it was approaching, and
false
would, by implication, indicate
that it was heading away). That’s a bit clumsy, though. You can often
end up having to negate Boolean properties—you might need to write this
sort of thing:
if (!plane.IsApproaching) { ... }
That reads as “if not plane is approaching” which sounds a bit awkward. We could go with:
if (somePlane.IsApproaching == false) { ... }
That’s “if is approaching is false” which isn’t much better. We
could offer a second, calculated property called IsNotApproaching
, but our code is likely to be
simpler and easier to read (and therefore likely to contain fewer bugs)
if, instead of using bool
, we have a
Direction
property whose value could
somehow be either Approaching
or
Leaving
.
We’ve just seen a technique we could use for that. We could create
two constant fields of any type we like (int
, for example), and a property of type
int
called Direction
(see Example 3-20).
Example 3-20. Named options with const int
class Plane { public const int Approaching = 0; public const int Leaving = 1; // ... public int Direction { get; set; } }
This lets us write code that reads a bit more naturally than it
would if we had used just true
and
false
:
someBoeing777.Direction = Plane.Approaching; if (someAirbusA380.Direction == Plane.Leaving) { /* Do something */ }
But there’s one problem: if our Direction
property’s type is int
, there’s nothing to stop us from saying
something like:
someBoeing777.Direction = 72;
This makes no sense, but the C# compiler doesn’t know that—after
all, we told it the property’s type was int
, so how’s it supposed to know that’s
wrong? Fortunately, the designers of C# have thought of this, and have
given us a kind of type for precisely this situation, called an enum
, and it turns out to be a much better
solution for this than const
int
.
The enum
[10] keyword lets us define a type whose values can be one of a
fixed set of possibilities. Example 3-21 declares an
enum
for our Direction
property. You can add this to an
existing source file, above or below the Plane
class, for example. Alternatively, you
could add a whole new source file to the project, although Visual Studio
doesn’t offer a file template for enum
types, so either you’d have to add a new class and then change the
class
keyword to enum
, or you could use the Code File template to
add a new, empty source file.
This is similar in some respects to a class declaration. We can
optionally begin with a protection level but if, like Example 3-21, we omit that, we get internal
protection by default. Then there’s the
enum
specifier itself, followed by the
name of the type, which by convention we PascalCase. Inside the braces, we
declare the members, again using PascalCasing. Notice that we use commas
to separate the list of constants—this is where the syntax starts to part
company with class
. Unusually, the
members are publicly accessible by default. That’s because an enum
has no behavior, and so there are no
implementation details—it’s just a list of named values, and those need to
be public
for the type to serve any
useful purpose.
Notice that we’ve chosen to call this DirectionOfApproach
, and not the plural
DirectionsOfApproach
. By convention,
we give enum
types a singular name
even though they usually contain a list. This makes sense because when
you use named entries from an enumeration, you use them one at a time,
and so it would look odd if the type name were plural. Obviously, there
won’t be any technical consequences for breaking this convention, but
following it helps make your code consistent with the .NET Framework
class libraries.
We can now declare our Direction
property, using the enumeration instead of an integer. Example 3-22 shows the property to add to the
Plane
class.
There are some optional features we can use in an enum
declaration. Example 3-23 uses these, and they
provide some insight into how enum
types work.
Example 3-23. Explicit type and values for enum
enum DirectionOfApproach: int
{ Approaching= 0
, Leaving= 1
}
In this declaration, we have explicitly specified the governing type for the enumeration.
This is the type that stores the individual values for an enumeration, and
we specify it with a colon and the type name. By default, it uses an
int
(exactly as we did in our original
const-based implementation of this property), so we’ve not actually
changed anything here; we’re just being more explicit. The governing type
must be one of the built-in integer types: byte
,
sbyte
, short
, ushort
, uint
,
long
, or ulong
.
Example 3-23 also specifies
the numbers to use for each named value. As it happens, if you don’t
provide these numbers, the first member is assigned the value 0
, and we count off sequentially after that, so
again, this example hasn’t changed anything, it’s just showing the values
explicitly.
We could, if we wanted, specify any value for any particular member.
Maybe we start from 10
and go up in
powers of 2
. And we’re also free to
define duplicates, giving the same value several different names. (That
might not be useful, but C# won’t stop you.)
We normally leave all these explicit specifiers off, and accept the defaults. However, the sidebar on the next page describes a scenario in which you would need to control the numbers.
If you don’t specify explicit values, the first item in your list
is effectively the default value for the enum
(because it corresponds to the zero
value). If you provide explicit values, be sure to define a value that
corresponds to zero—if you don’t, fields using your type will default to
a value that’s not a valid member of the enum
, which is not desirable.
We can now access the enumeration property like this:
someBoeing777.Direction = DirectionOfApproach.Approaching;
We’ve clearly made some progress with our Plane
class, but we’re not done yet. We have a
read-only property for its Identifier
.
We can store the speed, which we can get and set using two different
properties representing different units, using a const
field for the conversion factor. And we
know the direction, which will be either the Approaching
or the Leaving
member of an enum
.
We still need to store the aircraft’s position. According to the specification, we’ve got two polar coordinates (an angle and a distance) for its position on the ground, and another value for its height above sea level.
We’re likely to need to do a lot of calculations based on this
position information. Every time we want to create a function to do that,
we’d need three parameters per point, which seems overly complex. (And
error-prone—it’d be all too easy to inadvertently pass two numbers from
one position, and a third number from a different position.) It would be
nicer if we could wrap the numbers up into a single, lightweight, “3D
point” type that we can think of in the same kind of way we do int
or double
—a basic building block for other classes
to use with minimum overhead.
So far, we’ve been building a class
. When creating an instance of the class,
we stored it in a named variable, as Example 3-24 shows.
Example 3-24. Storing a reference in a variable
Plane someBoeing777 = new Plane("BA0049"); someBoeing777.Direction = DirectionOfApproach.Approaching;
We can define another variable with a different name, and store a reference to the same plane in that new variable, as shown in Example 3-25.
Example 3-25. Copying a reference from one variable to another
Plane theSameBoeing777ByAnotherName = someBoeing777;
If we change a property through one variable, that change will be
visible through the other. Example 3-26 modifies our plane’s
Direction
property through the second
variable, but then reads it through the first variable, verifying that
they really are referring to the same object.
Example 3-26. Using one object through two variables
theSameBoeing777ByAnotherName.Direction = DirectionOfApproach.Leaving; if (someBoeing777.Direction == DirectionOfApproach.Leaving) { Console.WriteLine("Oh, they are the same!"); }
As Shakespeare might have said, if only he’d found his true vocation as a C# developer:
That which we call
someBoeing777
By any other name would smell as sweet.
Assuming you like the smell of jet fuel.
When we define a type using class
, we always get this behavior—our variables
behave as references to an underlying object. We therefore call a type
defined as a class
a reference type.
It’s possible for a reference type variable to be in a state where
it isn’t referring to any object at all. C# has a special keyword,
null
, to represent
this. You can set a variable to null
,
or you can pass null
as an argument
to a method. And you can also test to see if a field, variable, or
argument is equal to null
in an
if
statement. Any field whose type is
a reference type will automatically be initialized to null
before the constructor runs, in much the
same way as numeric fields are initialized to zero.
The enum
we declared earlier and
the built-in numeric types (int
,
double
) behave differently, though, as
Example 3-27 illustrates.
Example 3-27. Copying values, not references
int firstInt = 3; int secondInt = firstInt; secondInt = 4; if (firstInt != 4) { Console.WriteLine("Well. They're not the same at all."); }
When we assign firstInt
to secondInt
, we are copying
the value. In this case, the variables hold the actual value,
not a reference to a value. We call types that behave this way value types.
People often refer to reference types as being allocated “on the heap” and value types “on the stack.” C++ programmers will be familiar with these concepts, and C++ provided one syntax in the language to explicitly create items on the stack (a cheap form of storage local to a particular scope), and a different syntax for working on the heap (a slightly more expensive but sophisticated form of storage that could persist beyond the current scope). C# doesn’t make that distinction in its syntax, because the .NET Framework itself makes no such distinction. These aspects of memory management are completely opaque to the developer, and it is actively wrong to think of value types as being always allocated on a stack.
For people familiar with C++ this can take a while to get used to, especially as the myth is perpetuated on the Web, in the MSDN documentation and elsewhere. (For example, at the time of this writing, http://msdn.microsoft.com/library/aa288471 states that structs are created on the stack, and while that happens to be true of the ones in that example when running against the current version of .NET, it would have been helpful if the page had mentioned that it’s not always true. For example, if a class has a field of value type, that field doesn’t live on the stack—it lives inside the object, and in all the versions of .NET released so far, objects live on the heap.)
The important difference for the C# developer between these two kinds of types is the one of reference versus copy semantics.
As well as understanding the difference in behavior, you also need to be aware of some constraints. To be useful, a value type should be:
Immutable
Lightweight
Something is immutable if it doesn’t change over time. So, the integer
3
is immutable. It doesn’t have any
internal workings that can change its “three-ness”. You can replace the
value of an int
variable that currently contains a 3
,
by copying a 4
into it, but you can’t
change a 3
itself. (Unlike, say, a
particular Plane
object, which has a
Direction
property that you can change
anytime you like without needing to replace the whole Plane
.)
There’s nothing in C# that stops you from creating a mutable value type. It is just a bad idea (in general). If your type is mutable, it is probably safer to make it a reference type, by declaring it as a class. Mutable value types cause problems because of the copy semantics—if you modify a value, it’s all too easy to end up modifying the wrong one, because there may be many copies.
It should be fairly apparent that a value type also needs to be pretty lightweight, because of all that copying going on. Every time you pass it into a function, or assign it to a variable, a copy is made. And copies are generally the enemy of good performance. If your value type consists of more than two or three of the built-in types, it may be getting too big.
These constraints mean it is very rare that you will actually want to declare a value type yourself. A lot of the obviously useful ones you might want are already defined in the .NET Framework class libraries (things like 2D points, times, and dates). Custom value types are so rare that it was hard to come up with a useful example for this book that wasn’t already provided in the class libraries. (If you were wondering why our example application represents aircraft positions in such an idiosyncratic fashion, this is the reason.)
But that doesn’t mean you should never, ever declare a value type. Value types can have performance benefits when used in arrays (although as with most performance issues, this is not entirely clear-cut), and the immutability and copy semantics can make them safer when passing them in to functions—you won’t normally introduce side effects by working with a value type because you end up using a copy, rather than modifying shared data that other code might be relying on.
Our polar 3D point seems to comply with the requirements. Any given
point is just that: a specific point in 3D space—a good candidate for
immutability. (We might want to move a plane to a different point, but we
can’t change what a particular point means.) It is also no more than three
doubles in size, which is small enough for copy semantics. Example 3-28 shows our declaration of this type, which we can
add to our project. (As with enum
,
Visual Studio doesn’t offer a template for value types. Again, we can use
the Class template, replacing the class with the code we want.)
Example 3-28. A value type
struct PolarPoint3D { public PolarPoint3D(double distance, double angle, double altitude) { Distance = distance; Angle = angle; Altitude = altitude; } public double Distance { get; private set; } public double Angle { get; private set; } public double Altitude { get; private set; } }
If you think that it looks just like a class declaration, but using
the struct
keyword instead of
class
, you’d be right—these two kinds
of types are very similar. However, if we try to compile it, we get an
error on the first line of the constructor:
The 'this' object cannot be used before all of its fields are assigned to
So, although the basic syntax of a struct
looks just like a class
there are important differences. Remember
that when you allocate an instance of a particular type, it is
always initialized to some default value. With
classes, all fields are initialized to zero (or the nearest equivalent
value). But things work slightly differently with value types—we need to do slightly more
work.
Anytime we write a struct
, C#
automatically generates a default, parameterless constructor that
initializes all of our storage to zero, so if we don’t want to write any
custom constructors, we won’t have any problems. (Unlike with a class
, we aren’t allowed to replace the default
constructor. We can define extra constructors, but the default constructor
is always present and we’re not allowed to write our own—see the sidebar
on the next page for details.)
Example 3-28 has hit trouble because we’re trying
to provide an additional constructor, which initializes the properties to
particular values. If we write a constructor in a struct
, the compiler refuses to let us invoke
any methods until we’ve initialized all the fields. (It doesn’t do the
normal zero initialization for custom constructors.) This restriction
turns out to include properties, because get
and set
accessors are methods under the covers. So C# won’t let us use our
properties until the underlying fields have been initialized, and we can’t
do that because these are auto properties—the C# compiler has generated
hidden fields that we can only access through the properties. This is a
bit of a chicken-and-egg bootstrapping problem!
Fortunately, C# gives us a way of calling one of our constructors
from another. We can use this to call the default constructor to do the
initialization; then our constructor can set the properties to whatever
values it wishes. We call the constructor using the this
keyword, and the
standard function calling syntax with any arguments enclosed in
parentheses. As Example 3-29
shows, we can invoke the default constructor with an empty argument
list.
Example 3-29. Calling one constructor from another
public PolarPoint3D(double distance, double angle, double altitude)
: this()
{
Distance = distance;
Angle = angle;
Altitude = altitude;
}
You add the call just before the opening brace for the body of the
constructor, and prefix it with a colon. We can also use this technique to
avoid writing common initialization code multiple times. Say we wanted to
provide another utility constructor that just took the polar coordinates,
and initialized the altitude to zero by default. Instead of repeating all
the code from the first constructor, we could just add this extra
constructor to our definition for PolarPoint3D
, as shown in Example 3-30.
Example 3-30. Sharing common initialization code
public PolarPoint3D(double distance, double angle)
: this(distance, angle, 0)
{
}
public PolarPoint3D(
double distance,
double angle,
double altitude)
: this()
{
Distance = distance;
Angle = angle;
Altitude = altitude;
}
Incidentally, this syntax for calling one constructor from another works equally well in classes, and is a great way of avoiding code duplication.
You should be careful of adding too many constructors to a class
or struct
. It is easy to lose track of which
parameters are which, or to make arbitrary choices about which
constructors you provide and which you don’t.
For example, let’s say we wanted to add yet another constructor to
PolarPoint3D
that lets callers pass
just the angle and altitude, initializing the distance to a default of
zero, as Example 3-31 shows.
Example 3-31. A constructor too far
public PolarPoint3D( double altitude, double angle ) : this( 0, angle, altitude ) { }
Even before we compile, we can see that there’s a problem—we happen
to have added the altitude
parameter so
that it is the first in the list, and angle
stays second. In our main constructor, the
altitude comes after the angle. Because they are both just doubles,
there’s nothing to stop you from accidentally passing the parameters “the
wrong way round.” This is the exactly the kind of thing that surprises
users of your class, and leads to hard-to-find bugs. But while
inconsistent parameter ordering is bad design, it’s not a
showstopper.
However, when we compile, things get even worse. We get another error:
Type 'PolarPoint3D' already defines a member called 'PolarPoint3D' with the same parameter types
When we define more than one member in a type with the same name (be it a constructor or, as we’ll see later, a method) we call this overloading.
Initially, we created two constructors (two overloads of the
constructor) for PolarPoint3D
, and they
compiled just fine. This is because they took different sets of
parameters. One took three doubles, the other two. In fact,
there was also the third, hidden constructor that took no parameters at
all. All three constructors took different numbers of parameters, meaning
there’s no ambiguity about which constructor we want when we initialize a
new PolarPoint3D
.
The constructor in Example 3-31 seems
different: the two doubles have different names. Unfortunately, this
doesn’t matter to the C# compiler—it only looks at the
types of the parameters, and the
order in which they are declared. It does not use
names for disambiguation. This
should hardly be surprising, because we’re not required to provide
argument names when we call methods or constructors. If we add the
overload in Example 3-31, it’s not clear what
new PolarPoint3D(0, 0)
would mean, and
that’s why we get an error—we’ve got two members with the same name
(PolarPoint3D
—the constructor), and
exactly the same parameter types, in the same order.
Looking at overloaded functions will emphasize that it really is only the method name and the parameters that matter—a function’s return type is not considered to be a disambiguating aspect of the member for overload purposes.
That means there’s nothing we can do about it: we’re going to have
to get rid of this third constructor (just delete it); and while we’re in
the code, we’ll finish up the declaration of the data portion of our
Plane
by adding a property for its
position, shown in Example 3-32.
Just as with constructors, we can provide more than one method with the same name, but a different list of parameter types. It is, in general, a bad idea to provide two overloads with the same name if they perform a semantically different operation (again—that’s the kind of thing that surprises developers using your class), so the most common reason for overloading is to provide several different ways to do something. We can provide users of our code with flexible methods that take lots of arguments to control different aspects of the code, and we can also provide developers that don’t need this flexibility with simpler options by providing overloads that don’t need as many arguments.
Suppose we added a method to our Plane
class enabling messages to be sent to
aircraft. Perhaps in our first attempt we define a method whose
signature looks like this:
public void SendMessage(string messageText)
But suppose that as the project progresses, we find that it would
be useful to be able to delay transmission of certain messages. We could
modify the SendMessage
method so that
it accepts an extra argument. There’s a handy type in the framework
called TimeSpan
which lets us specify
duration. We could modify the method to make use of it:
public void SendMessage(string messageText, TimeSpan delay)
Alas! If we already had code in our project depending on the original signature, we’d start to see this compiler error:
No overload for method 'SendMessage' takes '1' arguments
We’ve changed the signature of that method, so all our clients are sadly broken. They need to be rewritten to use the new method. That’s not great.
A better alternative is to provide both signatures—keep the old
single-parameter contract around, but add an overload with the extra
argument. And to ensure that the overloads behave consistently (and to
avoid duplicating code) we can make the simpler method call the new
method as its actual implementation. The old method was just the
equivalent of calling the new method with a delay of zero, so we could
replace it with the method shown in Example 3-33. This lets us
provide the newly enhanced SendMessage
, while continuing to support the
old, simpler version.
Example 3-33. Implementing one overload in terms of another
public void SendMessage(string messageName) { SendMessage(messageName, TimeSpan.Zero); }
(TimeSpan.Zero
is a static
field that returns a duration of
zero.)
Until C# 4.0 that’s as far as we could go. However, the C# designers noticed that a lot of member overloads were just like this one: facades over an über-implementation, with a bunch of parameters defaulted out to particular values. So they decided to make it easier for us to support multiple variations on the same method. Rather than writing lots of overloads, we can now just specify default values for a method’s arguments, which saves us typing a lot of boilerplate, and helps make our default choices more transparent.
Let’s take out the single-parameter method overload we just added, and instead change the declaration of our multiparameter implementation, as shown in Example 3-34.
Example 3-34. Parameter with default value
public void SendMessage( string messageName, TimeSpan delay = default(TimeSpan))
Even though we’ve only got one method, which supports two
arguments, code that tries to call it with a single argument will still
work. That’s because default values can fill in for missing arguments.
(If we tried to call SendMessage
with
no arguments at all, we’d get a compiler error, because there’s no
default for the first argument here.)
But it doesn’t end there. Say we had a method with four parameters, like this one:
public void MyMethod( int firstOne, double secondInLine = 3.1416, string thirdHere = "The third parameter", TimeSpan lastButNotLeast = default(TimeSpan)) { // ... }
If we want to call it and specify the first parameter (which we have to, because it has no default), and the third, but not the second or the fourth, we can do so by using the names of the parameters, like this:
MyMethod(127, thirdHere: "New third parameter");
With just one method, we now have many different ways to call it—we can provide all the arguments, or just the first and second, or perhaps the first, second, and third. There are many combinations. Before named arguments and defaults were added in C# 4.0, the only way to get this kind of flexibility was to write an overload for each distinct combination.
This is not just limited to normal methods—you can use this same syntax to provide default values for parameters in your constructors, if you wish.
Being forced to delete the extra constructor we tried to add back in Example 3-31 was a little disappointing—we’re constraining the number of ways users of our type can initialize it. Named arguments and default values have helped, but can we do more?
Until C# 3.0, the only real solution to this was to write one or more factory methods. These are described in the sidebar below. But now we have another option.
With C# 3.0 the language was extended to support
object initializers—an extension to the new
syntax that lets us set up a load of
properties, by name, as we create our object instance.
Example 3-35 shows how an object
initializer looks when we use it in our Main
function.
Example 3-35. Using object initializers
static void Main(string[] args) { Plane someBoeing777 = new Plane("BA0049") { Direction = DirectionOfApproach.Approaching, SpeedInMilesPerHour = 150 }; Console.WriteLine( "Your plane has identifier {0}," + " and is traveling at {1:0.00}mph [{2:0.00}kph]", // Use the property getter someBoeing777.Identifier, someBoeing777.SpeedInMilesPerHour, someBoeing777.SpeedInKilometersPerHour); someBoeing777.SpeedInKilometersPerHour = 140.0; Console.WriteLine( "Your plane has identifier {0}," + " and is traveling at {1:0.00}mph [{2:0.00}kph]", // Use the property getter someBoeing777.Identifier, someBoeing777.SpeedInMilesPerHour, someBoeing777.SpeedInKilometersPerHour); Console.ReadKey(); }
Object initializers are mostly just a convenient syntax for
constructing a new object and then setting some properties.
Consequently, this only works with writable properties—you can’t use it
for immutable types,[11] so this wouldn’t work with our PolarPoint3D
.
We still use the constructor parameter for the read-only Identifier
property; but then we add an extra
section in braces, between the closing parenthesis and the semicolon, in
which we have a list of property assignments, separated by commas. What’s
particularly interesting is that the purpose of the constructor parameter
is normally identifiable only by the
value we happen to assign to it, but the object initializer is “self-documenting”—we can easily see what is
being initialized to which values, at a glance.
The job isn’t quite done yet, though. While there’s nothing
technically wrong with using both the constructor parameter and the object
initializer, it does look a little bit clumsy. It might be easier for our
clients if we allow them to use a default, parameterless constructor, and
then initialize all the members using this new
syntax. As we’ll see in Chapter 6, we have
other ways of enforcing invariants in the object state, and dealing with
incorrect usages. Object initializers are certainly a more expressive
syntax, and on the basis that self-documenting and transparent is better,
we’re going to change how Plane
works
so that we can initialize the whole object with an object
initializer.
As with any design consideration, there is a counter argument. Some classes may be downright difficult to put into a “default” (zero-ish) state that isn’t actively dangerous. We’re also increasing the size of the public API by the changes we’re making—we’re adding a public setter. Here, we’ve decided that the benefits outweigh the disadvantages in this particular case (although it’s really a judgment call; no doubt some developers would disagree).
First, as Example 3-36 shows, we’ll delete
the special constructor from Plane
, and
then make Identifier
an ordinary
read/write property. We can also remove the _identifier
backing field we added earlier,
because we’ve gone back to using an auto property.
Example 3-36. Modifying Plane to work better with object initializers
class Plane { // Remove the constructor that we no longer require // public Plane(string newIdentifier) // { // Identifier = newIdentifier; // } public string Identifier { get; // remove the access modifier // to make it public set; } // ... }
We can now use the object initializer syntax for all the properties we want to set. As Example 3-37 shows, this makes our code look somewhat neater—we only need one style of code to initialize the object.
Example 3-37. Nothing but object initializer syntax
Plane someBoeing777 = new Plane { Identifier = "BA0049", Direction = DirectionOfApproach.Approaching, SpeedInMilesPerHour = 150 };
Object initializer syntax provides one big advantage over offering
lots of specialized constructors: people using your class can provide any
combination of properties they want. They might decide to set the Position
property inline in this object
initializer too, as Example 3-38 does—if
we’d been relying on constructors, default or named arguments wouldn’t
have helped if there was no constructor available that accepted a Position
. We’ve not had to provide an additional
constructor overload to make this possible—developers using our class have
a great deal of flexibility. Of course, this approach only makes sense if
our type is able to work sensibly with default values for the properties
in question. If you absolutely need certain values to be provided on
initialization, you’re better off with constructors.
Example 3-38. Providing an extra property
Plane someBoeing777 = new Plane
{
Identifier = "BA0049",
Direction = DirectionOfApproach.Approaching,
SpeedInMilesPerHour = 150,
Position = new PolarPoint3D(20, 180, 14500)
};
So, we’ve addressed the data part of our Plane
; but the whole point of a class is that it
can encapsulate both state and operations. What
methods are we going to define in our class?
When deciding what methods a class might need, we generally
scan our specifications or scenarios for verbs that relate to the object
of that class. If we look back at the ATC system description at the
beginning of this chapter, we can see several plane-related actions, to do
with granting permissions to land and permissions to take off. But do we
need functions on the Plane
class to
deal with that? Possibly not. It might be better to deal with that in
another part of the model, to do with our ground control, runways, and
runway management (that, you’ll be pleased to hear, we won’t be
building).
But we will periodically need to update the position of all the
planes. This involves changing the state of the plane—we will need to
modify its Position
. And it’s a change
of state whose details depend on the existing state—we need to take the
direction and speed into account. This sounds like a good candidate for a
method that the Plane
class should
offer. Example 3-39 shows the code to add inside the
class.
Example 3-39. A method
public void UpdatePosition(double minutesToAdvance) { double hours = minutesToAdvance / 60.0; double milesMoved = SpeedInMilesPerHour * hours; double milesToTower = Position.Distance; if (Direction == DirectionOfApproach.Approaching) { milesToTower -= milesMoved; if (milesToTower < 0) { // We've arrived! milesToTower = 0; } } else { milesToTower += milesMoved; } PolarPoint3D newPosition = new PolarPoint3D( milesToTower, Position.Angle, Position.Altitude); }
This method takes a single argument, indicating how much elapsed time the calculation should take into account. It looks at the speed, the direction, and the current position, and uses this information to calculate the new position.
This code illustrates that our design is some way from being finished. We never change the altitude, which suggests that our planes are going to have a hard time reaching the ground. (Although since this code makes them stop moving when they get directly above the tower, they’ll probably reach the ground soon enough...) Apparently our initial specification did not fully and accurately describe the problem our software should be solving. This will not come as astonishing news to anyone who has worked in the software industry. Clearly we need to talk to the client to get clarification, but let’s implement what we can for now.
Notice that our code is able to use all of the properties—SpeedInMilesPerHour
, Direction
, and so on—without needing to
qualify them with a variable. Whereas in Example 3-35 we had to write someBoeing777.SpeedInMilesPerHour
, here we just
write SpeedInMilesPerHour
. Methods are meant to
access and modify an object’s state, and so you can refer directly to any
member of the method’s containing class.
There’s one snag with that. It can mean that for someone reading the code, it’s not always instantly obvious when the code uses a local variable or argument, and when it uses some member of the class. Our properties use PascalCasing, while we’re using camelCasing for arguments and variables, which helps, but what it we wanted to access a field? Those conventionally use camelCasing too. That’s why some developers put an underscore in front of their field names—it makes it more obvious when we’re doing something with the object’s state. But there’s an alternative—a more explicit style, shown in Example 3-40.
Example 3-40. Explicit member access
public void UpdatePosition(double minutesToAdvance) { double hours = minutesToAdvance / 60; double milesMoved =this.
SpeedInMilesPerHour * hours; double milesToTower =this.
Position.Distance; if (this.
Direction == DirectionOfApproach.Approaching) { milesToTower -= milesMoved; if (milesToTower < 0) { // We've arrived! milesToTower = 0; } } else { milesToTower += milesMoved; } PolarPoint3D newPosition = new PolarPoint3D( milesToTower,this.
Position.Angle,this.
Position.Altitude); }
This is almost the same as Example 3-39, except every
member access goes through a variable called this
. But we’ve not defined any such
variable—where did that come from?
The UpdatePosition
method
effectively has an implied extra argument called this
, and it’s the object on which the method
has been invoked. So, if our Main
method were to call someBoeing777.UpdatePosition(10)
, the this
variable would refer
to whatever object the Main
method’s
someBoeing777
variable referred
to.
Methods get a this
argument by
default, but they can opt out, because sometimes it makes sense to write
methods that don’t apply to any particular object. The Main
method of our
Program
class is one example—it has no
this
argument, because the .NET
Framework doesn’t presume to create an object; it just calls the method
and lets us decide what objects, if any, to create. You can tell a method
has no this
argument because it will be
marked with the static
keyword—you may
recall from Chapter 2 that this
means the method can be run without needing an instance of its defining
type.
Aside from our Main
method, why
might we not want a method to be associated with a particular instance?
Well, one case comes to mind for our example application. There’s a rather
important feature of airspace management that we’re likely to need to cope
with: ensuring that we don’t let two planes hit each other. So, another
method likely to be useful is one that allows us to check whether one
plane is too close to another one, within some margin of error (say, 5,000
feet). And this method isn’t associated with any single plane: it always
involves two planes.
Now we could define a method on Plane
that accepted another Plane
as an argument, but that’s a slightly
misleading design—it has a lack of symmetry which suggests that the planes
play different roles, because you’re invoking the method on one while
passing in the other as an argument. So it would make more sense to define
a static method—one not directly
associated with any single plane—and to have that take two Plane
objects.
We’ll add the method shown in Example 3-41 to the Plane
class. Because it is marked static
, it’s not associated with a single
Plane
, and will have no implicit
this
argument. Instead, we pass in
both of the Plane
objects we want to
look at as explicit arguments, to emphasize the fact that neither of the
objects is in any way more significant than the other in this
calculation.
Example 3-41. Detecting when Planes are too close
public static bool TooClose(Plane first, Plane second, double minimumMiles) { double x1 = first.Position.Distance * Math.Cos(first.Position.Angle); double x2 = second.Position.Distance * Math.Cos(second.Position.Angle); double y1 = first.Position.Distance * Math.Sin(first.Position.Angle); double y2 = second.Position.Distance * Math.Sin(second.Position.Angle); double z1 = first.Position.Altitude / feetPerMile; double z2 = second.Position.Altitude / feetPerMile; double dx = x1 - x2; double dy = y1 - y2; double dz = z1 - z2; double distanceSquared = dx * dx + dy * dy + dz * dz; double minimumSquared = minimumMiles * minimumMiles; return distanceSquared < minimumSquared; } private const double feetPerMile = 5280;
We’ve seen plenty of function declarations like this before, but
we’ll quickly recap its anatomy. This one returns a bool
to indicate whether we’re safe (true
) or not (false
). In its parameter list, we have the
references to the two Plane
objects,
and a double
for the margin of error
(in miles).
Because there’s no implicit this
parameter, any attempt to use nonstatic
members of the class without going through an argument or variable
such as first
and second
in Example 3-41 will cause an error.
This often catches people out when learning C#. They try adding a
method to the Program
class of a
new program, and they forget to mark it as static
(or don’t realize that they need to),
and then are surprised by the error they get when attempting to call
it from Main
. Main
is a static
method, and like any static
method, it cannot use nonstatic
members of its containing type unless you provide it with an
instance.
Example 3-41 performs some calculations to work out how close the planes are. The details aren’t particularly important here—we’re more interested in how this uses C# methods. But just for completeness, the method converts the position into Cartesian coordinates, and then calculates the sum of the squares of the differences of the coordinates in all three dimensions, which will give us the square of the distance between the two planes. We could calculate the actual distance by taking the square root, but since we only want to know whether or not we’re too close, we can just compare with the minimum distance squared. (Computers are much faster at squaring than they are at calculating square roots, so given that we could do it either way, we may as well avoid the square root.)
It isn’t just functions that we can declare as static
. Fields and properties can be static,
too. In fact, we’ve already seen a special kind of static
field—the const
value we defined for the conversion
between miles and kilometers. There was only one conversion factor value,
however many objects we instantiated.
The only difference between a const
field and a static
field is that we can modify the static
field. (Remember: the const
field was immutable.) So, a static
property or field effectively lets us get or set data associated with the
class, rather than the object. No matter how many objects we create, we
are always getting and setting the same value.
Let’s look at a trivial illustration, shown in Example 3-42, to explore how it works, before we think about why we might want to use it.
Example 3-42. Static state
public class MyClassWithAStaticProperty { public static bool TrueOrFalse { get; set; } public void SayWhetherTrueOrFalse() { Console.WriteLine("Object is {0}",TrueOrFalse
); } } class Program { static void Main(string[] args) { // Create two objects MyClassWithAStaticProperty object1 = new MyClassWithAStaticProperty(); MyClassWithAStaticProperty object2 = new MyClassWithAStaticProperty(); // Check how the property looks to each object, // and accessed through the class name object1.SayWhetherTrueOrFalse(); object2.SayWhetherTrueOrFalse(); Console.WriteLine("Class is {0}", MyClassWithAStaticProperty.TrueOrFalse
); // Change the value MyClassWithAStaticProperty.TrueOrFalse
= true; // And see that it has changed everywhere object1.SayWhetherTrueOrFalse(); object2.SayWhetherTrueOrFalse(); Console.WriteLine("Class is {0}", MyClassWithAStaticProperty.TrueOrFalse
); Console.ReadKey(); } }
If you compile and run this code in a console application project, you’ll see the following output:
Object is False Object is False Class is False Object is True Object is True Class is True
This demonstrates that there’s clearly just the one piece of information here, no matter how many different object instances we may try to look at it through. But why might we want this kind of static, class-level data storage?
The principal use for class-level data is to enforce the reality
that there is exactly one instance of some piece of
data throughout the whole system. If you think about it, that’s exactly
what our miles-to-kilometers value is all about—we only need one instance
of that number for the whole system, so we declare it as const
(which, as we’ve already seen, is like a
special case of static
). A similar
pattern crops up in lots of places in the .NET Framework class library.
For example, on a computer running Windows, there is a specific directory
containing certain OS system files (typically C:Windowssystem32). The class library
provides a class called Environment
which offers, among other things, a SystemDirectory
property that returns that
location, and since there’s only one such directory, this is a static
property.
Another common use for static
is
when we want to cache information that is expensive to calculate, or which
is frequently reused by lots of different objects of the same type. To get
a benefit when lots of objects use the common data, it needs to be
available to all instances.
We can even apply the static
keyword to a constructor. This lets us write a special constructor that
only runs once for the whole class. We could add the constructor in
Example 3-43 to our Plane
class to illustrate this.
With this code in place, you would see the message printed out by that constructor just once at the beginning of the program—static constructors run exactly once.
In case you’re wondering, yes, static
fields can be marked as readonly
. And just as
a normal readonly
field can only be
modified in a constructor, a static
readonly
field can only be modified in a static
constructor.
But when exactly do static constructors run? We know when regular
members get initialized and when normal constructors run—that happens
when we new
up the object. Everything
gets initialized to zero, and then our constructor(s) are called to do
any other initialization that we need doing. But what about static
initialization?
The static constructor will run no later than the first time either of the following happens: you create an instance of the class; you use any static member of the class. There are no guarantees about the exact moment the code will run—it’s possible you’ll see them running earlier than you would have expected for optimization reasons.
Field initializers for static fields add some slight complication. (Remember, a field initializer is an expression that provides a default value for a field, and which appears in the field declaration itself, rather than the constructor. Example 3-44 shows some examples.) .NET initializes the statics in the order in which they are declared. So, if you reference one static field from the initializer for another static field in the same class, you need to be careful, or you can get errors at runtime. Example 3-44 illustrates how this can go wrong. (Also, the .NET Framework is somewhat noncommittal about exactly when field initializers will run—in theory it has more freedom than with a static constructor, and could run them either later or earlier than you might expect, although in practice, it’s not something you’d normally need to worry about unless you’re writing multithreaded code that depends on the order in which static initialization occurs.)
Example 3-44. Unwise ordering of static field initializers
class Bar { public bool myField; } // Bad - null reference exception on construction class Foo { public static bool field2 = field1.myField; public static Bar field1 = new Bar(); } // OK - initialized in the right order class Foo { public static Bar field1 = new Bar(); public static bool field2 = field1.myField; }
We saw how to define classes from which we can
create instances called objects, and that this can be
useful when attempting to model real-world entities. We can also define
value types, using the struct
keyword, and the main difference is that
when we assign variables or pass arguments, value types always copy the
whole value, whereas ordinary classes (which are reference types) only
copy a reference to the underlying object. We also saw a simpler kind of
type: enum
. This lets us define named
sets of constant values, and is useful when we need a value representing a
choice from a fixed set of options.
So, now we know how to abstract basic ideas of information storage (through fields and simple properties) and manipulation (through functions and calculated properties), using classes and objects. In the next chapter, we’re going to look at how we can extend these ideas further using a concept called polymorphism to model a hierarchy of related classes that can extend or refine some basic contract.
[9] A special kind of crayon, designed for writing on glossy surfaces such as plastic.
[10] It’s short for “enumeration,” by the way. So it’s often pronounced “e-noom” or, depending on where you’re from, “e-nyoom.” However, some developers (and one of the authors) ignore the etymology and pronounce it “ee numb” because that’s how it looks like it should sound.