Chapter 7

Operators and Casts

WHAT’S IN THIS CHAPTER?

  • Operators in C#
  • The idea of equality when dealing with reference and value types
  • Data conversion between primitive data types
  • Converting value types to reference types using boxing
  • Converting between reference types by casting
  • Overloading the standard operators for custom types
  • Adding cast operators to custom types

WROX.COM CODE DOWNLOADS FOR THIS CHAPTER

The wrox.com code downloads for this chapter are found at http://www.wrox.com/remtitle.cgi?isbn=1118314425 on the Download Code tab. The code for this chapter is divided into the following major examples:

  • SimpleCurrency
  • SimpleCurrency2
  • VectorStruct
  • VectorStructMoreOverloads

OPERATORS AND CASTS

The preceding chapters have covered most of what you need to start writing useful programs using C#. This chapter completes the discussion of the essential language elements and illustrates some powerful aspects of C# that enable you to extend its capabilities.

OPERATORS

Although most of C#’s operators should be familiar to C and C++ developers, this section discusses the most important operators for the benefit of new programmers and Visual Basic converts, and sheds light on some of the changes introduced with C#.

C# supports the operators listed in the following table:

CATEGORY OPERATOR
Arithmetic +* / %
Logical & | ∧ ~ && || !
String concatenation +
Increment and decrement ++ ––
Bit shifting << >>
Comparison == != < > <= >=
Assignment = += -= *= /= %= &= |= ^= <<= >>=
Member access (for objects and structs) .
Indexing (for arrays and indexers) []
Cast ()
Conditional (the ternary operator) ?:
Delegate concatenation and removal (discussed in Chapter 8, “Delegates, Lambdas, and Events”) + -
Object creation new
Type information sizeof is typeof as
Overflow exception control checked unchecked
Indirection and address []
Namespace alias qualifier (discussed in Chapter 2, “Core C#”) ::
Null coalescing operator ??

However, note that four specific operators (sizeof, *, ->, and &, listed in the following table) are available only in unsafe code (code that bypasses C#’s type-safety checking), which is discussed in Chapter 14, “Memory Management and Pointers.” It is also important to note that the sizeof operator keywords, when used with the very early versions of the .NET Framework 1.0 and 1.1, required the unsafe mode. This is not a requirement since the .NET Framework 2.0.

CATEGORY OPERATOR
Operator keywords sizeof (for .NET Framework versions 1.0 and 1.1 only)
Operators * -> &

One of the biggest pitfalls to watch out for when using C# operators is that, as with other C-style languages, C# uses different operators for assignment (=) and comparison (==). For instance, the following statement means “let x equal three”:

x = 3;

If you now want to compare x to a value, you need to use the double equals sign ==:

if (x == 3)
{
      
}

Fortunately, C#’s strict type-safety rules prevent the very common C error whereby assignment is performed instead of comparison in logical statements. This means that in C# the following statement will generate a compiler error:

if (x = 3)
{
      
}

Visual Basic programmers who are accustomed to using the ampersand (&) character to concatenate strings will have to make an adjustment. In C#, the plus sign (+) is used instead for concatenation, whereas the & symbol denotes a bitwise AND between two different integer values. The pipe symbol, |, enables you to perform a bitwise OR between two integers. Visual Basic programmers also might not recognize the modulus (%) arithmetic operator. This returns the remainder after division, so, for example, x % 5 returns 2 if x is equal to 7.

You will use few pointers in C#, and therefore few indirection operators. More specifically, the only place you will use them is within blocks of unsafe code, because that is the only place in C# where pointers are allowed. Pointers and unsafe code are discussed in Chapter 14.

Operator Shortcuts

The following table shows the full list of shortcut assignment operators available in C#:

SHORTCUT OPERATOR EQUIVALENT TO
x++, ++x x = x + 1
x--, --x x = x – 1
x += y x = x + y
x -= y x = x - y
x *= y x = x * y
x /= y x = x / y
x %= y x = x % y
x >>= y x = x >> y
x <<= y x = x << y
x &= y x = x & y
x |= y x = x | y

You may be wondering why there are two examples each for the ++ increment and the -- decrement operators. Placing the operator before the expression is known as a prefix; placing the operator after the expression is known as a postfix. Note that there is a difference in the way they behave.

The increment and decrement operators can act both as entire expressions and within expressions. When used by themselves, the effect of both the prefix and postfix versions is identical and corresponds to the statement x = x + 1. When used within larger expressions, the prefix operator will increment the value of x before the expression is evaluated; in other words, x is incremented and the new value is used in the expression. Conversely, the postfix operator increments the value of x after the expression is evaluated — the expression is evaluated using the original value of x. The following example uses the increment operator (++) as an example to demonstrate the difference between the prefix and postfix behavior:

int x = 5;
      
if (++x == 6)  // true – x is incremented to 6 before the evaluation
{
   Console.WriteLine("This will execute");
}
      
if (x++ == 7) // false – x is incremented to 7 after the evaluation
{
   Console.WriteLine("This won't");
}

The first if condition evaluates to true because x is incremented from 5 to 6 before the expression is evaluated. The condition in the second if statement is false, however, because x is incremented to 7 only after the entire expression has been evaluated (while x == 6).

The prefix and postfix operators --x and x-- behave in the same way, but decrement rather than increment the operand.

The other shortcut operators, such as += and -=, require two operands, and are used to modify the value of the first operand by performing an arithmetic, logical, or bitwise operation on it. For example, the next two lines are equivalent:

x += 5;
x = x + 5;

The following sections look at some of the primary and cast operators that you will frequently use within your C# code.

The Conditional Operator (==)

The conditional operator (?:), also known as the ternary operator, is a shorthand form of the if...else construction. It gets its name from the fact that it involves three operands. It allows you to evaluate a condition, returning one value if that condition is true, or another value if it is false. The syntax is as follows:

condition ? true_value: false_value

Here, condition is the Boolean expression to be evaluated, true_value is the value that will be returned if condition is true, and false_value is the value that will be returned otherwise.

When used sparingly, the conditional operator can add a dash of terseness to your programs. It is especially handy for providing one of a couple of arguments to a function that is being invoked. You can use it to quickly convert a Boolean value to a string value of true or false. It is also handy for displaying the correct singular or plural form of a word:

int x = 1;
string s = x + " ";
s += (x == 1 ? "man": "men");
Console.WriteLine(s);

This code displays 1 man if x is equal to one but will display the correct plural form for any other number. Note, however, that if your output needs to be localized to different languages, you have to write more sophisticated routines to take into account the different grammatical rules of different languages.

The checked and unchecked Operators

Consider the following code:

byte b = 255;
b++;
Console.WriteLine(b.ToString());

The byte data type can hold values only in the range 0 to 255, so incrementing the value of b causes an overflow. How the CLR handles this depends on a number of issues, including compiler options; so whenever there’s a risk of an unintentional overflow, you need some way to ensure that you get the result you want.

To do this, C# provides the checked and unchecked operators. If you mark a block of code as checked, the CLR will enforce overflow checking, throwing an OverflowException if an overflow occurs. The following changes the preceding code to include the checked operator:

byte b = 255;
checked
{
   b++;
}
Console.WriteLine(b.ToString());

When you try to run this code, you will get an error message like this:

Unhandled Exception: System.OverflowException: Arithmetic operation resulted in an 
   overflow at Wrox.ProCSharp.Basics.OverflowTest.Main(String[] args)
 

NOTE You can enforce overflow checking for all unmarked code in your program by -specifying the /checked compiler option.

If you want to suppress overflow checking, you can mark the code as unchecked:

byte b = 255;
unchecked
{
   b++;
}
Console.WriteLine(b.ToString());

In this case, no exception will be raised but you will lose data because the byte type cannot hold a value of 256, the overflowing bits will be discarded, and your b variable will hold a value of zero (0).

Note that unchecked is the default behavior. The only time you are likely to need to explicitly use the unchecked keyword is when you need a few unchecked lines of code inside a larger block that you have explicitly marked as checked.

The is Operator

The is operator allows you to check whether an object is compatible with a specific type. The phrase “is compatible” means that an object either is of that type or is derived from that type. For example, to check whether a variable is compatible with the object type, you could use the following bit of code:

int i = 10;
if (i is object)
{
   Console.WriteLine("i is an object");
}

int, like all C# data types, inherits from object; therefore, the expression i is object evaluates to true in this case, and the appropriate message will be displayed.

The as Operator

The as operator is used to perform explicit type conversions of reference types. If the type being converted is compatible with the specified type, conversion is performed successfully. However, if the types are incompatible, the as operator returns the value null. As shown in the following code, attempting to convert an object reference to a string will return null if the object reference does not actually refer to a string instance:

object o1 = "Some String";
object o2 = 5;
      
string s1 = o1 as string;   // s1 = "Some String"
string s2 = o2 as string;   // s2 = null

The as operator allows you to perform a safe type conversion in a single step without the need to first test the type using the is operator and then perform the conversion.

The sizeof Operator

You can determine the size (in bytes) required on the stack by a value type using the sizeof operator:

Console.WriteLine(sizeof(int));

This will display the number 4, because an int is 4 bytes long.

If you are using the sizeof operator with complex types (and not primitive types), you need to block the code within an unsafe block as illustrated here:

unsafe
{
   Console.WriteLine(sizeof(Customer));
}

Chapter 14 looks at unsafe code in more detail.

The typeof Operator

The typeof operator returns a System.Type object representing a specified type. For example, typeof(string) will return a Type object representing the System.String type. This is useful when you want to use reflection to find information about an object dynamically. For more information, see Chapter 15, “Reflection.”

Nullable Types and Operators

Looking at the Boolean type, you have a true or false value that you can assign to this type. However, what if you wanted to define the value of the type as undefined? This is where using nullable types can add a distinct value to your applications. If you use nullable types in your programs, you must always consider the effect a null value can have when used in conjunction with the various operators. Usually, when using a unary or binary operator with nullable types, the result will be null if one or both of the operands is null. For example:

int? a = null;
      
int? b = a + 4;      // b = null
int? c = a * 5;      // c = null

However, when comparing nullable types, if only one of the operands is null, the comparison will always equate to false. This means that you cannot assume a condition is true just because its opposite is false, as often happens in programs using non-nullable types. For example:

int? a = null;
int? b = -5;
      
if (a > = b)
   Console.WriteLine("a > = b");
else
   Console.WriteLine("a < b");
 

NOTE The possibility of a null value means that you cannot freely combine nullable and non-nullable types in an expression. This is discussed in the section “Type Conversions” later in this chapter.

The Null Coalescing Operator

The null coalescing operator (??) provides a shorthand mechanism to cater to the possibility of null values when working with nullable and reference types. The operator is placed between two operands — the first operand must be a nullable type or reference type, and the second operand must be of the same type as the first or of a type that is implicitly convertible to the type of the first operand. The null coalescing operator evaluates as follows:

  • If the first operand is not null, then the overall expression has the value of the first operand.
  • If the first operand is null, then the overall expression has the value of the second operand.

For example:

int? a = null;
int b;
      
b = a ?? 10;     // b has the value 10
a = 3;
b = a ?? 10;     // b has the value 3

If the second operand cannot be implicitly converted to the type of the first operand, a compile-time error is generated.

Operator Precedence

The following table shows the order of precedence of the C# operators. The operators at the top of the table are those with the highest precedence (that is, the ones evaluated first in an expression containing multiple operators).

GROUP OPERATORS
Primary () . [] x++ x-- new typeof sizeof checked unchecked
Unary + ! ~ ++x --x and casts
Multiplication/division * / %
Addition/subtraction + -
Bitwise shift operators << >>
Relational < ><= >= is as
Comparison == !=
Bitwise AND &
Bitwise XOR
Bitwise OR |
Boolean AND &&
Boolean OR ||
Conditional operator ?:
Assignment = += -= *= /= %= &= |= ∧= <<= >>= >>>=

NOTE In complex expressions, avoid relying on operator precedence to produce the correct result. Using parentheses to specify the order in which you want operators applied clarifies your code and prevents potential confusion.

TYPE SAFETY

Chapter 1, “.NET Architecture,” noted that the Intermediate Language (IL) enforces strong type safety upon its code. Strong typing enables many of the services provided by .NET, including security and language interoperability. As you would expect from a language compiled into IL, C# is also strongly typed. Among other things, this means that data types are not always seamlessly interchangeable. This section looks at conversions between primitive types.


NOTE C# also supports conversions between different reference types and allows you to define how data types that you create behave when converted to and from other types. Both of these topics are discussed later in this chapter.
Generics, however, enable you to avoid some of the most common situations in which you would need to perform type conversions. See Chapter 5, “Generics” and Chapter 10, “Collections,” for details.

Type Conversions

Often, you need to convert data from one type to another. Consider the following code:

byte value1 = 10;
byte value2 = 23;
byte total;
total = value1 + value2;
Console.WriteLine(total);

When you attempt to compile these lines, you get the following error message:

Cannot implicitly convert type 'int' to 'byte'

The problem here is that when you add 2 bytes together, the result will be returned as an int, not another byte. This is because a byte can contain only 8 bits of data, so adding 2 bytes together could very easily result in a value that cannot be stored in a single byte. If you want to store this result in a byte variable, you have to convert it back to a byte. The following sections discuss two conversion mechanisms supported by C# — implicit and explicit.

Implicit Conversions

Conversion between types can normally be achieved automatically (implicitly) only if you can guarantee that the value is not changed in any way. This is why the previous code failed; by attempting a conversion from an int to a byte, you were potentially losing 3 bytes of data. The compiler won’t let you do that unless you explicitly specify that’s what you want to do. If you store the result in a long instead of a byte, however, you will have no problems:

byte value1 = 10;
byte value2 = 23;
long total;               // this will compile fine
total = value1 + value2;
Console.WriteLine(total);

Your program has compiled with no errors at this point because a long holds more bytes of data than a byte, so there is no risk of data being lost. In these circumstances, the compiler is happy to make the conversion for you, without your needing to ask for it explicitly.

The following table shows the implicit type conversions supported in C#:

FROM TO
sbyte short, int, long, float, double, decimal, BigInteger
byte short, ushort, int, uint, long, ulong, float, double, decimal, BigInteger
short int, long, float, double, decimal, BigInteger
ushort int, uint, long, ulong, float, double, decimal, BigInteger
int long, float, double, decimal, BigInteger
uint long, ulong, float, double, decimal, BigInteger
long, ulong float, double, decimal, BigInteger
float double, BigInteger
char ushort, int, uint, long, ulong, float, double, decimal, BigInteger

As you would expect, you can perform implicit conversions only from a smaller integer type to a larger one, not from larger to smaller. You can also convert between integers and floating-point values; however, the rules are slightly different here. Though you can convert between types of the same size, such as int/uint to float and long/ulong to double, you can also convert from long/ulong back to float. You might lose 4 bytes of data doing this, but it only means that the value of the float you receive will be less precise than if you had used a double; the compiler regards this as an acceptable possible error because the magnitude of the value is not affected. You can also assign an unsigned variable to a signed variable as long as the value limits of the unsigned type fit between the limits of the signed variable.

Nullable types introduce additional considerations when implicitly converting value types:

  • Nullable types implicitly convert to other nullable types following the conversion rules described for non-nullable types in the previous table; that is, int? implicitly converts to long?, float?, double?, and decimal?.
  • Non-nullable types implicitly convert to nullable types according to the conversion rules described in the preceding table; that is, int implicitly converts to long?, float?, double?, and decimal?.
  • Nullable types do not implicitly convert to non-nullable types; you must perform an explicit conversion as described in the next section. That’s because there is a chance that a nullable type will have the value null, which cannot be represented by a non-nullable type.

Explicit Conversions

Many conversions cannot be implicitly made between types, and the compiler will return an error if any are attempted. These are some of the conversions that cannot be made implicitly:

  • int to short — Data loss is possible.
  • int to uint — Data loss is possible.
  • uint to int — Data loss is possible.
  • float to int — Everything is lost after the decimal point.
  • Any numeric type to char — Data loss is possible.
  • decimal to any numeric type — The decimal type is internally structured differently from both integers and floating-point numbers.
  • int? to int — The nullable type may have the value null.

However, you can explicitly carry out such conversions using casts. When you cast one type to another, you deliberately force the compiler to make the conversion. A cast looks like this:

long val = 30000;
int i = (int)val;   // A valid cast. The maximum int is 2147483647

You indicate the type to which you are casting by placing its name in parentheses before the value to be converted. If you are familiar with C, this is the typical syntax for casts. If you are familiar with the C++ special cast keywords such as static_cast, note that these do not exist in C#; you have to use the older C-type syntax.

Casting can be a dangerous operation to undertake. Even a simple cast from a long to an int can cause problems if the value of the original long is greater than the maximum value of an int:

long val = 3000000000;
int i = (int)val;         // An invalid cast. The maximum int is 2147483647

In this case, you will not get an error, but nor will you get the result you expect. If you run this code and output the value stored in i, this is what you get:

-1294967296

It is good practice to assume that an explicit cast will not return the results you expect. As shown earlier, C# provides a checked operator that you can use to test whether an operation causes an arithmetic overflow. You can use the checked operator to confirm that a cast is safe and to force the runtime to throw an overflow exception if it is not:

long val = 3000000000;
int i = checked((int)val);

Bearing in mind that all explicit casts are potentially unsafe, take care to include code in your application to deal with possible failures of the casts. Chapter 16, “Errors and Exceptions,” introduces structured exception handling using the try and catch statements.

Using casts, you can convert most primitive data types from one type to another; for example, in the following code, the value 0.5 is added to price, and the total is cast to an int:

double price = 25.30;
int approximatePrice = (int)(price + 0.5);

This gives the price rounded to the nearest dollar. However, in this conversion, data is lost — namely, everything after the decimal point. Therefore, such a conversion should never be used if you want to continue to do more calculations using this modified price value. However, it is useful if you want to output the approximate value of a completed or partially completed calculation — if you don’t want to bother the user with a lot of figures after the decimal point.

This example shows what happens if you convert an unsigned integer into a char:

ushort c = 43;
char symbol = (char)c;
Console.WriteLine(symbol);

The output is the character that has an ASCII number of 43, the + sign. You can try any kind of conversion you want between the numeric types (including char) and it will work, such as converting a decimal into a char, or vice versa.

Converting between value types is not restricted to isolated variables, as you have seen. You can convert an array element of type double to a struct member variable of type int:

struct ItemDetails
{
   public string Description;
   public int ApproxPrice;
}
      
//..
      
double[] Prices = { 25.30, 26.20, 27.40, 30.00 };
      
ItemDetails id;
id.Description = "Hello there.";
id.ApproxPrice = (int)(Prices[0] + 0.5);

To convert a nullable type to a non-nullable type or another nullable type where data loss may occur, you must use an explicit cast. This is true even when converting between elements with the same basic underlying type — for example, int? to int or float? to float. This is because the nullable type may have the value null, which cannot be represented by the non-nullable type. As long as an explicit cast between two equivalent non-nullable types is possible, so is the explicit cast between nullable types. However, when casting from a nullable type to a non-nullable type and the variable has the value null, an InvalidOperationException is thrown. For example:

int? a = null;
int  b = (int)a;     // Will throw exception

Using explicit casts and a bit of care and attention, you can convert any instance of a simple value type to almost any other. However, there are limitations on what you can do with explicit type conversions — as far as value types are concerned, you can only convert to and from the numeric and char types and enum types. You cannot directly cast Booleans to any other type or vice versa.

If you need to convert between numeric and string, you can use methods provided in the .NET class library. The Object class implements a ToString() method, which has been overridden in all the .NET predefined types and which returns a string representation of the object:

int i = 10;
string s = i.ToString();

Similarly, if you need to parse a string to retrieve a numeric or Boolean value, you can use the Parse() method supported by all the predefined value types:

string s = "100";
int i = int.Parse(s);
Console.WriteLine(i + 50);   // Add 50 to prove it is really an int

Note that Parse() will register an error by throwing an exception if it is unable to convert the string (for example, if you try to convert the string Hello to an integer). Again, exceptions are covered in Chapter 15.

Boxing and Unboxing

In Chapter 2, “Core C#,” you learned that all types — both the simple predefined types such as int and char, and the complex types such as classes and structs — derive from the object type. This means you can treat even literal values as though they are objects:

string s = 10.ToString();

However, you also saw that C# data types are divided into value types, which are allocated on the stack, and reference types, which are allocated on the managed heap. How does this square with the capability to call methods on an int, if the int is nothing more than a 4-byte value on the stack?

C# achieves this through a bit of magic called boxing. Boxing and its counterpart, unboxing, enable you to convert value types to reference types and then back to value types. We include this in the section on casting because this is essentially what you are doing — you are casting your value to the object type. Boxing is the term used to describe the transformation of a value type to a reference type. Basically, the runtime creates a temporary reference-type box for the object on the heap.

This conversion can occur implicitly, as in the preceding example, but you can also perform it explicitly:

int myIntNumber = 20;
object myObject = myIntNumber;

Unboxing is the term used to describe the reverse process, whereby the value of a previously boxed value type is cast back to a value type. We use the term cast here because this has to be done explicitly. The syntax is similar to explicit type conversions already described:

int myIntNumber = 20;
object myObject = myIntNumber;        // Box the int
int mySecondNumber = (int)myObject;   // Unbox it back into an int

A variable can be unboxed only if it has been boxed. If you execute the last line when myObject is not a boxed int, you will get a runtime exception thrown at runtime.

One word of warning: When unboxing, you have to be careful that the receiving value variable has enough room to store all the bytes in the value being unboxed. C#’s ints, for example, are only 32 bits long, so unboxing a long value (64 bits) into an int, as shown here, will result in an InvalidCastException:

long myLongNumber = 333333423;
object myObject = (object)myLongNumber;
int myIntNumber = (int)myObject;

COMPARING OBJECTS FOR EQUALITY

After discussing operators and briefly touching on the equality operator, it is worth considering for a moment what equality means when dealing with instances of classes and structs. Understanding the mechanics of object equality is essential for programming logical expressions and is important when implementing operator overloads and casts, the topic of the rest of this chapter.

The mechanisms of object equality vary depending on whether you are comparing reference types (instances of classes) or value types (the primitive data types, instances of structs, or enums). The following sections present the equality of reference types and value types independently.

Comparing Reference Types for Equality

You might be surprised to learn that System.Object defines three different methods for comparing objects for equality: ReferenceEquals() and two versions of Equals(). Add to this the comparison operator (==) and you actually have four ways to compare for equality. Some subtle differences exist between the different methods, which are examined next.

The ReferenceEquals() Method

ReferenceEquals() is a static method that tests whether two references refer to the same instance of a class, specifically whether the two references contain the same address in memory. As a static method, it cannot be overridden, so the System.Object implementation is what you always have. ReferenceEquals() always returns true if supplied with two references that refer to the same object instance, and false otherwise. It does, however, consider null to be equal to null:

SomeClass x, y;
x = new SomeClass();
y = new SomeClass();
bool B1 = ReferenceEquals(null, null);     // returns true
bool B2 = ReferenceEquals(null,x);         // returns false
bool B3 = ReferenceEquals(x, y);           // returns false because x and y
                                           // point to different objects

The Virtual Equals() Method

The System.Object implementation of the virtual version of Equals() also works by comparing references. However, because this method is virtual, you can override it in your own classes to compare objects by value. In particular, if you intend instances of your class to be used as keys in a dictionary, you need to override this method to compare values. Otherwise, depending on how you override Object.GetHashCode(), the dictionary class that contains your objects will either not work at all or work very inefficiently. Note that when overriding Equals(), your override should never throw exceptions. Again, that’s because doing so can cause problems for dictionary classes and possibly some other .NET base classes that internally call this method.

The Static Equals() Method

The static version of Equals() actually does the same thing as the virtual instance version. The difference is that the static version takes two parameters and compares them for equality. This method is able to cope when either of the objects is null; therefore, it provides an extra safeguard against throwing exceptions if there is a risk that an object might be null. The static overload first checks whether the references it has been passed are null. If they are both null, it returns true (because null is considered to be equal to null). If just one of them is null, it returns false. If both references actually refer to something, it calls the virtual instance version of Equals(). This means that when you override the instance version of Equals(), the effect is the same as if you were overriding the static version as well.

Comparison Operator (==)

It is best to think of the comparison operator as an intermediate option between strict value comparison and strict reference comparison. In most cases, writing the following means that you are comparing references:

bool b = (x == y);   // x, y object references

However, it is accepted that there are some classes whose meanings are more intuitive if they are treated as values. In those cases, it is better to override the comparison operator to perform a value comparison. Overriding operators is discussed next, but the obvious example of this is the System.String class for which Microsoft has overridden this operator to compare the contents of the strings rather than their references.

Comparing Value Types for Equality

When comparing value types for equality, the same principles hold as for reference types: ReferenceEquals() is used to compare references, Equals() is intended for value comparisons, and the comparison operator is viewed as an intermediate case. However, the big difference is that value types need to be boxed to be converted to references so that methods can be executed on them. In addition, Microsoft has already overloaded the instance Equals() method in the System.ValueType class to test equality appropriate to value types. If you call sA.Equals(sB) where sA and sB are instances of some struct, the return value will be true or false, according to whether sA and sB contain the same values in all their fields. On the other hand, no overload of == is available by default for your own structs. Writing (sA == sB) in any expression will result in a compilation error unless you have provided an overload of == in your code for the struct in question.

Another point is that ReferenceEquals() always returns false when applied to value types because, to call this method, the value types need to be boxed into objects. Even if you write the following, you will still get the result of false:

bool b = ReferenceEquals(v,v);   // v is a variable of some value type

The reason is because v will be boxed separately when converting each parameter, which means you get different references. Therefore, there really is no reason to call ReferenceEquals() to compare value types because it doesn’t make much sense.

Although the default override of Equals() supplied by System.ValueType will almost certainly be adequate for the vast majority of structs that you define, you might want to override it again for your own structs to improve performance. Also, if a value type contains reference types as fields, you might want to override Equals() to provide appropriate semantics for these fields because the default override of Equals() will simply compare their addresses.

OPERATOR OVERLOADING

This section looks at another type of member that you can define for a class or a struct: the operator overload. Operator overloading is something that will be familiar to C++ developers. However, because the concept is new to both Java and Visual Basic developers, we explain it here. C++ developers will probably prefer to skip ahead to the main operator overloading example.

The point of operator overloading is that you do not always just want to call methods or properties on objects. Often, you need to do things like add quantities together, multiply them, or perform logical operations such as comparing objects. Suppose you defined a class that represents a mathematical matrix. In the world of math, matrices can be added together and multiplied, just like numbers. Therefore, it is quite plausible that you would want to write code like this:

Matrix a, b, c;
// assume a, b and c have been initialized
Matrix d = c * (a + b);

By overloading the operators, you can tell the compiler what + and * do when used in conjunction with a Matrix object, enabling you to write code like the preceding. If you were coding in a language that did not support operator overloading, you would have to define methods to perform those operations. The result would certainly be less intuitive and would probably look something like this:

Matrix d = c.Multiply(a.Add(b));

With what you have learned so far, operators like + and * have been strictly for use with the predefined data types, and for good reason: The compiler knows what all the common operators mean for those data types. For example, it knows how to add two longs or how to divide one double by another double, and it can generate the appropriate intermediate language code. When you define your own classes or structs, however, you have to tell the compiler everything: what methods are available to call, what fields to store with each instance, and so on. Similarly, if you want to use operators with your own types, you have to tell the compiler what the relevant operators mean in the context of that class. You do that by defining overloads for the operators.

The other thing to stress is that overloading is not just concerned with arithmetic operators. You also need to consider the comparison operators, ==, <, >, !=, >=, and <=. Take the statement if (a==b). For classes, this statement will, by default, compare the references a and b. It tests whether the references point to the same location in memory, rather than checking whether the instances actually contain the same data. For the string class, this behavior is overridden so that comparing strings really does compare the contents of each string. You might want to do the same for your own classes. For structs, the == operator does not do anything at all by default. Trying to compare two structs to determine whether they are equal produces a compilation error unless you explicitly overload == to tell the compiler how to perform the comparison.

In many situations, being able to overload operators enables you to generate more readable and intuitive code, including the following:

  • Almost any mathematical object such as coordinates, vectors, matrices, tensors, functions, and so on. If you are writing a program that does some mathematical or physical modeling, you will almost certainly use classes representing these objects.
  • Graphics programs that use mathematical or coordinate-related objects when calculating positions on-screen.
  • A class that represents an amount of money (for example, in a financial program).
  • A word processing or text analysis program that uses classes representing sentences, clauses, and so on. You might want to use operators to combine sentences (a more sophisticated version of concatenation for strings).

However, there are also many types for which operator overloading is not relevant. Using operator overloading inappropriately will make any code that uses your types far more difficult to understand. For example, multiplying two DateTime objects does not make any sense conceptually.

How Operators Work

To understand how to overload operators, it’s quite useful to think about what happens when the compiler encounters an operator. Using the addition operator (+) as an example, suppose that the compiler processes the following lines of code:

int myInteger = 3;
uint myUnsignedInt = 2;
double myDouble = 4.0;
long myLong = myInteger + myUnsignedInt;
double myOtherDouble = myDouble + myInteger;

Now consider what happens when the compiler encounters this line:

long myLong = myInteger + myUnsignedInt;

The compiler identifies that it needs to add two integers and assign the result to a long. However, the expression myInteger + myUnsignedInt is really just an intuitive and convenient syntax for calling a method that adds two numbers. The method takes two parameters, myInteger and myUnsignedInt, and returns their sum. Therefore, the compiler does the same thing it does for any method call: It looks for the best matching overload of the addition operator based on the parameter types — in this case, one that takes two integers. As with normal overloaded methods, the desired return type does not influence the compiler’s choice as to which version of a method it calls. As it happens, the overload called in the example takes two int parameters and returns an int; this return value is subsequently converted to a long.

The next line causes the compiler to use a different overload of the addition operator:

double myOtherDouble = myDouble + myInteger;

In this instance, the parameters are a double and an int, but there is no overload of the addition operator that takes this combination of parameters. Instead, the compiler identifies the best matching overload of the addition operator as being the version that takes two doubles as its parameters, and it implicitly casts the int to a double. Adding two doubles requires a different process from adding two integers. Floating-point numbers are stored as a mantissa and an exponent. Adding them involves bit-shifting the mantissa of one of the doubles so that the two exponents have the same value, adding the mantissas, then shifting the mantissa of the result and adjusting its exponent to maintain the highest possible accuracy in the answer.

Now you are in a position to see what happens if the compiler finds something like this:

Vector vect1, vect2, vect3;
// initialize vect1 and vect2
vect3 = vect1 + vect2;
vect1 = vect1*2;

Here, Vector is the struct, which is defined in the following section. The compiler sees that it needs to add two Vector instances, vect1 and vect2, together. It looks for an overload of the addition operator, which takes two Vector instances as its parameters.

If the compiler finds an appropriate overload, it calls up the implementation of that operator. If it cannot find one, it checks whether there is any other overload for + that it can use as a best match — perhaps something with two parameters of other data types that can be implicitly converted to Vector instances. If the compiler cannot find a suitable overload, it raises a compilation error, just as it would if it could not find an appropriate overload for any other method call.

Operator Overloading Example: The Vector Struct

This section demonstrates operator overloading through developing a struct named Vector that represents a three-dimensional mathematical vector. Don’t worry if mathematics is not your strong point — the vector example is very simple. As far as you are concerned here, a 3D vector is just a set of three numbers (doubles) that tell you how far something is moving. The variables representing the numbers are called x, y, and z: the x tells you how far something moves east, y tells you how far it moves north, and z tells you how far it moves upward (in height). Combine the three numbers and you get the total movement. For example, if x=3.0, y=3.0, and z=1.0 (which you would normally write as (3.0, 3.0, 1.0), you’re moving 3 units east, 3 units north, and rising upward by 1 unit.

You can add or multiply vectors by other vectors or by numbers. Incidentally, in this context, we use the term scalar, which is math-speak for a simple number — in C# terms that is just a double. The significance of addition should be clear. If you move first by the vector (3.0, 3.0, 1.0) then you move by the vector (2.0, -4.0, -4.0), the total amount you have moved can be determined by adding the two vectors. Adding vectors means adding each component individually, so you get (5.0, -1.0, -3.0). In this context, mathematicians write c=a+b, where a and b are the vectors and c is the resulting vector. You want to be able to use the Vector struct the same way.


NOTE The fact that this example is developed as a struct rather than a class is not significant. Operator overloading works in the same way for both structs and classes.

Following is the definition for Vector — containing the member fields, constructors, a ToString() override so you can easily view the contents of a Vector, and, finally, that operator overload:

namespace Wrox.ProCSharp.OOCSharp
{
   struct Vector
   {
      public double x, y, z;
      
      public Vector(double x, double y, double z)
      {
         this.x = x;
         this.y = y;
         this.z = z;
      }
      
      public Vector(Vector rhs)
      {
         x = rhs.x;
         y = rhs.y;
         z = rhs.z;
      }
      
      public override string ToString()
      {
         return "( " + x + ", " + y + ", " + z + " )";
      }

This example has two constructors that require specifying the initial value of the vector, either by passing in the values of each component or by supplying another Vector whose value can be copied. Constructors like the second one, that takes a single Vector argument, are often termed copy constructors because they effectively enable you to initialize a class or struct instance by copying another instance. Note that to keep things simple, the fields are left as public. We could have made them private and written corresponding properties to access them, but it would not make any difference to the example, other than to make the code longer.

Here is the interesting part of the Vector struct — the operator overload that provides support for the addition operator:

      public static Vector operator + (Vector lhs, Vector rhs)
      {
         Vector result = new Vector(lhs);
         result.x += rhs.x;
         result.y += rhs.y;
         result.z += rhs.z;
      
         return result;
      }
   }
}

The operator overload is declared in much the same way as a method, except that the operator keyword tells the compiler it is actually an operator overload you are defining. The operator keyword is followed by the actual symbol for the relevant operator, in this case the addition operator (+). The return type is whatever type you get when you use this operator. Adding two vectors results in a vector; therefore, the return type is also a Vector. For this particular override of the addition operator, the return type is the same as the containing class, but that is not necessarily the case, as you will see later in this example. The two parameters are the things you are operating on. For binary operators (those that take two parameters), such as the addition and subtraction operators, the first parameter is the value on the left of the operator, and the second parameter is the value on the right.


NOTE It is conventional to name your left-hand parameters lhs (for left-hand side) and your right-hand parameters rhs (for right-hand side).

C# requires that all operator overloads be declared as public and static, which means they are associated with their class or struct, not with a particular instance. Because of this, the body of the operator overload has no access to non-static class members or the this identifier. This is fine because the parameters provide all the input data the operator needs to know to perform its task.

Now that you understand the syntax for the addition operator declaration, examine what happens inside the operator:

      {
         Vector result = new Vector(lhs);
         result.x += rhs.x;
         result.y += rhs.y;
         result.z += rhs.z;
      
         return result;
      }

This part of the code is exactly the same as if you were declaring a method, and you should easily be able to convince yourself that this will return a vector containing the sum of lhs and rhs as defined. You simply add the members x, y, and z together individually.

Now all you need to do is write some simple code to test the Vector struct:

      static void Main()
      {
         Vector vect1, vect2, vect3;
      
         vect1 = new Vector(3.0, 3.0, 1.0);
         vect2 = new Vector(2.0, -4.0, -4.0);
         vect3 = vect1 + vect2;
      
         Console.WriteLine("vect1 = " + vect1.ToString());
         Console.WriteLine("vect2 = " + vect2.ToString());
         Console.WriteLine("vect3 = " + vect3.ToString());
      }

Saving this code as Vectors.cs and compiling and running it returns this result:

vect1 = ( 3, 3, 1 )
vect2 = ( 2, -4, -4 )
vect3 = ( 5, -1, -3 )

Adding More Overloads

In addition to adding vectors, you can multiply and subtract them and compare their values. In this section, you develop the Vector example further by adding a few more operator overloads. You won’t develop the complete set that you’d probably need for a fully functional Vector type, but just enough to demonstrate some other aspects of operator overloading. First, you’ll overload the multiplication operator to support multiplying vectors by a scalar and multiplying vectors by another vector.

Multiplying a vector by a scalar simply means multiplying each component individually by the scalar: for example, 2 * (1.0, 2.5, 2.0) returns (2.0, 5.0, 4.0). The relevant operator overload looks similar to this:

public static Vector operator * (double lhs, Vector rhs)
{
   return new Vector(lhs * rhs.x, lhs * rhs.y, lhs * rhs.z);
}

This by itself, however, is not sufficient. If a and b are declared as type Vector, you can write code like this:

b = 2 * a;

The compiler will implicitly convert the integer 2 to a double to match the operator overload signature. However, code like the following will not compile:

b = a * 2;

The point is that the compiler treats operator overloads exactly like method overloads. It examines all the available overloads of a given operator to find the best match. The preceding statement requires the first parameter to be a Vector and the second parameter to be an integer, or something to which an integer can be implicitly converted. You have not provided such an overload. The compiler cannot start swapping the order of parameters, so the fact that you’ve provided an overload that takes a double followed by a Vector is not sufficient. You need to explicitly define an overload that takes a Vector followed by a double as well. There are two possible ways of implementing this. The first way involves breaking down the vector multiplication operation in the same way that you have done for all operators so far:

public static Vector operator * (Vector lhs, double rhs)
{
   return new Vector(rhs * lhs.x, rhs * lhs.y, rhs *lhs.z);
}

Given that you have already written code to implement essentially the same operation, however, you might prefer to reuse that code by writing the following:

public static Vector operator * (Vector lhs, double rhs)
{
   return rhs * lhs;
}

This code works by effectively telling the compiler that when it sees a multiplication of a Vector by a double, it can simply reverse the parameters and call the other operator overload. The sample code for this chapter uses the second version, because it looks neater and illustrates the idea in action. This version also makes the code more maintainable because it saves duplicating the code to perform the multiplication in two separate overloads.

Next, you need to overload the multiplication operator to support vector multiplication. Mathematics provides a couple of ways to multiply vectors, but the one we are interested in here is known as the dot product or inner product, which actually returns a scalar as a result. That’s the reason for this example, to demonstrate that arithmetic operators don’t have to return the same type as the class in which they are defined.

In mathematical terms, if you have two vectors (x, y, z) and (X, Y, Z), then the inner product is defined to be the value of x*X + y*Y + z*Z. That might look like a strange way to multiply two things together, but it is actually very useful because it can be used to calculate various other quantities. If you ever write code that displays complex 3D graphics, such as using Direct3D or DirectDraw, you will almost certainly find that your code needs to work out inner products of vectors quite often as an intermediate step in calculating where to place objects on the screen. What concerns us here is that we want users of your Vector to be able to write double X = a*b to calculate the inner product of two Vector objects (a and b). The relevant overload looks like this:

public static double operator * (Vector lhs, Vector rhs)
{
   return lhs.x * rhs.x + lhs.y * rhs.y + lhs.z * rhs.z;
}

Now that you understand the arithmetic operators, you can confirm that they work using a simple test method:

static void Main()
{
   // stuff to demonstrate arithmetic operations
   Vector vect1, vect2, vect3;
   vect1 = new Vector(1.0, 1.5, 2.0);
   vect2 = new Vector(0.0, 0.0, -10.0);
      
   vect3 = vect1 + vect2;
      
   Console.WriteLine("vect1 = " + vect1);
   Console.WriteLine("vect2 = " + vect2);
   Console.WriteLine("vect3 = vect1 + vect2 = " + vect3);
   Console.WriteLine("2*vect3 = " + 2*vect3);
   vect3 += vect2;
      
   Console.WriteLine("vect3+=vect2 gives " + vect3);
      
   vect3 = vect1*2;
      
   Console.WriteLine("Setting vect3=vect1*2 gives " + vect3);
      
   double dot = vect1*vect3;
      
   Console.WriteLine("vect1*vect3 = " + dot);
}

Running this code (Vectors2.cs) produces the following result:

VECTORS2
  vect1 = ( 1, 1.5, 2 )
  vect2 = ( 0, 0, -10 )
  vect3 = vect1 + vect2 = ( 1, 1.5, -8 )
  2*vect3 = ( 2, 3, -16 )
  vect3+=vect2 gives ( 1, 1.5, -18 )
  Setting vect3=vect1*2 gives ( 2, 3, 4 )
  vect1*vect3 = 14.5

This shows that the operator overloads have given the correct results; but if you look at the test code closely, you might be surprised to notice that it actually used an operator that wasn’t overloaded — the addition assignment operator, +=:

   vect3 += vect2;
      
   Console.WriteLine("vect3 += vect2 gives " + vect3);

Although += normally counts as a single operator, it can be broken down into two steps: the addition and the assignment. Unlike the C++ language, C# does not allow you to overload the = operator; but if you overload +, the compiler will automatically use your overload of + to work out how to perform a += operation. The same principle works for all the assignment operators, such as -=, *=, /=, &=, and so on.

Overloading the Comparison Operators

As shown earlier in the section “Operators,” C# has six comparison operators, and they are paired as follows:

  • == and !=
  • > and <
  • >= and <=

The C# language requires that you overload these operators in pairs. That is, if you overload ==, you must overload != too; otherwise, you get a compiler error. In addition, the comparison operators must return a bool. This is the fundamental difference between these operators and the arithmetic operators. The result of adding or subtracting two quantities, for example, can theoretically be any type depending on the quantities. You have already seen that multiplying two Vector objects can be implemented to give a scalar. Another example involves the .NET base class System.DateTime. It’s possible to subtract two DateTime instances, but the result is not a DateTime; instead it is a System.TimeSpan instance. By contrast, it doesn’t really make much sense for a comparison to return anything other than a bool.


NOTE If you overload == and !=, you must also override the Equals() and GetHashCode() methods inherited from System.Object; otherwise, you’ll get a compiler warning. The reasoning is that the Equals() method should implement the same kind of equality logic as the == operator.

Apart from these differences, overloading the comparison operators follows the same principles as overloading the arithmetic operators. However, comparing quantities isn’t always as simple as you might think. For example, if you simply compare two object references, you will compare the memory address where the objects are stored. This is rarely the desired behavior of a comparison operator, so you must code the operator to compare the value of the objects and return the appropriate Boolean response. The following example overrides the == and != operators for the Vector struct. Here is the implementation of ==:

public static bool operator == (Vector lhs, Vector rhs)
{
   if (lhs.x == rhs.x && lhs.y == rhs.y && lhs.z == rhs.z)
      return true;
   else
      return false;
}

This approach simply compares two Vector objects for equality based on the values of their components. For most structs, that is probably what you will want to do, though in some cases you may need to think carefully about what you mean by equality. For example, if there are embedded classes, should you simply compare whether the references point to the same object (shallow comparison) or whether the values of the objects are the same (deep comparison)?

With a shallow comparison, the objects point to the same point in memory, whereas deep comparisons work with values and properties of the object to deem equality. You want to perform equality checks depending on the depth to help you decide what you want to verify.


NOTE Don’t be tempted to overload the comparison operator by calling the instance version of the Equals() method inherited from System.Object. If you do and then an attempt is made to evaluate (objA == objB), when objA happens to be null, you will get an exception, as the .NET runtime tries to evaluate null.Equals(objB). Working the other way around (overriding Equals() to call the comparison operator) should be safe.

You also need to override the != operator. Here is the simple way to do this:

public static bool operator != (Vector lhs, Vector rhs)
{
   return ! (lhs == rhs);
}

As usual, you should quickly confirm that your override works with some test code. This time you’ll define three Vector objects and compare them:

static void Main()
{
   Vector vect1, vect2, vect3;
      
   vect1 = new Vector(3.0, 3.0, -10.0);
   vect2 = new Vector(3.0, 3.0, -10.0);
   vect3 = new Vector(2.0, 3.0, 6.0);
      
   Console.WriteLine("vect1==vect2 returns  " + (vect1==vect2));
   Console.WriteLine("vect1==vect3 returns  " + (vect1==vect3));
   Console.WriteLine("vect2==vect3 returns  " + (vect2==vect3));
      
   Console.WriteLine();
      
   Console.WriteLine("vect1!=vect2 returns  " + (vect1!=vect2));
   Console.WriteLine("vect1!=vect3 returns  " + (vect1!=vect3));
   Console.WriteLine("vect2!=vect3 returns  " + (vect2!=vect3));
}

Compiling this code (the Vectors3.cs sample in the code download) generates the following compiler warning because you haven’t overridden Equals() for your Vector. For our purposes here, that doesn’t, so we will ignore it:

Microsoft (R) Visual C# 2010 Compiler version 4.0.21006.1
for Microsoft (R) .NET Framework version 4.0
Copyright (C) Microsoft Corporation. All rights reserved.
      
      
Vectors3.cs(5,11): warning CS0660: 'Wrox.ProCSharp.OOCSharp.Vector' defines
        operator == or operator != but does not override Object.Equals(object o)
Vectors3.cs(5,11): warning CS0661: 'Wrox.ProCSharp.OOCSharp.Vector' defines
        operator == or operator != but does not override Object.GetHashCode()

Running the example produces these results at the command line:

VECTORS3
  vect1==vect2 returns  True
  vect1==vect3 returns  False
  vect2==vect3 returns  False
    
  vect1!=vect2 returns  False
  vect1!=vect3 returns  True
  vect2!=vect3 returns  True

Which Operators Can You Overload?

It is not possible to overload all the available operators. The operators that you can overload are listed in the following table:

CATEGORY OPERATORS RESTRICTIONS
Arithmetic binary +, *, /, -, % None
Arithmetic unary +, -, ++, -- None
Bitwise binary &, |, ^, <<, >> None
Bitwise unary !, ~true, false The true and false operators must be overloaded as a pair.
Comparison ==, !=,>=, <=>, <, Comparison operators must be overloaded in pairs.
Assignment +=, -=, *=, /=, >>=, <<=, %=, &=, |=, ^= You cannot explicitly overload these operators; they are overridden implicitly when you override the individual operators such as +, -, %, and so on.
Index [] You cannot overload the index operator directly. The indexer member type, discussed in Chapter 2, allows you to support the index operator on your classes and structs.
Cast () You cannot overload the cast operator directly. User-defined casts (discussed next) allow you to define custom cast behavior.

USER-DEFINED CASTS

Earlier in this chapter (see the “Explicit Conversions” section), you learned that you can convert values between predefined data types through a process of casting. You also saw that C# allows two different types of casts: implicit and explicit. This section looks at these types of casts.

For an explicit cast, you explicitly mark the cast in your code by including the destination data type inside parentheses:

   int I = 3;
   long l = I;             // implicit
   short s = (short)I;     // explicit

For the predefined data types, explicit casts are required where there is a risk that the cast might fail or some data might be lost. The following are some examples:

  • When converting from an int to a short, the short might not be large enough to hold the value of the int.
  • When converting from signed to unsigned data types, incorrect results are returned if the signed variable holds a negative value.
  • When converting from floating-point to integer data types, the fractional part of the number will be lost.
  • When converting from a nullable type to a non-nullable type, a value of null causes an exception.

By making the cast explicit in your code, C# forces you to affirm that you understand there is a risk of data loss, and therefore presumably you have written your code to take this into account.

Because C# allows you to define your own data types (structs and classes), it follows that you need the facility to support casts to and from those data types. The mechanism is to define a cast as a member operator of one of the relevant classes. Your cast operator must be marked as either implicit or explicit to indicate how you are intending it to be used. The expectation is that you follow the same guidelines as for the predefined casts: if you know that the cast is always safe regardless of the value held by the source variable, then you define it as implicit. Conversely, if you know there is a risk of something going wrong for certain values — perhaps some loss of data or an exception being thrown — then you should define the cast as explicit.


NOTE You should define any custom casts you write as explicit if there are any source data values for which the cast will fail or if there is any risk of an exception being thrown.

The syntax for defining a cast is similar to that for overloading operators discussed earlier in this chapter. This is not a coincidence — a cast is regarded as an operator whose effect is to convert from the source type to the destination type. To illustrate the syntax, the following is taken from an example struct named Currency, which is introduced later in this section:

public static implicit operator float (Currency value)
{
   // processing
}

The return type of the operator defines the target type of the cast operation, and the single parameter is the source object for the conversion. The cast defined here allows you to implicitly convert the value of a Currency into a float. Note that if a conversion has been declared as implicit, the compiler permits its use either implicitly or explicitly. If it has been declared as explicit, the compiler only permits it to be used explicitly. In common with other operator overloads, casts must be declared as both public and static.


NOTE C++ developers will notice that this is different from C++, in which casts are instance members of classes.

Implementing User-Defined Casts

This section illustrates the use of implicit and explicit user-defined casts in an example called SimpleCurrency (which, as usual, is available in the code download). In this example, you define a struct, Currency, which holds a positive USD ($) monetary value. C# provides the decimal type for this purpose, but it is possible you will still want to write your own struct or class to represent monetary values if you need to perform sophisticated financial processing and therefore want to implement specific methods on such a class.


NOTE The syntax for casting is the same for structs and classes. This example happens to be for a struct, but it would work just as well if you declared Currency as a class.

Initially, the definition of the Currency struct is as follows:

   struct Currency
   {
      public uint Dollars;
      public ushort Cents;
      
      public Currency(uint dollars, ushort cents)
      {
         this.Dollars = dollars;
         this.Cents = cents;
      }
      
      public override string ToString()
      {
         return string.Format("${0}.{1,-2:00}", Dollars,Cents);
      }
   }

The use of unsigned data types for the Dollar and Cents fields ensures that a Currency instance can hold only positive values. It is restricted this way to illustrate some points about explicit casts later. You might want to use a class like this to hold, for example, salary information for company employees (people’s salaries tend not to be negative!). To keep the class simple, the fields are public, but usually you would make them private and define corresponding properties for the dollars and cents.

Start by assuming that you want to be able to convert Currency instances to float values, where the integer part of the float represents the dollars. In other words, you want to be able to write code like this:

   Currency balance = new Currency(10,50);
   float f = balance; // We want f to be set to 10.5

To be able to do this, you need to define a cast. Hence, you add the following to your Currency definition:

      public static implicit operator float (Currency value)
      {
         return value.Dollars + (value.Cents/100.0f);
      }

The preceding cast is implicit. It is a sensible choice in this case because, as it should be clear from the definition of Currency, any value that can be stored in the currency can also be stored in a float. There is no way that anything should ever go wrong in this cast.


NOTE There is a slight cheat here: in fact, when converting a uint to a float, there can be a loss in precision, but Microsoft has deemed this error sufficiently marginal to count the uint-to-float cast as implicit.

However, if you have a float that you would like to be converted to a Currency, the conversion is not guaranteed to work. A float can store negative values, which Currency instances can’t, and a float can store numbers of a far higher magnitude than can be stored in the (uint) Dollar field of Currency. Therefore, if a float contains an inappropriate value, converting it to a Currency could give unpredictable results. Because of this risk, the conversion from float to Currency should be defined as explicit. Here is the first attempt, which will not return quite the correct results, but it is instructive to examine why:

      public static explicit operator Currency (float value)
      {
         uint dollars = (uint)value;
         ushort cents = (ushort)((value-dollars)*100);
         return new Currency(dollars, cents);
      }

The following code will now successfully compile:

   float amount = 45.63f;
   Currency amount2 = (Currency)amount;

However, the following code, if you tried it, would generate a compilation error, because it attempts to use an explicit cast implicitly:

   float amount = 45.63f;
   Currency amount2 = amount;   // wrong

By making the cast explicit, you warn the developer to be careful because data loss might occur. However, as you will soon see, this is not how you want your Currency struct to behave. Try writing a test harness and running the sample. Here is the Main() method, which instantiates a Currency struct and attempts a few conversions. At the start of this code, you write out the value of balance in two different ways (this will be needed to illustrate something later in the example):

static void Main()
{
   try
   {
      Currency balance = new Currency(50,35);
      
      Console.WriteLine(balance);
      Console.WriteLine("balance is " + balance);
      Console.WriteLine("balance is (using ToString()) " + balance.ToString());
      
      float balance2= balance;
      
      Console.WriteLine("After converting to float, = " + balance2);
      
      balance = (Currency) balance2;
      
      Console.WriteLine("After converting back to Currency, = " + balance);
      Console.WriteLine("Now attempt to convert out of range value of " +
                        "-$50.50 to a Currency:");
      
      checked
      {
         balance = (Currency) (-50.50);
         Console.WriteLine("Result is " + balance.ToString());
      }
   }
   catch(Exception e)
   {
      Console.WriteLine("Exception occurred: " + e.Message);
   }
}

Notice that the entire code is placed in a try block to catch any exceptions that occur during your casts. In addition, the lines that test converting an out-of-range value to Currency are placed in a checked block in an attempt to trap negative values. Running this code produces the following output:

SIMPLECURRENCY
  50.35
  Balance is $50.35
  Balance is (using ToString()) $50.35
  After converting to float, = 50.35
  After converting back to Currency, = $50.34
  Now attempt to convert out of range value of -$100.00 to a Currency:
  Result is $4294967246.00

This output shows that the code did not quite work as expected. First, converting back from float to Currency gave a wrong result of $50.34 instead of $50.35. Second, no exception was generated when you tried to convert an obviously out-of-range value.

The first problem is caused by rounding errors. If a cast is used to convert from a float to a uint, the computer will truncate the number rather than round it. The computer stores numbers in binary rather than decimal, and the fraction 0.35 cannot be exactly represented as a binary fraction (just as image cannot be represented exactly as a decimal fraction; it comes out as 0.3333 recurring). The computer ends up storing a value very slightly lower than 0.35 that can be represented exactly in binary format. Multiply by 100 and you get a number fractionally less than 35, which is truncated to 34 cents. Clearly, in this situation, such errors caused by truncation are serious, and the way to avoid them is to ensure that some intelligent rounding is performed in numerical conversions instead.

Luckily, Microsoft has written a class that does this: System.Convert. The System.Convert object contains a large number of static methods to perform various numerical conversions, and the one that we want is Convert.ToUInt16(). Note that the extra care taken by the System.Convert methods does come at a performance cost. You should use them only when necessary.

Let’s examine the second problem — why the expected overflow exception wasn’t thrown. The issue here is this: The place where the overflow really occurs isn’t actually in the Main() routine at all — it is inside the code for the cast operator, which is called from the Main() method. The code in this method was not marked as checked.

The solution is to ensure that the cast itself is computed in a checked context too. With both this change and the fix for the first problem, the revised code for the conversion looks like the following:

      public static explicit operator Currency (float value)
      {
         checked
         {
            uint dollars = (uint)value;
            ushort cents = Convert.ToUInt16((value-dollars)*100);
            return new Currency(dollars, cents);
         }
      }

Note that you use Convert.ToUInt16() to calculate the cents, as described earlier, but you do not use it for calculating the dollar part of the amount. System.Convert is not needed when calculating the dollar amount because truncating the float value is what you want there.


NOTE The System.Convert methods also carry out their own overflow checking. Hence, for the particular case we are considering, there is no need to place the call to Convert.ToUInt16() inside the checked context. The checked context is still required, however, for the explicit casting of value to dollars.

You won’t see a new set of results with this new checked cast just yet because you have some more modifications to make to the SimpleCurrency example later in this section.


NOTE If you are defining a cast that will be used very often, and for which performance is at an absolute premium, you may prefer not to do any error checking. That is also a legitimate solution, provided that the behavior of your cast and the lack of error checking are very clearly documented.

Casts Between Classes

The Currency example involves only classes that convert to or from float — one of the predefined data types. However, it is not necessary to involve any of the simple data types. It is perfectly legitimate to define casts to convert between instances of different structs or classes that you have defined. You need to be aware of a couple of restrictions, however:

  • You cannot define a cast if one of the classes is derived from the other (these types of casts already exist, as you will see).
  • The cast must be defined inside the definition of either the source or the destination data type.

To illustrate these requirements, suppose that you have the class hierarchy shown in Figure 7-1.

In other words, classes C and D are indirectly derived from A. In this case, the only legitimate user-defined cast between A, B, C, or D would be to convert between classes C and D, because these classes are not derived from each other. The code to do so might look like the following (assuming you want the casts to be explicit, which is usually the case when defining casts between user-defined classes):

   public static explicit operator D(C value)
   {
      // and so on
   }
   public static explicit operator C(D value)
   {
      // and so on
   }

For each of these casts, you can choose where you place the definitions — inside the class definition of C or inside the class definition of D, but not anywhere else. C# requires you to put the definition of a cast inside either the source class (or struct) or the destination class (or struct). A side effect of this is that you cannot define a cast between two classes unless you have access to edit the source code for at least one of them. This is sensible because it prevents third parties from introducing casts into your classes.

After you have defined a cast inside one of the classes, you cannot also define the same cast inside the other class. Obviously, there should be only one cast for each conversion; otherwise, the compiler would not know which one to use.

Casts Between Base and Derived Classes

To see how these casts work, start by considering the case in which both the source and the destination are reference types, and consider two classes, MyBase and MyDerived, where MyDerived is derived directly or indirectly from MyBase.

First, from MyDerived to MyBase, it is always possible (assuming the constructors are available) to write this:

MyDerived derivedObject = new MyDerived();
MyBase baseCopy = derivedObject;

Here, you are casting implicitly from MyDerived to MyBase. This works because of the rule that any reference to a type MyBase is allowed to refer to objects of class MyBase or anything derived from MyBase. In OO programming, instances of a derived class are, in a real sense, instances of the base class, plus something extra. All the functions and fields defined on the base class are defined in the derived class too.

Alternatively, you can write this:

MyBase derivedObject = new MyDerived();
MyBase baseObject = new MyBase();
MyDerived derivedCopy1 = (MyDerived) derivedObject;   // OK
MyDerived derivedCopy2 = (MyDerived) baseObject;      // Throws exception

This code is perfectly legal C# (in a syntactic sense, that is) and illustrates casting from a base class to a derived class. However, the final statement will throw an exception when executed. When you perform the cast, the object being referred to is examined. Because a base class reference can, in principle, refer to a derived class instance, it is possible that this object is actually an instance of the derived class that you are attempting to cast to. If that is the case, the cast succeeds, and the derived reference is set to refer to the object. If, however, the object in question is not an instance of the derived class (or of any class derived from it), the cast fails and an exception is thrown.

Notice that the casts that the compiler has supplied, which convert between base and derived class, do not actually do any data conversion on the object in question. All they do is set the new reference to refer to the object if it is legal for that conversion to occur. To that extent, these casts are very different in nature from the ones that you normally define yourself. For example, in the SimpleCurrency example earlier, you defined casts that convert between a Currency struct and a float. In the float-to-Currency cast, you actually instantiated a new Currency struct and initialized it with the required values. The predefined casts between base and derived classes do not do this. If you want to convert a MyBase instance into a real MyDerived object with values based on the contents of the MyBase instance, you cannot use the cast syntax to do this. The most sensible option is usually to define a derived class constructor that takes a base class instance as a parameter, and have this constructor perform the relevant initializations:

class DerivedClass: BaseClass
{
   public DerivedClass(BaseClass rhs)
   {
      // initialize object from the Base instance
   }
   // etc.

Boxing and Unboxing Casts

The previous discussion focused on casting between base and derived classes where both participants were reference types. Similar principles apply when casting value types, although in this case it is not possible to simply copy references — some copying of data must occur.

It is not, of course, possible to derive from structs or primitive value types. Casting between base and derived structs invariably means casting between a primitive type or a struct and System.Object. (Theoretically, it is possible to cast between a struct and System.ValueType, though it is hard to see why you would want to do this.)

The cast from any struct (or primitive type) to object is always available as an implicit cast — because it is a cast from a derived type to a base type — and is just the familiar process of boxing. For example, using the Currency struct:

Currency balance = new Currency(40,0);
object baseCopy = balance;

When this implicit cast is executed, the contents of balance are copied onto the heap into a boxed object, and the baseCopy object reference is set to this object. What actually happens behind the scenes is this: When you originally defined the Currency struct, the .NET Framework implicitly supplied another (hidden) class, a boxed Currency class, which contains all the same fields as the Currency struct but is a reference type, stored on the heap. This happens whenever you define a value type, whether it is a struct or an enum, and similar boxed reference types exist corresponding to all the primitive value types of int, double, uint, and so on. It is not possible, or necessary, to gain direct programmatic access to any of these boxed classes in source code, but they are the objects that are working behind the scenes whenever a value type is cast to object. When you implicitly cast Currency to object, a boxed Currency instance is instantiated and initialized with all the data from the Currency struct. In the preceding code, it is this boxed Currency instance to which baseCopy refers. By these means, it is possible for casting from derived to base type to work syntactically in the same way for value types as for reference types.

Casting the other way is known as unboxing. Like casting between a base reference type and a derived reference type, it is an explicit cast because an exception will be thrown if the object being cast is not of the correct type:

object derivedObject = new Currency(40,0);
object baseObject = new object();
Currency derivedCopy1 = (Currency)derivedObject;   // OK
Currency derivedCopy2 = (Currency)baseObject;      // Exception thrown

This code works in a way similar to the code presented earlier for reference types. Casting derivedObject to Currency works fine because derivedObject actually refers to a boxed Currency instance — the cast is performed by copying the fields out of the boxed Currency object into a new Currency struct. The second cast fails because baseObject does not refer to a boxed Currency object.

When using boxing and unboxing, it is important to understand that both processes actually copy the data into the new boxed or unboxed object. Hence, manipulations on the boxed object, for example, will not affect the contents of the original value type.

Multiple Casting

One thing you will have to watch for when you are defining casts is that if the C# compiler is presented with a situation in which no direct cast is available to perform a requested conversion, it will attempt to find a way of combining casts to do the conversion. For example, with the Currency struct, suppose the compiler encounters a few lines of code like this:

Currency balance = new Currency(10,50);
long amount = (long)balance;
double amountD = balance;

You first initialize a Currency instance, and then you attempt to convert it to a long. The trouble is that you haven’t defined the cast to do that. However, this code still compiles successfully. What will happen is that the compiler will realize that you have defined an implicit cast to get from Currency to float, and the compiler already knows how to explicitly cast a float to a long. Hence, it will compile that line of code into IL code that converts balance first to a float, and then converts that result to a long. The same thing happens in the final line of the code, when you convert balance to a double. However, because the cast from Currency to float and the predefined cast from float to double are both implicit, you can write this conversion in your code as an implicit cast. If you prefer, you could also specify the casting route explicitly:

Currency balance = new Currency(10,50);
long amount = (long)(float)balance;
double amountD = (double)(float)balance;

However, in most cases, this would be seen as needlessly complicating your code. The following code, by contrast, produces a compilation error:

Currency balance = new Currency(10,50);
long amount = balance;

The reason is that the best match for the conversion that the compiler can find is still to convert first to float and then to long. The conversion from float to long needs to be specified explicitly, though.

Not all of this by itself should give you too much trouble. The rules are, after all, fairly intuitive and designed to prevent any data loss from occurring without the developer knowing about it. However, the problem is that if you are not careful when you define your casts, it is possible for the compiler to select a path that leads to unexpected results. For example, suppose that it occurs to someone else in the group writing the Currency struct that it would be useful to be able to convert a uint containing the total number of cents in an amount into a Currency (cents, not dollars, because the idea is not to lose the fractions of a dollar). Therefore, this cast might be written to try to achieve this:

public static implicit operator Currency (uint value)
{
   return new Currency(value/100u, (ushort)(value%100));
} // Do not do this!

Note the u after the first 100 in this code to ensure that value/100u is interpreted as a uint. If you had written value/100, the compiler would have interpreted this as an int, not a uint.

The comment Do not do this! is clearly noted in this code, and here is why: The following code snippet merely converts a uint containing 350 into a Currency and back again; but what do you think bal2 will contain after executing this?

uint bal = 350;
Currency balance = bal;
uint bal2 = (uint)balance;

The answer is not 350 but 3! Moreover, it all follows logically. You convert 350 implicitly to a Currency, giving the result balance.Dollars = 3, balance.Cents = 50. Then the compiler does its usual figuring out of the best path for the conversion back. Balance ends up being implicitly converted to a float (value 3.5), and this is converted explicitly to a uint with value 3.

Of course, other instances exist in which converting to another data type and back again causes data loss. For example, converting a float containing 5.8 to an int and back to a float again will lose the fractional part, giving you a result of 5, but there is a slight difference in principle between losing the fractional part of a number and dividing an integer by more than 100. Currency has suddenly become a rather dangerous class that does strange things to integers!

The problem is that there is a conflict between how your casts interpret integers. The casts between Currency and float interpret an integer value of 1 as corresponding to one dollar, but the latest uint-to-Currency cast interprets this value as one cent. This is an example of very poor design. If you want your classes to be easy to use, you should ensure that all your casts behave in a way that is mutually compatible, in the sense that they intuitively give the same results. In this case, the solution is obviously to rewrite the uint-to-Currency cast so that it interprets an integer value of 1 as one dollar:

public static implicit operator Currency (uint value)
{
   return new Currency(value, 0);
}

Incidentally, you might wonder whether this new cast is necessary at all. The answer is that it could be useful. Without this cast, the only way for the compiler to carry out a uint-to-Currency conversion would be via a float. Converting directly is a lot more efficient in this case, so having this extra cast provides performance benefits, though you need to ensure that it provides the same result as via a float, which you have now done. In other situations, you may also find that separately defining casts for different predefined data types enables more conversions to be implicit rather than explicit, though that is not the case here.

A good test of whether your casts are compatible is to ask whether a conversion will give the same results (other than perhaps a loss of accuracy as in float-to-int conversions) regardless of which path it takes. The Currency class provides a good example of this. Consider this code:

Currency balance = new Currency(50, 35);
ulong bal = (ulong) balance;

At present, there is only one way that the compiler can achieve this conversion: by converting the Currency to a float implicitly, then to a ulong explicitly. The float-to-ulong conversion requires an explicit conversion, but that is fine because you have specified one here.

Suppose, however, that you then added another cast, to convert implicitly from a Currency to a uint. You will actually do this by modifying the Currency struct by adding the casts both to and from uint. This code is available as the SimpleCurrency2 example:

      public static implicit operator Currency (uint value)
      {
         return new Currency(value, 0);
      }
      
      public static implicit operator uint (Currency value)
      {
         return value.Dollars;
      }

Now the compiler has another possible route to convert from Currency to ulong: to convert from Currency to uint implicitly, then to ulong implicitly. Which of these two routes will it take? C# has some precise rules about the best route for the compiler when there are several possibilities. (The rules are not covered in this book, but if you are interested in the details, see the MSDN documentation.) The best answer is that you should design your casts so that all routes give the same answer (other than possible loss of precision), in which case it doesn’t really matter which one the compiler picks. (As it happens in this case, the compiler picks the Currency-to-uint-to-ulong route in preference to Currency-to-float-to-ulong.)

To test the SimpleCurrency2 sample, add this code to the test code for SimpleCurrency:

try
{
   Currency balance = new Currency(50,35);
      
   Console.WriteLine(balance);
   Console.WriteLine("balance is " + balance);
   Console.WriteLine("balance is (using ToString()) " + balance.ToString());
      
   uint balance3 = (uint) balance;
      
   Console.WriteLine("Converting to uint gives " + balance3);

Running the sample now gives you these results:

SIMPLECURRENCY2
  50
  balance is $50.35
  balance is (using ToString()) $50.35
  Converting to uint gives 50
  After converting to float, = 50.35
  After converting back to Currency, = $50.34
  Now attempt to convert out of range value of -$50.50 to a Currency:
  Result is $4294967246.00

The output shows that the conversion to uint has been successful, though as expected, you have lost the cents part of the Currency in making this conversion. Casting a negative float to Currency has also produced the expected overflow exception now that the float-to-Currency cast itself defines a checked context.

However, the output also demonstrates one last potential problem that you need to be aware of when working with casts. The very first line of output does not display the balance correctly, displaying 50 instead of $50.35. Consider these lines:

   Console.WriteLine(balance);
   Console.WriteLine("balance is " + balance);
   Console.WriteLine("balance is (using ToString()) " + balance.ToString());

Only the last two lines correctly display the Currency as a string. So what is going on? The problem here is that when you combine casts with method overloads, you get another source of unpredictability. We will look at these lines in reverse order.

The third Console.WriteLine() statement explicitly calls the Currency.ToString() method, ensuring that the Currency is displayed as a string. The second does not. However, the string literal "balance is" passed to Console.WriteLine() makes it clear to the compiler that the parameter is to be interpreted as a string. Hence, the Currency.ToString() method is called implicitly.

The very first Console.WriteLine() method, however, simply passes a raw Currency struct to Console.WriteLine(). Now, Console.WriteLine() has many overloads, but none of them takes a Currency struct. Therefore, the compiler will start fishing around to see what it can cast the Currency to in order to make it match up with one of the overloads of Console.WriteLine(). As it happens, one of the Console.WriteLine() overloads is designed to display uints quickly and efficiently, and it takes a uint as a parameter — you have now supplied a cast that converts Currency implicitly to uint.

In fact, Console.WriteLine() has another overload that takes a double as a parameter and displays the value of that double. If you look closely at the output from the first SimpleCurrency example, you will see that the first line of output displayed Currency as a double, using this overload. In that example, there wasn’t a direct cast from Currency to uint, so the compiler picked Currency-to-float-to-double as its preferred way of matching up the available casts to the available Console.WriteLine() overloads. However, now that there is a direct cast to uint available in SimpleCurrency2, the compiler has opted for that route.

The upshot of this is that if you have a method call that takes several overloads and you attempt to pass it a parameter whose data type doesn’t match any of the overloads exactly, then you are forcing the compiler to decide not only what casts to use to perform the data conversion, but also which overload, and hence which data conversion, to pick. The compiler always works logically and according to strict rules, but the results may not be what you expected. If there is any doubt, you are better off specifying which cast to use explicitly.

SUMMARY

This chapter looked at the standard operators provided by C#, described the mechanics of object equality, and examined how the compiler converts the standard data types from one to another. It also demonstrated how you can implement custom operator support on your data types using operator overloads. Finally, you looked at a special type of operator overload, the cast operator, which enables you to specify how instances of your types are converted to other data types.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset