WHAT’S IN THIS CHAPTER?
WROX.COM CODE DOWNLOADS FOR THIS CHAPTER
The wrox.com code downloads for this chapter are found at http://www.wrox.com/remtitle.cgi?isbn=1118314425 on the Download Code tab. The code for this chapter is divided into the following major examples:
Now that you understand more about what C# can do, you will want to learn how to use it. This chapter gives you a good start in that direction by providing a basic understanding of the fundamentals of C# programming, which is built on in subsequent chapters. By the end of this chapter, you will know enough C# to write simple programs (though without using inheritance or other object-oriented features, which are covered in later chapters).
Let’s start by compiling and running the simplest possible C# program — a simple console app consisting of a class that writes a message to the screen.
Type the following into a text editor (such as Notepad), and save it with a .cs extension (for example, First.cs). The Main() method is shown here (for more information, see “The Main Method” section later in this chapter):
using System;
namespace Wrox
{
public class MyFirstClass
{
static void Main()
{
Console.WriteLine("Hello from Wrox.");
Console.ReadLine();
return;
}
}
}
You can compile this program by simply running the C# command-line compiler (csc.exe) against the source file, like this:
csc First.cs
If you want to compile code from the command line using the csc command, you should be aware that the .NET command-line tools, including csc, are available only if certain environment variables have been set up. Depending on how you installed .NET (and Visual Studio 2011), this may or may not be the case on your machine.
Compiling the code produces an executable file named First.exe, which you can run from the command line or from Windows Explorer like any other executable. Give it a try:
csc First.cs
Microsoft (R) Visual C# Compiler version 4.0.30319.17379
for Microsoft(R) .NET Framework 4.5
Copyright (C) Microsoft Corporation. All rights reserved.
First.exe
Hello from Wrox.
First, a few general comments about C# syntax. In C#, as in other C-style languages, most statements end in a semicolon (;) and can continue over multiple lines without needing a continuation character. Statements can be joined into blocks using curly braces ({}). Single-line comments begin with two forward slash characters (//), and multiline comments begin with a slash and an asterisk (/*) and end with the same combination reversed (*/). In these aspects, C# is identical to C++ and Java but different from Visual Basic. It is the semicolons and curly braces that give C# code such a different visual appearance from Visual Basic code. If your background is predominantly Visual Basic, take extra care to remember the semicolon at the end of every statement. Omitting this is usually the biggest single cause of compilation errors among developers new to C-style languages. Another thing to remember is that C# is case sensitive. That means the variables named myVar and MyVar are two different variables.
The first few lines in the previous code example are related to namespaces (mentioned later in this chapter), which is a way to group together associated classes. The namespace keyword declares the namespace with which your class should be associated. All code within the braces that follow it is regarded as being within that namespace. The using statement specifies a namespace that the compiler should look at to find any classes that are referenced in your code but aren’t defined in the current namespace. This serves the same purpose as the import statement in Java and the using namespace statement in C++.
using System;
namespace Wrox
{
The reason for the presence of the using statement in the First.cs file is that you are going to use a library class, System.Console. The using System statement enables you to refer to this class simply as Console (and similarly for any other classes in the System namespace). Without using, you would have to fully qualify the call to the Console.WriteLine method like this:
System.Console.WriteLine("Hello from Wrox.");
The standard System namespace is where the most commonly used .NET types reside. It is important to realize that everything you do in C# depends on the .NET base classes. In this case, you are using the Console class within the System namespace to write to the console window. C# has no built-in keywords of its own for input or output; it is completely reliant on the .NET classes.
Next, you declare a class called MyFirstClass. However, because it has been placed in a namespace called Wrox, the fully qualified name of this class is Wrox.MyFirstCSharpClass:
class MyFirstCSharpClass
{
All C# code must be contained within a class. The class declaration consists of the class keyword, followed by the class name and a pair of curly braces. All code associated with the class should be placed between these braces.
Next, you declare a method called Main(). Every C# executable (such as console applications, Windows applications, and Windows services) must have an entry point — the Main() method (note the capital M):
public static void Main()
{
The method is called when the program is started. This method must return either nothing (void) or an integer (int). Note the format of method definitions in C#:
[modifiers] return_type MethodName([parameters])
{
// Method body. NB. This code block is pseudo-code.
}
Here, the first square brackets represent certain optional keywords. Modifiers are used to specify certain features of the method you are defining, such as from where the method can be called. In this case, you have two modifiers: public and static. The public modifier means that the method can be accessed from anywhere, so it can be called from outside your class. The static modifier indicates that the method does not operate on a specific instance of your class and therefore is called without first instantiating the class. This is important because you are creating an executable rather than a class library. You set the return type to void, and in the example you don’t include any parameters.
Finally, we come to the code statements themselves:
Console.WriteLine("Hello from Wrox.");
Console.ReadLine();
return;
In this case, you simply call the WriteLine() method of the System.Console class to write a line of text to the console window. WriteLine() is a static method, so you don’t need to instantiate a Console object before calling it.
Console.ReadLine() reads user input. Adding this line forces the application to wait for the carriage-return key to be pressed before the application exits, and, in the case of Visual Studio 2011, the console window disappears.
You then call return to exit from the method (also, because this is the Main() method, you exit the program as well). You specified void in your method header, so you don’t return any values.
Now that you have had a taste of basic C# syntax, you are ready for more detail. Because it is virtually impossible to write any nontrivial program without variables, we will start by looking at variables in C#.
You declare variables in C# using the following syntax:
datatype identifier;
For example:
int i;
This statement declares an int named i. The compiler won’t actually let you use this variable in an expression until you have initialized it with a value.
After it has been declared, you can assign a value to the variable using the assignment operator, =:
i = 10;
You can also declare the variable and initialize its value at the same time:
int i = 10;
If you declare and initialize more than one variable in a single statement, all the variables will be of the same data type:
int x = 10, y =20; // x and y are both ints
To declare variables of different types, you need to use separate statements. You cannot assign different data types within a multiple-variable declaration:
int x = 10;
bool y = true; // Creates a variable that stores true or false
int x = 10, bool y = true; // This won't compile!
Notice the // and the text after it in the preceding examples. These are comments. The // character sequence tells the compiler to ignore the text that follows on this line because it is included for a human to better understand the program, not part of the program itself. We further explain comments in code later in this chapter.
Variable initialization demonstrates an example of C#’s emphasis on safety. Briefly, the C# compiler requires that any variable be initialized with some starting value before you refer to that variable in an operation. Most modern compilers will flag violations of this as a warning, but the ever-vigilant C# compiler treats such violations as errors. This prevents you from unintentionally retrieving junk values from memory left over from other programs.
C# has two methods for ensuring that variables are initialized before use:
For example, you can’t do the following in C#:
public static int Main()
{
int d;
Console.WriteLine(d); // Can't do this! Need to initialize d before use
return 0;
}
Notice that this code snippet demonstrates defining Main() so that it returns an int instead of void.
If you attempt to compile the preceding lines, you will receive this error message:
Use of unassigned local variable 'd'
Consider the following statement:
Something objSomething;
In C#, this line of code would create only a reference for a Something object, but this reference would not yet actually refer to any object. Any attempt to call a method or property against this variable would result in an error.
Instantiating a reference object in C# requires use of the new keyword. You create a reference as shown in the previous example and then point the reference at an object allocated on the heap using the new keyword:
objSomething = new Something(); // This creates a Something on the heap
Type inference makes use of the var keyword. The syntax for declaring the variable changes somewhat. The compiler “infers” what the type of the variable is by what the variable is initialized to. For example:
int someNumber = 0;
becomes:
var someNumber = 0;
Even though someNumber is never declared as being an int, the compiler figures this out and someNumber is an int for as long as it is in scope. Once compiled, the two preceding statements are equal.
Here is a short program to demonstrate:
using System;
namespace Wrox
{
class Program
{
static void Main(string[] args)
{
var name = "Bugs Bunny";
var age = 25;
var isRabbit = true;
Type nameType = name.GetType();
Type ageType = age.GetType();
Type isRabbitType = isRabbit.GetType();
Console.WriteLine("name is type " + nameType.ToString());
Console.WriteLine("age is type " + ageType.ToString());
Console.WriteLine("isRabbit is type " + isRabbitType.ToString());
}
}
}
The output from this program is as follows:
name is type System.String
age is type System.Int32
isRabbit is type System.Bool
There are a few rules that you need to follow:
We examine this more closely in the discussion of anonymous types in Chapter 3, “Objects and Types.”
After the variable has been declared and the type inferred, the variable’s type cannot be changed. When established, the variable’s type follows all the strong typing rules that any other variable type must follow.
The scope of a variable is the region of code from which the variable can be accessed. In general, the scope is determined by the following rules:
It’s common in a large program to use the same variable name for different variables in different parts of the program. This is fine as long as the variables are scoped to completely different parts of the program so that there is no possibility for ambiguity. However, bear in mind that local variables with the same name can’t be declared twice in the same scope. For example, you can’t do this:
int x = 20;
// some more code
int x = 30;
Consider the following code sample:
using System;
namespace Wrox.ProCSharp.Basics
{
public class ScopeTest
{
public static int Main()
{
for (int i = 0; i < 10; i++)
{
Console.WriteLine(i);
} // i goes out of scope here
// We can declare a variable named i again, because
// there's no other variable with that name in scope
for (int i = 9; i >= 0; i — )
{
Console.WriteLine(i);
} // i goes out of scope here.
return 0;
}
}
}
This code simply prints out the numbers from 0 to 9, and then back again from 9 to 0, using two for loops. The important thing to note is that you declare the variable i twice in this code, within the same method. You can do this because i is declared in two separate loops, so each i variable is local to its own loop.
Here’s another example:
public static int Main()
{
int j = 20;
for (int i = 0; i < 10; i++)
{
int j = 30; // Can't do this — j is still in scope
Console.WriteLine(j + i);
}
return 0;
}
If you try to compile this, you’ll get an error like the following:
ScopeTest.cs(12,15): error CS0136: A local variable named 'j' cannot be declared in
this scope because it would give a different meaning to 'j', which is already used
in a 'parent or current' scope to denote something else.
This occurs because the variable j, which is defined before the start of the for loop, is still in scope within the for loop, and won’t go out of scope until the Main() method has finished executing. Although the second j (the illegal one) is in the loop’s scope, that scope is nested within the Main() method’s scope. The compiler has no way to distinguish between these two variables, so it won’t allow the second one to be declared.
In certain circumstances, however, you can distinguish between two identifiers with the same name (although not the same fully qualified name) and the same scope, and in this case the compiler allows you to declare the second variable. That’s because C# makes a fundamental distinction between variables that are declared at the type level (fields) and variables that are declared within methods (local variables).
Consider the following code snippet:
using System;
namespace Wrox
{
class ScopeTest2
{
static int j = 20;
public static void Main()
{
int j = 30;
Console.WriteLine(j);
return;
}
}
}
This code will compile even though you have two variables named j in scope within the Main() method: the j that was defined at the class level, and doesn’t go out of scope until the class is destroyed (when the Main() method terminates and the program ends); and the j defined in Main(). In this case, the new variable named j that you declare in the Main() method hides the class-level variable with the same name, so when you run this code, the number 30 is displayed.
What if you want to refer to the class-level variable? You can actually refer to fields of a class or struct from outside the object, using the syntax object.fieldname. In the previous example, you are accessing a static field (you’ll learn what this means in the next section) from a static method, so you can’t use an instance of the class; you just use the name of the class itself:
..
public static void Main()
{
int j = 30;
Console.WriteLine(j);
Console.WriteLine(ScopeTest2.j);
}
..
If you were accessing an instance field (a field that belongs to a specific instance of the class), you would need to use the this keyword instead.
As the name implies, a constant is a variable whose value cannot be changed throughout its lifetime. Prefixing a variable with the const keyword when it is declared and initialized designates that variable as a constant:
const int a = 100; // This value cannot be changed.
Constants have the following characteristics:
At least three advantages exist for using constants in your programs:
Now that you have seen how to declare variables and constants, let’s take a closer look at the data types available in C#. As you will see, C# is much stricter about the types available and their definitions than some other languages.
Before examining the data types in C#, it is important to understand that C# distinguishes between two categories of data type:
The next few sections look in detail at the syntax for value and reference types. Conceptually, the difference is that a value type stores its value directly, whereas a reference type stores a reference to the value.
These types are stored in different places in memory; value types are stored in an area known as the stack, and reference types are stored in an area known as the managed heap. It is important to be aware of whether a type is a value type or a reference type because of the different effect each assignment has. For example, int is a value type, which means that the following statement results in two locations in memory storing the value 20:
// i and j are both of type int
i = 20;
j = i;
However, consider the following example. For this code, assume you have defined a class called Vector; and that Vector is a reference type and has an int member variable called Value:
Vector x, y;
x = new Vector();
x.Value = 30; // Value is a field defined in Vector class
y = x;
Console.WriteLine(y.Value);
y.Value = 50;
Console.WriteLine(x.Value);
The crucial point to understand is that after executing this code, there is only one Vector object: x and y both point to the memory location that contains this object. Because x and y are variables of a reference type, declaring each variable simply reserves a reference — it doesn’t instantiate an object of the given type. In neither case is an object actually created. To create an object, you have to use the new keyword, as shown. Because x and y refer to the same object, changes made to x will affect y and vice versa. Hence, the code will display 30 and then 50.
If a variable is a reference, it is possible to indicate that it does not refer to any object by setting its value to null:
y = null;
If a reference is set to null, then clearly it is not possible to call any nonstatic member functions or fields against it; doing so would cause an exception to be thrown at runtime.
In C#, basic data types such as bool and long are value types. This means that if you declare a bool variable and assign it the value of another bool variable, you will have two separate bool values in memory. Later, if you change the value of the original bool variable, the value of the second bool variable does not change. These types are copied by value.
In contrast, most of the more complex C# data types, including classes that you yourself declare, are reference types. They are allocated upon the heap, have lifetimes that can span multiple function calls, and can be accessed through one or several aliases. The Common Language Runtime (CLR) implements an elaborate algorithm to track which reference variables are still reachable and which have been orphaned. Periodically, the CLR will destroy orphaned objects and return the memory that they once occupied back to the operating system. This is done by the garbage collector.
C# has been designed this way because high performance is best served by keeping primitive types (such as int and bool) as value types, and larger types that contain many fields (as is usually the case with classes) as reference types. If you want to define your own type as a value type, you should declare it as a struct.
As mentioned in Chapter 1, “.NET Architecture,” the basic predefined types recognized by C# are not intrinsic to the language but are part of the .NET Framework. For example, when you declare an int in C#, you are actually declaring an instance of a .NET struct, System.Int32. This may sound like a small point, but it has a profound significance: It means that you can treat all the primitive data types syntactically, as if they were classes that supported certain methods. For example, to convert an int i to a string, you can write the following:
string s = i.ToString();
It should be emphasized that behind this syntactical convenience, the types really are stored as primitive types, so absolutely no performance cost is associated with the idea that the primitive types are notionally represented by .NET structs.
The following sections review the types that are recognized as built-in types in C#. Each type is listed, along with its definition and the name of the corresponding .NET type (CTS type). C# has 15 predefined types, 13 value types, and 2 (string and object) reference types.
The built-in CTS value types represent primitives, such as integer and floating-point numbers, character, and Boolean types.
C# supports eight predefined integer types, shown in the following table.
Some C# types have the same names as C++ and Java types but have different definitions. For example, in C# an int is always a 32-bit signed integer. In C++ an int is a signed integer, but the number of bits is platform-dependent (32 bits on Windows). In C#, all data types have been defined in a platform-independent manner to allow for the possible future porting of C# and .NET to other platforms.
A byte is the standard 8-bit type for values in the range 0 to 255 inclusive. Be aware that, in keeping with its emphasis on type safety, C# regards the byte type and the char type as completely distinct, and any programmatic conversions between the two must be explicitly requested. Also be aware that unlike the other types in the integer family, a byte type is by default unsigned. Its signed version bears the special name sbyte.
With .NET, a short is no longer quite so short; it is now 16 bits long. The int type is 32 bits long. The long type reserves 64 bits for values. All integer-type variables can be assigned values in decimal or hex notation. The latter requires the 0x prefix:
long x = 0x12ab;
If there is any ambiguity about whether an integer is int, uint, long, or ulong, it will default to an int. To specify which of the other integer types the value should take, you can append one of the following characters to the number:
uint ui = 1234U;
long l = 1234L;
ulong ul = 1234UL;
You can also use lowercase u and l, although the latter could be confused with the integer 1 (one).
Although C# provides a plethora of integer data types, it supports floating-point types as well.
The float data type is for smaller floating-point values, for which less precision is required. The double data type is bulkier than the float data type but offers twice the precision (15 digits).
If you hard-code a non-integer number (such as 12.3), the compiler will normally assume that you want the number interpreted as a double. To specify that the value is a float, append the character F (or f) to it:
float f = 12.3F;
The decimal type represents higher-precision floating-point numbers, as shown in the following table.
One of the great things about the CTS and C# is the provision of a dedicated decimal type for financial calculations. How you use the 28 digits that the decimal type provides is up to you. In other words, you can track smaller dollar amounts with greater accuracy for cents or larger dollar amounts with more rounding in the fractional portion. Bear in mind, however, that decimal is not implemented under the hood as a primitive type, so using decimal has a performance effect on your calculations.
To specify that your number is a decimal type rather than a double, float, or an integer, you can append the M (or m) character to the value, as shown here:
decimal d = 12.30M;
The C# bool type is used to contain Boolean values of either true or false.
You cannot implicitly convert bool values to and from integer values. If a variable (or a function return type) is declared as a bool, you can only use values of true and false. You will get an error if you try to use zero for false and a nonzero value for true.
For storing the value of a single character, C# supports the char data type.
NAME | CTS TYPE | VALUES |
char | System.Char | Represents a single 16-bit (Unicode) character |
Literals of type char are signified by being enclosed in single quotation marks — for example, 'A'. If you try to enclose a character in double quotation marks, the compiler will treat this as a string and throw an error.
As well as representing chars as character literals, you can represent them with four-digit hex Unicode values (for example, 'u0041'), as integer values with a cast (for example, (char)65), or as hexadecimal values (for example,'x0041'). You can also represent them with an escape sequence, as shown in the following table.
ESCAPE SEQUENCE | CHARACTER |
' | Single quotation mark |
" | Double quotation mark |
Backslash | |