Chapter 7
IN THIS CHAPTER
Working with directories and files as collections
Enumerating a collection
Implementing an indexer for easy access to collection objects
Looping through a collection by using C# iterator blocks
Chapter 6 in this minibook explores the collection classes provided by the .NET Framework class library for use with C# and other .NET languages. Collection classes are constructs in .NET that can be instantiated to hold groups of items (see Chapter 6).
The first part of this chapter extends the notion of collections a bit. For instance, consider the following collections: a file as a collection of lines or records of data, and a directory as a collection of files. Thus, this chapter builds on both the collection material in Chapter 6 of this minibook and the file material in Book 3.
However, the focus in this chapter is on several ways to step through, or iterate, all sorts of collections, from file directories to arrays and lists of all sorts.
Sometimes you want to skim a directory of files, looking for something. The following LoopThroughFiles
program looks at all files in a given directory, reading each file and dumping its contents in hexadecimal format to the console. That may sound like a silly thing to do, but this program also demonstrates how to write out a file in a format other than just string
types. (You can find a description of hexadecimal format in the “Getting hexed” sidebar.)
From the command line, the user specifies the directory to use as an argument to the program. The following command “hex-dumps” each file in the temp
directory (including binary files as well as text files):
loopthroughfiles c: emp
If you don't enter a directory name, the program uses the current directory by default. (A hex dump displays the output as numbers in the hexadecimal — base 16 — system. See the nearby sidebar “Getting hexed.”)
The following example shows what happens when the user specifies the invalid directory x
:
Directory "x" invalid
Could not find a part of the path "C:C#ProgramsLoopThroughFilesinDebugx".
No files left
As with all examples in this book, you begin with a basic program structure, as shown in the following code. Note that you must include a separate using
statement for the System.IO
namespace. To this basic structure, you add the individual functions described in the sections that follow.
using System;
using System.IO;
// LoopThroughFiles -- Loop through all files contained in a directory;
// this time perform a hex dump, though it could have been anything.
namespace LoopThroughFiles
{
public class Program
{
}
}
Every console application begins with a Main()
function, as previous chapters indicate. Don't worry for now if you don’t quite understand how the Main()
function is supposed to work as part of the console application. For now, just know that the first function that C# calls is the Main()
function of your console application, as shown in the following code:
public static void Main(string[] args)
{
// If no directory name provided…
string directoryName;
if (args.Length == 0)
{
// …get the name of the current directory…
directoryName = Directory.GetCurrentDirectory();
}
else
{
// …otherwise, assume that the first argument
// is the name of the directory to use.
directoryName = args[0];
}
Console.WriteLine(directoryName);
// Get a list of all files in that directory.
FileInfo[] files = GetFileList(directoryName);
// Now iterate through the files in that list,
// performing a hex dump of each file.
foreach (FileInfo file in files)
{
// Write the name of the file.
Console.WriteLine("
hex dump of file {0}:", file.FullName);
// Now "dump" the file to the console.
DumpHex(file);
// Wait before outputting next file.
Console.WriteLine("
Press Enter to continue to next file");
Console.ReadLine();
}
// That's it!
Console.WriteLine("
No files left");
Console.Read();
}
The first line in LoopThroughFiles
looks for a program argument. If the argument list is empty (args.Length
is zero), the program calls Directory.GetCurrentDirectory()
. If you run inside Visual Studio rather than from the command line, that value defaults to the binDebug
subdirectory of your LoopThroughFiles
project directory.
The program then creates a list of all files in the specified directory by calling GetFileList()
. This method returns an array of FileInfo
objects. Each FileInfo
object contains information about a file — for example, the filename (with the full path to the file, FullName
, or without the path, Name
), the creation date, and the last modified date. Main()
iterates through the list of files using your old friend, the foreach
statement. It displays the name of each file and then passes off the file to the DumpHex()
method for display to the console. At the end of the loop, it pauses to allow the programmer a chance to gaze on the output from DumpHex()
.
Before you can process a list of files, you need to create one. The GetFileList()
method begins by creating an empty FileInfo
array and then filling it with a list of files. Here's the required code.
// GetFileList -- Get a list of all files in a specified directory.
public static FileInfo[] GetFileList(string directoryName)
{
// Start with an empty list.
FileInfo[] files = new FileInfo[0];
try
{
// Get directory information.
DirectoryInfo di = new DirectoryInfo(directoryName);
// That information object has a list of the contents.
files = di.GetFiles();
}
catch(Exception e)
{
Console.WriteLine("Directory "{0}" invalid", directoryName);
Console.WriteLine(e.Message);
}
return files;
}
GetFileList()
then creates a DirectoryInfo
object. Just as its name implies, a DirectoryInfo
object contains the same type of information about a directory that a FileInfo
object does about a file: name, rank, and serial-number-type stuff. However, the DirectoryInfo
object has access to one thing that a FileInfo
doesn't: a list of the files in the directory, in the form of a FileInfo
array.
To help trap errors, GetFileList()
wraps the directory- and file-related code in a big try
block. (For an explanation of try
and catch
, see Chapter 9 in this minibook.) The catch
at the end traps any errors that are generated. Just to embarrass you further, the catch
block flaunts the name of the directory (which probably doesn't exist, because you entered it incorrectly).
You can do anything you want with the list of files you collect. This example displays the content of each file in hexadecimal format, which can be useful in certain circumstances, such as when you need to know how files are actually put together. Before you can create a line of hexadecimal output, however, you need to create individual output lines. The DumpHex()
method, shown here, is a little tricky only because of the difficulties in formatting the output just right.
// DumpHex -- Given a file, dump the file contents to the console.
public static void DumpHex(FileInfo file)
{
// Open the file.
FileStream fs;
BinaryReader reader;
try
{
fs = file.OpenRead();
// Wrap the file stream in a BinaryReader.
reader = new BinaryReader(fs);
}
catch (Exception e)
{
Console.WriteLine("
can't read from "{0}"", file.FullName);
Console.WriteLine(e.Message);
return;
}
// Iterate through the contents of the file one line at a time.
for (int line = 1; true; line++)
{
// Read another 10 bytes across (all that will fit on a single
// line) -- return when no data remains.
byte[] buffer = new byte[10];
// Use the BinaryReader to read bytes.
// Note: Using FileStream is just as easy in this case.
int numBytes = reader.Read(buffer, 0, buffer.Length);
if (numBytes == 0)
{
return;
}
// Write the data in a single line preceded by line number.
Console.Write("{0:D3} - ", line);
DumpBuffer(buffer, numBytes);
// Stop every 20 lines so that the data doesn't scroll
// off the top of the Console screen.
if ((line % 20) == 0)
{
Console.WriteLine("Press Enter to continue another 20 lines" +
" or type Q to go to the next file.");
string Input = Console.ReadLine();
if (Input.ToUpper() == "Q")
break;
}
}
}
DumpHex()
starts by opening file
. A FileInfo
object contains information about the file — it doesn't open the file. DumpHex()
gets the full name of the file, including the path, and then opens a FileStream
in read-only mode using that name. The catch
block throws an exception if FileStream
can't read the file for some reason.
DumpHex()
then reads through the file, 10 bytes at a time. It displays every 10 bytes in hexadecimal format as a single line. Every 20 lines, it pauses until the user presses Enter. The code uses the modulo operator, %
, to accomplish that task.
The modulo operator (%
) returns the remainder after division. Thus (line % 20) == 0
is true when line
equals 20, 40, 60, 80 — you get the idea. This trick is valuable, useful in all sorts of looping situations where you want to perform an operation only so often.
After you have a single line of output to display, you can output it in hexadecimal form. DumpBuffer()
writes each member of a byte array using the X2 format control. Although X2 sounds like the name of a secret military experiment, it simply means “display a number as two hexadecimal digits.”
// DumpBuffer -- Write a buffer of characters as a single line in
// hex format.
public static void DumpBuffer(byte[] buffer, int numBytes)
{
for(int index = 0; index < numBytes; index++)
{
byte b = buffer[index];
Console.Write("{0:X2}, ", b);
}
Console.WriteLine();
}
The range of a byte
is 0 to 255, or 0xFF — two hex digits per byte. Here are the first 20 lines of an example file:
Hex dump of file C:Tempoutput.txt:
001 - 53, 74, 72, 65, 61, 6D, 20, 28, 70, 72,
002 - 6F, 74, 65, 63, 74, 65, 64, 29, 0D, 0A,
003 - 20, 20, 46, 69, 6C, 65, 53, 74, 72, 65,
004 - 61, 6D, 28, 73, 74, 72, 69, 6E, 67, 2C,
005 - 20, 46, 69, 6C, 65, 4D, 6F, 64, 65, 2C,
006 - 20, 46, 69, 6C, 65, 41, 63, 63, 65, 73,
007 - 73, 29, 0D, 0A, 20, 20, 4D, 65, 6D, 6F,
008 - 72, 79, 53, 74, 72, 65, 61, 6D, 28, 29,
009 - 3B, 0D, 0A, 20, 20, 4E, 65, 74, 77, 6F,
010 - 72, 6B, 53, 74, 72, 65, 61, 6D, 0D, 0A,
011 - 20, 20, 42, 75, 66, 66, 65, 72, 53, 74,
012 - 72, 65, 61, 6D, 20, 2D, 20, 62, 75, 66,
013 - 66, 65, 72, 73, 20, 61, 6E, 20, 65, 78,
014 - 69, 73, 74, 69, 6E, 67, 20, 73, 74, 72,
015 - 65, 61, 6D, 20, 6F, 62, 6A, 65, 63, 74,
016 - 0D, 0A, 0D, 0A, 42, 69, 6E, 61, 72, 79,
017 - 52, 65, 61, 64, 65, 72, 20, 2D, 20, 72,
018 - 65, 61, 64, 20, 69, 6E, 20, 76, 61, 72,
019 - 69, 6F, 75, 73, 20, 74, 79, 70, 65, 73,
020 - 20, 28, 43, 68, 61, 72, 2C, 20, 49, 6E,
Enter return to continue another 20 lines
The output codes are also valid for the lower part of the much vaster Unicode character set, which C# uses by default. (The site at http://www.i18nguy.com/unicode/codepages.html
provides you with listings of character sets of all kinds and is very useful if you have to deal with input from devices like mainframes.)
To run LoopThroughFiles, you need to do one of the following:
The second option in the preceding list, that of supplying a command-line argument in Visual Studio, requires a little special setting up on your part by following these steps:
Choose Project⇒ LoopThroughFiles Properties.
You see a Properties dialog box for the application.
Select Debug in the left pane.
You see the debug options shown in Figure 7-1.
Type the path you want to use, such as C:Temp
, in the Command Line Arguments field.
The path you type will work within Visual Studio whether you're in debug mode or not.
Choose File⇒ Save All.
Visual Studio saves the new path to disk.
Choose Debug⇒ Start Debugging or Debug⇒ Start Without Debugging.
You see the program execute in the path that you chose.
In the rest of this chapter, you see three different approaches to the general problem of iterating a collection. This section continues discussing the most traditional approach (at least for C# programmers), the iterator class, or enumerator, which implements the IEnumerator
interface.
Different collection types may have different accessing schemes. Not all types of collections can be accessed efficiently with an index like an array’s — the linked list, for example. A linked list just contains a reference to the next item in the list and is made to be consecutively — not randomly — accessed. Differences between collection types make it impossible to write a method such as the following without special provisions:
// Pass in any kind of collection:
void MyClearMethod(Collection aColl, int index)
{
aColl[index] = 0; // Indexing doesn't work for all types of collections.
// …continues…
}
Each collection type can (and does) define its own access methods. You decide on which access method to use based on the task requirements. The CollectionMoveNext example, shown here, demonstrates three access methods for a List<string>
object, Colors
:
static void Main(string[] args)
{
List<string> Colors = new List<string> {
"Red", "Yellow", "Green", "Blue" };
Console.WriteLine("Using a delegate.");
Colors.ForEach(delegate (string value)
{
Console.WriteLine(value);
});
Console.WriteLine("
Using a foreach.");
foreach (string col in Colors)
Console.WriteLine(col);
Console.WriteLine("
Using an enumerator.");
var colEnum = Colors.GetEnumerator();
while (colEnum.MoveNext())
Console.WriteLine(colEnum.Current);
Console.ReadLine();
}
This example shows how to use a delegate (described in detail in Book 2 Chapter 8 of this minibook), a foreach
loop (described in Chapter 6 of this minibook), and an enumerator (which Microsoft tends to confuse with iterators). The Colors.ForEach()
approach has an advantage in that you can use lambda expressions with it and it's extremely flexible, but sometimes it’s hard to read. The foreach
loop method is easy to read and quite common, but it lacks flexibility. The call to GetEnumerator()
obtains a special object that knows how to move between entries in a List<string>
. This is the best approach when you need to perform additional levels of processing and want strict control over when the Current
property value changes. The iterator (enumerator) approach offers these advantages:
IEnumerator
interface, it's usually straightforward to code.To make the foreach
loop possible, the IEnumerator
interface must support all different types of collections, from arrays to linked lists. Consequently, its methods must be as general as possible. For example, you can't use the iterator to access locations within the collection class randomly because most collections don’t provide random access. (You’d need to invent a different enumeration interface with that capability, but it wouldn’t work with foreach
.) IEnumerator
provides these three features:
Reset()
: Sets the enumerator to point to the beginning of the collection. Note: The generic version of IEnumerator
, IEnumerator<T>
, doesn't provide a Reset()
method. With .NET’s generic LinkedList
, for example, just begin with a call to MoveNext()
. That generic LinkedList
is found in System.Collections.Generic
.MoveNext()
: Moves the enumerator from the current object in the collection to the next one.Current
: A property, rather than a method, that retrieves the data object stored at the current position of the enumerator.The following method demonstrates this principle. The programmer of the MyCollection
class (not shown) creates a corresponding iterator class — say, IteratorMyCollection
. The application programmer stores
in ContainedDataObject
s
. The following code segment uses the three standard MyCollection
IEnumerator
methods to read these objects:
// The MyCollection class holds ContainedDataObject type objects as data.
void MyMethod(MyCollection myColl)
{
// The programmer who created the MyCollection class also
// creates an iterator class IteratorMyCollection;
// the application program creates an iterator object
// in order to navigate through the myColl object.
IEnumerator iterator = new IteratorMyCollection(myColl);
// Move the enumerator to the "next location" within the collection.
while (iterator.MoveNext())
{
// Fetch a reference to the data object at the current location
// in the collection.
ContainedDataObject contained; // Data
contained = (ContainedDataObject)iterator.Current;
// …use the contained data object…
}
}
The method MyMethod()
accepts as its argument the collection of ContainedDataObjects
. It begins by creating an iterator
of class IteratorMyCollection
. The method starts a loop by calling MoveNext()
. On this first call, MoveNext()
moves the iterator to the first element in the collection. On each subsequent call, MoveNext()
moves the pointer to the next position. MoveNext()
returns false
when the collection is exhausted and the iterator cannot be moved any farther.
The Current
property returns a reference to the data object at the current location of the iterator. The program converts the object returned into a ContainedDataObject
before assigning it to contained
. Calls to Current
are invalid if the MoveNext()
method didn't return true
on the previous call or if MoveNext()
hasn't yet been called.
The IEnumerator
methods are standard enough that C# uses them automatically to implement the foreach
statement. The foreach
statement can access any class that implements IEnumerable
or IEnumerable<T>
. This section discusses foreach
in terms of IEnumerable<T>
as shown in this general method that is capable of processing any such class, from arrays to linked lists to stacks and queues:
void MyMethod(IEnumerable<T> containerOfThings)
{
foreach (string s in containerOfThings)
{
Console.WriteLine("The next thing is {0}", s);
}
}
A class implements IEnumerable<T>
by defining the method GetEnumerator()
, which returns an instance of IEnumerator<T>
. Under the hood, foreach
invokes the GetEnumerator()
method to retrieve an iterator. It uses this iterator to make its way through the collection. Each element it retrieves has been cast appropriately before continuing into the block of code contained within the braces. Note that IEnumerable<T>
and IEnumerator<T>
are different, but related, interfaces. C# provides nongeneric versions of both as well, but you should prefer the generic versions for their increased type safety. IEnumerable<T>
looks like this:
interface IEnumerable<T>
{
IEnumerator<T> GetEnumerator();
}
while IEnumerator<T>
looks like this:
interface IEnumerator<T>
{
bool MoveNext();
T Current { get; }
}
The nongeneric IEnumerator
interface adds a Reset()
method that moves the iterator back to the beginning of the collection, and its Current
property returns type Object
. Note that IEnumerator<T>
inherits from IEnumerator
(Interface inheritance, covered in Book 2, Chapter 7, is different from normal object inheritance).
C# arrays (embodied in the Array
class they're based on) and all the .NET collection classes already implement both interfaces. So it’s only when you’re writing your own custom collection class that you need to take care of implementing these interfaces. For built-in collections, you can just use them. See the System.Collections.Generic namespace
topic at https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic?view=net-5.0
for details. Thus you can write the foreach
loop this way:
foreach(int nValue in myCollection)
{
// …
}
Accessing the elements of an array is simple: The statement container[n]
accesses the nth element of the container
array. The value in brackets is an index, while the []
are called the subscript operator. If only indexing into other types of collections were so simple.
C# enables you to write your own implementation of the index operation. You can provide an index feature for collections that wouldn't otherwise enjoy such a feature. In addition, you can index on subscript types other than the simple integers to which C# arrays are limited. For example, by writing your own index feature, you can interact with string
types. As another example, you could create an index feature for a programming construct like container["Joe"]
. (The “Indexers” section of Book 2, Chapter 11 shows how to add an indexer to a struct
.)
The indexer looks much like an ordinary get/set
property (Book 2 Chapter 3 describes accessors in more detail), except for the appearance of the keyword this
and the subscript operator []
instead of the property name, as shown in this bit of code:
class MyArray
{
public string this[int index] // Notice the "this" keyword.
{
get => MyArray[index];
set => MyArray[index] = value;
}
}
The example shows a short form of an indexer that you use when you don't need to do anything except get and set values. The “Working with indexers” section, later in this chapter, shows a longer version. Under the hood, the expression s = myArray[i];
invokes the get
accessor method, passing it the value of i
as the index. In addition, the expression myArray[i] = "some string";
invokes the set
accessor method, passing it the same index i
and "some string"
as value
.
The index type isn't limited to int
. You may choose to index a collection of houses by their owners’ names, by house address, or by any number of other indices. In addition, the indexer property can be overloaded with multiple index types, so you can index on a variety of elements in the same collection. The following sections discuss the Indexer
program, which generates the virtual array class KeyedArray
. This virtual array looks and acts like an array except that it uses a string
value as the index. (Note that you could replicate the functionality found in this example by using a C# Dictionary
, as described at https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.dictionary-2
.)
The Indexer
example relies on a special class, which means you must create a class framework for it. (Don't worry if some of the terms for this example seem strange; you discover a lot more about classes and other Object Oriented Programming, or OOP, techniques in Book 2.) Here is the framework used to hold the class methods discussed in sections that follow.
public class KeyedArray
{
// The following string provides the "key" into the array --
// the key is the string used to identify an element.
private string[] _keys;
// The object is the actual data associated with that key.
private object[] _arrayElements;
// KeyedArray -- Create a fixed-size KeyedArray.
public KeyedArray(int size)
{
_keys = new string[size];
_arrayElements = new object[size];
}
}
The class KeyedArray
holds two ordinary arrays. The _arrayElements
array of objects contains the actual KeyedArray
data. The string
types that inhabit the _keys
array act as identifiers for the object array. The ith element of _keys
corresponds to the ith entry of _arrayElements
. The application program can then index KeyedArray
via string
identifiers that have meaning to the application. A noninteger index is referred to as a key.
The line that reads public KeyedArray(int size)
is the start of a special kind of function called a constructor. Think of a constructor as an instruction to build an instance of the class. You don't need to worry about it for now, but the constructor actually assigns values to _keys
and _arrayElements
.
At this point, you need to define an indexer to make your code work, as shown in the following code. The indexer, public object this[string key]
, requires the use of two functions, Find()
and FindEmpty()
. Note that you add this code to the end of the KeyedArray
class.
// Find -- Find the index of the element corresponding to the
// string targetKey (return a negative if it can't be found).
private int Find(string targetKey)
{
for (int i = 0; i < _keys.Length; i++)
{
if (String.Compare(_keys[i], targetKey) == 0)
{
return i;
}
}
return -1;
}
// FindEmpty -- Find room in the array for a new entry.
private int FindEmpty()
{
for (int i = 0; i < _keys.Length; i++)
{
if (_keys[i] == null)
{
return i;
}
}
throw new Exception("Array is full");
}
// Look up contents by string key -- this is the indexer.
public object this[string key]
{
set
{
// See if the string is already there.
int index = Find(key);
if (index < 0)
{
// It isn't -- find a new spot.
index = FindEmpty();
_keys[index] = key;
}
// Save the object in the corresponding spot.
_arrayElements[index] = value;
}
get
{
int index = Find(key);
if (index < 0)
{
return null;
}
return _arrayElements[index];
}
}
The set[string]
indexer starts by checking to see whether the specified key already exists by calling the method Find()
. If Find()
returns an index, set[]
stores the new data object into the corresponding index in _arrayElements
. If Find()
can't find the key, set[]
calls FindEmpty()
to return an empty slot in which to store the object provided.
The get[]
side of the indexer follows similar logic. It first searches for the specified key using the Find()
method. If Find()
returns a non-negative index, get[]
returns the corresponding member of _arrayElements
where the data is stored. If Find()
returns –1
, get[]
returns null
, indicating that it can't find the provided key anywhere in the list.
The Find()
method loops through the members of _keys
to look for the element with the same value as the string targetKey
passed in. Find()
returns the index of the found element (or –1
if none was found). FindEmpty()
returns the index of the first element that has no key element.
The Main()
method, which is part of the Indexer program and not part of the class, demonstrates the KeyedArray
class in a trivial way:
static void Main(string[] args)
{
// Create an array with enough room.
KeyedArray ma = new KeyedArray(100);
// Save the ages of the Simpson kids.
ma["Bart"] = 10;
ma["Lisa"] = 8;
ma["Maggie"] = 2;
// Look up the age of Lisa.
Console.WriteLine("Let's find Lisa's age");
int age = (int)ma["Lisa"];
Console.WriteLine("Lisa is {0}", age);
Console.Read();
}
The program creates a KeyedArray
object ma
of length 100 (that is, with 100 free elements). It continues by storing the ages of the children in The Simpsons TV show, indexed by each child's name. Finally, the program retrieves Lisa’s age using the expression (int)ma["Lisa"]
and displays the result.
Notice that the program has to cast the value returned from ma[]
because KeyedArray
is written to hold any type of object. The cast wouldn't be necessary if the indexer were written to handle only int
values — or if the KeyedArray
were generic. (For more information about generics, see Chapter 8 in this minibook.) The output of the program is simple yet elegant:
Let's find Lisa's age
Lisa is 8
In previous versions of C#, the techniques associated with linked lists discussed in the section “Accessing Collections the Array Way: Indexers,” earlier in this chapter, was the primary practice for moving through collections, just as it was done in C++ and C before this. Although that solution does work, it turns out that C# versions 2.0 and above have simplified this process so that
GetEnumerator()
(and cast the results).MoveNext()
.Current
and cast its return value.foreach
to iterate the collection. (C# does the rest for you under the hood — it even writes the enumerator class.)Rather than implement all those interface methods in collection classes that you write, you can provide an iterator block as shown in the IteratorBlocks
example — and you don't have to write your own iterator class to support the collection. You can use iterator blocks for a host of other chores, too, as shown in the next example.
The best approach to iteration uses iterator blocks. When you write a collection class — and the need still exists for custom collection classes such as KeyedList
and PriorityQueue
— you implement an iterator block in its code rather than implement the IEnumerator
interface. Then users of that class can simply iterate the collection with foreach
. Here is the basic framework used for this example, which contains the functions that follow in the upcoming sections:
static void Main(string[] args)
{
// Instantiate a MonthDays "collection" class.
MonthDays md = new MonthDays();
// Iterate it.
Console.WriteLine("Stream of months:
");
foreach (string month in md)
{
Console.WriteLine(month);
}
// Instantiate a StringChunks "collection" class.
StringChunks sc = new StringChunks();
// Iterate it: prints pieces of text.
// This iteration puts each chunk on its own line.
Console.WriteLine("
stream of string chunks:
");
foreach (string chunk in sc)
{
Console.WriteLine(chunk);
}
// And this iteration puts it all on one line.
Console.WriteLine("
stream of string chunks on one line:
");
foreach (string chunk in sc)
{
Console.Write(chunk);
}
Console.WriteLine();
// Instantiate a YieldBreakEx "collection" class.
YieldBreakEx yb = new YieldBreakEx();
// Iterate it, but stop after 13.
Console.WriteLine("
stream of primes:
");
foreach (int prime in yb)
{
Console.WriteLine(prime);
}
// Instantiate an EvenNumbers "collection" class.
EvenNumbers en = new EvenNumbers();
// Iterate it: prints even numbers from 10 down to 4.
Console.WriteLine("
stream of descending evens :
");
foreach (int even in en.DescendingEvens(11, 3))
{
Console.WriteLine(even);
}
// Instantiate a PropertyIterator "collection" class.
PropertyIterator prop = new PropertyIterator();
// Iterate it: produces one double at a time.
Console.WriteLine("
stream of double values:
");
foreach (double db in prop.DoubleProp)
{
Console.WriteLine(db);
}
Console.Read();
}
The Main()
method shown provides basic testing functions for the iterator block code. Each of the sections that follow tell you how the code in the Main()
method interacts with the iterator block. In other words, the example won't compile until you add the code from the upcoming sections. For now, just know that the Main()
method is just one function, and the following sections break it apart so that you can understand it better.
The following class provides an iterator (shown in bold) that steps through the months of the year:
//MonthDays -- Define an iterator that returns the months
// and their lengths in days -- sort of a "collection" class.
class MonthDays
{
// Here's the "collection."
string[] months =
{ "January 31", "February 28", "March 31",
"April 30", "May 31", "June 30", "July 31",
"August 31", "September 30", "October 31",
"November 30", "December 31" };
//GetEnumerator -- Here's the iterator. See how it's invoked
// in Main() with foreach.
public System.Collections.IEnumerator GetEnumerator()
{
foreach (string month in months)
{
// Return one month per iteration.
yield return month;
}
}
}
Here’s part of a Main()
method that iterates this collection using a foreach
loop:
// Instantiate a MonthDays "collection" class.
MonthDays md = new MonthDays();
// Iterate it.
foreach (string month in md)
{
Console.WriteLine(month);
}
This collection class is based on an array, as KeyedArray
is. The class contains an array whose items are string
types. When a client iterates this collection, the collection's iterator block delivers string
types one by one. Each string
contains the name of a month (in sequence), with the number of days in the month tacked on to the string
.
The class defines its own iterator block, in this case as a method named GetEnumerator()
, which returns an object of type System.Collections.IEnumerator
. Now, it's true that you had to write such a method before, but you also had to write your own enumerator class to support your custom collection class. Here, you just write a fairly simple method to return an enumerator based on the new yield return
keywords. C# does the rest for you: It creates the underlying enumerator class and takes care of calling MoveNext()
to iterate the array. You get away with much less work and much simpler code.
Note that class MonthDays
' GetEnumerator()
method contains a foreach
loop to yield the string
types in its inner array. Iterator blocks often use a loop of some kind to do this, as you can see in several later examples. In effect, you have in your own calling code an inner foreach
loop serving up item after item that can be iterated in another foreach
loop outside GetEnumerator()
.
Take a moment to compare the little collection in this example with an elaborate LinkedList
collection. Whereas LinkedList
has a complex structure of nodes connected by pointers, this little months
collection is based on a simple array — with canned content, at that. The example expands the collection notion a bit and then develops it even more before this chapter concludes.
Your collection class may not contain canned content — most collections are designed to hold things you put into them via Add()
methods and the like. The KeyedArray
class in the earlier section “Accessing Collections the Array Way: Indexers,” for example, uses the [] subscript operator to add items. Your collection could also provide an Add()
method as well as add an iterator block so that it can work with foreach
.
The point of a collection, in the most general sense, is to store multiple objects and to allow you to iterate those objects, retrieving them one at a time sequentially — and sometimes randomly, or apparently randomly, as well, as in the Indexer
example. (Of course, an array can do that, even without the extra apparatus of a class such as MonthDays
, but iterators go well beyond the MonthDays
example.)
More generally, regardless of what an iterable collection does under the hood, it produces a “stream” of values, which you get at with foreach
. To drive home the point, here's another simple collection class from IteratorBlocks
, one that stretches the idea of a collection about as far as possible (you may think):
//StringChunks -- Define an iterator that returns chunks of text,
// one per iteration -- another oddball "collection" class.
class StringChunks
{
//GetEnumerator -- This is an iterator; see how it's invoked
// (twice) in Main.
public System.Collections.IEnumerator GetEnumerator()
{
// Return a different chunk of text on each iteration.
yield return "Using iterator ";
yield return "blocks ";
yield return "isn't all ";
yield return "that hard";
yield return ".";
}
}
Oddly, the StringChunks
collection stores nothing in the usual sense. It doesn't even contain an array. So where’s the collection? It’s in that sequence of yield return
calls, which use a special syntax to return one item at a time until all have been returned. The collection “contains” five objects, each a simple string
much like the ones stored in an array in the previous MonthDays
example. And, from outside the class, in Main()
, you can iterate those objects with a simple foreach
loop because the yield return
statements deliver one string at a time, in sequence. Here's part of a simple Main()
method that iterates a StringChunks
collection:
// Instantiate a StringChunks "collection" class.
StringChunks sc = new StringChunks();
// Iterate it: prints pieces of text.
foreach (string chunk in sc)
{
Console.WriteLine(chunk);
}
The sections that follow focus on two useful statements: yield return
and yield break
. The yield return
statement resembles the combination of MoveNext()
and Current
for retrieving the next item in a collection. The yield break
statement resembles the C# break
statement, which lets you break out of a loop or switch
statement.
The yield return
syntax works this way:
Using yield
is much like calling the MoveNext()
method explicitly, as in a LinkedList
. Each MoveNext()
call produces a new item from the collection. But here you don't need to call MoveNext()
. (You can bet, though, that it’s being done for you somewhere behind that yield return
syntax.)
You might wonder what's meant by “the next time it’s called.” Here again, the foreach
loop is used to iterate the StringChunks
collection:
foreach (string chunk in sc)
{
Console.WriteLine(chunk);
}
Each time the loop obtains a new chunk from the iterator (on each pass through the loop), the iterator stores the position it has reached in the collection (as all iterators do). On the next pass through the foreach
loop, the iterator returns the next value in the collection, and so on.
You need to understand an interesting bit of syntax related to yield
. You can stop the progress of the iterator at some point by specifying the yield break
statement in the iterator. Say that a threshold is reached after testing a condition in the collection class's iterator block, and you want to stop the iteration at that point. Here’s a brief example of an iterator block that uses yield break
in just that way:
//YieldBreakEx -- Another example of the yield break keyword
class YieldBreakEx
{
int[] primes = { 2, 3, 5, 7, 11, 13, 17, 19, 23 };
//GetEnumerator -- Returns a sequence of prime numbers
// Demonstrates yield return and yield break
public System.Collections.IEnumerator GetEnumerator()
{
foreach (int prime in primes)
{
if (prime > 13) yield break;
yield return prime;
}
}
}
In this case, the iterator block contains an if
statement that checks each prime number as the iterator reaches it in the collection (using another foreach
inside the iterator, by the way). If the prime number is greater than 13, the block invokes yield break
to stop producing primes. Otherwise, it continues — with each yield return
giving up another prime number until the collection is exhausted.
In earlier examples in this chapter, iterator blocks have looked like this:
public System.Collections.IEnumerator GetEnumerator()
{
yield return something;
}
But iterator blocks can also take a couple of other forms:
Rather than always write an iterator block presented as a method named GetEnumerator()
, you can write a named iterator — a method that returns the System.Collections.IEnumerable
interface instead of IEnumerator
and that you don't have to name GetEnumerator()
— you can name it something like MyMethod()
instead. For example, you can use this simple method to iterate the even numbers from a top value that you specify down to a stop value — yes, in descending order. Iterators can do just about anything:
//EvenNumbers -- Define a named iterator that returns even numbers
// from the "top" value you pass in DOWN to the "stop" value.
// Another oddball "collection" class
class EvenNumbers
{
//DescendingEvens -- This is a "named iterator."
// Also demonstrates the yield break keyword
// See how it's invoked in Main() with foreach.
public System.Collections.IEnumerable DescendingEvens(int top,
int stop)
{
// Start top at nearest lower even number.
if (top % 2 != 0) // If remainder after top / 2 isn't 0.
top -= 1;
// Iterate from top down to nearest even above stop.
for (int i = top; i >= stop; i -= 2)
{
if (i < stop)
yield break;
// Return the next even number on each iteration.
yield return i;
}
}
}
The DescendingEvens()
method takes two parameters (a handy addition), which set the upper limit of even numbers that you want to start from and the lower limit where you want to stop. The first even number that's generated will equal the top parameter or, if top
is odd, the nearest even number below it. The last even number generated will equal the value of the stop
parameter (or if stop
is odd, the nearest even number above it). The method doesn't return an int
itself, however; it returns the IEnumerable
interface. But it still contains a yield return
statement to return one even number and then waits until the next time it's invoked from a foreach
loop. That’s where the int
is yielded up.
public System.Collections.IEnumerable PositiveIntegers()
{
for (int i = 0; ; i++)
{
yield return i;
}
}
// Instantiate an EvenNumbers "collection" class.
EvenNumbers en = new EvenNumbers();
// Iterate it: prints even numbers from 10 down to 4.
Console.WriteLine("
stream of descending evens :
");
foreach (int even in en.DescendingEvens(11, 3))
{
Console.WriteLine(even);
}
This call produces a list of even-numbered integers from 10 down through 4. Notice also how the foreach
is specified. You have to instantiate an EvenNumbers
object (the collection class). Then, in the foreach
statement, you invoke the named iterator method through that object:
EvenNumbers en = new EvenNumbers();
foreach (int even in en.DescendingEvens(nTop, nStop)) …
foreach(int even in EvenNumbers.DescendingEvens(nTop, nStop)) …
If you can produce a “stream” of even numbers with a foreach
statement, think of all the other useful things you may produce with special-purpose collections like these: streams of powers of two or of terms in a mathematical series such as prime numbers or squares — or even something exotic such as Fibonacci numbers. Or, how about a stream of random numbers (that’s what the Random
class already does) or of randomly generated objects?
You can also implement an iterator block as a property of a class — specifically in the get()
accessor for the property. In this simple class with a DoubleProp
property, the property's get()
accessor acts as an iterator block to return a stream of double
values:
//PropertyIterator -- Demonstrate implementing a class
// property's get accessor as an iterator block.
class PropertyIterator
{
double[] doubles = { 1.0, 2.0, 3.5, 4.67 };
// DoubleProp -- A "get" property with an iterator block
public System.Collections.IEnumerable DoubleProp
{
get
{
foreach (double db in doubles)
{
yield return db;
}
}
}
}
You write the DoubleProp
header in much the same way as you write the DescendingEvens()
method's header in the named iterators example. The header returns an IEnumerable
interface, but as a property it has no parentheses after the property name and it has a get()
accessor — though no set()
. The get()
accessor is implemented as a foreach
loop that iterates the collection and uses the standard yield return
to yield up, in turn, each item in the collection of doubles
. Here's the way the property is accessed in Main()
:
// Instantiate a PropertyIterator "collection" class.
PropertyIterator prop = new PropertyIterator();
// Iterate it: produces one double at a time.
Console.WriteLine("
stream of double values:
");
foreach (double db in prop.DoubleProp)
{
Console.WriteLine(db);
}
Stream of months:
January 31
February 28
March 31
April 30
May 31
June 30
July 31
August 31
September 30
October 31
November 30
December 31
Stream of string chunks:
Using iterator
blocks
isn't all
that hard
.
stream of string chunks on one line:
Using iterator blocks isn't all that hard.
stream of primes:
2
3
5
7
11
13
stream of descending evens :
10
8
6
4
stream of double values:
1
2
3.5
4.67