Chapter 6. Query Operators

The preceding chapter covered the structure of query expressions. The next step is to begin embellishing queries with operators.

This book is not intended to be a reference; nevertheless, this chapter and the preceding one cover all the LINQ operators. You can supplement this material by referring to examples that are available for download on the book’s Web site (as described in the Appendix), the online help, or the excellent SampleQueries program that ships with Visual Studio. Like the samples available for download, the SampleQueries program provides code showing how to use each of the operators. Written primarily by C# PM Alex Turner, SampleQueries contains more than 500 sample methods. Included in the sample program are more than 100 LINQ to SQL queries, 100 LINQ to Objects queries, 100 LINQ to XML queries, and LINQ to Dataset queries. See Appendix A for more information on how to locate and install SampleQueries.

Locating and Grouping the LINQ Operators

The C# team broke the LINQ operators into groups. I will use these categories to give the discussion structure. Some of these operators should already be familiar to you. For instance, five operators used to create query expressions were discussed in the preceding chapters—Where, Group-By, Join, Select, and SelectMany. Two of the operators, Concat and Reverse, stand on their own. I merged them into the Set and Ordering groups, respectively. All the other operators are sorted into the default groups established by the C# team.

You have, of course, been working with LINQ operators such as where, orderby, and select since Chapter 2, “Getting Started.” You have seen that they are implemented as extension methods and are in a class called Enumerable. You know that you can extend LINQ by creating your own operators, and you can modify its behavior by overriding the existing operators. Nevertheless, it is the 49 LINQ operators that ship with the product that usually define what LINQ can and cannot do. Table 6.1 shows the complete set of those operators and three accompanying utilities (marked with a +). In this table and the others in this chapter, the operators that are not deferred are marked with a *.

Table 6.1. The LINQ Query Operators and Three Utilities Can Be Assigned to 12 Categories

image

image

These operators provide support for set operations, joins, ordering, grouping, and aggregation. Other operators, such as the Element and Partitioning types, allow you to easily access individual elements returned by a query.

Don’t look at the list of operators shown in Table 6.1 as a democratic brotherhood of equals. As you learned in the previous chapter, you must master five core operators—Where, OrderBy, Group-By, Join, and Select—if you want to understand LINQ. The clauses based on these operators, plus those formed with let and from, are the body and limbs of a query expression. They are the structure on which a query expression is built.

When trying to decide which LINQ operator to use, there is no need to scan the list of all the operators to find the one that fits your current needs. Instead, you should first master the big five and then learn to pick and choose from the others as you find the need. Do you need to perform a calculation? If so, take a look at the Aggregate operators. Do you need to find the union or intersection of two sequences? Take a look at the Set operators. Do you want to convert an IEnumerable<T> to a List<T>? Take a look at the Conversion operators.

The LINQ operators form a rich and varied API. Studying them can help you reach a level of proficiency sufficient to support writing sophisticated LINQ queries. If you can create simple queries quickly and efficiently, you will find that the small gaps in your knowledge will be filled in automatically during the course of your daily work.

Code Reuse

Throughout this chapter I will need to repeat the foreach loop that displays the data returned from a LINQ to Objects query. Because the lines of code for this process rarely change, I have created the following simple method, which I will call instead of showing you the same foreach loop repeatedly:

public static void ShowList<T>(IEnumerable<T> list)
{
    foreach (var item in list)
    {
        Console.WriteLine(item);
    }
}

I will also frequently use this code to display a title to the console:

public static void ShowTitle(string p)
{
    Console.WriteLine("=================");
    Console.WriteLine(p);
    Console.WriteLine("=================");
}

Furthermore, I will use a list of famous Romans on several occasions:

public class Roman
{
    public int Id { getset; }
    public string Name { getset; }
    public char Gender { getset; }

    public override string ToString()
    {

        return String.Format(
            "{{ Id = {0}; Gender = {1}; Name = {2} }}",
            Id.ToString().PadLeft(2, '0'), Gender, Name);
    }
}

public static List<Roman> romans = new List<Roman>()
{
    new Roman { Id=00, Gender='f', Name = "Aelia Paetina" },
    new Roman { Id=01, Gender='f', Name = "Agrippina the Younger" },
    new Roman { Id=02, Gender='m', Name = "Augustus" },
    new Roman { Id=03, Gender='f', Name = "Caesonia" },
    new Roman { Id=04, Gender='m', Name = "Caligula" },
    new Roman { Id=05, Gender='f', Name = "Claudia Octavia" },
    new Roman { Id=06, Gender='m', Name = "Claudius" },
    new Roman { Id=07, Gender='f', Name = "Clodia Pulchra" },
    new Roman { Id=08, Gender='f', Name = "Julia the Elder" },
    new Roman { Id=09, Gender='f', Name = "Junia Claudilla" },
    new Roman { Id=10, Gender='f', Name = "Livia Drusilla" },
    new Roman { Id=11, Gender='f', Name = "Livia Orestilla" },
    new Roman { Id=12, Gender='f', Name = "Lollia Paulina" },
    new Roman { Id=13, Gender='f', Name = "Messalina" },
    new Roman { Id=14, Gender='m', Name = "Nero" },
    new Roman { Id=15, Gender='f', Name = "Plautia Urgulanilla" },
    new Roman { Id=16, Gender='f', Name = "Poppaea Sabina" },
    new Roman { Id=17, Gender='f', Name = "Scribonia" },
    new Roman { Id=18, Gender='f', Name = "Statilia Messalina" },
    new Roman { Id=19, Gender='m', Name = "Tiberius" },
    new Roman { Id=20, Gender='f', Name = "Vipsania Agrippina" },
};

Locating the LINQ Operators

The LINQ to Objects operators are implemented in a class called System.Linq.Enumerable. You have already seen how to use LINQ to Objects to write a Reflection query that enumerates these methods. You can also find their declarations, but not their full source code, by using the tools built into the Visual Studio IDE. To find the declarations, first start a standard console application. In the main block, type in System.Linq.Enumerable, put the cursor on the word Enumerable, and press F12. Alternatively, you can right-click and select Go to Definition. You are taken to the metadata for the Enumerable class, as shown in Figure 6.1.

Figure 6.1. The IDE uses metadata to display the members of the Enumerable class.

image

You can learn a lot about these operators simply by looking at these declarations. After a time, you may even find that you can implement some of the simpler operators yourself just by looking at their declaration and having a little knowledge of how they behave.

When looking through these operators, you might notice that all but three of them are extension methods. The exceptions are a set of three utilities. Here are the declarations for Empty, Range, and Repeat, which clearly are not extension methods:

public static IEnumerable<TResult> Empty<TResult>();
public static IEnumerable<int> Range(int start, int count);
public static IEnumerable<TResult> Repeat<TResult>(TResult element,
                                                   int count);

If you’re curious, the following query finds the three utility methods, returning the declaration for Empty, Range, and Repeat.

var query = from method in typeof(Enumerable).GetMethods()
            where method.DeclaringType == typeof(Enumerable)
            where method.GetCustomAttributes(true).Count() == 0
            select new { method };

This query is found in the sample program called LinqReflectionQuery that accompanies this book.

Generation Operators

The simple Generation operators shown in Table 6.2 allow you to create enumerations and test the contents of existing sequences that implement IEnumerable<T>. Some of these operators are designed to act as helper methods that you might use in test code, or for highly targeted scenarios in production code. The samples in this section are found in the GenerationOperators program that is available on the book’s web site. None of the operators in the Generation group is deferred. Range, Repeat, and Empty are not implemented as extension methods.

Table 6.2. Generation Operators

image

The Range, Repeat, and Empty methods are utilities that create lists. The Any, All, and Contains operators allow you to test the contents of a list against certain conditions.

Range

The Range operator allows you to quickly generate a sequence of integers. It is declared like this:

public static IEnumerable<int> Range(int start, int count);

Notice that it is not an extension method. For this reason, it is not really a traditional LINQ operator. But it is grouped with them and implemented in the Enumerable class because it is used frequently in LINQ programs as a utility. You can, however, use this utility any place in your code that you think appropriate. There is no reason to use it only with LINQ. Note that Range is deferred, which means that it is implemented with yield return, and hence does not execute until it is enumerated.

Here is how to call Range:

var list = Enumerable.Range(1, 3);

This call to Range returns a sequence containing the values 1, 2, and 3.

Because Range returns a sequence, you can use it as a data source in a query:

var query = from x in Enumerable.Range(1, 10)
            where (x % 2) == 0
            select x;

This produces a sequence containing the even numbers between 1 and 10.

Here the Range operator is used to calculate the area of a range of circles:

var query = from radius in Enumerable.Range(1, 9)
            let pi = 3.14159
            let area = (radius * radius) * pi
            select new { radius, pi, area };

ShowTitle("Radius | Area");

foreach (var item in query)
{
    Console.WriteLine("{0,6} |{1, 10}", item.radius, item.area );
}

This code produces the following results:

image

Here is a somewhat more complex query:

var query = from x in Enumerable.Range(1, 2)
            from y in Enumerable.Range(1, 3)
            select new { x, y };

This says, in effect, for each x, show me a range of numbers between 1 and 3. The result sequence looks like this if enumerated with our ShowList method:

{ x = 1, y = 1 }
{ x = 1, y = 2 }
{ x = 1, y = 3 }
{ x = 2, y = 1 }
{ x = 2, y = 2 }
{ x = 2, y = 3 }

The point here is that by using multiple from clauses you can start generating relatively complex sequences. Here we generate the sequence 1, 1, 1, 2, 2, 2. A simple modification to the code would allow you to extend this sequence for as long as you want. If this idea intrigues you, spend a little time playing with this code, passing in different parameters until you begin to get a feeling for what can be done.

Repeat

The Repeat utility is also not implemented as an extension method, so it is not a standard operator. It returns a sequence containing the same value repeated multiple times:

var list = Enumerable.Repeat(108, 12);

This call returns an enumeration containing 12 copies of the number 108. Consider the code shown in Listing 6.1. The ShowRepeat method stores 15 identical copies of an Item object with a Length of 5 and a Width of 6. Note also that I’ve implemented the ToString method so that a foreach loop can easily print the content of each instance of the Item class.

Listing 6.1. The ShowRepeat Method Creates an Instance of the Item Class and Then Uses the Repeat Operator to Add the Instance to the Enumeration 15 Times

class Item
{
    public int Width { getset; }
    public int Length { getset; }

    public override string ToString()
    {
        return string.Format("Width: {0}, Length: {1}", Width, Length);
    }
}

public void ShowRepeat()
{
    var item = new Item() { Length = 5, Width = 6 };
    var list = Enumerable.Repeat(item, 15);
    Console.WriteLine(
       Object.ReferenceEquals(list.ElementAt(1), list.ElementAt(2)));
    ShowList(list);
}

If you compare any two items from the list created in Listing 6.1, you will find that they have object identity: only one instance of the object is stored in the list 15 times. This is demonstrated by the call to Object. ReferenceEquals, which returns True.

Here is code that selects 12 random “Celsius temperatures” between 0 and 30 and converts them to Fahrenheit:

Random random = new Random();

var query = from c in Enumerable.Repeat(0, 12)
                .Select(i => random.Next(0, 30))
            let f = (1.8 * c) + 32
            select new { c = c, f };

ShowTitle("Temp Convert");
foreach (var i in query)
{
    Console.WriteLine("{0, 3} c = {1, 4} f", i.c, i.f.ToString("F02"));
}

Note that Repeat never directly generates any numbers. Instead, the call to Select generates this list. It should not be hard to imagine how you could write this code in a more imperative fashion using a standard class constructor and a loop. This code accomplishes the same end with a more concise declarative syntax. The question you have to ask is whether the code is sufficiently readable to make the space savings worthwhile.


A Note on the Code and a Thank-You

In this text I’ve had to break the from clause into two lines because of line-width limitations. I should also add that I got the idea for using Repeat this way from a blog post written by Igor Ostrovsky, an engineer on Microsoft’s parallel team.


The code produces the following output:

=================
Temp Convert
=================
 28 c = 82.40 f
  6 c = 42.80 f
 19 c = 66.20 f
 19 c = 66.20 f
 24 c = 75.20 f
 14 c = 57.20 f
 24 c = 75.20 f
 25 c = 77.00 f
  8 c = 46.40 f
  2 c = 35.60 f
  9 c = 48.20 f
 18 c = 64.40 f

To me this looks like a list of daily temperatures in a computer simulation of a world where the temperature is not very stable.

Empty

The Empty operator is the last of the three declarations in the Enumerable class that are not extension methods. It returns an IEnumerable<T> with no elements in the sequence. Consider this simple call to Empty, which creates a sequence of System.Double[0] that contains zero elements:

var list = Enumerable.Empty<double>();
Console.WriteLine(list);

I am unable to think of any good uses for this operator that you could not achieve just as easily using a standard constructor. I have a feeling the team included it for the sake of completeness, or to give you a declarative way to create a class of some arbitrary type.

Any

The Boolean Any operator can be used to tell whether a sequence is empty, or whether it meets the conditions of a particular predicate. The following code first checks to see if two lists are empty, and then it uses a simple predicate to detect whether a list contains the number 8:

public void ShowAny()
{
    var listA = Enumerable.Empty<double>();
    var listB = Enumerable.Range(1, 10);

    Console.WriteLine("Are there any items in ListA: {0}, ListB: {1}",
       listA.Any(), listB.Any());

    Console.WriteLine("Does listB contain the number {0}: {1}",
       8, listB.Any(i => i == 8));
}

The output for this method looks like this:

Are there any items in ListA: False, ListB: True
Does listB contain the number 8: True

The first version of Any called in this code takes no parameters, and you should be able to easily imagine the declaration for it:

public static bool Any<TSource>(this IEnumerable<TSource> source);

Note that it is a simple extension method for IEnumerable<T> that returns a bool.

The second version of Any is declared like this:

public static bool Any<TSource>(
   this IEnumerable<TSource> source,
   Func<TSource, bool> predicate);

Notice the simple delegate expected as the second parameter. As you recall, functions like this that return a Boolean value are called predicates by mathematicians—hence the name of this parameter to the Any operator. The lambda we pass in to fulfill the contract inherent in this parameter looks like this:

i => i == 8

This delegate returns a Boolean value specifying whether any item in the list is equal to 8.

The preceding example calls Any with a lambda, but the following code also compiles and runs correctly:

public bool Predicate(int value)
{
    return value == 8;
}

public void ShowAny()
{
    var listB = Enumerable.Range(1, 10);

    Console.WriteLine("Does listB contain the number {0}: {1}",
        8, listB.Any(Predicate));
}

Some developers might prefer to use lambdas in these situations because they can be guaranteed to be side-effect-free and hence thread-safe.

All

The All operator detects whether the elements of a list meet a certain condition specified in a predicate. In this example, the predicate asks whether all the items in a list are smaller than the number 11:

public void ShowAll()
{
    var list = Enumerable.Range(1, 10);

    if (list.All(i => i < 11))
    {
        Console.WriteLine("Condition met");
    }
    else
    {
        Console.WriteLine("Condition not met");
    }
}

Contains

The Contains operator can be used to test for the inclusion of a particular element in a sequence. This operator is overloaded several times. In its simplest case, you simply pass in an element of the same type as the list, and the method returns whether it is a member of the sequence. In this code, a sequence with the numbers 3 through 13 is generated, and the code checks to see which of the numbers between 0 and 15 are included in that list:

var list = Enumerable.Range(3, 10);

for (int i = 0; i < 15; i++)
{
   Console.WriteLine("List contains value {0}: {1}",
      i, list.Contains(i));
}

The second overload of the Contains operator uses the IEQualityComparer interface:

public static bool Contains<TSource>(this IEnumerable<TSource> source,
   TSource value, IEqualityComparer<TSource> comparer);

This gives you latitude to make more complex decisions about whether a particular value is in a list. Listing 6.2 shows an implementation of the IEqualityComparer interface.

Listing 6.2. This IEqualityComparer Asks You to Implement the Equals and GetHashCode Methods

class EqualityCompareIEqualityComparer<int>
{

    #region IEqualityComparer<int> Members

    public bool Equals(int x, int y)
    {
        if (x == y)
        {
            return y % 2 == 0;
        }
        else
        {
            return false;
        }
    }

    public int GetHashCode(int obj)
    {
        return obj.GetHashCode();
    }

    #endregion
}

Here is a method that uses this implementation of IEqualityComparer:

public void ShowContains()
{
  for (int i = 0; i < 15; i++)
  {
    Console.WriteLine("List contains value {0:D2}: {1}",
      i, list.Contains(i, new EqualityCompare()));
  }
}

The important method in Listing 6.2 is the implementation of Equals. It tests whether the values passed into it are equal and whether they are even. If both conditions are met, it returns true; otherwise, it returns false. I will show other implementations of IEqualityComparer later in this chapter. Note also that Contains has an overload that takes only a single parameter that uses a default implementation of IEqualityComparer:

list.Contains(i)


Contains Can Help You Write Succinct Code

One of the goals of LINQ is to help you write clear, succinct code that is easy to read. The Contains operator can help you achieve this goal. You might remember the IsState extension method from the preceding chapter:

public static bool IsState01(this string source)
{
    if (source == nullreturn false;
    source = source.ToUpper();
    foreach (var item in stateCodes)
    {
        if (source == item)
        {
            return true;
        }
    }
    return false;
}

This method contains 10 lines of code. Here is a rewrite of the IsState() extension method from the preceding chapter that contains one line of code, which I have wrapped here due to line-width considerations:

public static bool IsState02(this string source)
{
    return (source == null) ? false :
       stateCodes.Contains(source.ToUpper());
}

With LINQ, the goal is often to allow developers to create the simplest syntax possible that properly expresses their logic. This is an example of how that end can be achieved.


SequenceEqual

The SequenceEqual operator stands on its own and traditionally is not considered to be part of the Generation operators. I will include it in this section, however, because it bears some similarity to the Any, All, and Contains operators.

Consider the following code:

var listA = Enumerable.Range(1, 3);
var listB = Enumerable.Range(1, 5);

var query1 = from a in listB
             where a < 4
             select a;

The SequenceEqual operator tests to see if two sequences are equal. Given the first two lists we see here, the following query returns false:

listA.SequenceEqual(listB);

This returns false because listA does not contain the same sequence as listB. In particular, the sequence 1, 2, 3 is not equivalent to the sequence 1, 2, 3, 4, 5.

We find that query1 does have the same sequence as ListA. As a result, the following query returns true:

listA.SequenceEqual(query1)

This returns true because query1 returns the numbers from listB that are smaller than 4. These are the numbers 1, 2, and 3, which are the same numbers, found in the same order, as those found in listA.

You can also implement IEqualityComparer<T> and pass that in to SequenceEqual. Here is a simple implementation of that interface that works for our current example:

class EqualityCompare : IEqualityComparer<int>
{

    public bool Equals(int x, int y)
    {
        return (x == y);
    }

    public int GetHashCode(int obj)
    {
        return obj.GetHashCode();
    }
 }

Given the existence of this class, you can write the following code, which returns true:

listA.SequenceEqual(query1, new EqualityCompare())

The EqualOperators sample that comes with this book contains the code shown in this section, and a few other samples that you might find interesting. For instance, it demonstrates that order matters when using this operator. For instance, the following code returns false:

var listC = new List<int> { 2, 1, 3 };
Console.WriteLine(listC.SequenceEqual(listA));

Partitioning Operators

The four Partitioning operators, shown in Table 6.3, allow you to divide a sequence into two sections where the partition between the sections is defined by a simple Boolean operation. All these operators are deferred.

Table 6.3. Partitioning Operators

image

Throughout this section I’ll run variations on a query run against the list of famous Romans shown near the beginning of this chapter. Here is an unadorned version of the query that does not use any Partitioning operators:

var query = from r in romans
            where r.Gender == 'm'
            select r.Name;

foreach (var e in query)
{
   Console.WriteLine(e);
}

The output from this query would look like this:

Augustus
Caligula
Claudius
Nero
Tiberius

This gives you a baseline to which you can compare this output with what is returned from the other queries in this section.

Take

The Take operator retrieves the first n elements from a list, where n is the number you pass as the operator’s sole argument. Here is a query using the partitioning operator Take. This query returns the first two male emperors from the list of five shown in the preceding section:

var query1 = (from r in romans
              where r.Gender == 'm'
              select r.Name).Take(2);

If you write the results of the query with a foreach loop, you see the following output:

Augustus
Tiberius

Skip

The Skip operator takes the opposite tack. It skips n elements in a list and then shows the remainder:

query1 = (from r in romans
          where r.Gender == 'm'
          select r.Name).Skip(2);

This code skips Augustus and Tiberius and returns Caligula, Claudius, and Nero.

Here is how to take the third 25 items from a list of 100 items:

var list = Enumerable.Range(1, 100);
list.Skip(75).Take(25)

TakeWhile

The TakeWhile operator is perhaps most useful when you are working with an infinite list, or a very long list. Consider this somewhat contrived method:

public IEnumerable<long> GetInfiniteList()
{
    int bottom = 7000;
    long count = bottom;

    while (true)
    {
        yield return count;
        count += 7;
        if (count >= 20000) count = bottom;
    }
}

This code returns, in an infinite loop, the numbers between 7,000 and 20,000 that are divisible by 7. Here is one way to view the first five members of this list:

foreach (var item in GetInfiniteList().Take(5))
{
    Console.WriteLine(item);
}

The output from this list looks like this:

7000
7007
7014
7021
7028

The following code takes all the items from the list that are divisible by 11 and that are smaller than 10,000:

var query4 = (from r in GetInfiniteList()
              where r % 11 == 0
              select r).TakeWhile(r => r < 10000);

The result sequence contains many members, but here are the first few:

7007
7084
7161

Here is a variation on this same code, expressed in query method syntax:

var query = Enumerable
            .Range(1, 100)
            .Where(x => x % 3 == 0)
            .TakeWhile(x => x % 11 != 0);

Here we ask for the numbers between 1 and 100 that are evenly divisible by 3 and 11. Note that the Range operator generates all the numbers between 1 and 100, whereas our GetInfiniteList method generates only the numbers that are multiples of 7. Furthermore, our method generates only the exact number of items you request; it stops working when you stop asking for the next item in the sequence.


An Interesting Case

Suppose we begin our series with a number that is not evenly divisible by 7, such as 8,000:

public IEnumerable<int> GetWeirdRepeatingList()
{
    int bottom = 8000;
    int top = 10000;
    int count = bottom;

    while (true)
    {
        count += 7;
        if (count >= 20000) count = bottom;
        yield return count;
    }
}

Now we ask for all the values in this list that are evenly divisible by 11, using the query just shown. The result is a list of numbers that begins like this: 8063, 8140, 8217, 8294, 8371. If you divide any of these numbers by 7, you end up with a repeating set of decimal values:

8063 / 7 = 1151.857142857142857142857142
8140 / 7 = 1162.857142857142857142857142
8217 / 7 = 1173.857142857142857142857142

As I say, it’s an interesting case.


SkipWhile

The SkipWhile operator is very much like TakeWhile, except that it skips the items from the beginning of a list that don’t meet a particular condition. We don’t want to use this operator with an infinite list, because it would never define a closing condition. We can, however, use it with a method that could potentially generate a fairly long list:

public IEnumerable<long> GetLongList()
{
    int top = 1000;
    long count = -1001;

    while (count < top)
    {
        count += 7;
        yield return count;
    }
}

The sequence generated by this method runs in increments of 7 from −1001 to 1000, but it could easily be extended to range over all integers expressible on a 64-bit system—that is, the numbers between long.MinValue and long.MaxValue.

Here is a query run against GetLongList:

var query3 = (from r in GetLongList()
              where r % 11 == 0
              select r).SkipWhile(r => r < 0);

This code retrieves all the positive numbers from the list that are divisible by 11. The first viewed elements in the result sequence look like this:

0
77
154
231
308
385

Something about these operators tempts you to show how they can be used to manipulate numbers. Nevertheless, it would probably be wise to finish this section with a query against string data:

query1 = (from r in SimpleRomans
          where r.Gender == 'f'
          select r.Name)
          .SkipWhile(r => r.StartsWith("A"));

This query asks for all the female Romans from our list but asks that we skip those whose name begins with the letter A:

Caesonia
Claudia Octavia
Clodia Pulchra
Julia the Elder
Junia Claudilla
Livia Drusilla
Etc...

In reading this section you may have sensed three things:

• Although you have seen only a few LINQ operators so far, it should be clear that, taken together, these operators form a complete language for querying data. One senses the team’s desire to be sure to include a way to express all the possible ways to query data.

• Some of these operators can be useful for writing mathematical formulas or expressing mathematical ideas.

• LINQ is quite at home working with infinite lists. You might, for instance, use LINQ to query an infinite stream of data coming in over the Internet, a stream of bytes from a network, a sequence of numbers such as primes, and so on.

Element Operators

As shown in Table 6.4, there are nine Element operators. Except for DefaultIfEmpty, which is deferred, their purpose is to force execution of a query and immediately return a single item from an enumeration. Examples of using all of these operators are found in the ElementOperators sample program available on the book’s web site.

Table 6.4. Element Operators

image

The Element operators allow you to access individual items from a sequence. With the exception of DefaultIfEmpty, none of these operators are deferred.

First and FirstOrDefault

The following query uses the First operator to return the initial item from a query:

var query = (from r in romans
             where r.Gender == 'm'
             select r.Name).First();

When enumerated, this query returns Augustus.

Here is another common way to call the First operator:

Console.WriteLine("First item: {0}", romans.First());

Pass a predicate to First to specify a filter:

var firstLambda = (from r in romans
                   where r.Gender == 'm'
                   select r.Name).First(n => n.Length > 4);

If the sequence you are querying is empty, First throws an InvalidOperationException. If you think this probably will happen, call FirstOrDefault. This operator returns the default value for the type of element in your sequence. The default value for Reference types is null; for numeric Value types, it is 0. For instance, if you are working with a sequence of string, it returns null.

Because the filter in this query causes it to return a result sequence with zero elements, this call to the Single operator throws an exception when enumerated:

var query = (from r in romans
             where r.Gender == 'z'
             select r.Name).First();

This query, however, does not throw an exception and returns null:

var query = (from r in romans
             where r.Gender == 'z'
             select r.Name).FirstOrDefault();

FirstOrDefault is overloaded to allow you to pass in a lambda, just as you can when you use the First operator.

Last and LastOrDefault

You can use the Last() operator to retrieve the last item from a sequence. All the features of the First operator also work for Last. Here is a call to the Last operator that returns the name Tiberius from the list:

query = (from r in romans
         where r.Gender == 'm'
         select r.Name).Last();

You can pass in a predicate to filter a list:

var clawClawClaudius = (from r in romans
                        where r.Gender == 'm'
                        select r.Name).LastOrDefault(n =>
                                                     n.StartsWith("C"));

This returns the value Claudius.

If you wrote a query that returned nothing, you would throw an exception, which you could capture in a try/catch block:

try
{
    var q = (from r in romans
             where r.Gender == 'z'
             select r).Last();
}
catch (System.InvalidOperationException)
{
    Console.WriteLine("You threw an InvalidOperationException");
}

You can call LastOrDefault to sidestep this problem:

var nullRoman = (from r in romans
                where r.Gender == 'z'
                select r).LastOrDefault();

WriteLine("{0}", (nullRoman == null) ? "Null" : nullRoman.Name);

Because this query returns zero elements, LastOrDefault returns the default value of null for a reference type such as Roman. In the WriteLine statement, the conditional operator is used to determine whether the query returned an instance of Roman.

Single

The Single operator returns one, and only one, element from a sequence. If your sequence contains only one element, you can use the operator with zero parameters, as shown here:

string name = (from r in romans
               where r.Name.Length == 4
               select r.Name).Single();

Console.WriteLine(name);

The point is that Single forces execution of the query. If we did not call Single, the return value would be an enumeration with one element. By calling Single, we get the result of the query without having to explicitly enumerate the list with foreach.

It is a runtime error to call Single on a sequence that contains more than one element:

try
{
    var n = (from r in romans
             select r.Name).Single();
}
catch (System.InvalidOperationException)
{
    Console.WriteLine("You threw an InvalidOperationException");
}

If you have multiple elements in your sequence, you may write a simple predicate that acts as a filter to return only the element you want. Your lambda must return only one element, or you receive an InvalidOperationException at runtime.

string name = (from r in romans
               where r.Gender == 'm'
               select r.Name).Single(r => r.Length == 4);

Console.WriteLine("{0}", name);

ElementAt

Here is how to pull Claudius’ name from the list:

query = (from r in romans
         where r.Gender == 'm'
         select r.Name).ElementAt(2);

Element Operators and Composition

I have been showing the Element operators in the context of a query expression, but you can use them directly on any type that supports IEnumerable<T>:

Console.WriteLine("{0,10}: {1}""First", romans.First());
Console.WriteLine("{0,10}: {1}""Last", romans.Last());
Console.WriteLine("{0,10}: {1}""ElementAt", romans.ElementAt(2));
Console.WriteLine("{0,10}: {1}""Single",
   romans.Single(q => q.Name.Length == 4));

These queries produce the following output:

     First: { Id = 00; Gender = f; Name = Aelia Paetina }
      Last: { Id = 20; Gender = f; Name = Vipsania Agrippina }
 ElementAt: { Id = 02; Gender = m; Name = Augustus }
    Single: { Id = 14; Gender = m; Name = Nero }

Or you can use them on the results of a query:

var query = from r in romans
            where r.Gender == 'm'
            select r.Name;

ShowList(query.Take(2));
Console.WriteLine(query.First());
Console.WriteLine(query.Last());
Console.WriteLine(query.ElementAt(2));

The query expression found at the beginning of the method retrieves the list of Julio-Claudian emperors from our collection:

Augustus
Caligula
Claudius
Nero
Tiberius

It then uses

• The Take operator to pull the first two items: Augustus and Caligula.

First to pull the name Augustus from the list.

Last to pull the name Tiberius from the list.

ElementAt(2) to pull our friend Claudius’ name from the list.

This ability to reuse the results of a query is a form of composition, as described in Chapter 3, “The Essence of LINQ.”

DefaultIfEmpty

Use the DefaultIfEmpty operator if you want to be sure that a result sequence has at least one element. Consider this query:

var query = from r in romans
            where r.Gender == 'z'
            select r;

It returns an empty list with zero elements.

This query, on the other hand, returns a list with one element that is set to null:

var query = (from r in romans
              where r.Gender == 'z'
              select r).DefaultIfEmpty();

Console.WriteLine("Count: {0}", query.Count());
foreach (var item in query)
{
    Console.WriteLine("{0}", (item == null) ? "null" : item.Name);
}

The output from these lines of code looks like this:

Count: 1
null

If a query returns a normal result sequence with one or more elements, a call to DefaultIfEmpty does nothing. It is as if you never called it. Here is an example:

var query = (from r in romans
              where r.Gender == 'm'
              select r).DefaultIfEmpty();

This query returns the names of our five male emperors.

You can specify a default value so that the single item in the list returned by DefaultIfEmpty contains a valid class rather than null:

var query = (from r in romans
             where r.Gender == 'z'
             select r)
             .DefaultIfEmpty(new Roman
                 {
                     Id = -1,
                     Gender = 'N',
                     Name = "Empty Roman"
                 });

This code uses our ToString implementation in the Roman class to return the following:

{ Id = -1; Gender = N; Name = Empty Roman }

Set Operators

Continuing our tour of the LINQ operators, we can now turn our attention to the Set operators, which are shown in Table 6.5. They allow you to perform set operations on various sequences. You can apply the Set operators to any two sequences that implement IEnumerable<T>. All the set operators are deferred. The Concat operator is also discussed in this section.

Table 6.5. Set Operators

image

image

Union

The Union operator shows the unique items from two lists, as shown in Listing 6.3. Here we have two lists—one containing the numbers 1, 2, and 3, and the other 3, 4, 5, and 6. The union of these two lists is the numbers 1, 2, 3, 4, 5, and 6.

Listing 6.3. The ShowUnion Method Displays the Numbers 1, 2, 3, 4, 5, and 6

public void ShowUnion()
{
    var listA = Enumerable.Range(1, 3);
    var listB = new List<int> { 3, 4, 5, 6 };

    var listC = listA.Union(listB);

    ShowList(listC);
}

After these two sequences are combined, only the unique members of each list are retained. The elements of listA appear before elements of listB in the merged sequence.

If you don’t want the union of two lists, consider using the Concat operator:

var listA = Enumerable.Range(1, 3);
var listB = new List<int> { 3, 4, 5, 6 };

var listC = listA.Concat(listB);

ShowList(listC);

This operator returns a sequence containing all the items in both enumerations, including duplicates:

1
2
3
3
4
5
6

Intersect

The Intersect operator shows the items that two lists have in common. In this case we have one list containing the numbers 1, 2, 3, and 4 and a second list containing the numbers 3, 4, 5, and 6. The intersection of the two lists are the numbers 3 and 4, as shown in Figure 6.2. Listing 6.4 demonstrates how to use this operator.

Figure 6.2. The intersection of two lists.

image

Listing 6.4. The ShowIntersect Method Prints the Numbers 3 and 4

public void ShowIntersect()
{
    var listA = Enumerable.Range(1, 4);
    var listB = new List<int> { 3, 4, 5, 6 };

    var listC = listA.Intersect(listB);

    ShowList(listC);
}

Here two collections are joined, and only the unique, shared members of each list are retained.

Consider what happens if we make the following change to the order in which the items are declared:

var listB = new List<int> { 4, 5, 3, 6 };

The result is still {3,4} because Intersect() is applied on listA, whose order it kept.

Distinct

The Distinct operator finds all the unique items in a single list, as shown in Listing 6.5. This method works with a list containing the numbers 1, 2, 3, 3, 2, and 1. The unique, or distinct, numbers in this list are 1, 2, and 3, as illustrated in Figure 6.3.

Figure 6.3. The unique, or distinct, items in the sequence 1, 2, 3, 3, 2, 1 are the numbers 1, 2, and 3.

image

Listing 6.5. This Code Prints the Numbers 1, 2, and 3

public void ShowDistinct()
{
    var listA = new List<int> { 1, 2, 3, 3, 2, 1 };
    var listB = listA.Distinct();

    ShowList(listB);
}

Except

The Except operator shows all the items in one list minus the items in a second list, as shown in Listing 6.6. Here we have one list containing the numbers 1, 2, 3, 4, 5, and 6 and a second list containing the numbers 3 and 4. If we use the Except operator to remove the items in the second list from the first list, we end up with the numbers 1, 2, 5, and 6, as illustrated in Figure 6.4.

Figure 6.4. The items of one list minus, or except, the items in a second list. In this case we take 3 and 4 from the list 1, 2, 3, 4, 5, and 6 to yield the list 1, 2, 5, and 6.

image

Listing 6.6. The ShowExcept Method Prints the Numbers 1, 2, 5, and 6

public void ShowExcept()
{
    var listA = Enumerable.Range(1, 6);
    var listB = new List<int> { 3, 4 };

    var listC = listA.Except(listB);

    ShowList(listC);
}

In the Context of LINQ

The type of code just shown is useful, but it might be helpful to see these same operators used in the context of LINQ query expressions. In that context, you can see how the Set operators can be used to analyze the results of queries to better understand the data that is returned.

You probably know that two similar collections are used to create lists. One is the generic List<T> collection, and the other is the old-style collection called ArrayList. We can use Set operators to help us better understand the difference between these two classes.

Here are two LINQ to Object providers that use Reflection-based queries to retrieve the methods from the List<int> class and the ArrayList class:

var queryList = from m in typeof(List<int>).GetMethods()
                where m.DeclaringType == typeof(List<int>)
                group m by m.Name into g
                select g.Key;


var queryArray = from m in typeof(ArrayList).GetMethods()
                 where m.DeclaringType == typeof(ArrayList)
                 group m by m.Name into g
                 select g.Key;

Here is code that shows the intersection of these two lists:

var listIntersect = queryList.Intersect(queryArray);
Console.WriteLine("Count: {0}", listIntersect.Count());

ShowList(listIntersect);

Alternatively, you could write the query like this:

var listIntersect = (from m in typeof(List<int>).GetMethods()
                     where m.DeclaringType == typeof(List<int>)
                     group m by m.Name into g
                     select g.Key).Intersect(
                        from m in typeof(ArrayList).GetMethods()
                        where m.DeclaringType == typeof(ArrayList)
                        group m by m.Name into g
                        select g.Key);

In either case, the following list would be displayed:

get_Capacity                     GetRange
set_Capacity                     IndexOf
get_Count                        Insert
get_Item                         InsertRange
set_Item                         LastIndexOf
Add                              Remove
AddRange                         RemoveAt
BinarySearch                     RemoveRange
Clear                            Reverse
Contains                         Sort
CopyTo                           ToArray
GetEnumerator

Here is how to see the items that the generic list supports that are not part of the old-style collection:

var listDifference = queryList.Except(listIntersect);

Here is the result of this query:

ConvertAll                            FindLast
AsReadOnly                            FindLastIndex
Exists                                ForEach
Find                                  RemoveAll
FindAll                               TrimExcess
FindIndex                             TrueForAll

Aggregate Operators

The Aggregate operators allow you to perform simple mathematical operations over the elements in a sequence. Because they return the results of that operation, none of them is deferred. All the samples shown in this section are found in the AggregateOperators sample that accompanies this book. Table 6.6 lists the seven Aggregate operators.

Table 6.6. Aggregate Operators

image

Except for the Aggregate operator, all these operators have a simple, obvious default use. Several of these operators, however, have overloads that need a few sentences of explanation. I will show you a simple example of using the operators’ default behavior. Then we will look a bit deeper with a second example that shows how to use at least one of the overloads.

The Count and LongCount Operators

The Count and LongCount operators return the number of elements in a sequence. Classes such as List<T> that implement the ICollection<T> interface already track their count. This means that the Count operator can simply ask these objects for the count—an operation that executes very quickly.

The LongCount operator provides the same basic functionality but allows you to work with collections that contain more than the maximum value that an integer can handle. Calling Count on a List<long> returns quickly, because the list is tracking the total count, but calling Count on an IEnumerable<long> could become a very lengthy operation. In fact, working with any collection of that size is likely to be very cumbersome.

Listing 6.7 shows a simple example of using the Count operator. LongCount works the same way.

Listing 6.7. A Simple Example of Using the Count Operator

public void ShowCount()
{
    var list = Enumerable.Range(5, 12);
    Console.WriteLine(list.Count());
}

The overloads for Count and LongCount allow you to perform calculations to derive a count for a sequence. For instance, you can write code that counts the number of even numbers in a collection:

var list = Enumerable.Range(1, 25);

Console.WriteLine("Total Count: {0}, Count the even numbers: {1}",
   list.Count(),
   list.Count(n => n % 2 == 0));

Our list consists of the numbers between 1 and 25. We call Count once with the first version of the Count operator and get back the number 25.

The second overload of the Count operator takes a simple predicate that you can use for calculations of this type. The declaration looks like this:

public static int Count<TSource>(this IEnumerable<TSource> source,
   Func<TSource, bool> predicate);

The predicate takes an integer and returns a bool specifying whether a particular value from the list passes a test. In our case, we simply ask whether the number is even:

n % 2 == 0

This computation returns the values 2, 4, 6, and so on up to 24, for a total of 12 elements.

The Min and Max Operators

The Min and Max operators are equally simple, as you can see by glancing at Listings 6.8 and 6.9. The first shows the behavior of the first overload of Min and Max, and the second shows how to use one of the other overloads to pose more complex questions.

Listing 6.8. A Simple Example of Using the Min and Max Operators to Determine the Highest and Lowest Values in a Sequence

public void ShowMinMax()
{
    var list = Enumerable.Range(6, 10);

    ShowList(list);

    Console.WriteLine("Min: {0}, Max: {1}", list.Min(), list.Max());
}

Our list consists of the numbers 6 through 15, so the code writes the values 6 and 15 to the console. If you pass in a null argument, you get an ArgumentNullException.

For the more complex examples, I need a few rows of simple data, which I provide in Listing 6.9.

Listing 6.9. The Following Item Class and the GetItems Method Are Used by Most of the Examples in This Section

class Item
{
    public int Width { getset; }
    public int Length { getset; }

    public override string ToString()
    {
        return string.Format("Width: {0}, Length: {1}", Width, Length);
    }
}

private List<Item> GetItems()
{
    return new List<Item>
    {
       new Item { Length = 0, Width = 5 },
       new Item { Length = 1, Width = 6 },
       new Item { Length = 2, Width = 7 },
       new Item { Length = 3, Width = 8 },
       new Item { Length = 4, Width = 9 }
    };
}

The Item class has two simple properties called Length and Width. It also overrides the ToString method so that it can be displayed easily in a foreach loop.

It is easy to understand how to discover a default maximum value for a list of integers, but how can you find the maximum or minimum values for a list of Items? Do you choose the element with the greatest Length, the greatest Width, a combination of the two, or some other value? There is no set answer to this question. The user must make a custom decision based on the requirements of his or her application. You can define your solution by implementing a delegate used with an overload of the Min and Max operators:

public static int Max<TSource>(this IEnumerable<TSource> source,
   Func<TSource, int> selector);

This delegate is not a predicate. Instead, it asks you to return your custom Max value for an Item class. That is, it asks for the integer value that should be used to represent a given Item instance. The Min and Max overloads use this value to compute and return the minimum/maximum of a collection of Item instances. To see how this works, look at Listing 6.10.

Listing 6.10. A Somewhat More Complex Use of Min and Max, Demonstrating How to Get Minimum and Maximum Values for Complex Types with Multiple Fields

List<Item> items = GetItems();

ShowList(items);

Console.WriteLine("MinItem: {0}, MaxItem: {1}",
    items.Min(x => x.Length + x.Width),
    items.Max(x => x.Length + x.Width));

The lambda passed to Max by our code looks like this: x => x.Length + x.Width. This delegate shows our definition of what we mean by max: the largest value returned by adding together the width and length of the Item.


Implementing Max

I mentioned earlier that many of these operators have very simple implementations. Without peeking at the real source code, it seems that Max might look like the code shown in Listing 6.11:

public static int
  Max<TSource>(this IEnumerable<TSource> source,
    Func<TSource, int> selector)
{
    int largest = int.MinValue;
    foreach (var item in source)
    {
        int nextItem = selector(item);
        if (nextItem > largest)
        {
            largest = nextItem;
        }
    }
    return largest;
}


The Average Operator

After you discover the pattern shown in our examination of the Min and Max operators, you find that it can be easily applied to most of the other Aggregate operators. Let’s look at the Average operator, which returns the average value from an enumeration.

Obtaining the average for a range of numbers looks like this:

var list = Enumerable.Range(0, 5);
Console.WriteLine("Average: {0}", list.Average());

When run, this code tells us that the average of the numbers 0, 1, 2, 3, and 4 is the value 2.

When working with a collection of Items, we face the same problem we had with Min and Max: How do we discover the average value for a list of Items that define two properties called Length and Width? The answer, of course, is that we proceed just as we did with the Min and Max operators:

List<Item> items = GetItems();

double averageValue = items.Average(v => v.Length + v.Width);
Console.WriteLine("AverageValue: {0}", AverageValue);

The implementation of Average is probably similar to what is shown in the custom implementation for the Max operator found in the Note at the end of the preceding section. The code must iterate over the list, passing in each item to our lambda, which defines the value we want the Average operator to use in its calculations.

The Sum Operator

The Sum operator tallies the values in an enumeration. Consider the following simple example:

var list = Enumerable.Range(5, 3);
Console.WriteLine("List sum = {0}", list.Sum());

Our list consists of the numbers 5, 6, and 7. The Sum operator adds them together, producing the value 18.

When working with a list of Items, the Sum operator faces the same problem we saw with the Min, Max, and Average operators. It should come as no surprise that the solution is nearly identical:

var items = GetItems();
Console.WriteLine("Sum the lengths of the items: {0}",
  items.Sum(x => x.Length + x.Width));

This is the same pattern you saw with the Average, Min, and Max operators: We pass in a simple lambda to define what the Sum operator should use in its calculations. The result printed to the console is the value 10. If only the rest of our lives were this simple!

The Aggregate Operator

The Aggregate operator follows in the footsteps of the Sum operator but is more flexible. Rather than taking a simple delegate like the other operators in this series, it asks for one similar to the lambda we worked with in Chapter 3:

public static T Aggregate<T>(this IEnumerable<T> source,
   Func<T, T, T> func);

We know what do to with delegates that look like this. We could, for instance, revisit the lesson on lambdas in Chapter 3 and create a delegate that adds up a range of numbers:

var list = Enumerable.Range(5, 3);
Console.WriteLine("Aggregation: {0}", list.Aggregate((a, b) => (a + b)));

The Aggregate operator gets passed the numbers 5, 6, and 7. The first time the lambda is called, it gets passed 5 and 6 and adds them together to produce 11. The next time it is called, it is passed the accumulated result of the previous calculation plus the next number in the series: 11 + 7, which yields 18:

5+6 = 11
11 + 7 = 18

This overload of the Aggregate operator is more flexible than the Sum operator because it allows you to choose the operator. For instance, this code performs multiplication, yielding the value 210:

list.Aggregate((a, b) => (a * b))


The Aggregate Corner Cases

Everyone asks two questions about the Aggregate operator. I’ll answer them here. If it is passed a list with one item, it returns that item. If it is passed a list with zero items, it throws an InvalidOperationException.


The second and perhaps most commonly used overload of the Aggregate operator allows you to seed the calculations it performs with an accumulator:

public static TAccumulate Aggregate<TSource, TAccumulate>(
   this IEnumerable<TSource> source, TAccumulate seed,
   Func<TAccumulate, TSource, TAccumulate> func);

This is essentially the same operator as shown in the previous example, but now you can decide the seed for the value that will be accumulated:

Console.WriteLine("Aggregation: {0}",
   list.Aggregate(0, (a, b) => (a + b)));

If we pass in a list with one item—say, the number 5—the first time the lambda is called, it is passed the seed plus the sole item in the list:

(0, 5) => (0 + 5)

The result, of course, is the number 5. Suppose we pass in an accumulator of 0 plus the numbers 5, 6, and 7:

var list = Enumerable.Range(5, 3);
Console.WriteLine("Aggregation: {0}", list.Aggregate(0, (a, b) =>
                                                     (a + b)));

In this case we would step through the following sequence:

0 + 5 = 5
5 + 6 = 11
11 + 7 = 18

Again, we are doing essentially what we did with the Sum operator.

If you pass in a different seed, you get a different result:

Console.WriteLine("Aggregation: {0}",
   list.Aggregate(3, (a, b) => (a + b)));

With a seed of 3, we get this:

 3 + 5 = 8
 8 + 6 = 14
14 + 7 = 21

If we use the multiplication operation, we should avoid passing in a seed of 0:

Console.WriteLine("Aggregation: {0}",
   list.Aggregate(1, (a, b) => (a * b)));

In this case the series looks like this:

1 * 5 = 5
5 * 6 = 30
30 * 7 = 210

If we passed in an accumulator of 0, we’d end up with the following series of operations:

0 * 5 = 0
0 * 6 = 0
0 * 7 = 0

In what I sometimes suspect might have been an excess of good spirits, the team added one final overload to the Aggregate operator:

public static TResult Aggregate<TSource, TAccumulate, TResult>(
   this IEnumerable<TSource> source, TAccumulate seed,
   Func<TAccumulate, TSource, TAccumulate> func,
   Func<TAccumulate, TResult> resultSelector);

This overload is identical to the previous one, but you are given one more delegate that you can use to transform the result of your aggregation. For instance, consider this use of the Aggregate operator:

Console.WriteLine("Aggregation: {0}",
  list.Aggregate(0, (a, b) => (a + b),
  (a) => (string.Format("{0:C}", a))));

Notice that the first two-thirds of this call mirror what we did earlier; only the third parameter is new.

Suppose we pass in a sequence with the values 5, 6, and 7. As we’ve already seen, the process begins by performing the following series of operations:

0 + 5 = 5
5 + 6 = 11
11 + 7 = 18

Now the Aggregate operator passes this result to our second lambda, which uses the string’s Format method to transform it into a string in currency format:

$18.00

Like nearly everything in LINQ, this seems terribly complicated at first, only to end up being reasonably simple. These kinds of simple operations, however, provide us with the building blocks from which we can safely create complex programs. This is what we mean when we apply the word “elegant” to a technology.

Ordering Operators

You have already seen several examples of the OrderBy operator. I will, however, quickly show examples of OrderByDescending and ThenBy. In this section I also include the related Reverse operator. Table 6.7 lists the Ordering operators, all of which are deferred.

Table 6.7. Ordering Operators

image

In this section I’ll run queries against one of the result sets from the section on joins in the preceding chapter. To keep things as simple as possible, I will embody the result of the query in a new class, as shown in Listing 6.11.

Listing 6.11. This Code, and All the Sample Code from This Section, Is Found in the OrderingOperators Sample That Accompanies This Book

class Musician
{
    public int OrderId { getset; }
    public string Name { getset; }
    public string Instrument { getset; }

    public override string ToString()
    {
        return string.Format("OrderId = {0}, Name = {1},
                             Instrument = {2}",
           OrderId, Name, Instrument);
    }

    public static List<Musician> GetList()
    {
        return new List<Musician>
        {
            new Musician { OrderId = 1, Name = "Sonny Rollings",
               Instrument = "Tenor Saxophone" },
            new Musician { OrderId = 2, Name = "Miles Davis",
               Instrument = "Trumpet" },
            new Musician { OrderId = 6, Name = "Miles Davis",
               Instrument = "Keyboard" },
            new Musician { OrderId = 4, Name = "John Coltrane",
               Instrument = "Tenor Saxophone" },
            new Musician { OrderId = 3, Name = "John Coltrane",
               Instrument = "Soprano Saxophone" },
            new Musician { OrderId = 5, Name = "Charlie Parker",
               Instrument = "Tenor Saxophone" }
        };
    }
}

OrderBy

Here is the simplest possible query we can run against the sequence of Musician objects shown in Listing 6.11:

var query = from m in list
            select m;

When passed to ShowList, the result of this query simply echoes our sequence to the console:

Name = Sonny Rollings, Instrument = Tenor Saxophone
Name = Miles Davis, Instrument = Trumpet
Name = Miles Davis, Instrument = Keyboard
Name = John Coltrane, Instrument = Tenor Saxophone
Name = John Coltrane, Instrument = Soprano Saxophone
Name = Charlie Parker, Instrument = Tenor Saxophone

We can order the sequence alphabetically by writing this code:

var query2 = from m in list
             orderby m.Name
             select m;

If run through the ShowList method, the output looks like this:

Name = Charlie Parker, Instrument = Tenor Saxophone
Name = John Coltrane, Instrument = Tenor Saxophone
Name = John Coltrane, Instrument = Soprano Saxophone
Name = Miles Davis, Instrument = Trumpet
Name = Miles Davis, Instrument = Keyboard
Name = Sonny Rollings, Instrument = Tenor Saxophone

OrderByDescending

If we use the descending keyword, the query and output are as shown in Listings 6.12 and 6.13.

Listing 6.12. Using the Keyword descending in a Query

var query2 = from m in list
             orderby m.Name descending
             select m;

Listing 6.13. The Output from the Query Shown in Listing 6.12

Name = Sonny Rollings, Instrument = Tenor Saxophone
Name = Miles Davis, Instrument = Trumpet
Name = Miles Davis, Instrument = Keyboard
Name = John Coltrane, Instrument = Tenor Saxophone
Name = John Coltrane, Instrument = Soprano Saxophone
Name = Charlie Parker, Instrument = Tenor Saxophone

When translated into query method syntax, the query shown in Listing 6.12 looks like this:

var query2 = list.OrderByDescending(m => m.Name);

The output from this query is identical to that shown in Listing 6.13.

ThenBy

In the queries shown in the previous two sections, we sorted on the Name field but ignored the Instrument field. You can, in fact, sort on multiple fields, or multiple keys, at the same time. Here is how it looks:

var query4 = from m in list
             orderby m.Name, m.Instrument
             select m;

The output shown after the query is enumerated with a foreach loop reveals that the artists are listed alphabetically, and the instruments played by Trane and Miles are also listed alphabetically:

Name = Charlie Parker, Instrument = Tenor Saxophone
Name = John Coltrane, Instrument = Soprano Saxophone
Name = John Coltrane, Instrument = Tenor Saxophone
Name = Miles Davis, Instrument = Keyboard
Name = Miles Davis, Instrument = Trumpet
Name = Sonny Rollins, Instrument = Tenor Saxophone

You may have noticed that this section of the text is titled ThenBy. The query I’ve shown, however, makes no mention of that word. The operator ThenBy becomes manifest only when you use method syntax rather than query expressions. When translated into method syntax, the query looks like this:

var query4a = list.OrderBy(m => m.Name).ThenBy(m => m.Instrument);

Now you can see the ThenBy operator! The output from this query is identical to that shown for the first query in this section.

The following syntax is also valid:

var query6 = from m in list
             orderby m.Instrument descending, m.Name descending
             select m;

You can also write code that looks like this, but it does not produce the same output as that derived from the previous query:

var query5 = from m in list
             orderby m.Instrument descending
             orderby m.Name descending
             select m;

I’ll leave it up to you to open the OrderingOperators sample program that accompanies this book and experiment with these various combinations to see exactly how they work.

Reverse

Here is a simple example of how to use the Reverse operator:

List<int> list = new List<int> { 1, 2, 3 };

list.Reverse();

foreach (var item in list)
{
    Console.WriteLine(item);
}

This code prints out the values 3, 2, 1.

Conversion Operators

LINQ provides several Conversion operators that help you transform one list type into another. In this chapter I have shown that you can perform many powerful operations using the operators implemented on IEnumerable<T>. However, there will be times when you will want to transform the results of a query into a more familiar collection type, or when you will want to transform a type that does not support IEnumerable<T> into a type that you can use in a LINQ query. These operators are designed to help you achieve that goal. The Conversion operators are shown in Table 6.8.

Table 6.8. Conversion Operators

image

The ToArray, ToList, ToDictionary, and ToLookup operators are not deferred. Like the element operators, they force immediate execution of a query.

ToList

By default, a typical LINQ query returns a computation on an IEnumerable<T>:

public IEnumerable<Roman> GetWives()
{
  var women = from r in romans
              where r.Gender == 'f'
              select r;

  return women;
}

You may have code that works with the common List<T> type, or you may want to call methods such as Add, which are unavailable on IEnumerable<T>. To convert the results of your query to a List<T>, just write this code:

public List<Roman> CreateList(char gender)
{
  return (from r in romans

          where r.Gender == gender
          select r).ToList();
}

The List<T> type is very commonly used by C# programmers, so there will obviously be many occasions when developers choose to work with that type rather than IEnumerable<T>. However, it also forces the execution of the query. In other words, calling ToList puts an end to deferred execution and all the other benefits that come with functional or declarative code. In saying this, I do not mean to discourage you from using the very useful ToList operator. I only ask that you be aware of the consequences of what you are doing.

Remember that you can always use Visual Studio’s QuickInfo to retrieve the type that a query returns. For instance, in Figure 6.5 you can see that this call to ToList creates a List<Roman>.

Figure 6.5. Using QuickInfo to see the type returned by a call to the ToList() operator.

image

As mentioned earlier, IEnumerable<T>, sans its LINQ operators, is a very simple type:

public interface IEnumerable<T> : IEnumerable
{
  IEnumerator<T> GetEnumerator();
}

You cannot, for instance, Add an item to an item of type IEnumerable<T>:

IEnumerable<Int> list = from n in Enumerable.Range(1, 3)
                        where n < 3
                        select n;
list.Add(3); // Will not compile: member does not exist.

The following code, however, does compile, and it behaves as expected:

List<int> list = (from n in Enumerable.Range(1, 3)
                  where n < 3
                  select n).ToList();

list.Add(4);

ShowList(list);

ToArray

ToArray() can convert a sequence into an array, as shown in Listing 6.14. Thus, you can quickly convert the results of a LINQ query into an array of string or an array of Integer.

Listing 6.14. You Can Use the ToArray() Method to Convert a Sequence—an IEnumerable<T>—into a More Traditional Array

List<int> list = new List<int> { 1, 2, 3 };

int[] data = (from num in list
              where num < 3
              select num).ToArray();

foreach (var item in data)
{
    Console.WriteLine(item);
}

Here we return not a computation, but an array of integers. Note that this forces execution of the query, so the query is no longer deferred.

OfType

Here is the OfType<T> operator, which converts only members of a collection that are of a specified type:

ArrayList list = new ArrayList { 1, "That", 2, "This" };

IEnumerable<string> elist = list.OfType<string>();

var query = from num in elist
            select num;

The output from this query looks like this:

That
This

Notice that the integers in the ArrayList are both ignored, and that only the strings are retrieved. This is because we asked explicitly for the members of the list that are “of type” string:

list.OfType<string>();

The point here is that an ArrayList is not type-safe, because it is not a generic collection. As a result, you can never completely trust that you know what type is in an ArrayList. This operator can help relieve your anxiety by ensuring that you will not end up with a runtime exception when you stumble across an unexpected type that was infesting an ArrayList.

ToDictionary

Here is an example of how to use the ToDictionary operator:

public void ToDictionary()
{
  var query = from r in romans
              select new { r.Name, r.Gender, r.Id };

  var romanDictionary = query.ToDictionary(r => r.Name);

  Console.WriteLine(romanDictionary["Augustus"]);
  Console.WriteLine(romanDictionary["Livia Drusilla"]);
}

This code produces the following output:

{ Name = Augustus, Gender = m, Id = 0 }
{ Name = Livia Drusilla, Gender = f, Id = 7 }

As you can see, the ToDictionary() operator converts the results of a LINQ query into a generic dictionary. ToDictionary takes as its sole argument a lambda expression that defines what field you want to use as the Key for the dictionary.


Anonymous Types Usually have a Limited Scope

The dictionary shown here uses the anonymous type returned from the original LINQ query shown in the first two lines of our method. This limits your options, because you can’t declare a dictionary of type Dictionary<MyAnonymousType>, so you can’t pass it outside the scope of this method. You could get around this problem by altering the code to return a Dictionary<Roman> or Dictionary<SomeCustomType> and then creating a ToString method for whichever type you used. An example of this is shown in the RomanOperators program that comes with this book.


Conversion Between IEnumerable and IQueryable

Although it is defined in the Queryable class, and not in Enumerable, this discussion would be incomplete without mentioning the AsQueryable operator and its companion, the AsEnumerable operator. AsEnumerable is defined with the other LINQ to Object operators in the Enumerable class. Here is a simple example, showing how it works:

var query = from r in romans
            where r.Gender == 'm'
            select r;

var queriable = query.AsQueryable();

Console.WriteLine(queriable.Expression.NodeType);

var query1 = queriable.AsEnumerable();

ShowList(query1);

The call to AsQueryable converts an IEnumerable<T> into an IQueryable<T>. As mentioned earlier, this is most often helpful when you want to create a LINQ provider. You can read about working examples of third-party providers that ship with source in Chapter 17, “LINQ Everywhere.”

The sample code shown here demonstrates how to convert a variable of type IQueryable<T> into an IEnumerable<T>. Because IQueryable<T> implements the IEnumerable<T> interface, you probably won’t often need to explicitly call this operator, but I mention it here for the sake of completeness.

Summary

In this chapter you have learned about the LINQ query operators. You have had a look at nearly all the operators, and you should now have a secure foothold in this landscape that will allow you to keep your balance in any situation.

This is the end of the introductory part of this book. If you understand the material that has been presented so far, you can consider yourself well established as an intermediate-level LINQ developer. The next chapter begins exploring LINQ to SQL, an important subject, and one that many developers will use every day in their work.

When thinking back on the material that has been covered, it is important to begin to understand how this style of programming differs from the traditional imperative style that has dominated programming for the last 20 or 30 years. The declarative style of programming offers interesting and exciting challenges for developers willing to explore this fascinating technology.

For additional information on the material covered in this chapter, see the page in the online help called “The .NET Standard Query Operators” at http://msdn.microsoft.com/en-us/library/bb394939.aspx. It is written by Anders Hejlsberg and Mads Torgersen.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset