To write it, it took three months; to conceive it three minutes; to collect the data in it—all my life. | ||
--F. Scott Fitzgerald |
Science is feasible when the variables are few and can be enumerated... | ||
--Paul Valéry |
You shall listen to all sides and filter them from your self. | ||
--Walt Whitman |
The portraitist can select one tiny aspect of everything shown at a moment to incorporate into the final painting. | ||
--Robert Nozick |
List, list, O, list! | ||
--William Shakespeare |
In this chapter you’ll learn:
<objective>Basic LINQ concepts.
</objective> <objective>How to query an array using LINQ.
</objective> <objective>Basic .NET collections concepts.
</objective> <objective>How to create and use a generic List
collection.
How to query a generic List
collection using LINQ.
The preceding chapter introduced arrays—simple data structures used to store data items of a specific type. Although commonly used, arrays have limited capabilities. For instance, you must specify an array’s size, and if at execution time, you wish to modify it, you must do so manually by creating a new array or by using the Array
class’s Resize
method, which creates a new array and copies the existing elements into the new array for you.
Here, we introduce a set of prepackaged data structures—the .NET Framework’s collection classes—that offer greater capabilities than traditional arrays. They’re reusable, reliable, powerful and efficient and have been carefully designed and tested to ensure quality and performance. This chapter focuses on the List
collection. List
s are similar to arrays but provide additional functionality, such as dynamic resizing—they automatically increase their size at execution time to accommodate additional elements. We use the List
collection to implement several examples similar to those used in the preceding chapter.
Large amounts of data are often stored in a database—an organized collection of data. (We discuss databases in detail in Chapter 18.) A database management system (DBMS) provides mechanisms for storing, organizing, retrieving and modifying data in the database. A language called SQL—pronounced “sequel”—is the international standard used to perform queries (i.e., to request information that satisfies given criteria) and to manipulate data. For years, programs accessing a relational database passed SQL queries to the database management system, then processed the results. This chapter introduces C#’s new LINQ (Language Integrated Query) capabilities. LINQ allows you to write query expressions, similar to SQL queries, that retrieve information from a wide variety of data sources, not just databases. We use LINQ to Objects in this chapter to query arrays and List
s, selecting elements that satisfy a set of conditions—this is known as filtering. Figure 9.1 shows where and how we use LINQ throughout the book to retrieve information from many data sources.
Table 9.1. LINQ usage throughout the book.
Chapter | Used to |
---|---|
Query arrays and | |
Select GUI controls in a Windows Forms application. | |
Search a directory and manipulate text files. | |
Retrieve information from a database. | |
Retrieve information from a database to be used in a web-based application. | |
Chapter 26, XML and LINQ to XML | Query an XML document. |
Chapter 28, Windows Communication Foundation (WCF) Web Services | Query and update a database. Process XML returned by WCF services. |
Chapter 29, Silverlight and Rich Internet Applications | Process XML returned by web services to a Silverlight application. |
The syntax of LINQ is built into C#, but LINQ queries may be used in many different contexts because of libraries known as providers. A LINQ provider is a set of classes that implement LINQ operations and enable programs to interact with data sources to perform tasks such as sorting, grouping and filtering elements.
In this book, we discuss LINQ to SQL and LINQ to XML, which allow you to query databases and XML documents using LINQ. These providers, along with LINQ to Objects, mentioned above, are included with Visual Studio and the .NET Framework. There are many providers that are more specialized, allowing you to interact with a specific website or data format. An extensive list of available providers is located at:
blogs.msdn.com/charlie/archive/2006/10/05/Links-to-LINQ.aspx |
Figure 9.2 demonstrates querying an array of integers using LINQ. Repetition statements that filter arrays focus on the process of getting the results—iterating through the elements and checking whether they satisfy the desired criteria. LINQ specifies the conditions that selected elements must satisfy. This is known as declarative programming—as opposed to imperative programming (which we’ve been doing so far) in which you specify the actual steps to perform a task. The query in lines 20–22 specifies that the results should consist of all the int
s in the values
array that are greater than 4
. It does not specify how those results are obtained—the C# compiler generates all the necessary code automatically, which is one of the great strengths of LINQ. To use LINQ to Objects, you must import the System.Linq
namespace (line 4).
Example 9.2. LINQ to Objects using an int
array.
1 // Fig. 9.2: LINQWithSimpleTypeArray.cs 2 // LINQ to Objects using an int array. 3 using System; 4 using System.Linq; 5 6 class LINQWithSimpleTypeArray 7 { 8 public static void Main( string[] args ) 9 { 10 // create an integer array 11 int[] values = { 2, 9, 5, 0, 3, 7, 1, 4, 8, 5 }; 12 13 // display original values 14 Console.Write( "Original array:" ); 15 foreach ( var element in values ) 16 Console.Write( " {0}", element ); 17 18 // LINQ query that obtains values greater than 4 from the array 19 var filtered = 20 from value in values 21 where value > 4 22 select value; 23 24 // display filtered results 25 Console.Write( " Array values greater than 4:" ); 26 foreach ( var element in filtered ) 27 Console.Write( " {0}", element ); 28 29 // use orderby clause to sort original array in ascending order 30 var sorted = 31 from value in values 32 orderby value 33 select value; 34 35 // display sorted results 36 Console.Write( " Original array, sorted:" ); 37 foreach ( var element in sorted ) 38 Console.Write( " {0}", element ); 39 40 // sort the filtered results into descending order 41 var sortFilteredResults = 42 from value in filtered 43 orderby value descending 44 select value; 45 46 // display the sorted results 47 Console.Write( 48 " Values greater than 4, descending order (separately):" ); 49 foreach ( var element in sortFilteredResults ) 50 Console.Write( " {0}", element ); 51 52 // filter original array and sort in descending order 53 var sortAndFilter = 54 from value in values 55 where value > 4 56 orderby value descending 57 select value; 58 59 // display the filtered and sorted results 60 Console.Write( 61 " Values greater than 4, descending order (one query):" ); 62 foreach ( var element in sortAndFilter ) 63 Console.Write( " {0}", element ); 64 65 Console.WriteLine(); 66 } // end Main 67 } // end class LINQWithSimpleTypeArray
A LINQ query begins with a from
clause (line 20), which specifies a range variable (value
) and the data source to query (values
). The range variable represents each item in the data source (one at a time), much like the control variable in a foreach
statement. We do not specify the range variable’s type. Since it is assigned one element at a time from the array values
, which is an int
array, the compiler determines that the range variable value
should be of type int
. This is a C# feature called implicitly typed local variables, which enables the compiler to infer a local variable’s type based on the context in which it’s used.
Introducing the range variable in the from
clause at the beginning of the query allows the IDE to provide IntelliSense while you write the rest of the query. The IDE knows the range variable’s type, so when you enter the range variable’s name followed by a dot (.
) in the code editor, the IDE can display the range variable’s methods and properties.
You can also declare a local variable and let the compiler infer the variable’s type based on the variable’s initializer. To do so, the var
keyword is used in place of the variable’s type when declaring the variable. Consider the declaration
var x = 7; |
Here, the compiler infers that the variable x
should be of type int
, because the compiler assumes that whole-number values, like 7
, are of type int
. Similarly, in the declaration
var y = -123.45; |
the compiler infers that y
should be of type double
, because the compiler assumes that floating-point number values, like -123.45
, are of type double
. Typically, implicitly typed local variables are used for more complex types, such as the collections of data returned by LINQ queries. We use this feature in lines 19, 30, 41 and 53 to enable the compiler to determine the type of each variable that stores the results of a LINQ query. We also use this feature to declare the control variable in the foreach
statements at lines 15–16, 26–27, 37–38, 49–50 and 62–63. In each case, the compiler infers that the control variable is of type int
because the array values
and the LINQ query results all contain int
values.
If the condition in the where
clause (line 21) evaluates to true
, the element is selected—i.e., it’s included in the results. Here, the int
s in the array are included only if they’re greater than 4
. An expression that takes an element of a collection and returns true
or false
by testing a condition on that element is known as a predicate.
For each item in the data source, the select
clause (line 22) determines what value appears in the results. In this case, it’s the int
that the range variable currently represents. A LINQ query typically ends with a select
clause.
Lines 26–27 use a foreach
statement to display the query results. As you know, a foreach
statement can iterate through the contents of an array, allowing you to process each element in the array. Actually, the foreach
statement can iterate through the contents arrays, collections and the results of LINQ queries. The foreach
statement in lines 26–27 iterates over the query result filtered
, displaying each of its items.
It would be simple to display the integers greater than 4 using a repetition statement that tests each value before displaying it. However, this would intertwine the code that selects elements and the code that displays them. With LINQ, these are kept separate, making the code easier to understand and maintain.
The orderby
clause (line 32) sorts the query results in ascending order. Lines 43 and 56 use the descending
modifier in the orderby
clause to sort the results in descending order. An ascending
modifier also exists but isn’t normally used, because it’s the default. Any value that can be compared with other values of the same type may be used with the orderby
clause. A value of a simple type (e.g., int
) can always be compared to another value of the same type; we’ll say more about comparing values of reference types in Chapter 12.
The queries in lines 42–44 and 54–57 generate the same results, but in different ways. The first query uses LINQ to sort the results of the query from lines 20–22. The second query uses both the where
and orderby
clauses. Because queries can operate on the results of other queries, it’s possible to build a query one step at a time, and pass the results of queries between methods for further processing.
Implicitly typed local variables can also be used to initialize arrays without explicitly giving their type. For example, the following statement creates an array of int
values:
var array = new[] { 32, 27, 64, 18, 95, 14, 90, 70, 60, 37 }; |
Note that there are no square brackets on the left side of the assignment operator, and that new[]
is used to specify that the variable is an array.
As we mentioned, the foreach
statement can iterate through the contents of arrays, collections and LINQ query results. Actually, foreach
iterates over any so-called IEnumerable<T>
object, which just happens to be what a LINQ query returns.
IEnumerable<T>
is an interface. Interfaces define and standardize the ways in which people and systems can interact with one another. For example, the controls on a radio serve as an interface between radio users and the radio’s internal components. The controls allow users to perform a limited set of operations (e.g., changing the station, adjusting the volume, and choosing between AM and FM), and different radios may implement the controls in different ways (e.g., using push buttons, dials or voice commands). The interface specifies what operations a radio permits users to perform but does not specify how the operations are implemented. Similarly, the interface between a driver and a car with a manual transmission includes the steering wheel, the gear shift, the clutch, the gas pedal and the brake pedal. This same interface is found in nearly all manual-transmission cars, enabling someone who knows how to drive one manual-transmission car to drive another.
Software objects also communicate via interfaces. A C# interface describes a set of methods that can be called on an object—to tell the object, for example, to perform some task or return some piece of information. The IEnumerable<T>
interface describes the functionality of any object that can be iterated over and thus offers methods to access each element. A class that implements an interface must define each method in the interface with a signature identical to the one in the interface definition. Implementing an interface is like signing a contract with the compiler that states, “I will declare all the methods specified by the interface.” Chapter 12 covers use of interfaces in more detail, as well as how to define your own interfaces.
Arrays are IEnumerable<T>
objects, so a foreach
statement can iterate over an array’s elements. Similarly, each LINQ query returns an IEnumerable<T>
object. Therefore, you can use a foreach
statement to iterate over the results of any LINQ query. The notation <T>
indicates that the interface is a generic interface that can be used with any type of data (for example, int
s, string
s or Employee
s). You’ll learn more about the <T>
notation in Section 9.4. You’ll learn more about interfaces in Section 12.7.
LINQ is not limited to querying arrays of primitive types such as int
s. It can be used with most data types, including string
s and user-defined classes. It cannot be used when a query does not have a defined meaning—for example, you cannot use orderby
on objects that are not comparable. Comparable types in .NET are those that implement the IComparable
interface, which is discussed in Section 22.4. All built-in types, such as string
, int
and double
implement IComparable
. Figure 9.3 presents the Employee
class. Figure 9.4 uses LINQ to query an array of Employee
objects.
Example 9.3. Employee
class.
1 // Fig. 9.3: Employee.cs 2 // Employee class with FirstName, LastName and MonthlySalary properties. 3 public class Employee 4 { 5 private decimal monthlySalaryValue; // monthly salary of employee 6 7 // auto-implemented property FirstName 8 public string FirstName { get; set; } 9 10 // auto-implemented property LastName 11 public string LastName { get; set; } 12 13 // constructor initializes first name, last name and monthly salary 14 public Employee( string first, string last, decimal salary ) 15 { 16 FirstName = first; 17 LastName = last; 18 MonthlySalary = salary; 19 } // end constructor 20 21 // property that gets and sets the employee's monthly salary 22 public decimal MonthlySalary 23 { 24 get 25 { 26 return monthlySalaryValue; 27 } // end get 28 set 29 { 30 if ( value >= 0M ) // if salary is nonnegative 31 { 32 monthlySalaryValue = value; 33 } // end if 34 } // end set 35 } // end property MonthlySalary 36 37 // return a string containing the employee's information 38 public override string ToString() 39 { 40 return string.Format( "{0,-10} {1,-10} {2,10:C}", 41 FirstName, LastName, MonthlySalary ); 42 } // end method ToString 43 } // end class Employee
Example 9.4. LINQ to Objects using an array of Employee
objects.
1 // Fig. 9.4: LINQWithArrayOfObjects.cs 2 // LINQ to Objects using an array of Employee objects. 3 using System; 4 using System.Linq; 5 6 public class LINQWithArrayOfObjects 7 { 8 public static void Main( string[] args ) 9 { 10 // initialize array of employees 11 Employee[] employees = { 12 new Employee( "Jason", "Red", 5000M ), 13 new Employee( "Ashley", "Green", 7600M ), 14 new Employee( "Matthew", "Indigo", 3587.5M ), 15 new Employee( "James", "Indigo", 4700.77M ), 16 new Employee( "Luke", "Indigo", 6200M ), 17 new Employee( "Jason", "Blue", 3200M ), 18 new Employee( "Wendy", "Brown", 4236.4M ) }; // end init list 19 20 // display all employees 21 Console.WriteLine( "Original array:" ); 22 foreach ( var element in employees ) 23 Console.WriteLine( element ); 24 25 // filter a range of salaries using && in a LINQ query 26 var between4K6K = 27 from e in employees 28 where e.MonthlySalary >= 4000M && e.MonthlySalary <= 6000M 29 select e; 30 31 // display employees making between 4000 and 6000 per month 32 Console.WriteLine( string.Format( 33 " Employees earning in the range {0:C}-{1:C} per month:", 34 4000, 6000 ) ); 35 foreach ( var element in between4K6K ) 36 Console.WriteLine( element ); 37 38 // order the employees by last name, then first name with LINQ 39 var nameSorted = 40 from e in employees 41 orderby e.LastName, e.FirstName 42 select e; 43 44 // header 45 Console.WriteLine( " First employee when sorted by name:" ); 46 47 // attempt to display the first result of the above LINQ query 48 if ( nameSorted.Any() ) 49 Console.WriteLine( nameSorted.First() ); 50 else 51 Console.WriteLine( "not found" ); 52 53 // use LINQ to select employee last names 54 var lastNames = 55 from e in employees 56 select e.LastName; 57 58 // use method Distinct to select unique last names 59 Console.WriteLine( " Unique employee last names:" ); 60 foreach ( var element in lastNames.Distinct() ) 61 Console.WriteLine( element ); 62 63 // use LINQ to select first and last names 64 var names = 65 from e in employees 66 select new { e.FirstName, Last = e.LastName }; 67 68 // display full names 69 Console.WriteLine( " Names only:" ); 70 foreach ( var element in names ) 71 Console.WriteLine( element ); 72 73 Console.WriteLine(); 74 } // end Main 75 } // end class LINQWithArrayOfObjects
Original array: Jason Red $5,000.00 Ashley Green $7,600.00 Matthew Indigo $3,587.50 James Indigo $4,700.77 Luke Indigo $6,200.00 Jason Blue $3,200.00 Wendy Brown $4,236.40 Employees earning in the range $4,000.00-$6,000.00 per month: Jason Red $5,000.00 James Indigo $4,700.77 Wendy Brown $4,236.40 First employee when sorted by name: Jason Blue $3,200.00 Unique employee last names: Red Green Indigo Blue Brown Names only: { FirstName = Jason, Last = Red } { FirstName = Ashley, Last = Green } { FirstName = Matthew, Last = Indigo } { FirstName = James, Last = Indigo } { FirstName = Luke, Last = Indigo } { FirstName = Jason, Last = Blue } { FirstName = Wendy, Last = Brown } |
Line 28 of Fig. 9.4 shows a where
clause that accesses the properties of the range variable. In this example, the compiler infers that the range variable is of type Employee
based on its knowledge that employees
was defined as an array of Employee
objects (lines 11–18). Any bool
expression can be used in a where
clause. Line 28 uses the conditional AND (&&
) operator to combine conditions. Here, only employees that have a salary between $4,000 and $6,000 per month, inclusive, are included in the query result, which is displayed in lines 35–36.
Line 41 uses an orderby
clause to sort the results according to multiple properties—specified in a comma-separated list. In this query, the employees are sorted alphabetically by last name. Each group of Employee
s that have the same last name is then sorted within the group by first name.
Line 48 introduces the query result’s Any
method, which returns true
if there’s at least one element, and false
if there are no elements. The query result’s First
method (line 49) returns the first element in the result. You should check that the query result is not empty (line 48) before calling First
.
We’ve not specified the class that defines methods First
and Any
. Your intuition probably tells you they’re methods declared in the IEnumerable<T>
interface, but they aren’t. They’re actually extension methods, but they can be used as if they were methods of IEnumerable<T>
.
LINQ defines many more extension methods, such as Count
, which returns the number of elements in the results. Rather than using Any
, we could have checked that Count
was nonzero, but it’s more efficient to determine whether there’s at least one element than to count all the elements. The LINQ query syntax is actually transformed by the compiler into extension method calls, with the results of one method call used in the next. It’s this design that allows queries to be run on the results of previous queries, as it simply involves passing the result of a method call to another method.
Line 56 uses the select
clause to select the range variable’s LastName
property rather than the range variable itself. This causes the results of the query to consist of only the last names (as string
s), instead of complete Employee
objects. Lines 60–61 display the unique last names. The Distinct
extension method (line 60) removes duplicate elements, causing all elements in the result to be unique.
The last LINQ query in the example (lines 65–66) selects the properties FirstName
and LastName
. The syntax
new { e.FirstName, Last = e.LastName } |
creates a new object of an anonymous type (a type with no name), which the compiler generates for you based on the properties listed in the curly braces ({}
). In this case, the anonymous type consists of properties for the first and last names of the selected Employee
. The LastName
property is assigned to the property Last
in the select
clause. This shows how you can specify a new name for the selected property. If you don’t specify a new name, the property’s original name is used—this is the case for FirstName
in this example. The preceding query is an example of a projection—it performs a transformation on the data. In this case, the transformation creates new objects containing only the FirstName
and Last
properties. Transformations can also manipulate the data. For example, you could give all employees a 10% raise by multiplying their MonthlySalary
properties by 1.1
.
When creating a new anonymous type, you can select any number of properties by specifying them in a comma-separated list within the curly braces ({}
) that delineate the anonymous type definition. In this example, the compiler automatically creates a new class having properties FirstName
and Last
, and the values are copied from the Employee
objects. These selected properties can then be accessed when iterating over the results. Implicitly typed local variables allow you to use anonymous types because you do not have to explicitly state the type when declaring such variables.
When the compiler creates an anonymous type, it automatically generates a ToString
method that returns a string
representation of the object. You can see this in the program’s output—it consists of the property names and their values, enclosed in braces. Anonymous types are discussed in more detail in Chapter 18.
The .NET Framework Class Library provides several classes, called collections, used to store groups of related objects. These classes provide efficient methods that organize, store and retrieve your data without requiring knowledge of how the data is being stored. This reduces application-development time.
You’ve used arrays to store sequences of objects. Arrays do not automatically change their size at execution time to accommodate additional elements—you must do so manually by creating a new array or by using the Array
class’s Resize
method.
The collection class List<T>
(from namespace System.Collections.Generic
) provides a convenient solution to this problem. The T
is a placeholder—when declaring a new List
, replace it with the type of elements that you want the List
to hold. This is similar to specifying the type when declaring an array. For example,
List< int > list1; |
declares list1
as a List
collection that can store only int
values, and
List< string > list2; |
declares list2
as a List
of string
s. Classes with this kind of placeholder that can be used with any type are called generic classes. Generic classes and additional generic collection classes are discussed in Chapters 22 and 23, respectively. Figure 23.2 provides a table of collection classes. Figure 9.5 shows some common methods and properties of class List<T>
.
Table 9.5. Some methods and properties of class List<T>
.
Method or property | Description |
---|---|
| Adds an element to the end of the |
| Property that gets or sets the number of elements a |
| Removes all the elements from the |
| Returns |
| Property that returns the number of elements stored in the |
| Returns the index of the first occurrence of the specified value in the |
| Inserts an element at the specified index. |
| Removes the first occurrence of the specified value. |
Removes the element at the specified index. | |
| Removes a specified number of elements starting at a specified index. |
| Sorts the |
| Sets the |
Figure 9.6 demonstrates dynamically resizing a List
object. The Add
and Insert
methods add elements to the List
(lines 13–14). The Add
method appends its argument to the end of the List
. The Insert
method inserts a new element at the specified position. The first argument is an index—as with arrays, collection indices start at zero. The second argument is the value that’s to be inserted at the specified index. All elements at the specified index and above are shifted up by one position. This is usually slower than adding an element to the end of the List
.
Example 9.6. Generic List<T>
collection demonstration.
1 // Fig. 9.6: ListCollection.cs 2 // Generic List collection demonstration. 3 using System; 4 using System.Collections.Generic; 5 6 public class ListCollection 7 { 8 public static void Main( string[] args ) 9 { 10 // create a new List of strings 11 List< string > items = new List< string >(); 12 13 items.Add( "red" ); // append an item to the List 14 items.Insert( 0, "yellow" ); // insert the value at index 0 15 16 // display the colors in the list 17 Console.Write( 18 "Display list contents with counter-controlled loop:" ); 19 for ( int i = 0; i < items.Count; i++ ) 20 Console.Write( " {0}", items[ i ] ); 21 22 // display colors using foreach 23 Console.Write( 24 " Display list contents with foreach statement:" ); 25 foreach ( var item in items ) 26 Console.Write( " {0}", item ); 27 28 items.Add( "green" ); // add "green" to the end of the List 29 items.Add( "yellow" ); // add "yellow" to the end of the List 30 31 // display the List 32 Console.Write( " List with two new elements:" ); 33 foreach ( var item in items ) 34 Console.Write( " {0}", item ); 35 36 items.Remove( "yellow" ); // remove the first "yellow" 37 38 // display the List 39 Console.Write( " Remove first instance of yellow:" ); 40 foreach ( var item in items ) 41 Console.Write( " {0}", item ); 42 43 items.RemoveAt( 1 ); // remove item at index 1 44 45 // display the List 46 Console.Write( " Remove second list element (green):" ); 47 foreach ( var item in items ) 48 Console.Write( " {0}", item ); 49 50 // check if a value is in the List 51 Console.WriteLine( " "red" is {0}in the list", 52 items.Contains( "red" ) ? string.Empty : "not " ); 53 54 // display number of elements in the List 55 Console.WriteLine( "Count: {0}", items.Count ); 56 57 // display the capacity of the List 58 Console.WriteLine( "Capacity: {0}", items.Capacity ); 59 } // end Main 60 } // end class ListCollection
Display list contents with counter-controlled loop: yellow red Display list contents with foreach statement: yellow red List with two new elements: yellow red green yellow Remove first instance of yellow: red green yellow Remove second list element (green): red yellow "red" is in the list Count: 2 Capacity: 4 |
Lines 19–20 display the items in the List
. The Count
property returns the number of elements currently in the List
. List
s can be indexed like arrays by placing the index in square brackets after the List
variable’s name. The indexed List
expression can be used to modify the element at the index. Lines 25–26 output the List
by using a foreach
statement. More elements are then added to the List
, and it’s displayed again (lines 28–34).
The Remove
method is used to remove the first element with a specific value (line 36). If no such element is in the List
, Remove
does nothing. A similar method, RemoveAt
, removes the element at the specified index (line 43). When an element is removed through either of these methods, all elements above that index are shifted down by one—the opposite of the Insert
method.
Line 52 uses the Contains
method to check if an item is in the List
. The Contains
method returns true
if the element is found in the List
, and false
otherwise. The method compares its argument to each element of the List
in order until the item is found, so using Contains
on a large List
is inefficient.
Lines 55 and 58 display the List
’s Count
and Capacity
. Recall that the Count
property (line 55) indicates the number of items in the List
. The Capacity
property (line 58) indicates how many items the List
can hold without growing. List
is implemented using an array behind the scenes. When the List
grows, it must create a larger internal array and copy each element to the new array. This is a time-consuming operation. It would be inefficient for the List
to grow each time an element is added. Instead, the List
grows only when an element is added and the Count
and Capacity
properties are equal—there’s no space for the new element.
You can use LINQ to Objects to query List
s just as arrays. In Fig. 9.7, a List
of string
s is converted to uppercase and searched for those that begin with “R
”.
Example 9.7. LINQ to Objects using a List<string>
.
1 // Fig. 9.7: LINQWithListCollection.cs 2 // LINQ to Objects using a List< string >. 3 using System; 4 using System.Linq; 5 using System.Collections.Generic; 6 7 public class LINQWithListCollection 8 { 9 public static void Main( string[] args ) 10 { 11 // populate a List of strings 12 List< string > items = new List< string >(); 13 items.Add( "aQua" ); // add "aQua" to the end of the List 14 items.Add( "RusT" ); // add "RusT" to the end of the List 15 items.Add( "yElLow" ); // add "yElLow" to the end of the List 16 items.Add( "rEd" ); // add "rEd" to the end of the List 17 18 // convert all strings to uppercase; select those starting with "R" 19 var startsWithR = 20 from item in items 21 let uppercaseString = item.ToUpper() 22 where uppercaseString.StartsWith( "R" ) 23 orderby uppercaseString 24 select uppercaseString; 25 26 // display query results 27 foreach ( var item in startsWithR ) 28 Console.Write( "{0} ", item ); 29 30 Console.WriteLine(); // output end of line 31 32 items.Add( "rUbY" ); // add "rUbY" to the end of the List 33 items.Add( "SaFfRon" ); // add "SaFfRon" to the end of the List 34 35 // display updated query results 36 foreach ( var item in startsWithR ) 37 Console.Write( "{0} ", item ); 38 39 Console.WriteLine(); // output end of line 40 } // end Main 41 } // end class LINQWithListCollection
Line 21 uses LINQ’s let
clause to create a new range variable. This is useful if you need to store a temporary result for use later in the LINQ query. Typically, let
declares a new range variable to which you assign the result of an expression that operates on the query’s original range variable. In this case, we use string
method ToUpper
to convert each item
to uppercase, then store the result in the new range variable uppercaseString
. We then use the new range variable uppercaseString
in the where
, orderby
and select
clauses. The where
clause (line 22) uses string
method StartsWith
to determine whether uppercaseString
starts with the character "R"
. Method StartsWith
performs a case-sensitive comparison to determine whether a string
starts with the string
received as an argument. If uppercaseString
starts with "R"
, method StartsWith
returns true
, and the element is included in the query results. More powerful string
matching can be done using the regular-expression capabilities introduced in Chapter 16, Strings and Characters.
The query is created only once (lines 20–24), yet iterating over the results (lines 27–28 and 36–37) gives two different lists of colors. This demonstrates LINQ’s deferred execution—the query executes only when you access the results—such as iterating over them or using the Count
method—not when you define the query. This allows you to create a query once and execute it many times. Any changes to the data source are reflected in the results each time the query executes.
There may be times when you do not want this behavior, and want to retrieve a collection of the results immediately. LINQ provides extension methods ToArray
and ToList
for this purpose. These methods execute the query on which they’re called and give you the results as an array or List<T>
, respectively. These methods can also improve efficiency if you’ll be iterating over the results multiple times, as you execute the query only once.
C# has a feature called collection initializers, which provide a convenient syntax (similar to array initializers) for initializing a collection. For example, lines 12–16 of Fig. 9.7 could be replaced with the following statement:
List< string > items = new List< string > { "aQua", "RusT", "yElLow", "rEd" }; |
This chapter introduced LINQ (Language Integrated Query), a powerful feature for querying data. We showed how to filter an array or collection using LINQ’s where
clause, and how to sort the query results using the orderby
clause. We used the select
clause to select specific properties of an object, and the let
clause to introduce a new range variable to make writing queries more convenient. The StartsWith
method of class string
was used to filter string
s starting with a specified character or series of characters. We used several LINQ extension methods to perform operations not provided by the query syntax—the Distinct
method to remove duplicates from the results, the Any
method to determine if the results contain any items, and the First
method to retrieve the first element in the results.
We introduced the List<T>
generic collection, which provides all the functionality of arrays, along with other useful capabilities such as dynamic resizing. We used method Add
to append new items to the end of the List
, method Insert
to insert new items into specified locations in the List
, method Remove
to remove the first occurrence of a specified item, method RemoveAt
to remove an item at a specified index and method Contains
to determine if an item was in the List
. We used property Count
to get the number of items in the List
, and property Capacity
to determine the size the List
can grow to without reallocating the internal array. In Chapter 10 we take a deeper look at classes and objects.
We use more advanced features of LINQ in later chapters. We’ve also created a LINQ Resource Center (www.deitel.com/LINQ/) that contains many links to additional information, including blogs by Microsoft LINQ team members, books, sample chapters, FAQs, tutorials, videos, webcasts and more. We encourage you to browse the LINQ Resource Center to learn more about this powerful technology.
.NET’s collection classes provide reusable data structures that are reliable, powerful and efficient.
List
s automatically increase their size to accommodate additional elements.
Large amounts of data are often stored in a database—an organized collection of data. Today’s most popular database systems are relational databases. SQL is the international standard language used almost universally with relational databases to perform queries (i.e., to request information that satisfies given criteria).
LINQ allows you to write query expressions (similar to SQL queries) that retrieve information from a wide variety of data sources. You can query arrays and List
s, selecting elements that satisfy a set of conditions—this is known as filtering.
A LINQ provider is a set of classes that implement LINQ operations and enable programs to interact with data sources to perform tasks such as sorting, grouping and filtering elements.
Repetition statements focus on the process of iterating through elements and checking whether they satisfy the desired criteria. LINQ specifies the conditions that selected elements must satisfy, not the steps necessary to get the results.
The System.Linq
namespace contains the classes for LINQ to Objects.
A from
clause specifies a range variable and the data source to query. The range variable represents each item in the data source (one at a time), much like the control variable in a foreach
statement.
If the condition in the where
clause evaluates to true
for an element, it’s included in the results.
The select
clause determines what value appears in the results.
A C# interface describes a set of methods and properties that can be used to interact with an object.
The IEnumerable<T>
interface describes the functionality of any object that’s capable of being iterated over and thus offers methods to access each element in some order.
A class that implements an interface must define each method in the interface.
Arrays and collections implement the IEnumerable<T>
interface.
A foreach
statement can iterate over any object that implements the IEnumerable<T>
interface.
A LINQ query returns an object that implements the IEnumerable<T>
interface.
The orderby
clause sorts query results in ascending order by default. Results can also be sorted in descending order using the descending
modifier.
C# provides implicitly typed local variables, which enable the compiler to infer a local variable’s type based on the variable’s initializer.
To distinguish such an initialization from a simple assignment statement, the var
keyword is used in place of the variable’s type.
You can use local type inference with control variables in the header of a for
or foreach
statement.
Implicitly typed local variables can be used to initialize arrays without explicitly giving their type. To do so, use new[]
to specify that the variable is an array.
LINQ can be used with collections of most data types.
Any boolean
expression can be used in a where
clause.
An orderby
clause can sort the results according to multiple properties specified in a comma-separated list.
Method Any returns true
if there’s at least one element in the result; otherwise, it returns false
.
The First
method returns the first element in the query result. You should check that the query result is not empty before calling First
.
The Count
method returns the number of elements in the query result.
The Distinct
method removes duplicate values from query results.
You can select any number of properties in a select
clause by specifying them in a comma-separated list in braces after the new
keyword. The compiler automatically creates a new class having these properties—called an anonymous type.
The .NET collection classes provide efficient methods that organize, store and retrieve data without requiring knowledge of how the data is being stored.
Class List<T>
is similar to an array but provides richer functionality, such as dynamic resizing.
The Add
method appends an element to the end of a List
.
The Insert
method inserts a new element at a specified position in the List
.
The Count
property returns the number of elements currently in a List
.
List
s can be indexed like arrays by placing the index in square brackets after the List
object’s name.
The Remove
method is used to remove the first element with a specific value.
The RemoveAt
method removes the element at the specified index.
The Contains
method returns true
if the element is found in the List
, and false
otherwise.
The Capacity
property indicates how many items a List
can hold without growing.
LINQ to Objects can query List
s.
LINQ’s let
clause creates a new range variable. This is useful if you need to store a temporary result for use later in the LINQ query.
The StartsWith
method of the string
class determines whether a string
starts with the string
passed to it as an argument.
A LINQ query uses deferred execution—it executes only when you access the results, not when you create the query.
Add
method of class List<T>
Contains
method of class List<T>
Count
extension method for IEnumerable<T>
Count
property of class List<T>
First
extension method for IEnumerable<T>
from
clause of a LINQ query
Insert
method of class List<T>
let
clause of a LINQ query
List<T>
collection class
orderby
clause of a LINQ query
query using LINQ
Remove
method of class List<T>
RemoveAt
method of class List<T>
select
clause of a LINQ query
ToUpper
method of class string
where
clause of a LINQ query
9.3 | (Querying an Array of
Table 9.8. Sample data for Exercise 9.3.
| ||||||||||||||||||||||||||||||||||||
9.4 | (Duplicate Word Removal) Write a console application that inputs a sentence from the user (assume no punctuation), then determines and displays the nonduplicate words in alphabetical order. Treat uppercase and lowercase letters the same. [Hint: You can use | ||||||||||||||||||||||||||||||||||||
9.5 | (Sorting Letters and Removing Duplicates) Write a console application that inserts 30 random letters into a
|