Chapter 2. Querying with DbContext

There are two things that almost every application that accesses a database has in common: the need to retrieve data from the database and to save changes to that data back into the database. Over the next two chapters you will see how the DbContext API makes it easy to achieve these tasks using the Entity Framework. The focus of this chapter will be on retrieving data from the database.

One of the great benefits of using an Object Relational Mapper (ORM), such as Entity Framework, is that once we have set up the mapping, we can interact with our data in terms of the objects and properties that make up our model, rather than tables and columns. When querying for objects, this means we no longer need to know how to write queries using the SQL syntax of our database.

Writing Queries with LINQ to Entities

Entity Framework queries are written using a .NET Framework feature known as Language Integrated Query, or LINQ for short. As the name suggests, LINQ is tightly integrated with the .NET programming experience and provides a strongly typed query language over your model. Strongly typed simply means that the query is defined using the classes and properties that make up your model. This provides a number of benefits such as compile-time checks to ensure your queries are valid and the ability to provide IntelliSense as you write your queries.

LINQ is a general query framework and isn’t specific to Entity Framework, or even databases for that matter. A LINQ Provider is responsible for taking your LINQ query, translating it into a query against the data, and then returning results. For Entity Framework this provider is known as LINQ to Entities and is responsible for taking your LINQ query and translating it into a SQL query against the database you are targeting. The information you supplied to Entity Framework about the shape of your model and how it maps to the database is used to perform this translation. Once the query returns, Entity Framework is responsible for copying the data into instances of the classes that make up your model.

The capabilities of LINQ and its use within Entity Framework are beyond the scope of this book. This chapter will provide an overview to help you get up and running with queries using DbContext, but is not an exhaustive query guide. Programming Entity Framework, 2e, provides a much more in-depth look at the query capabilities of Entity Framework, not only in Chapter 3 and Chapter 4, which are dedicated to querying, but throughout the book.

Note

In addition to LINQ, Entity Framework also supports a text-based query language known as Entity SQL, or ESQL for short. ESQL is typically used in more advanced scenarios where queries need to be dynamically constructed at runtime. Because ESQL is text-based, it is also useful in scenarios where the application needs to build a query against a model that isn’t known until runtime. Given that ESQL is less commonly used, it is not exposed directly on the DbContext API. If your application requires the use of ESQL, you will need to access the ObjectContext API using the IObjectContextAdapter interface.

To follow along with the examples in this book you will need a Visual Studio solution containing a console application that references the BAGA model built in Programming Entity Framework: Code First. You can download a prebuilt solution from http://learnentityframework.com/downloads. This prebuilt solution also includes a database initializer that will reset the database and insert some seed data into the database each time you run the application. The seed data is used in the examples throughout this book.

The Model project of the prebuilt solution contains classes that make up the BAGA domain model. The BAGA model includes a Destination class (Example 2-1) that represents all the wonderful places that our intrepid travelers can venture to.

Example 2-1. Destination class as listed in download solution
[Table("Locations", Schema = "baga")]
public class Destination
{
  public Destination()
  {
    this.Lodgings = new List<Lodging>();
  }

  [Column("LocationID")]
  public int DestinationId { get; set; }
  [Required, Column("LocationName")]
  [MaxLength(200)]
  public string Name { get; set; }
  public string Country { get; set; }
  [MaxLength(500)]
  public string Description { get; set; }
  [Column(TypeName = "image")]
  public byte[] Photo { get; set; }
  public string TravelWarnings { get; set; }
  public string ClimateInfo { get; set; }

  public List<Lodging> Lodgings { get; set; }
}

The BAGA model also includes a Lodging class (Example 2-2) that represents the accommodation that is available at the various Destinations.

Example 2-2. Lodging class as listed in download solution
public class Lodging
{
  public int LodgingId { get; set; }
  [Required]
  [MaxLength(200)]
  [MinLength(10)]
  public string Name { get; set; }
  public string Owner { get; set; }
  public decimal MilesFromNearestAirport { get; set; }

  [Column("destination_id")]
  public int DestinationId { get; set; }
  public Destination Destination { get; set; }
  public List<InternetSpecial> InternetSpecials { get; set; }
  public Nullable<int> PrimaryContactId { get; set; }
  [InverseProperty("PrimaryContactFor")]
  [ForeignKey("PrimaryContactId")]
  public Person PrimaryContact { get; set; }
  public Nullable<int> SecondaryContactId { get; set; }
  [InverseProperty("SecondaryContactFor")]
  [ForeignKey("SecondaryContactId")]
  public Person SecondaryContact { get; set; }
}

The Destination and Lodging classes will be used extensively for the examples throughout this book. To perform data access using these classes you will be using the BreakAwayContext from the DataAccess project. The project contains additional classes that are represented in BreakAwayContext as well as the Lodgings and Destinations. We’ll be using Code First for the examples in this book, but the techniques you will learn apply to any context that derives from DbContext. This includes contexts created using the Model First or Database First workflows.

Example 2-3. BreakAwayContext class as listed in download solution
public class BreakAwayContext : DbContext
{
  public DbSet<Destination> Destinations { get; set; }
  public DbSet<Lodging> Lodgings { get; set; }
  public DbSet<Trip> Trips { get; set; }
  public DbSet<Person> People { get; set; }
  public DbSet<Reservation> Reservations { get; set; }
  public DbSet<Payment> Payments { get; set; }
  public DbSet<Activity> Activities { get; set; }
}

Querying All the Data from a Set

Arguably the simplest query you can write is one that fetches all the data for a given entity type. This is the equivalent of a SELECT * FROM mytable query in SQL. Fortunately you don’t need to know SQL, because Entity Framework will take care of translating LINQ queries into SQL for you.

Getting all the data from a set doesn’t require you to really write a query. You can simply iterate over the contents of any given DbSet and Entity Framework will send a query to the database to find all the data in that set. Let’s add a PrintAllDestinations method to our console application that iterates over the Destinations set defined in our BreakAwayContext and prints out the name of each Destination (Example 2-4).

Example 2-4. Query for all destinations
private static void PrintAllDestinations()
{
  using (var context = new BreakAwayContext())
  {
    foreach (var destination in context.Destinations)
    {
      Console.WriteLine(destination.Name);
    }
  }
}

Note

When you debug the application, the console window will close when the application has finished executing, which may prevent you from inspecting the output. You can put a breakpoint at the end of the method for debugging. Alternatively, you can run without debugging (CTRL + F5), in which case Visual Studio will ensure that the console window remains open after the program has finished executing.

If you update the Main method to call this new PrintAllDestinations method and run the application, you will see that the name of each Destination in the database is printed to the console:

Grand Canyon
Hawaii
Wine Glass Bay
Great Barrier Reef

As the code began iterating over the contents of the Destinations set, Entity Framework issued a SQL query against the database to load the required data:

SELECT
[Extent1].[LocationID] AS [LocationID],
[Extent1].[LocationName] AS [LocationName],
[Extent1].[Country] AS [Country],
[Extent1].[Description] AS [Description],
[Extent1].[Photo] AS [Photo]
FROM [baga].[Locations] AS [Extent1]

The SQL may not look like the SQL you would have written. This is because Entity Framework has a generic query building algorithm that not only caters to this very simple query, but also for much more complex scenarios.

The query is sent to the database when the first result is requested by the application: that’s during the first iteration of the foreach loop. Entity Framework doesn’t pull back all the data at once, though. The query remains active and the results are read from the database as they are needed. By the time the foreach loop is completed, all the results have been read from the database.

One important thing to note is that Entity Framework will query the database every time you trigger an iteration over the contents of a DbSet. This has performance implications if you are continually querying the database for the same data. To avoid this, you can use a LINQ operator such as ToList to copy the results into a list. You can then iterate over the contents of this list multiple times without causing multiple trips to the database. Example 2-5 introduces a PrintAllDestinationsTwice method that demonstrates this approach.

Example 2-5. Iterating all Destinations twice with one database query
private static void PrintAllDestinationsTwice()
{
  using (var context = new BreakAwayContext())
  {
    var allDestinations = context.Destinations.ToList();

    foreach (var destination in allDestinations)
    {
      Console.WriteLine(destination.Name);
    }

    foreach (var destination in allDestinations)
    {
      Console.WriteLine(destination.Name);
    }
  }
}

Because a query is sent to the database to find the items in a DbSet, iterating a DbSet will only contain items that exist in the database. Any objects that are sitting in memory waiting to be saved to the database will not be returned. To ensure added objects are included you can use the techniques described in Querying Local Data.

Using LINQ for Sorting, Filtering, and More

While this chapter will not be an exhaustive list of everything you can do with LINQ and Entity Framework, let’s take a look at the patterns used to achieve some common query tasks. Let’s say you want to print out the names of Destinations again, but this time you want them ordered alphabetically by Name. Add a new PrintAllDestinationsSorted method that uses a LINQ query to perform this sort (Example 2-6).

Example 2-6. Query for destinations sorted by name
private static void PrintAllDestinationsSorted()
{
  using (var context = new BreakAwayContext())
  {
    var query = from d in context.Destinations
                orderby d.Name
                select d;

    foreach (var destination in query)
    {
      Console.WriteLine(destination.Name);
    }
  }
}

The above code uses LINQ to create a query and then iterates the results of the query and displays the name of each destination. The query is expressed using a syntax that looks a little bit like SQL. You start by telling it what you want to select from (in our case, the Destinations set on our context). You give the set a name so that you can refer to it throughout the rest of the query (in our case that name is d). Following this, you use operators such as orderby, groupby, and where to define the query. Finally you specify what you want returned using the select operator. In our case we want the actual Destination objects returned, so we specify the name that we gave the set in the first line.

Remember that Entity Framework won’t execute the query against the database until it needs the first result. During the first iteration of the foreach loop, the query is sent to the database. The query remains active and each result is read from the database as it is needed by the application. LINQ also includes methods that will copy the results of a query into a collection. For example, ToList can be called on a query to copy the results into a new List<T>. Calling a method such as this will cause all the results to be retrieved from the database and be copied into the new List<T>.

The code shown in Example 2-6 uses the LINQ query syntax to express the query. While most people find this the easiest to understand, there is an alternate method syntax that can be used if you prefer. Example 2-7 shows the same query expressed using method syntax.

Example 2-7. LINQ method syntax in C#
var query = context.Destinations
  .OrderBy(d => d.Name);

The method syntax makes use of lambda expressions to define the query. The LINQ methods are strongly typed, which gives you IntelliSense and compile-time checking for the lambda expressions you write. For example, in the OrderBy method we are using a lambda expression to specify that we want to order by the Name property. You start a lambda expression by giving a name to the thing you are operating on; this forms the left side of the expression. In our case we are operating on a Destination and we have chosen to call it d. Then, on the right side of the expression, you specify the body of the expression. In our case we just want to identify the Name property.

C# uses the lambda sign (=>) to separate the left and right sides of the expression. VB.NET uses the Function keyword followed by brackets to identify the left side of the expression. Example 2-8 shows the same query written in VB.NET using the method syntax.

Example 2-8. LINQ method syntax in VB.NET
context.Destinations.OrderBy(Function(d) d.Name)

Another common task is to filter the results of a query. For example, we may only want Destinations from Australia. Add the PrintAustralianDestinations method shown in Example 2-9.

Example 2-9. Query for Australian destinations
private static void PrintAustralianDestinations()
{
  using (var context = new BreakAwayContext())
  {
    var query = from d in context.Destinations
                where d.Country == "Australia"
                select d;

    foreach (var destination in query)
    {
      Console.WriteLine(destination.Name);
    }
  }
}

This code looks very similar to the PrintAllDestinationsSorted we saw in Example 2-6, except we are using the where operator instead of orderby. You can also combine these operators. Example 2-10 shows how to query for Australian Destinations sorted by name.

Example 2-10. Query combining filter and sort
var query = from d in context.Destinations
            where d.Country == "Australia"
            orderby d.Name
            select d;

Operators can also be combined in the method syntax. The same query from Example 2-10 is shown using method syntax in Example 2-11.

Example 2-11. Method syntax for combining filter and sort
var query = context.Destinations
  .Where(d => d.Country == "Australia")
  .OrderBy(d => d.Name);

So far our queries have returned collections of entities from our model, but this may not always be the case. In fact, we have been returning complete Destination objects when we really only need the name. You can use projection to create a query that selects from a set of entities in your model but returns results that are of a different type. For example, you can use projection to create a query that selects from a set of entities type but only returns a subset of the properties of that entity. It’s called projection because you are projecting data from the shape of the source that you are selecting from onto the shape of the result set you want.

In our case we want to project a query about Destinations into a result set that just has a string representing the destination’s name. Example 2-12 adds a PrintDestinationNameOnly method that shows how we use the select section of our query to specify what we want the result set to contain.

Example 2-12. Querying for just the Destination name
private static void PrintDestinationNameOnly()
{
  using (var context = new BreakAwayContext())
  {
    var query = from d in context.Destinations
                where d.Country == "Australia"
                orderby d.Name
                select d.Name;

    foreach (var name in query)
    {
      Console.WriteLine(name);
    }
  }
}

Example 2-13 shows how this same query can be written using method syntax by making use of the Select method.

Example 2-13. Method syntax for projection
var query = context.Destinations
  .Where(d => d.Country == "Australia")
  .OrderBy(d => d.Name)
  .Select(d => d.Name);

LINQ is a powerful query language and this section has just grazed the surface of its capabilities. Programming Entity Framework, 2e, contains a much deeper look into using LINQ with the Entity Framework. There are also more example queries available in the Entity Framework MSDN documentation: http://msdn.microsoft.com/en-us/library/bb399367.aspx.

Finding a Single Object

So far you’ve seen queries that return a collection of entities, but sometimes you will want to run a query that just returns a single object. The most common scenario for querying for a single object is to find the object with a given key. The DbContext API makes this very simple by exposing a Find method on DbSet. Find accepts the value to be searched for and will return the corresponding object if it is found. If there is no entity with the provided key, Find will return null.

One of the great things about Find is that it doesn’t unnecessarily query the database. It’s also capable of finding newly added objects that haven’t yet been saved to the database. Find uses a simple set of rules to locate the object (in order of precedence):

  1. Look in memory for an existing entity that has been loaded from the database or attached to the context (you’ll learn more about attaching objects in Chapter 4).

  2. Look at added objects that have not yet been saved to the database.

  3. Look in the database for entities that have not yet been loaded into memory.

To see this behavior, add the FindDestination method shown in Example 2-14. This method accepts an ID from the user and then attempts to locate the Destination with the specified ID.

Example 2-14. Using Find to locate a Destination
private static void FindDestination()
{
  Console.Write("Enter id of Destination to find: ");
  var id = int.Parse(Console.ReadLine());
  using (var context = new BreakAwayContext())
  {
    var destination = context.Destinations.Find(id);
    if (destination == null)
    {
      Console.WriteLine("Destination not found!");
    }
    else
    {
      Console.WriteLine(destination.Name);
    }
  }
}

The code above uses the Find method to look up the Destination with the specified ID. If one is found, it prints out the name of the destination. If Find returns null, indicating there is no Destination with the specified ID, an error message is displayed to the user.

There may be times when you want to query for a single object but are not able to use Find. These could include wanting to query by something other than the key or wanting to include related data in the query (as described in Eager Loading). To do this, you will need to create a standard LINQ query and then use the Single method to get a single object as the result.

Let’s say we want to locate the Destination that has the name Great Barrier Reef. Name isn’t the key of Destination but we know there is, and only ever will be, one Great Barrier Reef. Example 2-10 introduces a FindGreatBarrierReef method that will locate this single Destination.

Example 2-15. Query for single entity based on name
private static void FindGreatBarrierReef()
{
  using (var context = new BreakAwayContext())
  {
    var query = from d in context.Destinations
                where d.Name == "Great Barrier Reef"
                select d;

    var reef = query.Single();

    Console.WriteLine(reef.Description);
  }
}

The LINQ query looks the same as any other query that filters based on name. We then use the Single method to let Entity Framework know that we expect a single result. If the query returns no results, or more than one result, an exception will be thrown. If there are potentially no matches, you can use the SingleOrDefault method, which will return null if no results are found. Example 2-16 shows the FindGreatBarrierReef method updated to account for the fact it may not exist in the database.

Example 2-16. Query for single entity that may not exist
private static void FindGreatBarrierReef()
{
  using (var context = new BreakAwayContext())
  {
    var query = from d in context.Destinations
                where d.Name == "Great Barrier Reef"
                select d;

    var reef = query.SingleOrDefault();

    if (reef == null)
    {
      Console.WriteLine("Can't find the reef!");
    }
    else
    {
      Console.WriteLine(reef.Description);
    }
  }
}

SingleOrDefault uses the same database query that Find uses when it looks for entities in the database. The SQL selects the TOP two results so that it can ensure there is only one match:

SELECT TOP (2)
  [Extent1].[LocationID] AS [LocationID],
  [Extent1].[LocationName] AS [LocationName],
  [Extent1].[Country] AS [Country],
  [Extent1].[Description] AS [Description],
  [Extent1].[Photo] AS [Photo],
  [Extent1].[TravelWarnings] AS [TravelWarnings],
  [Extent1].[ClimateInfo] AS [ClimateInfo]
FROM [baga].[Locations] AS [Extent1]
WHERE N'Great Barrier Reef' = [Extent1].[LocationName]

If two rows are found, Single and SingleOrDefault will throw because there is not a single result. If you just want the first result, and aren’t concerned if there is more than one result, you can use First or FirstOrDefault.

One important thing to remember is that LINQ queries against a DbSet always send a query to the database to find the data. So, if the Great Barrier Reef was a newly added Destination that hadn’t been saved to the database yet, the queries in Example 2-15 and Example 2-16 won’t be able to locate it. To look for newly added entities, you would also need to query the in-memory data using the techniques shown in Querying Local Data.

Querying Local Data

So far you’ve used LINQ to query a DbSet directly, which always results in a SQL query being sent to the database to load the data. You’ve also used the Find method, which will look for in-memory data before querying that database. Find will only query based on the key property though, and there may be times when you want to use a more complex query against data that is already in memory and being tracked by your DbContext.

One of the reasons you may want to do this is to avoid sending multiple queries to the database when you know that all the data you need is already loaded into memory. Back in Example 2-5, we saw one way to do this was to use ToList to copy the results of a query into a list. While this works well if we are using the data within the same block of code, things get a little messy if we need to start passing that list around our application. For example, we might want to load all Destinations from the database when our application loads. Different areas of our application are then going to want to run different queries against that data. In some places we might want to display all Destinations, in others we might want to sort by Name, and in others we might want to filter by Country. Rather than passing around a list of Destination objects, we can take advantage of the fact that our context is tracking all the instances and query its local data.

Another reason may be that you want the results to include newly added data, which doesn’t yet exist in the database. Using ToList on a LINQ query against a DbSet will always send a query to the database. This means that any new objects that don’t yet exist in the database won’t be included in the results. Local queries, however, will include newly created objects in the results.

The in-memory data for a DbSet is available via the Local property. Local will return all the data that has been loaded from the database plus any newly added data. Any data that has been marked as deleted but hasn’t been deleted from the database yet will be filtered out for you. More information on how entities get into these different states is available in Chapter 3.

Let’s start with the very simple task of finding out how many Destinations are in memory and available to be queried. Go ahead and add the GetLocalDestinationCount method, as shown in Example 2-17.

Example 2-17. Checking how many Destinations are in-memory
private static void GetLocalDestinationCount()
{
  using (var context = new BreakAwayContext())
  {
    var count = context.Destinations.Local.Count;
    Console.WriteLine("Destinations in memory: {0}", count);
  }
}

The code accesses the Local property of the Destinations set that we created on our BreakAwayContext. Rather than running a query, we simply store the count in a variable and then print it to the console. If you run the application you will see that the count is zero:

Destinations in memory: 0

We’re getting a zero count because we haven’t run any queries to load Destinations from the database, and we haven’t added any new Destination objects either. Let’s update the GetLocalDestinationCount method to query some data from the database before getting the local count (Example 2-18).

Example 2-18. Checking in-memory data after a query
private static void GetLocalDestinationCount()
{
  using (var context = new BreakAwayContext())
  {
    foreach (var destination in context.Destinations)
    {
      Console.WriteLine(destination.Name);
    }

    var count = context.Destinations.Local.Count;
    Console.WriteLine("Destinations in memory: {0}", count);
  }
}

This new code iterates over the Destinations set, causing the data to be loaded from the database. Because the data is loaded when we get the count from the Local property, we now see a nonzero result when we run the application:

Grand Canyon
Hawaii
Wine Glass Bay
Great Barrier Reef
Destinations in memory: 4

Using the Load Method to Bring Data into Memory

Iterating over the contents of a DbSet with a foreach loop is one way to get all the data into memory, but it’s a little inefficient to do that just for the sake of loading data. It’s also a little unclear what the intent of the code is, especially if the iteration code doesn’t directly precede the local query.

Fortunately the DbContext API includes a Load method, which can be used on a DbSet to pull all the data from the database into memory. Go ahead and add the GetLocalDestinationCountWithLoad method (Example 2-19) that uses Load on the Destinations set and then prints out the count of in-memory Destinations.

Example 2-19. Using the Load to bring data into memory
private static void GetLocalDestinationCountWithLoad()
{
  using (var context = new BreakAwayContext())
  {
    context.Destinations.Load();

    var count = context.Destinations.Local.Count;
    Console.WriteLine("Destinations in memory: {0}", count);
  }
}

Compare this code with the GetLocalDestinationCount method we wrote back in Example 2-18. This updated code makes it much clearer that our intent is to load the contents of the Destinations set and then query the in-memory data.

Note

Load is actually an extension method on IQueryable<T> and is defined in the System.Data.Entity namespace. If you want to use Load, you will need to have this namespace imported.

Because Load is an extension method on IQueryable<T>, we can also use it to load the results of a LINQ query into memory, rather than the entire contents of a set. For example, let’s say we only wanted to load Australian Destinations into memory and then run a few local queries on that subset of data. Let’s add the LoadAustralianDestinations method shown in Example 2-20.

Example 2-20. Loading results of a LINQ query into memory
private static void LoadAustralianDestinations()
{
  using (var context = new BreakAwayContext())
  {
    var query = from d in context.Destinations
                where d.Country == "Australia"
                select d;

    query.Load();

    var count = context.Destinations.Local.Count;
    Console.WriteLine("Aussie destinations in memory: {0}", count);
  }
}

This time just the Destinations with Country set to Australia are loaded into memory. When we run the application, we see that the count we get from Local is reduced to reflect this.

Note

Using Load on a LINQ query will bring the results of that query into memory but it does not remove the results of previous queries. For example if you called Load on a query for Australian destinations and then Load on a query for American destinations, both Australian and American destinations would be in memory and would be returned from Local.

Running LINQ Queries Against Local

So far we have just looked at getting the count from Local to make sure that it is returning the correct data that we brought into memory. Because Local is just a collection of in-memory objects, we can also run queries against it. One of the great things about LINQ is that it’s not specific to Entity Framework. We can use the same LINQ syntax to query a number of different data sources, including in-memory collections of objects.

Let’s add a LocalLinqQueries method that pulls data into memory using a single database query and then runs some in-memory queries using Local (Example 2-21).

Example 2-21. Using LINQ to query Local
private static void LocalLinqQueries()
{
  using (var context = new BreakAwayContext())
  {
    context.Destinations.Load();

    var sortedDestinations = from d in context.Destinations.Local
                             orderby d.Name
                             select d;

    Console.WriteLine("All Destinations:");
    foreach (var destination in sortedDestinations)
    {
      Console.WriteLine(destination.Name);
    }

    var aussieDestinations = from d in context.Destinations.Local
                             where d.Country == "Australia"
                             select d;

    Console.WriteLine();
    Console.WriteLine("Australian Destinations:");
    foreach (var destination in aussieDestinations)
    {
      Console.WriteLine(destination.Name);
    }
  }
}

The code loads all Destinations into memory and then runs one query to sort them by Name and another to pull out just the Australian Destinations. Remember that Find also defaults to using in-memory data where possible. So we could also use Find and it would use the data we loaded rather than sending more queries to the database.

While Load and Local are great if you want to reduce the number of queries that get run against the database just remember that pulling all your data into memory may be an expensive operation. If you are running multiple queries that only return a subset of your data you’ll probably get better performance by letting these queries hit the database and just pull back the data you actually need.

Working with the ObservableCollection Returned by Local

If you’ve looked at the API closely you may have noticed that Local returns an ObservableCollection<TEntity>. This type of collection allows subscribers to be notified whenever objects are added or removed from the collection. ObservableCollection is useful in a number of data-binding scenarios, but it can also be useful if your application needs to know when new data comes into memory.

Local will raise the CollectionChanged event whenever the contents of Local change. This can be when data is brought back from that database via a query, when new objects are added to the DbContext, or when objects previously brought into memory are marked for deletion.

Let’s add a ListenToLocalChanges method that uses this functionality to log any changes to Destinations.Local to the console (Example 2-22).

Example 2-22. Using CollectionChanged to print out changes to Local
private static void ListenToLocalChanges()
{
  using (var context = new BreakAwayContext())
  {
    context.Destinations.Local
      .CollectionChanged += (sender, args) =>
    {
      if (args.NewItems != null)
      {
        foreach (Destination item in args.NewItems)
        {
          Console.WriteLine("Added: " + item.Name);
        }
      }

      if (args.OldItems != null)
      {
        foreach (Destination item in args.OldItems)
        {
          Console.WriteLine("Removed: " + item.Name);
        }
      }
    };

    context.Destinations.Load();
  }
}

The code adds a new event handler to the Local collection of Destinations. This handler looks at items entering or leaving the collection and prints out the name of the affected Destination and indicates if it is being added or removed. Once the event handler is in place, we use Load to pull all the data from the database into memory. If you run the application, you can see the output appearing as items are returned from the database:

Added: Grand Canyon
Added: Hawaii
Added: Wine Glass Bay
Added: Great Barrier Reef

These events could be handy if you have a screen that needs to be refreshed whenever some data in your context changes. For example, you might have a screen that displays all Destinations and another screen where the user can add a new Destination. You could wire up the screen displaying all Destinations to listen to the CollectionChanged event and refresh whenever anything is added or removed.

Some UI frameworks, such as WPF, will take care of this for you so that you don’t have to write code to listen to changes. If you bind a WPF ListBox to the contents of Local, whenever any other area of the application adds or removes an entity from the DbSet, the ListBox will be updated to reflect those changes.

Note

If you use LINQ to query the contents of Local, the result of the query is no longer an ObservableCollection. This means if you run a LINQ query against Local and bind the results to a WPF ListBox, it will no longer get automatically updated for you when entities are added or removed. You would need to write code that listens to OnCollectionChanged on DbSet.Local and rerun the query to refresh the ListBox.

So far we have looked at accessing data for a single type of entity and everything has been about Destinations. But if we were writing a real application, we would probably want to know something about the Lodging that is available at each Destination. If we want to access the Lodgings associated with a Destination, that means working with related data.

You’ll need to pull related data into memory so that we can look at it. There are three approaches you can use to load related data: lazily, eagerly, or explicitly. While they may achieve the same end result, there are some differences between each approach that can have a significant impact on performance. This isn’t a one-time decision either. Different approaches may be better at different times. This section will walk through the three available options and help you work out which one is best for you in different situations.

Note

The “Demystifying Entity Framework Strategies: Loading Related Data” MSDN article gives a detailed look at the pros and cons of the different strategies and some pointers on choosing the right strategy for you.

Lazy Loading

Lazy loading related data is the most transparent to your application and involves letting Entity Framework automatically retrieve the related data for you when you try to access it. For example, you may have the Grand Canyon destination loaded. If you then use the Lodgings property of this Destination, Entity Framework will automatically send a query to the database to load all Lodgings at the Grand Canyon. It will appear to your application code as if the Lodgings property was always populated.

Entity Framework achieves lazy loading using a dynamic proxy. Here’s how that works. When Entity Framework returns the results of a query, it creates instances of your classes and populates them with the data that was returned from the database. Entity Framework has the ability to dynamically create a new type at runtime that derives from your POCO class. This new class acts as a proxy to your POCO class and is referred to as a dynamic proxy. It will override the navigation properties of your POCO class and include some additional logic to retrieve the data from the database when the property is accessed. Because the dynamic proxy derives from your POCO class, your application can be written in terms of the POCO class and doesn’t need to be aware that there may be a dynamic proxy at runtime.

Note

DbContext has a configuration setting that enables lazy loading: DbContext.Configuration.LazyLoadingEnabled. This setting is true by default and therefore if you have not changed the default, the dynamic proxy will perform lazy loading.

In order to use dynamic proxies, and therefore lazy loading, there are a couple of criteria your class must meet. If these criteria are not met, Entity Framework will not create a dynamic proxy for the class and will just return instances of your POCO class, which cannot perform lazy loading:

  • Your POCO class must be public and not sealed.

  • The navigation properties that you want to be lazy loaded must also be marked as virtual (Overridable in Visual Basic) so that Entity Framework can override the properties to include the lazy loading logic.

Before we make any changes to our classes, let’s see what the behavior is like without dynamic proxies. Add a TestLazyLoading method that attempts to access the Lodgings associated with a specific Destination (Example 2-23).

Example 2-23. Method to access related data
private static void TestLazyLoading()
{
  using (var context = new BreakAwayContext())
  {
    var query = from d in context.Destinations
                where d.Name == "Grand Canyon"
                select d;

    var canyon = query.Single();

    Console.WriteLine("Grand Canyon Lodging:");
    if (canyon.Lodgings != null)
    {
      foreach (var lodging in canyon.Lodgings)
      {
        Console.WriteLine(lodging.Name);
      }
    }
  }
}

The code locates the Grand Canyon Destination and then tests if the Lodgings property is populated. If it is populated, the name of each associated Lodging is printed to the console. If you update the Main method to call TestLazyLoading and run the application, you will see that nothing is printed out to the console. This is because the Lodgings property on Destination isn’t marked as virtual (Overridable in Visual Basic), so Entity Framework can’t override the property in a dynamic proxy. Entity Framework is forced to use your implementation of the property (that doesn’t perform lazy loading) rather than replacing it with an implementation that includes the lazy loading logic. Let’s go ahead and edit the Destination class so that the property is marked as virtual:

public virtual List<Lodging> Lodgings { get; set; }

Now Entity Framework can create a dynamic proxy for the Destination class. If you run the application again, you’ll see that the individual Lodgings for the Grand Canyon are displayed because the data was automatically loaded for you when the code encountered the first request for Lodgings:

Grand Canyon Lodging:
Grand Hotel
Dave's Dump

As the code executed, Entity Framework sent two queries to the database (Figure 2-1). The first query retrieves the data for the Grand Canyon Destination and was executed when the code called the Single method on query. Remember that the Single method uses a SELECT TOP (2) query to ensure there is one result and only one result. The second query selects all Lodgings associated with the Grand Canyon. This query was sent at the moment the code first tried to access the Lodgings property for the Grand Canyon Destination.

Lazy loading query
Figure 2-1. Lazy loading query

Understanding the downsides of lazy loading

Lazy loading is very simple because your application doesn’t really need to be aware that data is being loaded from the database. But that is also one of its dangers! Improper use of lazy loading can result in a lot of queries being sent to the database. For example, you might load fifty Destinations and then access the Lodgings property on each. That would result in 51 queries against the database—one query to get the Destinations and then for each of the fifty Destinations, to load that Destination’s Lodgings. In cases like this it may be much more efficient to load all that data in a single query, using a SQL join in the database query. This is where eager loading comes into play.

Note

If you decide that lazy loading is just too much magic, you can choose to disable it altogether by using the DbContext.Configuration.LazyLoadingEnabled property. If this switch is set to false, lazy loading will never occur, even if a navigation property is marked as virtual.

Eager Loading

Eager loading related data relies on you telling Entity Framework what related data to include when you query for an entity type. Entity Framework will then use a JOIN in the generated SQL to pull back all of the data in a single query. Let’s assume we want to run though all Destinations and print out the Lodgings for each. Add a TestEagerLoading method that queries for all Destinations and uses Include to also query for the associated Lodgings (Example 2-24).

Example 2-24. Using eager loading to load related data
private static void TestEagerLoading()
{
  using (var context = new BreakAwayContext())
  {
    var allDestinations = context
      .Destinations
      .Include(d => d.Lodgings);

    foreach (var destination in allDestinations)
    {
      Console.WriteLine(destination.Name);

      foreach (var lodging in destination.Lodgings)
      {
        Console.WriteLine(" - " + lodging.Name);
      }
    }
  }
}

The code uses the Include method to indicate that the query for all destinations should include the related Lodging data. Include uses a lambda expression to specify which properties to include the data for. When the application runs, we see a single query is executed against the database (Figure 2-2). This query uses a join to return the Destination and Lodging data as a single result set.

Eager loading returns all data in a single query
Figure 2-2. Eager loading returns all data in a single query

There is also a string-based overload of Include that just accepts the name of the property to include data for (Include(“Lodgings”) in our case). Previous versions of Entity Framework only included this string option. The string-based overload is problematic because it’s not strongly typed and therefore there is no compile-time checking of the parameter. This can lead to issues with mistyped property names or failing to update the Include call if the property is renamed in the future.

Note

The lambda version of the Include method is defined as an extension method in System.Data.Entity. To use the lambda overload you will need to import this namespace.

It is possible to include more than one related set of data in a single query. Say we wanted to query for Lodgings and include the PrimaryContact plus the associated Photo. We do this by “dotting through” the navigation properties in the lambda expression:

context.Lodgings
  .Include(l => l.PrimaryContact.Photo)

The syntax gets a little more complicated if you have a collection navigation property in the middle of the path to be included. What if you want to query for Destinations and include Lodgings and also the PrimaryContact for each of the related Lodging instances? Following the collection, you need to use the LINQ Select method to identify which property you want to load:

context.Destinations
  .Include(d => d.Lodgings.Select(l => l.PrimaryContact))

Include can be used multiple times in the same query to identify different data to be loaded. For example, you may want to query the Lodgings set and include both PrimaryContact and SecondaryContact. This requires two separate calls to Include:

context.Lodgings
  .Include(l => l.PrimaryContact)
  .Include(l => l.SecondaryContact)

Note

Eager loading is currently only able to include the entire contents of a navigation property. The ability to only include a subset of the contents of a collection navigation property is a common request, but it is not currently supported by the Entity Framework.

Understanding the downsides of eager loading

One thing to bear in mind with eager loading is that fewer queries aren’t always better. The reduction in the number of queries comes at the expense of the simplicity of the queries being executed. As you include more and more data, the number of joins in the query that is sent to the database increases and results in a slower and more complex query. If you need a significant amount of related data, multiple simpler queries will often be significantly faster than one big query that returns all the data.

Using Include in LINQ queries

You can also use Include as part of a LINQ query by adding the Include method to the DbSet being queried. If you are using query syntax, the Include goes in the from part of the query:

var query = from d in context.Destinations.Include(d => d.Lodgings)
            where d.Country == "Australia"
            select d;

If you are using method syntax, you can simply put Include in line with the other method calls:

var query = context.Destinations
  .Include(d => d.Lodgings)
  .Where(d => d.Country == "Australia");

Include is defined as an extension method on IQueryable<T> and can therefore be added to a query at any point. It doesn’t have to immediately follow the DbSet from which you are selecting. For example, you can call Include on an existing query for Australian Destinations to specify that Lodgings should also be included:

var query = from d in context.Destinations
            where d.Country == "Australia"
            select d;

query = query.Include(d => d.Lodgings);

Note that the code doesn’t just call Include on the existing query but overrides the query variable with the result of the Include call. This is necessary because Include doesn’t modify the query that it is called on, it returns a new query that will include the related data. Remember that Entity Framework doesn’t execute any queries until the code uses the results of the query. The above code doesn’t use the results of the query, so nothing will be executed against the database until some other code accesses the Destinations from the query variable.

Note

Although Include is defined as an extension method on IQueryable<T> it will only have an effect when used on a LINQ to Entities query. If another LINQ provider is being used, Include will have no effect unless the implementation of IQueryable<T> exposes an Include method that accepts a single string parameter. If this method exists, it will be called with a string representing the property path that was specified to be included.

Explicit Loading

Another loading option is explicit loading. Explicit loading is like lazy loading in that related data is loaded separately, after the main data has been loaded. However, unlike lazy loading, it doesn’t automatically happen for you; you need to call a method to load the data.

There are a number of reasons you might opt for explicit loading over lazy loading:

  • It removes the need to mark your navigation properties as virtual. To some this may seem like a trivial change, for others, the fact that a data access technology requires you to change your POCO classes is far from ideal.

  • You may be working with an existing class library where the navigation properties are not marked as virtual and you simply can’t change that.

  • Explicit loading allows you to be sure that you know exactly when queries are sent to the database. Lazy loading has the potential to generate a lot of queries; with explicit loading it is very obvious when and where queries are being run.

Explicit loading is achieved using the DbContext.Entry method. The Entry method gives you access to all the information that the DbContext has about an entity. This goes beyond the values that are stored in the properties of the actual entity and includes things such as the state of the entity and the original values for each property when it was retrieved from the database. You’ll see a lot more about this information in Chapters 4 and 5. In addition to information about the entity, the Entry method also gives you access to some operations you can perform on the entity, including loading data for navigation properties.

Once we have the entry for a given entity we can use the Collection and Reference methods to drill into the information and operations for navigation properties. One of the operations available is the Load method, which will send a query to the database to load the contents of the navigation property.

Let’s take another look at loading the Lodgings available at the Grand Canyon. This time let’s add a TestExplicitLoading method that uses the Entry method to load the data (Example 2-25).

Example 2-25. Loading related data with explicit load
private static void TestExplicitLoading()
{
  using (var context = new BreakAwayContext())
  {
    var query = from d in context.Destinations
                where d.Name == "Grand Canyon"
                select d;

    var canyon = query.Single();

    context.Entry(canyon)
      .Collection(d => d.Lodgings)
      .Load();

    Console.WriteLine("Grand Canyon Lodging:");
    foreach (var lodging in canyon.Lodgings)
    {
      Console.WriteLine(lodging.Name);
    }
  }
}

The first part of the code should be familiar—it uses a LINQ query to locate the Grand Canyon Destination. The code then calls the Entry method, passing in the canyon object. From there the Collection method is used to drill into the Lodgings navigation property. Collection and Reference use a lambda expression to specify the property to drill into. There are also string-based alternatives to these methods, but the lambda version ensures we get compile-time checking of the parameter. Finally, the Load method is used to query for the related data and bring it into memory.

If you update the Main method to call TestExplicitLoading and then run the application, you will see two queries run against the database (Figure 2-3). The first one runs when the code requests the single result of the query for the Grand Canyon, by calling Single on query. The second query is asking for all Lodging at the Grand Canyon and runs as a result of the call to Load.

Explicit loading runs separate queries for related data
Figure 2-3. Explicit loading runs separate queries for related data

You’ve seen that explicit loading can be used to load the entire contents of a collection navigation property but it can also be used to load just some of the contents, based on a LINQ query. You’ll see this in Explicit Loading a Subset of the Contents of a Navigation Property.

Explicit loading of a reference navigation property looks very similar, except you use the Reference method rather than Collection. For example, if you wanted to load the PrimaryContact of some lodging, you could write this:

var lodging = context.Lodgings.First();

context.Entry(lodging)
  .Reference(l => l.PrimaryContact)
  .Load();

Checking If a Navigation Property Has Been Loaded

The Reference and Collection methods also give you access to the IsLoaded property. The IsLoaded method will tell you whether the entire contents of the navigation property have been loaded from the database or not. The IsLoaded property will be set to true when lazy, eager, or explicit loading is used to load the contents of the navigation property. Add the TestIsLoaded method shown in Example 2-26.

Example 2-26. Testing if a navigation property has been loaded with IsLoaded
private static void TestIsLoaded()
{
  using (var context = new BreakAwayContext())
  {
    var canyon = (from d in context.Destinations
                  where d.Name == "Grand Canyon"
                  select d).Single();

    var entry = context.Entry(canyon);

    Console.WriteLine(
      "Before Load: {0}",
      entry.Collection(d => d.Lodgings).IsLoaded);

    entry.Collection(d => d.Lodgings).Load();

    Console.WriteLine(
      "After Load: {0}",
      entry.Collection(d => d.Lodgings).IsLoaded);
  }
}

The code uses a LINQ query to load the Grand Canyon Destination from the database. The value assigned to the IsLoaded property for the Lodgings property is then printed out to the console. Explicit loading is used to load the contents of the Lodgings property and the value of IsLoaded is printed to the console again. If you update the Main method to call TestIsLoaded and then run the application, you will see that the value of IsLoaded is set to true after the explicit load is performed:

Before Load: False
After Load: True

If you are performing an explicit load, and the contents of the navigation property may have already been loaded, you can use the IsLoaded flag to determine if the load is required or not.

Querying Contents of a Collection Navigation Property

So far you’ve looked at loading the entire contents of a collection navigation property so that you can work with the data in memory. If you wanted to filter the contents of a navigation property you could do this after you’d brought everything into memory, using LINQ to Objects. However, if you are only interested in a subset of the contents, it may make sense to just bring the bits you are interested in into memory. Or if you just want a count, or some other calculation, it may make sense just to calculate the result in the database and not bring any of the data into memory.

Once you’ve used Entry and Collection to drill into a collection navigation property, you can then use the Query method to get a LINQ query representing the contents of that property. Because it’s a LINQ query, you can then do further filtering, sorting, aggregation, and the like.

Assume you wanted to find all Lodgings at the Grand Canyon that are less than ten miles from the nearest airport. You could just use LINQ to query the contents of the Lodgings property of the Grand Canyon, something like Example 2-27.

Example 2-27. In-memory query of a navigation property
private static void QueryLodgingDistance()
{
  using (var context = new BreakAwayContext())
  {
    var canyonQuery = from d in context.Destinations
                      where d.Name == "Grand Canyon"
                      select d;

    var canyon = canyonQuery.Single();

    var distanceQuery = from l in canyon.Lodgings
                  where l.MilesFromNearestAirport <= 10
                  select l;

    foreach (var lodging in distanceQuery)
    {
      Console.WriteLine(lodging.Name);
    }
  }
}

The problem with this code is that distanceQuery is using LINQ to Objects to query the contents of the Lodgings navigation property. This will cause the property to be lazy loaded, pulling the entire contents into memory. The code then immediately filters out some of the data, meaning there was no need to pull it into memory. Let’s rewrite the QueryLodgingDistance method from Example 2-27 to use Query, as shown in Example 2-28.

Example 2-28. Database query of a navigation property
private static void QueryLodgingDistance()
{
  using (var context = new BreakAwayContext())
  {
    var canyonQuery = from d in context.Destinations
                      where d.Name == "Grand Canyon"
                      select d;

    var canyon = canyonQuery.Single();

    var lodgingQuery = context.Entry(canyon)
      .Collection(d => d.Lodgings)
      .Query();

    var distanceQuery = from l in lodgingQuery
                  where l.MilesFromNearestAirport <= 10
                  select l;

    foreach (var lodging in distanceQuery)
    {
      Console.WriteLine(lodging.Name);
    }
  }
}

This updated code uses the Query method to create a LINQ to Entities query for the Lodgings associated with the Grand Canyon. It then composes on that query to ask for just the Lodgings that are within ten miles of an airport. When iterating over this query, Entity Framework takes care of the translation to SQL and performs the filter on MilesFromNearestAirport in the database. This means that only the data you care about is brought back into memory.

Perhaps you want to know how many Lodgings are available at the Grand Canyon. You could load all the Lodgings and get a count, but why bring all that data into memory just to get a single integer result? Add a QueryLodgingCount method that uses Query to get the count without loading the data (Example 2-29).

Example 2-29. Using Query to get a count of Lodgings
private static void QueryLodgingCount()
{
  using (var context = new BreakAwayContext())
  {
    var canyonQuery = from d in context.Destinations
                where d.Name == "Grand Canyon"
                select d;

    var canyon = canyonQuery.Single();

    var lodgingQuery = context.Entry(canyon)
      .Collection(d => d.Lodgings)
      .Query();

    var lodgingCount = lodgingQuery.Count();

    Console.WriteLine("Lodging at Grand Canyon: " + lodgingCount);
  }
}

The code loads the Grand Canyon destination and then uses Entry and Collection to drill into the Lodgings navigation property. From there it uses the Query method to get a query representing the contents of the navigation property. It then uses the LINQ Count method to materialize just the count of the results of the query. Because it is using the LINQ to Entities provider, it recognizes that you want the count and pushes the entire query to the database so that only the single integer result is returned from the database. If you update the Main method to call QueryLodgingCount and run the application you will see the count correctly displayed:

Lodging at Grand Canyon: 2

Explicit Loading a Subset of the Contents of a Navigation Property

You can combine the Query and Load methods to perform a filtered explicit load. That’s an explicit load that only loads a subset of the contents of a navigation property. For example, you may want to just load the Lodgings at the Grand Canyon that contain the word “Hotel” in their Name:

context.Entry(canyon)
  .Collection(d => d.Lodgings)
  .Query()
  .Where(l => l.Name.Contains("Hotel"))
  .Load();

It’s important to remember that calling Load will not clear any objects that are already in the navigation property. So if you loaded Lodgings at the Grand Canyon that contain the word “Hotel” and then also loaded Lodgings that contain the word “Campsite”, the Lodgings navigation property will contain both hotels and campsites.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset