There are two things that almost every application that accesses a database has in common: the need to retrieve data from the database and to save changes to that data back into the database. Over the next two chapters you will see how the DbContext API makes it easy to achieve these tasks using the Entity Framework. The focus of this chapter will be on retrieving data from the database.
One of the great benefits of using an Object Relational Mapper (ORM), such as Entity Framework, is that once we have set up the mapping, we can interact with our data in terms of the objects and properties that make up our model, rather than tables and columns. When querying for objects, this means we no longer need to know how to write queries using the SQL syntax of our database.
Entity Framework queries are written using a .NET Framework feature known as Language Integrated Query, or LINQ for short. As the name suggests, LINQ is tightly integrated with the .NET programming experience and provides a strongly typed query language over your model. Strongly typed simply means that the query is defined using the classes and properties that make up your model. This provides a number of benefits such as compile-time checks to ensure your queries are valid and the ability to provide IntelliSense as you write your queries.
LINQ is a general query framework and isn’t specific to Entity Framework, or even databases for that matter. A LINQ Provider is responsible for taking your LINQ query, translating it into a query against the data, and then returning results. For Entity Framework this provider is known as LINQ to Entities and is responsible for taking your LINQ query and translating it into a SQL query against the database you are targeting. The information you supplied to Entity Framework about the shape of your model and how it maps to the database is used to perform this translation. Once the query returns, Entity Framework is responsible for copying the data into instances of the classes that make up your model.
The capabilities of LINQ and its use within Entity Framework are
beyond the scope of this book. This chapter will provide an overview to
help you get up and running with queries using DbContext
, but is not an exhaustive query guide.
Programming Entity
Framework, 2e, provides a much more in-depth look at
the query capabilities of Entity Framework, not only in Chapter 3 and
Chapter 4, which are dedicated to querying, but throughout the
book.
In addition to LINQ, Entity Framework also supports a text-based query language known as Entity SQL, or ESQL for short. ESQL is typically used in more advanced scenarios where queries need to be dynamically constructed at runtime. Because ESQL is text-based, it is also useful in scenarios where the application needs to build a query against a model that isn’t known until runtime. Given that ESQL is less commonly used, it is not exposed directly on the DbContext API. If your application requires the use of ESQL, you will need to access the ObjectContext API using the IObjectContextAdapter interface.
To follow along with the examples in this book you will need a Visual Studio solution containing a console application that references the BAGA model built in Programming Entity Framework: Code First. You can download a prebuilt solution from http://learnentityframework.com/downloads. This prebuilt solution also includes a database initializer that will reset the database and insert some seed data into the database each time you run the application. The seed data is used in the examples throughout this book.
The Model
project of the prebuilt
solution contains classes that make up the BAGA domain model. The BAGA
model includes a Destination
class
(Example 2-1) that represents all the wonderful
places that our intrepid travelers can venture to.
[Table
("Locations"
, Schema ="baga"
)]public class
Destination
{public
Destination() {this
.Lodgings =new
List
<Lodging
>(); } [Column
("LocationID"
)]public int
DestinationId {get
;set
; } [Required
,Column
("LocationName"
)] [MaxLength
(200)]public string
Name {get
;set
; }public string
Country {get
;set
; } [MaxLength
(500)]public string
Description {get
;set
; } [Column
(TypeName ="image"
)]public byte
[] Photo {get
;set
; }public string
TravelWarnings {get
;set
; }public string
ClimateInfo {get
;set
; }public
List
<Lodging
> Lodgings {get
;set
; } }
The BAGA model also includes a Lodging
class (Example 2-2)
that represents the accommodation that is available at the various
Destination
s.
public class
Lodging
{public int
LodgingId {get
;set
; } [Required
] [MaxLength
(200)] [MinLength
(10)]public string
Name {get
;set
; }public string
Owner {get
;set
; }public decimal
MilesFromNearestAirport {get
;set
; } [Column
("destination_id"
)]public int
DestinationId {get
;set
; }public
Destination
Destination {get
;set
; }public
List
<InternetSpecial
> InternetSpecials {get
;set
; }public
Nullable
<int
> PrimaryContactId {get
;set
; } [InverseProperty
("PrimaryContactFor"
)] [ForeignKey
("PrimaryContactId"
)]public
Person
PrimaryContact {get
;set
; }public
Nullable
<int
> SecondaryContactId {get
;set
; } [InverseProperty
("SecondaryContactFor"
)] [ForeignKey
("SecondaryContactId"
)]public
Person
SecondaryContact {get
;set
; } }
The Destination
and Lodging
classes will be used extensively for the
examples throughout this book. To perform data access using these classes
you will be using the BreakAwayContext
from the DataAccess
project. The
project contains additional classes that are represented in BreakAwayContext
as well as the Lodgings
and Destinations
. We’ll be using Code First for the
examples in this book, but the techniques you will learn apply to any
context that derives from DbContext
.
This includes contexts created using the Model First or Database First
workflows.
public class
BreakAwayContext
:DbContext
{public
DbSet
<Destination
> Destinations {get
;set
; }public
DbSet
<Lodging
> Lodgings {get
;set
; }public
DbSet
<Trip
> Trips {get
;set
; }public
DbSet
<Person
> People {get
;set
; }public
DbSet
<Reservation
> Reservations {get
;set
; }public
DbSet
<Payment
> Payments {get
;set
; }public
DbSet
<Activity
> Activities {get
;set
; } }
Arguably the simplest query you can write is one that fetches all
the data for a given entity type. This is the equivalent of a SELECT * FROM mytable
query in SQL. Fortunately
you don’t need to know SQL, because Entity Framework will take care of
translating LINQ queries into SQL for you.
Getting all the data from a set doesn’t require you to really write
a query. You can simply iterate over the contents of any given DbSet
and Entity Framework will send a query to
the database to find all the data in that set. Let’s add a PrintAllDestinations
method to our console
application that iterates over the Destinations
set defined in our BreakAwayContext
and prints out the name of each
Destination
(Example 2-4).
private static void
PrintAllDestinations() {using
(var
context =new
BreakAwayContext
()) {foreach
(var
destinationin
context.Destinations) {Console
.WriteLine(destination.Name); } } }
When you debug the application, the console window will close when the application has finished executing, which may prevent you from inspecting the output. You can put a breakpoint at the end of the method for debugging. Alternatively, you can run without debugging (CTRL + F5), in which case Visual Studio will ensure that the console window remains open after the program has finished executing.
If you update the Main
method to
call this new PrintAllDestinations
method and run the application, you will see that the name of each
Destination
in the database is printed
to the console:
Grand Canyon Hawaii Wine Glass Bay Great Barrier Reef
As the code began iterating over the contents of the Destinations
set, Entity Framework issued a SQL
query against the database to load the required data:
SELECT [Extent1].[LocationID] AS [LocationID], [Extent1].[LocationName] AS [LocationName], [Extent1].[Country] AS [Country], [Extent1].[Description] AS [Description], [Extent1].[Photo] AS [Photo] FROM [baga].[Locations] AS [Extent1]
The SQL may not look like the SQL you would have written. This is because Entity Framework has a generic query building algorithm that not only caters to this very simple query, but also for much more complex scenarios.
The query is sent to the database when the first result is requested
by the application: that’s during the first iteration of the foreach
loop. Entity Framework doesn’t pull back
all the data at once, though. The query remains active and the results are
read from the database as they are needed. By the time the foreach
loop is completed, all the results have
been read from the database.
One important thing to note is that Entity Framework will query the
database every time you trigger an iteration over the contents of a
DbSet
. This has performance
implications if you are continually querying the database for the same
data. To avoid this, you can use a LINQ operator such as ToList
to copy the results into a list. You can
then iterate over the contents of this list multiple times without causing
multiple trips to the database. Example 2-5 introduces a
PrintAllDestinationsTwice
method that
demonstrates this approach.
private static void
PrintAllDestinationsTwice() {using
(var
context =new
BreakAwayContext
()) {var
allDestinations = context.Destinations.ToList();foreach
(var
destinationin
allDestinations) {Console
.WriteLine(destination.Name); }foreach
(var
destinationin
allDestinations) {Console
.WriteLine(destination.Name); } } }
Because a query is sent to the database to find the items in a
DbSet
, iterating a DbSet
will only contain items that exist in the
database. Any objects that are sitting in memory waiting to be saved to
the database will not be returned. To ensure added objects are included
you can use the techniques described in Querying Local Data.
While this chapter will not be an exhaustive list of everything you
can do with LINQ and Entity Framework, let’s take a look at the patterns
used to achieve some common query tasks. Let’s say you want to print out
the names of Destination
s again, but
this time you want them ordered alphabetically by Name
. Add a new PrintAllDestinationsSorted
method that uses a
LINQ query to perform this sort (Example 2-6).
private static void
PrintAllDestinationsSorted() {using
(var
context =new
BreakAwayContext
()) {var
query =from
din
context.Destinationsorderby
d.Nameselect
d;foreach
(var
destinationin
query) {Console
.WriteLine(destination.Name); } } }
The above code uses LINQ to create a query and then iterates the
results of the query and displays the name of each destination. The query
is expressed using a syntax that looks a little bit like SQL. You start by
telling it what you want to select from (in our case, the Destinations
set on our context). You give the
set a name so that you can refer to it throughout the rest of the query
(in our case that name is d
). Following
this, you use operators such as orderby
, groupby
, and where
to define the query. Finally you specify
what you want returned using the select
operator. In our case we want the actual Destination
objects returned, so we specify the
name that we gave the set in the first line.
Remember that Entity Framework won’t execute the query against the
database until it needs the first result. During the first iteration of
the foreach
loop, the query is sent to
the database. The query remains active and each result is read from the
database as it is needed by the application. LINQ also includes methods
that will copy the results of a query into a collection. For example,
ToList
can be called on a query to copy
the results into a new List<T>
.
Calling a method such as this will cause all the results to be retrieved
from the database and be copied into the new List<T>
.
The code shown in Example 2-6 uses the LINQ query syntax to express the query. While most people find this the easiest to understand, there is an alternate method syntax that can be used if you prefer. Example 2-7 shows the same query expressed using method syntax.
The method syntax makes use of lambda expressions to define the
query. The LINQ methods are strongly typed, which gives you IntelliSense
and compile-time checking for the lambda expressions you write. For
example, in the OrderBy
method we are
using a lambda expression to specify that we want to order by the Name
property. You start a lambda expression by
giving a name to the thing you are operating on; this forms the left side
of the expression. In our case we are operating on a Destination
and we have chosen to call it
d
. Then, on the right side of the
expression, you specify the body of the expression. In our case we just
want to identify the Name
property.
C# uses the lambda sign (=>
)
to separate the left and right sides of the expression. VB.NET uses the
Function
keyword followed by brackets
to identify the left side of the expression. Example 2-8 shows the same query written in
VB.NET using the method syntax.
Another common task is to filter the results of a query. For
example, we may only want Destinations
from Australia. Add the PrintAustralianDestinations
method shown in
Example 2-9.
private static void
PrintAustralianDestinations() {using
(var
context =new
BreakAwayContext
()) {var
query =from
din
context.Destinationswhere
d.Country =="Australia"
select
d;foreach
(var
destinationin
query) {Console
.WriteLine(destination.Name); } } }
This code looks very similar to the PrintAllDestinationsSorted
we saw in Example 2-6, except we are using
the where
operator instead of orderby
. You can also combine these operators.
Example 2-10 shows how to query for
Australian Destination
s sorted by
name.
var
query =from
din
context.Destinationswhere
d.Country =="Australia"
orderby
d.Nameselect
d;
Operators can also be combined in the method syntax. The same query from Example 2-10 is shown using method syntax in Example 2-11.
var
query = context.Destinations .Where(d => d.Country =="Australia"
) .OrderBy(d => d.Name);
So far our queries have returned collections of entities from our
model, but this may not always be the case. In fact, we have been
returning complete Destination
objects
when we really only need the name. You can use
projection to create a query that selects from a set
of entities in your model but returns results that are of a different
type. For example, you can use projection to create a query that selects
from a set of entities type but only returns a subset of the properties of
that entity. It’s called projection because you are projecting data from
the shape of the source that you are selecting from onto the shape of the
result set you want.
In our case we want to project a query about Destinations
into a result set that just has a
string representing the destination’s name. Example 2-12 adds a PrintDestinationNameOnly
method that shows how
we use the select
section of our query
to specify what we want the result set to contain.
private static void
PrintDestinationNameOnly() {using
(var
context =new
BreakAwayContext
()) {var
query =from
din
context.Destinationswhere
d.Country =="Australia"
orderby
d.Nameselect
d.Name;foreach
(var
namein
query) {Console
.WriteLine(name); } } }
Example 2-13 shows how this same
query can be written using method syntax by making use of the Select
method.
var
query = context.Destinations .Where(d => d.Country =="Australia"
) .OrderBy(d => d.Name) .Select(d => d.Name);
LINQ is a powerful query language and this section has just grazed the surface of its capabilities. Programming Entity Framework, 2e, contains a much deeper look into using LINQ with the Entity Framework. There are also more example queries available in the Entity Framework MSDN documentation: http://msdn.microsoft.com/en-us/library/bb399367.aspx.
So far you’ve seen queries that return a collection of entities, but
sometimes you will want to run a query that just returns a single object.
The most common scenario for querying for a single object is to find the
object with a given key. The DbContext API makes this very simple by
exposing a Find
method on DbSet
. Find
accepts the value to be searched for and will return the corresponding
object if it is found. If there is no entity with the provided key,
Find
will return null
.
One of the great things about Find
is that it doesn’t unnecessarily query the
database. It’s also capable of finding newly added objects that haven’t
yet been saved to the database. Find
uses a simple set of rules to locate the object (in order of
precedence):
Look in memory for an existing entity that has been loaded from the database or attached to the context (you’ll learn more about attaching objects in Chapter 4).
Look at added objects that have not yet been saved to the database.
Look in the database for entities that have not yet been loaded into memory.
To see this behavior, add the FindDestination
method shown in Example 2-14. This method accepts an ID
from the user and then attempts to locate the Destination
with the specified ID.
private static void
FindDestination() {Console
.Write("Enter id of Destination to find: "
);var
id =int
.Parse(Console
.ReadLine());using
(var
context =new
BreakAwayContext
()) {var
destination = context.Destinations.Find(id);if
(destination ==null
) {Console
.WriteLine("Destination not found!"
); }else
{Console
.WriteLine(destination.Name); } } }
The code above uses the Find
method to look up the Destination
with
the specified ID. If one is found, it prints out the name of the
destination. If Find
returns null
, indicating there is no Destination
with the specified ID, an error
message is displayed to the user.
There may be times when you want to query for a single object but
are not able to use Find
. These could
include wanting to query by something other than the key or wanting to
include related data in the query (as described in Eager Loading). To do this, you will need to create a
standard LINQ query and then use the Single
method to get a single object as the
result.
Let’s say we want to locate the Destination
that has the name Great Barrier
Reef. Name
isn’t the key of Destination
but we know there is, and only ever
will be, one Great Barrier Reef. Example 2-10 introduces a FindGreatBarrierReef
method that will locate
this single Destination
.
private static void
FindGreatBarrierReef() {using
(var
context =new
BreakAwayContext
()) {var
query =from
din
context.Destinationswhere
d.Name =="Great Barrier Reef"
select
d;var
reef = query.Single();Console
.WriteLine(reef.Description); } }
The LINQ query looks the same as any other query that filters based
on name. We then use the Single
method
to let Entity Framework know that we expect a single result. If the query
returns no results, or more than one result, an exception will be thrown.
If there are potentially no matches, you can use the SingleOrDefault
method, which will return
null
if no results are found. Example 2-16 shows the FindGreatBarrierReef
method updated to account
for the fact it may not exist in the database.
private static void
FindGreatBarrierReef() {using
(var
context =new
BreakAwayContext
()) {var
query =from
din
context.Destinationswhere
d.Name =="Great Barrier Reef"
select
d;var
reef = query.SingleOrDefault();if
(reef ==null
) {Console
.WriteLine("Can't find the reef!"
); }else
{Console
.WriteLine(reef.Description); } } }
SingleOrDefault
uses the same
database query that Find
uses when it
looks for entities in the database. The SQL selects the TOP
two results so that it can ensure there is
only one match:
SELECT TOP (2) [Extent1].[LocationID] AS [LocationID], [Extent1].[LocationName] AS [LocationName], [Extent1].[Country] AS [Country], [Extent1].[Description] AS [Description], [Extent1].[Photo] AS [Photo], [Extent1].[TravelWarnings] AS [TravelWarnings], [Extent1].[ClimateInfo] AS [ClimateInfo] FROM [baga].[Locations] AS [Extent1] WHERE N'Great Barrier Reef' = [Extent1].[LocationName]
If two rows are found, Single
and
SingleOrDefault
will throw because
there is not a single result. If you just want the first result, and
aren’t concerned if there is more than one result, you can use First
or FirstOrDefault
.
One important thing to remember is that LINQ queries against a
DbSet
always send a query to the
database to find the data. So, if the Great Barrier Reef was a newly added
Destination
that hadn’t been saved to
the database yet, the queries in Example 2-15 and Example 2-16 won’t be able to
locate it. To look for newly added entities, you would also need to query
the in-memory data using the techniques shown in Querying Local Data.
So far you’ve used LINQ to query a DbSet
directly, which always results in a SQL
query being sent to the database to load the data. You’ve also used the
Find
method, which will look for
in-memory data before querying that database. Find
will only query based on the key property
though, and there may be times when you want to use a more complex query
against data that is already in memory and being tracked by your DbContext
.
One of the reasons you may want to do this is to avoid sending
multiple queries to the database when you know that all the data you need
is already loaded into memory. Back in Example 2-5, we saw one way to
do this was to use ToList
to copy the
results of a query into a list. While this works well if we are using the
data within the same block of code, things get a little messy if we need
to start passing that list around our application. For example, we might
want to load all Destination
s from the
database when our application loads. Different areas of our application
are then going to want to run different queries against that data. In some
places we might want to display all Destination
s, in others we might want to sort by
Name, and in others we might want to filter by Country
. Rather than passing around a list of
Destination
objects, we can take
advantage of the fact that our context is tracking all the instances and
query its local data.
Another reason may be that you want the results to include newly
added data, which doesn’t yet exist in the database. Using ToList
on a LINQ query against a DbSet
will always send a query to the database.
This means that any new objects that don’t yet exist in the database won’t
be included in the results. Local queries, however, will include newly
created objects in the results.
The in-memory data for a DbSet
is
available via the Local
property.
Local
will return all the data that has
been loaded from the database plus any newly added data. Any data that has
been marked as deleted but hasn’t been deleted from the database yet will
be filtered out for you. More information on how entities get into these
different states is available in Chapter 3.
Let’s start with the very simple task of finding out how many
Destination
s are in memory and
available to be queried. Go ahead and add the GetLocalDestinationCount
method, as shown in
Example 2-17.
private static void
GetLocalDestinationCount() {using
(var
context =new
BreakAwayContext
()) {var
count = context.Destinations.Local.Count;Console
.WriteLine("Destinations in memory: {0}"
, count); } }
The code accesses the Local
property of the Destinations
set that
we created on our BreakAwayContext
.
Rather than running a query, we simply store the count in a variable and
then print it to the console. If you run the application you will see that
the count is zero:
Destinations in memory: 0
We’re getting a zero count because we haven’t run any queries to
load Destination
s from the database,
and we haven’t added any new Destination
objects either. Let’s update the
GetLocalDestinationCount
method to
query some data from the database before getting the local count (Example 2-18).
private static void
GetLocalDestinationCount() {using
(var
context =new
BreakAwayContext
()) {foreach
(var
destinationin
context.Destinations) {Console
.WriteLine(destination.Name); }var
count = context.Destinations.Local.Count;Console
.WriteLine("Destinations in memory: {0}"
, count); } }
This new code iterates over the Destinations
set, causing the data to be loaded
from the database. Because the data is loaded when we get the count from
the Local
property, we now see a
nonzero result when we run the application:
Grand Canyon Hawaii Wine Glass Bay Great Barrier Reef Destinations in memory: 4
Iterating over the contents of a DbSet
with a foreach
loop is one way to get all the data
into memory, but it’s a little inefficient to do that just for the sake
of loading data. It’s also a little unclear what the intent of the code
is, especially if the iteration code doesn’t directly precede the local
query.
Fortunately the DbContext API includes a Load
method, which can be used on a DbSet
to pull all the data from the database
into memory. Go ahead and add the GetLocalDestinationCountWithLoad
method (Example 2-19) that uses
Load
on the Destinations
set and then prints out the count
of in-memory Destinations
.
private static void
GetLocalDestinationCountWithLoad() {using
(var
context =new
BreakAwayContext
()) { context.Destinations.Load();var
count = context.Destinations.Local.Count;Console
.WriteLine("Destinations in memory: {0}"
, count); } }
Compare this code with the GetLocalDestinationCount
method we wrote back
in Example 2-18. This
updated code makes it much clearer that our intent is to load the
contents of the Destinations
set and
then query the in-memory data.
Load
is actually an extension
method on IQueryable<T>
and
is defined in the System.Data.Entity
namespace. If you want to
use Load
, you will need to have
this namespace imported.
Because Load
is an extension
method on IQueryable<T>
, we can
also use it to load the results of a LINQ query into memory, rather than
the entire contents of a set. For example, let’s say we only wanted to
load Australian Destination
s into
memory and then run a few local queries on that subset of data. Let’s
add the LoadAustralianDestinations
method shown in Example 2-20.
private static void
LoadAustralianDestinations() {using
(var
context =new
BreakAwayContext
()) {var
query =from
din
context.Destinationswhere
d.Country =="Australia"
select
d; query.Load();var
count = context.Destinations.Local.Count;Console
.WriteLine("Aussie destinations in memory: {0}"
, count); } }
This time just the Destination
s
with Country
set to Australia are
loaded into memory. When we run the application, we see that the count
we get from Local
is reduced to
reflect this.
Using Load
on a LINQ query
will bring the results of that query into memory but it does not
remove the results of previous queries. For example if you called
Load
on a query for Australian destinations and
then Load
on a query for American
destinations, both Australian and American destinations would be in
memory and would be returned from Local
.
So far we have just looked at getting the count from Local
to make sure that it is returning the
correct data that we brought into memory. Because Local
is just a collection of in-memory
objects, we can also run queries against it. One of the great things
about LINQ is that it’s not specific to Entity Framework. We can use the
same LINQ syntax to query a number of different data sources, including
in-memory collections of objects.
Let’s add a LocalLinqQueries
method that pulls data into memory using a single database query and
then runs some in-memory queries using Local
(Example 2-21).
private static void
LocalLinqQueries() {using
(var
context =new
BreakAwayContext
()) { context.Destinations.Load();var
sortedDestinations =from
din
context.Destinations.Localorderby
d.Nameselect
d;Console
.WriteLine("All Destinations:"
);foreach
(var
destinationin
sortedDestinations) {Console
.WriteLine(destination.Name); }var
aussieDestinations =from
din
context.Destinations.Localwhere
d.Country =="Australia"
select
d;Console
.WriteLine();Console
.WriteLine("Australian Destinations:"
);foreach
(var
destinationin
aussieDestinations) {Console
.WriteLine(destination.Name); } } }
The code loads all Destination
s
into memory and then runs one query to sort them by Name
and another to pull out just the
Australian Destination
s. Remember
that Find
also defaults to using
in-memory data where possible. So we could also use Find
and it would use the data we loaded
rather than sending more queries to the database.
While Load
and Local
are great if you want to reduce the
number of queries that get run against the database just remember that
pulling all your data into memory may be an expensive operation. If you
are running multiple queries that only return a subset of your data
you’ll probably get better performance by letting these queries hit the
database and just pull back the data you actually need.
If you’ve looked at the API closely you may have noticed that
Local
returns an ObservableCollection<TEntity>
. This type
of collection allows subscribers to be notified whenever objects are
added or removed from the collection. ObservableCollection
is useful in a number of
data-binding scenarios, but it can also be useful if your application
needs to know when new data comes into memory.
Local
will raise the CollectionChanged
event whenever the contents
of Local
change. This can be when
data is brought back from that database via a query, when new objects
are added to the DbContext
, or when
objects previously brought into memory are marked for deletion.
Let’s add a ListenToLocalChanges
method that uses this
functionality to log any changes to Destinations.Local
to the console (Example 2-22).
private static void
ListenToLocalChanges() {using
(var
context =new
BreakAwayContext
()) { context.Destinations.Local .CollectionChanged += (sender, args) => {if
(args.NewItems !=null
) {foreach
(Destination
itemin
args.NewItems) {Console
.WriteLine("Added: "
+ item.Name); } }if
(args.OldItems !=null
) {foreach
(Destination
itemin
args.OldItems) {Console
.WriteLine("Removed: "
+ item.Name); } } }; context.Destinations.Load(); } }
The code adds a new event handler to the Local
collection of Destination
s. This handler looks at items
entering or leaving the collection and prints out the name of the
affected Destination
and indicates if
it is being added or removed. Once the event handler is in place, we use
Load
to pull all the data from the
database into memory. If you run the application, you can see the output
appearing as items are returned from the database:
Added: Grand Canyon Added: Hawaii Added: Wine Glass Bay Added: Great Barrier Reef
These events could be handy if you have a screen that needs to be
refreshed whenever some data in your context changes. For example, you
might have a screen that displays all Destination
s and another screen where the user
can add a new Destination
. You could
wire up the screen displaying all Destination
s to listen to the CollectionChanged
event and refresh whenever
anything is added or removed.
Some UI frameworks, such as WPF, will take care of this for you so
that you don’t have to write code to listen to changes. If you bind a
WPF ListBox
to the contents of
Local
, whenever any other area of the
application adds or removes an entity from the DbSet
, the ListBox
will be updated to reflect those
changes.
If you use LINQ to query the contents of Local
, the result of the query is no longer
an ObservableCollection
. This means
if you run a LINQ query against Local
and bind the results to a WPF ListBox
, it will no longer get automatically
updated for you when entities are added or removed. You would need to
write code that listens to OnCollectionChanged
on DbSet.Local
and rerun the query to refresh
the ListBox
.
So far we have looked at accessing data for a single type of entity
and everything has been about Destinations
. But if we were writing a real
application, we would probably want to know something about the Lodging
that is available at each Destination
. If we want to access the Lodgings
associated with a Destination
, that means working with related
data.
You’ll need to pull related data into memory so that we can look at it. There are three approaches you can use to load related data: lazily, eagerly, or explicitly. While they may achieve the same end result, there are some differences between each approach that can have a significant impact on performance. This isn’t a one-time decision either. Different approaches may be better at different times. This section will walk through the three available options and help you work out which one is best for you in different situations.
The “Demystifying Entity Framework Strategies: Loading Related Data” MSDN article gives a detailed look at the pros and cons of the different strategies and some pointers on choosing the right strategy for you.
Lazy loading related data is the most transparent to your
application and involves letting Entity Framework automatically retrieve
the related data for you when you try to access it. For example, you may
have the Grand Canyon destination loaded. If you then use the Lodgings
property of this Destination
, Entity Framework will
automatically send a query to the database to load all Lodgings
at the Grand Canyon. It will appear
to your application code as if the Lodgings
property was always populated.
Entity Framework achieves lazy loading using a dynamic proxy. Here’s how that works. When Entity Framework returns the results of a query, it creates instances of your classes and populates them with the data that was returned from the database. Entity Framework has the ability to dynamically create a new type at runtime that derives from your POCO class. This new class acts as a proxy to your POCO class and is referred to as a dynamic proxy. It will override the navigation properties of your POCO class and include some additional logic to retrieve the data from the database when the property is accessed. Because the dynamic proxy derives from your POCO class, your application can be written in terms of the POCO class and doesn’t need to be aware that there may be a dynamic proxy at runtime.
DbContext
has a configuration
setting that enables lazy loading: DbContext.Configuration.LazyLoadingEnabled
.
This setting is true
by default and
therefore if you have not changed the default, the dynamic proxy will
perform lazy loading.
In order to use dynamic proxies, and therefore lazy loading, there are a couple of criteria your class must meet. If these criteria are not met, Entity Framework will not create a dynamic proxy for the class and will just return instances of your POCO class, which cannot perform lazy loading:
Your POCO class must be public and not sealed.
The navigation properties that you want to be lazy loaded must
also be marked as virtual
(Overridable
in Visual Basic) so
that Entity Framework can override the properties to include the
lazy loading logic.
Before we make any changes to our classes, let’s see what the
behavior is like without dynamic proxies. Add a TestLazyLoading
method that attempts to access
the Lodgings
associated with a
specific Destination
(Example 2-23).
private static void
TestLazyLoading() {using
(var
context =new
BreakAwayContext
()) {var
query =from
din
context.Destinationswhere
d.Name =="Grand Canyon"
select
d;var
canyon = query.Single();Console
.WriteLine("Grand Canyon Lodging:"
);if
(canyon.Lodgings !=null
) {foreach
(var
lodgingin
canyon.Lodgings) {Console
.WriteLine(lodging.Name); } } } }
The code locates the Grand Canyon Destination
and then tests if the Lodgings
property is populated. If it is
populated, the name of each associated Lodging
is printed to the console. If you
update the Main
method to call
TestLazyLoading
and run the
application, you will see that nothing is printed out to the console.
This is because the Lodgings
property
on Destination
isn’t marked as
virtual
(Overridable
in Visual Basic), so Entity
Framework can’t override the property in a dynamic proxy. Entity
Framework is forced to use your implementation of the property (that
doesn’t perform lazy loading) rather than replacing it with an
implementation that includes the lazy loading logic. Let’s go ahead and
edit the Destination
class so that
the property is marked as virtual
:
public virtual
List
<Lodging
> Lodgings {get
;set
; }
Now Entity Framework can create a dynamic proxy for the Destination
class. If you run the application
again, you’ll see that the individual Lodgings
for the Grand Canyon are displayed
because the data was automatically loaded for you when the code
encountered the first request for Lodgings
:
Grand Canyon Lodging: Grand Hotel Dave's Dump
As the code executed, Entity Framework sent two queries to the
database (Figure 2-1). The first query
retrieves the data for the Grand Canyon Destination
and was executed when the code
called the Single
method on query
. Remember that the Single
method uses a SELECT TOP
(2)
query to ensure there is one result and only one result.
The second query selects all Lodgings
associated with the Grand Canyon. This query was sent at the moment the
code first tried to access the Lodgings
property for the Grand Canyon
Destination
.
Lazy loading is very simple because your application doesn’t
really need to be aware that data is being loaded from the database.
But that is also one of its dangers! Improper use of lazy loading can
result in a lot of queries being sent to the database. For example,
you might load fifty Destination
s
and then access the Lodgings
property on each. That would result in 51 queries against the
database—one query to get the Destinations
and then for each of the fifty
Destinations
, to load that Destination
’s Lodgings
. In cases like this it may be much
more efficient to load all that data in a single query, using a SQL
join in the database query. This is where eager loading comes into
play.
If you decide that lazy loading is just too much magic, you
can choose to disable it altogether by using the DbContext.Configuration.LazyLoadingEnabled
property. If this switch is set to false
, lazy
loading will never occur, even if a navigation property is marked as
virtual
.
Eager loading related data relies on you telling Entity Framework
what related data to include when you query for an entity type. Entity
Framework will then use a JOIN
in the
generated SQL to pull back all of the data in a single query. Let’s
assume we want to run though all Destinations
and print out the Lodgings
for each. Add a TestEagerLoading
method that queries for all
Destinations
and uses Include
to also query for the associated
Lodgings
(Example 2-24).
private static void
TestEagerLoading() {using
(var
context =new
BreakAwayContext
()) {var
allDestinations = context .Destinations .Include(d => d.Lodgings);foreach
(var
destinationin
allDestinations) {Console
.WriteLine(destination.Name);foreach
(var
lodgingin
destination.Lodgings) {Console
.WriteLine(" - "
+ lodging.Name); } } } }
The code uses the Include
method to indicate that the query for all destinations should include
the related Lodging
data. Include
uses a lambda expression to specify
which properties to include the data for. When the application runs, we
see a single query is executed against the database (Figure 2-2). This query uses
a join to return the Destination
and
Lodging
data as a single result set.
There is also a string-based overload of Include
that just accepts the name of the
property to include data for (Include(“Lodgings”)
in our case). Previous
versions of Entity Framework only included this string option. The
string-based overload is problematic because it’s not strongly typed and
therefore there is no compile-time checking of the parameter. This can
lead to issues with mistyped property names or failing to update the
Include call if the property is renamed in the future.
The lambda version of the Include
method is defined as an extension
method in System.Data.Entity
. To
use the lambda overload you will need to import this namespace.
It is possible to include more than one related set of data in a
single query. Say we wanted to query for Lodgings
and include the PrimaryContact
plus the associated Photo
. We do this by “dotting through” the
navigation properties in the lambda expression:
context.Lodgings .Include(l => l.PrimaryContact.Photo)
The syntax gets a little more complicated if you have a collection
navigation property in the middle of the path to be included. What if
you want to query for Destination
s
and include Lodgings
and also the
PrimaryContact
for each of the
related Lodging
instances? Following
the collection, you need to use the LINQ Select
method to identify which property you
want to load:
context.Destinations .Include(d => d.Lodgings.Select(l => l.PrimaryContact))
Include
can be used multiple
times in the same query to identify different data to be loaded. For
example, you may want to query the Lodgings
set and include both PrimaryContact
and SecondaryContact
. This requires two separate
calls to Include
:
context.Lodgings .Include(l => l.PrimaryContact) .Include(l => l.SecondaryContact)
Eager loading is currently only able to include the entire contents of a navigation property. The ability to only include a subset of the contents of a collection navigation property is a common request, but it is not currently supported by the Entity Framework.
One thing to bear in mind with eager loading is that fewer queries aren’t always better. The reduction in the number of queries comes at the expense of the simplicity of the queries being executed. As you include more and more data, the number of joins in the query that is sent to the database increases and results in a slower and more complex query. If you need a significant amount of related data, multiple simpler queries will often be significantly faster than one big query that returns all the data.
You can also use Include
as
part of a LINQ query by adding the Include
method to the DbSet
being queried. If you are using query
syntax, the Include
goes in the
from
part of the query:
var
query =from
din
context.Destinations.Include(d => d.Lodgings)where
d.Country =="Australia"
select
d;
If you are using method syntax, you can simply put Include in line with the other method calls:
var
query = context.Destinations .Include(d => d.Lodgings) .Where(d => d.Country =="Australia"
);
Include
is defined as an
extension method on IQueryable<T>
and can therefore be
added to a query at any point. It doesn’t have to immediately follow
the DbSet
from which you are
selecting. For example, you can call Include on an existing query for
Australian Destinations
to specify
that Lodgings
should also be
included:
var
query =from
din
context.Destinationswhere
d.Country =="Australia"
select
d; query = query.Include(d => d.Lodgings);
Note that the code doesn’t just call Include
on the existing query but overrides
the query
variable with the result
of the Include
call. This is
necessary because Include
doesn’t
modify the query that it is called on, it returns a new query that
will include the related data. Remember that Entity Framework doesn’t
execute any queries until the code uses the results of the query. The
above code doesn’t use the results of the query, so nothing will be
executed against the database until some other code accesses the
Destinations
from the query
variable.
Although Include
is defined
as an extension method on IQueryable<T>
it will only have an
effect when used on a LINQ to Entities query. If another LINQ
provider is being used, Include
will have no
effect unless the implementation of IQueryable<T>
exposes an Include
method that accepts a single
string parameter. If this method exists, it will be called with a
string representing the property path that was specified to be
included.
Another loading option is explicit loading. Explicit loading is like lazy loading in that related data is loaded separately, after the main data has been loaded. However, unlike lazy loading, it doesn’t automatically happen for you; you need to call a method to load the data.
There are a number of reasons you might opt for explicit loading over lazy loading:
It removes the need to mark your navigation properties as virtual. To some this may seem like a trivial change, for others, the fact that a data access technology requires you to change your POCO classes is far from ideal.
You may be working with an existing class library where the navigation properties are not marked as virtual and you simply can’t change that.
Explicit loading allows you to be sure that you know exactly when queries are sent to the database. Lazy loading has the potential to generate a lot of queries; with explicit loading it is very obvious when and where queries are being run.
Explicit loading is achieved using the DbContext.Entry
method. The Entry
method gives you access to all the
information that the DbContext
has
about an entity. This goes beyond the values that are stored in the
properties of the actual entity and includes things such as the state of
the entity and the original values for each property when it was
retrieved from the database. You’ll see a lot more about this
information in Chapters 4 and 5. In
addition to information about the entity, the Entry
method also gives you access to some
operations you can perform on the entity, including loading data for
navigation properties.
Once we have the entry for a given entity we can use the Collection
and Reference
methods to drill into the
information and operations for navigation properties. One of the
operations available is the Load
method, which will send a query to the database to load the contents of
the navigation property.
Let’s take another look at loading the Lodgings
available at the Grand Canyon. This
time let’s add a TestExplicitLoading
method that uses the Entry
method to
load the data (Example 2-25).
private static void
TestExplicitLoading() {using
(var
context =new
BreakAwayContext
()) {var
query =from
din
context.Destinationswhere
d.Name =="Grand Canyon"
select
d;var
canyon = query.Single(); context.Entry(canyon) .Collection(d => d.Lodgings) .Load();Console
.WriteLine("Grand Canyon Lodging:"
);foreach
(var
lodgingin
canyon.Lodgings) {Console
.WriteLine(lodging.Name); } } }
The first part of the code should be familiar—it uses a LINQ query
to locate the Grand Canyon Destination
. The code then calls the Entry
method, passing in the canyon
object. From there the Collection
method is used to drill into the
Lodgings
navigation property.
Collection
and Reference
use a lambda expression to specify
the property to drill into. There are also string-based alternatives to
these methods, but the lambda version ensures we get compile-time
checking of the parameter. Finally, the Load
method is used to query for the related
data and bring it into memory.
If you update the Main
method
to call TestExplicitLoading
and then
run the application, you will see two queries run against the database
(Figure 2-3). The first
one runs when the code requests the single result of the query for the
Grand Canyon, by calling Single
on
query. The second query is asking for all Lodging
at the Grand Canyon and runs as a
result of the call to Load
.
You’ve seen that explicit loading can be used to load the entire contents of a collection navigation property but it can also be used to load just some of the contents, based on a LINQ query. You’ll see this in Explicit Loading a Subset of the Contents of a Navigation Property.
Explicit loading of a reference navigation property looks very
similar, except you use the Reference
method rather than Collection
. For
example, if you wanted to load the PrimaryContact
of some lodging, you could
write this:
var
lodging = context.Lodgings.First();
context.Entry(lodging)
.Reference(l => l.PrimaryContact)
.Load();
The Reference
and Collection
methods also give you access to the
IsLoaded
property. The IsLoaded
method will tell you whether the
entire contents of the navigation property have been loaded from the
database or not. The IsLoaded
property will be set to true when lazy, eager, or explicit loading is
used to load the contents of the navigation property. Add the TestIsLoaded
method shown in Example 2-26.
private static void
TestIsLoaded() {using
(var
context =new
BreakAwayContext
()) {var
canyon = (from
din
context.Destinationswhere
d.Name =="Grand Canyon"
select
d).Single();var
entry = context.Entry(canyon);Console
.WriteLine("Before Load: {0}"
, entry.Collection(d => d.Lodgings).IsLoaded); entry.Collection(d => d.Lodgings).Load();Console
.WriteLine("After Load: {0}"
, entry.Collection(d => d.Lodgings).IsLoaded); } }
The code uses a LINQ query to load the Grand Canyon Destination
from the database. The value
assigned to the IsLoaded
property for
the Lodgings
property is then printed
out to the console. Explicit loading is used to load the contents of the
Lodgings
property and the value of
IsLoaded
is printed to the console
again. If you update the Main
method
to call TestIsLoaded
and then run the
application, you will see that the value of IsLoaded
is set to true after the explicit
load is performed:
Before Load: False After Load: True
If you are performing an explicit load, and the contents of the
navigation property may have already been loaded, you can use the
IsLoaded
flag to determine if the
load is required or not.
So far you’ve looked at loading the entire contents of a collection navigation property so that you can work with the data in memory. If you wanted to filter the contents of a navigation property you could do this after you’d brought everything into memory, using LINQ to Objects. However, if you are only interested in a subset of the contents, it may make sense to just bring the bits you are interested in into memory. Or if you just want a count, or some other calculation, it may make sense just to calculate the result in the database and not bring any of the data into memory.
Once you’ve used Entry
and
Collection
to drill into a collection
navigation property, you can then use the Query
method to get a LINQ query representing
the contents of that property. Because it’s a LINQ query, you can then do
further filtering, sorting, aggregation, and the like.
Assume you wanted to find all Lodging
s at the Grand Canyon that are less than
ten miles from the nearest airport. You could just use LINQ to query the
contents of the Lodgings
property of
the Grand Canyon, something like Example 2-27.
private static void
QueryLodgingDistance() {using
(var
context =new
BreakAwayContext
()) {var
canyonQuery =from
din
context.Destinationswhere
d.Name =="Grand Canyon"
select
d;var
canyon = canyonQuery.Single();var
distanceQuery =from
lin
canyon.Lodgingswhere
l.MilesFromNearestAirport <= 10select
l;foreach
(var
lodgingin
distanceQuery) {Console
.WriteLine(lodging.Name); } } }
The problem with this code is that distanceQuery
is using LINQ to Objects to query
the contents of the Lodgings
navigation
property. This will cause the property to be lazy loaded, pulling the
entire contents into memory. The code then immediately filters out some of
the data, meaning there was no need to pull it into memory. Let’s rewrite
the QueryLodgingDistance
method from
Example 2-27 to use
Query
, as shown in Example 2-28.
private static void
QueryLodgingDistance() {using
(var
context =new
BreakAwayContext
()) {var
canyonQuery =from
din
context.Destinationswhere
d.Name =="Grand Canyon"
select
d;var
canyon = canyonQuery.Single();var
lodgingQuery = context.Entry(canyon) .Collection(d => d.Lodgings) .Query();var
distanceQuery =from
lin
lodgingQuerywhere
l.MilesFromNearestAirport <= 10select
l;foreach
(var
lodgingin
distanceQuery) {Console
.WriteLine(lodging.Name); } } }
This updated code uses the Query
method to create a LINQ to Entities query for the Lodgings
associated with the Grand Canyon. It
then composes on that query to ask for just the Lodgings
that are within ten miles of an
airport. When iterating over this query, Entity Framework takes care of
the translation to SQL and performs the filter on MilesFromNearestAirport
in the database. This
means that only the data you care about is brought back into
memory.
Perhaps you want to know how many Lodgings
are available at the Grand Canyon. You
could load all the Lodgings
and get a
count, but why bring all that data into memory just to get a single
integer result? Add a QueryLodgingCount
method that uses Query
to get the count
without loading the data (Example 2-29).
private static void
QueryLodgingCount() {using
(var
context =new
BreakAwayContext
()) {var
canyonQuery =from
din
context.Destinationswhere
d.Name =="Grand Canyon"
select
d;var
canyon = canyonQuery.Single();var
lodgingQuery = context.Entry(canyon) .Collection(d => d.Lodgings) .Query();var
lodgingCount = lodgingQuery.Count();Console
.WriteLine("Lodging at Grand Canyon: "
+ lodgingCount); } }
The code loads the Grand Canyon destination and then uses Entry
and Collection
to drill into the Lodgings
navigation property. From there it uses
the Query
method to get a query
representing the contents of the navigation property. It then uses the
LINQ Count
method to materialize just
the count of the results of the query. Because it is using the LINQ to
Entities provider, it recognizes that you want the count and pushes the
entire query to the database so that only the single integer result is
returned from the database. If you update the Main
method to call QueryLodgingCount
and run the application you
will see the count correctly displayed:
Lodging at Grand Canyon: 2
You can combine the Query
and
Load
methods to perform a
filtered explicit load. That’s an explicit load
that only loads a subset of the contents of a navigation property. For
example, you may want to just load the Lodgings
at the Grand Canyon that contain the
word “Hotel” in their Name
:
context.Entry(canyon)
.Collection(d => d.Lodgings)
.Query()
.Where(l => l.Name.Contains("Hotel"
))
.Load();
It’s important to remember that calling Load
will not clear any objects that are already in the navigation property.
So if you loaded Lodgings at the Grand Canyon that contain the word
“Hotel” and then also loaded Lodgings that contain the word “Campsite”,
the Lodgings navigation property will contain both hotels and
campsites.