Chapter 12. XML

XML (the eXtensible Markup Language) provides an industry-standard method for encoding structured information. It defines syntactic and structural rules that enable software applications to process XML files even when they don’t understand all of the data.

XML specifications are defined and maintained by the World Wide Web Consortium (W3C). The latest version is XML 1.1 (Second Edition). However, XML 1.0 (currently in its fifth edition) is the most popular version, and is supported by all XML parsers. W3C states that:

You are encouraged to create or generate XML 1.0 documents if you do not need the new features in XML 1.1; XML Parsers are expected to understand both XML 1.0 and XML 1.1 (see http://www.w3.org/xml/core/#publications/).

This chapter will introduce XML 1.0 only, and in fact, will focus on just the most commonly used XML features. We’ll introduce you to the XDocument and XElement classes first, and you’ll learn how to create and manipulate XML documents.

Of course, once you have a large document, you’ll want to be able to find substrings, and we’ll show you two different ways to do that, using LINQ. The .NET Framework also allows you to serialize your objects as XML, and deserialize them at their destination. We’ll cover those methods at the end of the chapter.

XML Basics (A Quick Review)

XML is a markup language, not unlike HTML, except that it is extensible—that is, applications that use XML can (and do) create new kinds of elements and attributes.

Elements

In XML, a document is a hierarchy of elements. An element is typically defined by a pair of tags, called the start and end tags. In the following example, FirstName is an element:

<FirstName>Orlando</FirstName>

A start tag contains the element name surrounded by a pair of angle brackets:

<FirstName>

An end tag is similar, except that the element name is preceded by a forward slash:

</FirstName>

An element may contain content between its start and end tags. In this example, the element contains text, but content can also contain child elements. For example, this Customer element has three child elements:

  <Customer>
    <FirstName>Orlando</FirstName>
    <LastName>Gee</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>

The top-level element in an XML document is called its root element. Every document has exactly one root element.

An element does not have to contain content, but every element (except for the root element) has exactly one parent element. Elements with the same parent element are called sibling elements.

In this example, Customers (plural) is the root. The children of the root element, Customers, are the three Customer (singular) elements:

<Customers>
  <Customer>
    <FirstName>Orlando</FirstName>
    <LastName>Gee</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Keith</FirstName>
    <LastName>Harris</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Donna</FirstName>
    <LastName>Carreras</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Janet</FirstName>
    <LastName>Gates</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Lucy</FirstName>
    <LastName>Harrington</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
</Customers>

Each Customer has one parent (Customers) and three children (FirstName, LastName, and EmailAddress). Each of these, in turn, has one parent (Customer) and zero children.

When an element has no content—no child elements and no text—you can optionally use a more compact representation, where you write just a single tag, with a slash just before the closing angle bracket. For example, this:

<Customers/>

means exactly the same as this:

<Customers></Customers>

This empty element tag syntax is the only syntax in which an element is represented by just a single tag. Unless you are using this form, it is illegal to omit the closing tag.

XHTML

XHTML is an enhanced standard of HTML that follows the stricter rules of XML validity. The two most important XML rules that make XHTML different from plain HTML follow:

  • No elements may overlap, though they may nest. So this is legal, because the elements are nested:

    <element 1>
       <element2>
          ...
       </element 2>
    </element 1>

    You may not write:

    <element 1>
       <element2>
          ...
       </element 1>
    </element 2>

    because in the latter case, element2 overlaps element1 rather than being neatly nested within it. (Ordinary HTML allows this.)

  • Every element must be closed, which means that for each opened element, you must have a closing tag (or the element tag must be self-closing). So while plain old HTML permits:

     <br>

    in XHTML we must either write this:

    <br></br>

    or use the empty element tag form:

    <br />

X Stands for eXtensible

The key point of XML is to provide an extensible markup language. Here’s an incredibly short pop-history lesson: HTML was derived from the Standard Generalized Markup Language (SGML). HTML has many wonderful attributes (if you’ll pardon the pun), but if you want to add a new element to HTML, you have two choices: apply to the W3C and wait, or strike out on your own and be “nonstandard.”

There was a strong need for the ability for two organizations to get together and specify tags that they could use for data exchange. Hey! Presto! XML was born as a more general-purpose markup language that allows users to define their own tags. This is the critical distinction of XML.

Creating XML Documents

Because XML documents are structured text documents, you can create them using a text editor and process them using string manipulation functions. To paraphrase David Platt, you can also have an appendectomy through your mouth, but it takes longer and hurts more.

To make the job easier, .NET implements classes and utilities that provide XML functionality. There are several to choose from. There are the streaming XML APIs (which support XmlReader and XmlWriter), which never attempt to hold the whole document in memory—you work one element at a time, and while that enables you to handle very large documents without using much memory, it can be tricky to code for. So there are simpler APIs that let you build an object model that represents an XML document. Even here, you have a choice. One set of XML APIs is based on the XML Document Object Model (DOM), a standard API implemented in many programming systems, not just .NET. However, the DOM is surprisingly cumbersome to work with, so .NET 3.5 introduced a set of APIs that are easier to use from .NET. These are designed to work well with LINQ, and so they’re often referred to as LINQ to XML. These are now the preferred XML API if you don’t need streaming. (Silverlight doesn’t even offer the XML DOM APIs, so LINQ to XML is your only nonstreaming option there.)

Despite the name, it’s not strictly necessary to use LINQ when using the LINQ to XML classes—Example 12-1 uses this API to write a list of customers to an XML document.

Example 12-1. Creating an XML document

using System;
using System.Collections.Generic;
using System.Xml.Linq;

namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string EmailAddress { get; set; }
    }

    // Main program
    public class Tester
    {
        static void Main()
        {
            List<Customer> customers = CreateCustomerList();

            var customerXml = new XDocument();
            var rootElem = new XElement("Customers");
            customerXml.Add(rootElem);
            foreach (Customer customer in customers)
            {
                // Create new element representing the customer object.
                var customerElem = new XElement("Customer");

                // Add element representing the FirstName property
                // to the customer element.
                var firstNameElem = new XElement("FirstName",
                    customer.FirstName);
                customerElem.Add(firstNameElem);

                // Add element representing the LastName property
                // to the customer element.
                var lastNameElem = new XElement("LastName",
                    customer.LastName);
                customerElem.Add(lastNameElem);

                // Add element representing the EmailAddress property
                // to the customer element.
                var emailAddress = new XElement("EmailAddress",
                    customer.EmailAddress);
                customerElem.Add(emailAddress);

                // Finally add the customer element to the XML document
                rootElem.Add(customerElem);
            }

            Console.WriteLine(customerXml.ToString());
            Console.Read();
        }

        // Create a customer list with sample data
        private static List<Customer> CreateCustomerList()
        {
            List<Customer> customers = new List<Customer>
                {
                    new Customer { FirstName = "Orlando",
                                   LastName = "Gee",
                                   EmailAddress = "[email protected]"},
                    new Customer { FirstName = "Keith",
                                   LastName = "Harris",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Donna",
                                   LastName = "Carreras",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Janet",
                                   LastName = "Gates",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Lucy",
                                   LastName = "Harrington",
                                   EmailAddress = "[email protected]" }
                };
            return customers;
        }
    }
}

The program will produce this output:

<Customers>
  <Customer>
    <FirstName>Orlando</FirstName>
    <LastName>Gee</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Keith</FirstName>
    <LastName>Harris</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Donna</FirstName>
    <LastName>Carreras</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Janet</FirstName>
    <LastName>Gates</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Lucy</FirstName>
    <LastName>Harrington</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
</Customers>

Note

As it happens, this example would have needed less code if we had used LINQ, but for this first example, we wanted to keep things simple. We’ll show the LINQ version shortly.

In .NET, the System.Xml.Linq namespace contains the LINQ to XML classes we can use to create and process XML documents.

The Customer class and the CreateCustomerList function in the main Tester class contain straightforward code to give us some data to work with, so we will not go over them. The main attraction in this example is the XML creation in the Main function. First, we create a new XML document object:

var customerXml = new XDocument();

Next, we create the root element and add it to the document:

var rootElem = new XElement("Customers");
customerXml.Add(rootElem);

After these two operations, the customerXml object represents an XML document containing an empty element, which might look either like this:

<Customers></Customers>

or like this:

<Customers />

LINQ to XML tends to use the empty element tag form where possible, so if you were to call ToString() on customerXml at this point, it would produce that second version.

Of course, you may already have an XML document, and you may want to turn that into an XDocument object. Example 12-2 shows how to load a string into a new XDocument.

Example 12-2. Loading XML from a string

XDocument doc = XDocument.Parse("<Customers><Customer /></Customers>");

There’s also a Load method, which has several overloads. You can pass in a URL, in which case it will fetch the XML from there and then parse it. You can also pass in a Stream or a TextReader, the abstract types from the System.IO namespace that represent a stream of bytes (such as a file), or a source of text (such as a file of some known character encoding).

XML Elements

With the root element in hand, you can add each customer as a child node:

foreach (Customer customer in customers)
{
    // Create new element representing the customer object.
    var customerElem = new XElement("Customer");

In this example, we make each property of the customer object a child element of the customer element:

    // Add element representing the FirstName property to the Customer element.
    var firstNameElem = new XElement("FirstName", customer.FirstName);
    cstomerElem.Add(firstNameElem);

This adds the FirstName child element. We’re passing the customer’s first name as the second constructor argument, which will make that the content of the element. The result will look like this:

<FirstName>Orlando</FirstName>

The other two properties, LastName and EmailAddress, are added to the customer element in exactly the same way. Here’s an example of the complete customer element:

<Customer>
  <FirstName>Orlando</FirstName>
  <LastName>Gee</LastName>
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

Finally, the newly created customer element is added to the XML document as a child of the root element:

    // Finally add the customer element to the XML document
    rootElem.Add(customerElem);
}

Once all customer elements are created, this example prints the XML document:

Console.WriteLine(customerXml.ToString());

When you call ToString() on any of the LINQ to XML objects (whether they represent the whole document, as in this case, or just some fragment of a document such as an XElement), it produces the XML text, and it formats it with indentation, making it easy to read. There are ways to produce more compact representations—if you’re sending the XML across a network to another computer, size may be more important than readability. To see a terser representation, we could do this:

Console.WriteLine(customerXml.ToString(SaveOptions.DisableFormatting));

That will print the XML as one long line with no spaces.

XML Attributes

An XML element may have a set of attributes, which store additional information about the element. An attribute is a key/value pair contained in the start tag of an XML element:

<Customer FirstName="Orlando" LastName="Gee"></Customer>

If you’re using an empty element tag, the attributes appear in the one and only tag:

<Customer FirstName="Orlando" LastName="Gee" />

The next example demonstrates how you can mix the use of child elements and attributes. It creates customer elements with the customer’s name stored in attributes and the email address stored as a child element:

<Customer FirstName="Orlando" LastName="Gee">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

The only difference between this and Example 12-1 is that we create XAttribute objects for the FirstName and LastName properties instead of XElement objects:

// Add an attribute representing the FirstName property
// to the customer element.
var firstNameAttr = new XAttribute("FirstName", customer.FirstName);
customerElem.Add(firstNameAttr);
// Add an attribute representing the LastName property
// to the customer element.
var lastNameAttr = new XAttribute("LastName", customer.LastName);
customerElem.Add(lastNameAttr);

As with elements, we just add the attribute to the parent element. Example 12-3 shows the complete sample code and output.

Example 12-3. Creating an XML document containing elements and attributes

using System;
using System.Collections.Generic;
using System.Xml.Linq;

namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        // Same as in Example 12-1
    }

    // Main program
    public class Tester
    {
        static void Main()
        {
            List<Customer> customers = CreateCustomerList();

            var customerXml = new XDocument();
            var rootElem = new XElement("Customers");
            customerXml.Add(rootElem);
            foreach (Customer customer in customers)
            {
                // Create new element representing the customer object.
                var customerElem = new XElement("Customer");

                // Add an attribute representing the FirstName property
                // to the customer element.
                var firstNameAttr = new XAttribute("FirstName",
                                      customer.FirstName);
                customerElem.Add(firstNameAttr);


                // Add an attribute representing the LastName property
                // to the customer element.
                var lastNameAttr = new XAttribute("LastName",
                                     customer.LastName);
                customerElem.Add(lastNameAttr);

                // Add element representing the EmailAddress property
                // to the customer element.
                var emailAddress = new XElement("EmailAddress",
                    customer.EmailAddress);
                customerElem.Add(emailAddress);

                // Finally add the customer element to the XML document
                rootElem.Add(customerElem);
            }

            Console.WriteLine(customerXml.ToString());
            Console.Read();
        }

        // Create a customer list with sample data
        private static List<Customer> CreateCustomerList()
        {
            List<Customer> customers = new List<Customer>
                {
                    new Customer { FirstName = "Orlando",
                                   LastName = "Gee",
                                   EmailAddress = "[email protected]"},
                    new Customer { FirstName = "Keith",
                                   LastName = "Harris",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Donna",
                                   LastName = "Carreras",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Janet",
                                   LastName = "Gates",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Lucy",
                                   LastName = "Harrington",
                                   EmailAddress = "[email protected]" }
                };
            return customers;
        }
    }
}

Output:
<Customers>
  <Customer FirstName="Orlando" LastName="Gee">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Keith" LastName="Harris">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Donna" LastName="Carreras">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Janet" LastName="Gates">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Lucy" LastName="Harrington">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
</Customers>

While it’s often convenient to be able to create and add elements and attributes one step at a time, these classes offer constructors that allow us to do more work in a single step. If we know exactly what we want to put in an element, this can lead to neater looking code. For example, we can replace the foreach loop with the code in Example 12-4.

Example 12-4. Constructing an XElement all at once

    foreach (Customer customer in customers)
    {
        // Create new element representing the customer object.
        var customerElem = new XElement("Customer",
            new XAttribute("FirstName", customer.FirstName),
            new XAttribute("LastName", customer.LastName),
            new XElement("EmailAddress", customer.EmailAddress)
            );

        // Finally add the customer element to the XML document
        rootElem.Add(customerElem);
    }

The only difference is that we’re passing all the XAttribute and XElement objects to the containing XElement constructor, rather than passing them to Add one at a time. As well as being more compact, it’s pretty easy to see how this code relates to the structure of the XML element being produced. We can also use this technique in conjunction with LINQ.

Putting the LINQ in LINQ to XML

We’ve seen several examples that construct an XElement, passing the name as the first argument, and the content as the second. We’ve passed strings, child elements, and attributes, but we can also provide an implementation of IEnumerable<T>. So if we add a using System.Linq; directive to the top of our file, we could use a LINQ query as the second constructor argument as Example 12-5 shows.

Example 12-5. Generating XML elements with LINQ

var customerXml = new XDocument(new XElement("Customers",
    from customer in customers
    select new XElement("Customer",
        new XAttribute("FirstName", customer.FirstName),
        new XAttribute("LastName", customer.LastName),
        new XElement("EmailAddress", customer.EmailAddress)
        )));

This generates the whole of the XML document in a single statement. So the work that took 25 lines of code in Example 12-1 comes down to just seven. Example 12-6 shows the whole example, with its much simplified Main method.

Example 12-6. Building XML with LINQ

using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;

namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        // Same as in Example 12-1
    }

    // Main program
    public class Tester
    {
        static void Main()
        {
            List<Customer> customers = CreateCustomerList();

            var customerXml = new XDocument(new XElement("Customers",
                from customer in customers
                select new XElement("Customer",
                    new XAttribute("FirstName", customer.FirstName),
                    new XAttribute("LastName", customer.LastName),
                    new XElement("EmailAddress", customer.EmailAddress)
                    )));

            Console.WriteLine(customerXml.ToString());
            Console.Read();
        }

        // Create a customer list with sample data
        private static List<Customer> CreateCustomerList()
        {
            List<Customer> customers = new List<Customer>
                {
                    new Customer { FirstName = "Orlando",
                                   LastName = "Gee",
                                   EmailAddress = "[email protected]"},
                    new Customer { FirstName = "Keith",
                                   LastName = "Harris",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Donna",
                                   LastName = "Carreras",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Janet",
                                   LastName = "Gates",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Lucy",
                                   LastName = "Harrington",
                                   EmailAddress = "[email protected]" }
                };
            return customers;
        }
    }
}

We’re not really doing anything special here—this LINQ query is just relying on plain old LINQ to Objects—the same techniques we already saw in Chapter 8. But this is only half the story. LINQ to XML is not just about creating XML. It also supports reading XML.

Being able to create XML documents to store data to be processed or exchanged is great, but it would not be of much use if you could not find information in them easily. LINQ to XML lets you use the standard LINQ operators to search for information in XML documents.

Searching in XML with LINQ

We’ll need an example document to search through. Here’s the document from Example 12-3, reproduced here for convenience:

<Customers>
  <Customer FirstName="Orlando" LastName="Gee">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Keith" LastName="Harris">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Donna" LastName="Carreras">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Janet" LastName="Gates">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Lucy" LastName="Harrington">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
</Customers>

Example 12-7 lists the code for the example.

Example 12-7. Searching an XML document using LINQ

using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;

namespace Programming_CSharp
{
    public class Customer
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string EmailAddress { get; set; }
    }

    public class Tester
    {
        private static XDocument CreateCustomerListXml()
        {
            List<Customer> customers = CreateCustomerList();
            var customerXml = new XDocument(new XElement("Customers",
                from customer in customers
                select new XElement("Customer",

                new XAttribute("FirstName", customer.FirstName),
                new XAttribute("LastName", customer.LastName),
                new XElement("EmailAddress", customer.EmailAddress)
                )));

            return customerXml;
        }

        private static List<Customer> CreateCustomerList()
        {
            List<Customer> customers = new List<Customer>
                {
                    new Customer {FirstName = "Douglas",
                                  LastName = "Adams",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Richard",
                                  LastName = "Dawkins",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Kenji",
                                  LastName = "Yoshino",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Ian",
                                  LastName = "McEwan",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Neal",
                                  LastName = "Stephenson",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Randy",
                                  LastName = "Shilts",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Michelangelo",
                                  LastName = "Signorile ",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Larry",
                                  LastName = "Kramer",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Jennifer",
                                  LastName = "Baumgardner",
                                  EmailAddress = "[email protected]"}
            };
            return customers;
        }

        static void Main()
        {
            XDocument customerXml = CreateCustomerListXml();

            Console.WriteLine("Search for single element...");
            var query =
              from customer in
                     customerXml.Element("Customers").Elements("Customer")
              where customer.Attribute("FirstName").Value == "Douglas"
              select customer;
            XElement oneCustomer = query.SingleOrDefault();

            if (oneCustomer != null)
            {
                Console.WriteLine(oneCustomer);
            }
            else
            {
                Console.WriteLine("Not found");
            }


            Console.WriteLine("
Search using descendant axis... ");
            query = from customer in customerXml.Descendants("Customer")
                    where customer.Attribute("FirstName").Value == "Douglas"
                    select customer;
            oneCustomer = query.SingleOrDefault();
            if (oneCustomer != null)
            {
                Console.WriteLine(oneCustomer);
            }
            else
            {
                Console.WriteLine("Not found");
            }

            Console.WriteLine("
Search using element values... ");
            query = from emailAddress in
                      customerXml.Descendants("EmailAddress")
                    where emailAddress.Value == "[email protected]"
                    select emailAddress;
            XElement oneEmail = query.SingleOrDefault();
            if (oneEmail != null)
            {
                Console.WriteLine(oneEmail);
            }
            else
            {
                Console.WriteLine("Not found");
            }

            Console.WriteLine("
Search using child element values... ");
            query = from customer in customerXml.Descendants("Customer")
                    where customer.Element("EmailAddress").Value
                            == "[email protected]"
                    select customer;
            oneCustomer = query.SingleOrDefault();
            if (oneCustomer != null)
            {
                Console.WriteLine(oneCustomer);
            }
            else
            {
                Console.WriteLine("Not found");
            }

        }       // end main
    }           // end class
}               // end namespace


Output:
Search for single element...
<Customer FirstName="Douglas" LastName="Adams">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

Search using descendant axis...
<Customer FirstName="Douglas" LastName="Adams">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

Search using element values...
<EmailAddress>[email protected]</EmailAddress>

Search using child element values...
<Customer FirstName="Douglas" LastName="Adams">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

This example refactors Example 12-3 by extracting the creation of the sample customer list XML document into the CreateCustomerListXml() method. You can now simply call this function in the Main() function to create the XML document.

Searching for a Single Node

The first search in Example 12-7 is to find a customer whose first name is “Douglas”:

var query =
  from customer in
         customerXml.Element("Customers").Elements("Customer")
  where customer.Attribute("FirstName").Value == "Douglas"
  select customer;
XElement oneCustomer = query.SingleOrDefault();

if (oneCustomer != null)
{
    Console.WriteLine(oneCustomer);
}
else
{
    Console.WriteLine("Not found");
}

In general, you will have some ideas about the structure of XML documents you are going to process; otherwise, it will be difficult to find the information you want. Here we know the node we are looking for sits just one level below the root element. So the source of the LINQ query—the part after the in keyword—fetches the root Customers element using the singular Element method, and then asks for all of its children called Customers by using the plural Elements method:

from customer in
       customerXml.Element("Customers").Elements("Customer")

We specify the search conditions with a where clause, as we would do in any LINQ query. In this case, we want to search on the value of the FirstName attribute:

where customer.Attribute("FirstName").Value == "Douglas"

The select clause is trivial—we just want the query to return all matching elements. Finally, we execute the query using the standard LINQ SingleOrDefault operator, which, as you may recall, returns the one result of the query, unless it failed to match anything, in which case it will return null. (And if there are multiple matches, it throws an exception.) We therefore test the result against null before attempting to use it:

if (oneCustomer != null)
{
    Console.WriteLine(oneCustomer);
}
else
{
    Console.WriteLine("Not found");
}

In this example, the method is successful, and the resultant element is displayed.

Search Axes

In practice, you don’t always know exactly where the information you require will be in the XML document when you write the code. For these cases, LINQ to XML provides the ability to search in different ways—if you are familiar with the XPath query language[25] for XML, this is equivalent to the XPath concept of a search axis. This specifies the relationship between the element you’re starting from and the search target nodes.

The Element and Elements methods we used earlier only ever search one level—they look in the children of whatever object you call them on. But we can instead use the Descendants method to look not just in the children, but also in their children’s children, and so on. So the source for the next query in Example 12-7 looks for all elements called Customer anywhere in the document. This is more compact, but also less precise.

query = from customer in customerXml.Descendants("Customer")

Other methods available for querying along different axes include Parent, Ancestors, ElementsAfterSelf, ElementsBeforeSelf, and Attributes. The first two look up the tree and are similar to Elements and Descendants, in that Parent looks up just one level, while Ancestors will search up through the document all the way to the root. ElementsBeforeSelf and ElementsAfterSelf search for elements that have the same parent as the current item, and which appear either before or after it in the document. Attributes searches in an element’s attributes rather than its child elements. (If you are familiar with XPath, you will know that these correspond to the parent, ancestor, following-sibling, preceding-sibling, and attribute axes.)

Where Clauses

The first query in Example 12-7 included a where clause that looked for a particular attribute value on an element. You can, of course, use other criteria. The third query looks at the content of the element itself—it uses the Value property to extract the content as text:

where emailAddress.Value == "[email protected]"

You can get more ambitious, though—the where clause can dig further into the structure of the XML. The fourth query’s where clause lets through only those elements whose child EmailAddress element has a particular value:

where customer.Element("EmailAddress").Value == "[email protected]"

XML Serialization

So far, our code has constructed the objects representing the Customer XML elements by hand. As XML is becoming popular, especially with the increasingly widespread use of web services, it can be useful to automate this process. If you expect to work with XML elements that always have a particular structure, it can be convenient to serialize objects to or from XML. Working with conventional objects can be a lot easier than using lots of explicit XML code.

The .NET Framework provides a built-in serialization mechanism to reduce the coding efforts by application developers. The System.Xml.Serialization namespace defines the classes and utilities that implement methods required for serializing and deserializing objects. Example 12-8 illustrates this.

Example 12-8. Simple XML serialization and deserialization

using System;
using System.IO;
using System.Xml.Serialization;

namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string EmailAddress { get; set; }

        // Overrides the Object.ToString() to provide a
        // string representation of the object properties.
        public override string ToString()
        {
            return string.Format("{0} {1}
Email:   {2}",
                        FirstName, LastName, EmailAddress);
        }
    }

    // Main program
    public class Tester
    {
        static void Main()
        {
            Customer c1 = new Customer
                          {
                              FirstName = "Orlando",
                              LastName = "Gee",
                              EmailAddress = "[email protected]"
                          };

            XmlSerializer serializer = new XmlSerializer(typeof(Customer));
            StringWriter writer = new StringWriter();

            serializer.Serialize(writer, c1);
            string xml = writer.ToString();
            Console.WriteLine("Customer in XML:
{0}
", xml);

            Customer c2 = serializer.Deserialize(new StringReader(xml))
                          as Customer;
            Console.WriteLine("Customer in Object:
{0}", c2.ToString());

            Console.ReadKey();
        }
    }
}

Output:
Customer in XML:
<?xml version="1.0" encoding="utf-16"?>
<Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <FirstName>Orlando</FirstName>
  <LastName>Gee</LastName>
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

Customer in Object:
Orlando Gee
Email:   [email protected]

To serialize an object using .NET XML serialization, you need to create an XmlSerializer object:

XmlSerializer serializer = new XmlSerializer(typeof(Customer));

You must pass in the type of the object to be serialized to the XmlSerializer constructor. If you don’t know the object type at design time, you can discover it by calling its GetType() method:

XmlSerializer serializer = new XmlSerializer(c1.GetType());

You also need to decide where the serialized XML document should be stored. In this example, you simply send it to a StringWriter:

StringWriter writer = new StringWriter();

serializer.Serialize(writer, c1);
string xml = writer.ToString();
Console.WriteLine("Customer in XML:
{0}
", xml);

The resultant XML string is then displayed on the console:

<?xml version="1.0" encoding="utf-16"?>
<Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <FirstName>Orlando</FirstName>
  <LastName>Gee</LastName>
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

The first line is an XML declaration. This is to let the consumers (human users and software applications) of this document know that this is an XML file, the official version to which this file conforms, and the encoding format used. This is optional in XML, but this code always produces one.

The root element here is the Customer element, with each property represented as a child element. The xmlns:xsi and xmlns:xsd attributes relate to the XML Schema specification. They are optional, and don’t do anything useful in this example, so we will not explain them further. If you are interested, please read the XML specification or other documentation, such as the MSDN Library, for more details.

Aside from those optional parts, this XML representation of the Customer object is equivalent to the one created in Example 12-1. However, instead of writing numerous lines of code to deal with the XML specifics, you need only three lines using .NET XML serialization classes.

Furthermore, it is just as easy to reconstruct an object from its XML form:

Customer c2 = serializer.Deserialize(new StringReader(xml))
              as Customer;
Console.WriteLine("Customer in Object:
{0}", c2.ToString());

All it needs is to call the XmlSerializer.Deserialize method. It has several overloaded versions, one of which takes a TextReader instance as an input parameter. Because StringReader is derived from TextReader, you just pass an instance of StringReader to read from the XML string. The Deserialize method returns an object, so it is necessary to cast it to the correct type.

Of course, there’s a price to pay. XML serialization is less flexible than working with the XML APIs directly—with serialization you decide exactly what XML elements and attributes you expect to see when you write the code. If you need to be able to adapt dynamically to elements whose names you only learn at runtime, you will need to stick with the XML-aware APIs.

Customizing XML Serialization Using Attributes

By default, all public read/write properties are serialized as child elements. You can customize your classes by specifying the type of XML node you want for each of your public properties, as shown in Example 12-9.

Example 12-9. Customizing XML serialization with attributes

using System;
using System.IO;
using System.Xml.Serialization;

namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        [XmlAttribute]
        public string FirstName { get; set; }

        [XmlIgnore]
        public string LastName { get; set; }

        public string EmailAddress { get; set; }

        // Overrides the Object.ToString() to provide a
        // string representation of the object properties.
        public override string ToString()
        {
            return string.Format("{0} {1}
Email:   {2}",
                        FirstName, LastName, EmailAddress);
        }
    }

    // Main program
    public class Tester
    {
        static void Main()
        {
            Customer c1 = new Customer
                          {
                              FirstName = "Orlando",
                              LastName = "Gee",
                              EmailAddress = "[email protected]"
                          };

            //XmlSerializer serializer = new XmlSerializer(c1.GetType());
            XmlSerializer serializer = new XmlSerializer(typeof(Customer));
            StringWriter writer = new StringWriter();

            serializer.Serialize(writer, c1);
            string xml = writer.ToString();
            Console.WriteLine("Customer in XML:
{0}
", xml);

            Customer c2 = serializer.Deserialize(new StringReader(xml)) as
                          Customer;
            Console.WriteLine("Customer in Object:
{0}", c2.ToString());

            Console.ReadKey();
        }
    }
}

Output:
Customer in XML:
<?xml version="1.0" encoding="utf-16"?>
<Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xmlns:xsd="http://www.w3.org/2001/XMLSchema"
          FirstName="Orlando">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

Customer in Object:
Orlando
Email:   [email protected]

The only changes in this example are a couple of XML serialization attributes added in the Customer class:

[XmlAttribute]
public string FirstName { get; set; }

The first change is to specify that you want to serialize the FirstName property into an attribute of the Customer element by adding the XmlAttributeAttribute to the property:

[XmlIgnore]
public string LastName { get; set; }

The other change is to tell XML serialization that you in fact do not want the LastName property to be serialized at all. You do this by adding the XmlIgnoreAttribute to the property. As you can see from the sample output, the Customer object is serialized without LastName, exactly as we asked.

However, you have probably noticed that when the object is deserialized, its LastName property is lost. Because it is not serialized, the XmlSerializer is unable to assign it any value. Therefore, its value is left as the default, which is an empty string. So in practice, you would exclude from serialization only those properties you don’t need or can compute or can retrieve in other ways.

Summary

In this chapter, we saw how to use the LINQ to XML classes to build objects representing the structure of an XML document, which can then be converted into an XML document, and we saw how the same classes can be used to load XML from a string or file back into memory as objects. These classes support LINQ, both for building new XML documents and for searching for information in existing XML documents. And we also saw how XML serialization can hide some of the details of XML handling behind ordinary C# classes in situations where you know exactly what structure of XML to expect.



[25] XPath is supported by both LINQ to XML and the DOM APIs. (Unless you’re using Silverlight, in which case the DOM API is missing entirely, and the XPath support is absent from LINQ to XML.) So if you prefer that, you can use it instead, or you can use a mixture of LINQ and XPath.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset