You have a large and complex XML document and you need to find various pieces of information, such as all the information contained within a specific element and having a particular attribute setting. You want to query the XML structure without having to iterate through all the nodes in the XML document and searching for a particular item by hand.
In order to query a database, you normally
would use SQL. In order to query an XML document, you would use
XPath
. In .NET, this means using the
System.Xml.XPath
namespace and classes like
XPathDocument
,
XPathNavigator
, and
XPathNodeIterator
.
In the following example, we use these classes to select nodes from
an XML document we construct holding members from the board game
“Clue” (or
“Cluedo”, as it is known abroad)
and their various roles. We want to be able to select the married
female participants who were witnesses to the crime. In order to do
this, we pass an XPath
expression to query the XML
data set as follows:
public static void QueryXML( ) { string xmlFragment = "<?xml version='1.0'?>" + "<Clue>" + "<Participant type='Perpetrator'>Professor Plum</Participant>" + "<Participant type='Witness'>Colonel Mustard</Participant>" + "<Participant type='Witness'>Mrs. White</Participant>" + "<Participant type='Witness'>Mrs. Peacock</Participant>" + "<Participant type='Witness'>Mr. Green</Participant>" + "</Clue>"; XmlTextReader reader = new XmlTextReader(xmlFragment, XmlNodeType.Element,null); // Instantiate an XPathDocument using the XmlTextReader. XPathDocument xpathDoc = new XPathDocument(reader, XmlSpace.Preserve); // get the navigator XPathNavigator xpathNav = xpathDoc.CreateNavigator( ); // set up the query looking for the married female participants // who were witnesses string xpathQuery = "/Clue/Participant[attribute::type='Witness'][contains(text( ),'Mrs.')]"; // get the nodeset from the query XPathNodeIterator xpathIter = xpathNav.Select(xpathQuery); // write out the nodes found (Mrs. White and Mrs.Peacock in this instance) while(xpathIter.MoveNext( )) { Console.WriteLine(xpathIter.Current.Value); } // close the reader. reader.Close( ); }
This outputs the following:
Mrs. White Mrs. Peacock
XPath
is a very versatile language for performing
queries on XML-based data. In order to accomplish our goal, we first
created an XML fragment that looks like this:
<?xml version='1.0'?> <Clue> <Participant type='Perpetrator'>Professor Plum</Participant> <Participant type='Witness'>Colonel Mustard</Participant> <Participant type='Witness'>Mrs. White</Participant> <Participant type='Witness'>Mrs. Peacock</Participant> <Participant type='Witness'>Mr. Green</Participant> </Clue>;
We then load this fragment into an
XmlTextReader
, as shown
in Recipe 17.1, then construct an
XPathDocument
to allow us to create an
XPathNavigator
, which lets us use
XPath
syntax to query the XML document shown in
the preceding listing. The XmlTextReader
reads
over the document, checking for well-formedness; the
XPathDocument
instance wraps the
XmlTextReader
so we can use
XPath
to locate nodes (as well as perform
XSLT
transforms directly), and the
XPathNavigator
gets the set of nodes selected by
the XPath
expression.
XmlTextReader reader = new XmlTextReader(xmlFragment, XmlNodeType.Element,null); // Instantiate an XPathDocument using the XmlTextReader. XPathDocument xpathDoc = new XPathDocument(reader, XmlSpace.Preserve); // get the navigator XPathNavigator xpathNav = xpathDoc.CreateNavigator( );
Now we have to determine the XPath
-based query to
get all of the married female participants who were witnesses. This
is set up in the xpathQuery
string like this:
// set up the query looking for the married female participants // who were witnesses string xpathQuery = "/Clue/Participant[attribute::type='Witness'][contains(text( ),'Mrs.')]";
In order to get a bit of comprehension of what is going on here, let me explain the syntax a bit:
/Clue/Participant
says “Get all
of the Participants under the root level node Clue.”
Participant[attribute::type='Witness']
says
“Select only Participants with an attribute called
type with a value of Witness.”
Participant[contains(text( ),'Mrs.')]
says
“Select only Participants with a value that contains `Mrs.’”
Put them all together and we get all of the married female participants who were witnesses.
Once we have an XPathNavigator
, we call the
Select
method on it, passing the
XPath
- based query to select the nodes we are
looking for that are returned via the
XPathNodeIterator
. We use the
XPathNodeIterator
to write out the names of the
Participants we found and close the XmlTextReader
.