Appendix A. The XML and XSLT You Need to Know

Knowledge of XML is essential if you want to build applications around the Document Object Model (DOM) XML capabilities in the browser instead of just using plain text for all of your Ajax responses. Going right along with XML is XSLT, which was originally thought to work in tandem with XML to produce Ajax results for a client. If you are already acquainted with XML and XSLT, you do not need to read this appendix. If not, you should read on.

The general overview of XML and XSLT given in this appendix should be sufficient to enable you to work with XML documents, transform them, and use them in Ajax applications. For a much more solid grounding in the many details of XML, you should consider these books:

  • XML in a Nutshell, Third Edition, by Elliotte Rusty Harold and W. Scott Means (O’Reilly)

  • Effective XML: 50 Specific Ways to Improve Your XML by Elliotte Rusty Harold (Addison-Wesley Professional)

  • Learning XML, Second Edition, by Erik T. Ray (O’Reilly)

  • XSLT Cookbook, Second Edition, by Sal Mangano (O’Reilly)

  • XSLT 2.0 Web Development by Dmitry Kirsanov (Prentice-Hall)

  • Learning XSLT by Michael Fitzgerald (O’Reilly)

Another good source of material on XML and XSLT is XML.com (http://www.xml.com).

What Is XML?

XML, the eXtensible Markup Language, is an Internet-friendly format for data and documents, invented by the World Wide Web Consortium (W3C). The word Markup in the term denotes a way to express a document’s structure within the document itself. XML has its roots in the Standard Generalized Markup Language (SGML), which is used in publishing. HTML was an application of SGML to web publishing. XML was created to do for machine-readable documents on the Web what HTML did for human-readable documents: provide a commonly agreed-upon syntax so that processing the underlying format becomes commonplace and documents are made accessible to all users. The current version of the W3C Recommendation is the XML 1.1 (Second Edition), published on September 29, 2006 and available at http://www.w3.org/TR/xml11/. Though this is the latest version, most XML documents are version 1.0 documents, and this is what I will describe in this appendix—especially because version 1.1 made only minor changes to the recommendation.

Unlike HTML, though, XML comes with very little predefined. HTML developers are accustomed both to the notion of using angle brackets (< >) for denoting elements, and to the set of element names (such as head, body, etc.). XML shares only the former feature (i.e., the notion of using angle brackets for denoting elements). Unlike HTML, XML has no predefined elements, but is merely a set of rules that lets you write other languages such as HTML.

Because XML defines so little, it is easy for everyone to agree to use the XML syntax and then to build applications on top of it. It is like agreeing to use a particular alphabet and set of punctuation symbols, but not saying which language to use. This offers immense flexibility for returning data sets from the server to the browser clients.

Anatomy of an XML Document

The best way to explain how an XML document is composed is to present one. This example shows an XML document you might use to describe two authors:

1  <?xml version="1.0" encoding="us-ascii"?>
2  <authors>
3      <person id="lear">
4          <name>J.K. Rowling</name>
5          <nationality>British</nationality>
6      </person>
7      <person id="lewis">
8          <name>C.S. Lewis</name>
9          <nationality>Irish</nationality>
10     </person>
11     <person id="mysteryperson"/>
12  </authors>

Line 1 of the document is known as the XML declaration. This tells a processing application which version of XML you are using—the version indicator is mandatory—and which character encoding you have used for the document. In this example, the document is encoded in ASCII. (I cover the significance of character encoding later in this appendix.)

If the XML declaration is omitted, a processor will make certain assumptions about your document. In particular, it will expect it to be encoded in UTF-8, an encoding of the Unicode character set. However, it is best to use the XML declaration wherever possible, both to avoid confusion over the character encoding and to indicate to processors which version of XML you’re using. (Version 1.0 is most common, but 1.1, which makes relatively minor though potentially incompatible changes, has recently appeared.) Encoding handling should be automatic by the browser, but you may need to watch for documents you import from other sources.

Elements and Attributes

Line 2 of the example in the preceding section begins an element, which has been named authors. The contents of that element include everything between the right-angle bracket (>) in <authors> and the left-angle bracket (<) in </authors>. The actual syntactic constructs <authors> and </authors> are often referred to as the element start tag and end tag, respectively. Do not confuse tags with elements! Tags mark the boundaries of elements. Note that elements, like the authors element earlier, may include other elements as well as text. An XML document must contain exactly one root element, which contains all other content within the document. The name of the root element defines the type of the XML document.

Elements that contain both text and other elements simultaneously are classified as mixed content. Browsers support the use of mixed content, though other applications may not.

The sample authors document uses elements named person to describe the authors. Each person element has an attribute named id. Unlike elements, attributes can contain only textual content. Their values must be surrounded by quotes. You can use either single quotes (') or double quotes ("), as long as you use the same kind of closing quote as the opening one.

Within XML documents, attributes are frequently used for metadata (i.e., data about data)—describing properties of the element’s contents. This is the case in our example, where id contains a unique identifier for the person being described.

As far as XML is concerned, it does not matter in what order attributes are presented in the element start tag. For example, these two elements contain exactly the same information as far as an XML 1.0 conformant processing application is concerned:

<animal name="dog" legs="4"></animal>
<animal legs="4" name="dog"></animal>

On the other hand, the information presented to an application by an XML processor on reading the following two lines will be different for each animal element because the ordering of elements is significant:

<animal><name>dog</name><legs>4</legs></animal>
<animal><legs>4</legs><name>dog</name></animal>

XML treats a set of attributes like a bunch of stuff in a bag—there is no implicit ordering—whereas elements are treated like items on a list, where ordering matters.

New XML developers frequently ask when it is best to use attributes to represent information and when it is best to use elements. As you can see from the authors example, if order is important to you, elements are a good choice. In general, there is no hard-and-fast best practice for choosing whether to use attributes or elements, though elements can contain other elements and attributes, whereas attributes can contain only text.

The final author described in our document has no information available. All we know about this person is his or her ID, mysteryperson. The document uses the XML shortcut syntax for an empty element. The following is a reasonable alternative:

<person id="mysteryperson"></person>

Name Syntax

XML 1.0 has certain rules about element and attribute names. In particular:

  • Names are case-sensitive; for example, <person/> is not the same as <Person/>.

  • Names beginning with “xml” (in any permutation of uppercase or lowercase) are reserved for use by XML 1.0 and its companion specifications.

  • A name must start with a letter or an underscore, not a digit, and may continue with any letter, digit, underscore, or period. (Actually, a name may also contain a colon, but the colon is used to delimit a namespace prefix and is not available for arbitrary use as of the Second Edition of XML 1.0.)

You can find a precise description of names in Section 2.3 of the XML 1.0 specification, at http://www.w3.org/TR/REC-xml#sec-common-syn.

XML Namespaces

XML 1.0 lets developers create their own elements and attributes, but leaves open the potential for overlapping names. “Title” in one context may mean something entirely different from “Title” in a different context. The namespaces in the XML specification (which you can find at http://www.w3.org/TR/REC-xml-names/) provide a mechanism by which developers can identify particular vocabularies using Uniform Resource Identifiers (URIs).

URIs are a combination of the familiar Uniform Resource Locators (URLs) and Uniform Resource Names (URNs). From the perspective of XML namespaces, URIs are convenient because they combine an easily used syntax with a notion of ownership. Although it is possible for me to create namespace URIs that begin with http://microsoft.com, general practice holds that it would be better for me to create URIs that begin with http://holdener.com, a domain I own, and leave http://microsoft.com to Microsoft. In general, organizations and individuals who create XML vocabularies should choose a namespace URI in a space they control. This makes it possible (though it is not required) to put information there documenting the vocabulary, or other resources for processing the vocabulary.

The rules for XML names do not permit developers to create elements with names such as http://holdener.com/ns/mine:Title, and working with such names wouldn’t necessarily be much fun anyway. To get around these problems, the namespaces in the XML specification define a mechanism for associating URIs with element and attribute names through prefixes. Instead of typing out the whole URI, developers can work with a much shorter prefix, or even set a default URI that applies to names without prefixes.

To create a prefix, you use a namespace declaration, which looks like an attribute. For example, to create a prefix of xhtml associated with the URI http://www.w3.org/1999/xhtml, you would use an xmlns:xhtml attribute, as shown here:

<container xmlns:xhtml="http://www.w3.org/1999/xhtml" >
.
.
.
</container>

To apply a prefix, you put it in front of the element or attribute name, with a colon separating the prefix from the name. To put an XHTML <p> element inside that container, you could write:

<container xmlns:xhtml="http://www.w3.org/1999/xhtml" >
<xhtml:p>This is an XHTML paragraph!</xhtml:p>
</container>

When a program encountered the xhtml:p, it would know that p was the local name of the element, xhtml was the prefix, and http://www.w3.org/1999/xhtml was the URI for that element. The namespace declaration applies to all elements inside the element where it appears, as well as the element containing the declaration. For example, the xhtml prefix works for all three of these paragraphs:

<container xmlns:xhtml="http://www.w3.org/1999/xhtml" >
<xhtml:p>This is XHTML paragraph 1!</xhtml:p>
<xhtml:p>This is XHTML paragraph 2!</xhtml:p>
<xhtml:p>This is XHTML paragraph 3!</xhtml:p>
</container>

In most XML processing, the prefix does not matter; the local name and the URI are what counts, and the prefix is just a mechanism for associating them. (This is especially important in XSLT processing and XML Schemas.) In some documents, particularly ones that use structures from only one namespace or where one vocabulary is dominant, developers choose to use the default namespace rather than prefixes. When the default namespace is used (assigned with an xmlns attribute), elements without a prefix are associated with a given URI. In XHTML, an XML derivative of HTML, this is the most typical path, because HTML developers are not used to putting prefixes on all of their element names. A typical XHTML document might look like this:

<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
        <title>My Document</title>
    </head>
    <body>
        <p>Could use some content here</p>
    </body>
</html>

In this case, the URI http://www.w3.org/1999/xhtml applies to every element in the document, including <html>, <head>, <title>, <body>, and <p>. The default namespace has one quirk, though: it does not apply to attributes. You can give attributes a namespace by explicitly using a prefix in their name, but unprefixed attributes have no namespace URI. This often does not matter, but it can be important when writing XSLT stylesheets and creating XML Schemas.

Typically, the namespaces a document uses are declared on the root element of the document, which lets the namespaces apply to all the content inside that document. Of course, you also can declare them throughout the document, though this makes it more difficult to read. Declarations can override one another as well, and the declaration closest to a given use of a prefix in the hierarchy will be used. This lets developers mix and match XML vocabularies even when they use the same prefix.

Namespaces are very simple on the surface but are a well-known field of combat in the XML arena. For more information on namespaces, see Tim Bray’s “XML Namespaces by Example,” published at http://www.xml.com/pub/a/1999/01/namespaces.html; or the aforementioned books XML in a Nutshell and Learning XML.

Well Formed

An XML document that conforms to the rules of XML syntax is described as well formed. At its most basic level, being well formed means the elements are properly matched, and all opened elements are closed. You can find a formal definition of well formed in Section 2.1 of the XML 1.0 specification, at http://www.w3.org/TR/REC-xml#sec-well-formed. Table A-1 shows some XML documents that are not well formed.

Table A-1. Examples of poorly formed XML documents

Document

Reason why it is not well formed

<foo>
    <bar>
    </foo>
</bar>

The elements are not properly nested because foo is closed while inside its child element bar.

<foo>
    <bar>
</foo>

The bar element was not closed before its parent, foo, was closed.

<foo bar>
</foo>

The bar attribute has no value. Although this is permissible in HTML (e.g., <table border>), it is forbidden in XML.

<foo bar=23>
</foo>

The bar attribute value, 23, has no surrounding quotes. Unlike HTML, all attribute values must be quoted in XML.

Comments and Processing Instructions

As in HTML, it is possible to include comments within XML documents. XML comments are intended to be read only by people. With HTML, developers have occasionally employed comments to add application-specific functionality. For example, the server-side include functionality of most web servers uses instructions embedded in HTML comments. In XML, comments should not be used for any purpose other than those for which they were intended, as they are usually stripped from the document during parsing.

The start of a comment is indicated with <!--, and the end of the comment with -->. Any sequence of characters, aside from the string --, may appear within a comment. Comments can appear at the start or end of a document as well as inside elements. They cannot appear inside attributes or inside a tag. A comment might look like this:

<!--Hello, this is a comment -->

Comments tend to be used more in XML documents intended for human consumption than those intended for machine consumption. If you want to pass information to an XML application without affecting the document’s structure, you can use processing instructions, or PIs. PIs use <? as a starting delimiter and ?> as a closing delimiter, must contain a target conforming to the rules for XML names, and may contain additional data. A typical PI might look like this:

<?xml-style type="text/css" href="mystyle.css" ?>

In this case, xml-style is the target and type="text/css" href="mystyle.css" is the data. For more information on PIs, see Section 2.6 of the XML 1.0 specification, at http://www.w3.org/TR/REC-xml#sec-pi.

Entity References

You may occasionally need to use the mechanism for escaping characters. Because some characters have special significance in XML, you need a way to represent them. For example, in some cases the < symbol might really be intended to mean “less than” rather than to signal the start of an element name. Clearly, just inserting the character without any escaping mechanism would result in a poorly formed document because a processing application would assume you were starting another element. Another instance of this problem is the need to include both double quotes and single quotes simultaneously in an attribute’s value. Here is an example that illustrates both difficulties:

<badDoc>
    <para>
        I'd really like to use the < character
    </para>
    <note title="On the proper 'use' of the " character"/>
</badDoc>

XML avoids this problem by the use of the predefined entity reference. The word entity in the context of XML simply means a unit of content. The term entity reference means just that: a symbolic way to refer to a certain unit of content. XML predefines entities for the following symbols: left-angle bracket (<), right-angle bracket (>), apostrophe ('), double quote ("), and ampersand (&).

An entity reference is introduced with an ampersand (&), which is followed by a name (using the word name in its formal sense, as defined by the XML 1.0 specification), and terminated with a semicolon (;). Table A-2 shows how the five predefined entities can be used within an XML document.

Table A-2. Predefined entity references in XML 1.0

Literal character

Entity reference

<

&lt;

>

&gt;

'

&apos;

"

&quot;

&

&amp;

Here is our problematic document revised to use entity references:

<badDoc>
    <para>
        I'd really like to use the &lt; character
    </para>
    <note title="On the proper &apos;use&apos;  of the &quot;character"/>
</badDoc>

Being able to use the predefined entities is often all you need; in general, entities are provided as a convenience for human-created XML. XML 1.0 allows you to define your own entities and use entity references as “shortcuts” in your document. Section 4 of the XML 1.0 specification, available at http://www.w3.org/TR/REC-xml#sec-physical-struct, describes the use of entities.

Character References

You may find character references in web services that pass information with XML. Character references allow you to denote a character by its numeric position in the Unicode character set (this position is known as its code point). Table A-3 contains a few examples that illustrate the syntax.

Table A-3. Example character references

Actual character

Character reference

1

&#48;

A

&#65;

~

&#xD1;

®

&#xAE;

Note that you can express the code point in decimal or, with the use of x as a prefix, in hexadecimal.

Character Encodings

Character encoding is frequently a mysterious subject for developers. Most code tends to be written for one computing platform and, normally, to run within one organization. Although the Internet is changing things quickly, most of us have never had to think too deeply about internationalization.

XML, designed to be an Internet-friendly syntax for information exchange, has internationalization at its very core. One of the basic requirements for XML processors is that they support Unicode standard character encoding. Unicode attempts to include the requirements of all the world’s languages within one character set. Consequently, it is very large!

Unicode encoding schemes

Unicode 3.0 has more than 57,700 code points, each corresponding to a character. (You can obtain charts of characters online by visiting http://www.unicode.org/charts/.) If you were to express a Unicode string by using the position of each character in the character set as its encoding (in the same way as ASCII does), expressing the whole range of characters would require four octets for each character (an octet is a string of eight binary digits, or bits; a byte is commonly but not always considered the same thing as an octet). Clearly, if a document is written in 100 percent American English, it will be four times larger than required, with all the characters in ASCII fitting into a 7-bit representation. This strains both storage space and memory requirements for processing applications.

Fortunately, two encoding schemes for Unicode alleviate this problem: UTF-8 and UTF-16. As you might guess from their names, applications can process documents in these encodings in 8- or 16-bit segments at a time. When code points are required in a document that cannot be represented by one chunk, a bit pattern is used that indicates that the following chunk is required to calculate the desired code point. In UTF-8, this is denoted by the most significant bit of the first octet being set to 1.

This scheme means that UTF-8 is a highly efficient encoding for representing languages using Latin alphabets, such as English. All of the ASCII character set is represented natively in UTF-8—an ASCII-only document and its equivalent in UTF-8 are byte-for-byte identical. UTF-16 is more efficient for representing languages that use Unicode characters represented by larger numeric values, notably Chinese, Japanese, and Korean.

This knowledge will also help you debug encoding errors. One frequent error arises because ASCII is a proper subset of UTF-8—programmers get used to this fact and produce UTF-8 documents, but use them as though they were ASCII. Things start to go awry when the XML parser processes a document containing, for example, characters such as &Aacute; (an entity reference that should be replaced with an accented A). Because you cannot represent this character using only one octet in UTF-8, this produces a two-octet sequence in the output document; in a non-Unicode viewer or text editor, it looks like a couple of characters of garbage.

Other character encodings

Unicode, in the context of computing history, is a relatively new invention. Native operating system support for Unicode is by no means widespread. For instance, although Windows NT offers Unicode support, Windows 95 and 98 do not have it.

XML 1.0 allows a document to be encoded in any character set registered with the Internet Assigned Numbers Authority (IANA). European documents are commonly encoded in one of the ISO Latin character sets, such as ISO-8859-1. Japanese documents commonly use Shift-JIS, and Chinese documents use GB2312 and Big 5.

You can find a full list of registered character sets at http://www.iana.org/assignments/character-sets.

The XML 1.0 specification does not require XML processors to support anything more than UTF-8 and UTF-16, but most commonly support other encodings, such as US-ASCII and ISO-8859-1. Although many XML transactions are currently conducted in ASCII (or the ASCII subset of UTF-8), nothing can stop XML documents from containing, say, Korean text. You will probably have to dig into your computing platform’s encoding support to determine whether you can use alternative encodings, however.

Validity

In addition to being well formed, XML 1.0 offers another level of verification, called validity. To understand why validity is important, imagine that you invented a simple XML format for your friends’ telephone numbers:

<phonebook>
    <person>
        <name>Albert Smith</name>
        <number>123-456-7890</number>
    </person>
    <person>
        <name>Bertrand Jones</name>
        <number>456-123-9876</number>
    </person>
</phonebook>

Based on your format, you also construct a program to display and search your phone numbers. This program turns out to be so useful that you share it with your friends. However, your friends are not as hot on detail as you are, and they try to feed your program this phone book file:

<phonebook>
    <person>
        <name>Melanie Green</name>
        <phone>123-456-7893</phone>
    </person>
</phonebook>

Note that although this file is perfectly well formed, it doesn’t fit the format you prescribed for the phone book because there is a phone element where there should have been a number element. You will likely need to change your program to cope with this situation. If your friends had used number as you did to denote the phone number, and not phone, there would not have been a problem. However, as it is, this second file probably will not be usable by programs set up to work with the first file; from the program’s perspective, it is not valid.

For validity to be a useful general concept, we need a machine-readable way to say what a valid document is; that is, which elements and attributes must be present and in what order. XML 1.0 achieves this by introducing document type definitions (DTDs).

DTDs

The purpose of a DTD is to express which elements and attributes are allowed in a certain document type and to constrain the order in which elements must appear within that document type. A DTD is generally composed of one file or a group of connected files, containing declarations defining element types, attribute lists, and entities.

Connecting DTDs to documents

Although you may not work with DTDs, you should be aware of how they are linked to XML documents. The connection is done with a document type declaration, <!DOCTYPE ...>, inserted at the beginning of the XML document, after the XML declaration in our fictitious example:

<?xml version="1.0" encoding="us-ascii"?>
<!DOCTYPE authors SYSTEM "http://example.com/authors.dtd">
<authors>
    <person id="lear">
        <name>J.K. Rowling</name>
        <nationality>British</nationality>
    </person>
    <person id="lewis">
        <name>C.S. Lewis</name>
        <nationality>Irish</nationality>
    </person>
    <person id="mysteryperson"/>
</authors>

This example assumes that the DTD file has been placed on a web server located at Example.com. Note that the document type declaration specifies the root element of the document, not the DTD itself. You could use the same DTD to define person, name, or nationality as the root element of a valid document. Certain DTDs, such as the DocBook DTD for technical documentation (see http://www.docbook.org/), use this feature to good effect, allowing you to use the same DTD while working with multiple document types.

A validating XML processor is obligated to check the input document against its DTD. If it does not validate, the document is rejected. To return to the phone book example, if your application validated its input files against a phone book DTD, you would have been spared the problems of debugging your program and correcting your friend’s XML because your application would have rejected the document as being invalid.

Extensible Stylesheet Language Transformation

We’ve covered the basics of XML, and now we’ll discuss what we can do with the data we have. By transforming XML, we can make our data more presentable to a user. XSL refers to XSL Transformations (XSLT), the Path Language (XPath), and a formatting language, though for this appendix our concentration is on XSLT. XSLT became a W3C Recommendation on November 16, 1999 as XSL Transformations (XSLT) Version 1.0; the latest version became a W3C Recommendation on January 23, 2007 as XSL Transformations (XSLT) Version 2.0 (http://www.w3.org/TR/xslt20/). Browsers currently support XSLT 1.0.

XSLT is used to transform an XML file into another text-based format—often HTML or XHTML, but sometimes plain text or other XML vocabularies.

For example, this XSLT would transform the earlier phone book XML into a piece of XHTML code that the browser could style and view accordingly:

<xsl:template match="/phonebook">
    <div>
        <xsl:for-each select="person">
            <div>
                <xsl:text>Name: </xsl:text><strong><xsl:value-of select="name" />
                </strong>
            </div>
            <div>
                <xsl:text>Number: </xsl:text><strong><xsl:value-of
                select="number" /></strong>
            </div>
        </xsl:for-each>
    </div>
</xsl:template>

The Progression of XSL

XSLT developed in several distinct stages to become what it is today. These changes occurred as more developers began to use and understand XML and XSL, and the requirements for its definition needed to change along the way to accommodate ideas:

XML Query Language

Proposed in 1998 by Microsoft, Texcel, and webMethods, XML Query Language (XQL) was intended to transform XML into HTML so that browsers of the time could read it. The general query mechanism that came out of this proposal was the XSL pattern language.

XSLT

In 1999, the W3C introduced XSLT as a way to unify all the research that had been going on to create a “common core semantic model for querying.”

XPath

As XSLT was developed, the definition of XPointer was developed. Both XPointer and XSLT required a way to get to various portions of a document, and the solution was a subset of XSLT called XPath. XPath, though a subset of XSLT, can also be used as a standalone mechanism.

The Stylesheet

The XSLT stylesheet defines the transformations that should process the XML data being referenced. XSLT has traditionally been handled by external processes, often on the server, but some modern browsers are now handling transformations themselves. XML documents can specify which stylesheets are most appropriate for their processing. For example:

<?xml-stylesheet type="text/xml" href="transform.xsl"?>

This declaration must be made as part of the prolog of the XML document.

Document declaration

Just as with XML documents, XSLT documents require a root element at the beginning of the document after the XML prolog. The <xsl:stylesheet> element is used to declare the document’s relevant information:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
.
.
.
</xsl:stylesheet>

Tip

<xsl:stylesheet> and <xsl:transform> define the root element of an XSLT document and are completely synonymous.

Table A-4 lists the available attributes for the <xsl:stylesheet> or <xsl:transform> element.

Table A-4. Available attributes of the <xsl:stylesheet> or <xsl:transform> element

Attribute

Description

exclude-result-prefixes

This attribute is optional and should contain a whitespace-separated list of namespace prefixes that should not be sent with the output.

extension-element-prefixes

This attribute is optional and should contain a whitespace-separated list of namespace prefixes used for extension elements.

Id

This attribute is optional and is the unique identifier for the stylesheet.

Version

This attribute is required and contains the XSLT version of the stylesheet.

XSLT Elements

The <xsl:stylesheet> and <xsl:transform> elements I just introduced are examples of the XSLT elements available to create an XSLT document. In the following sections, I will discuss some of the more commonly used elements and how to use them in an XSLT document. Learning XSLT, by Michael Fitzgerald (O’Reilly), is a good resource for all of the XSLT elements.

<xsl:template>

You create a template rule using the <xsl:template> element. For example:

<xsl:template match="person">
    <div>
        <xsl:text>Name: </xsl:text>
        <strong><xsl:value-of select="name" /></strong>
    </div>
    <div>
        <xsl:text>Number: </xsl:text>
        <strong><xsl:value-of select="number" /></strong>
    </div>
</xsl:template>

All attributes for this element, shown in Table A-5, are optional. However, if no name is specified, a match must be, and vice versa.

Table A-5. Available attributes of the <xsl:template> element

Attribute

Description

Match

This attribute is optional and defines the pattern that should be matched for the template. If this attribute is omitted, there must be a name attribute.

Mode

This attribute is optional and defines a specific mode for the template.

Name

This attribute is optional and defines a specific name for the template. If this attribute is omitted, there must be a match attribute.

priority

This attribute is optional and defines a number to indicate the numeric priority of the template.

<xsl:text>

When literal text is to be written to the output, you use the <xsl:text> element. This element may contain any literal text and entity references. For example:

<xsl:text>Name: </xsl:text>

Only one attribute is available with this element, and it is optional. disable-output-escaping is a yes or no value that indicates whether special characters such as less than (<) should be left as is or output as an entity (&lt;). The default is no.

<xsl:value-of>

You use the <xsl:value-of> element to extract the value out of a selected node. This is used to select the value of an XML element and add it to the transformed output. For example:

<xsl:value-of select="name" />

Table A-6 contains the attributes associated with this element.

Table A-6. Available attributes of the <xsl:value-of> element

Attribute

Description

disable-output-escaping

This attribute is optional and is a yes or no value that indicates whether special characters such as less than (<) should be left as is or output as an entity (&lt;). The default is no.

Select

This attribute is required and contains an XPath expression that indicates the node/attribute from which to extract the value.

<xsl:for-each>

For basic looping within the XSLT document, you use the <xsl:for-each> element. This element can select elements of a specified node group, and you can use it to filter this group. For example:

<xsl:for-each select="person">
    <div>
        <xsl:text>Name: </xsl:text>
        <strong><xsl:value-of select="name" /></strong>
    </div>
    <div>
        <xsl:text>Number: </xsl:text>
        <strong><xsl:value-of select="number" /></strong>
    </div>
</xsl:for-each>

You can filter the group by adding a criterion to the select attribute. The following filters are available:

  • = (equal)

  • != (not equal)

  • &lt; (less than)

  • &gt; (greater than)

Here is an example of a basic filter:

<xsl:for-each select="person[name='Anthony Holdener'">
    <div>
        <xsl:text>Name: </xsl:text>
        <strong><xsl:value-of select="name" /></strong>
    </div>
    <div>
        <xsl:text>Number: </xsl:text>
        <strong><xsl:value-of select="number" /></strong>
    </div>
</xsl:for-each>

The only attribute that the <xsl:for-each> element takes is the required select attribute.

<xsl:if>

When you need a conditional test with an element’s value in an XML file, you use the <xsl:if> element in the XSLT document. This element takes a test attribute (which is required) to execute an expression against an XML element’s value and contains a template to be used when the expression evaluates to true. Here is an example using the <xsl:if> element:

<xsl:for-each select="person">
    <xsl:if test="name='Anthony Holdener'">
        <h3><xsl:text>The author of this book!</xsl:text></h3>
    </xsl:if>
    <div>
        <xsl:text>Name: </xsl:text>
        <strong><xsl:value-of select="name" /></strong>
    </div>
    <div>
        <xsl:text>Number: </xsl:text>
        <strong><xsl:value-of select="number" /></strong>
    </div>
</xsl:for-each>

<xsl:apply-templates>

When you have created templates and you need to apply them to an XSLT document, you use <xsl:apply-templates>. This element must be found within an <xsl:template> element to be valid and function correctly. For example:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" omit-xml-declaration="yes" />
    <xsl:template match="/phonebook">
        <xsl:apply-templates select="person" />
    </xsl:template>

    <xsl:template match="person">
        <xsl:if test="name='Anthony Holdener'">
            <h3><xsl:text>The author of this book!</xsl:text></h3>
        </xsl:if>
        <div>
            <xsl:text>Name: </xsl:text>
            <strong><xsl:value-of select="name" /></strong>
        </div>
        <div>
            <xsl:text>Number: </xsl:text>
            <strong><xsl:value-of select="number" /></strong>
        </div>
    </xsl:template>
</xsl:stylesheet>

The standard elements

Table A-7 provides a complete list of all the standard elements that you can use in XSLT stylesheets.

Table A-7. The standard elements for XSLT

Element

Description

xsl:apply-imports

This element applies a template from an imported stylesheet.

xsl:apply-templates

This element applies a template to the current element or to the current element’s child nodes.

xsl:attribute

This element adds an attribute.

xsl:attribute-set

This element defines a specified set of attributes.

xsl:call-template

This element calls a specified template.

xsl:choose

This element is used with xsl:when and xsl:otherwise to create a multiple-conditional test.

xsl:comment

This element creates a comment node.

xsl:copy

This element creates a copy of the current node, but doesn’t copy child nodes or attributes.

xsl:copy-of

This element creates a copy of the current node with child nodes and attributes.

xsl:decimal-format

This element defines the characters and symbols to be used when the format-number( ) function (see Table A-8) is executed.

xsl:element

This element creates an element node.

xsl:fallback

This element defines an alternative to use if the program processing the XSLT does not support a given XSLT element.

xsl:for-each

This element loops through each node in a defined set of nodes.

xsl:function

This element defines a function for use within a stylesheet. The function is written in XSLT, but it may be called from any XPath expression in the stylesheet.

xsl:if

This element holds a template to be applied to the output when a specified condition is true.

xsl:import

This element imports the structure of one stylesheet into another, but sets the precedence of the imported structure lower than the importing stylesheet’s structure.

xsl:include

This element includes the structure of one stylesheet into another, giving the imported structure the same precedence as the importing stylesheet’s structure.

xsl:key

This element defines a specified key that is used with the key( ) function.

xsl:message

This element writes a message to the output when reporting errors.

xsl:namespace-alias

This element replaces a namespace in the stylesheet to a new namespace in the output.

xsl:number

This element figures out the integer position in the current node and formats the contained number.

xsl:otherwise

This element defines a default action for the xsl:choose element.

xsl:output

This element defines the format for the document’s output.

xsl:param

This element declares a parameter.

xsl:preserve-space

This element tells the processor which elements should have their whitespace preserved.

xsl:processing-instruction

This element writes a processing instruction to the document’s output.

xsl:sort

This element sorts the output.

xsl:strip-space

This element tells the processor which elements should have their whitespace removed.

xsl:stylesheet

This element defines the root element for the stylesheet. It is synonymous with the xsl:transform element.

xsl:template

This element creates a structure to apply when a specified node is matched.

xsl:text

This element writes literal text to the output.

xsl:transform

This element defines the root element for the stylesheet. It is synonymous with the xsl:stylesheet element.

xsl:value-of

This element gets the value of a selected node.

xsl:variable

This element declares a variable.

xsl:when

This element specifies an action for the xsl:choose element.

xsl:with-param

This element specifies the value of a parameter to be passed into an xsl:template element.

Using functions

XSLT functions are used as part of the XPath expressions in an XSLT stylesheet. XSLT has built-in functions, as I will show in Table A-8, as well as functions that it inherits from XPath. Using functions in an XPath expression is simple. For example:

<xsl:apply-templates select="book[@title=current( )/@ref]" />

Assuming that you have an XML document of books, this example will process all book elements that have a title attribute with a value equal to the current node’s ref attribute.

Table A-8. The built-in functions in XSLT

Function

Description

current( )

This function returns the current node.

document(object, node-set)

This function is used to access the nodes in an external XML document.

element-available(string)

This function tests whether the XSLT processor supports the specified element.

format-number(number, format[,decimalFormat])

This function converts a number into a formatted string.

function-available(string)

This function tests whether the XSLT processor supports the function specified.

generate-id(node-set)

This function returns a string value that uniquely identifies a specified node or node set.

key(string, object)

This function returns a node set using the index created by an <xsl:key> element.

system-property(string)

This function returns the value of the system properties specified.

unparsed-entity-uri(string)

This function returns the URI of the unparsed entity specified.

XSLT 2.0 also inherits many functions through XPath 2.0, though there are no browsers that currently support this technology. Until browsers begin to implement XPath 2.0 functions, those listed in Table A-8 are the only functions available as part of any XSLT document. Once browsers implement XPath 2.0 and XSLT 2.0, the capabilities of XSLT will become far more powerful than they currently are.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset