13.2. Saxon

Michael Kay has contributed what might be considered one of the most robust and versatile XSLT processors with his Saxon product. It has one of the largest sets of built-in extension top-level elements, instruction elements, and functions. It also runs on Java and is regularly updated at the source Web site, http://users.iclway.co.uk/mhkay/saxon.

Saxon includes a servlet that allows it to be invoked directly from a URL entered into a browser. You might think of Saxon as the “programmer's XSLT processor,” due to its extended documentation for adding extensions, event handlers, and so forth (see the api-guide.html file in the Saxon user documentation).

The Saxon XSLT processor is available in two forms, a “complete” Saxon API for Java, and a simple command-line version of the processor, called Instant Saxon.

The complete Saxon API contains a Java library, which supports a similar processing model to XSL, but allows full programming capability, which you need if you want to perform complex processing of the data or to access external services such as a relational database. It includes a typical set of .jar files that are added to the CLASSPATH environment, and also contains utilities such as a DTD generator and other goodies, including documentation.

The simple version, Instant Saxon, runs straight from a Windows command-line. The Microsoft JVM must be installed on the system prior to using Instant Saxon. However, if you use Internet Explorer 4 or later, the JVM will already be on your system. The Instant Saxon installation does not emphasize the extras or the documentation, so it may be worth downloading both versions just to get these. The Instant Saxon installation comes bundled with the AElfred XML parser from Microstar.[5]

[5] See www.microstar.com.

13.2.1. Installing Full Saxon on Solaris/UNIX or Windows Java

If you are installing the full Saxon product, you will need the JDK 1.2 (1.1.6+ will do, but is not recommended). Kay notes that the current version is compiled with Java 2 and will run with 1.1, but will not compile under 1.1. If you do not use the default Aelfred parser included with Saxon, you will also need a SAX1 or SAX2 parser, such as XP.

The core program for working with objects is a JAR file, saxon.jar, which you must include on your CLASSPATH.We will continue to work with the model introduced above, which assumes you will put this in a /usr/bin directory, likely called /usr/bin/saxon.

You can find additional user documentation, covering both the XSLT and Java interfaces, included in the Saxon package as JAVADOC specifications. These package summaries give an overview in the form of a user guide. In addition, there is an introductory overview, included with the documentation provided with Saxon.

Saxon comes with a bundled XML parser, a modified copy of the AElfred parser, adapted to notify comments to the application. Saxon has been tested successfully in the past with Lark, MSXML, SUN Project X, Oracle XML, Xerces, xml4j, and XP. Use of a SAX2-compliant parser is preferred, as SAX1 does not allow XML comments to be passed to the application. However, Saxon works with either. All the relevant classes must be installed on your Java CLASSPATH. The following examples assume that you will use the default xp.jar XML processor and that you have put it in your directory with Saxon.

At the very least, you must include saxon.jar and xp.jar on the system CLASSPATH. Thus, where you had a basic “.”, you would modify it as follows:

set CLASSPATH=/usr/bin/saxon/xp.jar:/usr/bin/saxon/saxon.jar:.

Use the above for your .cshrc on Solaris/UNIX; or for an autoexec.bat file on Windows, do the same syntax, but remember to use the semicolon (;) and reverse slash ().

setenv CLASSPATH=/usr/bin/saxon/xp.jar:/usr/bin/saxon/saxon.jar:.

To run full Saxon, unless you've attached some applet wrapper or invoked it from a URL in a browser (in which case, you should review the Saxon documentation index.html file), open a command line window and run it with the following syntax:

saxon [options] source.xml stylesheet.xsl [params . . .]

13.2.2. Installing Instant Saxon on Windows

All you need to install Instant Saxon is the download zip file located at http://users.iclway.co.uk/mhkay/saxon. You do not need to add any extra parsers or to modify PATH or CLASSPATH environment variables, provided you have IE 4+ (IE 5 recommended) on your Windows 95, 98, or NT/2000 machine. Unzip the file in the directory where you plan to use Saxon and you are ready to go.

To run Instant Saxon on Windows, use a command-line or DOS window (select Start, Run, and type cmd) and run it with the following syntax:

saxon [options] input.xml stylesheet.xsl [params . . .]

Options and parameters for Instant Saxon are described in the following sections. The input.xml and stylesheet.xsl represent filenames for the input XML document and the XSL stylesheet being used, respectively.

13.2.3. Saxon Options

Saxon has a number of command-line options that are used when invoking Saxon with an XSLT stylesheet (see Table 13-3). The options must precede the input.xml and the stylesheet.xsl filenames on the command-line:

saxon [options] input.xml stylesheet.xsl [params . . .]

Table 13-3. Command-line options for Saxon
Argument Action/Effect
-a Used with XML documents that directly contain a stylesheet. This means that the filename for the stylesheet on the command line is not required. See Chapter 2, Section 2.7 for more information on including XSLT stylesheets in an XML document.
-ds | -dt Selects which internal tree model is to be used. -dt (which is the default) selects the "tinytree" model, and -ds selects the traditional tree model.
-l Saxon implements a line numbering function saxon: line-number(), to access the line number for each line in the input document. This option enables (turns on) the line numbering for the source document.
-m classname Used with the <xsl:message> element to control the output of messages as a new document. Must be used with the com.icl.saxon.output.Emitter class.
-r classname Used with the document() function in the <xsl:include> and <xsl:import> elements to resolve URIs into a source document.
Also used with the -u option to process the URIs of the input file and stylesheet file provided on the command-line.
-o filename Used to provide a filename for the output from the processor. This option checks the extension of the filename provided to determine the output file type if one is not explicitly specified with the method attribute of <xsl:output>.
-t Displays the version and timing information.
-T Displays stylesheet tracing information. Also enables (turns on) the line numbering for the source document.
-TL classname Signals the processor to use a TraceListener. The name of a user-defined class, which must implement com.icl.saxon.trace.TraceListener, is specified with the classname.
-u Provides the ability to use URLs for the input and stylesheet filenames on the command line. If the filenames start with “http:” or “file:” they are assumed to be URLs, and this option is not required.
-w0, w1, or w2Saxon implements 3 levels of recovery when an error occurs. The level can be specified on the command-line as:

w0 - recover silently,

w1 - recover after writing out a warning message

w2 - signal the error and do not attempt recovery

The default is w1.
-x classname The SAX parser used to process the XML files can be specified using this option. The classname specifies a Java class that implements the org.xml.sax.Parser or org.xml.sax.XMLReader interface.
-y classname The SAX parser used to process the XSLT files can be specified using this option. The classname specifies a Java class that implements the org.xml.sax.Parser or org.xml.sax.XMLReader interface.
-? Displays the help for Saxon's command-line syntax.

13.2.4. Saxon Command-line Parameters

Saxon provides the ability to submit parameter values through the command-line at run-time to update global parameters defined in the stylesheet with the <xsl:param> top-level element. The parameters must follow the filenames for the input XML document and the XSLT stylesheet on the command-line as follows:

saxon [options] input.xml stylesheet.xsl [params . . .]
					

A parameter value is passed to the stylesheet in the form name=value, where name is the name of the parameter defined in the stylesheet with <xsl:param>, and value is the new value for the parameter. If the parameter is not declared in the stylesheet, the parameter from the command-line is ignored. Parameter values that contain spaces should be surrounded with double quotes on the command-line.

13.2.5. Saxon Extensions

Saxon includes what is one of the largest collections of built-in extensions. They include extension top-level elements, extension functions, extension attributes, and extension instruction elements. The following material is excerpted and annotated from the material included from the current download of Saxon (this is from the extensions.html file in the Saxon documentation). The most up-to-date documentation is available at http://users.iclway.co.uk/mhkay/saxon/. Kay provides the following preface to users of the Saxon extensions:

These extension functions and elements have been provided because there are things that are difficult to achieve, or inefficient, using standard XSLT facilities alone. As always, it is best to stick to standard if you possibly can: and most things are possible, even if it's not obvious at first sight.

13.2.5.1. Saxon Attribute Extensions

Saxon implements the following extension attributes: trace, allow-avt, disable-output-escaping,[6] method,[7] indent-spaces, character-representation, omit-meta-tag, and next-in-chain.

[6] The disable-output-escaping attribute has been implemented in the XSLT specification and is no longer a Saxon extension.

[7] The method attribute is from the XSLT1.0 specification, but Saxon adds support for QName values.

The use of the Saxon extension attributes requires that the Saxon namespace be declared either in the document element, an element that uses the extension, or an ancestor of the element that uses the extension. The Saxon namespace is declared using the following format:

xmlns:saxon="http://icl.com/saxon"

The saxon:trace Extension Attribute

This attribute can be used on either the document element or an <xsl:template> element, and turns on echoing of the instantiation for each template rule. The reporting is sent to the standard error output, whether the command-line or a GUI window, as implemented by the application.

If you use this attribute on the document element, all the top-level elements are listed along with their import precedence. All contained template rules are then traced as well. The default value for saxon:trace is no, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:trace (yes|no) "no"
VALUE = (yes|no) "no"

Use this attribute on either the <xsl:stylesheet> or <xsl:transform> document elements, or a template rule as follows:

<xsl:template match="block" saxon:trace="yes">

The saxon:allow-avt Extension Attribute

This extension attribute is used with the <xsl:call-template> instruction element. This attribute lets the value given for the name attribute of <xsl:call-template> to be interpreted as an attribute value template, when the value is surrounded by curly-braces {} (see Chapter 6, Section 6.6.1). Since attribute value templates are not normally allowed as the value for the name in <xsl:call-template>, adding the extension attribute will prevent a processor error. The default value for saxon:allow-avt is no, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:allow-avt (yes|no) "no"
VALUE = (yes|no) "no"

Use the saxon:allow-avt attribute as follows to permit AVT's in the value for name:

     <xsl:call-template name="{$some_variable}"
saxon:allow-avt="yes" >

The saxon:disable-output-escaping Extension Attribute

The disable-output-escaping attribute has been implemented in the XSLT specification and is no longer a Saxon extension. Its use can be found in Chapter 3 in conjunction with the <xsl:value-of> element, and Chapter 6 in conjunction with the <xsl:text> element.

The method Attribute with Saxon

The method attribute of <xsl:output> and <xsl:document> is not an extension attribute, but its value can contain a QName that is governed by a processor. The prefix of the QName must be a valid namespace prefix. We use saxon as the prefix in the following examples, however it can be any valid prefix. Saxon implements the method attribute with the values shown in Table 13-4.

Table 13-4. Values of QNames implemented by the Saxon processor[a]
QName Action
Saxon:fop Directs output to Apache's FOP processor (which must be installed separately from www.apache.org), which implements the developing W3C formatting objects, or FO, portion of XSLT.
Saxon:xhtml Outputs the result tree in XHTML format. This follows the same rules as method=“xml,” except that it follows the guidelines for making the XML acceptable to legacy HTML browsers. Specifically (a) empty elements such as <br/> are output as <br/>, and (b) empty elements such as <p/> are output as <p></p>. The indent attribute defaults to “yes,” and indenting follows the HTML rather than XML rules. Other attributes may be specified as for XML output, e.g. cdata-section-elements and omit-xml-declaration.
Saxon:classname The fully qualified class name of a class that implements either the SAX org.xml.sax.DocumentHandler interface, or the SAX2 org.xml.sax.ContentHandler interface, or that is a subclass of the com.icl.saxon.output.Emitter class. If such a value is specified, output is directed to the user-supplied class.

[a] The information for this table comes direcrly from the Saxon 6.2.2 documentarion.

Use the method attribute as follows:

<xsl:output method="saxon:fop"/>

The saxon:indent-spaces Extension Attribute

The saxon:indent-spaces controls the amount of indentation that is generated when the file output method is XML or HTML, and indent is set to yes on either <xsl:output> or <xsl:document> elements. The value of the attribute must be an integer.

The value for saxon:indent-spaces is a number, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:indent-spaces NMTOKEN #IMPLIED
VALUE = Number

Use the saxon:indent-spaces attribute as follows:

<xsl:output saxon:indent-spaces="10"/>

The saxon:character-representation Extension Attribute

This attribute is used with <xsl:output> or <xsl:document>, and controls how non-ASCII characters are represented in the output. It works with the two method values, xml and html.

When used with the xml method, its value can be either decimal or hex.

When used with the html method, the value has two strings, separated by a semicolon. The first string controls how non-ASCII characters within the character encoding is represented, the values being native, entity, decimal, or hex. The second string controls how characters outside the encoding will be represented, the values being entity, decimal, or hex.

The value for saxon:character-representation is a string, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:character-representation CDATA #IMPLIED
VALUE = String

Use the saxon:character-representation attribute as follows:

<xsl:output method="xml" saxon:character-representation="hex"/>

The saxon:omit-meta-tag Extension Attribute

This attribute is used with <xsl:output> and the html method. The normal action of the html output method is to generate a <META> tag immediately after the <HEAD> tag, containing details of the media type and character encoding. Setting this attribute to “yes” causes this output to be suppressed.

The values for saxon:omit-meta-tag are yes or no, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:omit-meta-tag (yes|no) "no"
VALUE = (yes|no) "no"

Use the saxon:omit-meta-tag attribute as follows:

<xsl:output method="html" saxon:omit-meta-tag="yes"/>

The saxon:next-in-chain Attribute

The saxon:next-in-chain attribute is used with either <xsl:output> or <xsl:document> to direct the output to another stylesheet. The output is then used as the input for the new stylesheet. The value of the attribute is the URL of the new stylesheet. The output stream must always be pure XML, and attributes that control the format of the output (e.g., method, cdata-section-elements, etc.) will be ignored. The output of the second stylesheet will be directed to the destination that would have been used for the first stylesheet if no saxon:next-in-chain attribute were present. When used with <xsl:output>, the original transformation result destination is used. When used with <xsl:document>, the file specified by the href attribute is used. The value for saxon:next-in-chain is a URL, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:next-in-chain CDATA
#IMPLIED
VALUE = URL
							

Use the saxon:next-in-chain attribute as follows:

<xsl:output saxon:next-in-chain="http://mystyles/newstyle.xsl"/>

13.2.5.2. Saxon Extension Elements

Saxon adds four top-level extension elements: <saxon:handler>, <saxon:preview>, <saxon:function>, and <saxon:script>, as well as eight instruction extension elements: <saxon:assign>, <saxon:doctype>, <saxon:entity-ref>, <saxon:group>, <saxon:item>, <saxon:output>, <saxon:return>, and <saxon:while>.

To use Saxon extension elements, their namespace must be declared and the extension-element-prefixes attribute on the document element must include the saxon value.

All these extensions are available to either full Saxon or Instant Saxon. However, to use the external Java calls as with <saxon:output> you may need the accompanying documentation, which comes with full Saxon.

The <saxon:handler> Top-Level Extension Element

The <saxon:handler> top-level extension element is similar to <xsl:template>, and has the same uses for the match, mode, name, and priority attributes, as shown in the following element model definition:

<!-- Category: top-level-extension-element -->
<saxon:handler

 handler = classname
  match = pattern
  name = qname
  priority = number
  mode = qname>
/>

This element is sorted for precedence of instantiation in equal standing with any other <xsl:template> element. Its function is to call a user-written JavaNodeHandler with the mandatory handler attribute. The JavaNodeHandler and the <saxon:handler> element are explained in detail in the Saxon documentation (begin with the extensions.html file in the Saxon documentation).

The <saxon:preview> Top-Level Extension Element

This top-level extension element is designed to facilitate more efficient handling of large documents. In the traditional XSLT stylesheet processing model, each template rule is evaluated for a match to determine if it will be instantiated in turn. This means that the entire input XML document instance is parsed for a match for every single template rule—very time- and system-resource consuming.

With <saxon:preview>, the relevant parts of the input source, those which find the template match, are processed as soon as they are parsed, then removed from the virtual document tree, saving on memory resources. In effect, it is possible to break the transformation of the document source into a series of separate smaller transformations. The elements listed in the mandatory elements attribute are “disregarded” by the Saxon processor after they have been treated according to whatever mode has been stipulated in the mandatory mode attribute. The results are written to the output result tree, but those elements in the input XML document instance are ignored in subsequent evaluation of other templates in the XSLT stylesheet. The following element model definition shows the structure of <saxon:preview>.

<!-- Category: top-level-extension-element -->
<saxon:preview

  mode = qname
								elements = qnames >

  <!-- Content: (xsl:param*, template) -->
</saxon:preview>

The <saxon:preview> element can be used to simply weed out undesired input elements by using it as a template that does nothing—in other words, give it no children instruction elements, only the list of elements to be ignored for that mode.

The <saxon:function> Top-Level Extension Element

The top-level <saxon:function> extension element is used to declare an extension function. It contains a template, preceded by zero or more <xsl:param> elements. It has a required name attribute whose value is a QName, evaluating to a URI, as shown in the following element model definition:

<!-- Category: top-level-extension-element -->
<saxon:function

  name = qname >
  <!-- Content: (xsl:param*, template?, saxon:return*,
xsl:fallback?) -->
</saxon:function>

The function definition contains zero or more <saxon:return> instructions to define the return value. The Saxon documentation provides additional information for defining functions using the <saxon:function> element.

An example of using <saxon:function> from the Saxon Documentation is as follows:

<saxon:function name="my:initial">
    <xsl:param name="size"/>
    <saxon:return select="substring(.,1,$size)"/>
</saxon:function>
<xsl:template match="text()">
    <xsl:value-of select="my:initial(3)"/>
</xsl:template>

The <saxon:script> Top-Level Extension Element

The <saxon:script> element is a top-level element that is equal to <xsl:script>, defined in XSLT 1.1 WD. The reason Saxon provides this element is so it can be used in stylesheets that are shared and used with different processors. Any processor other than Saxon will ignore this element.

For example, to use an extension function like xx:intersection(), you can define the Saxon implementation as follows:

<saxon:script implements-prefix="xx" language="java"
      src="java:com.icl.saxon.functions.Extensions">

The following element model definition shows the structure of the <saxon:script> element:

<!-- Category: top-level-extension-element -->
<saxon:script

  implements-prefix = ncname
  language = "ecmascript" | "javascript" | "java" |
								qname-but-not-ncname
  src = uri-reference
  archive = uri-references >
  <!-- Content: #PCDATA -->
</saxon:script>

The <saxon:assign> Extension Element

This function provides a very useful feature that allows XSLT variables and parameters to be dynamically updated in the context of a template rule. Currently, XSLT variables and parameters, as codified in the W3C specification, cannot be updated other than in the case of a parameter with the use of <xsl:with-param>, which has limited uses. For example, you might have a declared variable of birthday that has been assigned to a variable as follows:

<xsl:variable name="birthday" select="{@date}" />

You can then update it to make an employee password a combination of start date, Social Security number, and birthdate.

<xsl:template match="password">
      <xsl:attribute>
            <xsl:value-of select="@ssn" />
            <saxon:assign name="birthday"
            expr="concat($birthday, @start-date" />
      </xsl:attribute>
</xsl:template>

This extension instruction element can also contain a template, as shown in the element model definition below:

<!-- Category: instruction-extension-element -->
<saxon:assign

  name = qname
  select = node-set-expression >

  <!-- Content: (template) -->
</saxon:assign>

The variable being updated must have been defined using the extension attribute saxon:assignable="yes". The value of the variable is determined either using the select attribute or by instantiating the template it contains.

The <saxon:doctype> Extension Element

The <saxon:doctype> instruction element is used to insert a document type declaration into the current output file. It has no attributes, and its content is a template, as shown in the element model definition below. The template is instantiated to create an XML document that represents the DTD to be generated.

The Saxon documentation provides detailed information on the output format and usage of the <saxon:doctype> element. An example of using <saxon:doctype> from the Saxon documentation is as follows:

<xsl:template match="/">

Note

If this element is present the doctype-system and doctype-public attributes of <xsl:output> are ignored.


<!-- Category: instruction-extension-element -->
<saxon:doctype>

  <!-- Content: (template) -->
</saxon:doctype>

  <saxon:doctype xsl:extension-element-prefixes="saxon">
  <dtd:doctype name="booklist"
       xmlns:dtd="http://icl.com/saxon/dtd" xsl:exclude-result-
prefixes="dtd">
    <dtd:element name="booklist" content="(book)*"/>
    <dtd:element name="book" content="EMPTY"/>
    <dtd:attlist element="book">
      <dtd:attribute name="isbn" type="ID" value="#REQUIRED"/>
      <dtd:attribute name="title" type="CDATA" value="#IMPLIED"/>
    </dtd:attlist>
    <dtd:entity name="blurb">'A <i>cool</i> book with &gt;
200 pictures!'</dtd:entity>
    <dtd:entity name="cover" system="cover.gif" notation="GIF"
     <dtd:notation name="GIF" system="http://gif.org/"/>
  </dtd:doctype>
  </saxon:doctype>
  <xsl:apply-templates/>
</xsl:template>

The <saxon:entity-ref> Extension Element

This instruction element allows HTML entities such as &nbsp; to be generated in HTML output when the <xsl:output> top-level element has a method attribute of html. Use the element as follows:

<saxon:entity-ref name="nbsp" />

This empty element has one required attribute, name, as shown in the element model definition below:

<!-- Category: instruction-extension-element -->
<saxon:entity-ref

 name = qname
/>

The <saxon:group> Extension Element

The grouping mechanism provided by <saxon:group> allows iteration over nodes selected in an expression returning a node-set. The required select attribute is used to define the nodes which will be used for the iteration, as shown in the following element model definition:

<!-- Category: instruction-extension-element -->
<saxon:group

  select = node-set-expression
								group-by = string >

  <!-- Content: (xsl:sort*, template?, saxon:item, template?) -->
</saxon:group>

This instruction element is similar in function to <xsl:for-each>. It also requires a group-by attribute to determine how the grouping is to be done whose value is a string expression that is applied to each item selected under the select attribute. This element can have <xsl:sort> children and must have a <saxon:item> children (see the section immediately below). The other instructions contained in <saxon:group> are performed once for each item in the group selected by the select attribute of the parent <saxon:group>.

The <saxon:item> Extension Element

This element is the required child of the <saxon:group> element, and stipulates the items within a group. XSLT instructions outside of <saxon:item> are executed once for each group that qualifies in the group-by attribute of the <saxon:group>. The XSLT instructions that are children of <saxon:item> are executed once per item. This element has no attributes, and contains a template, as shown in the element model definition below:

<!-- Category: extension-element -->
<saxon:item>

  <!-- Content: (template) -->
</saxon:item>

The <saxon:output> Extension Element

This element allows redirection of output to different files of all result tree nodes produced within the <saxon:output> tags. After its contents have been executed and placed in the respective files, the output destination reverts back to the previous output destination stipulated when the XSLT stylesheet was invoked. This element is equal to the <xsl:document> element that is specified in XML 1.1 WD, which is implemented by many processors. Note that in previous versions of Saxon, <saxon:output> had additional functionality that has been removed. The <saxon:output> element is shown in the following element model definition:

<!-- Category: instruction-element -->
<saxon:output
  href = { uri-reference }
  method = { "xml" | "html" | "text" | qname-but-not-ncname }
  version = { nmtoken }
  encoding = { string }
  omit-xml-declaration = { "yes" | "no" }
  standalone = { "yes" | "no" }
  doctype-public = { string }
  doctype-system = { string }
  cdata-section-elements = { qnames }
  indent = { "yes" | "no" }
  media-type = { string } >

  <!-- Content: (template) -->
</saxon:output>

The <saxon:return> Extension Element

The <saxon:return> element is used to exit from a function, and provides a return value. It is only used within a <saxon:function> element, and it must not have any following sibling instructions other than <xsl:fallback>. However, there can be more than one <xsl:return> instruction in a function, for example, one in each branch of an <xsl:choose>.

The <saxon:return> element has one optional select attribute, whose value is an expression. The expression is evaluated and its value is sent as the return value of the function. If the select attribute is not used, the template in the <saxon:return> element is instantiated and the result is returned as a result tree fragment. The following element model definition shows the structure of the <saxon:return> element.

<!-- Category: extension-element -->
<saxon:return

  select = expression >
  <!-- Content: (template) -->
</saxon:return>

The <saxon:while> Extension Element

This element adds an iteration feature that processes as long as some given condition is true. The condition is a Boolean expression in the mandatory test attribute. To prevent endless looping, the <saxon:assign> element is required as a child to <saxon:while> and sets a variable that is updated at some point in the loop in order to terminate it.

<!-- Category: instruction-extension-element -->
<saxon:while
test = expression >
  <!-- Content: (template?, saxon:assign, template?) -->
</saxon:while>

An example of using <saxon:while>, from the Saxon documentation, is as follows:

<xsl:variable name="i" expr="0"/>
<saxon:while test="$i &lt; 10">
    The value of i is <xsl:value-of select="$i"/>
    <saxon:assign name="i" expr="$i+1"/>
</saxon:while>

13.2.5.3. Saxon Extension Functions

Saxon implements twenty-seven extension functions, ranging in application from basic existence to conditional functions. These functions include: saxon:after(), saxon:before(), saxon:difference(), saxon:distinct(), saxon:evaluate(), saxon:eval(), saxon:exists(), saxon:expression(), saxon:forAll(), saxon:getUserData(), saxon:hasSameNodes(), saxon: highest(), saxon:if(), saxon:ifNull(), saxon: intersection(), saxon:leading(), saxon:lineNumber(), saxon:lowest(), saxon:max(), saxon:min(), saxon:nodeSet(), saxon:path(), saxon:range(), saxon:set- UserData(), saxon:sum(), saxon:systemId(), and saxon:tokenize().

To invoke a Saxon function, the Saxon namespace must be declared at or above the element calling the function. A typical use of a Saxon extension function is shown below:

<xsl:template match="something">
      <xsl:apply-templates
             select="saxon:distinct($some_nodeset)" >
</xsl:template>

More details about these functions and updates for newly added functions are available at http://users.iclway.co.uk/mhkay/saxon.

The documentation notes that these extension functions have a very simple source code for the most part which can be used as templates, or models, by users for writing their own extensions.

The after() Extension Function
Function: node-set
								after
								(node-set-1, node-set-2)
							

The after() function returns a node-set with all the nodes in node-set-2 that follow (in document order) at least one node of node-set-1. Its function return type is node-set, and it contains two node-set arguments.

The before() Extension Function
Function: node-set
								before
								(node-set-1, node-set-2)
							

The before() function returns a node-set with all the nodes in node-set-2 that precede (in document order) at least one node of node-set-1. Its function return type is node-set, and it contains two required node-set arguments.

The difference() Extension Function
Function: node-set
								difference
								(node-set-1, node-set-2)
							

The difference() function compares the two arguments and returns a node-set of those nodes in node-set-1 that are not in node-set-2. Its function return type is node-set, and it contains two required node-set arguments.

The distinct() Extension Function
Function: node-set
								distinct
								(node-set-1, stored-expression)
							

This function returns a node-set based on evaluating all the nodes in the set given in the first argument that a duplicate string value as the stored-expression in the second argument. Its function return type is node-set, and it contains two arguments, the first a required node-set and the second an optional string.

If the second argument is not used, the string that is used as a comparison is the string value of the current node. Every node following will be compared, removing any duplicates.

An example from the Saxon documentation is as follows:

<xsl:for-each select="saxon:distinct(surname,
saxon:expression('substring(.,1,1)')">

This function will process the first surname starting with each letter of the alphabet in turn.

The eval() Extension Function
Function: string
								eval
								(stored-expression)
							

The eval() function evaluates the expression stored as its argument and returns the string value of that expression. See the saxon:expression() function for information about generating stored-expressions. The function return type is string, and it contains one string argument, which is an expression. The following example comes from the Saxon documentation:

saxon:eval(saxon:expression(concat(2, $op, 2)))

The evaluate() Extension Function
Function: string
								evaluate
								(string)
							

This function evaluates the expression that is passed in as a string argument and returns its value as a string. This allows the calculation of a variable, for instance, at runtime, based on the evaluation of this expression. One use might be to dynamically determine a sort key for <xsl:sort> based on different contingencies for various input XML document instances. The function saxon:evaluate(string) is shorthand for saxon:eval(saxon:expression(string)).

The exists() Extension Function
Function: boolean
								exists
								(node-set-1, stored-expression)
							

The exists() function is used to test whether the value of the stored-expression in the second argument is true for any node in the node-set supplied in the first argument. The function return type is boolean, and it has two required arguments, a node-set and a string (expression).

The expression() Extension Function
Function: string
								expression
								(string)
							

This function is used to create a stored expression that can be used in other Saxon extension functions. It contains one required argument, a string which must be an expression. Its function return type is string.

The forAll() Extension Function
Function: boolean
								forAll
								(node-set-1, stored-expression)
							

This function tests each node in the node-set provided in the first argument against the expression in the second argument. If each node in the node-set evaluates to true, the function returns true. Otherwise it returns false. It has two required arguments, a node-set and a string (expression). Its function return type is Boolean.

An example of using this function, from the Saxon documentation, is as follows:

saxon:forAll(sale, saxon:expression('@price * @qty &gt; 1000'))

This will return true if for every child <sale> element of the context node, the product of price and qty exceeds 1000.

The getUserData() Extension Function
Function: string
								getUserData
								(string)
							

This function returns a string value of the predefined user data associated with the context node. The user data is predefined using the saxon:setUserData() function. It has one required argument, a string, and its function return type is a string.

The hasSameNodes() Extension Function
Function: boolean
								hasSameNodes
								(node-set-1, node-set-2)
							

The has-same-nodes() function returns a Boolean true if node-set-1 and node-set-2 have exactly the same nodes (not merely an intersection). This is different from the XSLT = operator, which only compares the string values of nodes. The function has two required arguments, both node-sets, and its function return type is Boolean.

The highest() Extension Function
Function: node-set
								highest
								(node-set1, stored-expression)
							

This function returns a node-set of the one node that has the highest numerical value, evaluated as if using the number() function. If the second argument is used, the expression is evaluated and the node that is returned is the one that has the highest value according to that expression. NaN values are ignored. The function has one required attribute, a node-set, and one optional argument, a string (expression). Its function return type is node-set.

An example of using this function, from the Saxon documentation, is as follows:

saxon:highest(sale, saxon:expression('@price * @qty'))

This will evaluate price times quantity for each child <sale> element, and return the node for which this has the highest value.

The if() Extension Function
Function: object
								if
								(condition, value1, value2)
							

This function allows conditionals as part of an XPath expression. The first argument must be a Boolean function, such as contains(), or a similar test. If it is true, then it returns the value of the first argument; if it is false, it returns the value of the second argument. The function has three arguments, the first is a Boolean, the second and third are of type object (they can be of any type, node-set, string, number, or Boolean). Its function return type is object, the same type as the value of the argument being returned.

The ifNull() Extension Function
Function: boolean
								ifNull
								(java-object)
							

This function returns true if the java-object provided as the required argument is null. Its function return type is Boolean, and its one required argument is of type string (java-object).

The intersection() Extension Function
Function: node-set
								intersection
								(node-set-1, node-set-2)
							

This function will return a node-set containing only those nodes common to both node-set-1 and node-set-2, and discards all others. The function has two required arguments, both node-sets, and its function return type is node-set.

An added convenience is that the arguments can be a union of tests with the | operator to test one of several node-sets. This is very handy to use, for instance, with keys, as it can determine what both arguments have in common.

The leading() Extension Function
Function: node-set
								leading
								(node-set-1, stored-expression)
							

This function evaluates the expression in the second argument and returns each node in the node-set of the first argument that evaluates to true, up to, but not including, the first node that returns a false value. The function has two required arguments, a node-set and a string (expression), and its function return type is node-set.

An example of using this function, from the Saxon documentation, is as follows:

saxon:leading(following-sibling::*, saxon:expression('self::para'))

This will return the <para> elements following the current node, stopping at the first element that is not a <para>.

The lineNumber() Extension Function
Function: number
								lineNumber()

This function is used to determine the line number, in document order, of the input XML document at the point where it is used. It can be used with <xsl:message>, for instance, to diagnose where a match is or is not happening for a given template rule. The function has no arguments, and its function return type is a number.

Make sure line numbering is turned on by adding the -l option on the command-line.

The lowest() Extension Function
Function: node-set
								lowest
								(node-set-1, stored-expression)
							

This function returns a node-set of the one node that has the lowest numerical value, evaluated as if using the number() function. If the second argument is used, the expression is evaluated and the node that is returned is the one that has the lowest value according to that expression. NaN values are ignored. The function has one required attribute, a node-set, and one optional argument, a string (expression). Its function return type is node-set.

An example of using this function, from the Saxon documentation, is as follows:

saxon:lowest(sale, saxon:expression('@price * @qty'))

This will evaluate price times quantity for each child <sale> element, and return the node for which this has the lowest value.

The max() Extension Function
Function: number
								max
								(node-set-1, stored-expression)
							

This function returns a number which is the highest possible value of the evaluation of the expression in the second argument for each node in the node-set of the first argument. The number() function is used implicitly to evaluate the string value of each node prior to testing, and if there is no second argument, the highest value of that evaluation is returned. This function has one required argument, a node-set, and one optional argument, a string (expression). Its function return type is number.

An example of using this function, from the Saxon documentation, is as follows:

saxon:max(sale, saxon:expression('@price * @qty'))

This will evaluate price times quantity for each child <sale> element, and return the maximum amount.

The min() Extension Function
Function: number
								min
								(node-set-1, stored-expression)
							

This function returns a number which is the lowest possible value of the evaluation of the expression in the second argument for each node in the node-set of the first argument. The number() function is used implicitly to evaluate the string value of each node prior to testing, and if there is no second argument, the lowest value of that evaluation is returned. This function has one required argument, a node-set, and one optional argument, a string (expression). Its function return type is number.

An example of using this function, from the Saxon documentation, is as follows:

saxon:min(sale, saxon:expression('@price * @qty'))

This will evaluate price times quantity for each child <sale> element, and return the minimum amount.

The nodeSet() Extension Function (obsolete)
Function: node-set
								nodeSet
								($fragment)
							

This function is now obsolete: a result-tree-fragment is now converted implicitly to a node-set if it is used in a context where a node-set is required.

The path() Extension Function
Function: string
								path()

The path() function returns the string value of the path (XPath pattern expression) of the context node. It has no arguments, and its function return type is string.

The range() Extension Function
Function: node-set
								range
								(number-1, number-2)
							

The range() function allows two arguments to be converted to numbers according to the XSLT number() function, and then further rounds them to nearest integers. A new node-set is then made which contains one node for each integer in the range, starting with the first number and all the integers between and including the last number. The values of the numbers are converted to strings and stored as the values of the nodes in the new node-set. Its two required arguments are both numbers, and its function return type is node-set.

For example, range(2, 5) creates a node-set with four nodes with string values 2, 3, 4, and 5.

The main intended usage, as stated in the Saxon documentation, is <xsl:for-each select="range($from, $to)"> which simulates a conventional for-loop in other programming languages.

The setUserData() Extension Function
Function: string
								setUserData
								(string, value)
							

This function associates property information with the context node that can then be accessed with the getUserData() extension function (within the same stylesheet). It has two arguments, both strings, although the second string contains an expression. The string value of the first argument is used as the name for the property. The value of the property is assigned using the second argument, which is an expression. The function return type for setUserData() is an empty string, because the values are retrieved using the getUserData() function.

The sum() Extension Function
Function: number
								sum
								(node-set-1, stored-expression)
							

This function evaluates the expression in the second argument and applies it to each node in the node-set of the first argument. Each value is then added up to provide a total sum of the numbers of the nodes. If the result of any node is NaN, the total will be NaN.

An example of using this function, from the Saxon documentation, is as follows:

saxon:sum(sale, saxon:expression('@price * @qty'))

This will evaluate price times quantity for each child <sale> element, and return the total amount.

The systemId() Extension Function
Function: string
								systemId()

This function returns the system identifier or URI of the XML entity that contains the context node. Its function return type is a string, and it has no arguments.

The tokenize() Extension Function
Function: node-set
								tokenize
								(string-1, string-2?)
							

This function builds a new node-set containing a node for each token in the first argument. The first argument is converted to a string, as with the XSLT string() function (see Chapter 5). This string is treated then as a whitespace-separated list of tokens. The second argument can set a delimiter other than whitespace, such as a comma. It can be used to break out, word by word, the contents of a sentence, for example. This function contains one required argument and one optional argument, both of type string, and its function return type is node-set.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset