Opentopia Directory Encyclopedia Tools

XSL Transformations

Encyclopedia : X : XS : XSL : XSL Transformations


XSL transformation processing
Enlarge
XSL transformation processing

Extensible Stylesheet Language Transformations, or XSLT, is an XML-based language used for the transformation of XML documents. The original document is not changed; rather, a new document is created based on the content of an existing one. The new document may be serialized (output) by the processor in standard XML syntax or in another format, such as HTML or plain text. XSLT is most often used to convert data between different XML schemas or to convert XML data into web pages or PDF documents.

XSLT was produced as a result of the Extensible Stylesheet Language (XSL) development effort within W3C during 19981999, which also produced XSL Formatting Objects (XSL-FO) and the XML Path Language, XPath. The editor of the first version (and in effect the chief designer of the language) was James Clark. The version most widely used today is XSLT 1.0, which was published as a Recommendation by the W3C on 16 November 1999. A greatly expanded version 2.0, under the editorship of Michael Kay, reached the status of a Candidate Recommendation from W3C on 3 November 2005.

Overview

The XSLT language is declarative — rather than listing an imperative sequence of actions to perform in a stateful environment, an XSLT stylesheet consists of a template rules collection, each of which specifies what to add to the result tree when the XSLT processor, scanning the source tree, according to a fixed algorithm, finds a node that meets conditions. Instructions within template rules are processed as if they were sequential instructions; but, in fact, they comprise functional expressions, representing their evaluated results - ultimately, nodes to be added to the result tree.

The XSLT specification defines a transformation in terms of source and result trees to avoid locking implementations into system-specific APIs and memory, network and file I/O issues. For example, the specification does not mandate that a source tree always be derived from an XML file, since it may be more efficient for the processor to read from an in-memory DOM object or some other implementation-specific representation. Output may be in a format not envisioned by the XSLT language's designers. However, XSLT processing often begins by reading a serialized XML input document into the source tree and ends by writing the result tree to an output document. The output document may be XML, but can be HTML, RTF, TeX, delimited files, plain text or any other format that the XSLT processor is capable of producing.

XSLT relies upon the W3C's XPath language for identifying subsets of the source document tree, as well as for performing calculations. XPath also provides a range of functions, which XSLT itself further augments. This reliance upon XPath adds a great deal of power and flexibility to XSLT.

Most current operating systems have an XSLT processor installed. For example, Windows XP comes with the MSXML3 library, which includes an XSLT processor. Earlier versions may be upgraded and there are many alternatives, see the External Links section.

The W3C finalized the XSLT 1.0 specification in 1999. The XSLT 2.0 specification is currently a Candidate Recommendation.

Example 1 (transforming XML to XML)

Transforming the XML document




John
Smith


Morka
Ismincius

by applying the XSLT transform:


 















We obtain a new XML document, having another structure:



MP123456
John


PK123456
Morka

Example 2 (transforming XML to Example of incoming XML document:



www
World Wide Web site


java
Java info

www World Wide Web site validator web developers who want to get it right

Example XSLT Stylesheet:

xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml">

test1

The following host names are currently in use at

Host nameURLUsed by




Output XHTML that this would produce (whitespace has been adjusted here for clarity):


"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">


test1



Sun Microsystems Inc.

The following host names are currently in use at sun.com

Host name URL Used by
www http://www.sun.com World Wide Web site
java http://java.sun.com Java info

The World Wide Web Consortium

The following host names are currently in use at w3.org

Host name URL Used by
www http://www.w3.org World Wide Web site
validator http://validator.w3.org web developers who want to get it right

In a web browser, this XHTML appears as:

Rendered XHTML
Enlarge
Rendered XHTML

It is worth noting that while this generates valid XML, it will not necesarily generate valid XHTML. Some common browsers are very particular about empty elements, e.g. cases involving div, script and textarea elements, and may render HTML-pages containing any of these incorrectly.

In XSLT 2.0, this has been rectified by adding an output method 'XHTML', as XSLT 1.0 already had for 'HTML'.

Template rule processing

XSLT stylesheets are declarative, not procedural; rather than defining a sequence of operations to execute, they define rules and other hints applied during processing, according to a fixed algorithm. The algorithm, which is somewhat complicated, is described below, although many of its esoteric details have been omitted.

Every XSLT processor is required to behave as if it had followed the following steps to prepare for the transform:

  1. Read the XSLT stylesheet with an XML parser and convert (abstract, rather) its content to a tree of nodes (the stylesheet tree), according to the XPath data model. "Compile-time" stylesheet syntax errors are detected at this stage. Stylesheets can be modular, so any transclusions (xsl:include, xsl:import instructions) would also be handled at this stage in order to bring template rules and other top-level stylesheet elements from other XSLT documents into the stylesheet tree.
  2. Read the input XML with an XML parser and convert its content to a tree of nodes (the source tree), according to the XPath data model. The stylesheet may reference other XML sources via document() function calls. These are, typically, evaluated at run-time, since their locations may have to be calculated and the function calls may not even be reachable. (The example above does not reference any other source documents.)
  3. Strip whitespace-only text nodes from the stylesheet tree, except those that are descendants of xsl:text elements. This allows nested elements in template rules to be on separate ('pretty') lines in the original XSLT without resulting in unintended whitespace being added to the result tree.
  4. Strip whitespace-only text nodes from the source tree, if xsl:strip-space instructions are present in the stylesheet. This allows 'pretty' input XML to be processed in a manner that ignores extraneous whitespace. (The example above does not use this feature.)
  5. Supplement the stylesheet tree with a trio of built-in template rules that provide default behavior for any node type that might be encountered during processing. One template rule is provided for processing the root node or any element node; it directs the processor to continue and process each child node. Another template is provided for any text node or attribute node; it directs the processor to make a copy of that result tree node. A third template rule is provided for any comment node or processing instruction node; it is a no-op. Templates, explicitly provided in the stylesheet, will override some or all of these. If the stylesheet contains no explicit template rules, the built-in template rules will result in a recursive source tree descension and only text nodes are copied to the result tree (attribute nodes will not be reached because they are not "children" of their parent elements). This result is generally never desirable, as it tends to be just a concatenation of the non-markup character data from the XML source.
Then, the processor performs the following steps to produce and serialize the result tree:
  1. Create the root node of the result tree.
  2. Process the root node of the source tree. The procedure for node processing is described below.
  3. Serialize the result tree, if desired, according to hints provided in the xsl:output instruction.
When processing a node, the following steps are undertaken:
  1. The best-matching template rule for that node is located. This is facilitated by each template rule's "match" pattern (an XPath-like expression), indicating the nodes to which it can be applied. Each template is assigned a relative priority and import precedence by the processor to help ease conflict resolution. The order of template rules in the stylesheet can also help resolve conflicts between templates which match the same nodes, but it does not affect the order in which nodes are processed.
  2. Template rule contents are instantiated. Elements in the XSLT namespace (prefixed with xsl:, typically; it is the namespace identifier bound to the prefix — not the prefix, itself — that matters) are treated as instructions and have special semantics that guide how they are interpreted. Some result in nodes being added to the result tree; others are control oriented. Non-XSLT elements and text nodes encountered in the template rule are copied, verbatim (namespaces and all) to the result tree. Comments and processing instructions are ignored.
The XSLT instruction xsl:apply-templates, when processed, results in a new set of nodes being selected for processing. The nodes are identified via an XPath expression. Each node is processed in document order (the relative order in which they appear in the original document).

XSLT extends XPath's function library and allows XPath variables to be defined. These variables have different scopes in the stylesheet, depending on where they are defined and their values can originate outside the stylesheet. A variable's value cannot be changed during processing.

Although this procedure may sound complicated, it has the net effect of making XSLT much like other web templating languages. If the stylesheet consists only of a single template rule that matches the root node, everything in the template is essentially copied to the output, except for the XSLT instructions (the 'xsl:…' elements), replaced by computed content. XSLT even offers an abbreviated stylesheet format ("literal result element as stylesheet") for these simple, single-template transformations. However, the ability to define separate template rules greatly increases XSLT's versatility and efficiency, especially when producing output that is very similar to the input.

See also

External links

Implementations
; Implementations for Java
: [Xalan-Java]
: [SAXON] by Michael Kay
: [XT] originally by James Clark
: Oracle XSLT, in the [Oracle XDK]
; Implementations for C or C++:
: [Xalan-C++]
: [libxslt] the XSLT C library for GNOME
: [Sablotron], which is integrated into PHP4
; Implementations for Perl:
: [XML::LibXSLT] is a Perl interface to the libxslt C library
; Implementations for PHP:
: [XSLT] is the PHP4 interface to the [Saboltron] processor
: [XSL] is the new interface to XSL introduced in PHP5. The extension uses the [libxslt].
; Implementations for Python:
: 4XSLT, in the [4Suite] toolkit by Fourthought, Inc.
: [lxml] by Martijn Faassen is a Pythonic wrapper of the libxslt C library
; Implementations for Ruby
: [Ruby/XSLT] is a simple XSLT class based on libxml and libxslt
: [Sablotron module for Ruby] is a ruby interface to Sablotron
; Implementations for JavaScript:
: [Google AJAXSLT] AjaXSLT is an implementation of XSL-T in JavaScript, intended for use in Ajax applications. Because XSL-T uses XPath, it is also an implementation of XPath that can be used independently of XSL-T.
; Implementation for Database Engines
: OpenLink Virtuoso
; Implementation for command line or shell scripts
: [XMLStarlet] Command Line XML Toolkit ( MIT License, for Windows an POSIX OS)
; Implementations for specific operating systems:
: Microsoft's [MSXML] library may be used in various Microsoft Windows application development environments and languages, such as .Net, Visual Basic, C, and JScript.
: Saxon .NET [Project Weblog], an IKVM.NET-based port of Dr. Michael Kay's and Saxonica's Saxon Processor provides XSLT 2.0, XPath 2.0, and XQuery 1.0 support on the .NET platform.
; Implementations integrated into web browsers: (Comparison of layout engines (XML))
: Mozilla has [native XSLT support] based on TransforMiiX.
: Safari 1.3+ has [native XSLT support]. Unfortunately, a major drawback is that Safari is [unable to perform XSL transformations via JavaScript], a limitation that does not occur in Mozilla or Internet Explorer. This limits the capabilities of Ajax applications that would run in Safari. Safari's XML-parser is also not standards-compliant; it will parse XML strings according to HTML rules. Therefore, under certain circumstances, it will omit data from the DOM tree if it encounters malformed "HTML" -- even though it actually encountered valid XML. These errors will propagate to XSL-processed DOM trees.
: X-Smiles has native XSLT support.
: Opera has native XSLT support since Version 9.
: Internet Explorer 6 supports XSLT 1.0 via the MSXML library (described above). IE5 and IE5.5 came with an earlier MSXML component that only supported an older, nonrecommended dialect of XSLT. A newer version of MSXML can be downloaded and installed separately to enable IE5 and IE5.5 to support XSLT 1.0 through scripting, and if certain Windows Registry keys are modified, the newer library will replace the older version as the default used by IE.
Documentation
[XSLT 1.0 W3C Recommendation]
[XSLT 2.0 W3C Candidate Recommendation]
[Zvon XSLT 1.0 Reference]
[XSL Concepts and Practical Use] by Norman Walsh
[Tutorial from developerWorks] (1 hour)
[Zvon XSLT Tutorial]
[XSLT Tutorial]
[Quick tutorial]
[What kind of language is XSLT?]
[XSLT and Scripting Languages]
[XSLT Community Wiki] (down?)
Mailing lists
[The XSLT list hosted by Mulberrytech]
Blogs
[A commentary, news, and evangelism weblog devoted to XSLT]
Books
[XSLT] by Doug Tidwell, published by O’Reilly (ISBN 0-59-600053-7)
[XSLT Cookbook] by Sal Mangano, published by O’Reilly (ISBN 0-596-00974-7)
[XSLT Programmer's Reference] by Michael Kay (ISBN 1-86-100312-9)
[XSLT 2.0 Web Development] by Dmitry Kirsanov (ISBN 0-13-140635-3)
[XSL Companion, 2nd Edition] by Neil Bradley, published by Addison-Wesley (ISBN 0-20-177083-0)
[XSLT and XPath on the Edge (Unlimited Edition)] by Jeni Tennison, published by Hungry Minds Inc, U.S. (ISBN 0-76-454776-3)
[XSLT & XPath, A Guide to XML Transformations] by John Robert Gardner and Zarella Rendon, published by Prentice-Hall (ISBN 0-13-040446-2)
Libraries
[EXSLT] is a widespread community initiative to provide extensions to XSLT.
[FXSL] is a library implementing support for Higher-order functions in XSLT. FXSL is written in XSLT itself.

 


From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.

Search Titles
0123456789
ABCDEFGHIJ
KLMNOPQRST
UVWXYZ?

E-mail this article to:

Personal Message: