an XML Collection...

Rescuing XSLT from Niche Status

A Gentle Introduction to XSLT through HTML Templates

David Jacobs

XSLT is one of the most exciting technologies to come out of the XML family. Unfortunately, its incredible power and associated complexity can be overwhelming to new users preventing many from experimenting with it or causing them to quickly give up in disgust. In fact, unless the method of teaching and the common style of use for XSLT is radically changed to make it more accessible, XSLT will be relegated to niche status like SGML and other powerful technologies.

The Problem

The 1990’s saw an incredible proliferation of new web related languages. Looking back we can see what features separated the winners and losers. The biggest key has been having a very low barrier to entry. Many languages accomplished this by following the following rules.

From these rules we can see why embedded web scripting languages like Active Server Pages (ASPs), Cold Fusion, PHP and Java Server Pages (JSPs) are so popular. They all leverage a user’s knowledge of HTML. They also allow the minimum amount of scripting to be added to accomplish the dynamic feature a developer is looking for. This has allowed numerous web developers to start off with very small projects and then through continuos enhancement and learning, find themselves using the full power of a complex programming language. Furthermore, because of the very incremental nature of that learning the developer was never scared off.

HTML has also fostered the technique of learning by example. When a web author sees another site with a feature they like, they immediately bring up the source to see how it was implemented. In this way many web authors were able to learn complex HTML tricks with no formal training. While server-side scripts are not as easy to come by, there are still numerous sites that house thousands of example scripts for a blossoming developer to examine.

Traditionally XSLT has been presented as a programming language for translating XML documents into another format, often for presentation. This frames the problem, such that for each element, the programmer has the task of figuring out how that element needs to be translated. As long as there are one to one mappings or one to zero mappings this is straightforward. For example, if every occurrence of a <name> element is going to become an HTML header. It is a simple matter to write a matching template to accomplish this.

<xsl:template match="name">
  <h1><xsl:apply-templates/></h1>
</xsl:template>

However when adding one to many mappings (i.e. when an element’s contents will appear multiple times in the target document with different formatting), keeping track of all the relationships quickly grows in complexity and becomes confusing. For example, if, after writing the previous template, the programmer discovers that the name also needs to be placed in the title the programmer might add the template

<xsl:template match="/">
  <title><xsl:value-of select="name"/></title>
</xsl:template>

Notice the use of the <xsl:value-of> function in this template because using <apply-templates select="name"> would have caused a triggering of the previous template adding undesired header tags to my content. This means before adding a translation to an element the programmer must first be aware of all the existing translations (ugh!). Of course if the programmer became aware of the <title> requirement first, the contents of these templates could have been reversed. One can quickly see how the arbitrary decisions of development and discovery of requirements can lead to a set of templates that are no longer intuitive.

As a programmer with more than 20 years experience with over a dozen languages, XSLT templates and default rules were not obvious to me. Over the past year or two I had looked at numerous examples trying to discern how they worked. While I could understand the general gist of what was occurring, there was too much implied behavior that I did not pick up. It was not until going through formal XSLT training that I fully understood how XSLT worked. Clearly, if the barrier to entry is that high for an experienced programmer, the average web developer was not going to find this technology very useful.

The Solution

So how do we solve this problem and help deliver XSLT’s promise to the masses?  For XSLT to be successful it must be presented and used in a way that adopts those attributes discussed earlier (reuse of knowledge, fast start, and gradualism). This tutorial will attempt to ease XSLT’s introduction by focusing on these attributes. First, it is only going to focus on the generation of HTML documents and users who are familiar with HTML. If your goal is to immediately start transforming one XML document into another XML document this tutorial is not for you.

The second is to reframe the problem so the XSLT solutions programmers write are more naturally extensible and intuitive. Instead of trying to translate an XML source document into an HTML presentation document, the programmer should see the XML document as a complex data structure with XSLT providing powerful tools for extracting that information into their HTML documents. This allows us to leverage the experience most people have with using an HTML templating language (e.g. ASP, PHP, JSP, Cold Fusion, Web Macro, etc). These templating languages are all based on the basic premise that HTML comes first and all enhancements are then embedded in special tags.

With some caveats, this tutorial will show how XSLT can be used in this same way. The benefit of this approach is it allows the quick use of many of XSLT’s powerful functions while letting you learn its more esoteric capabilities as the need arises. In addition the resulting XSLT files are more intuitive and maintainable.

<xsl:value-of> and {}

On to an example. Here is a very simple welcome page.

<html>
  <head>
    <title>Welcome</title>
  </head>
  <body>
    Welcome!
  </body>
</html>

And here is an XML document with information on member.

<?xml version="1.0"?>
<member level="platinum">
  <name>Jeff</name>
  <phone type="home">555-1234</phone>
  <phone type="work">555-4321</phone>
  <favoriteColor>blue</favoriteColor>
</member>

The following XSLT stylesheet will customize the welcome page to use the member’s name from the XML document.

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">
  <xsl:template match="/">
    <html>
      <head>
        <title>Welcome</title>
      </head>
      <body>
        Welcome <xsl:value-of select="member/name"/>!
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>

There are a couple of things that need to be pointed out right away. First this is a well-formed XML document. This mean all HTML used must conform to the XHTML specification (i.e. all tags must be closed and lowercase).

The lines before the <html> tag and after the </html> tag will be seen in all the examples. For now, other than realizing that they are required in any style sheet created, just go ahead and forget they are there. You don’t NEED to understand them right now to get useful work out of XSLT.

Notice the HTML is identical to the original except for the introduction of a new tag <xsl:value-of>. This tag is the key to extracting any piece of information out an XML document. It has a "select" attribute that provides the path through the XML document to the information we seek. In this case <member> is the outer most tag and <name> is the tag underneath it. Slash characters ("/") are used to designate parent/child relationships between tags. If you are used to navigating around a Unix file system this should feel familiar.

Now let’s consider further customizing this page by making the welcome in the person’s favorite color using the <font> tag with the "bgcolor" attribute. Because <xsl:value-of> is an XML tag it is not valid to insert it in an HTML attribute value. So another mechanism is needed to insert information from our XML file there.

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">
  <xsl:template match="/">
    <html>
      <head>
        <title>Welcome</title>
      </head>
      <body>
        <font bgcolor="{member/favoriteColor}">
          Welcome <xsl:value-of select="member/name"/>!
        </font>
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>

Notice the use of the curly brackets ("{}"). When used within an attribute assignment "{path}" has the exact same effect as <xsl:value-of select="path"/> used outside of attribute assignments.

Queries

Not all paths lead to a single node. For example, what if we wanted to put a person’s home phone number on the page? Notice that the XML document contains two phone entries. If we simply used <xsl:value-of select="member/phone"/> both entries would be returned. We obviously need a way to be more specific. Luckily, XSLT allows the full power of XPath to describe the value(s) of interest. XPath allows conditions on any attribute or tag to be placed in square brackets ("[]") which are then used to restrict the values returned.

So to retrieve the home phone number we would use the path "member/phone[@type=’home’]". Notice the "@" symbol in front of "type". The "@" symbol signifies that we are referring to an attribute. So our new HTML template looks like:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">
  <xsl:template match="/">
    <html>
      <head>
        <title>Welcome</title>
      </head>
      <body>
        <font bgcolor="{member/favoriteColor}">
          Welcome <xsl:value-of select="member/name"/>!
          <br/>
          Your home phone number is:
          <xsl:value-of select="member/phone[@type=’home’]"/>
        </font>
      </body
    </html
  </xsl:template>
</xsl:stylesheet>

<xsl:for-each>

The previous example brings up another issue. What if this <member> entry had numerous phone numbers and we wanted to print them all on the web page. We could simply use <xsl:value-of select="member/phone"/> but this would not enable us to format the phone number into a nice list that describes the type of each number.

To accomplish this requires the introduction of the <xsl:for-each> tag which allows us to loop through each of the elements that match a given path. So to create a table that contains the phone number type in the first column and the phone number in the second column, the following stylesheet could be used.

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">
  <xsl:template match="/">
    <html>
      <head>
        <title>Welcome</title>
      </head>
      <body>
        <font bgcolor="{member/favoriteColor}">
          Welcome <xsl:value-of select="member/name"/>!
        </font>
        <table>
          <tr><th>Type</th><th>Number</th></tr>
          <xsl:for-each select="member/phone">
            <tr>
              <td><xsl:value-of select="@type"/></td>
              <td><xsl:value-of select="."/></td>
            </tr>
          </xsl:for-each>
        </table>
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>

This example brings up a number of issues. First, while in the loop, all <xsl:value-of/> accesses are relative to the current element being iterated over (in this case <phone>). Notice the use of the period ("."), which like in a Unix file system means the current element. So in this case the period (".") refers to each phone element as the loop iterates. Also like in a file system you can address a parent element using a double period ("..") and can access any element in the document by starting over at the root element using a slash ("/").

<xsl:if>

As a last enhancement to our page let’s add a special offer to "platinum" level members. To do this requires the use of a new tag <xsl:if> which allows us to insert content based on a condition of the data in the XML document.

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">
  <xsl:template match="/">
    <html>
      <head>
        <title>Welcome</title>
      </head>
      <body>
        <font bgcolor="{member/favoriteColor}">
          Welcome <xsl:value-of select="member/name"/>!
        </font>
        <xsl:if test="member[@level='platinum']">
          Our special offer to platinum members today is something great
        </xsl:if>
        <table>
          <tr><th>Type</th><th>Number</th></tr>
          <xsl:for-each select="member/phone">
            <tr>
              <td><xsl:value-of select="@type"/></td>
              <td><xsl:value-of select="."/></td>
            </tr>
          </xsl:for-each>
        </table>
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>

Within the "test" attribute the full array of Boolean and relative operators are available. The only caveat is that since this is an XML document less than and greater than ("<", ">") signs must be escaped as "&lt;" and "&gt;".

<xsl:choose>

One nuance of the <xsl:if> tag that is not always obvious at first glance is the lack of an "else" statement. This means to have an else statement requires two ifs. The first one saying “if condition” and the second one saying “if not condition”.  This scheme quickly becomes unworkable with embedded if then else logic.  Luckily XSLT supports an additional test operator called <xsl:choose> which works like a switch/case statement.

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">
  <xsl:template match="/">
    <html>
      <head>
        <title>Welcome</title>
      </head>
      <body>
        <font bgcolor="{member/favoriteColor}">
          Welcome <xsl:value-of select="member/name"/>!
        </font>
<xsl:choose>
<xsl:when test="member[@level='platinum']">
Our special offer to platinum members today is something great
</xsl:when>
<xsl:otherwise>

Become a platinum member today!
</xsl:otherwise>
</xsl:choose>

        <table>
          <tr><th>Type</th><th>Number</th></tr>
          <xsl:for-each select="member/phone">
            <tr>
              <td><xsl:value-of select="@type"/></td>
              <td><xsl:value-of select="."/></td>
            </tr>
          </xsl:for-each>
        </table>
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>
 

The “test” attribute has the same capabilities/ constraints as the “test” attribute in the <xsl:if> tag.  Multiple <xsl:when> blocks are allowed.  As soon as one “when test” is mached, it will not evaluate any further “xsl:when tests” in the <xsl:choose> block.

Conclusion

With just these few commands (an admittedly small subset of XSLT) and a strong background in HTML (DHTML and JavaScript included), I believe web developers could meet the majority of their presentation needs. Obviously there will be cases where greater flexibility is required, but the advantage of this technique is that only then, does the developer need to learn those constructs.

As a further benefit, this technique minimized the interdependence of one XLST construct on another. Local changes stay local thereby reducing the brittleness of solutions. The developer also no longer has to remember and account for XSLT’s default behaviors.

I hope from these few examples, I have opened a few eyes to the power of XSLT and how a small change in how XSLT is framed can make a huge difference in its understandability and accessibility to web developers. If nothing else, I hope to encourage some good discussions.