Learning XML

XSL and XSL Transformations (XSLT)

Getting Started

by Richard G. Baldwin
baldwin.richard@iname.com

File Xml000550.htm

June 10, 2000


Preface

I have authored numerous online articles on XML.  These articles cover the waterfront from introductory topics to advanced topics.

I maintain a consolidated index of hyperlinks to all of my XML articles at my personal website so that you can access earlier articles from there.

Rendering XML documents

As of this writing, Microsoft IE5 is the only widely-used web browser that has the ability to render XML documents.

IE5 can render XML documents using either CSS (see my personal website) or XSL.  This is one in a series of articles that discuss the use of XSL for the rendering of XML documents, with particular emphasis on the use of IE5 for that purpose.

Introduction

What is the W3C?

According to one well known author, "The W3C is a consortium, a gathering place where organizations can meet and work together without the appearance of antitrust problems."

I wrote an article addressing this question late in 1999.  You can view that article at my personal website if you would like to know more about the W3C.

For purposes of this article, the W3C is a governing body that has published many important documents on XSL and XSLT, two of which will be referred to shortly.

What is XSL?

XSL is an acronym for Extensible Stylesheet Language.

According to the W3C,
 
XSL is a language for expressing stylesheets. It consists of two parts:

1. A language for transforming XML documents, and

2. An XML vocabulary for specifying formatting semantics.

An XSL stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses the formatting vocabulary.

A working draft

As of March 2000, the XSL document is classified as a working draft, so we need to be aware that things could change in the future.

Why Do We Need XSL?

As you are probably aware by now, one of the primary virtues of XML is the ability to separate content from presentation.

Separating content from presentation

In other words, an XML document contains structured information, but does not provide any hints as to how that information should be rendered for the benefit of a consumer.

An example

For example, if the information happens to be the daily news, there are at least two desirable ways to render the information:

Same news, different renderings

The two different renderings of the same information are likely to be very different.  For example, the online version is likely to make extensive use of color.  However, due to the cost of printing color, the printed version is likely to make minimal use of color.

I wish printed newspapers could have hyperlinks

The printed version is likely to have statements such as "continued on page 5" (don't you just hate that) at the bottom of the columns.   (Then you have an opportunity to drag the bottom of the newspaper through your oatmeal as you manually turn it to a different page.)

Online newspapers do have hyperlinks

The online version for a particular news story may not be broken into different sections.  If it is broken into different sections, they will likely be connected with hyperlinks making navigation easy.

Actually, the online newspapers that I read do break a story into different sections.  One is usually a summary and other is  usually the full story, and the two are connected using hyperlinks.

What is XSLT?

XSLT is an acronym for XSL Transformations.

According to the W3C,
 
This specification defines the syntax and semantics of XSLT, which is a language for transforming XML documents into other XML documents.

XSLT is designed for use as part of XSL, which is a stylesheet language for XML. In addition to XSLT, XSL
includes an XML vocabulary for specifying formatting. XSL specifies the styling of an XML document by using XSLT to describe how the document is transformed into another XML document that uses the formatting vocabulary.

XSLT is also designed to be used independently of XSL. However, XSLT is not intended as a completely general-purpose XML transformation language. Rather it is designed primarily for the kinds of transformations that are needed when XSLT is used as part of XSL.

This is a recommendation from W3C

As of November, 1999, this document is classified as a recommendation by the W3C.  This means that "It is a stable document and may be used as reference material or cited as a normative reference from other documents."

Transforming XML to other formats

Because an HTML document can be represented as an XML document, XSLT can be used to transform XML documents into HTML documents.

Suitable for rendering in a Web browser

This makes it possible to render the information contained in an XML document using a common HTML Web browser.

Because of the usefulness of this approach, transforming XML documents into HTML documents will be a primary emphasis in the next several articles.

Where does the transformation take place?

When transforming information from an XML document for rendering on an HTML browser, the transformation can take place anywhere between the XML document and the browser.

Transforming on the server

For example, an XSLT engine could be written in Java and run as a servlet (see the discussion of XT later), or it could be written as a JavaBean component and accessed from a scriptlet in a JavaServer page.  (I have written several articles on Java servlets, JavaBean components, and JavaServer pages, which you will find indexed at my personal website.)

Transforming at the browser

Or, the transformation could be performed at the browser.  For example, Microsoft IE5 can be used for this purpose.

According to Microsoft,
 
The World Wide Web Consortium (W3C) Working Draft for XSL divides the language into two main parts: a transformation language for XML documents, and an XML vocabulary for formatting semantics. 

Microsoft® Internet Explorer 5 supports a subset of the transformation part of the Extensible Stylesheet Language (December 18th Working Draft) . 

Microsoft plans to update this technology to match the final W3C recommendation for XSL. XSL Working Draft Conformance Notes details the differences between the Internet Explorer 5 implementation and the December draft.

Testing Your Transformation Code

One of the problems with XSL and XSLT is that in this early stage of development, it is not easy to find vehicles suitable for testing your XSL code or your XSLT code.

Sparse

One convenient online resource for testing XSL code is the Sparse website.  This site contains two text boxes that allow you to enter XML text in one and XSL text in the other.

When you press a Process button, the XSL is applied to the XML to convert the XML into HTML.

You can see the HTML that is produced

A major advantage of this approach is that you are able see the HTML that is produced.

Not current

Unfortunately, as of this writing, the site contains the following warning:
 
"Sparse is not current with the latest XSL draft released from the W3C! I am currently looking at the latest draft to determine if it's  worth working on this any more. 

For now, enter some XML and OLD XSL rules and hit process, then you'll see an alert box showing the HTML, and then a window displaying it."

I personally hope that the author decides to continue updating Sparse as a service to the rest of us who are struggling with XSL and XSLT.  Maybe if enough of us send him messages of encouragement, he will decide to do so.

IE5

Within its current limitations, IE5 also provides a convenient way to test XSLT code.

Intermediate HTML is not immediately available with IE5

Unfortunately, if it is possible to get an intermediate copy of the HTML that is produced in IE5, I haven't figured out how to do it (but there is a workaround to get it as described below).

When you apply XSLT code to XML using IE5, the resulting HTML is rendered in the browser window.  If you then view the source, what you see is the original XML, not the resulting HTML that is actually rendered.

A lot better than nothing

That's a lot better than nothing.  At least Microsoft seems to be serious about supporting XML in its browser.  But, for testing purposes, it would be even better if we could see the HTML that is produced.

The XSL Debugger

Another free product from Microsoft, called the XSL Debugger, makes it possible to test and debug XSL code, and also to see the HTML that is produced.  As of this writing, you can download the XSL Debugger from this URL.  While not as convenient to use as IE5, the Debugger is very useful for testing your XSLT.

HTML output is a single line of text

Unfortunately, the entire output HTML file is presented as a single line of text, which is not very suitable for reading.

Copy to clipboard and manually reformat

You can copy the output HTML to the clipboard, paste it into a text editor, and manually insert some line breaks to get a better look at it.

That is fairly labor intensive for a large HTML file, so if I find myself using it very much, I will probably write a short Java or Python program to automatically reformat it into a more readable form.

Use as a debugger

As a debugger, the program provides the ability to single-step through the code, set breakpoints, etc.  Here is the documentation that is provided with the program:
 
"Description:
The XSL Debugger is a simple HTML-based debugger that allows single-stepping through the execution of an XSL stylesheet.  The current positions in both the stylesheet and the XML document are indicated, and breakpoints can be set and cleared.

Compatibility/Platform Compatibility:
You must be running Microsoft Internet Explorer 5 or greater on Win32 or Unix platforms to view this demo."

IE5 or later is required

When you download and install the XSL Debugger, you will end up with an HTML file named xsl-debugger.htm.  (It wants to install itself into a directory named Workshop.)

Usage instructions

To run the debugger,

At that point, you are given a couple of fairly intuitive choices (go, step, etc.).

The output

If the transformation runs successfully, the same output will be produced that you would normally see in your IE5 window.

At that point, you can check a box and cause the HTML that produced that output to be displayed as described above.

Won't work with Netscape 4.7

If you try using the debugger with Netscape 4.7, it won't work.

Errors

If either the XML file or the XSLT file is not well-formed, the debugger will refuse to load the file and will give you some diagnostic information.

XT

Another possibility for testing your XSLT code is XT, written by James Clark, editor of the XSLT specification mentioned earlier.

Here is what the Clark has to say about XT:
 
XT is an implementation in Java of XSL Transformations. This version of XT implements the PR-xslt-19991008 version of XSLT.  Stylesheets written for earlier versions of the XSLT WD must be converted before they can be used with this version of XT.
...
XT can be used as a servlet. This requires a servlet engine that implements at least version 2.1 of the Java Servlet API. 

Hooray for servlets

Because of my strong interest in Java (see my personal website), I am particularly interested in using an XSLT engine as either a servlet or as a JavaBean component for transformation of XML data to some other form on the server.

Although I haven't tried XT yet, I plan to do so in the near future.

Future Plans

XSL/XSLT is a very complex topic.  In future articles on this topic, I plan to dig very deep into both aspects: I will provide detailed discussions of the various features of both, along with numerous examples for you to run.

Until next time ... enjoy your online learning experience.

Copyright 2000, Richard G. Baldwin.  Reproduction in whole or in part in any form or medium without  express written permission from Richard Baldwin is prohibited.

About the author

Richard Baldwin is a college professor and private consultant whose primary focus is a combination of Java and XML. In addition to the many platform-independent benefits of Java applications, he believes that a combination of Java and XML will become the primary driving force in the delivery of structured information on the Web.

Richard has participated in numerous consulting projects involving Java, XML, or a combination of the two.  He frequently provides onsite Java and/or XML training at the high-tech companies located in and around Austin, Texas.  He is the author of Baldwin's Java Programming Tutorials, which has gained a worldwide following among experienced and aspiring Java programmers. He has also published articles on Java Programming in Java Pro magazine.

Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.

baldwin.richard@iname.com

-end-