A Layman's View of XML, Part 3

by Richard G. Baldwin
baldwin@austin.cc.tx.us
Baldwin's Home Page

Dateline: 11/26/99

Prolog

This is one of a series of articles explaining XML in layman's language, being particularly careful to avoid the use of technical jargon.

The first article in the series provided the following brief definition of XML:

XML gives us a way to create and maintain structured documents in plain text that can be rendered in a variety of different ways.

Then the article proceeded to break down the jargon into plain English and provided some examples of structured documents.

The article that was previous to this one ended with the following XML statements describing an example of a structured document (a simple book) given earlier in that article.

<book>
<chap number="1">
Text for Chapter 1
</chap>
<chap number="2">
Text for Chapter 2
</chap>
</book>

I stated that the example includes an attribute (that contains the chapter number) in each of the chapter elements.

Then, realizing that I was using technical jargon again (element and attribute), I promised that this article would provide a layman's description of the meaning of element and attribute. As it turns out, this article discusses attribute in some detail, and defers element until the next article.

Where to begin?

It is always a little difficult to know just where to begin when writing the next article in a series of articles. Let’s begin with a new jargon word: tag.

What is a tag?

I am going to refer to items (such as the following) enclosed in angle brackets as tags.

<book>

The tag shown above is often referred to as a start tag. The tag shown below is often referred to as an end tag.

</book>

Note that in this case, the start tag and the end tag differ only in that the end tag contains a slash character. However, the start tag can also contain optional attributes as discussed below.

What are elements, content, and attributes?

To begin with, I need to introduce you to another new word: content. I will call the following set of characters an element and will call the characters in between the tags (rendered in green) the content.

<chap number="1">
Text for Chapter 1
</chap>

I will refer to the characters rendered in blue as an attribute. Note that an element consists of a start tag and an end tag with the content being sandwiched in between the two tags. The start tag may contain optional attributes. In this case, a single attribute provides the number value for the chapter.

Tell me more about attributes

The term attribute is commonly used in computer science and usually has about the same meaning, regardless of whether the discussion revolves around XML, Java programming, or database management.

Things have attributes

A chapter in a book is a thing. A person is also a thing. Therefore, a person has attributes. Each attribute has a value. Here is a list of some of the attributes (along with their values) that might be used to describe a person:

name=”Joe”
height=”84”
weight=”176”
complexion=”pale”
sex=”male”
training=”Java programmer”
degree=”Masters”

Obviously, there are many more attributes that could be used to describe a person.

The importance of an attribute depends on the context

Which attributes are important depend on the context in which the person is being considered. For example, if the person is being considered in the context of being a candidate for a basketball team, the height, weight, and sex attributes will probably be important.

On the other hand, if the person is being considered in the context of being a candidate for employment, the height, weight, and sex attributes should not be important at all, but the training and degree attributes might be very important.

Why does XML use attributes?

The definition of XML given earlier is repeated here for convenience:

XML gives us a way to create and maintain structured documents in plain text that can be rendered in a variety of different ways.

In an earlier article, I suggested that the most common modern use of the word rendering probably means to present something for human consumption. I gave an example of a newspaper that can either be rendered on newsprint paper, or can be rendered on a computer screen.

If the newspaper (structured document) is created and maintained as an XML document, then some sort of computer program (often referred to as a rendering engine) will probably be used to render it into the desired presentation format.

Take our book, for example. It could also be rendered in a variety of different ways. However it is rendered, it would probably be useful to separate and/or number the chapters. Thus, the value of the number attribute could be used by the rendering engine to present the chapter number for a specific rendering. In some renderings, the number might appear on an otherwise blank page that begins a new chapter. In a different rendering, the chapter number might appear in the upper right or left-hand corner of each page.

Separation of content from presentation

One of the most important characteristics of XML (as opposed to HTML) is that XML separates content from presentation. The XML document contains information about structure and content. It does not contain presentation information (as does HTML). Presentation of an XML document requires the use of a rendering engine of some sort to render the XML document in a particular presentation style.

Attributes provide information about XML elements that may be useful to the rendering engine. If the attribute values for an element are not important in a particular presentation context, the rendering engine for that context can ignore them. If they are important in a particular context, the rendering engine can use them.

What about elements and content?

I will have more to say about elements and content in the next article in this series.

Coming attractions...

In my next article I will continue the discussion, including a layman's description of elements and content.

The XML octopus

Trying to wrap your brain around XML is sort of like trying to put an octopus in a bottle. Every time you think you have it under control, a new tentacle shows up. XML has many tentacles, reaching out in all directions. But, that's what makes it fun. As your XML host, I will do my best to lead you to the information that you need to keep the XML octopus under control.

Credits

This HTML page was produced using the WYSIWYG features of Microsoft Office 2000. The images on this page were used with permission from the Microsoft Clipart Gallery.

rev12231141

About the author

Richard Baldwin is a college professor and private consultant whose primary focus is a combination of Java and XML. In addition to the many platform-independent benefits of Java applications, he believes that a combination of Java and XML will become the primary driving force in the delivery of structured information on the Web.

Richard has participated in numerous consulting projects involving Java, XML, or a combination of the two. He frequently provides onsite Java and/or XML training at the high-tech companies located in and around Austin, Texas. He is the author of Baldwin's Java Programming Tutorials, which has gained a worldwide following among experienced and aspiring Java programmers. He has also published articles on Java Programming in Java Pro magazine.

Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.

baldwin@austin.cc.tx.us
Baldwin's Home Page

-end-