XML and Java Objects, Part 2

by Richard G. Baldwin
Baldwin's Home Page

Dateline: 07/11/99


This is the second article in a series involving SAX and Java. In this series, I will show you how to use SAX to convert an XML document into a set of Java objects, and how to convert a set of Java objects into an XML document.

You might wonder why anyone would want to do this. Well, I do it every day for the purpose of maintaining this site. I maintain all of the links in an XML database and have written several Java programs to manipulate and maintain that database. As of this writing, the database contains more than five-hundred links and their associated comments.

As you are aware, links on the web come and go, so the maintenance of more than five-hundred links spread out over a couple dozen HTML files could be a formidable task. However, by consolidating all of the links into a single XML document and using a Java program to automatically generate the required HTML files, I have converted an otherwise formidable task into one that is very manageable.

Whenever I need to modify or add to the database, rather than manipulating the XML file directly, I simply convert the XML file to a set of Java objects, manipulate those objects, and then convert the objects to a new XML file.

I wrapped up the previous article by telling you "My plan is to continue this discussion in the next article, showing you some of the Java code that can be used to convert the XML file into a set of Java objects using SAX."

So, without further discussion, let's continue down that path. You may find it helpful to review the previous article before diving into the details of this one.

handling parser events and errors

This program uses the IBM parser (XML4J) along with the XML file from the previous article.

purpose of the program

The program, consists of the following four source files:

I will discuss all four files during this series of articles. Taken together, they illustrate the conversion of an XML file to a set of Java objects, and the conversion of those objects back to a new XML file. The contents of the objects are displayed.

converting XML to Java objects

This program reads an XML file with a very specific format as described in the previous article. It creates a Java object for each record in the file. You will recall that the XML file contains the basis for a computerized exam, and each record represents an exam problem. The program stores the objects in a Vector container. Classes in the file named Sax02B.java are used for this purpose.

displaying the data

After generating the Vector object containing the XML records, the program displays each of the instance variables in each object whose reference is stored in the Vector. This is a very simple illustration of how the XML data can be processed after first having converted it into object format. In a real program, something much more significant than simply displaying the data (such as editing the data) would occur at this point.

converting Java objects into XML

Then the program generates an XML file containing a record for each element in the Vector, and writes it out to a disk file named junk.xml. Classes in the file named Sax02C.java are used for this purpose.

The conversion from objects to XML is very specific and fairly brute force. A more generalized approach using the Document Object Model will be illustrated later.

the object class

A class definition in the file named Sax02D.java is used by all of the classes that make up this program as a common class for representing an XML record as an object.

miscellaneous comments

No particular effort was expended to make this program robust. In particular, if it encounters an XML file in the wrong format, it may throw an exception, or it may continue to run, but it probably will not work properly.

The program was tested using JDK 1.2 under Win95. It also requires the IBM XML4J parser, or some suitable substitute.

The program was tested with the XML file named Sax02.xml listed in the previous article. (Note that line breaks were manually inserted in that listing to force the text to fit in this narrow page format):

processing results

The program produced the following output on the screen. Line breaks were manually inserted here to force the text to fit in this format)

Problem Number: 1
Question: Which of the following are grown 
in a vegetable garden?
Type: multiple
Number choices: 4
Valid: 0,1,3
Item number 0: Carrots
Item number 1: Cabbage
Item number 2: Apples
Item number 3: Lettuce
Explanation: Carrots, cabbage, and lettuce are 
grown in a vegetable garden. Apples are grown 
in an orchard.
Problem Number: 2
Question: Which one of the following requires 
XML entities?
Type: single
Number choices: 3
Valid: 1
Item number 0: Tom
Item number 1: "<Mary & Sue>"
Item number 2: Dick
Explanation: Left and right angle brackets, 
ampersands, and quotation marks must be 
represented in XML by entities

program output

The program produced an output file named junk.xml containing the following. (Line breaks were manually inserted here to force the text to fit in this format): Since the program did not modify the data after reading the original XML file and before creating the new XML file, this output should be a replica of the original XML file.

<?xml version="1.0"?>
<problem problemNumber="1">
<question>Which of the following are grown 
in a vegetable garden?</question>
<answer type="multiple" numberChoices="4" 
<explanation>Carrots, cabbage, and lettuce are 
grown in a vegetable garden. Apples are grown 
in an orchard.
<problem problemNumber="2">
<question>Which one of the following requires 
XML entities?</question>
<answer type="single" numberChoices="3" 
<item>&quot;&lt;Mary &amp; 
<explanation>Left and right angle brackets, 
ampersands, and quotation marks must be 
represented in XML by entities</explanation>

interesting code fragments

The entire program consists of a driver file named Sax02.java and several helper files as listed earlier. I'm going to begin with the file containing the class definition used to instantiate objects to contain the XML data.


The class definition in this file provides a common object format for storage of the data from the XML file. The class is designed to contain an instance variable for each piece of data stored in a single exam problem in the XML file. This class has no methods. It is simply a container for the data extracted from the XML file.

This is a very specific class designed for a very specific XML format.

class Sax02D{
  int problemNumber;//an attribute of <problem>
  String question;//the content of <question>
  String type;//an attribute of <answer>
  int numberChoices;//an attribute of <answer>
  String valid;//an attribute of <answer>

  //Each populated element in the following 
  // array contains the content of one 
  // <item> element in the XML file
  // with an arbitrary limit of five such 
  // elements.
  String[] item = new String[5];
  String explanation;//content of <explanation>
}//end Sax02D


Next I will discuss the driver file named Sax02.java by breaking it up into fragments.

The first fragment shows the beginning of the controlling class along with the declaration of three class variables.

class Sax02 {
  static Vector theExam = new Vector();
  static Thread mainThread =
  static String XMLfile = "Sax02.xml";


The next fragment shows the beginning of the main() method. In this method, I spawn a new thread of type Sax02B that will parse the incoming XML file, creating an object for each problem specification in the exam, and storing a reference to each of those objects in the Vector mentioned above and referred to by theExam.


Then I invoke the start() method on the thread to start it running.

  public static void main (String args[]){
      Sax02B fetchItObj = 
        new Sax02B(XMLfile,theExam,mainThread);
      fetchItObj.start();//start the thread

If the XML file is a long one, some time will pass before the theExam has been populated and is ready for use.

producer/consumer scenario

This is a typical producer/consumer scenario for which there are several control solutions. In this case, the Sax02B thread is the producer and the main thread is the consumer.

go to sleep

Because of the simplicity of this particular situation, I chose simply to put the main thread to sleep and let it sleep until the thread that is parsing the XML file awakens it. That thread will interrupt the main thread when it finishes parsing the XML file, which will cause the main thread to wake up and process theExam.

Thus, the main thread will sleep until the parse is completed. It will wake up when interrupted by the parser thread and will then process the data in the Vector.

If parsing is not completed during the first 100000 milliseconds, it will wake up and then go back to sleep. (However, that would be an awfully long time to complete the parse so it might be better to throw an exception under those conditions.)

        }//end while
      }catch(InterruptedException e){
        //Wake up and do something
      }//end catch

XML file has been converted to objects

At this point, each of the exam problems in the XML file has been converted into a Java object. References to those objects have been stored in a Vector object named theExam. This Vector object can be used for any number of useful purposes, such as editing the contents of the individual objects, sorting the problems, administering the test, etc.

display the object contents

In this sample program, I simply display all of the data in each object before converting the objects back to XML format and writing the XML data into a new disk file named junk.xml. This is accomplished in the next fragment.

Everything in this fragment is simple Java programming using the Enumeration interface, so I won't bother to provide an explanation. If this is new to you, you should review the appropriate lessons in my Java tutorials.

Enumeration theEnum = theExam.elements();
  Sax02D theDataObj =
    "Problem Number: " 
      + theDataObj.problemNumber);
    "Question: " 
      + theDataObj.question);
    "Type: " 
      + theDataObj.type);
    "Number choices: " 
      + theDataObj.numberChoices);
    "Valid: " 
      + theDataObj.valid);
  //The XML file contains a 
  // field that specifies the
  // number of choices that 
  // will be presented for the
  // problem on a multiple-choice
  // test.  That value
  // should specify the number
  // of data values in the
  // array.  Use that value
  // to extract the data
  // values from the array.
  for(int cnt = 0; 
    System.out.println("Item number " 
      + cnt + ": " 
      + theDataObj.item[cnt]);
  }//end for loop
    "Explanation: " 
      + theDataObj.explanation);
}//end while(theEnum.hasMoreElements())

convert objects back to XML

The next fragment instantiates an object of type Sax02C and invokes the writeTheFile() method on that object to convert the data stored in the Vector object to XML format and write it into an output file named junk.xml. I will discuss the particulars of the Sax02C class used to accomplish this in a subsequent article.

      Sax02C xmlWriter = 
        new Sax02C(theExam,"junk.xml");
    }catch(Exception e){System.out.println(e);}
  }//end main
}//end class Sax02 

That completes the main() method and also completes the class definition for Sax02. I will discuss the process of converting the XML file to a set of objects in the next article. 

coming attractions...

My plan is to continue this discussion in the next article, showing you more of the Java code that can be used to convert the XML file into a set of Java objects using SAX.

the XML octopus

Trying to wrap your brain around XML is sort of like trying to put an octopus in a bottle. Every time you think you have it under control, a new tentacle shows up. XML has many tentacles, reaching out in all directions. But, that's what makes it fun. As your XML host, I will do my best to lead you to the information that you need to keep the XML octopus under control.


This HTML page was produced using the WYSIWYG features of Microsoft Word 97. The images on this page were used with permission from the Microsoft Word 97 Clipart Gallery.


Copyright 2000, Richard G. Baldwin

About the author

Richard Baldwin is a college professor and private consultant whose primary focus is a combination of Java and XML. In addition to the many platform-independent benefits of Java applications, he believes that a combination of Java and XML will become the primary driving force in the delivery of structured information on the Web.

Richard has participated in numerous consulting projects involving Java, XML, or a combination of the two.  He frequently provides onsite Java and/or XML training at the high-tech companies located in and around Austin, Texas.  He is the author of Baldwin's Java Programming Tutorials, which has gained a worldwide following among experienced and aspiring Java programmers. He has also published articles on Java Programming in Java Pro magazine.

Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.

Baldwin's Home Page