Data Structures in Java: Part 10

Baldwin shows you how to use a Comparator object to achieve a natural ordering on a set of names added to a TreeSet collection while ignoring the case used to write the names.

The Comparator Interface, Part 2

Published:  August 13, 2001
By Richard G. Baldwin

Java Programming, Lecture Notes # 1368


Preface

A miniseries

This is the tenth lesson in a miniseries on Java data structures and the Java Collections Framework.  The first lesson in the miniseries was entitled Data Structures in Java: Part 1, Getting Started.  The previous lesson was entitled Data Structures in Java: Part 9, The Comparator Interface, Part 1.

The purpose of this miniseries is to help you learn the essential features of Object-Oriented data structures in Java using the Collections Framework.

A sub-series

This is also the second lesson in a sub-series on the Comparator interface.  The purpose of the lessons in this sub-series is to teach you about the interactions between the Comparator interface and the Collections Framework, particularly with respect to the Set, SortedSet, and SortedMap interfaces of the Collections Framework.  This lesson discusses Set and SortedSet.  A discussion of SortedMap will be deferred for inclusion in a subsequent lesson.

Viewing tip

You may find it useful to open another copy of this lesson in a separate browser window.  That will make it easier for you to scroll back and forth among the different listings while you are reading about them.

Supplementary material

I recommend that you also study the other lessons in my extensive collection of online Java tutorials.  You will find those lessons published at Gamelan.com.  However, as of the date of this writing, Gamelan doesn't maintain a consolidated index of my Java tutorial lessons, and sometimes they are difficult to locate there.  You will find a consolidated index at Baldwin's Java Programming Tutorials.

Preview

In this lesson, I will show you how to use a Comparator object to achieve a natural ordering of a set of names added to a TreeSet collection while ignoring the case used to write the names.

Discussion and Sample Program

Beginning with a quiz

Let's begin with a little quiz to test your prior knowledge of the Collections Framework.

What output is produced by the program shown in Listing 1?

//File Comparator03.java
//Copyright 2001 R.G.Baldwin
import java.util.*;
import java.io.Serializable;

public class Comparator03{
  public static void main(
                        String args[]){
    new Worker().doIt();
  }//end main()
}//end class Comparator03

class Worker{
  public void doIt(){
    Iterator iter;
    Collection ref;
    System.out.println(
                      "Natural ordering");
    ref = new TreeSet();
    Populator.fillIt(ref);
    iter = ref.iterator();
    while(iter.hasNext()){
      System.out.print(iter.next() + " ");
    }//end while loop
    System.out.println();    
    
    System.out.println(
                     "Comparator in use");
    ref = new TreeSet(
                  new TheComparator());
    Populator.fillIt(ref);
    iter = ref.iterator();
    while(iter.hasNext()){
      System.out.print(iter.next() + " ");
    }//end while loop
    System.out.println();

  }//end doIt()
}// end class Worker

class Populator{
  public static void fillIt(
                       Collection ref){
    ref.add("Joe");
    ref.add("Bill");
    ref.add("Tom");
    ref.add("JOE");
    ref.add("BILL");
    ref.add("TOM");
  }//end fillIt()
}//end class Populator

class TheComparator  
    implements Comparator,Serializable{
      
  public int compare(
                  Object o1,Object o2){
    if(!(o1 instanceof String))
        throw new ClassCastException();
    if(!(o2 instanceof String))
        throw new ClassCastException();

    //Do an upper-case comparison
    int result = 
            ((String)o1).toUpperCase().
                compareTo(((String)o2).
                        toUpperCase());
    return result;
  }//end compare()
    
  public boolean equals(Object o){
    if(!(o instanceof TheComparator))
        return false;
    else return true;
  }//end overridden equals()
}//end class TheComparator

Listing 1

If your answer was None of the above, you are correct.

The program output

The output produced by the program shown in Listing 1 is four lines long as shown below:

Natural ordering
BILL Bill JOE Joe TOM Tom
Comparator in use
Bill Joe Tom

From the previous lesson

In the previous lesson, I introduced you to the essentials of defining and using a Comparator for controlling the sort order of the elements contained in a TreeSet collection.

In that lesson, I explained the difference between natural ordering and the sort ordering produced through the use of a Comparator object.

However, what I showed you generally replicated the natural ordering, and therefore, wasn't too exciting.

Doing more with a Comparator

In this and several subsequent lessons, I am going to show you some of the things that you can do with a Comparator object.  By using a Comparator object, you can achieve comparisons and sort orders that are different from the natural ordering for a given element type.

Two steps in the program

The program shown in Listing 1 goes through two major steps.  First it populates a TreeSet collection with the names of six people without using a Comparator.  Then it displays the contents of that collection using an iterator.  That produces the following output:

Natural ordering
BILL Bill JOE Joe TOM Tom

In this step, each of the six names that were added to the collection were displayed after they were arranged into their natural ordering.

In case you are unfamiliar with this fact, upper-case characters appear before lower-case characters in the natural ordering of characters in the Unicode character set.  Therefore, the names consisting of all upper-case characters appear in the output ahead of the same names consisting of a mixture of upper-case and lower-case characters.

The second step

Then the program shown in Listing 1 instantiates a new TreeSet object, providing a Comparator for use in comparing and managing the sort order of the elements.

The program populates the new TreeSet collection with the same set of six names in the same order as before.  After the collection is populated, its contents are displayed producing the following output:

Comparator in use
Bill Joe Tom

Duplicate names eliminated from the set

Three of the names appear in the output in the same order as the natural ordering shown earlier.  However, the duplicate names are eliminated and only three names appear.

This is because a Comparator was used by the TreeSet object to compare the elements as they were added.  The Comparator was designed to eliminate the distinction between upper-case and lower-case characters.

Does Joe equal JOE?

For the earlier case that didn't use a Comparator, the names Joe and JOE were considered to be different elements.  Therefore, after population, both names appeared in the collection.

When the Comparator was used to eliminate the distinction between upper-case and lower-case characters, the names Joe and JOE were considered to be duplicates.  As a result, only the first of the two was allowed into the collection and the second of the two was rejected.

Let's see some code

The code shown in Listing 2 is the code that was used

    ref = new TreeSet();
    Populator.fillIt(ref);

    iter = ref.iterator();
    while(iter.hasNext()){
      System.out.print(iter.next() + " ");
    }//end while loop

Listing 2

The actual populator code

The code in Listing 3 was actually used to populate the collection in both cases, both with, and without a Comparator (to be discussed later).
 
class Populator{
  public static void fillIt(
                       Collection ref){
    ref.add("Joe");
    ref.add("Bill");
    ref.add("Tom");
    ref.add("JOE");
    ref.add("BILL");
    ref.add("TOM");
  }//end fillIt()
}//end class Populator

Listing 3

Populating the collection with String objects

Note that in Listing 3, I did not use a class of my own design from which to instantiate the objects used to populate the collection.  Rather, I used the String class from the standard library.

The String class implements the Comparable interface.  Therefore, objects instantiated from the String class have a natural ordering when placed in a collection.

Because the compareTo method of the String class, (which implements the Comparable interface) considers upper-case and lower-case characters to be different, there were no duplicate elements added to the collection when the compareTo method was used to compare elements.  The six String objects were simply arranged so that the iterator would return references to those objects in sorted order.  This produced the output shown below:

BILL Bill JOE Joe TOM Tom

A TreeSet with a Comparator

The code shown in Listing 4 was used to instantiate a new TreeSet object.  A Comparator object was provided to the TreeSet constructor.  The Comparator object was subsequently used for comparing and controlling the sorting order of the elements in the TreeSet collection.
 
    ref = new TreeSet(
                     new TheComparator());
    Populator.fillIt(ref);
    iter = ref.iterator();
    while(iter.hasNext()){
      System.out.print(iter.next() + " ");
    }//end while loop

Listing 4

The code in Listing 4 was also used to populate the collection, and to display the contents of the collection after it was populated.

Populating the TreeSet collection

As before, the fillIt method shown in Listing 5 was actually used to populate the collection.  The same six names as before were added to the TreeSet collection.  However, the result of adding those six names was determined by the behavior of the compare method in the Comparator object used by the TreeSet object for managing the collection (three of the names were rejected as duplicates).
 
  public static void fillIt(
                       Collection ref){
    ref.add("Joe");
    ref.add("Bill");
    ref.add("Tom");
    ref.add("JOE");
    ref.add("BILL");
    ref.add("TOM");
  }//end fillIt()

Listing 5

The Comparator

The code in Listing 6 shows the beginning of the class from which the Comparator object was instantiated.  Note that this class implements the Comparator interface.
 
class TheComparator  
    implements Comparator,Serializable{
      
  public int compare(
                  Object o1,Object o2){
    if(!(o1 instanceof String))
        throw new ClassCastException();
    if(!(o2 instanceof String))
        throw new ClassCastException();

Listing 6

Listing 6 doesn't contain the interesting part of this class.  The code in Listing 6 simply throws an exception if the compare method receives any incoming parameters of types other than String.

The interesting code

The interesting code is shown in Listing 7.
 
  int result = 
    ((String)o1).toUpperCase().
          compareTo(((String)o2).
                        toUpperCase());
    return result;
}//end compare()

Listing 7

The code in Listing 7 makes use of two methods of the String class to compare the two incoming objects.

Convert to upper-case

The method named toUpperCase is used to produce a version of each of the incoming strings that consists of upper-case characters only.  In other words, lower-case characters in each of the two strings are replaced by the corresponding upper-case characters.  This conversion occurs before the strings are compared.

For example, the string Joe is converted to JOE inside the compare method, before the actual comparison is made.  This results in the two strings containing Joe and JOE being considered to be duplicates.  If one of them is already in the collection when an attempt is made to add the other, the second will be rejected as a duplicate.

Making the comparison

Then the compareTo method of the string class is used to make the actual comparison.  (Note that this is the same method that is used to make the comparison in the absence of a Comparator object.  However, in the case of the Comparator object, the case of the strings is modified before they are passed to the compareTo method.)

This code invokes the compareTo method on the upper-case version of the string represented by o1, passing the upper-case version of the string represented by o2 as a parameter.  Here is part of what Sun has to say about the behavior of the compareTo method.

"Returns:  the value 0 if the argument is a string lexicographically equal to this string; a value less than 0 if the argument is a string lexicographically greater than this string; and a value greater than 0 if the argument is a string lexicographically less than this string."
Just what I was looking for

That is exactly the behavior that I was looking for, so all that I needed to do after invoking the compareTo method on the upper-case versions of the two strings was to return the value that was returned by the compareTo method.

(Note, while writing this lesson and explaining the behavior of this program, I discovered that I could have used a method of the String class named compareToIgnoreCase to accomplish the same thing with a little less work.)

The results

When the TreeSet object used the Comparator object to compare and arrange the elements in the collection, the three duplicate names were eliminated and the iterator delivered references to the remaining three names in the following order:

Bill Joe Tom
 

Summary

In this lesson, I showed you how to use a Comparator object to achieve a natural ordering of a set of names added to a TreeSet collection while ignoring the case used to write the names.

What's Next?

In the next lesson, I will show you how to use a Comparator to cause a TreeSet collection to be sorted in descending order while preserving differences in case.


Copyright 2001, Richard G. Baldwin.  Reproduction in whole or in part in any form or medium without express written permission from Richard Baldwin is prohibited.

About the author

Richard Baldwin is a college professor and private consultant whose primary focus is a combination of Java and XML. In addition to the many platform-independent benefits of Java applications, he believes that a combination of Java and XML will become the primary driving force in the delivery of structured information on the Web.

Richard has participated in numerous consulting projects involving Java, XML, or a combination of the two.  He frequently provides onsite Java and/or XML training at the high-tech companies located in and around Austin, Texas.  He is the author of Baldwin's Java Programming Tutorials, which has gained a worldwide following among experienced and aspiring Java programmers. He has also published articles on Java Programming in Java Pro magazine.

Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.

baldwin.richard@iname.com

-end-