Data Structures in Java: Part 6, Duplicate Elements, Ordered Collections, Sorted Collections, and Interface Specialization

In this lesson Baldwin shows you that all concrete implementations in the Java Collections Framework (JDK 1.3) implement a subinterface of the Collection interface.  A Set object cannot contain duplicate elements, but a List object can contain duplicate elements.  Ordered collections are not the same as sorted collections.  Specialized stipulations are placed on interfaces as you progress down the interface inheritance hierarchy of the Java Collections Framework.

Published:  July 9, 2001
By Richard G. Baldwin

Java Programming, Lecture Notes # 1360


Preface

This is the sixth lesson in a miniseries on Java data structures and the Java Collections Framework.  The first lesson in the miniseries was entitled Data Structures in Java: Part 1, Getting Started.  The previous lesson was entitled Data Structures in Java: Part 5, The Core Collection Interfaces.

The purpose of this miniseries is to help you learn the essential features of Object-Oriented data structures in Java using the Collections Framework.

Viewing tip

You may find it useful to open another copy of this lesson in a separate browser window.  That will make it easier for you to scroll back and forth among the different figures and listings while you are reading about them.

Supplementary material

I recommend that you also study the other lessons in my extensive collection of online Java tutorials.  You will find those lessons published at Gamelan.com.  However, as of the date of this writing, Gamelan doesn't maintain a consolidated index of my Java tutorial lessons, and sometimes they are difficult to locate there.  You will find a consolidated index at Baldwin's Java Programming Tutorials.

The index on my site provides links to the lessons at Gamelan.com.

Preview

In the previous lesson, entitled Data Structures in Java: Part 5, The Core Collection Interfaces, you learned that the Java Collections Framework defines six core interfaces, in two distinct trees.  You learned the names and the inheritance structure of those interfaces.  You also learned about their purpose.  You saw how the interfaces declare polymorphic methods that apply to implementations of the interfaces, and you learned about the optional methods of the Collection interface.

In this lesson you will learn that all of the implementations of the interfaces in the Java Collections Framework implement one of the subinterfaces of the Collection interface.  You will learn that a Set object cannot contain duplicate elements, but a List object can contain duplicate elements.

You will learn about the difference between ordered collections and sorted collections.  You will also learn about ascending order and the natural ordering of objects.  In addition, you will learn how more specialized stipulations are placed on interfaces as you progress down the interface inheritance hierarchy of the Java Collections Framework.

Discussion

Let's start with a quiz

As is often the case, I am going to begin this lesson with a little quiz just to make sure that you are awake.  Is the following statement True or False?

The TreeSet class is a direct implementation of the Collection interface.
The answer is False.  The TreeSet class is not a direct implementation of the Collection interface.  Rather, the TreeSet class is a direct implementation of the SortedSet interface.  The SortedSet interface extends the Set interface, and the Set interface extends the Collection interface.

The root of the collection hierarchy

The Collection interface is the root of the collection hierarchy.  JDK 1.3 doesn't provide any direct implementations of the Collection interface.  All of the implementations of the interfaces in the Java Collections Framework implement one of the subinterfaces of the Collection interface.

What does Sun say about this?

Here is what the Sun documentation has to say on the topic of the Collection interface:

"The SDK does not provide any direct implementations of this interface: it provides implementations of more specific subinterfaces like Set and List. This interface is typically used to pass collections around and manipulate them where maximum generality is desired."

The Sun documentation also states:

"Bags or multisets (unordered collections that may contain duplicate elements) should implement this interface directly."

However, JDK 1.3 does not provide any implementations for bags or multisets.  If you need collections of these types, you will need to define the implementation classes yourself.

What about duplicate elements?

Some implementations of Collection allow duplicate elements, and others do not.  Implementations of the List interface (such as ArrayList) allow duplicate elements.  Implements of Set and SortedSet (such as TreeSet) do not allow duplicate elements.  This was illustrated in a previous lesson entitled Data Structures in Java: Part 5, The Core Collection Interfaces.

A sample program in that lesson created two collection objects and applied the polymorphic add() method to add the same elements to each collection.  One of the collection objects was of type ArrayList, and the other collection object was of type TreeSet.  The elements added to each collection contained one pair of duplicate elements.  The duplicate element was automatically excluded from the TreeSet object, but was retained in the ArrayList object.

So, what is a set?

According to Sun, a Set is a "collection that contains no duplicate elements ... this interface models the mathematical set abstraction." An object of type Set is typically used to model collections such as Social Security numbers where duplicates are not allowed.

And, what is a list?

Also according to Sun, a List is "An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list."

Ordered is not the same as sorted

Note that an ordered collection is not the same as a sorted collection.  The fact that the collection is ordered derives from the fact that each element in the collection has a specific position specified by an index.  In a sorted collection, the position of each element is determined by its value relative to the values of its predecessors and successors.

Sun goes on to say, "Unlike sets, lists typically allow duplicate elements. More formally, lists typically allow pairs of elements e1 and e2 such that e1.equals(e2), and they typically allow multiple null elements if they allow null elements at all."

Is ascending sort order always required?

Not all implementations of the Collection interface maintain the elements in ascending sort order.  Some may, and others do not.  For example, as discussed above, implementations of the List interface (such as ArrayList) do not maintain their elements in sorted order at all.  In other words, the position of an element in an ArrayList does not depend on the value of the element.

On the other hand, implementations of the interfaces named SortedSet (such as TreeSet) and SortedMap do maintain their elements in sorted order.  However, that order is not necessarily ascending.  When an object is instantiated from a class that implements the SortedSet interface, the sorting order for that object can be established by providing an object instantiated from a class that implements the Comparator interface.  In that case, the author of the implementing class determines the order imposed on the elements in the collection.

Does case matter in String objects?

For example, if your SortedSet object contains references to String objects, the natural ascending sort would take the difference between upper case and lower case characters into account.  However, you might prefer that case be ignored when establishing the sorted order.  You can accomplish this by providing an object of a class that implements the Comparator interface and which defines the compare() and equals() methods in such a way as to eliminate case considerations for comparisons of String objects.  (The lesson entitled Swing from A to Z:  Analyzing Swing Components, Part 1, Concepts contains a sample program named Introspect03 that implements the Comparator interface for exactly this purpose.)

Subinterfaces have more stipulations

As you progress down the inheritance hierarchy, you find that additional stipulations apply at each level of inheritance.  As an example, according to Sun, "The Set interface places additional stipulations, beyond those inherited from the Collection interface, on the contracts of all constructors and on the contracts of the add, equals and hashCode methods."

The important point is that specific subinterfaces of the Collection interface can define requirements that do not apply to all subinterfaces of the Collection interface.  For example, the add() method of the Set interface stipulates the following:

"Adds the specified element to this set if it is not already present."
On the other hand, the add() method of the Collection interface simply states:
"Ensures that this collection contains the specified element."
Thus, the contract for the add() method of an object of a class that implements the Set interface is more specialized than the contract for the add() method of an object of a class that implements Collection interface.

An additional stipulation on the constructor for a Set object is that all constructors must create a set that contains no duplicate elements.

Stipulations on SortedSet

The SortedSet interface extends the Set interface. The SortedSet interface contains the following stipulation that makes it more specialized than a Set.

"A set that further guarantees that its iterator will traverse the set in ascending element order, sorted according to the natural ordering of its elements (see Comparable), or by a Comparator provided at sorted set creation time."

Let's end with a quiz

I'm going to finish this lesson with several questions in the form of a quiz.  To ensure that this is a learning experience, I will provide an explanation in addition to the answer for each question.

Q1  True or False?  A collection that implements the List interface maintains its elements in ascending alphanumeric order.

The answer to question 1 is false.  Unlike collections that implement the SortedSet interface, the order of the elements in a collection that implements the List interface is not based on the values of the objects referred to by the elements in the list.

Q2  True or False?  A collection that implements the List interface is an unordered collection.

The answer to question 2 is also false.  A collection that implements the List interface is an ordered collection (also known as a sequence).  According to Sun, "The user of the interface has precise control over where in the list each element is inserted." Elements can be inserted and retrieved on the basis of their integer index (position in the list) using the following methods:

add(int index, Object element)
get(int index)

Valid index values are positive integers that begin with zero.  When the add method is used to insert an element at a specific position in the sequence, the element currently at that position (if any) and any subsequent elements are shifted toward higher index values to make room for the new element.

Another version of the add method takes a reference to an object as an incoming parameter and appends the specified element to the end of the collection.

The get method simply returns the element at the specified position in the collection.

The List interface also declares various other methods that can be used to manipulate the contents of the collection.

Q3  True or False?  A collection that implements the List interface is allowed to contain duplicate values.

The answer to question 3 is true.  Unlike a collection that implements the Set interface, a collection that implements the List interface is typically allowed to contain duplicate values.  More formally, according to Sun, "lists typically allow pairs of elements e1 and e2 such that e1.equals(e2), and they typically allow multiple null elements if they allow null elements at all."

Q4  True or False?  The contracts of the methods in the List interface are the same as the contracts of the methods inherited from the Collection interface.

The answer to question 4 is false.  According to Sun, "The List interface places additional stipulations, beyond those specified in the Collection interface, on the contracts of the iterator, add, remove, equals, and hashCode methods."

For example, the iterator() method (for both the List and Collection interfaces) returns an iterator over the elements in the collection.  For the Collection interface, there are no guarantees concerning the order in which the elements are returned by the methods of the Iterator object.

On the other hand, the iterator() method for the List interface returns an iterator over the elements in the collection in proper sequence, where the sequence is determined by the numeric index.  In other words, when you invoke the methods of the Iterator object on a List, the elements will be returned in the proper sequence as determined by a numeric index.

Similarly, according to Sun, the SortedSet interface "guarantees that its iterator will traverse the set in ascending element order, sorted according to the natural ordering of its elements (see Comparable), or by a Comparator provided at sorted set creation time."

Summary

In this lesson you learned that all of the implementations of the interfaces in the Java Collections Framework implement one of the subinterfaces of the Collection interface.  You learned that a Set object cannot contain duplicate elements, but a List object can contain duplicate elements.

You learned about the difference between ordered collections and sorted collections.  You also learned about ascending order and the natural ordering of objects.  In addition, you learned how more specialized stipulations are placed on interfaces as you progress down the interface inheritance hierarchy of the Java Collections Framework.

What's Next?

The SortedSet interface "guarantees that its iterator will traverse the set in ascending element order, sorted according to the natural ordering of its elements (see Comparable), or by a Comparator provided at sorted set creation time."  In the next lesson, I will show you how to use the Comparator interface to control the sorted order of your collections.


Copyright 2000, Richard G. Baldwin.  Reproduction in whole or in part in any form or medium without express written permission from Richard Baldwin is prohibited.

About the author

Richard Baldwin is a college professor and private consultant whose primary focus is a combination of Java and XML. In addition to the many platform-independent benefits of Java applications, he believes that a combination of Java and XML will become the primary driving force in the delivery of structured information on the Web.

Richard has participated in numerous consulting projects involving Java, XML, or a combination of the two.  He frequently provides onsite Java and/or XML training at the high-tech companies located in and around Austin, Texas.  He is the author of Baldwin's Java Programming Tutorials, which has gained a worldwide following among experienced and aspiring Java programmers. He has also published articles on Java Programming in Java Pro magazine.

Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.

baldwin.richard@iname.com

-end-