A look at the Collections Framework of Java which provides a large set of popular data structures such as stacks, linked lists and dynamic arrays.
The Collections Framework is arguably the most powerful subsystem in the Java API because it supplies ready-to-use versions of programming’s most widely used data structures. For example, it provides support for stacks, queues, dynamic arrays, and linked lists. It also defines trees, hash tables, and maps. These “data engines” make it easy to work with groups (i.e., collections) of data. It is not necessary, for example, to write your own linked-list routines. You can simply use the LinkedList class provided by the Collections Framework.
The Collections Framework has had a profound effect on the way that Java programs are written. Because of the ease with which one of the collection classes can be employed, there is seldom the need to create your own custom solution. By using a standard collection, you gain three major advantages:
- Development time is reduced because the collections are thoroughly tested and ready to use. You don’t need to spend time developing your own custom implementations.
- The standard collections are efficiently implemented. As a general rule, there is little (if any) benefit in creating a custom implementation.
- Your code is easier to maintain. Because the standard collections are part of the Java API, most programmers are familiar with them, and with the way that they operate. Thus, anyone working on your code will understand a collection-based approach.
Collections Overview
Collections have not always been a part of Java. Originally, Java relied on classes such as Dictionary, HashTable, Vector, Stack, and Properties to provide the basic data structures. Although these classes were quite useful, they were not part of a unified whole. To remedy this situation, Java 1.2 added the Collections Framework, which standardized the way in which groups of objects are handled by your programs. Each subsequent Java release has added to and improved this important API.
The core of the Collections Framework is packaged in java.util. It contains the interfaces that define collections and provides several concrete implementations of these interfaces. These are the collections that programmers normally think of when they use the term“Collections Framework,” and they are the focus of this chapter.
In addition to the collections defined in java.util, several other collections are contained in java.util.concurrent. These collections are relatively new, being added by Java 5. They support concurrent programming, but otherwise operate like the other collections. Because concurrent programming is a topic unto itself, the concurrent collections are not included in this chapter.
The Collections Framework is defined by three main features:
- A set of standard interfaces that define the functionality of a collection
- Concrete implementations of the interfaces
- Algorithms that operate on collections
The collection interfaces determine the characteristics of a collection. At the top of the interface hierarchy is Collection, which defines the features common to all collections. Subinterfaces add the ttributes related to specific types of collections. For example, the Set interface specifies the functionality of a set, which is a collection of unique (i.e., non-duplicate) elements. Several classes, such as ArrayList, HashSet, and LinkedList, provide concrete implementations of the collection interfaces. These concrete implementations provide “offthe- shelf” solutions to most data storage and retrieval tasks Algorithms are static methods defined within the Collections class that operate on collections. For example, there are algorithms that search, sort, or reverse a collection. In essence, algorithms provide a standard means of manipulating collections.
When working with a collection, you will often want to cycle through its elements. One way to do this is with an iterator, which is defined by the Iterator interface. An iterator offers a general-purpose, standardized way of accessing the elements within a collection, one at a time. In other words, an iterator provides a means of enumerating the contents of a collection. Because each collection implements Iterator, an iterator can be used to cycle through the elements of any collection class.
Another feature defined by the Collections Framework is the map. A map stores key / value pairs. Although maps are part of the Collections Framework, they are not “collections” in the strict use of the term because they do not implement the Collection interface. You can, however, obtain a collection-view of a map. Such a view contains the elements from the map stored in a collection. Thus, you can process the contents of a map as a collection, if you choose.
Three Recent Changes
As you may know, the Java language underwent a substantial change when several new features were added by Java 5. Three of these features had a profound effect on the Collections Framework: generics, autoboxing, and the for-each style for loop. Although these features are now a well-established part of Java programming, not all programmers are aware of how much they have impacted collections. Because the recipes in this chapter make use of these features, a brief discussion is warranted.
With the release of Java 5, the entire Java API, including the Collections Framework, was reengineered for generics. As a result, today all collections are generic, and many of the methods that operate on collections take generic type parameters. Generics improve collections by adding type safety. Prior to generics, all collections stored Object references, which meant that any collection could store any type of object. Thus, it was possible to accidentally store incompatible types in a collection. Doing so could result in runtime type mismatch errors. With generics, the type of data being stored is explicitly specified, and runtime type mismatches can be avoided.
Autoboxing/unboxing facilitates the storing of primitive types in collections. A collection can store only references, not primitive types. In the past, if you wanted to store a primitive type in a collection, you had to manually box it into its type wrapper. For example, to store an int value, you needed to create an Integer object that contained that value. When the value was retrieved, it needed to be manually unboxed (by using an explicit cast) into its proper primitive type. Because of autoboxing/unboxing, Java can now automatically perform the proper boxing and unboxing needed when storing or retrieving primitive types. There is no need to manually perform these operations. This makes it much easier to store primitive types in a collection.
All collection classes now implement the Iterable interface. This enables a collection to be cycled through by use of the for-each style for loop. In the past, cycling through a collection required the use of an iterator. Although iterators are still needed for some uses, in many cases, iterator-based loops can be replaced by for loops.
The Collection Interfaces
The Collections Framework is defined by a set of interfaces, which are shown in Table 5-1. At the top of the interface hierarchy is Collection. It must be implemented by all collections. From Collection are derived several subinterfaces, such as List and Set, which define specific types of collections, such as lists and sets.
In addition to the collection interfaces, collections also use the Comparator, RandomAccess, Iterator, and ListIterator interfaces. Comparator defines how two objects are compared. Iterator and ListIterator enumerate the objects within a collection. By implementing RandomAccess, a list indicates that it supports efficient, random access to its elements. The Collections Framework supports both modifiable and unmodifiable collections.
To enable this, the collection interfaces allow methods that modify a collection to be optional. If an attempt is made to use one of these methods on an unmodifiable collection, an UnsupportedOperationException is thrown. All the built-in collections are modifiable, but it is possible to obtain an immutable view of a collection. (Obtaining an immutable collection is described in Create an Immutable Collection.)
The List Interface
The List interface extends Collection and declares the behavior of a collection that stores a sequence of elements. Elements can be inserted or accessed by their position in the list, using a zero-based index. A list may contain duplicate elements. List is a generic interface that has this declaration:
1. interface List<E>
Here, E specifies the type of objects that the list will hold.
In addition to the methods defined by Collection, List defines some of its own, which are summarized in Table 5-3. Pay special attention to the get( ) and set( ) methods. They provide access to the elements in the list through an index. The get( ) method obtains the object stored at a specific location, and set( ) assigns a value to the specified element in the list.
List specifies that the equals( ) method must compare the contents of two lists, returning true only if they are exactly the same. (In other words, they must contain the same elements, in the same sequence.) If not, then equals( ) returns false. Therefore, any collection that implements List, implements equals( ) in this way.
Several of these methods will throw an UnsupportedOperationException if an attempt is made to modify an immutable collection, and a ClassCastException is generated when one object is incompatible with another, such as when an attempt is made to add an incompatible object to a collection. A NullPointerException is thrown if an attempt is made to store a null object and null elements are not allowed in the list. An IllegalArgumentException is thrown if an invalid argument is used.
The Set Interface
The Set interface defines a set. It extends Collection and declares the behavior of a collection that does not allow duplicate elements. Therefore, the add( ) method returns false if an attempt is made to add duplicate elements to a set. It does not define any additional methods of its own. Set is a generic interface that has this declaration:
1. interface Set<E>
Here, E specifies the type of objects that the set will hold.
Set specifies that the equals( ) method must compare the contents of two sets, returning true only if they contain the same elements. If not, then equals( ) returns false. Therefore, any collection that implements Set, implements equals( ) in this manner.
The SortedSet Interface
The SortedSet interface extends Set and declares the behavior of a set sorted in ascending order. SortedSet is a generic interface that has this declaration:
1. interface SortedSet<E>
Here, E specifies the type of objects that the set will hold.
In addition to those methods defined by Set, the SortedSet interface declares the methods summarized in Table 5-4. These methods make set processing more convenient. Several methods throw a NoSuchElementException when no items are contained in the invoking set. AClassCastException is thrown when an object is incompatible with the elements in a set. A NullPointerException is thrown if an attempt is made to use a null object and null is not allowed in the set. An IllegalArgumentException is thrown if an invalid argument is used.
The NavigableSet Interface
A recent addition to the Collections Framework is NavigableSet. It was added by Java 6 and extends SortedSet. NavigableSet declares the behavior of a collection that supports the retrieval of elements based on the closest match to a given value or values. NavigableSet is a generic interface that has this declaration:
1. interface NavigableSet<E>
Here, E specifies the type of objects that the set will hold. In addition to the methods that it inherits from SortedSet, NavigableSet adds those summarized in Table 5-5. A ClassCastException is thrown when an object is incompatible with the elements in the set. A NullPointerException is thrown if an attempt is made to use a null object and null is not allowed in the set. An IllegalArgumentException is thrown if an invalid argument is used.