The bubble sort is notoriously slow, but it's conceptually the simplest of the sorting algorithms and for that reason is a good beginning for our exploration of sorting techniques.
Bubble Sort on the Baseball Players
Imagine that you’re near-sighted (like a computer program) so that you can see only two of the baseball players at the same time, if they’re next to each other and if you stand very close to them. Given this impediment, how would you sort them? Let’s assume there are N players, and the positions they’re standing in are numbered from 0 on the left to N-1 on the right.
The bubble sort routine works like this: You start at the left end of the line and compare the two kids in positions 0 and 1. If the one on the left (in 0) is taller, you swap them. If the one on the right is taller, you don’t do anything. Then you move over one position and compare the kids in positions 1 and 2. Again, if the one on the left is taller, you swap them.
Here are the rules you’re following:
- Compare two players.
- If the one on the left is taller, swap them.
- Move one position right.
You continue down the line this way until you reach the right end. You have by no means finished sorting the kids, but you do know that the tallest kid is on the right. This must be true because, as soon as you encounter the tallest kid, you’ll end up swapping him (or her) every time you compare two kids, until eventually he (or she) will reach the right end of the line. This is why it’s called the bubble sort: As the algorithm progresses, the biggest items “bubble up” to the top end of the array.
After this first pass through all the data, you’ve made N-1 comparisons and somewhere between 0 and N-1 swaps, depending on the initial arrangement of the players. The item at the end of the array is sorted and won’t be moved again. Now you go back and start another pass from the left end of the line. Again, you go toward the right, comparing and swapping when appropriate. However, this time you can stop one player short of the end of the line, at position N-2, because you know the last position, at N-1, already contains the tallest player. This rule could be stated as:
- When you reach the first sorted player, start over at the left end of the line.
You continue this process until all the players are in order.
Java Code for a Bubble Sort
// bubbleSort.java
// demonstrates bubble sort
// to run this program: C>java BubbleSortApp
////////////////////////////////////////////////////////////////
class ArrayBub
{
private long[] a; // ref to array a
private int nElems; // number of data items
//--------------------------------------------------------------
public ArrayBub(int max) // constructor
{
a = new long[max]; // create the array
nElems = 0; // no items yet
}
//--------------------------------------------------------------
public void insert(long value) // put element into array
{
a[nElems] = value; // insert it
nElems++; // increment size
}
//--------------------------------------------------------------
public void display() // displays array contents
{
for(int j=0; j<nElems; j++) // for each element,
System.out.print(a[j] + “ “); // display it
System.out.println(“”);
}
//--------------------------------------------------------------
public void bubbleSort()
{
int out, in;
for(out=nElems-1; out>1; out--) // outer loop (backward)
for(in=0; in<out; in++) // inner loop (forward)
if( a[in] > a[in+1] ) // out of order?
swap(in, in+1); // swap them
} // end bubbleSort()
//--------------------------------------------------------------
private void swap(int one, int two)
{
long temp = a[one];
a[one] = a[two];
a[two] = temp;
}
//--------------------------------------------------------------
} // end class ArrayBub
////////////////////////////////////////////////////////////////
class BubbleSortApp
{
public static void main(String[] args)
{
int maxSize = 100; // array size
ArrayBub arr; // reference to array
arr = new ArrayBub(maxSize); // create the array
arr.insert(77); // insert 10 items
arr.insert(99);
arr.insert(44);
arr.insert(55);
arr.insert(22);
arr.insert(88);
arr.insert(11);
arr.insert(00);
arr.insert(66);
arr.insert(33);
arr.display(); // display items
arr.bubbleSort(); // bubble sort them
arr.display(); // display them again
} // end main()
} // end class BubbleSortApp
////////////////////////////////////////////////////////////////
The constructor and the insert() and display() methods of this class are similar to those we’ve seen before. However, there’s a new method: bubbleSort(). When this method is invoked from main(), the contents of the array are rearranged into sorted order.
The main() routine inserts 10 items into the array in random order, displays the array, calls bubbleSort() to sort it, and then displays it again. Here’s the output:
77 99 44 55 22 88 11 0 66 33
0 11 22 33 44 55 66 77 88 99
The bubbleSort() method is only four lines long. Here it is, extracted from the listing:
public void bubbleSort()
{
int out, in;
for(out=nElems-1; out>1; out--) // outer loop (backward)
for(in=0; in<out; in++) // inner loop (forward)
if( a[in] > a[in+1] ) // out of order?
swap(in, in+1); // swap them
} // end bubbleSort()
The idea is to put the smallest item at the beginning of the array (index 0) and the largest item at the end (index nElems-1). The loop counter out in the outer for loop starts at the end of the array, at nElems-1, and decrements itself each time through the loop. The items at indices greater than out are always completely sorted. The out variable moves left after each pass by in so that items that are already sorted are no longer involved in the algorithm.
The inner loop counter in starts at the beginning of the array and increments itself each cycle of the inner loop, exiting when it reaches out. Within the inner loop, the two array cells pointed to by in and in+1 are compared, and swapped if the one in in is larger than the one in in+1.
For clarity, we use a separate swap() method to carry out the swap. It simply exchanges the two values in the two array cells, using a temporary variable to hold the value of the first cell while the first cell takes on the value in the second and then setting the second cell to the temporary value. Actually, using a separate swap() method may not be a good idea in practice because the function call adds a small amount of overhead. If you’re writing your own sorting routine, you may prefer to put the swap instructions in line to gain a slight increase in speed.
Invariants
In many algorithms there are conditions that remain unchanged as the algorithm proceeds. These conditions are called invariants. Recognizing invariants can be useful in understanding the algorithm. In certain situations they may also be helpful in debugging; you can repeatedly check that the invariant is true, and signal an error if it isn’t.
In the bubbleSort.java program, the invariant is that the data items to the right of out are sorted. This remains true throughout the running of the algorithm. (On the first pass, nothing has been sorted yet, and there are no items to the right of out because it starts on the rightmost element.)
Efficiency of the Bubble Sort
As you can see by watching the BubbleSort Workshop applet with 10 bars, the inner and inner+1 arrows make nine comparisons on the first pass, eight on the second, and so on, down to one comparison on the last pass. For 10 items, this is:
9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 = 45
In general, where N is the number of items in the array, there are N-1 comparisons on the first pass, N-2 on the second, and so on. The formula for the sum of such a series is:
(N–1) + (N–2) + (N–3) + … + 1 = N*(N–1)/2
N*(N–1)/2 is 45 (10*9/2) when N is 10.
Thus, the algorithm makes about N2⁄2 comparisons (ignoring the –1, which doesn’t make much difference, especially if N is large).
There are fewer swaps than there are comparisons because two bars are swapped only if they need to be. If the data is random, a swap is necessary about half the time, so there will be about N2⁄4 swaps. (Although in the worst case, with the initial data inversely sorted, a swap is necessary with every comparison.)
Both swaps and comparisons are proportional to N2. Because constants don’t count in Big O notation, we can ignore the 2 and the 4 and say that the bubble sort runs in O(N2) time.
Whenever you see one loop nested within another, such as those in the bubble sort and the other sorting algorithms in this chapter, you can suspect that an algorithm runs in O(N2) time. The outer loop executes N times, and the inner loop executes N (or perhaps N divided by some constant) times for each cycle of the outer loop. This means you’re doing something approximately N*N or N2 times.