Sorting algorithms

CSC 150 : Sorting Algorithms
primary information resource: An Invitation to Computer Science, G. Michael Schneider and Judith L. Gersting, Brooks/Cole, 2000.

[ back to chapter 3 | back to search | schedule | on to chapter 4 ]

Topic Index
[ sorting | selection sort | bubble sort | insertion sort | quick sort ]

Sorting algorithms

No class of algorithms has been more closely studied and refined with respect to order of magnitude runtimes than algorithms for sorting data (arranging data in ascending or descending order).

The textbook explains one sorting algorithm, the selection sort. In Lab 4, you will explore this plus two additional algorithms, the bubble sort and the quick sort. In Lab 6, you will explore all three of these plus a fourth, the insertion sort.

See http://www.brian-borowski.com/Software/Sorting/ for a very nice animated demonstrations of a variety of sorting algorithms (all four of the above plus mergesort, heapsort, and shellsort).

Selection Sort

Study selection sort algorithm to understand it, determine which steps are worth counting, and do some runtime analysis.

Essence of selection sort for list of n values:

locate largest of the n values and swap it with whatever is in position n.
locate largest of the first n-1 values and swap it with whatever is in position n-1.
locate largest of the first n-2 values and swap it with whatever is in position n-2.
locate largest of the first n-3 values and swap it with whatever is in position n-3.
repeat until you’ve found larger of first 2 and swapped.

At the end of any pas through the list, list is partitioned into two parts: the unsorted part followed by the sorted part.

The sorted part starts at the end and grows toward the beginning. You are finished when it reaches the beginning.

Note this uses the “search for the largest” algorithm from lab 3.

What operations appear to be most significant?
Comparisons (to find the largest) and exchanges.

Runtime analysis of selection sort.

How many comparisons are done?
Any difference between worst and best case for comparisons? Nope
To find largest of n values requires n-1 comparisons, but that covers only the first repetition.
The total comparisons is (n-1)+(n-2)+(n-3)+…+3+2+1, which is n * (n-1) / 2.
How many exchanges are done?
Any difference between worst and best case? No.
One exchanges is done per "pass" through the list, for total of (n-1).
Total comparisons and exchanges: n * (n-1) / 2 + (n – 1) = O(n²)

Bubble sort

Essence of bubble sort for list of n values V1 through Vn:

Make a beginning-to-end pass through the list from V1 through Vn. For each pair of values Vi and Vi+1, compare them and exchange if out of order. Largest value ends up in position n.
Make second pass through the list from V1 through Vn-1. For each pair of values Vi and Vi+1, compare them and exchange if out of order. Second-largest value ends up in position n-1.
Make third pass through the list from V1 through Vn-2. For each pair of values Vi and Vi+1, compare them and exchange if out of order. Third-largest value ends up in position n-2.
Continue until n-1 passes have been made.

See where it gets its name? In the first pass, the largest value “bubbles” its way to its correct position.

At the end of any given pass, the list is partitioned into the unsorted part followed by the sorted part.

Runtime analysis of bubble sort.

How many comparisons are done?
Any difference between worst and best case for comparisons? No
First pass requires n-1 comparisons, second pass n-2, third pass n-3, for total of n-1 passes.
The total comparisons is (n-1)+(n-2)+(n-3)+…+3+2+1, which is n * (n-1) / 2.
How many exchanges are done?
Any difference between worst and best case? Yes.
Best case exchanges: values originally ordered, no exchanges.
Worst case exchanges: values in reverse order, exchange for every comparison.
Best case total comparisons and exchanges: n * (n-1) / 2 + 0 = O(n²)
Worst case total comparisons and exchanges: 2 * (n * (n-1) / 2 ) = O(n²)

NOTE: Best case can be reduced to n-1 comparisons and 0 exchanges using this simple optimization: set “exchanged” flag to false at beginning of pass and set it true if exchange is made. If still false at end of pass then list is ordered so stop. In the best case, list is originally ordered, so it stops after first pass.

Insertion Sort

Informal description:

Make n-1 passes through list
At the start of pass i, the list is partitioned into a sorted part, the first i values and the unsorted part, the remaining n-i values.
In pass i, “pick up” the first value of the unsorted part (position i+1) and work backwards into the sorted part, exchanging it with each value along the way until correct position is found.

Example of how a given pass works:
Consider algorithm in progress just before starting pass 4 with list:
7 12 21 24 | 15 5 32 (vertical bar separates sorted / unsorted part)
Select the value 15 and work backwards comparing and exchanging as you go until correct position is found (between the 12 and 21).
The 24 is compared to 15. 15 is smaller, so exchange them.
7 12 21 15 24 | 5 32
The 21 is compared to 15. 15 is smaller, so exchange them.
7 12 15 21 24 | 5 32
The 12 is compared to 15. 15 is larger, no exchange pass 4 is complete.

Informal worst case analysis:

O(n) comparisons and O(n) exchanges per pass
O(n) passes required (n-1 to be exact)
Total complexity is: O(n) passes * [ O(n) comparisons + O(n) exchanges ] = O(n²)

Breaking the O(n²) barrier: Quick Sort

Note: many sorting algorithms are fundamentally faster than O(n²) but we choose Quick Sort because you will utilize it in labs 5 and 6.

Breaking the barrier usually requires a “divide and conquer” strategy.

Informal description of Quick Sort operating under ideal conditions:

Select the median value from the list (not middle position, but middle value). This is called the pivot.
Rearrange the list and position the pivot such that all values less than it are to its left and all values greater than it are to its right. This forms a partial ordering and can be done pretty quickly. Note that the pivot is now in its correct and final position!
do a Quick Sort on the left sublist (if it has more than one value).
do a Quick Sort on the right sublist (if it has more than one value).

The runtime analysis is complicated, because the Quick Sort algorithm actually calls itself twice! We call this recursion.

Ideally, each sublist will be about half the size of the original. This is the “divide and conquer” part. With each recursive call, the sublists get smaller and smaller until they are a lot of little two-value sublists, each of which can be quickly sorted! Then it all gets put back together again.

Complicated? You bet. The results are well worth it. For each level of recursion, O(n) work is required (the rearranging-around-the-pivot part). The number of levels of recursion is approximately the logarithm base 2 of n. The total work is thus O( log₂ n * n ).
So comparing Selection sort with Quick sort for n=1000, selection sort requires O(n²) or about 1,000,000 operations where Quick sort requires only about 10,000! (log₂ of 1000 is about 10) Or 100 times faster!
For n=1,000,000 the savings are even better, Quick sort is 500,000 times faster than selection sort! (log₂ of 1,000,000 is about 20). No wonder they call it “quick”!

[ C SC 150 | Peter Sanderson | Math Sciences server | Math Sciences home page | Otterbein ]

Last updated:
Peter Sanderson (PSanderson@otterbein.edu)