CSC 150 Chapter 3: Efficiency of Algorithms
primary information resource: An Invitation to Computer Science (Java), Third Edition, G. Michael Schneider and Judith L. Gersting, Course Technology, 2007.

[ previous | schedule | next ]

Topic Index
[ characteristics | efficiency | sequential search | binary search | data cleanup problem | intractible problems ]
[ sorting | selection sort | bubble sort | insertion sort | quick sort ]



One theme consistent algorithms we've developed and studied so far is this: problems have multiple algorithmic solutions.

On what basis do we compare and contrast different algorithms?

(the "-ability" characteristics mainly contribute to ease of translation into code and ease of maintenance)

Beware that tradeoffs are often necessary.

Why would one design for understandability over efficiency?
Why would one design for efficiency over understandability?


Let's take a closer look at efficiency, "judicious use of resources"

What resources are we talking about?  Memory space and execution time.

Why be concerned about this?  Doesn't Moore's Law take care of this?

Which is more critical, space or time?

Usually more concerned with time.
Practical reason: memory modules can more easily be added for additional capacity than CPU's can.  In addition, you cannot take advantage of additional CPUs unless algorithm can be restructured for parallelism.

How can an algorithm's time efficiency be measured?

Can we use this as a basis for comparing it to a different algorithm for the same problem?

To judge based on running programs requires the elimination of non-algorithmic factors, such as:


To judge an algorithm, do so on the basis of algorithm properties

Here are some examples of step counting for Java statements. Go through at least the first 7.

Sketch graphs of the running times (n is x-axis, steps is y-axis) for the 7 exercises.

Exaggerate somewhat to illustrate the insignificance of the constant factors when comparing algorithms as n grows large.
e.g. Even if you have 1000n versus 1n2, at some large value of n the former will be become faster (in this case n>1000).

When analyzing runtime of code/algorithms we normally ignore the constant multiplier, and concentrate on the “order of magnitude”

The term “order of magnitude” is normally written as (theta)  or O (big-O).
 


Three algorithms for the "data cleanup" problem

The problem, simply stated, is: given a list of integers, eliminate the list members which have a value of 0 while keeping the list structure intact (no gaps).

Note: The descriptions given here are intended merely to convey the essence of the algorithms and not to fully and unambiguously describe the algorithms in the way that a pseudocode specification does.

Approach 1:  “shuffle-left”
1. Scan the list from left to right.
2. When the first “0” entry is reached, shift all remaining values one position left (The first such shift will overwrite the “0” entry, and each subsequent shift overwrites the entry before it.  An additional copy of the last entry is created but we ignore all except the latest -- leftmost -- copy.)
3. Resume the scan at the same position (in case a “0” was shuffled into it) and repeat step 2.
4. When the scan reaches the end of the list (ignoring any and all extra copies of the last entry), decrement the list size if the last entry is "0"
5. Stop

Approach 2: "copy-over"
1. Create a second, initially emtpy, list
2. Scan the first list from left to right.
3. At each entry during the scan of the first list, check to see if it is "0".  If not, copy it to the end of the second list.
4. Stop when the scan reaches the end of the first list.

Approach 3: "converging-pointers"
1. Scan the list from left to right.
2. When the first "0" entry is reached, copy the last value of the list into this position and decrement the list size.
3. Resume the scan at the same position (in case a “0” was copied into it) and repeat step 2.
4. When the scan reaches the end of the list, decrement the list size if the last entry is "0"
5. Stop

This is called converging points because one pointer is the main scanner that goes from left to right and the other pointer is a scanner that goes from right to left because it always tracks the last position in the list.

Do you have a intuitive feeling for which of the three is fastest?


Runtime analysis of the three data cleanup algorithms

Our goal is order of magnitude running time.
We don’t want to bother counting every single step.
What steps are significant?
- comparing a list value to 0
- copying a number

Shuffle-left
What is best case?  No zeroes in list.
What is worst case?  All zeroes in list.
Best case number of comparisons: O(n)
Best case number of copy operations: none
Best case total: O(n) + none = O(n)
Worst case number of comparisons: O(n)
Worst case number of copy operations: O(n2)  <explanation below>
Worst case total: O(n2) + O(n) = O(n2)   <only dominant term remains>
Worst case # copy operations explained:

[Note the inefficiency of left-shifting all remaining values, even those beyond the legit marker.  Tweaking the algorithm to ignore positions beyond legit will speed it up but does not change the worst case order of magnitude.  Worst case number of copy operations is (n-1)+(n-2)+(n-3)+…+3+2+1, which is n * (n-1) / 2. This is half what it was before, but 1/2 is simply a constant multiplier which is ignored for order of magnitude.]

Conclusion?  Slow but space-efficient.

Copy-Over
What is best case?  All zeroes in list.
What is worst case?  No zeroes in list.
Best case number of comparisons: O(n)
Best case number of copy operations: none
Best case total: O(n) + none = O(n)
Worst case number of comparisons: O(n)
Worst case number of copy operations: O(n)
Worst case total: O(n) + O(n) = O(n)

Conclusion?  Fast but space-inefficient.

Converging-Pointers
What is best case?  No zeroes in list.
What is worst case?  All zeroes in list.
Best case number of comparisons: O(n)
Best case number of copy operations: none
Best case total: O(n) + none = O(n)
Worst case number of comparisons: O(n)
Worst case number of copy operations: O(n)
Worst case total: O(n) + O(n) = O(n)

Conclusion?  Fast AND space-efficient!  We’ve beaten the time-space tradeoff.


Problems for which there is no computationally feasible solution

Consider the class of problems for which no solution of  O(n2) or even any fixed power of n exists.  Here are some examples:

Traveling Salesman problem: salesman has N cities to visit.  He is constrained by the existing network of roads.  In what order should he visit them to minimize the total mileage?

Related and seemingly easier problem is Hamiltonian Circuit: Given a network of N cities (not all directly connected to each other) and starting at city A, is it possible to travel through all the other cities exactly once and finally arrive back to A?

Bin packing problem, stated terms of a contemporary example: You have a spindle of 80 minute CD-Rs and a list of N songs you want to burn on them.  What arrangement of songs will minimize the number of CD-Rs required?  (think very large N)

The difficulty in all of these lies in the number of possible combinations that have to be explored to determine the answer.

The number of combinations increase very rapidly, exponentially, as N increases.  In general, their order of magnitude is O(2N)!

Compare 2N to N2 to N to log2N for N=32

log2N = 5
N  = 32
N2 = 1024
2N = 4,294,967,296

The best we can hope for are good heuristic (approximate) solutions.


[ C SC 150 | Peter Sanderson | Math Sciences server  | Math Sciences home page | Otterbein ]

Last updated:
Peter Sanderson (PSanderson@otterbein.edu)