Chapter 3: Efficiency of Algorithms

CSC 150 Chapter 3: Efficiency of Algorithms
primary information resource: An Invitation to Computer Science (Java), Third Edition, G. Michael Schneider and Judith L. Gersting, Course Technology, 2007.

[ previous | schedule | next ]

One theme consistent algorithms we've developed and studied so far is this: problems have multiple algorithmic solutions.

On what basis do we compare and contrast different algorithms?

correctness (required in all cases)
elegance (style)
efficiency (judicious use of resources)
readability
understandability
extendibility
flexibility

(the "-ability" characteristics mainly contribute to ease of translation into code and ease of maintenance)

Beware that tradeoffs are often necessary.

Highly efficient solutions are often difficult to read and understand.
An elegant solution may not be easy to understand but the best ones are.
Algorithms designed for understandability are often not the most efficient.

Why would one design for understandability over efficiency?
Why would one design for efficiency over understandability?

Let's take a closer look at efficiency, "judicious use of resources"

What resources are we talking about? Memory space and execution time.

Why be concerned about this? Doesn't Moore's Law take care of this?

Which is more critical, space or time?

Usually more concerned with time.
Practical reason: memory modules can more easily be added for additional capacity than CPU's can. In addition, you cannot take advantage of additional CPUs unless algorithm can be restructured for parallelism.

How can an algorithm's time efficiency be measured?

Can we use this as a basis for comparing it to a different algorithm for the same problem?

To judge based on running programs requires the elimination of non-algorithmic factors, such as:

computer architecture and speed of CPU
language that algorithm is programmed in
compiler used to translate program into machine code
operating system
specific input that program is given

To judge an algorithm, do so on the basis of algorithm properties

simulate execution of the algorithm and count steps.
You have done this recently in your lab assignment.
not all steps are equally important, determine which are fundamental
analyze algorithm structure to estimate number of fundamental steps

Here are some examples of step counting for Java statements. Go through at least the first 7.

Sketch graphs of the running times (n is x-axis, steps is y-axis) for the 7 exercises.

Exaggerate somewhat to illustrate the insignificance of the constant factors when comparing algorithms as n grows large.
e.g. Even if you have 1000n versus 1n², at some large value of n the former will be become faster (in this case n>1000).

When analyzing runtime of code/algorithms we normally ignore the constant multiplier, and concentrate on the “order of magnitude”

The term “order of magnitude” is normally written as (theta) or O (big-O).

Three algorithms for the "data cleanup" problem

The problem, simply stated, is: given a list of integers, eliminate the list members which have a value of 0 while keeping the list structure intact (no gaps).

Note: The descriptions given here are intended merely to convey the essence of the algorithms and not to fully and unambiguously describe the algorithms in the way that a pseudocode specification does.

Approach 1: “shuffle-left”
1. Scan the list from left to right.
2. When the first “0” entry is reached, shift all remaining values one position left (The first such shift will overwrite the “0” entry, and each subsequent shift overwrites the entry before it. An additional copy of the last entry is created but we ignore all except the latest -- leftmost -- copy.)
3. Resume the scan at the same position (in case a “0” was shuffled into it) and repeat step 2.
4. When the scan reaches the end of the list (ignoring any and all extra copies of the last entry), decrement the list size if the last entry is "0"
5. Stop

Approach 2: "copy-over"
1. Create a second, initially emtpy, list
2. Scan the first list from left to right.
3. At each entry during the scan of the first list, check to see if it is "0". If not, copy it to the end of the second list.
4. Stop when the scan reaches the end of the first list.

Approach 3: "converging-pointers"
1. Scan the list from left to right.
2. When the first "0" entry is reached, copy the last value of the list into this position and decrement the list size.
3. Resume the scan at the same position (in case a “0” was copied into it) and repeat step 2.
4. When the scan reaches the end of the list, decrement the list size if the last entry is "0"
5. Stop

This is called converging points because one pointer is the main scanner that goes from left to right and the other pointer is a scanner that goes from right to left because it always tracks the last position in the list.

Do you have a intuitive feeling for which of the three is fastest?

Runtime analysis of the three data cleanup algorithms

Our goal is order of magnitude running time.
We don’t want to bother counting every single step.
What steps are significant?
- comparing a list value to 0
- copying a number

Shuffle-left
What is best case? No zeroes in list.
What is worst case? All zeroes in list.
Best case number of comparisons: O(n)
Best case number of copy operations: none
Best case total: O(n) + none = O(n)
Worst case number of comparisons: O(n)
Worst case number of copy operations: O(n²) <explanation below>
Worst case total: O(n²) + O(n) = O(n²) <only dominant term remains>
Worst case # copy operations explained:

comparison in position 1 yields 0. legit decremented & the remaining n-1 positions are copied (shifted left one position).
re-comparison in position 1 yields 0. The actions above are repeated.
re-comparison in position 1 yields 0. The actions above are repeated.
this is repeated until legit is decremented to 0.
Since legit is initially n, that means n repetitions of n-1 copy operations each, or n* (n-1) = n² – n total. The n² term dominates thus order of magnitude n².

[Note the inefficiency of left-shifting all remaining values, even those beyond the legit marker. Tweaking the algorithm to ignore positions beyond legit will speed it up but does not change the worst case order of magnitude. Worst case number of copy operations is (n-1)+(n-2)+(n-3)+…+3+2+1, which is n * (n-1) / 2. This is half what it was before, but 1/2 is simply a constant multiplier which is ignored for order of magnitude.]

Conclusion? Slow but space-efficient.

Copy-Over
What is best case? All zeroes in list.
What is worst case? No zeroes in list.
Best case number of comparisons: O(n)
Best case number of copy operations: none
Best case total: O(n) + none = O(n)
Worst case number of comparisons: O(n)
Worst case number of copy operations: O(n)
Worst case total: O(n) + O(n) = O(n)

Conclusion? Fast but space-inefficient.

Converging-Pointers
What is best case? No zeroes in list.
What is worst case? All zeroes in list.
Best case number of comparisons: O(n)
Best case number of copy operations: none
Best case total: O(n) + none = O(n)
Worst case number of comparisons: O(n)
Worst case number of copy operations: O(n)
Worst case total: O(n) + O(n) = O(n)

Conclusion? Fast AND space-efficient! We’ve beaten the time-space tradeoff.

Problems for which there is no computationally feasible solution

Consider the class of problems for which no solution of O(n²) or even any fixed power of n exists. Here are some examples:

Traveling Salesman problem: salesman has N cities to visit. He is constrained by the existing network of roads. In what order should he visit them to minimize the total mileage?

Related and seemingly easier problem is Hamiltonian Circuit: Given a network of N cities (not all directly connected to each other) and starting at city A, is it possible to travel through all the other cities exactly once and finally arrive back to A?

Bin packing problem, stated terms of a contemporary example: You have a spindle of 80 minute CD-Rs and a list of N songs you want to burn on them. What arrangement of songs will minimize the number of CD-Rs required? (think very large N)

The difficulty in all of these lies in the number of possible combinations that have to be explored to determine the answer.

The number of combinations increase very rapidly, exponentially, as N increases. In general, their order of magnitude is O(2^N)!

Compare 2^N to N² to N to log₂N for N=32

log₂N = 5
N = 32
N² = 1024
2^N = 4,294,967,296

The best we can hope for are good heuristic (approximate) solutions.

[ C SC 150 | Peter Sanderson | Math Sciences server | Math Sciences home page | Otterbein ]

Last updated:
Peter Sanderson (PSanderson@otterbein.edu)