CSC 150 Chapter 3: Efficiency of Algorithms
primary information resource:
An Invitation to Computer Science (Java), Third Edition,
G. Michael Schneider and Judith L. Gersting,
Course Technology, 2007.
[ previous | schedule | next ]
On what basis do we compare and contrast different algorithms?
Beware that tradeoffs are often necessary.
Let's take a closer look at efficiency, "judicious use of resources"
What resources are we talking about? Memory space and execution time.
Why be concerned about this? Doesn't Moore's Law take care of this?
Which is more critical, space or time?
Usually more concerned with time.
Practical reason: memory modules can more easily be added for additional
capacity than CPU's can. In addition, you cannot take advantage of
additional CPUs unless algorithm can be restructured for parallelism.
How can an algorithm's time efficiency be measured?
Can we use this as a basis for comparing it to a different algorithm for the same problem?
To judge based on running programs requires the elimination of non-algorithmic factors, such as:
To judge an algorithm, do so on the basis of algorithm properties
Sketch graphs of the running times (n is x-axis, steps is y-axis) for the 7 exercises.
Exaggerate somewhat to illustrate the insignificance of the constant
factors when comparing algorithms as n grows large.
e.g. Even if you have 1000n versus 1n2, at some large value
of n the former will be become faster (in this case n>1000).
When analyzing runtime of code/algorithms we normally ignore the constant multiplier, and concentrate on the “order of magnitude”
The term “order of magnitude” is normally written as (theta) or
O (big-O).
Three algorithms for the "data cleanup" problem
The problem, simply stated, is: given a list of integers, eliminate the list members which have a value of 0 while keeping the list structure intact (no gaps).
Note: The descriptions given here are intended merely to convey the essence of the algorithms and not to fully and unambiguously describe the algorithms in the way that a pseudocode specification does.
Approach 1: “shuffle-left”
1. Scan the list from left to right.
2. When the first “0” entry is reached, shift all remaining values
one position left (The first such shift will overwrite the “0” entry, and
each subsequent shift overwrites the entry before it. An additional
copy of the last entry is created but we ignore all except the latest --
leftmost -- copy.)
3. Resume the scan at the same position (in case a “0” was shuffled
into it) and repeat step 2.
4. When the scan reaches the end of the list (ignoring any and all
extra copies of the last entry), decrement the list size if the last entry
is "0"
5. Stop
Approach 2: "copy-over"
1. Create a second, initially emtpy, list
2. Scan the first list from left to right.
3. At each entry during the scan of the first list, check to see if
it is "0". If not, copy it to the end of the second list.
4. Stop when the scan reaches the end of the first list.
Approach 3: "converging-pointers"
1. Scan the list from left to right.
2. When the first "0" entry is reached, copy the last value of the
list into this position and decrement the list size.
3. Resume the scan at the same position (in case a “0” was copied into
it) and repeat step 2.
4. When the scan reaches the end of the list, decrement the list size
if the last entry is "0"
5. Stop
This is called converging points because one pointer is the main scanner that goes from left to right and the other pointer is a scanner that goes from right to left because it always tracks the last position in the list.
Do you have a intuitive feeling for which of the three is fastest?
Runtime analysis of the three data cleanup algorithms
Our goal is order of magnitude running time.
We don’t want to bother counting every single step.
What steps are significant?
- comparing a list value to 0
- copying a number
Shuffle-left
What is best case? No zeroes in list.
What is worst case? All zeroes in list.
Best case number of comparisons: O(n)
Best case number of copy operations: none
Best case total: O(n) + none = O(n)
Worst case number of comparisons: O(n)
Worst case number of copy operations: O(n2) <explanation
below>
Worst case total: O(n2) + O(n) = O(n2)
<only dominant term remains>
Worst case # copy operations explained:
Conclusion? Slow but space-efficient.
Copy-Over
What is best case? All zeroes in list.
What is worst case? No zeroes in list.
Best case number of comparisons: O(n)
Best case number of copy operations: none
Best case total: O(n) + none = O(n)
Worst case number of comparisons: O(n)
Worst case number of copy operations: O(n)
Worst case total: O(n) + O(n) = O(n)
Conclusion? Fast but space-inefficient.
Converging-Pointers
What is best case? No zeroes in list.
What is worst case? All zeroes in list.
Best case number of comparisons: O(n)
Best case number of copy operations: none
Best case total: O(n) + none = O(n)
Worst case number of comparisons: O(n)
Worst case number of copy operations: O(n)
Worst case total: O(n) + O(n) = O(n)
Conclusion? Fast AND space-efficient! We’ve beaten the time-space tradeoff.
Problems for which there is no computationally feasible solution
Consider the class of problems for which no solution of O(n2) or even any fixed power of n exists. Here are some examples:
Traveling Salesman problem: salesman has N cities to visit. He is constrained by the existing network of roads. In what order should he visit them to minimize the total mileage?
Related and seemingly easier problem is Hamiltonian Circuit: Given a network of N cities (not all directly connected to each other) and starting at city A, is it possible to travel through all the other cities exactly once and finally arrive back to A?
Bin packing problem, stated terms of a contemporary example: You have a spindle of 80 minute CD-Rs and a list of N songs you want to burn on them. What arrangement of songs will minimize the number of CD-Rs required? (think very large N)
The difficulty in all of these lies in the number of possible combinations that have to be explored to determine the answer.
The number of combinations increase very rapidly, exponentially, as N increases. In general, their order of magnitude is O(2N)!
Compare 2N to N2 to N to log2N for N=32
log2N = 5
N = 32
N2 = 1024
2N = 4,294,967,296
The best we can hope for are good heuristic (approximate) solutions.