Document

CE 221
Data Structures and
Algorithms
Chapter 2: Algorithm Analysis - I
Text: Read Weiss, §2.1 – 2.4.2
Izmir University of Economics
1
Definition
• An Algorithm is a clearly specified set of
simple instructions to be followed to solve
a problem.
• Once an algorithm is given and decided
somehow to be correct, an important step
is to determine how much in the way of
resources, such as time or space, the
algorithm will require.
Izmir University of Economics
2
Definition of Asymptotic
(of a function, series, formula, etc)
approaching a given value or
condition, as a variable or an
expression containing a variable
approaches a limit, usually
infinity.
Mathematical Background
• Definition 2.1. T(N) = O(f(N)) if there are
c,n0 ≥0 such that T(N)≤cf(N) when N ≥ n0.
• Definition 2.2. T(N) = Ω(g(N)) if there are
c,n0 ≥0 such that T(N)≥cg(N) when N ≥ n0.
• Definition 2.3. T(N) = Ɵ(h(N)) iff
T(N)=O(h(N)) and T(N)=Ω(h(N)).
• Definition 2.4. T(N) = o(p(N)) if for all c,
there exists an n0 such that T(N)<cp(N)
when N > n0. (T(N) = o(p(N)) if
T(N)=O(p(N)) and T(N) <> Ɵ(p(N))
Izmir University of Economics
4
O (Big-Oh) Notation
• Definitions establish a relative order among
functions. We compare their relative rates of
growth.
• Example: For small values of N, T(N)=1000N
is larger than f(N)=N2. For N≥n0=1000 and c=
1, T(N) ≤ cf(N). Therefore, 1000N = O(N2)
(Big-Oh notation).
• Big-Oh notation says that the growth rate of
T(N) is less than or equal to that of f(N).
T(N)=O(f(N)) means f(N) is an upper bound
on T(N).
Izmir University of Economics
5
Ω and Ɵ notations
• T(N) = Ω(g(N)) (pronounced “omega”) says that
the growth rate of T(N) is greater than or equal
to that of g(N). g(N) is a lower bound on T(N).
• T(N) = Ɵ(h(N)) (pronounced “theta”) says that
the growth rate of T(N) equals the growth rate
h(N).
• T(N) = o(p(N)) (pronounced “little-oh”) says that
the growth rate of T(N) is less than that of p(N).
• Example: N2=O(N3), N3=Ω(N2)
Izmir University of Economics
6
Intuition for Asymptotic
Notation
Big-Oh
– f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n)
big-Omega
– f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n)
big-Theta
– f(n) is (g(n)) if f(n) is asymptotically equal to g(n)
little-oh
– f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n)
little-omega
– f(n) is (g(n)) if is asymptotically strictly greater than g(n)
Rules for Asymptotic Analysis-I
• Rule 1: If T1(N)=O(f(N)) and T2(N)=O(g(N)), then
a) T1(N)+T2(N)=O(f(N)+g(N))
(intuitevely, max(O(f(N)), O(g(N)))
b) T1(N)*T2(N)=O(f(N)*g(N))
• Rule 2: If T(N) is a polynomial
of degree k, then T(N)=Ɵ(Nk).
• Rule 3: logkN=O(N) for any
constant k.
Izmir University of Economics
8
Rules for Asymptotic Analysis-II
• It is not desirable to include constants or low-order terms
inside a Big-Oh. Do not say T(N)=O(2N2) or
T(N)=O(N2+N). The correct form is T(N)=O(N2).
• The relative growth rates of f(N) and g(N) can always be
determined by limN(f(N)/g(N)): (using L’Hỏpital’s rule if
necessary)
lim
f ( N )  , lim
g(N )  
N 
lim
1.
2.
3.
4.
N 
N 
f (N )
g(N )
 lim
N 
f ( N )
g ( N )
The limit is 0: f(N)=o(g(N))
The limit is c≠0: f(N)=Ɵ(g(N))
The limit is : g(N)=o(f(N))
The limit oscillates: no relation
Izmir University of Economics
9
Rules for Asymptotic Analysis-III
• Sometimes simple algebra is just sufficient.
• Example: f(N)=NlogN, g(N)=N1.5 are given.
Decide which of the two functions grows faster!
• This amounts to comparing logN and N0.5. This,
in turn, is equivalent to testing log2N and N. But
we already know that N grows faster than any
power of a log.
• It is bad to say f(N)≤O(g(N)), because the
inequality is implied by the definition.
f(N)≥O(g(N)) is incorrect since it does not make
sense.
Izmir University of Economics
10
Model of Computation
• Our model is a normal computer.
• Instructions are executed sequentially.
• It has a repertoire of simple instructions (addition,
multiplication, comparison, assignment).
• It takes one (1) time unit to execute these simple
instructions (assume our model has fixed size
integers and no fancy operations like matrix
inversion or sorting).
• It has infinite memory.
• Weaknesses: disk read vs addition, page faults
when memory is not infinite
Izmir University of Economics
11
What to Analyze
• The most important resource to analyze is generally
the running time of a program.
• Compiler and computer used affect it but are not
taken into consideration.
• The algorithm used and the input to it will be
considered.
• Tavg(N) (average running time: typical behavior) and
Tworst(N) (worst case: it is generally the required
quantity: guarantee for performance: bound for all
input) running times on input size N.
• The details of the programming language do not
affect a Big-Oh answer. It is the algorithm that is
analyzed not the program (implementation).
Izmir University of Economics
12
Counting Primitive Operations
Computing the time complexity: By inspecting the
pseudocode, we can determine the maximum number of
primitive operations executed by an algorithm, as a
function of the input size
Algorithm arrayMax(A, n)
currentMax  A[0]
for i  1 to n  1 do
if A[i]  currentMax then
currentMax  A[i]
{ increment counter i }
return currentMax
# operations
2
2+n
2(n  1)
2(n  1)
2(n  1)
1
Total
7n  1
Maximum Subsequence Sum
Problem - I
• Definition: Given (possibly negative) integers
j
A1, A2, ..., AN, find the maximum value of  Ak .
k i
(For convenience, the maximum subsequence
sum is 0 if all integers are negative)
• Example: For input -2, 11, -4, 13, -5, -2, the
answer is 20 (A2 through A4).
• There are many algorithms to solve this
problem. We will discuss 4 of these.
Izmir University of Economics
14
Maximum Subsequence Sum
Problem - II
• For a small amount of input, they all run in a
blink of the eye.
• Algorithms should not form bottlenecks.Times do
not include the time to read.
Izmir University of Economics
15
Maximum Subsequence Sum
Problem - III
-O(NlogN)
Algorithm is
not linear.
Verify it by a
straight-edge.
-Relative
growth rates
are evident.
Izmir University of Economics
16
Maximum Subsequence Sum
Problem - IV
• Illustrates
how useless
inefficient
algorithms
are for even
moderately
large
amounts of
input.
Izmir University of Economics
17
Running Time Calculations
• Several ways to estimate the running time of a program (empirical vs
analytical)
• Big-Oh running times. Here is a simple program fragment to
3
calculate iN1i
- The declarations count for no time.
unsigned int sum( int n ) {
- Lines 1 and 4 count for 1 unit each
unsigned int i, partial_sum;
- Line 3 counts for 4 units per time
/*1*/ partial_sum = 0;
executed (2 multiplications, 1 addition,
/*2*/ for( i=1; i<=n; i++ )
1 assignment) and is executed N
/*3*/
partial_sum += i*i*i;
times for a total of 4N units.
/*4*/ return( partial_sum );
- Line 2 costs 2N+2 units (1 unit for initial }
assignment, N+1 units for comparison tests
Some shortcuts could be taken
N units for all increments).
without affecting the final answer
- Total time T(N) is 6N+4 which is O(N). (Line 3 is an O(1) statement, Line 1
is insignificant compared to for loop.
Izmir University of Economics
18
General Rules - I
• RULE 1-FOR LOOPS:The running time of a for loop is at
most the running time of the statements inside the for
loop (including tests) times the number of iterations.
• RULE 2-NESTED FOR LOOPS: Analyze these inside
out. The total running time of a statement inside a group
of nested for loops is the running time of the statement
multiplied by the product of the sizes of all the for loops.
Example: the following program fragment is O(n2):
for( i = 0; i < n; i++ )
for( j=0; j < n; j++ )
k++;
Izmir University of Economics
19
General Rules - II
• RULE 3-CONSECUTIVE STATEMENTS: These just
add (which means that the maximum is the one that
counts – Rule 1(a) on page 6).
Example: the following program fragment, which has O(n)
work followed by O (n2) work, is also O (n2):
for( i = 0; i < n; i++)
a[i] = 0;
for( i = 0; i < n; i++ )
for( j = 0; j < n; j++ )
a[i] += a[j] + i + j;
Izmir University of Economics
20
General Rules - III
• RULE 4-lF/ELSE: For the fragment
if( condition )
S1
else
S2
• the running time of an if/else statement is never more
than the running time of the test plus the larger of the
running times of S1 and S2.
• Clearly, this can be an over-estimate in some cases, but
it is never an under-estimate.
Izmir University of Economics
21
General Rules - IV
• Other rules are obvious, but a basic strategy of analyzing
from the inside (or deepest part) out works. If there are
function calls, obviously these must be analyzed first.
• If there are recursive procedures, there are several
options. If it is really just a thinly veiled for loop, the
analysis is usually trivial. Example: The following function
is really just a simple loop and is obviously O (n):
unsigned int factorial( unsigned int n ) {
if( n <= 1 )
return 1;
else
return( n * factorial(n-1) );
}
Izmir University of Economics
22
General Rules - V
• When recursion is properly used, it is difficult to convert the
recursion into a simple loop structure. In this case, the analysis
will involve a recurrence relation that needs to be solved.
• Example: Consider the following program:
- If the program is run for values of n
unsigned int fib( unsigned int n ) {
around 40, it becomes terribly inefficient. /*1*/ if( n <= 1 )
Let T(n) be the running time for the
/*2*/ return 1;
Function fib(n). T(0)=T(1)=1 (time to do
else
the test at Line 1 and return). For n≥2,
/*3*/ return( fib(n-1) + fib(n-2) );
the total time required is then
}
T(n ) = T(n - 1) + T(n - 2) + 2
(where the 2 accounts for the work at Line 1 plus the addition at
Line 3).
T(n) ≥ fib(n)=fib(n-1)+fib(n-2) ≥ (3/2)n (for n > 4, by induction)
• Huge amount of redundant work (violates 4th rule of recursioncompound interest rule. fib(n-1) has already computed fib(n-2)
Izmir University of Economics
23
Homework Assignments
• 2.1, 2.2, 2.3, 2.4, 2.5, 2.7, 2.11, 2.12
• You are requested to study and solve the
exercises. Note that these are for you to
practice only. You are not to deliver the
results to me.
Izmir University of Economics
24