Document

Algorithms
Algorithm Analysis
1
Theoretical Analysis




Uses a high-level description of the algorithm
instead of an implementation
Characterizes running time as a function of
the input size, n.
Takes into account all possible inputs
Allows us to evaluate the speed of an
algorithm independent of the hardware/
software environment
2
Analysis of Algorithms

What is the goal of analysis of algorithms?


To compare algorithms mainly in terms of
running time but also in terms of other factors
(e.g., memory requirements, programmer's
effort etc.)
What do we mean by running time analysis?

Determine how running time increases as
the size of the problem increases.
3
Input Size

Input size (number of elements in the input)

size of an array

polynomial degree

# of elements in a matrix

# of bits in the binary representation of the input

vertices and edges in a graph
4
Kinds of analyses
Worst-case: (usually)
• T(n) = maximum time of algorithm
on any input of size n.
Average-case: (sometimes)
• T(n) = expected time of algorithm
over all inputs of size n.
• Need assumption of statistical
distribution of inputs.
Best-case: (NEVER)
• Cheat with a slow algorithm that
works fast on some input.
Lower Bound  Running Time  Upper Bound
Types of Analysis

Worst case




Provides an upper bound on running time
An absolute guarantee that the algorithm would not run
longer, no matter what the inputs are
Best case

Provides a lower bound on running time

Input is the one for which the algorithm runs the fastest
Average case

Provides a prediction about the running time

6
Assumes that the input is random
How do we compare
algorithms?



Express running time as a function of the
input size n (i.e., f(n)).
Compare different functions corresponding to
running times.
Such an analysis is independent of machine
time, programming style, etc.
7
Machine model: Generic Random
Access Machine (RAM)


Executes operations sequentially
Set of primitive operations:


Arithmetic. Logical, Comparisons, Function calls
Simplifying assumption: all ops cost 1
unit

Eliminates dependence on the speed of our
computer, otherwise impossible to verify and to
compare
Some function categories







The
The
The
The
The
The
The
constant function:
linear function:
quadratic function:
cubic function:
exponential function:
logarithm function:
n log n function:
f(n)
f(n)
f(n)
f(n)
f(n)
f(n)
f(n)
=
=
=
=
=
=
=
c
n
n2
n3
bn
log n
n log n
9
Rate of Growth

Consider the example of buying elephants and
goldfish:
Cost: cost_of_elephants + cost_of_goldfish

Cost ~ cost_of_elephants (approximation)
The low order terms in a function are relatively
insignificant for large n
n4 + 100n2 + 10n + 50
~
n4
i.e., we say that n4 + 100n2 + 10n + 50 and n4 have
the same rate of growth10
Constant Factors

The growth rate is not affected by



constant factors or
lower-order terms
Examples


102n + 105 is a linear function
105n2 + 108n is a quadratic function
11
Primitive Operations





Basic computations performed by an
algorithm
Identifiable in pseudocode
Largely independent from the programming
language
Exact definition not important (we will see
why later)
Assumed to take a constant amount of time
12
in the RAM model
Counting Primitive Operations

By inspecting the pseudocode, we
can determine the maximum
number of primitive operations
executed by an algorithm, as a
function of the input size
13
Primitive Operations

Examples:







Assigning a value to a variable
Calling a method
Performing an arithmetic operation such as
addition
Comparing two numbers
Indexing into an array
Following an object reference
Returning from a method
14
What to count
currentMax  A[0]: assignment, array access
for i  1 to n - 1 do:
assignment,comparison,subtraction,increment (2
ops)
if currentMax < A[i] then: comparison, access
currentMax  A[i]: assignment, access
return currentMax: return
15
Measuring running time
comparison
& subtraction
increment
currentMax  A[0]
for i  1 to n - 1 do
if currentMax < A[i] then
currentMax  A[i]
return currentMax
2
1+2n+2(n-1)
2(n-1)
0 .. 2(n-1)
1
RUNNING TIME (worst-case)
8n - 2
16
Nested loops
// prints all pairs from the set {1,2,…n}
for i  1 to n do
for j  1 to n do
print i, j
n2 times
How about the operations implied in the for
statement headers?
17
Nested loops running time
for i  1 to n do
for j  1 to n do
print i, j
3n+2
n (3n+2)
n2
TOTAL:
4n2 + 5n + 2
18
Example


Associate a "cost" with each statement.
Find the "total cost“ by finding the total number of times
each statement is executed.
Algorithm 1
Algorithm 2
Cost
Cost
arr[0] = 0;
c1
for(i=0; i<N; i++)
c2
arr[1] = 0;
c1
arr[i] = 0;
c1
arr[2] = 0;
c1
...
...
arr[N-1] = 0;
c1
----------------------c1+c1+...+c1 = c1 x N
(N+1) x c2 + N x c1 =
(c2 + c1) x N + c2
19
Another Example

Algorithm 3
Cost
sum = 0;
for(i=0; i<N; i++)
for(j=0; j<N; j++)
sum += arr[i][j];
c1
c2
c2
c3
-----------c1 + c2 x (N+1) + c2 x N x (N+1) + c3 x N2
20
Counting Primitive Operations
# operations
2 [index into array;
assign value]
for i  1 to n  1 do
1+n
[n comparisons;
initialization]
if A[i]  currentMax then
2(n  1) [n-1 loops-index;
compare]
currentMax  A[i] 2(n  1) [n-1 loops-index;
assign]
{ increment counter i }
2(n  1) [n-1 loops- add; assign]
return currentMax
1
Algorithm arrayMax(A,n);
currentMax  A[0]
5n (at least)
Total 7n  2 (worst case)
Estimating Running Time

Algorithm arrayMax executes 7n  2 primitive
operations in the worst case. Define:
a = Time taken by the fastest primitive operation
b = Time taken by the slowest primitive operation


Let T(n) be worst-case time of arrayMax. Then
a (7n  2)  T(n)  b(7n  2)
Hence, the running time T(n) is bounded by two
linear functions
22
Big-Oh Notation


Important definition: Given
functions f(n) and g(n), we say that
f(n) is O(g(n)) if there are positive
constants c and n0 such that
f(n)  cg(n) for n  n0
O(n) is read as big-oh of n.
23
Examples

Show that 30n+8 is O(n).
 Show c,n0: 30n+8  cn, n>n0 .

Let c=31, n0=8. Assume n>n0=8.
Then
cn = 31n = 30n + n > 30n+8, so
30n+8 < cn.
Examples

2n2 = O(n3):
2n2 ≤ cn3  2 ≤ cn  c = 1 and n0= 2

n2 = O(n2):
n2 ≤ cn2  c ≥ 1  c = 1 and n0= 1

1000n2+1000n = O(n2):
1000n2+1000n ≤ 1000n2+ n2 =1001n2 c=1001 and n0 = 1000

n = O(n2):
n ≤ cn2  cn ≥ 1  c = 1 and n0= 1
No Uniqueness

There is no unique set of values for n0 and c in proving the asymptotic
bounds

Prove that 100n + 5 = O(n2)

100n + 5 ≤ 100n + n = 101n ≤ 101n2
for all n ≥ 5
n0 = 5 and c = 101 is a solution

100n + 5 ≤ 100n + 5n = 105n ≤ 105n2
for all n ≥ 1
n0 = 1 and c = 105 is also a solution
Must find SOME constants c and n0 that satisfy the asymptotic notation relation
26
Big-Oh Notation
Example:
 Prove: 2n + 10 is O(n)
 We want to find c and n0 such
that
2n + 10  cn for n  n0
 Realize the values found will not
be unique!

27
Prove: 2n + 10 is O(n)




We want to find c and n0 such that
2n + 10  cn for n  n0
 Start with 2n + 10  cn and try to solve for n
 (c  2) n  10
 n  10/(c  2)
So, one possible choice (but not the only one)
would be to let c = 3 and n0 = 10.
This is not handed in when a proof is requested.
Now we can construct a proof!
28
Big-Oh Example

Example: the function n2 is not O(n)

We want



n2  cn
nc
The above inequality cannot be satisfied
since c must be a constant
29
More Big-Oh Examples

7n-2
Claim: 7n-2 is O(n)
Scratchwork: Need c > 0 and n0  1 such that 7n-2 
cn for n  n0
7n-2 ≤ cn
7n - cn ≤ 2
(7-c) n ≤ 2
this is true for c = 7 and n0 = 1
30
More Big-Oh Examples
 3n3
+ 20n2 + 5
3n3 + 20n2 + 5 is O(n3)
need c > 0 and n0  1 such that 3n3 + 20n2 + 5  cn3 for
n  n0
this is true for c = 4 and n0 = 21
3
log n + log log n
3 log n + log log n is O(log n)
need c > 0 and n0  1 such that 3 log n + log log n  clog
n for n  n0
this is true for c = 4 and n0 = 2
Big-Oh Rules

If is f(n) a polynomial of degree d, then f(n) is
O(nd), i.e.,
1.
2.

Use the smallest possible class of functions


Drop lower-order terms
Drop constant factors
Say “2n is O(n)” instead of “2n is O(n2)”
Use the simplest expression of the class

Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”
32
Big-O Notation


We say fA(n)=30n+8 is order n, or
O (n)
It is, at most, roughly proportional
to n.
fB(n)=n2+1 is order n2, or O(n2). It
is, at most, roughly proportional to
n2.
Back to Our Example
Algorithm 1
arr[0] =
arr[1] =
arr[2] =
...
arr[N-1]
0;
0;
0;
Cost
c1
c1
c1
Algorithm 2
for(i=0; i<N; i++)
arr[i] = 0;
c1
----------c1+c1+...+c1 = c1 x N

= 0;
Cost
c2
c1
------------(N+1) x c2 + N x c1 =
(c2 + c1) x N + c2
Both algorithms are of the same order: O(N)
34
Example (cont’d)
Algorithm 3
Cost
sum = 0;
for(i=0; i<N; i++)
for(j=0; j<N; j++)
sum += arr[i][j];
c1
c2
c2
c3
-----------c1 + c2 x (N+1) + c2 x N x (N+1) + c3 x N2 = O(N2)
35
Big-O Visualization
O(g(n)) is the set of
functions with smaller
or same order of
growth as g(n)
36
Logarithms and Exponents


Definition: logba = c of a = bc
properties of logarithms:
logb(xy) = logbx + logby
logb (x/y) = logbx - logby
logbxa = alogbx
logba = logxa/logxb
properties of exponentials:
a(b+c) = aba c
abc = (ab)c
ab /ac = a(b-c)
b = a logab
bc = a c logab
37
Lower Bound  Running Time  Upper Bound
Types of Analysis

Worst case




Provides an upper bound on running time
An absolute guarantee that the algorithm would not run
longer, no matter what the inputs are
Best case

Provides a lower bound on running time

Input is the one for which the algorithm runs the fastest
Average case

Provides a prediction about the running time

38
Assumes that the input is random
Asymptotic Notation

O notation: asymptotic “less than”:


 notation: asymptotic “greater than”:


f(n)=O(g(n)) implies: f(n) “≤” g(n)
f(n)=  (g(n)) implies: f(n) “≥” g(n)
 notation: asymptotic “equality”:

f(n)=  (g(n)) implies: f(n) “=” g(n)
Relatives of Big-Oh

big-Omega
• f(n) is (g(n)) if there is a constant
c > 0 and an integer constant n0  1
such that
f(n)  cg(n) for n  n0
Intuitively, this says up to a constant
factor, f(n) asymptotically is greater
than or equal to g(n)
40
Asymptotic notations

-notation
(g(n)) is the set of functions
with the same order of growth
as g(n)
41
Examples

n2/2 –n/2 = (n2)

½ n2 - ½ n ≤ ½ n2 n ≥ 0

½ n2 - ½ n ≥ ½ n2 - ½ n * ½ n ( n ≥ 2 ) = ¼ n2
 c1= ¼

n ≠ (n2): c1 n2 ≤ n ≤ c2 n2
 only holds for: n ≤ 1/c1
42
 c2 = ½
Relatives of Big-Oh

little-oh
 f(n) is o(g(n)) if, for any constant
c > 0, there is an integer constant
n0  0 such that f(n)  cg(n) for n 
n0
Intuitively, this says f(n) is, up to a
constant, asymptotically strictly less
than g(n).
43
Relatives of Big-Oh

little-omega
 f(n) is (g(n)) if, for any constant c > 0,
there is an integer constant n0  0 such
that f(n)  cg(n) for n  n0
Intuitively, this says f(n) is, up to a
constant, asymptotically strictly greater
than g(n).
These are not used as much as the earlier
definitions, but they round out the
44
picture.
Summary for Intuition for
Asymptotic Notation
Big-Oh
• f(n) is O(g(n)) if f(n) is asymptotically less than or equal to
g(n)
big-Omega
• f(n) is (g(n)) if f(n) is asymptotically greater than or
equal to g(n)
big-Theta
• f(n) is (g(n)) if f(n) is asymptotically equal to g(n)
little-oh
• f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n)
little-omega
• f(n) is (g(n)) if is asymptotically strictly greater than g(n)
45
Example Uses of the
Relatives of Big-Oh

5n2 is (n2)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1
such that f(n)  cg(n) for n  n0
let c = 5 and n0 = 1
5n2 is (n)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1
such that f(n)  cg(n) for n  n0
let c = 1 and n0 = 1
5n2 is (n)
f(n) is (g(n)) if, for any constant c > 0, there is an integer constant n0 
0 such that f(n)  cg(n) for n  n0
need 5n02  c•n0  given c, the n0 that satisfies this is n0  c/5  0
46
More Examples

f(n) = log n2; g(n) = log n + 5

f(n) = n; g(n) = log n2

f(n) = log log n; g(n) = log n

f(n) = n; g(n) = log2 n

f(n) = n log n + n; g(n) = log n

f(n) = 10; g(n) = log 10

f(n) = 2n; g(n) = 10n2

f(n) = 2n; g(n) = 3n
f(n) = (g(n))
f(n) = (g(n))
f(n) = O(g(n))
f(n) = (g(n))
f(n) = (g(n))
f(n) = (g(n))
f(n) = (g(n))
f(n) = O(g(n))
A MORE PRECISE DEFINITION OF O, 
(only for those with calculus backgrounds)
Definition: Let f and g be functions defined on the
positive real numbers with real values.
We say g is in O(f) if and only if
lim
g(n)/f(n) = c
n -> 
for some nonnegative real number c--- i.e. the limit exists
and is not infinite.
We say f is in (g) if and only if
f is in O(g) and g is in O(f)
Note: Often to calculate these limits you need L'Hopital's
48
Relations Between Different
Sets

Subset relations between order-ofgrowth sets. RR
( f )
O( f )
•f
( f )
49
Why Asymptotic Behavior is
Important



1) Allows us to compare two algorithms with
respect to running time on large data sets.
2) Helps us understand the maximum size of
input that can be handled in a given time,
provided we know the environment in which
we are running.
3) Stresses the fact that even dramatic
speedups in hardware do not overcome the
handicap of an asymtotically slow algorithm.
50
Common Summations

n
n ( n + 1)
2
Arithmetic series:
 k  1 + 2 + ... + n 
Geometric series:
x n +1  1
x  1 + x + x + ... + x 
x  1

x 1
k 0
k 1
n


Special case: |x| < 1:
k

 xk 
k 0


Harmonic series:
Other important
2
n
1
1 x
n
1
1
1
 ln n

1
+
+
...
+

2
n
k 1 k
n
 lg k  n lg n
k 1
formulas:
n
 k p  1p + 2 p + ... + n p 
k 1
1
n p +1
p +1