Analysis of Algorithms CS 465/665

Analysis of
Algorithms
Asymptotic Analysis
Prepared By:
Dr. Eng. Moustafa Reda AbdALLAH
Acknowledgement
• This Presentation note has been summarized
from Presentation notes on Data Structure and
Algorithm, Design and Analysis of Computer
Algorithm all over the world. I made use of all
these presentations. However, I’d like to thank
all professors who create such a good work on
those
presentation
notes.
Without
those
presentations, this slide can’t be finished.
2
Objectives
1. Details of classic algorithms
2. Methods for designing algorithms
3. Validate/verify algorithm correctness
4. Analyze algorithm efficiency
5. Prove (or at least indicate) no correct, efficient
algorithm exists for solving a given problem
6. Writing clear algorithms and proofs
3
Course Objectives
• This course introduces students to the analysis
and design of computer algorithms. Upon
completion of this course, students will be able
to do the following:
– Analyze the asymptotic performance of algorithms.
– Demonstrate a familiarity with major algorithms and
data structures.
– Apply important algorithmic design paradigms and
methods of analysis.
– Synthesize efficient algorithms in common
engineering design situations.
4
What is an Algorithm?
• Algorithm
– is any well-defined computational procedure that takes some value, or set
of values, as input and produces some value, or set of values, as output.
– is thus a sequence of computational steps that transform the input into the
output.
– is a tool for solving a well - specified computational problem.
– Any special method of solving a certain kind of problem (Webster
Dictionary)
– An algorithm is a finite set of precise instructions for performing a
computation or for solving a problem.
– a clearly specified set of simple instructions to be followed to solve a
problem
• Takes a set of values, as input and
• produces a value, or set of values, as output
– May be specified
• In English & As a computer program & As a pseudo-code
5
Computer Algorithms
An algorithm is a procedure (a finite set of
well-defined instructions) for accomplishing
some tasks which,
• given an initial state
• terminate in a defined end-state
The computational complexity and
efficient implementation of the algorithm
are important in computing, and this
depends on suitable data structures.
6
Algorithm Description
• How to describe algorithms independent of a
programming language
• Pseudo-Code = a description of an algorithm that is
– more structured than usual prose but
– less formal than a programming language
• (Or diagrams)
• Example: find the maximum element of an array.
Algorithm arrayMax(A, n):
Input: An array A storing n integers.
Output: The maximum element in A.
currentMax  A[0]
for i 1 to n -1 do
if currentMax < A[i] then currentMax  A[i]
return currentMax
7
The study of Algorithm
• How to devise algorithms
• How to express algorithms
• How to validate algorithms
• How to analyze algorithms
• How to test a program
8
Clear Writing
• Methods for Expressing Algorithms
– Implementations
– Pseudo-code
– English
• Writing clear and understandable proofs
• My main concern is not the specific language
used but the clarity of your algorithm/proof
9
Pseudo Code
• Control flow
– if … then … [else …]
– while … do …
– repeat … until …
– for … do …
– Indentation replaces braces
• Method declaration
Algorithm method (arg [, arg…])
Input …
• Method call
var.method (arg [, arg…])
• Return value
return expression
• Expressions
 Assignment (equivalent to )
 Equality
(equivalent to )
n2 Superscripts
mathematical
allowed
testing
and
other
formatting
Output …
10
Pseudo Code
• High-level description of an
algorithm.
• More structured than plain
English.
• Less detailed than a
program.
• Preferred notation for
describing algorithms.
• Hides program design
issues.
Example: find the max
element of an array
Algorithm arrayMax(A, n)
Input array A of n integers
Output maximum element of A
currentMax  A[0]
for i  1 to n  1 do
if A[i]  currentMax then
currentMax  A[i]
return currentMax
11
Pseudo Code
• Expressions: use standard mathematical symbols
– use  for assignment ( ? in C/C++)
– use = for the equality relationship (? in C/C++)
• Method Declarations:
-Algorithm name(param1, param2)
• Programming Constructs:
–
–
–
–
–
decision structures: if ... then ... [else ..]
while-loops
while ... do
repeat-loops:
repeat ... until ...
for-loop:
for ... do
array indexing:
A[i]
• Methods
– calls:
– returns:
object method(args)
return value
• Use comments
• Instructions have to be basic enough and feasible!
12
Algorithm Expressed in pseudo-Language
• Example: Pseudo-language description for
sorting an array A of n integers in ascending
order.
• Let us use the notation A[i:n] to denote the set of
array elements A[i],A[i+1],...,A[n].
• We first find the minimum integer in the array
A[1:n] and swap it with the number in A[1].
• Then we find the minimum in the array A[2:n] and
swap it with the number in A[2] and so on.
13
Some Application
• Study problems these techniques can be applied to
– sorting
– data retrieval
– network routing
– Games
– etc
14
What is a Program?
• A program is the expression of an algorithm in a programming
language
• a set of instructions which the computer will follow to solve a
problem
• Program = algorithms + data structures
• Data structures
– Methods of organizing data
• To be interesting, an algorithm has to solve a general, specified
problem.
15
Problem Solving: Main Steps
1. Problem definition
2. Algorithm design/Algorithm specification
3. Algorithm analysis
4. Implementation
5. Testing
6. [Maintenance]
16
What is a Problem?
• Definition
– A mapping/relation between a set of input instances
(domain) and an output set (range)
• Problem Specification
– Specify what a typical input instance is
– Specify what the output should be in terms of the
input instance
• Example: Sorting
– Input: A sequence of N numbers a1…an
– Output: the permutation (reordering) of the input
sequence such that a1  a2  …  an .
17
Define Problem
• Problem:
– Description of Input-Output relationship
• Algorithm:
• A sequence of computational step that transform the input into the
output. Finite set of instructions that, if followed, accomplishes a
particular task. It is described in natural language / pseudo-code /
diagrams / etc.
• Data Structure:
– An organized method of storing and retrieving data.
• Our task:
– Given a problem, design a correct and good algorithm that
solves it.
18
Problem Definition
• What is the task to be accomplished?
– Calculate the average of the grades for a given
student
– Understand the talks given out by politicians and
translate them in Chinese
• What are the time / space / speed /
performance requirements ?
19
A Problem
Input is a sequence of integers stored in an array.
Output the minimum.
Algorithm
INPUT
instance
25, 90, 53, 23, 11, 34
OUTPUT
m:= a[1];
for I:=2 to size of input
if m > a[I] then
m:=a[I];
return s
m
11
Data-Structure
20
Types of Problems
Search
: find X in the input satisfying property Y
Structuring : Transform input X to satisfy property Y
Construction: Build X satisfying Y
Optimization: Find the best X satisfying property Y
Decision
: Does X satisfy Y?
Adaptive
: Maintain property Y over time.
21
Algorithm Design / Specifications
•
Algorithm consists of a set of finite steps satisfying the following
conditions:

Input: Number and type of input values must be made clear/ Zero or more
quantities (externally produced).

Output: One or more quantities

Precise specification of each step: Each step or instruction must be feasible
and unambiguously defined.

Finiteness: For all input possibilities, the algorithm must terminate in finite time.
The algorithm has to stop after a finite (may be very large) number of steps

Result: The objective of the algorithm must be made clear. There may be an
output that spells out the execution of the algorithm.
Definiteness: Clarity, precision of each instruction
Effectiveness: Each instruction has to be basic enough and feasible


 Algorithms are the ideas behind computer programs.
 An algorithm is the thing that stays the same whether the program is in C++
running on a Cray in New York or is in BASIC running on a Macintosh in
Alaska!
22
Simple Sort Algorithm
Algorithm Simplesort(A, n)
Input : An array A of n integers.
Output : A sorted array A in ascending order of the numbers it
holds.
FOR i1 to (n-1) do
Find the minimum integer in the array A[i:n] :
Let j be such that A[j]=min A[i:n]
Swap A[i] with A[j]
ENDFOR
23
Algorithm Analysis
• We only analyze correct algorithms
• An algorithm is correct
– If, for every input instance, it halts with the correct output. Always provides
correct output when presented with legal input.
• Incorrect algorithms
– Might not halt at all on some input instances
– Might halt with other than the desired answer
• Analyzing an algorithm
– Predicting the resources that the algorithm requires
– Resources include
• Memory
• Communication bandwidth
• Computational time (usually most important)
• What is the goal of analysis of algorithms?
– To compare algorithms mainly in terms of running time but also in terms of
other factors (e.g., memory requirements, programmer's effort etc.)
24
Algorithm Analysis
•
The “process” of determining how much resources
(time, space) are used by a given algorithm. i.e. how to
estimate the time required for an algorithm
•
Techniques that drastically reduce the running time of
an algorithm, i.e. A mathemactical framwork that more
rigorously describes the running time of an algorithm
•
We want to be able to make quantitative assessments
about the value (goodness) of one algorithm compared
to another
•
We want to do this WITHOUT implementing and
running an executable version of an algorithm
25
Algorithm Analysis…
• Factors affecting the running time
–
–
–
–
computer
compiler
algorithm used
input to the algorithm
• The content of the input affects the running time
• typically, the input size (number of items in the input) is the
main consideration
– E.g. sorting problem  the number of items to be sorted
– E.g. multiply two matrices together  the total number of
elements in the two matrices
• Machine model assumed
– Instructions are executed one after another, with no
concurrent operations  Not parallel computers
26
Algorithm Analysis
• Many criteria affect the running time of an
algorithm, including
– speed of CPU, bus and peripheral hardware
– design think time, programming time and debugging
time
– language used
programmer
and
coding
efficiency
of
the
– quality of input (good, bad or average)
27
Algorithm Analysis
• Programs derived from two algorithms for
solving the same problem should both be
– Machine independent
– Language independent
– Environment independent (load on the system,...)
– Amenable to mathematical study
– Realistic
28
Algorithm Analysis
• In lieu of some standard benchmark conditions
under which two programs can be run, we
estimate the algorithm's performance based on
the number of key and basic operations it
requires to process an input of a given size
• For a given input size n we express the time T to
run the algorithm as a function T(n)
• Concept of growth rate allows us to compare
running time of two algorithms without writing
two programs and running them on the same
computer
29
Algorithm Analysis
• Given an algorithm a mathematical estimate is determined
that reflects the time and space complexity of an algorithm.
• This estimate should closely reflect the experimental results to
the extent possible.
• Any code or pseudo-language describing an algorithm
consists of certain number of primitive operations or
instructions.
• Primitive operations:



Assignment of a value to a variable.
Comparing two numbers.
Arithmetic operations such has addition,
multiplication etc. between two numbers.
Primitive Operations cont….
subtraction,
30
Algorithm Analysis

Invoking a function or a method and returning from a
function or method.

Indexing into an array.

Unconditional jumps from one step of the code to
another.
• The execution times of these operations
depend on the architecture of the machine
that implements the algorithms.
31
What do we analyze about them?
• Correctness
– Does the input/output relation match algorithm requirement?
– Loop invariant by mathematical induction (Loop invariants are
conditions and relationships that are satisfied by the variables and
data structures at the end of each iteration of the loop.)
• Amount of work done (aka/Also Known As) complexity
– Basic operations to do task
• Amount of space used
– Memory used
• Simplicity, clarity
– Verification and implementation.
• Optimality
– Is it impossible to do better?
32
Types of Analysis
• Worst case
– Provides an upper bound on running time
– An absolute guarantee that the algorithm would not run longer,
no matter what the inputs are
• Best case
– Provides a lower bound on running time
– Input is the one for which the algorithm runs the fastest
Lower Bound  Running Time  Upper Bound
• Average case
– Provides a prediction about the running time
– Assumes that the input is random
33
Worst / Average / Best-case
• Worst-case running time of an algorithm
– The longest running time for any input of size n
– An upper bound on the running time for any input
 guarantee that the algorithm will never take longer
– Example: Sort a set of numbers in increasing order; and the data
is in decreasing order
– The worst case can occur fairly often
• E.g. in searching a database for a particular piece of information
• Best-case running time
– sort a set of numbers in increasing order; and the data is already
in increasing order
• Average-case running time
– May be difficult to define what “average” means
34
Low Level Algorithm Analysis
• Based on primitive operations (low-level computations
independent from the programming language)
• E.g.:
– Make an addition = 1 operation
– Calling a method or returning from a method = 1 operation
– Index in an array = 1 operation
– Comparison = 1 operation etc.
•
Method: Inspect the pseudo-code and count the number
of primitive operations executed by the algorithm
35
Algorithm/Program Performance
•
Program performance is the amount of
computer memory and time needed to run a
program.
•
How is it determined?
1. Analytically
•
performance analysis
2. Experimentally
•
performance measurement
36
What’s more important than performance?
•
•
•
•
•
•
•
•
•
•
Modularity
Correctness
Maintainability
Functionality
Robustness
User-friendliness
Programmer time
Simplicity
Extensibility
Reliability
37
Why study algorithms and performance?
• Algorithms help us to understand scalability.
• Performance often draws the line between what is
feasible and what is impossible.
• Algorithmic mathematics provides a language for talking
about program behavior.
• Performance is the currency of computing.
• The lessons of program performance generalize to other
computing resources.
• Speed is fun!
38
What do we need?
Correctness: Whether the algorithm computes
the correct solution for all instances
Efficiency: Resources needed by the algorithm
1. Time: Number of steps.
2. Space: amount of memory used.
Measurement “model”:
Worst case, Average case and Best case.
39
Correctness
• Example: Traveling Salesperson Problem (TSP)
• Input: A sequence of N cities with the distances dij
between each pair of cities
• Output: a permutation (ordering) of the cities <c1’, …, cn’>
that minimizes the expression
Σj =1 to n-1 dj’,j’+1 + dn’,1’
• Which of the following algorithms is correct?
– Nearest neighbor: Initialize tour to city 1. Extend tour by visiting
nearest unvisited city. Finally return to city 1.
– All tours: Try all possible orderings of the points selecting the
ordering that minimizes the total length:
40
Algorithm Correctness
•
Proving an algorithm generates correct output
for all inputs
•
One technique covered in textbook
–
•
Loop invariants
We will do some of this in the course, but it is
not emphasized as much as other objectives
41
Efficiency
• Efficiency
– Computes correct output quickly given input
• Example: Odd Number Problem
• Input: A number n
• Output: Yes if n is odd, no if n is even
• Which of the following algorithms is most efficient?
– Count up to that number from one and alternate naming each
number as odd or even.
– Factor the number and see if there are any twos in the factorization.
– Keep a lookup table of all numbers from 0 to the maximum integer.
– Look at the last bit (or digit) of the number.
42
Study of Effectiveness of an Algorithm
• An approach to evaluate the effectiveness of an
algorithm is to perform empirical studies.
• Limitations of Empirical studies:

Algorithm has to be implemented and comparison of two
algorithms implies that both of them have to implemented
for the same machine using the same language.

It is impossible to test for all possible input permutations of
data.
• General approach for Performance Analysis:

Comparison of different algorithms for the same problem
has to be done in the same computer environments.
43
Study of Effectiveness of an Algorithm
• Measures the efficiency of an algorithm or its
implementation as a program as the input size
becomes very large
• We evaluate a new algorithm by comparing its
performance with that of previous approaches
– Comparisons are asymtotic analyses of classes of
algorithms
• We usually analyze the time required for an
algorithm and the space required for a
datastructure
44
Criteria for Measurement
• Space
– amount of memory program occupies
– usually measured in bytes, KB or MB
• Time
– execution time
– usually measured by the number of executions
45
Space Complexity
• Space complexity is defined as the amount of
memory a program needs to run to completion.
• Why is this of concern?
– We could be running on a multi-user system where
programs are allocated a specific amount of space.
– We may not have sufficient memory on our computer.
– There may be multiple solutions, each having
different space requirements.
– The space complexity may define an upper bound on
the data that the program can handle.
46
Space Complexity
• Space complexity = The amount of memory required by
an algorithm to run to completion
– [Core dumps = the most often encountered cause is “memory
leaks” – the amount of memory required larger than the memory
available on a given system]
• Some algorithms may be more efficient if data
completely loaded into memory
– Need to look also at system limitations
– E.g. Classify 2GB of text in various categories [politics, tourism,
sport, natural disasters, etc.] – can I afford to load the entire
collection?
47
Space Complexity
• Space complexity = The amount of memory required by
an algorithm to run to completion
– [Core dumps = the most often encountered cause is “memory
leaks” – the amount of memory required larger than the memory
available on a given system]
• Some algorithms may be more efficient if data
completely loaded into memory
– Need to look also at system limitations
– E.g. Classify 2GB of text in various categories [politics, tourism,
sport, natural disasters, etc.] – can I afford to load the entire
collection?
48
Space Complexity
1. Fixed part: The size required to store certain
data/variables, that is independent of the size of the
problem:
- e.g. name of the data collection
- same size for classifying 2GB or 1MB of texts
2. Variable part: Space needed by variables, whose size is
dependent on the size of the problem:
- e.g. actual text
- load 2GB of text VS. load 1MB of text
49
Components of Program Space
• Program space =
Instruction space
+ data space
+ stack space
• The instruction space is dependent on several
factors.
– the compiler that generates the machine code
– the compiler options that were set at compilation time
– the target computer
50
Components of Program Space
• Data space
– very much dependent on the computer architecture
and compiler
– The magnitude of the data that a program works with
is another factor
char
short
int
long
Unit: bytes
1
2
2
4
float
double
long double
pointer
4
8
10
2
51
Components of Program Space
• Data space
– Choosing a “smaller” data type has an effect on the
overall space usage of the program.
– Choosing the correct type is especially important
when working with arrays.
– How many bytes of memory are allocated with each
of the following declarations?
double a[100];
int maze[rows][cols];
52
Components of Program Space
•
Environment Stack Space
– Every time a function is called, the following data are
saved on the stack.
1. the return address
2. the values of all local variables and value formal parameters
3. the binding of all reference and const reference parameters
– What is the impact of recursive function calls on the
environment stack space?
53
Space Complexity Summary
• Given what you now know about space
complexity, what can you do differently to make
your programs more space efficient?
– Always choose the optimal (smallest necessary) data
type
– Study the compiler.
– Learn about the effects of different compilation
settings.
– Choose non-recursive algorithms when appropriate.
54
Time Complexity
• Time complexity is the amount of computer time
a program needs to run.
• Why do we care about time complexity?
– Some computers require upper limits for program
execution times.
– Some programs require a real-time response.
– If there are many solutions to a problem, typically
we’d like to choose the quickest.
55
Time Complexity
• Often more important than space complexity
– space available (for computer programs!) tends to be larger and
larger
– time is still a problem for all of us
• 3-4GHz processors on the market
– still …
– researchers estimate that the computation of various
transformations for 1 single DNA chain for one single protein on 1
TerraHZ computer would take about 1 year to run to completion
• Algorithms running time is an important issue
56
Time Complexity
•
How do we measure?
1. Count a particular operation (operation counts)
2. Count the number of steps (step counts)
3. Asymptotic complexity
57
Running-time of algorithms
• Bounds are for the algorithms, rather than
programs
– programs are just implementations of an algorithm,
and almost always the details of the program do not
affect the bounds
• Bounds are for algorithms, rather than problems
– A problem can be solved with several algorithms,
some are more efficient than others
58
Running Time
• Number of primitive steps that are executed
– Except for time of executing a function call most
statements roughly require the same amount of time
• y=m*x+b
• c = 5 / 9 * (t - 32 )
• z = f(x) + g(y)
• We can be more exact if need be
59
Running Time
• Problem: prefix averages
– Given an array X
– Compute the array A such that A[i] is the average of
elements X[0] … X[i], for i =0..n-1
• Sol 1
– At each step i, compute the element A[i] by traversing the
array X and determining the sum of its elements,
respectively the average
• Sol 2
– At each step i update a sum of the elements in the array A
– Compute the element X[i] as sum/I(i+1)
Big question: Which solution to choose?60
Running time for small inputs
61
Running time for moderate inputs
62
Worst Case Operation Count
for (j = i - 1; j >= 0 && t < a[j]; j--)
a[j + 1] = a[j];
a = [1,2,3,4] and t = 0
⇒ 4 compares
a = [1,2,3,4,…,i] and t = 0
⇒ i compares
for (int i = 1; i < n; i++)
for (j = i - 1; j >= 0 && t < a[j]; j--)
a[j + 1] = a[j];
total compares = 1+2+3+…+(n-1) = (n-1)n/2
63
Input Size
•
In general, larger input instances – greatly affects time & space
complexities as in sorting and multiplication- require more resources
to process correctly
•
We standardize by defining a notion of size for an input instance
•
Examples
–
What is the size of a sorting input instance?
–
What is the size of an “Odd number” input instance?
• Input size (number of elements in the input) is characterized by:
– size of an array
– polynomial degree
– # of elements in a matrix
– Sorting: number of input items
– Multiplication: total number of bits
– # of vertices and edges in a graph
– # of bits in the binary representation of the input
64
Example: Selection Problem
• Given a list of N numbers, determine the kth
largest, where k  N.
• Algorithm 1:
(1) Read N numbers into an array
(2) Sort the array in decreasing order by some simple
algorithm
(3) Return the element in position k
65
Example: Selection Problem…
• Algorithm 2:
(1) Read the first k elements into an array and sort
them in decreasing order
(2) Each remaining element is read one by one
• If smaller than the kth element, then it is ignored
• Otherwise, it is placed in its correct spot in the array,
bumping one element out of the array.
(3) The element in the kth position is returned as the
answer.
66
Example: Selection Problem…
• Which algorithm is better when
– N =100 and k = 100?
– N =100 and k = 1?
• What happens when N = 1,000,000 and
k = 500,000?
• There exist better algorithms
67
Important Question
• Is it always important to be on the most
preferred curve?
• How much better is one curve than another?
• How do we decide which curve a particular
algorithm lies on?
• How do we design algorithms that avoid being
on the bad curves?
68
How do we compare algorithms?
• We need to define a number of objective
measures.
(1) Compare execution times!
Not good: times are specific to a particular
computer !!
(2) Count the number of statements
executed !
Not good: number of statements vary with
the programming language as well as the
style of the individual programmer.
69
Ideal Solution
• Express running time as a function of the
input size n (i.e., f(n)).
• Compare different functions corresponding
to running times.
• Such an analysis is independent of
machine type, programming style, etc.
70
Measuring Complexity
•
The running time of an algorithm is the function
defined by the number of steps (or amount of
memory) required to solve input instances of
size n
–
–
–
–
–
•
F(1) = 3
F(2) = 5
F(3) = 7
…
F(n) = 2n+1
Problem: Inputs of the same size may require
different numbers of steps to solve
71
Asymptotic Analysis
• To compare two algorithms with running times f(n) and g(n), we need
a rough measure that characterizes how fast each function grows.
• Hint: use rate of growth
• Compare functions in the limit, that is, asymptotically! (i.e., for large
values of n)
• Goal: to simplify analysis by getting rid of unneeded information (like
“rounding” 1,000,001≈1,000,000)
• We want to say in a formal way 3n2 ≈ n2
• asymptotic performance
– How does the algorithm behave as the problem size gets very large?
• Running time
• Memory/storage requirements
• Bandwidth/power requirements/logic gates/etc.
72
Asymptotic Complexity
•
Two important reasons to determine operation and
step counts
1. To compare the time complexities of two programs
that compute the same function
2. To predict the growth in run time as the instance
characteristic changes
•
Neither of the two yield a very accurate measure
– Operation counts: focus on “key” operations and
ignore all others
– Step counts: the notion of a step is itself inexact
•
Asymptotic complexity provides meaningful statements
about the time and space complexities of a program
73
Complexity Example
•
Two programs have complexities c1n2 + c2n and c3n,
respectively
•
The program with complexity c3n will be faster than the
one with complexity c1n2 + c2n for sufficiently large
values of n
•
For small values of n, either program could be faster 
depends on the values of c1, c2 and c3
•
If c1 = 1, c2 = 2, c3 = 100, then c1n2 + c2n ≤ c3n for n ≤
98 and c1n2 + c2n > c3n for n > 98
•
What if c1 = 1, c2 = 2, and c3 = 3?
74
3 different analyses
The worst case running time of an algorithm is the function
defined by the maximum number of steps taken on any instance
of size n. Provides an upper bound on running time. An absolute
guarantee
The best case running time of an algorithm is the function
defined by the minimum number of steps taken on any instance
of size n.
The average-case running time of an algorithm is the function
defined by an average number of steps taken on any instance of
size n. Provides the expected running time. Very useful, but treat
with care: what is “average”?(Random (equally likely) inputs &
Real-life inputs)
Which of these is the best to use?
75
Average case analysis
•
Drawbacks
–
Based on a probability distribution of input instances
•
•
–
•
The distribution may not be appropriate
Provides little consolation if we have a worst-case input
More complicated to compute than worst case
running time
Worst case running time is often comparable
to average case running time (see next graph)
–
Counterexamples to above point:
•
•
Quicksort
simplex method for linear programming
76
Best, Worst, and Average Case
77
Best, Worst, and Average Case
worst-case
5 ms
}
4 ms
average-case?
3 ms
best-case
2 ms
1 ms
A
B
C
D
Input
E
F
G
Suppose the program includes an if-then statement that
may execute or not:  variable running time
Typically algorithms are measured by their worst case
78
Experimental Approach
• Write a program that implements the algorithm
• Run the program with data sets of varying size.
• Determine the actual running time using a system
call to measure time (e.g. system (date) );
• Problems?
79
Experimental Approach
• It is necessary to implement and test the
algorithm in order to determine its running time.
• Experiments can be done only on a limited set of
inputs, and may not be indicative of the running
time for other inputs.
• The same hardware and software should be
used in order to compare two algorithms. –
condition very hard to achieve!
80
Use a Theoretical Approach
• Based on high-level description of the
algorithms, rather than language dependent
implementations
• Makes possible an evaluation of the algorithms
that is independent of the hardware and
software environments
 Generality
81
Worst case analysis
•
Typically much simpler to compute as we do
not need to “average” performance on many
inputs
–
Instead, we need to find and understand an input
that causes worst case performance
•
Provides guarantee that is independent of any
assumptions about the input
•
Often reasonably close to average case
running time
•
The standard analysis performed
82
Upper Bounds
• Time complexity T(n) is a function of the problem size n. The value
of T(n) is the running time of the algorithm in the worst case, i.e.,
the number of steps it requires at most with an arbitrary input.
• Average case - the mean number of steps required with a large
number of random inputs.
• Example: the sorting algorithm bubblesort has a time complexity of
T(n) = n·(n-1)/2 comparison-exchange steps to sort a sequence of n
data elements.
• Often, it is not necessary to know the exact value of T(n), but only
an upper bound as an estimate.
• e.g., an upper bound for time complexity T(n) of bubblesort is the
function f(n) = n2/2, since T(n) ≤ f(n) for all n.
83
Rate of Growth
• Consider the example of buying elephants and
goldfish ‫ سمك زينة‬:
Cost: cost_of_elephants + cost_of_goldfish
Cost ~ cost_of_elephants (approximation)
• The low order terms in a function are relatively
insignificant for large n
n4 + 100n2 + 10n + 50 ~ n4
i.e., we say that n4 + 100n2 + 10n + 50 and n4 have
the same rate of growth
84
Growth Rate
• The idea is to establish a relative order among functions
for large n
•  c , n0 > 0 such that f(N)  c g(N) when N  n0
• f(N) grows no faster than g(N) for “large” N
85
Asymptotic Notation

Describes the behavior of the time or space complexity for
large instance characteristic
•
O notation/ Big Oh (O) : asymptotic “less than”:
provides an upper bound for the function f
 f(n)=O(g(n)) implies: f(n) “≤” g(n)
•
 notation/ Omega (Ω) : asymptotic “greater than”:
provides a lower-bound
– f(n)=  (g(n)) implies: f(n) “≥” g(n)
•
 notation/ Theta () : asymptotic “equality”:
is used when an algorithm can be bounded both from above and
below by the same function
– f(n)=  (g(n)) implies: f(n) “=” g(n)
•
Little oh(o) defines a loose upper bound.
86
Examples
• What about f(n) = 4n2 ? Is it O(n)?
– Find a c such that 4n2 < cn for any n > n0
• 50n3 + 20n + 4 is O(n3)
– Would be correct to say is O(n3+n)
• Not useful, as n3 exceeds by far n, for large values
– Would be correct to say is O(n5)
• OK, but g(n) should be as closed as possible to f(n)
• 3log(n) + log (log (n)) = O( ? )
•Simple Rule: Drop lower order
terms and constant factors
87
Three Common Sets
 f(n) = O(g(n)) means c  g(n) is an Upper Bound on
f(n)
 f(n) = (g(n)) means c  g(n) is a Lower Bound on
f(n)
f(n) = (g(n)) means c1  g(n) is an Upper Bound on
f(n) and c2  g(n) is a Lower Bound on f(n)
 These bounds hold for all inputs beyond some
threshold n0.
88
Motivation for Asymptotic Analysis
• An exact computation of worst-case running time
can be difficult
– Function may have many terms:
• 4n2 - 3n log n + 17.5 n - 43 n⅔ + 75
• An exact computation of worst-case running time
is unnecessary
– Remember that we are already approximating running
time by using RAM model
89
Simplifications
• Ignore constants
– 4n2 - 3n log n + 17.5 n - 43 n⅔ + 75 becomes
– n2 – n log n + n - n⅔ + 1
• Asymptotic Efficiency
– n2 – n log n + n - n⅔ + 1 becomes n2
• End Result: Θ(n2)
90