n - Learn Group

Analysis and Design of
Algorithms
According to math historians the true origin of
the word algorism: comes from a famous
Persian author named ál-Khâwrázmî.
Khâwrázmî (780-850 A.D.)
A stamp issued
September 6, 1983
in the Soviet Union,
commemorating
Khâwrázmî's
1200th birthday.
A page from his
book.
Statue of Khâwrázmî
in front of the
Faculty of Mathematics,
Amirkabir University of
Technology,
Tehran, Iran.
Courtesy of Wikipedia
Computational Landscape
Design Methods:
 Iteration & Recursion
 pre/post condition, loop invariant
 Incremental
 Divide-&-Conquer
 Prune-&-Search
 Greedy
 Dynamic programming
 Randomization
 Reduction …
Data Structures:
 List, array, stack, queue
 Hash table
 Dictionary
 Priority Queue
 Disjoint Set Union
 Graph
…
Analysis Methods:
 Mathematical Induction
 pre/post condition, loop invariant
 Asymptotic Notation
 Summation
 Recurrence Relation
 Lower and Upper Bounds
 Adversarial argument
 Decision tree
 Recursion tree
 Reduction …
Computational Models:
 Random Access Machine (RAM)
 Turing Machine
 Parallel Computation
 Distributed Computation
 Quantum Computation
…
Algorithm
• An algorithm is a sequence of
unambiguous instructions for solving a
problem, i.e., for obtaining a required
output for any legitimate input in a finite
amount of time.
Analysis of Algorithms
• How good is the algorithm?
– Correctness
– Time efficiency
– Space efficiency
• Does there exist a better algorithm?
– Lower bounds
– Optimality
Example
Time complexity shows dependence of algorithm’s running time on input size.
Let’s assume: Computer speed = 106 IPS,
Input: a data base of size n = 106
Time Complexity
Execution time
n
1 sec
n log n
20 sec
n2
12 days
2n
40 quadrillion (1015) years
Machine Model

Algorithm Analysis:
 should reveal intrinsic properties of the algorithm itself.
 should not depend on any computing platform, programming
language, compiler, computer speed, etc.

Elementary steps:
 arithmetic:
+ –  
 logic:
and or not
 comparison:      
 assigning a value to a scalar variable: 
 ….
Complexity
• Space complexity
• Time complexity
• For iterative algorithms: sums
• For recursive algorithms: recurrence relations
Time Complexity
 Time complexity shows dependence of algorithm’s running time on input size.
 Worst-case
 Average or expected-case
 What is it good for?
•
Tells us how efficient our design is before its costly implementation.
•
Reveals inefficiency bottlenecks in the algorithm.
•
Can use it to compare efficiency of different algorithms that solve
the same problem.
•
Is a tool to figure out the true complexity of the problem itself!
How fast is the “fastest” algorithm for the problem?
•
Helps us classify problems by their time complexity.
T(n) = Q( f(n) )
T(n) = 23 n3 + 5 n2 log n + 7 n log2 n + 4 log n + 6.
drop constant
multiplicative factor
drop lower order terms
T(n) = Q(n3)
Why do we want to do this?
1. Asymptotically (at very large values of n) the leading term largely
determines function behaviour.
2. With a new computer technology (say, 10 times faster) the leading
coefficient will change (be divided by 10). So, that coefficient is technology
dependent any way!
3. This simplification is still capable of distinguishing between important but
distinct complexity classes,
e.g., linear vs. quadratic, or polynomial vs exponential.
Asymptotic Notations: Q O W o w
Rough, intuitive meaning worth remembering:
Theta
f(n) = Q(g(n))
f(n) ≈c g(n)
Big Oh
f(n) = O(g(n))
f(n) ≤c g(n)
Big Omega
f(n) = Ω(g(n))
f(n) ≥ c g(n)
Little Oh
f(n) = o(g(n))
f(n) ≪c g(n)
Little Omega
f(n) = ω(g(n))
f(n) ≫c g(n)
limn→∞ f(n)/g(n)
=
0
order of growth of f(n) < order of growth of g(n)
f(n)  o(g(n)), f(n)  O(g(n))
c>0
order of growth of f(n) = order of growth of g(n)
f(n)  Q(g(n)), f(n)  O(g(n)), f(n)  W(g(n))
∞
order of growth of f(n) > order of growth of g(n)
f(n)  w(g(n)), f(n)  W(g(n))
Asymptotics by ratio limit
L = lim n f(n)/g(n).
If L exists, then:
Theta
f(n) = Q(g(n))
0<L<
Big Oh
f(n) = O(g(n))
0≤ L<
Big Omega
f(n) = Ω(g(n))
0<L
Little Oh
f(n) = o(g(n))
L=0
Little Omega
f(n) = ω(g(n))
L=
Examples:
• logb n vs. logc n
logbn = logbc logcn
limn→∞( logbn / logcn) = limn→∞ (logbc) = logbc
logbn Q(logcn)
L’Hôpital’s rule
If

limn→∞ t(n) = limn→∞ g(n) = ∞

The derivatives f´, g´ exist,
Then
lim
n→∞
t(n)
g(n) =
lim t ´(n)
n→∞ g ´(n)
• Example: log2n vs. n
Theta : Asymptotic Tight Bound
c2g(n)
f(n) = Q(g(n))
f(n)
g(n)
c1g(n)
n0
c1, c2, n0 >0 : n  n0 ,
+
n
c1g(n)  f(n)  c2g(n).
Big Oh: Asymptotic Upper Bound
c g(n)
f(n) = O(g(n))
g(n)
n0
c, n0 >0 : n  n0 ,
+
f(n)
f(n)  c g(n).
n
Big Omega : Asymptotic Lower Bound
f(n)
f(n) = W(g(n))
g(n)
c g(n)
n0
c, n0 >0 : n  n0 ,
+
cg(n)  f(n).
n
Little oh : Non-tight Asymptotic Upper Bound
c g(n)
f(n) = o(g(n))
f(n)
n0
c >0, n0 >0 : n  n0 , f(n) < c g(n) .
No matter how small +
n
Little omega : Non-tight Asymptotic Lower Bound
f(n)
f(n) = w(g(n))
c g(n)
n0
c >0, n0 >0 : n  n0 , f(n) > c g(n) .
No matter how large +
n
Definitions of Asymptotic Notations
f(n) = Q(g(n))
c1,c2>0, n0>0: n  n0 , c1g(n)  f(n)  c2g(n)
f(n) = O(g(n))
c>0,
n0>0: n  n0 ,
f(n) = W(g(n))
c>0,
n0 >0: n  n0 , c g(n)  f(n)
f(n) = o(g(n))
c >0,
n0>0: n  n0 ,
f(n) = w(g(n))
c >0,
n0>0: n  n0 ,
f(n)  c g(n)
f(n) < c g(n)
c g(n) < f(n)
Ordering Functions
Functions
<<
<<
25n <<
Factorial
5 << 5 log n << (log n)5 << n5
<<
Exponential
<<
Polynomial
<<
Poly Logarithmic
Logarithmic
Constant
<<
n!
Classifying Functions
Polynomial
Linear
Quadratic
Cubic
?
5n
5n2
5n3
5n4
Example Problem:
Sorting
Some sorting algorithms and their worst-case time complexities:
Quick-Sort:
Q(n2)
Insertion-Sort:
Q(n2)
Selection-Sort:
Q(n2)
Merge-Sort:
Q(n log n)
Heap-Sort:
Q(n log n)
there are infinitely many sorting algorithms!
So, Merge-Sort and Heap-Sort are worst-case optimal, and
SORTING complexity is Q(n log n).
Theoretical analysis of time efficiency
Time efficiency is analyzed by determining the number of
repetitions of the basic operation as a function of input size

Basic operation: the operation that contributes most towards
the running time of the algorithm.
input size
T(n) ≈ copC(n)
running time
execution time
for basic operation
Number of times
basic operation is
executed
Input size and basic operation examples
Problem
Input size measure
Basic operation
Search for key in list of n
Number of items in list n Key comparison
items
Multiply two matrices of
Dimensions of matrices
floating point numbers
Floating point
multiplication
Compute an
n
Floating point
multiplication
Graph problem
#vertices and/or edges
Visiting a vertex or
traversing an edge
Theoretical analysis of time
efficiency
Time efficiency is analyzed by determining the
number of repetitions of the basic operation as a
function of input size
Best-case, average-case, worst-case
• Worst case: W(n) – maximum over inputs of size n
• Best case:
B(n) – minimum over inputs of size n
• Average case: A(n) – “average” over inputs of size n
– NOT the average of worst and best case
– Under some assumption about the probability distribution of
all possible inputs of size n, calculate the weighted sum of
expected C(n) (numbers of basic operation repetitions) over all
possible inputs of size n.
Time efficiency of nonrecursive algorithms

Steps in mathematical analysis of nonrecursive
algorithms:
•
•
•
•
Decide on parameter n indicating input’s size
Identify algorithm’s basic operation
Determine worst, average, & best case for inputs of size n
Set up summation for C(n) reflecting algorithm’s loop
structure
• Simplify summation using standard formulas
32
Series
N ( N  1)
S  i 
2
i 1
N

Proof by Gauss when 9 years old (!):
S  1  2  3  ...  ( N  2)  ( N  1)  N
S  N  ( N  1)  ( N  2)  ...  3  2  1

2 S  N ( N  1)
General rules for sums
n
n
i m
i m
 c  c1  c(n  m  1)
 (a  b )   a   b
i
i
 ca
i
i
i
i
i
n
nk
a
i m
ik
i
 c  ai
i

a
imk
i
ik
k
i
a
x

x
a
x
 i
 i
i
i
i
Some Mathematical Facts
• Some mathematical equalities are:
n * (n  1) n 2
i  1  2  ...  n 


2
2
i 1
n
3
n
*
(
n

1
)
*
(
2
n

1
)
n
2
2
i

1

4

...

n



6
3
i 1
n
n 1
i
n 1
n
2

0

1

2

...

2

2
1

i 0
The Execution Time of Algorithms
• Each operation in an algorithm (or a program) has a cost.
 Each operation takes a certain of time.
count = count + 1; 
take a certain amount of time, but it is constant
A sequence of operations:
count = count + 1;
sum = sum + count;
 Total Cost = c1 + c2
Cost: c1
Cost: c2
The Execution Time of Algorithms (cont.)
Example: Simple If-Statement
if (n < 0)
absval = -n
else
absval = n;
Cost
c1
c2
c3
Total Cost <= c1 + max(c2,c3)
Times
1
1
1
The Execution Time of Algorithms (cont.)
Example: Simple Loop
i = 1;
sum = 0;
while (i <= n) {
i = i + 1;
sum = sum + i;
}
Cost
c1
c2
c3
c4
c5
Times
1
1
n+1
n
n
Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*c5
 The time required for this algorithm is proportional to n
The Execution Time of Algorithms (cont.)
Example: Nested Loop
Cost
c1
c2
c3
c4
c5
c6
c7
Times
1
1
n+1
n
n*(n+1)
n*n
n*n
i=1;
sum = 0;
while (i <= n) {
j=1;
while (j <= n) {
sum = sum + i;
j = j + 1;
}
i = i +1;
c8
n
}
Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*(n+1)*c5+n*n*c6+n*n*c7+n*c8
 The time required for this algorithm is proportional to n2
Growth-Rate Functions – Example1
i = 1;
sum = 0;
while (i <= n) {
i = i + 1;
sum = sum + i;
}
T(n)
Cost
c1
c2
c3
c4
Times
1
1
n+1
n
c5
n
= c1 + c2 + (n+1)*c3 + n*c4 + n*c5
= (c3+c4+c5)*n + (c1+c2+c3)
= a*n + b
 So, the growth-rate function for this algorithm is O(n)
Growth-Rate Functions – Example2
Cost
c1
c2
c3
c4
c5
c6
c7
Times
1
1
n+1
n
n*(n+1)
n*n
n*n
i=1;
sum = 0;
while (i <= n) {
j=1;
while (j <= n) {
sum = sum + i;
j = j + 1;
}
i = i +1;
c8
n
}
T(n)
= c1 + c2 + (n+1)*c3 + n*c4 + n*(n+1)*c5+n*n*c6+n*n*c7+n*c8
= (c5+c6+c7)*n2 + (c3+c4+c5+c8)*n + (c1+c2+c3)
= a*n2 + b*n + c
 So, the growth-rate function for this algorithm is O(n2)
Growth-Rate Functions – Example3
Cost
for (i=1; i<=n; i++)
Times
c1
n+1
n
for (j=1; j<=i; j++)
 ( j  1)
c2
j 1
j
n
for (k=1; k<=j; k++)
x=x+1;
 (k  1)
c3
j 1 k 1
n
j
 k
c4
j 1 k 1
T(n)
n
j
= c1*(n+1) + c2*(  ( j  1) ) + c3* (  (k  1) ) + c4*( k )
n
j 1
n
j
j 1 k 1
= a*n3 + b*n2 + c*n + d
 So, the growth-rate function for this algorithm is O(n3)
j 1 k 1
Sequential Search
int sequentialSearch(const int a[], int item, int n){
for (int i = 0; i < n && a[i]!= item; i++);
if (i == n)
return –1;
return i;
}
Unsuccessful Search:
 O(n)
Successful Search:
Best-Case: item is in the first location of the array O(1)
Worst-Case: item is in the last location of the array O(n)
Average-Case: The number of key comparisons 1, 2, ..., n
n
i
( n 2  n) / 2

n
n
i 1
 O(n)
Insertion Sort
an incremental algorithm
11
4
5
2
15
7
4
11
5
2
15
7
4
5
11
2
15
7
2
4
5
11
15
7
2
4
5
11
15
7
4
5
7
2
11
15
Insertion Sort: Time Complexity
Algorithm InsertionSort(A[1..n])
for i  2 .. n do
LI: A[1..i –1] is sorted, A[i..n] is untouched.
§ insert A[i] into sorted prefix A[1..i–1] by right-cyclic-shift:
T( n )
 n

 Q  1  t i  1 
 i2

n


 Q n   t i 
i2


n


 Q n   i 
i2 

 Q n  n2


 Q n  .
2
2.
key  A[i]
3.
j  i –1
4.
while j > 0 and A[j] > key do
5.
A[j+1]  A[j]
6.
j  j –1
7.
end-while
8.
A[j+1]  key
9. end-for
end
Worst -case: t i  i iterations (reverse sorted) .
i  2 i 
n
n ( n 1)
2
 
 1  Q n2 .
Master theorem
𝑛
𝑏
• If 𝑇 𝑛 = 𝑎𝑇
+ 𝑂 𝑛𝑑 for some
constants a >0, b>1, and d ≥ 0, then
1.
2.
3.
a < bd
a = bd
a > bd
T(n) ∈ Θ(nd)
T(n) ∈ Θ(nd lg n )
T(n) ∈ Θ(nlog b a)
The divide-and-conquer Design Paradigm
• Divide the problem into subproblems.
• Conquer the subproblems by solving them
recursively.
• Combine subproblem solutions.
• Many algorithms use this paradigm.
Divide-and-conquer Technique
a problem of size n
subproblem 1
of size n/2
subproblem 2
of size n/2
a solution to
subproblem 1
a solution to
subproblem 2
a solution to
the original problem
Divide and Conquer Examples
•
•
•
•
•
•
Sorting: mergesort and quicksort
Matrix multiplication-Strassen’s algorithm
Binary search
Powering a Number
Closest pair problem
….etc.
49
Binary search
• Find an element in a sorted array:
– Divide: Check middle element.
– Conquer: Recursively search 1 sub array.
– Combine: Trivial.
• Example: Find 9
3 5 7 8 9 12 15
Binary search
• Find an element in a sorted array:
– Divide: Check middle element.
– Conquer: Recursively search 1 sub array.
– Combine: Trivial.
• Example: Find 9
Binary search
• Find an element in a sorted array:
– Divide: Check middle element.
– Conquer: Recursively search 1 sub array.
– Combine: Trivial.
• Example: Find 9
Binary search
• Find an element in a sorted array:
– Divide: Check middle element.
– Conquer: Recursively search 1 sub array.
– Combine: Trivial.
• Example: Find 9
Binary search
• Find an element in a sorted array:
– Divide: Check middle element.
– Conquer: Recursively search 1 sub array.
– Combine: Trivial.
• Example: Find 9
Binary search
• Find an element in a sorted array:
– Divide: Check middle element.
– Conquer: Recursively search 1 sub array.
– Combine: Trivial.
• Example: Find 9
Binary search
• Find an element in a sorted array:
– Divide: Check middle element.
– Conquer: Recursively search 1 sub array.
– Combine: Trivial.
• Example: Find 9
Binary Search
int binarySearch(int a[], int size, int x) {
int low =0;
int high = size –1;
int mid;
// mid will be the index of
// target when it’s found.
while (low <= high) {
mid = (low + high)/2;
if (a[mid] < x)
low = mid + 1;
else if (a[mid] > x)
high = mid – 1;
else
return mid;
}
return –1;
}
Recurrence for binary search
•
T(n) = 1T(n/2) + Θ(1)
# subproblems
subproblem
size
cost of dividing
and combining
How much better is O(log2n)?
n
16
64
256
1024 (1KB)
16,384
131,072
262,144
524,288
1,048,576 (1MB)
1,073,741,824 (1GB)
O(log2n)
4
6
8
10
14
17
18
19
20
30
Powering a Number
• Problem: Compute an, where n∈N.
• Naive algorithm:
– Multiply n copies of X: X ·X ·X ··· X ·X ·X.
– Complexity ? Θ(n)
• The Spot Creativity:
– Is this the only and the best algorithm?
– Any Suggestions on using divide-and-conquer
strategy?
Powering a Number
• Starting with X, repeatedly multiply the result
by itself. We get:
•
X, X2, X4, X8, X16, X32, …
• Suppose we want to compute X13.
•
13 = 8 + 4 + 1. So, X13 = X8 ·X4 ·X.
Powering a Number
• Divide-and-conquer:
• Complexity:
T(n) = T(n/2) + Θ(1)
T(n) = Θ(lgn)
Matrix Multiplication of n*n
•
•
•
Code for Matrix Multiplication
•
for i=1 to n
for j=1 to n
cij=0
for k=1 to n
cij=cij + aik*bkj
• Running Time= ?
Matrix Multiplication by Divide-&-Conquer
n/2
A11
B11
A12
B12

n/2
A21
n/2
C11
C12
C21
C22
=
A22
B21
B22
n/2
C11  A11B11  A12 B 21
C 21  A 21B11  A 22 B 21
C12  A11B12  A12 B 22
C 22  A 21B12  A 22 B 22

T ( n )  8T ( 2 )  Q( n )  T ( n )  Q n
n
2
log 8
2
  Qn 
3
65
Strassen’s Idea
• How Strassen came up with his magic idea?
– We should try get rid of as more multiplications
as possible. We could do a hundred additions
instead.
– We can reduce the numbers of subproblems in
T(n)=8 * T(n/2) + Θ(n2)
– He must be very clever.
Strassen’s Idea
• Multiply 2*2 matrices with only 7 recursive
multiplications.
• Notes: plus of matrices is commutative , but
multiplication is not.
Strassen’s Algorithm
1.Divide: Partition A and B into (n/2)*(n/2)
submatrices. And form terms to be multiplied
using + and –.
2.Conquer: Perform 7 multiplications (P1 to P7)
of (n/2)×(n/2) submatrices recursively.
3.Combine:Form C (r,s,t,u) using + and –on
(n/2)×(n/2) submatrices.
• Write down cost of each step
• We got T(n)=7 * T(n/2) + Θ(n2)
Cost of Strassen’s Algorithm
T(n)=7 * T(n/2) + Θ(n2)
• a=7,b=2
•
=nlg7
• case 1
• T(n) = Θ(nlg7) =O(n2.81)
• Not so surprising?
• Strassen’s Algorithm is not the best. The best one has
O(n2.376) (Be of theoretical interest only).
• But it is simple and efficient enough compared with
the naïve one when n>=32.
Mergesort
Algorithm:
• Split array A[1..n] in two and make copies of each
𝑛
𝑛
half in arrays 𝐵[1 …
] and C[1 …
]


2
Sort arrays B and C
Merge sorted arrays B and C into array A
2
Using Divide and Conquer: Mergesort
• Mergesort Strategy
(first  last)2
last
first
Sort recursively
by Mergesort
Sort recursively
by Mergesort
Sorted
Sorted
Merge
Sorted
Mergesort
Algorithm:
• Split array A[1..n] in two and make copies of each half in arrays


𝑛
𝑛
𝐵[1 …
] and C[1 …
]
2
2
Sort arrays B and C
Merge sorted arrays B and C into array A as follows:
• Repeat the following until no elements remain in one of the arrays:
– compare the first elements in the remaining unprocessed portions
of the arrays
– copy the smaller of the two into A, while incrementing the index
indicating the unprocessed portion of that array
• Once all elements in one of the arrays are processed, copy the
remaining unprocessed elements from the other array into A.
Algorithm: Mergesort
Input: Array E and indices first and last, such that
the elements E[i] are defined for first  i  last.
Output: E[first], …, E[last] is a sorted
rearrangement of the same elements
void mergeSort(Element[] E, int first, int last)
if (first < last)
int mid = (first+last)/2;
mergeSort(E, first, mid);
mergeSort(E, mid+1, last);
merge(E, first, mid, last);
return;
Merge Sort
• How to express the cost of merge sort?
T(n) =2T(n/2)  Q(n) for n>1, T(1)=0  Q(n lg n)
Merge Sort
1.Divide:Trivial.
2.Conquer:Recursively sort subarrays.
3.Combine:Linear-time merge.
# subproblems
subproblem
size
cost of dividing
and combining
Efficiency of Mergesort

All cases have same efficiency: Θ( n log n)
• Number of comparisons is close to theoretical
minimum for comparison-based sorting:
– lg n ! ≈ n lg n - 1.44 n

Space requirement: Θ( n ) (NOT in-place)

Can be implemented without recursion (bottom-up)
Quicksort by Hoare (1962)
• Select a pivot (partitioning element)
• Rearrange the list so that all the elements in the positions
before the pivot are smaller than or equal to the pivot and
those after the pivot are larger than or equal to the pivot
• Exchange the pivot with the last element in the first (i.e., ≤)
sublist – the pivot is now in its final position
• Sort the two sublists recursively
p
A[i]≤p
A[i]p
• Apply quicksort to sort the list 7 2 9 10 5 4
Quicksort Example
• Recursive implementation with the left most array
entry selected as the pivot element.
Quicksort Algorithm
• Input:
 Array E and indices first, and last, s.t. elements E[i] are defined for first  i
 last
• Ouput:
 E[first], …, E[last] is a sorted rearrangement of the array
• Void quickSort(Element[] E, int first, int last)
if (first < last)
Element pivotElement = E[first];
int splitPoint = partition(E, pivotElement, first, last);
quickSort (E, first, splitPoint –1 );
quickSort (E, splitPoint +1, last );
return;
Quicksort Analysis
• Partition can be done in O(n) time, where n is the
size of the array
• Let T(n) be the number of comparisons required by
Quicksort
• If the pivot ends up at position k, then we have
– T(n) T(nk)  T(k 1)  n
• To determine best-, worst-, and average-case
complexity we need to determine the values of k
that correspond to these cases.
Best-Case Complexity
The best case is clearly when the pivot always
partitions the array equally.
Intuitively, this would lead to a recursive depth of at
most lg n calls
We can actually prove this. In this case
– T(n) T(n/2)  T(n/2)  n  Q(n lg n)
Worst-Case and Average-Case Complexity
• The worst-case is when the pivot always ends up in
the first or last element. That is, partitions the array
as unequally as possible.
• In this case
– T(n)  T(n1)  T(11)  n  T(n1)  n
 n  (n1)  … + 1
 n(n  1)/2  O(n2)
• Average case is rather complex, but is where the
algorithm earns its name. The bottom line is:
A( n )  1.386n lg n  Q( n lg n )
QuickSort Average-Case
n
S< : x < p
i
p
S> : x > p
n – i –1
WLOG Assume: | S= | = 1. If it’s larger, it can only help!
T(n) = T(i) + T(n-i-1) + Q(n),
T(n) = Q(1), for n=0,1.
Expected-Case:
T(n) = avei { T(i) + T(n-i-1) + Q(n) : i = 0 .. n –1 }

1
n
 n2
n 1
 T (i )  T ( n  i  1)  Q( n ) 
i 0
n 1
 T (i )  Q( n )
i 0
= Q(n log n)
83
QuickSort Average-Case
n 1
Example 2: T ( n )  2 i0 T ( i )  n , n  0
n
[ T ( 0)  0]
1. Multiply across by n (so that we can subsequently cancel out the summation):
n T ( n )  2i0 T (i )  n 2 , n  0
n1
2. Substitute n-1 for n:
( n 1) T ( n 1)  2i0 T (i )  ( n 1) 2 , n 1
3. Subtract (2) from (1):
nT ( n )  ( n 1) T ( n 1)  2 T ( n 1)  2 n 1, n 1
4. Rearrange:
nT ( n )  ( n 1) T ( n 1)  2 n 1, n 1
5. Divide by n(n+1) to make
LHS & RHS look alike:
6. Rename:
Q( n ) 
T(n )
,
n 1
T ( n 1)
T(n )

 2 n 1 , n 1
n 1
n
n ( n 1)
Q( n 1) 
T ( n 1)
,
n
Q ( n 1) 
Q
(
n
)

7. Simplified recurrence:

0
8. Iteration: Q(n) 
9. Finally:
n2
2n -1  3  1
n(n 1) n1 n
 n31  n1 
n 1
for n  0
nth
Harmonic
number
n31  1n   n3  n11   n31  n1 2       32  11   Q(0)  2H (n)  n3n1
T ( n )  ( n 1) Q ( n )  2 ( n 1) H ( n )  3n  Q ( n log n ).
84
Summary of Quicksort
• Best case: split in the middle — Θ( n log n)
• Worst case: sorted array! — Θ( n2)
• Average case: random arrays — Θ( n log n)
Summary of Quicksort
• Best case: split in the middle — Θ( n log n)
• Worst case: sorted array! — Θ( n2)
• Average case: random arrays — Θ( n log n)
• Considered as the method of choice for internal sorting
for large files (n ≥ 10000)
• Improvements:
– better pivot selection: median of three partitioning avoids
worst case in sorted files
– switch to insertion sort on small subfiles
– elimination of recursion
these combine to 20-25% improvement
Binary Heap




A = a binary tree with one key per node.
Max Heap Order: A satisfies the following partial order:
for every node x  root[A] : key[x]  key[parent(x)].
Full tree node allocation scheme: nodes of A are allocated in increasing
order of level, and left-to-right within the same level.
This allows array implementation, where array indices simulate tree pointers.
91
1
8 48
84
74
2
3
73
81
66
54
4
5
6
7
9 62
77
34
10
53
11
61
13
12
36
23
51
27
44
69
20
27
47
33
59
46
16
17
18
19
20
21
22
23
24
25
26
27
41
29
14
15
Array as Binary Heap
size[A]
1
currently unused
A=
max size
1
parent
node
A[ t/2 ]
A[t]
h ≈ log n
A[2t]
left child
n = size[A]
A[2t+1]
right child
Some MAX Heap Properties





Root A[1] contains the maximum item.
Every root-to-leaf path appears in non-increasing order.
Every subtree is also max heap ordered. [Recursive structure]
The key at any node is the largest among all its descendents.
(x,y)  AncestorOf : A[x]  A[y],
where AncestorOf = { (x,y) : node x is ancestor of node y }.
A[1]
1
2
94
82
74
8
92
5
4
68
6
76
48
9
L
3
74
10
7
68
18
56
11
12
88
x
y
R
UpHeap





A[1..n] = a max-heap.
Suddenly, item A[t] increases in value.
Now A is a “t upward corrupted heap”:
(x,y)  AncestorOf : y  t  A[x]  A[y].
Question: how would you rearrange A to make it a max-heap again?
Answer: percolate A[t] up its ancestral path.
procedure UpHeap(A, t)
§ O(log n) time
Pre-Cond: A is a t upward corrupted heap
Post-Cond: A is rearranged into a max-heap
p  t/2 § parent of t
if p = 0 or A[p]  A[t] then return
A[t]  A[p]
UpHeap(A,p)
end
1
t
UpHeap Example
1
2
1
94
3
82
74
68
8
6
76
48
9
86
74
10
18
56
11
12
74
88
2
74
94
8
92
6
82
48
9
76
10
10
3
5
4
76
9
86
7
68
18
56
11
12
6
86
48
8
stop 1
92
5
68
68
3
4
7
68
94
82
92
5
4
2
88
7
68
18
56
11
12
88
DownHeap (or Heapify)





A[1..n] = a max-heap.
Suddenly, item A[t] decreases in value.
Now A is a “t downward corrupted heap”:
(x,y)  AncestorOf : x  t  A[x]  A[y].
Question: how would you rearrange A to make it a max-heap again?
Answer: demote A[t] down along largest-child path.
procedure DownHeap(A, t) § O(log n) time
Pre-Cond: A is a t downward corrupted heap
Post-Cond: A is rearranged into a max-heap
c  2t
§ left child of t
if c > size[A] then return § c not part of heap
if c < size[A] and A[c] < A[c+1] then c  c+1
§ now c is the largest child of t
if A[t] < A[c] then
A[t]  A[c]
DownHeap(A, c)
end
t
2t
2t+1
DownHeap Example
1
1
94
2
3
82
26
74
68
8
6
76
48
9
74
10
18
56
11
12
74
88
68
1
2
68
8
26
9
10
stop
10
92
6
74
48
9
3
5
74
74
94
76
7
68
18
56
11
12
92
6
26
48
8
4
3
5
4
7
68
94
76
92
5
4
2
88
7
68
18
56
11
12
88
Construct Heap



One application of heaps is sorting. But how do we start a heap first?
Problem: Given array A[1..n], rearrange its items to form a heap.
Solution: Build Incrementally:
That is, make A[1..t] a max heap while incrementing t  1 .. n.
That is,
for t  1 .. n do UpHeap(A, t) end
 h i
Time  Q   i 2 
 i0

 Q( h 2 h )
 Q ( n log n )
1
i
h = log n
2i
t
Most nodes are concentrated near the
bottom with larger depths but smaller
heights. Idea: DownHeap is better!
n
Heap Construction Algorithm
Solution 3: Build Backwards on t by DownHeap(A,t).
procedure ConstructHeap(A[1..n]) § O(n) time
Pre-Cond: input is array A[1..n] of arbitrary numbers
Post-Cond: A is rearranged into a max-heap
size[A]  n
§ establish last node barrier
LastNonleafNode  n/2
for t  LastNonleafNode downto 1 do DownHeap(A, t)
end
 h

Time  Q   ( h  i ) 2 i 
 i0

h

h j 

 Q   j2 
L
R
 j 0

h
T(n) =
h
j

2
Q
(
j
2
)

T(|L|) + T(|R|) + O(log n)
1
A[1]
 T(n) = O(n)
j 0
 2 Q (1)
 Q(n )
h
h = log n
t
2i
h-i
n
Construct Heap Example
1
14
2
3
23
5
4
8
6
62
51
83
42
94
9
26
10
7
12
88
56
11
12
92
Construct Heap Example
1
14
2
3
23
5
4
8
6
62
51
83
42
94
9
26
10
7
12
88
56
11
12
DownHeap(A,t)
t=6
92
Construct Heap Example
1
14
2
3
23
5
4
8
6
62
51
83
42
94
9
26
10
7
56
88
12
11
12
DownHeap(A,t)
t=5
92
Construct Heap Example
1
14
2
3
23
5
4
8
6
88
51
83
42
94
9
26
10
7
56
62
12
11
12
DownHeap(A,t)
t=4
92
Construct Heap Example
1
14
2
3
23
5
4
8
6
88
94
83
42
51
9
26
10
7
56
62
12
11
12
DownHeap(A,t)
t=3
92
Construct Heap Example
1
14
2
3
23
5
4
8
6
88
94
83
92
51
9
7
56
26
62
12
10
11
12
DownHeap(A,t)
t=2
42
Construct Heap Example
1
14
2
3
94
5
4
8
6
88
83
23
92
7
56
51
26
62
12
9
10
11
12
DownHeap(A,t)
t=1
42
Construct Heap Example
MAX
HEAP
1
94
2
3
88
5
4
8
6
62
83
23
92
51
9
26
10
7
56
14
12
11
12
42
Heap as a Priority Queue
A Priority Queue (usually implemented with some “heap” structure)
is an abstract Data Structure that maintains a set S of items and supports the
following operations on it:
MakeEmptyHeap(S): Make an empty priory queue and call it S.
ConstructHeap(S):
Construct a priority queue containing the set S of items.
Insert(x, S):
Insert new item x into S (duplicate values allowed)
DeleteMax(S):
Remove and return the maximum item from S.
Note: Min-Heap is used if we intend to do DeleteMin instead of DeleteMax.
Priority Queue Operations
Array A as a binary heap is a suitable implementation.
For a heap of size n, it has the following time complexities:
O(1)
MakeEmptyHeap(A)
size[A]  0
O(n)
ConstructHeap(A[1..n])
discussed already
O(log n) Insert(x,A) and DeleteMax(A)
see below
procedure Insert(x, A)
size[A]  size[A] + 1
A[ size[A] ]  x
UpHeap(A, size[A] )
end
1
procedure DeleteMax(A)
if size[A] = 0 then return error
MaxItem A[1]
A[1]  A[size[A]]
size[A]  size[A] – 1
DownHeap(A, 1)
return MaxItem
end
size[A]

Download Report

n - Learn Group

Paperzz.com

Your Paperzz