Lower Bound for Comparison Sorting Algorithms

CPSC 320: Intermediate Algorithm
Design and Analysis
July 4, 2014
1
Course Outline
• Introduction and basic concepts
• Asymptotic notation
• Greedy algorithms
• Graph theory
• Amortized analysis
• Recursion
• Divide-and-conquer algorithms
• Randomized algorithms
• Dynamic programming algorithms
• NP-completeness
2
Asymptotic Notation
3
Machine Model
• In all algorithms we will assume:
• Sequential execution
• One processor
• Memory position can hold arbitrarily large integer
• Unless otherwise specified, we will discuss the worst case
• Best case: usually not very useful
• Average case: depends on input distribution assumptions
4
Upper Bound
𝑂 𝑔 𝑛
= 𝑓 𝑛 ∃ 𝑐 > 0 ∃ 𝑛0 ∀ 𝑛 ≥ 𝑛0 , 𝑓 𝑛 ≤ 𝑐 𝑔(𝑛)}
• Describes the upper bound of the growth rate of the function
3
7𝑛2 + 5 ∈ 𝑂 𝑛 6
Assume 𝑐 = 72, 𝑛0 = 1
7𝑛2 + 5 ≤ 7𝑛2 + 5𝑛2 (since 𝑛 ≥ 1)
= 12𝑛2
≤ 12𝑛3 (since 𝑛 ≥ 1)
3
= 72 𝑛 6 ∎
5
Lower Bound
Ω 𝑔 𝑛
= 𝑓 𝑛 ∃ 𝑐 > 0 ∃ 𝑛0 ∀ 𝑛 ≥ 𝑛0 , 𝑓 𝑛 ≥ 𝑐 𝑔(𝑛)}
• Lower bound is not “best case”, is “lower limit for growth rate”
• Usually used for:
• Identifying lower bounds for every possible algorithm to solve a problem
• Identifying tight bounds (Θ 𝑔 𝑛 )
6
Tight Bound
Θ 𝑔 𝑛
= 𝑓 𝑛 ∃ 𝑐1 , 𝑐2 > 0 ∃ 𝑛0 ∀ 𝑛 ≥ 𝑛0 , 𝑐1 𝑔 𝑛 ≤ 𝑓 𝑛 ≤ 𝑐2 𝑔(𝑛)}
Θ 𝑔 𝑛
=𝑂 𝑔 𝑛
∩Ω 𝑔 𝑛
• Describes tight bounds for the growth rate of the function
7
Strict bounds
𝑜 𝑔 𝑛 = 𝑓 𝑛 ∀ 𝑐 > 0 ∃ 𝑛0 ∀ 𝑛 ≥ 𝑛0 , 𝑓 𝑛 < 𝑐 𝑔(𝑛)}
𝜔 𝑔 𝑛 = 𝑓 𝑛 ∀ 𝑐 > 0 ∃ 𝑛0 ∀ 𝑛 ≥ 𝑛0 , 𝑓 𝑛 > 𝑐 𝑔(𝑛)}
• Strict bounds indicate a growth rate that is strictly bound (i.e., is always
smaller/larger)
8
Exercise
• Prove that:
𝑛3 ∈ Θ 𝑛3
3𝑛3 − 𝑛2 + 10 ∈ Θ(𝑛3 )
𝑛3
+ 100𝑛2 ∈ Θ 𝑛3
2
log 2 𝑛 ∈ Θ(log10 𝑛)
9
Solutions
Prove that: 𝑛3 ≤ 𝑐 𝑛3
𝑛3 = 1 𝑛3
Prove that: 𝑛3 ≥ 𝑐 𝑛3
𝑛3 = 1 𝑛3
So, for 𝑐 = 1, 𝑛0 = 0, 𝑛3 ≤ 𝑐 𝑛3
So, for 𝑐 = 1, 𝑛0 = 0, 𝑛3 ≥ 𝑐 𝑛3
Thus, 𝑛3 ∈ 𝑂(𝑛3 )
Thus, 𝑛3 ∈ Ω(𝑛3 )
Finally, 𝑛3 ∈ Θ 𝑛3
10
Solutions
Prove that: 3𝑛3 − 𝑛2 + 10 ≤ 𝑐 𝑛3
3𝑛3 − 𝑛2 + 10 = 3𝑛3 − 𝑛2 − 10
≤ 3𝑛3 for 𝑛2 > 10
So, for 𝑐 = 3, 𝑛0 = 4, 3𝑛3 − 𝑛2 + 10 ≤ 𝑐 𝑛3
Thus, 3𝑛3 − 𝑛2 + 10 ∈ 𝑂(𝑛3 )
Prove that: 3𝑛3 − 𝑛2 + 10 ≥ 𝑐 𝑛3
3𝑛3 − 𝑛2 + 10 ≥ 3𝑛3 − 𝑛3 + 10
= 2𝑛3 + 10
≥ 2𝑛3
So, for 𝑐 = 2, 𝑛0 = 1, 3𝑛3 − 𝑛2 + 10 ≥ 𝑐 𝑛3
Thus, 3𝑛3 − 𝑛2 + 10 ∈ Ω(𝑛3 )
Finally, 3𝑛3 − 𝑛2 + 10 ∈ Θ 𝑛3
11
Solutions
𝑛3
2
𝑛3
2
Prove that: + 100𝑛 ≤ 𝑐 𝑛
3
𝑛3
𝑛
+ 100𝑛2 ≤
+ 100𝑛3 ≤ 101𝑛3
2
2
2
So, for 𝑐 = 101, 𝑛0 =
Thus,
𝑛3
2
+
100𝑛2
∈
𝑛3
1,
2
𝑂(𝑛3 )
Prove that: + 100𝑛2 ≥ 𝑐 𝑛3
3
𝑛3
𝑛
1
+ 100𝑛2 ≥
= 𝑛3
2
2
2
3
+
100𝑛2
≤𝑐
𝑛3
So, for 𝑐 =
Thus,
𝑛3
2
Finally,
1
, 𝑛0
2
= 1,
𝑛3
2
+ 100𝑛2 ≥ 𝑐 𝑛3
+ 100𝑛2 ∈ Ω(𝑛3 )
𝑛3
2
+ 100𝑛2 ∈ Θ 𝑛3
12
Solutions
Prove that: log 2 𝑛 ≤ 𝑐1 log10 𝑛 and log 2 𝑛 ≥ 𝑐2 log10 𝑛
log10 𝑛
1
log 2 𝑛 =
=
log 𝑛
log10 2 log10 2 10
1
So, for 𝑐1 = 𝑐2 = log
10
2
, 𝑛0 = 1,log 2 𝑛 = 𝑐1 log10 𝑛
As such, we have both log 2 𝑛 ≤ 𝑐1 log10 𝑛 and log 2 𝑛 ≥ 𝑐2 log10 𝑛
Thus, log 2 𝑛 ∈ Θ log10 𝑛
13
Exercise
for i = 1 to N
print i
j = i
while j > 1
j = ceil(j / 2)
print “,” j
print “\n”
• Find (and prove) a tight bound for the complexity of the algorithm above
14
Solution
• The inner loop runs lg 𝑖 times for the 𝑖 th iteration of the outer loop
• The total number of iterations of the inner loop is:
⌈lg 1⌉ + ⌈lg 2⌉ + ⌈lg 3⌉ + ⋯ + ⌈lg 𝑛⌉
lg 1 + lg 2 + lg 3 + ⋯ + lg 𝑛
≤ lg 𝑛 + lg 𝑛 + lg 𝑛 + ⋯ + lg 𝑛
= 𝑛 lg 𝑛
𝑛
𝑛
− 1 + lg
+ ⋯ + lg 𝑛
2
2
𝑛
𝑛
≥ 0 + ⋯ + 0 + lg + ⋯ + lg
2
2
𝑛−1 𝑛
𝑛 − 1 (lg 𝑛 − 1) 𝑛 lg 𝑛 − 𝑛 − lg 𝑛 + 1
≥
lg =
=
2
2
2
2
1
≥ 𝑛 lg 𝑛
2
lg 1 + ⋯ + lg
15
Exercise
• Assume an algorithm that tries to guess an alphanumerical password of size 𝑛 by
brute force
• The algorithm tests all possible combinations of letters (26 uppercase, 26
lowercase) and digits (10) until a correct combination is found
• For simplicity, assume that the size of password is known
• Find a tight bound for the complexity of the algorithm above
• Solution: Θ 62𝑛
16
Lower Bound for Comparison Sorting Algorithms
17
Comparison Sort
• Comparison sort algorithms: final sorted order is determined only by comparisons
between pairs of elements
• Examples:
• Bubble Sort
• Insertion Sort
• Selection Sort
• Merge Sort
• Heap Sort
• Quick Sort
• Most comparison sort algorithms are 𝑂(𝑛 log 𝑛) or worse
• Can we do better?
18
Decision Tree
• Assume a comparison sort algorithm
• Input: (𝑎1 , 𝑎2 , 𝑎3 , … , 𝑎𝑛 )
• Output: a permutation (𝑏1 , 𝑏2 , 𝑏3 , … , 𝑏𝑛 ) of the input where ∀𝑖 ∈ 1, 𝑛 , 𝑏𝑖 ≤ 𝑏𝑖+1
• Decision tree: walkthrough of all possible outputs
19
Decision Tree – Insertion Sort with 𝑁 = 3
20
Decision Tree
• For every input of size 𝑛, there are 𝑛! (𝑛 factorial) possible permutations
• Each of these permutations is represented by one leaf
• The path from the root to a node corresponds to the sequence of comparisons to
reach that node
• Length of this path: number of comparisons
• Height of the tree: worst-case number of comparisons
21
Lower bound for comparison sort
Theorem: Every comparison sort requires Ω(𝑛 log 𝑛) comparisons
Proof: Worst case number of comparisons for any comparison sort is the height of
the decision tree. There are 𝑛! leaves. A binary tree of height ℎ has 2ℎ leaves.
𝑛! ≤ 2ℎ
ℎ ≥ log 2 𝑛!
= log 2 𝑛 ∙ 𝑛 − 1 … ∙ 3 ∙ 2 ∙ 1
= log 2 𝑛 + log 2 𝑛 − 1 … + log 2 3 + log 2 2 + log 2 1
ℎ ∈ Ω 𝑛 log 𝑛 ∎
22
Asymptotic Notation using Limits
23
Using limits
𝑓(𝑛)
𝑛→∞ 𝑔(𝑛)
• Suppose that lim
exists. Then:
𝑓(𝑛)
𝑛→∞ 𝑔(𝑛)
= 0, then 𝑓 𝑛 ∈ 𝑜 𝑔 𝑛
𝑓(𝑛)
𝑛→∞ 𝑔(𝑛)
= +∞, then 𝑓 𝑛 ∈ 𝜔 𝑔 𝑛
𝑓(𝑛)
𝑛→∞ 𝑔(𝑛)
∈ ℝ+, then 𝑓 𝑛 ∈ Θ 𝑔 𝑛
• If lim
• If lim
• If lim
24
Exercise
• Compare asymptotically:
𝑛 and log e 𝑛
𝑛
• Compare asymptotically: 23 and 32
𝑛
25
Exercise - solution
1 −12
𝑛
𝑛
2
lim
= lim
(L′Hôpital′s rule)
𝑛→+∞ log e 𝑛
𝑛→+∞ 1
𝑛
1 𝑛
= lim ∙
𝑛→+∞ 2
𝑛
𝑛
= lim
= +∞
𝑛→+∞ 2
𝑛 ∈ 𝜔(log 𝑛)
Note: L’Hôpital rule can be applied because the limit for both
𝑛 and log 𝑛 are +∞.
26
Exercise - solution
𝑛
𝑛
23
23
lim
𝑛 = lim
𝑛
𝑛→+∞ 32
𝑛→+∞ 2log2 3 2
𝑛
𝑛
= lim 23 −2 log2 3
𝑛→+∞
lim 3𝑛 − 2𝑛 log 2 3 = lim 2𝑛
𝑛→+∞
𝑛→+∞
3𝑛
− log 2 3
2𝑛
= lim 2𝑛 lim
𝑛→+∞
3𝑛
𝑛→+∞
3
2
𝑛
− log 2 3
= +∞
2
𝑛 = +∞
𝑛→+∞ 32
lim
𝑛
𝑛
23 ∈ 𝜔 32
27
Greedy Algorithms – Interval Scheduling
28
Greedy Algorithms
• An algorithm is greedy if it makes a locally optimal choice to build towards the
output
• Not all problems can be solved with greedy algorithms
• Based on heuristics: assumes local optimality will produce global optimality
29
Interval Scheduling Problem
• Consider the following conference schedule:
• 1:00-4:00 Very cool topic
• 1:30-2:00 Another cool topic
• 4:00-5:00 Interesting topic
• 4:30-5:30 Social interaction you don’t want to miss
• What is the maximum amount of activities you can participate in?
• What (greedy) algorithm can be used to find an optimal solution?
30
Definitions
• Given a set 𝑆 =
𝑠1 , 𝑒1 , 𝑠2 , 𝑒2 , 𝑠3 , 𝑒3 , … 𝑠𝑛 , 𝑒𝑛
of intervals
• Start and end (𝑠𝑖 and 𝑒𝑖 ) represented as integers for simplification
• A subset 𝑇 of 𝑆 is compatible if no two elements of 𝑇 overlap
• ∀ 𝑠, 𝑒 ∈ 𝑇 ∄ 𝑠 ′ , 𝑒 ′ ∈ 𝑇 𝑠 ′ < 𝑒 ∧ 𝑒 ′ > 𝑠
• A compatible subset 𝑇 is optimal if there is no other compatible subset with more
elements
31