pointers cross swap with partitioning element Partitioning in Quicksort

Divide and Conquer
• Divide the problem into a number of subproblems
– There must be base case (to stop recursion).
• Conquer (solve) each subproblem recursively
• Combine (merge) solutions to subproblems into a
solution to the original problem
Example: Find the MAX and MIN
• Obvious strategy (Needs 2n - 3 compares)
– Find MAX (n - 1 compares)
– Find MIN of the remaining elements (n - 2)
• Nontrivial strategy
– Split the array in half
– Find the MAX and MIN of both halves
– Compare the 2 MAXes and compare the 2 MINs to get
overall MAX and MIN.
• In this example, we only consider the number of
comparisons and ignore all other operations.
1. Procedure mm(i, j: integer; var lo, hi: integer);
2. begin
3. if i = j
4. then begin lo := a[i]; hi := a[i] end
5. else if i = j-1
6.
then begin if a[i] < a[j]
7.
then begin lo := a [i]; hi := a[j] end
8.
else begin lo := a [j]; hi := a[i] end
9.
end
10.
else begin
11.
m := (i+j) div 2;
12.
mm(i, m, min1, max1);
13.
mm(m+1, j, min2, max2);
14.
lo := MIN(min1, min2);
15.
hi := MAX(max1, max2)
16.
end
17. end.
Analysis
• Solving the above, we have T(n) = 3n/2 –2 if n is a power of 2.
• This can be shown to be a lower bound.
Note
• More accurately, it should be
3 
n

2
 2 
Balancing
• It is generally best to divide problems into subproblems
of roughly EQUAL size. Very roughly, you want binary
search instead of linear search. Less roughly, if in the
MAX-MIN problem, we divide the set by putting one
element in subset 1 and the rest in subset 2, the execution
tree would look like:
2(n – 1) + 1 = 2n –3
Comparisons would
Be needed.
Mergesort
• Obvious way to sort n numbers (selection sort):
– Find the minimum (or maximum)
– Sort the remaining numbers recursively
• Analysis: Requires (n-1) + (n-2) + ... + 1
= n(n-1)/2 = Q(n2) comparisons.
• Clearly this method is not using the balancing
heuristics.
A Divide and Conquer Solution
•
The following requires only Q(n log n) comparisons.
Mergesort
1. Divide: divide the given n-element array A into 2
subarrays of n/2 elements each
2. Conquer: recursively sort the two subarrays
3. Combine: merge 2 sorted subarrays into 1 sorted
array
•
Analysis: T(n) = Q(1) + 2T(n/2) + Q(n) = Q(n log n)
The Pseudocode
1. Procedure mergesort(i, j: integer);
2. var m: integer;
3. begin
4.
if i < j then
5.
begin
6.
m := (i+j) div 2;
7.
mergesort(i, m);
8.
mergesort(m+1, j);
9.
merge(i, m, j)
10.
end;
11. end.
Illustration
The Merging Process
• merge(x, y, z: integer) uses
another array for temporary storage.
• merge segments of size m and n takes m + n – 1
compares in the worst case.
Summary
input
output
function D&C(P: problem): solution;
if size(P) is small enough then S = solve(P)
else divide P into P1, P2, P3, …, Pk;
S1=D&C(P1); S2=D&C(P2); …, Sk=D&C( Pk);
S = merge(S1, S2, S3, …, Sk);
return(S);
Summary
•
Divide and Conquer with Balancing is a useful
technique if the subproblems to be solved are
independent (no redundant computations).
• Also the dividing and merging phases must be efficient.
1. The Fibonacci problem was an example where the
subproblems were not independent.
2. Usually, either dividing or merging the subproblems
will be trivial.
3. Problem is usually, but not always, divide into two
parts.
• Divide into ONE part: Binary search.
• Divide into > 2 parts: Critical path problem.
Two Dimensional Search
• You are given an m  n matrix of numbers A, sorted in
increasing order within rows and within columns.
Assume m = O(n). Design an algorithm that finds the
location of an arbitrary value x, in the matrix or report
that the item is not present. Is your algorithm optimal?
• How about probe the middle of the matrix?
– It seems we can eliminate 1/4 data with one
comparison and it yields 3 subproblems of size about
1/4 of the original problem
– Is this approach optimal? What is the recurrence in this
case?
T(n) = 3T(n/4) + O(1) ????
Well, It is Wrong!
• T(n) = 3T(n/4) + O(1) is not correct, because the
subproblems are of size n/2.
• The correct recurrence for the solution is
3n log2 3  1
T(n) = 3T(n/2) + O(1)  T(n) = O(
)
2
Illustration of Idea
>
<
=
>
<
A Q(n) algorithm
1. c = n, r = 1
2. if c = 1 or r = m then use binary search to locate x.
3. compare x and A[r, c]:
• x = A[r, c] -- report item found in position (r, c).
• x > A[r, c] -- r = r + 1; goto step 2
• x < A[r, c] -- c = c - 1; goto step 2.
At most m + n comparisons are required.
Selection
• The Problem: Given a sequence S of n elements
and an integer k, determine the kth smallest
element in S.
• Special cases:
– k = 1, or k = n: Q(n) time needed.
– k = n/2 : trivial method -- O(n2) steps
– sort then select -- O(nlog n) steps
• Lower bound: W(n).
A Linear Time Algorithm
procedure SELECT(S, k)
1. if |S| < Q then sort S and return the kth element durectly
else subdivide S into |S|/Q subsequence of Q elements
(with up to Q-1 leftover elements).
2. Sort each subsequence and determine its median.
3. Call SELECT recursively to find m, the median of the |S|/Q
medians found in setp 2.
4. Create three subsequences L, E, and G of elements of
S smaller than, equal to, and larger than m, respectively.
5. if |L| ≥ k then call SELECT(L, k)
else if |L| + |E| ≥ k then return(m)
else SELECT(G, k - |L| - |E|).
Analysis of Selection
Q
• Let t(n) be the running time of SELECT.
Step 1
Step 2
Step 3
Step 4
O(n)
O(n)
t(n/Q)
O(n)
Step 5
t(3n/4)
The Complexity
t(n) = t(n/Q) + t(3n/4) + O(n)
= t(n/5) + t(3n/4) + O(n)
Take Q = 5
Since 1/5 + 3/4 < 1, we have
t(n) = Q(n) .
• Recall that the solution of the recurrence relation
t(n) = t(pn) + t(qn) + cn, when 0 < p + q < 1, is Q(n)
Multiplying Two n Bit Numbers
• Here we are using the log-cost model, counting bits.
• The naive pencil-and-paper algorithm:
• This uses n2 multiplications, (n-1)2 additions (+ carries).
In fact, this is also divide and conquer.
Karatsuba's algorithm, 1962 :O(n1.59 )
• Let X and Y each contain n bits. Write
X=ab
and Y = c d
where a; b; c; d are n/2 bit numbers. Then
XY = (a2n/2 + b)(cn/2 + d)
= ac2n + (ad + bc)2n/2 + bd
• This breaks the problem up into 4 subproblems of size
n/2, which doesn't do us any good. Instead, Karatsuba
observed that
XY = (2n +2n/2)ac + 2n/2 (a-b)(d-c) + (2n/2 + 1)bd
= ac2n +ac2n/2 + 2n/2 (a-b)(d-c) + bd2n/2 +bd
= ac2n + (ad + bc)2n/2 + bd
Polynomial multiplication
• Straightforward multiplication: O(n2).
• Using D&C approach: O(n1.59)
• Using FFT technique: O(nlog n)
A D&C Approach
A Modified D&C Solution : O(n1.59 )
• Any idea for further improvement?
Matrix Multiplication
Complexity (on uniprocessor)
• Best known lower bound: W(n2)
(assume m = Q(n) and k = Q(n) )
• Straightforward algorithm: O(n3).
• Strassen's algorithm: O(nlog 7) = O(n2.81).
• Best known sequential algorithm: O(n2.376) ?
• The best algorithm for this problem is still open.
The Straightforward Method
• It takes O(mnk) = O(n3) time.
A D&P approach
Strassen's algorithm
•
• T(n) = 7T(n/2) +O(n2)
= O(nlog 7) = O(n2.81)
Quicksort
• Quicksort is a simple divide-and-conquer sorting
algorithm that practically outperforms Heapsort.
• In order to sort A[p..r] do the following:
– Divide: rearrange the elements and generate two
subarrays A[p..q] and A[q+1..r] so that every element
in A[p..q] is at most every element in A[q+1..r];
– Conquer: recursively sort the two subarrays;
– Combine: nothing special is necessary.
• In order to partition, choose u = A[p] as a pivot, and
move everything < u to the left and everything > u to the
right.
Quicksort
• Although mergesort is O(n log n), it is quite inconvenient
for implementation with arrays, since we need space to
merge.
• In practice, the fastest sorting algorithm is Quicksort,
which uses partitioning as its main idea.
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
Q U I C K S O R T I S C O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
swap me
Q U I C K S O R T I S C O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
swap me
Q U I C K S O R T I S C O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
swap me
Q U I C K S O R T I S C O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
swap me
swap me
Q U I C K S O R T I S C O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
C U I C K S O R T I S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
swap me
C U I C K S O R T I S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
swap me
C U I C K S O R T I S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
swap me
swap me
C U I C K S O R T I S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
C I I C K S O R T U S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
C I I C K S O R T U S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
C I I C K S O R T U S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
C I I C K S O R T U S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
swap me
C I I C K S O R T U S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
swap me
C I I C K S O R T U S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
swap me
C I I C K S O R T U S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
swap me
C I I C K S O R T U S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
swap with
repeat untilpointers
pointers
cross
cross
partitioning
element
C I I C K S O R T U S Q O O L
unpartitioned
partitioned
partition element
left
right
Partitioning in Quicksort
– How do we partition the array efficiently?
•
•
•
•
•
choose partition element to be rightmost element
scan from right for smaller element
scan from left for larger element
exchange
repeat until pointers cross
C I I C K L O R T U S Q O O S
unpartitioned
partitioned
partition element
left
right
Analysis
• Worst-case: If A[1..n] is already sorted, then Partition
splits A[1..n] into A[1] and A[2..n] without changing the
order. If that happens, the running time C(n) satisfies:
C(n) = C(1) + C(n –1) + Q(n) = Q(n2)
• Best case: Partition keeps splitting the subarrays into
halves. If that happens, the running time C(n) satisfies:
C(n) ≈ 2 C(n/2) + Q(n) = Q(n log n)
Analysis
• Average case (for random permutation of n elements):
• C(n) ≈ 1.38 n log n which is about 38% higher than the
best case.
Comments
• Sort smaller subfiles first reduces stack size
asymptotically at most O(log n). Do not stack right
subfiles of size < 2 in recursive algorithm -- saves factor
of 4.
• Use different pivot selection, e.g. choose pivot to be
median of first last and middle.
• Randomized-Quicksort: turn bad instances to good
instances by picking up the pivot randomly