Lecture 2: Divide and Conquer I: Merge

Lecture 5:
Linear Time Sorting
Shang-Hua Teng
Sorting
• Input: Array A[1...n], of elements in
arbitrary order; array size n
Output: Array A[1...n] of the same elements,
but in the non-decreasing order
How fast can we sort?
• Can we do better than O(n log n)?
• Well… it depends?
• It depends on the model of computation!
Comparison Sorting
• Only comparisons are used to determine the
order of elements
• Example
–
–
–
–
Insertion sort
Merge Sort
Quick Sort
Heapsort
Decision Trees
• Example: 3 element sorting (a1, a2, a3)
a1? a2
“Less than” leads to left branch
“Great than” leads to right branch
Decision Trees
• Example: 3 element sort (a1, a2, a3)
a1? a2
a2? a3
Decision Trees
• Example: 3 element sort (a1, a2, a3)
a1? a2
a2? a3
a1, a2, a3
Decision Trees
• Example: 3 element sort (a1, a2, a3)
a1? a2
a2? a3
a1? a3
a1, a2, a3
Decision Trees
• Example: 3 element sort (a1, a2, a3)
a1? a2
a2? a3
a1? a3
a1, a2, a3
a1, a3, a2
a3, a1, a2
Decision Trees
• Example: 3 element sort (a1, a2, a3)
a1? a2
a2? a3
a1? a3
a1? a3
a1, a2, a3
a1, a3, a2
a3, a1, a2
Decision Trees
• Example: 3 element sort (a1, a2, a3)
a1? a2
a2? a3
a1? a3
a1? a3
a2, a1, a3
a1, a2, a3
a1, a3, a2
a3, a1, a2
Decision Trees
• Example: 3 element sort (a1, a2, a3)
a1? a2
a2? a3
a1? a3
a1? a3
a2, a1, a3
a1, a2, a3
a1, a3, a2
a2? a3
a3, a1, a2
a2, a3, a1
a3, a2, a1
Decision Trees
• Example: 3 element sort (a1, a2, a3)
a1? a2
a2? a3
a1? a3
a1? a3
a2, a1, a3
a1, a2, a3
a1, a3, a2
a2? a3
a3, a1, a2
a2, a3, a1
a3, a2, a1
Decision Trees
• Each leaf contains a permutation indicating
that permutation of order has been
established
• A decision tree can model a comparison sort
• It is the tree of all possible instruction traces
• Running time = length of path
• Worst case running time = length of the
longest path (height of the tree)
Lower Bound for Comparison
Sorting
• Theorem: Any decision tree that sorts n elements
has height W(n lg n)
• Proof: There are n! leaves
• A binary tree of height h has no more than 2h
nodes
n


n 

h  lg( n!)  lg  
Stirling formula
 e  


 n lg n  n lg e  W(n lg n)
Optimal Comparison Sorting
Algorithms
• Merge sort
• Quick sort
• Heapsort
Sorting in Linear Time
• Counting sort
– Suppose we only have k keys {1,2,…,k}
 Q ( n+k ) time
• Radix sort
–
–
–
–
Can use counting sort
Place by place (digit by digit ) sorting
IBM Human machine interface
If elements have d places (digits), then we need
d calls to counting sort
Counting Sort
• Counts how many occurrences of each key
• Allocate proper amount of spaces for each
key, from small to large
• For each element a[j] in the input, count the
number of elements that are less than the
element
• A[k] is “Less than” A[j] if
– A[k] < A[j] or
– A[k] = A[j] and k < j
Example of Counting Sort
• A = [4 1 3 4 3], we have four keys
• C =[1 0 2 2]
• Prefix sum of C
–
–
–
–
–
Prefix_sum( C ) = 0 [1 1 3 5]
First key goes from 1 to 1
Second key goes from 2 to 1 (empty)
Third key goes from 2 to 3
Fourth key goes from 4 5
• Sorted array is = [1 3 3 4 4]
Counting Sort
• Counting_sort (A, 1, n)
– for i = 0 to k
Zero out the counting array
• do C[i ]  0
– for j = 1 to length(A)
• do C[ A[ j ]]  C[ A[ j ]]  1
– for i = 1 to k
• do C[i]  C[i]  C[i  1]
C[j] now contains the number
of elements equal to i
C[j] now contains the number
– for j = length(A) downto 1 of elements less than or
equal to j
• do BC[ A[ j ]]  A[ j ]
C[ A[ j ]]  C[ A[ j ]]  1
Stable sorting
• Stable: after sorting elements with the same
values appear in the output array in the
same order as they do in the input array
• An important property of counting sort
– It is stable
Complexity of Counting Sort
Q(n+k)
Radix Sort
329
457
657
839
436
720
355
Radix Sort
329
457
657
839
436
720
355
720
355
436
457
657
329
839
Radix Sort
329
457
657
839
436
720
355
720
355
436
457
657
329
839
720
329
436
839
355
457
657
Radix Sort
329
457
657
839
436
720
355
720
355
436
457
657
329
839
720
329
436
839
355
457
657
329
355
436
457
657
720
839
Correctness
• Induction on digit position
– Assume numbers are sorted by low-order t-1 digits
– Sort on digit t
• Two numbers that differ in digit t are correctly sorted
• 2 number = in digit t are put in same order as input,
implying correct order
• Spreadsheets use stable sort, so you can try by
hand if you have spreadsheets
Time Complexity of Radix sort
• Sort n words with b bits each
– b passes of counting sort on 1-bits digits
– Or b/2 passes … 2-bits digits
b

T (n, b)  Q n  2 r 
– Or b/r passes … r-bits digits
r

• r = log n
 b

n
T (n, b)  Q
 log n 
• Keys from {1,…,nd} needs Q(dn) time