Department of Computer and Information Science, School of Science, IUPUI Mergesort Dale Roberts, Lecturer Computer Science, IUPUI E-mail: [email protected] Dale Roberts Why Does It Matter? Run time (nanoseconds) Time to solve a problem of size 1.3 N3 10 N2 47 N log2N 48 N 1000 1.3 seconds 10 msec 0.4 msec 0.048 msec 10,000 22 minutes 1 second 6 msec 0.48 msec 100,000 15 days 1.7 minutes 78 msec 4.8 msec million 41 years 2.8 hours 0.94 seconds 48 msec 1.7 weeks 11 seconds 0.48 seconds 10 million 41 millennia Max size problem solved in one second 920 10,000 1 million 21 million minute 3,600 77,000 49 million 1.3 billion hour 14,000 600,000 2.4 trillion 76 trillion day 41,000 2.9 million 50 trillion 1,800 trillion 1,000 100 10+ 10 N multiplied by 10, time multiplied by Dale Roberts Orders of Magnitude Seconds Equivalent 1 1 second 10 10 seconds 102 1.7 minutes 103 17 minutes 104 2.8 hours 105 1.1 days 106 1.6 weeks 107 3.8 months 108 3.1 years 109 3.1 decades 1010 3.1 centuries ... forever 1021 age of universe Meters Per Second Imperial Units Example 10-10 1.2 in / decade Continental drift 10-8 1 ft / year Hair growing 10-6 3.4 in / day Glacier 10-4 1.2 ft / hour Gastro-intestinal tract 10-2 2 ft / minute Ant 1 2.2 mi / hour Human walk 102 220 mi / hour Propeller airplane 104 370 mi / min Space shuttle 106 620 mi / sec Earth in galactic orbit 108 62,000 mi / sec 1/3 speed of light Powers of 2 210 thousand 220 million 30 2 Dale Roberts billion Impact of Better Algorithms Example 1: N-body-simulation. Simulate gravitational interactions among N bodies. physicists want N = # atoms in universe Brute force method: N2 steps. Appel (1981). N log N steps, enables new research. Example 2: Discrete Fourier Transform (DFT). Breaks down waveforms (sound) into periodic components. foundation of signal processing CD players, JPEG, analyzing astronomical data, etc. Grade school method: N2 steps. Runge-König (1924), Cooley-Tukey (1965). FFT algorithm: N log N steps, enables new technology. Dale Roberts Mergesort Mergesort (divide-and-conquer) Divide array into two halves. A A L L G G O O R R I T I Dale Roberts H T M H S M S divide Mergesort Mergesort (divide-and-conquer) Divide array into two halves. Recursively sort each half. A L G O R I T H M S A L G O R I T H M S divide A G L O R H I M S T sort Dale Roberts Mergesort Mergesort (divide-and-conquer) Divide array into two halves. Recursively sort each half. Merge two halves to make sorted whole. A L G O R I T H M S A L G O R I T H M S divide A G L O R H I M S T sort A G H I L M O Dale Roberts R S T merge Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. smallest A G smallest L O R H I A M S T auxiliary array Dale Roberts Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. smallest A G A L smallest O R H I G M S T auxiliary array Dale Roberts Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. smallest A G A L G O smallest R H I H M S T auxiliary array Dale Roberts Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. smallest A G A L G smallest O H R H I I M S T auxiliary array Dale Roberts Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. smallest A G A L G smallest O H R I H I L M S T auxiliary array Dale Roberts Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. smallest A G A L G O H smallest R I H L I M Dale Roberts M S T auxiliary array Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. smallest A G A L G O H smallest R I H L M I O Dale Roberts M S T auxiliary array Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. smallest A G A L G O H smallest R I H L M I O Dale Roberts M R S T auxiliary array Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. first half exhausted A G A L G O H R I smallest H L M I O Dale Roberts M R S S T auxiliary array Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. first half exhausted A G A L G O H R I smallest H L M I O Dale Roberts M R S S T T auxiliary array Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. first half exhausted A G A L G O H R I second half exhausted H L M I O Dale Roberts M R S S T T auxiliary array Implementing Mergesort mergesort (see Sedgewick Program 8.3) Item aux[MAXN]; uses scratch array void mergesort(Item a[], int left, int right) { int mid = (right + left) / 2; if (right <= left) return; mergesort(a, left, mid); mergesort(a, mid + 1, right); merge(a, left, mid, right); } Dale Roberts Implementing Merge (Idea 0) mergeAB(Item c[], Item a[], int N, Item b[], int M ) { int i, j, k; for (i = 0, j = 0, k = 0; k < N+M; k++) { if (i == N) { c[k] = b[j++]; continue; } if (j == M) { c[k] = a[i++]; continue; } c[k] = (less(a[i], b[j])) ? a[i++] : b[j++]; } } Dale Roberts Implementing Mergesort merge (see Sedgewick Program 8.2) void merge(Item a[], int left, int mid, int right) { int i, j, k; for (i = mid+1; i > left; i--) aux[i-1] = a[i-1]; for (j = mid; j < right; j++) aux[right+mid-j] = a[j+1]; copy to temporary array for (k = left; k <= right; k++) if (ITEMless(aux[i], aux[j])) a[k] = aux[i++]; else a[k] = aux[j--]; merge two sorted sequences } Dale Roberts Mergesort Demo Mergesort The auxilliary array used in the merging operation is shown to the right of the array a[], going from (N+1, 1) to (2N, 2N). The demo is a dynamic representation of the algorithm in action, sorting an array a containing a permutation of the integers 1 through N. For each i, the array element a[i] is depicted as a black dot plotted at position (i, a[i]). Thus, the end result of each sort is a diagonal of black dots going from (1, 1) at the bottom left to (N, N) at the top right. Each time an element is moved, a green dot is left at its old position. Thus the moving black dots give a dynamic representation of the progress of the sort and the green dots give a history of the data-movement cost. Dale Roberts Computational Complexity Framework to study efficiency of algorithms. Example = sorting. MACHINE MODEL = count fundamental operations. count number of comparisons UPPER BOUND = algorithm to solve the problem (worst-case). N log2 N from mergesort LOWER BOUND = proof that no algorithm can do better. N log2 N - N log2 e OPTIMAL ALGORITHM: lower bound ~ upper bound. mergesort Dale Roberts Decision Tree a1 < a2 YES NO a2 < a3 a1 < a3 YES NO print a1, a2, a3 YES print a2, a1, a3 a1 < a3 YES print a1, a3, a2 NO NO print a3, a1, a2 Dale Roberts a2 < a3 YES print a2, a3, a1 NO print a3, a2, a1 Comparison Based Sorting Lower Bound Theorem. Any comparison based sorting algorithm must use (N log2N) comparisons. Proof. Worst case dictated by tree height h. N! different orderings. One (or more) leaves corresponding to each ordering. Binary tree with N! leaves must have height h log2 ( N ! ) log2 ( N / e ) N Stirling's formula N log2 N N log2 e Food for thought. What if we don't use comparisons? Stay tuned for radix sort. Dale Roberts Mergesort Analysis How long does mergesort take? Bottleneck = merging (and copying). merging two files of size N/2 requires N comparisons T(N) = comparisons to mergesort N elements. to make analysis cleaner, assume N is a power of 2 0 if N 1 T( N ) 2T ( N / 2) N otherwise sorting both halves merging Claim. T(N) = N log2 N. Note: same number of comparisons for ANY file. even already sorted We'll prove several different ways to illustrate standard techniques. Dale Roberts Profiling Mergesort Empirically Mergesort prof.out void merge(Item a[], int left, int mid, int right) <999>{ int i, j, k; for (<999>i = mid+1; <6043>i > left; <5044>i--) <5044>aux[i-1] = a[i-1]; for (<999>j = mid; <5931>j < right; <4932>j++) <4932>aux[right+mid-j] = a[j+1]; for (<999>k = left; <10975>k <= right; <9976>k++) Striking feature: if (<9976>ITEMless(aux[i], aux[j])) All numbers <4543>a[k] = aux[i++]; SMALL! else <5433>a[k] = aux[j--]; <999>} # comparisons Theory ~ N log2 N = 9,966 void mergesort(Item a[], int left, int right) <1999>{ Actual = 9,976 int mid = <1999>(right + left) / 2; if (<1999>right <= left) return<1000>; <999>mergesort(a, aux, left, mid); <999>mergesort(a, aux, mid+1, right); <999>merge(a, aux, left, mid, right); <1999>} Dale Roberts Sorting Analysis Summary Running time estimates: Home pc executes 108 comparisons/second. Supercomputer executes 1012 comparisons/second. Insertion Sort (N2) Mergesort (N log N) computer thousand million billion thousand million billion home instant 2.8 hours 317 years instant 1 sec 18 min super instant 1 second 1.6 weeks instant instant instant Quicksort (N log N) Dale Roberts thousand million billion instant 0.3 sec 6 min instant instant instant Acknowledgements Sorting methods are discussed in our Sedgewick text. Slides and demos are from our text’s website at princeton.edu. Special thanks to Kevin Wayne in helping to prepare this material. Dale Roberts Extra Slides Dale Roberts Proof by Picture of Recursion Tree 0 if N 1 T( N ) 2T ( N / 2) N otherwise sorting both halves merging T(N) N T(N/4) 2(N/2) T(N/2) T(N/2) T(N/4) T(N/4) T(N/4) log2N 4(N/4) ... 2k (N / 2k) ... T(N / 2k) T(2) T(2) T(2) T(2) T(2) T(2) Dale Roberts T(2) T(2) N/2 (2) N log2N Proof by Telescoping Claim. T(N) = N log2 N (when N is a power of 2). 0 if N 1 T( N ) 2T ( N / 2) N otherwise sorting both halves merging T(N ) N Proof. For N > 1: 2T ( N / 2) N 1 T ( N / 2) N /2 1 T ( N / 4) N /4 1 1 T(N / N ) N/N 1 1 log2 N Dale Roberts log 2 N Mathematical Induction Mathematical induction. Powerful and general proof technique in discrete mathematics. To prove a theorem true for all integers k 0: Base case: prove it to be true for N = 0. Induction hypothesis: assuming it is true for arbitrary N Induction step: show it is true for N + 1 Claim: 0 + 1 + 2 + 3 + . . . + N = N(N+1) / 2 for all N 0. Proof: (by mathematical induction) Base case (N = 0). 0 = 0(0+1) / 2. Induction hypothesis: assume 0 + 1 + 2 + . . . + N = N(N+1) / 2 Induction step: 0 + 1 + . . . + N + N + 1 = (0 + 1 + . . . + N) + N+1 = N (N+1) /2 + N+1 = (N+2)(N+1) / 2 Dale Roberts Proof by Induction Claim. T(N) = N log2 N (when N is a power of 2). 0 if N 1 T( N ) 2T ( N / 2) N otherwise sorting both halves merging T ( 2 N ) 2T ( N ) 2 N 2 N log2 N 2 N 2 N log2 ( 2 N ) 1 2 N 2 N log2 ( 2 N ) Proof. (by induction on N) Base case: N = 1. Inductive hypothesis: T(N) = N log2 N. Goal: show that T(2N) = 2N log2 (2N). Dale Roberts Proof by Induction What if N is not a power of 2? T(N) satisfies following recurrence. if N 1 0 T( N ) T N / 2 T N / 2 N otherwise solve left half merging solve right half Claim. Proof. T(N) N log2 N. See supplemental slides. Dale Roberts Proof by Induction Claim. T(N) N log2 N. Proof. (by induction on N) Base case: N = 1. Define n1 = N / 2 , n2 = N / 2. Induction step: assume true for 1, 2, . . . , N – 1. if N 1 0 T( N ) T N / 2 T N / 2 N otherwise solve left half merging solve right half T ( N ) T ( n1 ) T ( n2 ) N n1 log2 n1 n2 log2 n2 N n1 log2 n2 n2 log2 n2 N N log2 n2 N N ( log2 N 1 ) N N log2 N Dale Roberts n2 N / 2 log N 2 2 /2 log2 n2 log2 N 1
© Copyright 2026 Paperzz