Data Structures and Algorithm Analysis Algorithm Analysis Lecturer: Ligang Dong Email: [email protected] Tel: 28877721,13306517055 Office: SIEE Building 305 Analysis of Algorithms Analysis of an algorithm gives insight into how long the program runs and how much memory it uses time complexity space complexity Why useful? Show the efficiency of an algorithm Machine Independent Analysis We assume that every basic operation takes constant time: Example Basic Operations: Addition, Subtraction, Multiplication, Memory Access Non-basic Operations: Sorting, Searching Efficiency of an algorithm is the number of basic operations it performs We do not distinguish between the basic operations. Machine Independent Analysis Input size is indicated by a number n sometimes have multiple inputs, e.g. m and n Running time is a function of n n, n2, n log n, 18 + 3n(log n2) + 5n3 Simplifying the Analysis Eliminate low order terms 4n + 5 4n 0.5 n log n - 2n + 7 0.5 n log n 2n + n3 + 3n 2n Eliminate constant coefficients 4n n 0.5 n log n n log n log n2 = 2 log n log n log3 n = (log3 2) log n log n Order Notation BIG-O Upper bound Exist constants c and n0 such that T(n) c f(n) for all n n0 OMEGA T(n) = (f(n)) Lower bound Exist constants c and n0 such that T(n) c f(n) for all n n0 THETA T(n) = O(f(n)) T(n) = θ (f(n)) Tight bound θ(n) = O(n) = (n) Examples n2 + 100 n = O(n2) = (n2) = (n2) ( n2 + 100 n ) 2 n2 ( n2 + 100 n ) 1 n2 n log n = O(n2) n log n = (n log n) n log n = (n) for n 10 for n 0 More on Order Notation Order notation is not symmetric; write 2n2 + 4n = O(n2) but never O(n2) = 2n2 + 4n right hand side is a crudification of the left Likewise O(n2) (n3) = O(n3) = (n2) A Few Comparisons Function #1 Function #2 n3 + 2n2 100n2 + 1000 n0.1 log n n + 100n0.1 2n + 10 log n 5n5 n! n-152n/100 1000n15 82log n 3n7 + 7n Race I n3 + 2n2 vs. 100n2+1000 Race II n0.1 vs. log n Race III n+100n0.1 vs. 2n+10logn Race IV 5n5 vs. n! Race V n-152n/100 vs. 1000n15 Race VI 82log(n) vs. 3n7 + 7n The Losers Win Function #1 Function #2 Better algorithm! n3 + 2n2 100n2 + 1000 O(n2) n0.1 log n O(log n) n + 100n0.1 2n + 10 log n TIE O(n) 5n5 n! O(n5) n-152n/100 1000n15 O(n15) 82log n 3n7 + 7n O(n6) Common Names constant: logarithmic: linear: log-linear: superlinear: quadratic: polynomial: exponential: O(1) O(log n) O(n) O(n log n) O(n1+c) (c is a constant > 0) O(n2) O(nk) (k is a constant) O(cn) (c is a constant > 1) Kinds of Analysis Running time may depend on actual data input, not just length of input Distinguish worst case best case average case your worst enemy is choosing input assumes some probabilistic distribution of inputs amortized average time over many operations Analyzing Code C operations consecutive stmts conditionals loops function calls recursive functions constant time sum of times sum of branches, condition sum of iterations cost of function body - solve recursive equation Above all, use your head! Nested Loops for i = 1 to n do for j = 1 to n do sum = sum + 1 n i 1 n j 1 n 1 n n i 1 2 Nested Dependent Loops for i = 1 to n do for j = i to n do sum = sum + 1 n n n i 1 j i i 1 1 n n i 1 i 1 (n i 1) (n 1) i n(n 1) n(n 1) n(n 1) n2 2 2 Conditionals Conditional if C then S1 else S2 time time(C) + Max( time(S1), time(S2) ) Recursion Recursion Iteration A subroutine which calls itself, with different parameters. Example: factorial(n) = nfactorial(n-1) Example: prod = 1 For j=1 to m prod prod j In general, iteration is more efficient than recursion because of maintenance of state information. Recursion A recursive procedure can often be analyzed by solving a recursive equation Basic form: T(n) = if (base case) then some constant else ( time to solve subproblems + time to combine solutions ) Result depends upon how many subproblems how costly to combine solutions Example: Sum of Integer Queue sum_queue(Q){ if (Q.length == 0 ) return 0; else return Q.dequeue() + sum_queue(Q); } One subproblem Linear reduction in size (decrease by 1) Combining: constant c (+), 1×subproblem Equation: T(0) b T(n) c + T(n – 1) for n>0 Example: Sum of Integer Queue Equation: T(0) b T(n) c + T(n – 1) Solution: T(n) c + c + T(n-2) c + c + c + T(n-3) kc + T(n-k) for all k nc + T(0) for k=n cn + b = O(n) for n>0 Example: Binary Search 7 12 30 35 75 83 87 90 97 99 One subproblem, half as large Equation: T(1) b T(n) T(n/2) + c for n>1 Solution: T(n) T(n/2) + c T(n/4) + c + c T(n/8) + c + c + c T(n/2k) + kc T(1) + c log n where k = log n b + c log n = O(log n) Example: MergeSort Split array in half, sort each half, merge together 2 subproblems, each half as large linear amount of work to combine T(1) b T(n) 2T(n/2) + cn for n>1 T(n) 2T(n/2)+cn 2(2(T(n/4)+cn/2)+cn = 4T(n/4) +cn +cn 4(2(T(n/8)+c(n/4))+cn+cn = 8T(n/8)+cn+cn+cn 2kT(1) + cn log n = O(n log n) 2kT(n/2k)+kcn where k = log n Example: Recursive Fibonacci Recursive Fibonacci: int Fib(n){ if (n == 0 or n == 1) return 1 ; else return Fib(n - 1) + Fib(n - 2); } Running time: Lower bound analysis T(0), T(1) 1 T(n) T(n - 1) + T(n - 2) + c Note: T(n) Fib(n) Fact: Fib(n) (3/2)n O( (3/2)n ) Why? if n > 1 Proof of Recursive Fibonacci Recursive Fibonacci: int Fib(n) if (n == 0 or n == 1) return 1 else return Fib(n - 1) + Fib(n - 2) Lower bound analysis T(0), T(1) >= b T(n) >= T(n - 1) + T(n - 2) + c if n > 1 Analysis let be (1 + 5)/2 which satisfies 2 = 2( + 1) show by induction on n that T(n) >= bn - 1 Direct Proof Continued Basis: T(0) b > b-1 and T(1) b = b0 Inductive step: Assume T(m) bm - 1 for all m < n T(n) = T(n - 1) + T(n - 2) + c bn-2 + bn-3 + c bn-3( + 1) + c 2bn-32 + c bn-1 Fibonacci Call Tree 5 3 4 2 1 0 0 3 2 1 1 2 1 0 1 Learning from Analysis To avoid recursive calls store all basis values in a table each time you calculate an answer, store it in the table before performing any calculation for a value n check if a valid answer for n is in the table if so, return it Memoization a form of dynamic programming How much time does memoized version take? Kinds of Analysis So far we have considered worst case analysis We may want to know how an algorithm performs “on average” Several distinct senses of “on average” amortized average time per operation over a sequence of operations average case average time over a random distribution of inputs expected case average time for a randomized algorithm over different random seeds for any input Amortized Analysis Consider any sequence of operations applied to a data structure your worst enemy could choose the sequence! Some operations may be fast, others total time for n operations slow n Goal: show that the average time per operation is still good Stack ADT A Stack operations push pop is_empty EDCBA B C D E F F Stack property: if x is on the stack before y is pushed, then x will be popped after y is popped What is biggest problem with an array implementation? Stretchy Stack Implementation int data[]; int maxsize; int top; Best case Push = O( ) Worst case Push = O( ) Push(e){ if (top == maxsize){ temp = new int[2*maxsize]; copy data into temp; deallocate data; data = temp; } else { data[++top] = e; } Stretchy Stack Amortized Analysis Consider sequence of n operations push(3); push(19); push(2); … What is the max number of stretches? log n What is the total time? let’s say a regular push takes time a, and stretching an array contain k elements takes time kb, for some constants a and b. logn an b(1 2 4 8 ... n) an b 2i i o an b(21logn 1) an b(2n 1) Amortized time = (an+b(2n-1))/n = O(1) Homework #3 3.1 Design the algorithm of Towers of Hanoi. Source peg, Destination peg, Auxilliary peg k disks on the source peg, a bigger disk can never be on top of a smaller disk Need to move all k disks to the destination peg using the auxilliary peg, without ever keeping a bigger disk on the smaller disk. 3.2 Analyze the time complex of your algorithm in 3.1. 3.3 Do a program to resolve Towers of Hanoi. Homework #3
© Copyright 2026 Paperzz