Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 10 Bucket Sort • Assumption: – the input is generated by a random process that distributes elements uniformly over [0, 1) • Idea: – – – – Divide [0, 1) into n equal-sized buckets Distribute the n input values into the buckets Sort each bucket Go through the buckets in order, listing elements in each one • Input: A[1 . . n], where 0 ≤ A[i] < 1 for all i • Output: elements in A sorted • Auxiliary array: B[0 . . n - 1] of linked lists, each list initially empty CS 477/677 - Lecture 10 2 BUCKET-SORT Alg.: BUCKET-SORT(A, n) for i ← 1 to n do insert A[i] into list B[⎣nA[i]⎦] for i ← 0 to n - 1 do sort list B[i] with insertion sort concatenate lists B[0], B[1], . . . , B[n -1] together in order return the concatenated lists CS 477/677 - Lecture 10 3 Correctness of Bucket Sort • Consider two elements A[i], A[ j] • Assume without loss of generality that A[i] ≤ A[j] • Then ⎣nA[i]⎦ ≤ ⎣nA[j]⎦ – A[i] belongs to the same group as A[j] or to a group with a lower index than that of A[j] • If A[i], A[j] belong to the same bucket: – insertion sort puts them in the proper order • If A[i], A[j] are put in different buckets: – concatenation of the lists puts them in the proper order CS 477/677 - Lecture 10 4 Analysis of Bucket Sort Alg.: BUCKET-SORT(A, n) for i ← 1 to n 𝝤(n) do insert A[i] into list B[⎣nA[i]⎦] for i ← 0 to n - 1 do sort list B[i] with insertion sort Θ(n) concatenate lists B[0], B[1], . . . , B[n -1] together in order 𝝤(n) return the concatenated lists CS 477/677 - Lecture 10 Θ(n) 5 Conclusion • Any comparison sort will take at least nlgn to sort an array of n numbers • We can achieve a better running time for sorting if we can make certain assumptions on the input data: – Counting sort: each of the n input elements is an integer in the range 0 to k – Radix sort: the elements in the input are integers represented with d digits – Bucket sort: the numbers in the input are uniformly distributed over the interval [0, 1) CS 477/677 - Lecture 10 6 A Job Scheduling Application • Job scheduling – The key is the priority of the jobs in the queue – The job with the highest priority needs to be executed next • Operations – Insert, remove maximum • Data structures – Priority queues – Ordered array/list, unordered array/list CS 477/677 - Lecture 10 7 PQ Implementations & Cost Worst-case asymptotic costs for a PQ with N items Insert Remove max ordered array N 1 ordered list N 1 unordered array 1 N unordered list 1 N Can we implement both operations efficiently? CS 477/677 - Lecture 10 8 Background on Trees • Def: Binary tree = structure composed of a finite set of nodes that either: – Contains no nodes, or – Is composed of three disjoint sets of nodes: a root node, a left subtree and a right subtree root 4 Left subtree 1 2 14 3 16 9 Right subtree 10 8 CS 477/677 - Lecture 10 9 Special Types of Trees • Def: Full binary tree = a binary tree in which each node is either a leaf or has degree (number of children) exactly 2. • Def: Complete binary tree = a binary tree in which all leaves have the same depth and all internal nodes have degree 2. CS 477/677 - Lecture 10 4 1 3 2 14 16 9 8 10 7 12 Full binary tree 4 1 2 3 16 9 10 Complete binary tree 10 The Heap Data Structure • Def: A heap is a nearly complete binary tree with the following two properties: – Structural property: all levels are full, except possibly the last one, which is filled from left to right – Order (heap) property: for any node x Parent(x) ≥ x 8 7 5 4 2 It doesn’t matter that 4 in level 1 is smaller than 5 in level 2 Heap CS 477/677 - Lecture 10 11 Definitions • Height of a node = the number of edges on a longest simple path from the node down to a leaf • Depth of a node = the length of a path from the root to the node • Height of tree = height of root node = ⎣lgn⎦, for a heap of n elements Height of root = 3 4 1 Height of (2)= 1 2 14 3 16 9 10 Depth of (10)= 2 8 CS 477/677 - Lecture 10 12 Array Representation of Heaps • A heap can be stored as an array A. – Root of tree is A[1] – Left child of A[i] = A[2i] – Right child of A[i] = A[2i + 1] – Parent of A[i] = A[ ⎣i/2⎦] – Heapsize[A] ≤ length[A] • The elements in the subarray A[(⎣n/2⎦ + 1) .. n] are leaves • The root is the maximum element of the heap CS 477/677 - Lecture 10 13 Heap Types • Max-heaps (largest element at root), have the max-heap property: – for all nodes i, excluding the root: A[PARENT(i)] ≥ A[i] • Min-heaps (smallest element at root), have the min-heap property: – for all nodes i, excluding the root: A[PARENT(i)] ≤ A[i] CS 477/677 - Lecture 10 14 Operations on Heaps • Maintain the max-heap property – MAX-HEAPIFY • Create a max-heap from an unordered array – BUILD-MAX-HEAP • Sort an array in place – HEAPSORT • Priority queue operations CS 477/677 - Lecture 10 15 Operations on Priority Queues • Max-priority queues support the following operations: – INSERT(S, x): inserts element x into set S – EXTRACT-MAX(S): removes and returns element of S with largest key – MAXIMUM(S): returns element of S with largest key – INCREASE-KEY(S, x, k): increases value of element x’s key to k (assume k ≥ current key value at x) CS 477/677 - Lecture 10 16 Building a Heap • Convert an array A[1 … n] into a max-heap (n = length[A]) • The elements in the subarray A[(⎣n/2⎦+1) .. n] are leaves • Apply MAX-HEAPIFY on elements between 1 and ⎣n/2⎦ 1 Alg: BUILD-MAX-HEAP(A) 1. 2. 3. 4 n = length[A] 2 1 4 for i ← ⎣n/2⎦ downto 1 8 do MAX-HEAPIFY(A, i, n) A: 4 2 14 1 CS 477/677 - Lecture 10 3 2 3 5 6 3 7 16 9 9 10 8 7 16 9 10 14 10 8 17 7 Example: A 4 8 2 14 8 7 8 2 14 4 6 3 16 9 2 7 10 8 2 14 5 9 10 8 7 8 7 3 16 9 2 7 1 4 10 8 14 2 3 5 9 10 8 7 6 i=1 1 1 4 4 16 6 16 9 10 2 7 3 4 8 2 14 3 16 9 10 8 1 5 6 7 9 10 CS 477/677 - Lecture 10 2 7 3 4 8 2 8 3 16 9 i=2 5 10 6 7 1 3 1 4 3 9 8 4 2 1 10 14 4 1 4 9 i=3 5 10 16 1 3 9 2 i=4 2 1 3 i=5 1 4 1 7 10 3 14 9 10 4 1 5 6 7 9 10 7 3 18 Correctness of BUILD-MAX-HEAP • Loop invariant: – At the start of each iteration of the for loop, each node i + 1, i + 2,…, n is the root of a max-heap • Initialization: – i = ⎣n/2⎦: Nodes ⎣n/2⎦ + 1, ⎣n/2⎦ + 2, …, n are leaves ⇒ they are the root of trivial max-heaps 1 4 2 1 4 8 14 2 3 5 9 10 8 7 6 16 9 CS 477/677 - Lecture 10 3 7 10 19 Correctness of BUILD-MAX-HEAP • Maintenance: – MAX-HEAPIFY makes node i a maxheap root and preserves the property that nodes i + 1, i + 2, …, n are roots of max-heaps 4 – Decrementing i in the for loop 2 8 reestablishes the loop invariant 14 • Termination: 1 4 2 3 1 5 9 10 8 7 6 3 16 9 10 – i = 0 ⇒ each node 1, 2, …, n is the root of a max-heap (by the loop invariant) CS 477/677 - Lecture 10 7 20 Running Time of BUILD MAX HEAP Alg: BUILD-MAX-HEAP(A) 1. n = length[A] 2. for i ← ⎣n/2⎦ downto 1 3. do MAX-HEAPIFY(A, i, n) O(lgn) O(n) ⇒ It would seem that running time is O(nlgn) • This is not an asymptotically tight upper bound CS 477/677 - Lecture 10 21 Running Time of BUILD MAX HEAP • HEAPIFY takes O(h) ⇒ the cost of HEAPIFY on a node i is proportional to the height of the node i in the tree Height h h i 0 i 0 T (n) ni hi 2i h i O (n) Level No. of nodes h0 = 3 (⎣lgn⎦) i=0 20 h1 = 2 i=1 21 h2 = 1 i=2 22 h3 = 0 i = 3 (⎣lgn⎦) 23 hi = h – i height of the heap rooted at level i ni = 2i number of nodes at level i CS 477/677 - Lecture 10 22 Running Time of BUILD MAX HEAP h T (n) ni hi Cost of HEAPIFY at level i × number of nodes at that level i 0 h 2i h i Replace the values of ni and hi computed before i 0 hi h 2 h i i 0 2 h k h 2 k k 0 2 h k k 2 k 0 Multiply by 2h both at the nominator and denominator and write 2i as 1i 2 Change variables: k = h - i n The sum above is smaller than the sum of all elements to ∞ and h = lgn O (n) The sum above is smaller than 2 Running time of BUILD-MAX-HEAP: T(n) = O(n) CS 477/677 - Lecture 10 23 Binary Search Trees • Tree representation: – A linked data structure in which each node is an object • Node representation: – – – – – L parent Key field key Satellite data Left: pointer to left child Left child Right: pointer to right child p: pointer to parent (p [root [T]] = NIL) data R Right child • Satisfies the binary search tree property CS 477/677 - Lecture 10 24 Binary Search Tree Example • Binary search tree property: – If y is in left subtree of x, then key [y] ≤ key [x] – If y is in right subtree of x, then key [y] ≥ key [x] CS 477/677 - Lecture 10 5 3 2 7 5 9 25 Binary Search Trees • Support many dynamic set operations – SEARCH, MINIMUM, MAXIMUM, PREDECESSOR, SUCCESSOR, INSERT, DELETE • Running time of basic operations on binary search trees – On average: Θ(lgn) • The expected height of the tree is lgn – In the worst case: Θ(n) • The tree is a linear chain of n nodes CS 477/677 - Lecture 10 26 Red-Black Trees • “Balanced” binary trees guarantee an O(lgn) running time on the basic dynamicset operations • Red-black tree – Binary tree with an additional attribute for its nodes: color which can be red or black – Constrains the way nodes can be colored on any path from the root to a leaf • Ensures that no path is more than twice as long as another the tree is balanced – The nodes inherit all the other attributes from the binary-search trees: key, left, right, p CS 477/677 - Lecture 10 27 Red-Black Trees Properties 1. Every node is either red or black 2. The root is black 3. Every leaf (NIL) is black 4. If a node is red, then both its children are black • No two red nodes in a row on a simple path from the root to a leaf 5. For each node, all paths from the node to descendant leaves contain the same number of black nodes CS 477/677 - Lecture 10 28 Example: RED-BLACK TREE 26 17 NIL 41 NIL NIL 30 47 38 NIL NIL NIL 50 NIL NIL • For convenience we use a sentinel NIL[T] to represent all the NIL nodes at the leafs – NIL[T] has the same fields as an ordinary node – Color[NIL[T]] = BLACK – The other fields may be set to arbitrary values CS 477/677 - Lecture 10 29 Black-Height of a Node 26 h=1 bh = 1 NIL h=4 bh = 2 17 41 NIL NIL h=2 30 bh = 1 h=3 bh = 2 h=1 bh = 1 38 NIL NIL NIL 47 h=2 bh = 1 50 NIL h=1 bh = 1 NIL • Height of a node: the number of edges in a longest path to a leaf • Black-height of a node x: bh(x) is the number of black nodes (including NIL) on a path from x to leaf, not counting x CS 477/677 - Lecture 10 30 Readings • Chapter 6, 7, 8 CS 477/677 - Lecture 10 31
© Copyright 2026 Paperzz