Data Structures and Algorithm Analysis Algorithm Design Techniques Lecturer: Ligang Dong Email: [email protected] Tel: 28877721,13306517055 Office: SIEE Building 305 The Greedy Approach Greedy algorithms are usually designed to solve optimization problems in which a quantity is to be minimized or maximized. Greedy algorithms typically consist of a n iterative procedure that tries to find a local optimal solution. In some instances, these local optimal solutions translate to global optimal solutions. In others, they fail to give optimal solutions. The Greedy Approach A greedy algorithm makes a correct guess on the basis of little calculation without worrying about the future. Thus, it builds a solution step by step. Each step increases the size of the partial solution and is based on local optimization. The choice make is that which produces the largest immediate gain while maintaining feasibility. Since each step consists of little work based on a small amount of information, the resulting algorithms are typically efficient. The Fractional Knapsack Problem Given n items of sizes s1, s2, …, sn, and values v1, v2, …, vn and size C, the knapsack capacity, the objective is to find nonnegative real numbers x1, x2, …, xn that maximize the sum n xv i 1 i i subject to the constraint n x s i 1 i i C The Fractional Knapsack Problem This problem can easily be solved using the following greedy strategy: For each item compute yi=vi/si, the ratio of its value to its size. Sort the items by decreasing ratio, and fill the knapsack with as much as possible from the first item, then the second, and so forth. This problem reveals many of the characteristics of a greedy algorithm discussed above: The algorithm consists of a simple iterative procedure that selects that item which produces that largest immediate gain while maintaining feasibility. File Compression Suppose we are given a file, which is a string of characters. We wish to compress the file as much as possible in such a way that the original file can easily be reconstructed. Let the set of characters in the file be C={c1, c2, …, cn}. Let also f(ci), 1in, be the frequency of character ci in the file, i.e., the number of times ci appears in the file. File Compression Using a fixed number of bits to represent each character, called the encoding of the character, the size of the file depends only on the number of characters in the file. Since the frequency of some characters may be much larger than others, it is reasonable to use variable length encodings. File Compression Intuitively, those characters with large frequencies should be assigned short encodings, whereas long encodings may be assigned to those characters with small frequencies. When the encodings vary in length, we stipulate that the encoding of one character must not be the prefix of the encoding of another character; such codes are called prefix codes. For instance, if we assign the encodings 10 and 101 to the letters “a” and “b”, there will be an ambiguity as to whether 10 is the encoding of “a” or is the prefix of the encoding of the letter “b”. File Compression Once the prefix constraint is satisfied, the decoding becomes unambiguous; the sequence of bits is scanned until an encoding of some character is found. One way to “parse” a given sequence of bits is to use a full binary tree, in which each internal node has exactly two branches labeled by 0 an 1. The leaves in this tree corresponding to the characters. Each sequence of 0’s and 1’s on a path from the root to a leaf corresponds to a character encoding. File Compression The algorithm presented is due to Huffman. The algorithm consists of repeating the following procedure until C consists of only one character. Let ci and cj be two characters with minimum frequencies. Create a new node c whose frequency is the sum of the frequencies of ci and cj, and make ci and cj the children of c. Let C=C-{ci, cj}{c}. File Compression Example: frequencies ={0.4,0.3,0.15,0.05,0.04,0.03,0.03} 0.4 0.3 0.4 0.15 0.3 0.05 0.15 0.06 0.03 0.03 0.04 0.05 0.03 0.04 0.03 0.06 File Compression Example: 0.4 0.4 0.3 0.15 0.09 0.3 0.15 0.15 0.06 0.15 0.06 0.09 0.06 0.03 0.03 0.04 0.09 0.05 0.03 0.03 0.04 0.05 File Compression 0.4 Example: 0.3 0.4 0.3 0.6 0.6 0.3 0.3 0.15 0.15 0.15 0.06 0.15 0.09 0.09 0.06 0.03 0.3 0.03 0.04 0.05 0.03 0.03 0.04 0.05 File Compression Example: 1 0.6 0.3 0.15 0.3 0.15 0.09 0.06 0.03 0.4 0.03 0.04 0.05 1 2 3 4 5 6 7 8 9 10 11 12 13 frequency parent LChild RChild 0.4 0.3 0.15 0.05 0.04 0.03 0.03 0.06 0.09 0.15 0.3 0.6 1 13 12 11 9 9 8 8 10 10 11 12 13 0 0 0 0 0 0 0 0 6 4 8 3 2 1 0 0 0 0 0 0 0 7 5 9 10 11 12 1 File Compression Example: 指令 I1 I2 0.6 0.4 0 0.3 0.3 1 0 0.15 0.15 0 0.09 0.06 0.03 0 1 1 1 1 0 1 0.03 0.04 0 0.05 I3 I4 I5 I6 I7 频率 0.4 0.3 0.15 0.05 0.04 0.03 0.03 编码 0 10 110 11100 11101 11110 11111 The Greedy Approach Other examples: Dijkstra’s algorithm for the shortest path problem Kruskal’s algorithm for the minimum cost spanning tree problem Divide and Conquer A divide-and-conquer algorithm divides the problem instance into a number of subinstances (in most cases 2), recursively solves each subinsance separately, and then combines the solutions to the subinstances to obtain the solution to the original problem instance. Divide and Conquer Consider the problem of finding both the minimum and maximum in an array of integers A[1…n] and assume for simplicity that n is a power of 2. Divide the input array into two halves A[1…n/2] and A[(n/2)+1…n], find the minimum and maximum in each half and return the minimum of the two minima and the maximum of the two maxima. Divide and Conquer Input: An array A[1…n] of n integers, where n is a power of 2; Output: (x, y): the minimum and maximum integers in A; 1. minmax(1, n); minmax(low, high) 1. if high-low=1 then 2. if A[low]<A[high] then return (A[low], A[high]); 3. else return(A[high], A[low]); 4. end if; 5. else 6. mid(low+high)/2; 7. (x1, y1)minmax(low, mid); 8. (x2, y2)minmax(mid+1, high); 9. xmin{x1, x2}; 10. y max{y1, y2}; 11. return(x, y); 12. end if; The Divide and Conquer Paradigm The divide step: the input is partitioned into p1 parts, each of size strictly less than n. The conquer step: performing p recursive call(s) if the problem size is greater than some predefined threshold n0. The combine step: the solutions to the p recursive call(s) are combined to obtain the desired output. The Divide and Conquer Paradigm (1) If the size of the instance I is “small”, then solve the problem using a straightforward method and return the answer. Otherwise, continue to the next step; (2) Divide the instance I into p subinstances I1, I2, …, Ip of approximately the same size; (3) Recursively call the algorithm on each subinstance Ij, 1jp, to obtain p partial solutions; (4) Combine the results of the p partial solutions to obtain the solution to the original instance I. Return the solution of instance I. Divide and Conquer example: Quicksort Dynamic Programming Dynamic programming resorts to evaluating the recurrence in a bottom-up manner, saving intermediate results that are used later on to compute the desired solution. This technique applies to many combinatorial optimization problems to derive efficient algorithms. The Dynamic Programming Paradigm The idea of saving solutions to subproblems in order to avoid their recomputation is the basis of this powerful method. This is usually the case in many combinatorial optimization problems in which the solution can be expressed in the form of a recurrence whose direct solution causes subinstances to be computed more than once. The Dynamic Programming Paradigm An important observation about the working of dynamic programming is that the algorithm computes an optimal solution to every subinstance of the original instance considered by the algorithm. This argument illustrates an important principle in algorithm design called the principle of optimality: Given an optimal sequence of decisions, each subsequence must be an optimal sequence of decisions by itself. Dynamic Programming example: Floyd Algorithm for the All-Pairs Shortest Path Problem Backtracking In many real world problems, a solution can be obtained by exhaustively searching through a large but finite number of possibilities. Hence, the need arose for developing systematic techniques of searching, with the hope of cutting down the search space to possibly a much smaller space. Here, we present a general technique for organizing the search known as backtracking. This algorithm design technique can be described as an organized exhaustive search which often avoids searching all possibilities. Backtracking Example: Depth-First Search Breadth-First Search Homework #11 11.1 A file contains only colons, spaces, newline, commas, and digits in the following frequency: colon (100), space (605), newline (100), commas (705), 0 (431), 1 (242), 2 (176), 3 (59), 4 (185), 5 (250), 6 (174), 7 (199), 8 (205), 9 (217). Construct the Huffman code.
© Copyright 2026 Paperzz