SHEN’S CLASS NOTES Chapter 16 Greedy Algorithms Like the dynamic programming, greedy algorithms are used to solve optimization problems also. Examples of optimization problems: (1) Find the largest number among n numbers. (2) Find the minimum spanning tree (MST) for a given graph. (3) Find the shortest path from vertex a to vertex z in a graph. A greedy algorithm works in stages also: (1) Initially, the greedy algorithm provides a simple partial solution (or a feasible solution) to the problem. For example, it can start with a single vertex for the MST problem. (2) At each stage, the greedy algorithm grows the current feasible or partial solution from previous stage to a larger, better, or more complete solution. After a number of stages, the algorithm stops when: (1) An optimal solution is obtained in O(f(n)) time. In this case, we say that the algorithm solves the problem in O(f(n)) time. (2) A good but sub-optimal solution is obtained. In this case, we say that this is an approximation algorithm. 1 SHEN’S CLASS NOTES Note that the greedy algorithms are different from dynamic programming in that the greedy algorithms usually grow only one unique partial solution at each stage. The dynamic programming develops a set of solutions at each stage and dynamically determines which solutions in previous stages should be used. Moreover, the dynamic programming must solve all smaller problems optimally. However, the dynamic programming and greedy algorithms do share a common idea that is to solve a large problem by solving smaller problems. This idea is also shared by divide and conquer algorithms which take the top-down approach. 16.1 An Activity-Selection Problem Suppose we have a set of n proposed activities: S = {a1, a2, …, an}. Each activity ai has a start time si and a finish time fi, where 0 si < fi < . Moreover, we assume that all these activities share a common resource. Therefore, if ai is selected, then this activity will take place and occupy the resource in the time interval [si, fi). Definition 16.1 Two activities ai and aj are compatible if [si, fi) and [sj, fj) do not overlap, that is fi sj, or fj si. Definition 16.2 The activity-selection problem is to select a maximum-size subset of mutually compatible activities. 2 SHEN’S CLASS NOTES Example. Consider the following set S of activities. i si fi 1 1 4 2 3 5 3 0 6 4 5 7 5 3 8 6 5 9 7 6 10 8 8 11 9 8 12 10 2 13 11 12 14 We notice that {a3, a9, a11} are three mutually compatible activities: a3 a11 a9 0 12 6 8 Fig. 16-1 14 However, {a3, a9, a11} is not the largest set. An optimal one is {a1, a4, a8, a11}. a1 1 4 5 a11 a8 a4 7 8 Fig. 16-2 11 12 So, how to solve the problem? There are many ways to solve this problem. For example, we can use dynamic programming. Dynamic programming approach 3 14 SHEN’S CLASS NOTES Let Si, j = { ak S | fi sk < fk Define f0 = 0, sn+1 = Then, S0, n+1 = S. sj } We assume the activities are sorted by their finishing time: f0 f1 f2 f3 … fn < fn+1 Obviously, Si, j = if i j. Let c[i, j] be the number of activities in the best solution for Si, j. We have if S i , j if S i, j 0 c[i, j] = Max {c[i, k ] c[k , j ] 1} i k j This relation is illustrated by Fig. 16-3. fi sk fk ak ai fj sj aj c[i, k] c[k, j] Fig. 16-2 We leave the details for the reader. 4 SHEN’S CLASS NOTES Recursive Approach We need a theorem first. Theorem 16.1 Let am Si, j such that fm = Min { fk | ak Si, j }, then we have (1) am is used in some maximum-sized subset of mutually compatible activities of Si, j. (2) Si, m = . Proof We prove (2) first. We assume for the sake of contradiction that Si, m . Then, we have an activity ak such that fi sk < fk sm fm. This contradicts to the assumption that fm is the minimum. Now, we prove (1). Let Ai, j be an optimal solution. Ai, j is a maximum-sized subset of mutually compatible activities of Si, j. If Ai, j contains am, we are done. Otherwise, let us order the activities in Ai, j by their finishing time in increasing order. Let ak be the first one, ak am. It is clear that for any other activity ax in Ai, j, we have fk < sx. Now, look at the set A’i, j obtained from Ai, j by replacing ak with am: A’i, j = (Ai, j - { ak }) { am }. Because fm fk, we have fm < sx for any other activity ax in A’i, j. Therefore, all activities in A’i, j are mutually compatible. 5 SHEN’S CLASS NOTES Since | A’i, j | = | Ai, j |, A’i, j is also a optimal solution which contains am. From Theorem 16.1, we can take the following approach to the activity-selection problem: Sort the activities by their finishing time, and select am if fm is a the smallest. Because Si, m = , we only need to select more activities from Sm, j. So, the solution is: Ai, j = { am } Am, j. The Sm, j can be obtained from Si, j by deleting any activity ax that has sx < fm. Then, the same approach is repeated to Sm, j. Continue this way we will get Ai, j. The following recursive algorithm reflects this method. Note that we have assumed that the activities are sorted by their finishing time: f0 f1 f2 f3 … fn < fn+1 Recursive-Activity-Selector (s, f, i, j) 1 m i +1 2 while m < j and sm < fi 3 do m m +1 //find the 1st activity within Si, j. 4 if m < j 5 then return {am}Recursive-Activity-Selector(s,f,m,j) 6 SHEN’S CLASS NOTES 6 else return 7 End Greedy Algorithm The above recursive approach can be easily converted to a greedy algorithm. Greedy-Activity-Selctior (s, f) 1 n length[s] 2 A { a1} 3 i1 4 for m 2 to n 8 do if sm fi 9 then { A A { am} 10 im 11 } 12 return A. 13 End The following example in Fig. 16-3 illustrate the greedy algorithm. 7 SHEN’S CLASS NOTES k sk f k 0 - 0 a0 1 1 4 start a1 m =1 2 3 5 a2 excluded 3 0 6 a3 excluded 4 5 7 a4 5 3 8 a5 m =4 6 5 9 a6 7 6 10 a7 8 8 11 a8 9 8 12 a9 a10 10 2 13 11 12 14 Set A m=8 a1 a4 a8 a11 m = 11 a11 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Fig. 16-3. 8 time SHEN’S CLASS NOTES There are many other greedy algorithms such as the minimum spanning tree (MST) algorithm and single source shortest path (SSP) algorithm. They are so important that we will discuss them in separate chapters. 9 SHEN’S CLASS NOTES Additional Example – The gas scheduling problem Suppose we plan to drive in a car from city A to city B. Along the high way, we will go through n gas stations, labeled with 1, 2, …, n. For convenience, let station 0 be located in city A and station n+1 in city B. The distance from station i-1 to station i is given in an array D[i] which is known in advance. Let the gas price at station i be P(i), 0 i n+1 which is also known. For simplicity, we assume P(i) has been converted to the dollar amount per each mile for the car we use. You may assume P(n+1) = 0. An example is given below. A B P[6] = 0 P[0] = 5 P[1] = 6 P[2] = 2 P[3] = 4 P[4] = 3 P[5] = 7 D[6] = 1 D[0] = 0 D[1] = 4 D[2] = 3 D[3] = 5 D[4] = 2 D[5] = 3 Fig. 16 –E1 An illustration of the gas scheduling problem. Let L (miles) be the distance a car can run with full tank of gas. It is assumed that in any L miles, there is at least one station and at most k stations, where k is a constant. Now, we want to compute the amount of gas that needs be added (in terms of miles) at each station such that the total cost is minimized to drive the car from city A to city B, assuming the gas tank is empty initially. Solution We will design a greedy algorithm. 10 SHEN’S CLASS NOTES Suppose we have arrived at station i, 0 i n. Let R[i], 0 i n, be the miles the remaining gas in the tank can run (R[i] L). R[0] = 0. We will decide how much gas (miles) we need to add at station i and which station is the next stop. We distinguish two cases: (1) Within L miles, station u is the first such that P[u] P[i]. (2) Within L miles, station u has the lowest price but P[u] > P[i]. Let G[i] be the amount of gas we need to add at station i. Case 1 In L miles, u is the first such that P[u] P[i]. u In this case, we need to add gas G(i) = D[ j ] - R(i) which j i 1 is just enough to arrive to station u. The next stop is station u. u Our algorithm will guarantee that D[ j ] R(i). (Even if j i 1 u D[ j ] < R(i), the strategy is still correct.) The correctness j i 1 for this case is easy to see because any optimal algorithm u needs to add at least G(i) = D[ j ] - R(i) (miles) gas from j i 1 station i to station u-1 in order to arrive to station u. But, it should not add more than G(i) miles because the excessive 11 SHEN’S CLASS NOTES amount can be obtained at station u with a lower (or equal) price. Then, the cheapest way to add the G(i) miles is to add them at station i. Case 2 In L miles, station u has the lowest price but P[u] > P[i]. In this case, we need to add as much gas as possible to make the tank full. That is, we should add G(i) = L - R(i) gas. The next stop is station u. The correctness can be argued as follows. Obviously, we need to stop at some station in L miles. Suppose an optimal solution add Q(i) miles at station i and Q more miles at other stations within L. Then Q(i) + Q G(i) = L - R(i) in order go beyond L miles. We argue that Q(i) = G(i) for otherwise, if Q(i) < G(i), then some amount of gas in Q could heave been added at station i with a lower price. This amount is G(i) Q(i) Q. Now if Q(i) = G(i), then where we should stop? Since station u has the lowest price, we should stop at station u. If we stop earlier than u, then we will pay higher price for some amount of gas. If we stop later than u, some amount of gas could have added at station u with a lower price. Based on above discussion, the pseudo code is given below. 12 SHEN’S CLASS NOTES Min-trip (P, D, n, cost) 1 for k = 0 to n 2 do G(k) 0 //initialize in O(n) time 3 R[0] 0 //initially the gas tank is empty 4 cost 0 5 i0 6 while i < n 7 do { d 0 8 v u i+1 9 while d + D[v] L and v n 10 do { d d + D[v] 11 if P[v] P[u] 12 then u v 13 v v +1 // Keep track the lowest 14 if P[u] < P[i] 15 //First station whose price < P[i] 16 then exit the while loop 17 } 18 if P[u] < P[i] or v > n 19 then { G(i) d - R(i) 20 R(u) 0 21 iu //next stop 22 } 23 else { G(i) L - R(i) 24 //P[u] is the lowest in L miles u 25 R(u) L - D[ j ] 26 27 iu } j i 1 // L - d //next stop 13 SHEN’S CLASS NOTES cost cost + P[i]G[i] } 28 29 30 End. The complexity is T(n) = O(n). This is because, at station i, we need look at most k stations ahead. So, the complexity is O(kn) = O(n). Example Let L = 8, for the example in Fig. 16-E1, the answer is as follows. A B P[6] = 0 P[0] = 5 P[1] = 6 P[2] = 2 P[3] = 4 P[4] = 3 P[5] = 7 D[6] = 1 D[0] = 0 D[1] = 4 D[2] = 3 D[3] = 5 D[4] = 2 D[5] = 3 R[0] = 0 R[2] = 0 R[4] = 1 G[0] = 7 G[1] = 0 G[2] = 8 G[3] = 0 G[4] = 3 G[1] = 0 The total cost is 75 + 82 + 33 = 60. Note. If k is not a constant, the algorithm can be modified to have O(n) complexity. The new algorithm is explained as follows. The main idea is that if we check each station only once, then we will get an O(n) complexity. This is true when we handle the first case. However, it is not true when we check all stations in L miles and don’t find a station u such that p(u) < p(i). We may 14 SHEN’S CLASS NOTES have checked k stations, from i to v. We find a station u (< v) such that p(u) is the lowest among the k stations but p(u) > p(i). Then, when we stop at station u, we will check stations u+1, u+2, …, v again. We try to get rid of this redundant work. We take the following approaches. i (1) Pre-compute d[i] = D[ j ] which is the distance from j 1 station 0 to station i. So, we can get the distance between any two stations in O(1) time. (2) When we check stations from current station i, we use a stack S to keep those stations that may be next stop. Specifically, suppose current station is i, and it is at the top of stack S. We check stations i+1, i+2, …, until station v such that p[v] < p[i] or d[v] – d[i] > L. Before we reach v, we do the following for each station u being checked: While p[u] p[Top(S)] do pop(S) push(u, S). We do pop(S) because the Top(S) cannot be the next stop. We push(u, S) because u may be the next stop if it is the lowest after we reach station v, or it is not the next stop, but it could be the next-next stop. When we reach station v and p[v] < p[i], then v is the next stop and v is on the top of the stack. If we reach station v and d[v] – d[i] > L, then v is at the top of stack S, but the next stop is the station in the stack that is immediately on top of station i. We use an array S[0..h] to implement the stack, where S[h] is the top element. The pseudo code is given below. 15 SHEN’S CLASS NOTES Modified-Min-Trip (P, D, n, cost) 1 d[0] 0 2 for k 1 to n 3 do d[k] d[k-1] + D[k] 4 for k 0 to n 5 do G(k) 0 //initialize in O(n) time 6 R[0] 0 //initially the gas tank is empty 7 cost 0 8 S[0] 0 //S[i] is the current station. 9 i0 10 h 0 //S[0..h] are possible stops. 11 j 0 //The last checked station. 12 while j + 1 n 13 do if d[j + 1] - d[S[i]] > L 14 then {G(S[i]) L - R(S[i]) 15 R(S[i+1]) L – (d(S[i+1]) – d(S[i])) 16 i i +1 17 } 18 else if p[j + 1] < p(S[i]) 19 then {G(S[i]) (d[j+1]–d(S[i])) - R(S[i]) 20 S[h] j +1 21 ih //R(S[h]) = 0 initialized 22 jj+1 23 } 24 else { while p[j + 1] p(S[h]) 25 do h h-1 26 h h +1 27 S[h] j +1 16 SHEN’S CLASS NOTES 28 jj+1 29 } 30 for k 0 to n-1 31 do cost cost + P[k]G[k] 32 End The correctness of the above algorithm has been explained. The time complexity is O(n). This is because that the dominating part is the while loop at line 12. For each iteration, we have three cases. (1) d[j + 1] - d[S[i]] > L Since this case will increase index i and index i never gets reduced, we need at most O(n) time for this case. (2) d[j + 1] - d[S[i]] L and p[j + 1] < p(S[i]) Since this case will increase index j and index j never gets reduced, we need at most O(n) time for this case. (3) d[j + 1] - d[S[i]] L and p[j + 1] p(S[i]) Since this case will increase index j and index j never gets reduced, we need at most O(n) time for this part. However, this case also increase h by one. This will be done at most O(n) times. Then, the number of operations of h h-1 cannot be larger than n. Therefore, the complexity for this case is also bounded by O(n). 17
© Copyright 2024 Paperzz