Primal-dual RNC approximat ion algorithms for (multi)-set (multi)-cover and covering integer programs Vijay V. Vaeiranit Indian Institute of Technology, Delhi. Sridhar Rajagopalan' University of California, Berkeley. Abstract 1 Introduction We build on the classical greedy sequential set cover algorithm, in the spirit of the primal-dual schema, t o obtain simple parallel approxima.tion algorithms for the set cover problem and its generalizations. Our algorithms use randomization, and our randomized voting lemmas may be of independent interest. Fast parallel approximation algorithms were known before for set cover [BRS89] [LN93], though not for any of its generalizations. Given a collection of sets over a universe containing n elements, and a cost for each set,, the set cover problem asks for a minimum cost sub-collection that covers all the n elements. Set multi-cover and multi-set multicover are natural generalizations in which elements have specified coverage requirements, and the instance consists of multi-sets, i.e., elements may occur multiple number of times in a set. Because of its generality, wide applicability and good combinatorial structure, the set cover problem occupies a central place in the theory of approximation algorithms. Set, cover was one of the first problems shown to be NP-complete - in Karp's seminal paper [I<a72]. Soon after this, the natural greedy algorithm was shown to be an H , factor approximation algorithm for this problem ( H , = 1 1 / 2 . . l / n ) - by Johnson [Jo74] and Lovasz [Lo751 for the unit cost case, and by Chvatal [Ch79] for the arbitrary cost case. This approximation factor was recently shown to be essentially tight by Lund and Yannakakis [LY92]. The greedy algorithm is inherently sequential, however, and can be made to take O ( n ) iterations on suitably constructed examples. The first parallel algorithm for approximating set cover w a s obtained by Berger, Rompel and Shor [BRS89]. They gave an RNC5 algorithm, and they derandomized it to obtain an N C 7 algorithm. More recently, Luby and Nisan [LN93] obtained a (1 E ) factor (for any constant E > 0), NC3 algorithm for positive linear programs. Since the integrality gap for set cover is O(1og n ) , this approximates the cost of the optimal set cover to within an O(1ogn) factor. Furthermore, using a randomized rounding technique [Ra88], they can obtain an integral solution achieving the same approximation factor, thus getting an RNC3 set cover algorithm. For special set systems, where each element occurs in a constant number of sets (this includes the vertex cover problem), I<huller, Vishkin and Young [I<VY93] parallelize the a.pproximation algorithm of Hochbaum [Ho82]. In this paper, we present a new RNC3 approximation algorit,hm for set cover, and RNC4 approximation + 'Supported by N S F PYI Award CCR 88-96202 and NSF grant IRI 91-20074 Part of this work was done when visitiiig IIT, Delhi t Partial support provided by DIMACS + .+ + 0272-5428/93 $03.00 0 1993 E E E 322 algorithms for set multi-cover, multi-set multi-cover and covering integer programs (i.e., integer programs of the ' , and the form min c x, s.t. A x 2 b where x E Z entries in the vectors c and b and the matrix A are all non-negative), each achieving an approximation guarantee of O(logn). In addition, our algorithm for covering integer programs achieves a better approximation factor than the best known sequential guarantee [Do82]. Our algorithms utilize several ideas from the previous works. As described below, the basis of our work is the sequential greedy algorithm. We use the general idea from [BRS89] of relaxing the greedy set picking condition so that rapid progress is guaranteed; however, our relaxation criterion is new, and our algorithm is quite different. Also, our rounding lemma 4.1.1 is similar to that in [LN93], except that it is carried out in the integer programming framework. The greedy set cover algorithm can be viewed as a primal-dual algorithm, and can be analyzed within the powerful framework of LP-duality; this is the starting point of our work. Primal-dual considerations and randomized voting lemmas enable us to build on the sequential algorithm to obtain the parallelization. For details of the primal-dual schema in the exact and approximate settings, see (PS821 and [WGMV93]. Since the greedy sequential algorithm is based on very general principles, i.e., the primal dual framework, and not on properties that are specific to set cover, it extends easily to solving the generalizations of set cover mentioned above. In turn, our parallel set cover algorithm also extends to these problems. The main new ideas needed for parallelizing multi-set multi-cover are in the analysis of the randomized step. These are best shown in the simpler setting of set multi-cover. It turns out that for both these problems, it is easier to parallelize the version in which each set can be picked a t most once; the other version, in which sets can be picked arbitrarily many times, is an instance of covering integer programs. Finally, we give an approximation preserving NC' reduction from covering integer programs to multi-set multi-cover. 2 a t which it can be covered in the current iteration, i.e., def cs current-cost (e) = min -. ~ 3 (SI e Thus, in an iteration, this algorithm covers elements having least current-cost. Our parallelization involves two central ideas. The first is a way of relaxing the criterion for choosing costeffective sets so that rapid progress is guaranteed, and at the same time, the approximation guarantee does not go up by more than a constant factor. Our parallel algorithm is also iterative; in an iteration, the algorithm first identifies all sets that are cost-effective by the relaxed criterion (for this, the current-cost of an element is as defined above). However, there may be too many such (overlapping) sets, and a good set cover cannot contain them all. The second idea is the use of randomization for picking a sub-collection of these sets such that on an average, elements are not covered too many times. This is done in a series of phases. PARALLEL SETCOV Iteration: For each element e, compute current-cost(e). For each set S: include S in L if (e) CeES current-cost(e) 2 %. Phase: (a)Permute C a t random. (b)Each element e votes for first set S (in the random order) such that e E S . (c)If Et:votes s current-cost(e) 2 S IS added to the set cover. (d)Delete covered elements from sets. If any set fails to satisfy ( e ) , it is deleted from C. Repeat until L: is empty. Iterate until all elements are covered. s, Remark: Notice that current-cost(e) is computed only at the beginning of an iteration, and is not updated as the iteration proceeds. 0 We will prove that our algorithm requires a t most O(1ogn) iterations, and O(1ogn) phases in each iteration, and achieves an approximation guarantee of 16H,. Parallel set cover algorithm 3 Our parallel set cover algorithm builds on the classical sequential greedy algorithm. This (sequential) algorithm is iterative; in each iteration it picks a set that covers new elements a t the minimum average cost, i.e., a set that minimizes cs/lSI, and removes these elements from consideration (cs is the cost of set S ) . The algorithm terminates when all elements are covered. In an iteration of this algorithm, for an uncovered element, define its current-cost to be the minimum average cost Key ideas behind our algorit hm As stated in the introduction, our parallel set cover algorithm emerged from LP-duality principles. In this section, we will first state the LP-relaxation of the set cover problem and recall a simple proof of the sequent,ial algorithm. This will enable us to point out the ideas that lead to the parallelization. 323 3.1 The set cover problem and its LP relaxation and y is dual feasible. The value of the dual objective is 0 We can express the set cover problem as an integer program as follows. Here I S is the C / 1 variable indicating whether set S is picked in the set cover. Hence, the greedy algorithm finds an integral primal feasible solution and a dual feasible solution such that IP = H,DP. By the LP-duality theorem, DP 5 DP' = LP' IP'. Therefore, IP 5 H,IP*. This completes the proof of the following theorem. IP : MIN s.t. csCsxs .cs2 1 I S E {0,1} The LP-relaxation and its dual are as follows. LP : DP : MIN CSCSXS MAX Cepe s.t. CS3e2 5 L 1 s.t. Gees Ye L cs I S LO Ye L 0 The primal problem is a covering problem, and the dual is a packing problem. Henceforth we will use IP. LP and DP to refer to both, the problems and the value of the objective achieved by the specific solution being considered; the context will resolve any ambiguity. The superscript * will be used for the optimal solution. Hence, we want an approximation to IP' . 3.2 Theorem 3.1 [Ch79],[Jo7d],[Lo75] The greedy algorithm as an H , factor approximation algorithm for set cover. Remark: Notice that theorem 3.1 and its proof will hold even if H , is replaced by H k , where k is the cardinality of the largest set in the instance. This is true for 0 our parallel algorithm as well. 3.3 The greedy algorithm is inherently sequential, and it is easy to construct examples on which it takes O ( n ) iterations. Notice that in an iteration, this algorithm considers a set S for inclusion in the set cover only if Ce E current-cost(e) = cs, and it picks any one of these sets. Our central idea for achieving parallelism is to relax appropriately this criterion for choos ing cost-effective sets so that rapid progress is guaranteed. Furthermore, we still want to attribute the cost of picked sets to elements in such a way that the amount charged to an element, called cost(e), is at most a 'current-cost(e), in the iteration in which the element was covered, where Q is a constant. Then, by essentially the proof of lemma 3.2.1 , ye = cost(e)/aH, will again be a dual feasible solution, and our algorithm would achieve a factor of Q H, . We first give a straightforward way of relaxing the criterion for picking sets, together with an example to show that the algorithm can be forced t o take @ ( n ) iterations. Example: Number the sets arbitrarily. In each iteration, each element will vote for the lowest numbered set votes will covering it a t least cost, and any set with add itself to the set cover. This process is repeated until all elements are covered. (Notice that replacing by essentially gives us the greedy algorithm.) This algorithm takes O ( n ) iterations on the following input. Let the universal set be U = {ti,, V i : i = l . . . n }. Let the sets be U, = ( t i ~ , ~ i + ~ ,and ~i+~} V, = { ~ ; , z L ~ + ~for , vi ~ = +l .~. .} ( n - l ) . Finally,choose have the same cost, and costs of sets so that U;and 0 the cost of U, > the cost of U ; + l . Proof of approximation guarantee for the classical greedy algorithm s As mentioned above, the sequential greedy algorithm picks a most cost-effective set in each iteration. When a set S is picked, its cost, cs, is ascribed equally to each of the new elements that S covers. Let cost(e) be the cost ascribed to e. Notice that cost(e) is the current-cost of e in the iteration in which e was covered, and the cost of the set cover is the sum of the costs ascribed to all the elements. cost(e) = cost of set cover = IP e For each element e, let ye = vector of all ye's. Lemma 3.2.1 y IS Relaxing the set picking criterion, and the primal-dual framework q, and ket y be the dual feasible. Proof: Consider an arbitrary set S . Let a l , a2,.. . be the reverse order in which the elements in S were covered. When ai was covered, S had at least i uncovered elements. Therefore, it was covering elements a t a cost not larger than y . Since a, was covered at the least possible cost, cost(a;) 5 y . Therefore, 1st Since ye = cost(e)/Hn, 324 Our criterion for identifying cost-effective sets in an current-cost(e) 2 y . These sets are iteration is CeES included in the collection C. As pointed out in Section 2, these sets may have too many overlaps. For example, if the sets were all possible pairs of elements, each with cost 1, then all sets would be identified as cost-effective. Clearly, we will not be able to attribute the cost of all these sets to elements so that elements are not over-charged (i.e., cost(e) 5 a . current-cost(e), for some constant a). Our algorithm gets around this problem as follows: we will pick an ordering of the sets in C, and then each element will give a vote of value acurrent-cost(e) (for a suitable choice of a ) to the first set in this ordering that contains this element. A set gets picked in the set cover if it gets a substantial fraction of its cost as votes, and it attributes its cost to the elements that voted for it. Clearly the first set in the ordering will get picked; however, that may not be sufficient to guarantee rapid progress. We do not know of any deterministic way of picking the ordering so that votes get concentrated on several sets. Instead, we show that picking a random ordering works. This is done in phases, until each set in L: is either picked, or is identified as not cost-effective any more (since elements that get covered are removed from all sets). This is the only use of randomness in our algorithm. as a primal-dual algorithm. It will be useful to view the greedy algorithm and our parallelization in the framework of the primal-dual schema. Such an algorithm starts with a primal infeasible solution and a dual feasible solution, and it iteratively improves the feasibility of the primal solution, and the optimality of the dual solution, terminating when it obtains a primal feasible solution. This framework is particularly useful for obtaining approximation algorithms for NP-hard combinatorial optimization problems that can be stated as integer programming problems: the schema works with the LP-relaxation and dual. and always seeks an integral extension to the primal solution. On termination, the primal integral feasible solution can be compared directly with the dual feasible solution obtained, since the latter is a lower bound on the fractional, and hence integral, O P T (assuming we started with a minimization problem). The framework leaves sufficient scope for effectively using the combinatorial structure of the problem a t hand: in designing the algorithm for the improvement steps, and carrying out the proof of approximation guarantee. 4.1 Our criterion for picking sets can now be stated in terms of the element dual variables, ye's. The main difference between our criterion and the one given in the abovestated example is that elements of different current-costs can contribute to a set's being picked, and their contribution is proportional to their current-costs. This weighted mixing of diverse elements, and the corresponding better use of dual variables, appears to lend power to our criterion. Approximation guarantee and running time 4 Lemma 4.0.1 PARALLEL SETCOVis an 16H, approximation algorithm for set cover. Proof: Each set that is included in the set cover ascribes its cost to the elements that voted for it. Because of step (c) in the algorithm, this can be done without charging an element e more than 16 . current-cost(e) at the moment e was covered. Since C,cost(e) = Cost of set cover and ye = c 3 is dual feasible, the lemma follows. 0 Bounding the number of iterations We will first show that the set costs can be modified so that they lie in a polynomial range. Lemma 4.1.1 It suffices to conszder the case, cs E [I, n21. Proof: Define /3 = max, mins3e c s . Then it is easy to verify that p 5 IP' 5 np. Any set with weight more than nd could not possibly be in the optimum set cover. Hence, we can omit all such sets from consideration. Further, if there are e and S such that e E S and cs 5 we cover e immediately using S. Since there are at most n elements the total cost incurred is a t most an additional 4. Since /3 is a lower bound on IP', this cost 0 is subsumed in the approximation. E, Lemma 4.1.2 I n an iteratton, define a mine current-cost(e). Then a increases by at least a factor of 2 after each iteration. dzf Proof: At any point in the current iteration, consider a set S that has not yet been included in the set cover S1 is the number of and such that cs 5 2aJSI where I uncovered elements in S . Then by definition of a , The greedy algorithm can be viewed as a primal-dual algorithm as follows. At any point in the algorithm, for element e , let ye be cost(e)/H, if e is already covered, and current-cost(e)/H, if e is not yet covered. Then, by essentially the proof of lemma 3.2.1, it can be shown that y is a dual feasible solution. The currently picked sets constitute the primal solution. By replacing H, by a . H, in setting ye'syour algorithm can also be viewed CS Ccurrent-cost(e) 2 l ~ l a2 2 eES Thus S satisfies (e) and must have satisfied it a t the inception of the iteration. Thus S E L. However, at 325 of all elements in S is a t least y , the total current-cost of elements f such that (S,f) is not good is a t least By lemma 4.2.1, each of the latter elements votes for S with probability at least 1/2, conditioned on the event that e votes for S, and so the expected vote that S gets is a t least y. Since the total vote to S is bounded by 0 cs , the lemma follows Markov’s inequality. the end of an iteration, t is empty. Thus for every remaining S, c s 2 2alSI. This ensures that in the next iteration, a increases by a factor of 2. 0 Since a is restricted to n2] as an immediate c o n e quence of lemma 4.1.1 above, and it doubles with each iteration, we obtain the following lemma. 2. [i, Lemma 4.1.3 PARALLEL SETCOVrequires at most O(1ogn) iterations. 4.2 Lemma 4.2.3 O(1ognm) phases sufice to complete any iterafaon. Bounding the number of phases in each iteration Proof: (Sketch) We will consider the following potential function: An easy argument for bounding the number of phases in an iteration by O(log2n) can be based on the following intuition: Let the degree of a n element be the number of sets in C in which it occurs. Elements with large degree will occur early in the random ordering of 13, and these early sets stand a good chance of being picked. Hence such elements get covered with constant probability, and the max degree decreases by a constant factor after O(1og n) phases. Below we show that the number of phases in an iteration is bounded by O(logn), by considering the relative degrees of elements in each set and using a potential function that weights each element to the extent of its degree. Call a set-element pair (S,e), e E S good if deg(e) 2 We will ascribe the decrease in 3 when an element e votes for a set S and S is subsequently added t o the set cover to the set-element pair (S,e). The decrease in 0 that will then be ascribed to (S,e) is deg(e), since 3 decreases by 1 for each set containing e. Since e voted for only one set, any decrease in @ is ascribed to only one (S,e) pair. Thus the expected decrease in 3 is a t least. prob(e voted for S, S was picked) . deg(e) 2 (see) ith deg(f) for a t least of the elements f E S. For such a pair, we will show that if e votes for S then S gets picked with high probability. This is so because the low degree elements in S are unlikely to occur in a set before S in the random ordering of L. 2 (prob(e voted for S) (S,e ) good xprob(S was picked 1 e voted for S) xWe)) Lemma 4.2.1 Let e, f E S such that deg(e) >_ deg(f). Then, prob [f votes S I e votes SI 1 = -(number 1 2 >- 15 Proof: Let N e ,N/ and Nb be the number of sets that contain e but not f, f but not e, and both e and f respectively. deg(e) 2 deg(f) implies that Ne 2 N,. Hence, prob[f votes S I e votes SI = Ne+Nb Ne+Nt,+Nj of good (S,e) pairs) ith Finally since at least fraction of all (S,e) pairs are good, we observe that E ( A 0 ) 2 &3. Thus, 3 decreases by a constant factor in each phase. Since 3 starts a t a value of a t most nm, in O(1ognm) phases it will be down to zero and the iteration will be over. 0 Clearly, each phase can be implemented in O(1ogn) time on a linear number (in the size of the instance, i.e., sum of cardinalities of all sets) of processors. Hence we get: ,1 2 0 Lemma 4.2.2 If (S,e) is good then Theorem 4.1 PARALLEL SETCOVis an RNC3 algorithm that approzimates set cover to within 16Hn, using a linear number of processors. 1 prob( S is picked I e votes for S) 2 15 B, Proof: For any element f E S, current-cost(f) 5 since S covers f a t this average cost. So, the total current-cost of elements f such that (S,f) is good is a t most 9. Furthermore, since the total current-cost + Comment: The constant 16 can be reduced to 2 c for any e. The factor 2 is inherent is in the probabilistic analysis of lemma 4.2.1. 0 326 Set multi-cover 5 Now, by ordering the elements covered by S first, and the remaining elements in the reverse order in which their requirement was satisfied, the RHS can be shown 0 to be bounded by c s . The version of set multi-cover that we will first parallelize is the one in which each set can be picked a t most once. Dobson [Do821 shows how the greedy set cover algorithm extends to this problem; however, he compares the solution obtained to the integral optimal IP'. For our purpose, we want to compare the solution obtained to the fractional optimal LP', and we will first present this argument. Lemma 5.1.2 Cost of the set multi-cover, IP = C(s,e) cost(S, e) = DP . H,, Proof: The first equality follows from the manner in which the cost of the sets picked is attributed to the set-element pairs. For the second equality, notice that DP . H,, = r,max-cost(e) Sequential analysis 5.1 Assume each element e has a coverage requirement re. The greedy set cover algorithm extends as follows: say that an element is alive until it is covered to the extent of its requirement. In each iteration, pick a set that covers alive elements a t the least average cost, and remove all elements that are not alive from consideration. The LPrelaxation and dual for this problem are: e - (max-cost(e) - cost(S, e)) S = e covered by S cost(S,e) (s,e) = IP 0 Theorem 5.1 The extended greedy algorithm j n d s a set multi-cover within an H, factor of LP'. 5.2 When a set S is picked, its cost, cs is ascribed to (S,e) pairs for each alive element e it covers, i.e. for each such pair, cost(S,e) = cs/lSI, where IS1 is the number of alive elements in S. For each element e, define max-cost( e) = maxS { cost ( S ,e)}, and let ye = max-cost(e)/H,,. If a set S is not picked, let zs = 0, and othe-wise let zs = covered by s(max-cost(e)-cost(s,e)) PARALLEL SETMULTCOV Iteration: For each element e, compute current-cost(e). For each set S: include S in C if (0) CeES current-cost(e) 2 8. Phase: Initialization: Let r(e) be the dynamic requirement, i.e., the requirement that is yet t o be satisfied. Furthermore, .(e) is restricted to be a t most deg(e) in L;if not, set r(e) = deg(e). (a) Permute C at random. (b) Each element e votes for first r(e) sets it belongs t o in the random ordering of L. (c) If E, current-cost(e) 2 g, S is added to the set cover. (d) Delete sets that are picked or fail to satisfy ( 0 ) from C. Decrement r(e) to the smaller of the dynamic requirement or deg(e) in C. Repeat until C is empty. Iterate until all elements are covered. H, Lemma 5.1.1 y,z is dual feasible, Proof: Suppose S is not picked in the set cover. Then, by ordering elements in the reverse order in which their requirement was satisfied, and the proof of lemma 3.2.1, it is easy to check that Next assume S is picked in the set cover. Then, Ye - 2s e€S - Parallel algorithm max-cost(e) e E S ,and not covered by S 5.3 Analysis The main difference in the analysis is in bounding the number of phases in an iteration. The other differences 327 are small, and we will sketch them first. When a set is picked, it assigns to each set-element pair that voted It is easy to see that if ye and zs for it a cost of are scaled down by a factor of 32, y, zis dual feasible (by essentially the proof of lemma 5.1.1). Hence the algorithm achieves an approximation factor of 32H,. The number of iterations is bounded by O(1ognm). The only change required for proving this is in the definition of p: Let 5'1, S2 . .- S , be the sets arranged in increasing order of cost. Let Pe be the weight of the rth set in this sequence containing e, and let p = max, De. Then, p 5 IP' 5 mng. The number of phases required in an iteration is a t most O(log2 nm). The main differences in the proof are in proving a lemma analogous to lemma 4.2.1, and in extending the potential function t o take care of multiple requirements. Therefore. Lemma 5.3.1 Let e, f E S , then Corollary 5.3.2 Let e, f E S such that v. prob(e and 15 f l ( 1 deg(e)-l '(e) deg(f)-l + r(f) prob(e does not vote for S z . (deg(e) - 1 ) < - .(e) Substituting this value back in ( l ) ,we obtain prob(e and f vote for S ) + Choosing the value of B to maximize the right hand side, we get prob(e and f vote S) 2 1 prob( f votes S votes S ) 2 4 ith for a t least of the elements f E S. By essentially the proof of lemma 4.2.2, one can show that if (S,e) is good, then prob( S is picked I e votes for S ) 2 prob(e and f vote SlXs = z) dz 1 31 - The potential function we use is: prob (e and f vote S ( X s = z) dz f? E [O,11 1 - prob(e does not vote SlXs = z) Q, (1) E(AQ,) 2 prob(e voted for S ) x I(e, T) >_ XT<Z It is easy to see that the espectation of Ye(.) is given by E ( Y e ( z ) )= z (deg(e) - 1). Applying Markov's inequality to Ye(.) we get the following 1 X s = z) s prob(S was picked 1 e voted for S)-d e d e ) .(e) r(e) 1 d e d e ) deg(e) 31 .(e) ' Hence, the expected decrease in Q, after a phase is at least O(&), and so the number of phases required log r ) in expectation. Since Theorem 5.2 PARALLEL SETMULTCOV is an RNC4 algorithm that approximates set multi-cover to within 32H,,, using a linear number of processors. Finally, we notice that (e does not vote for S Hv(e) The initial value of Q, is 5 mn log r , where r is the largest requirement. The expected decrease in Q, ascribable to a good set-element pair ( S ,e) is given by f? is a parameter that we will choose later to maximize the RHS. We will now focus on the right hand term prob(e does not vote S I X s = z ). Define Ye(.) to be the number occurrences of e in sets earlier than S. ef def (s+) -prob (f does not vote SJXs= z) dx ye(.) Ie 5 $-&. 1 Remark: This lemma has interpretations in other situations such as permutations of n white and m black Q balls and up-right walks in Manhattan grids. Say that a set-element pair (S,e) is good if 1 prob(e and f vote S ) = 2 1 1 Q Proof: Consider picking the random permutation of L: by first choosing a real number Xs E [0,1] for each S and then sorting sets by Xs. Define an indicator variable I(e, S) E ( 0 , l ) . Here I(e, S ) = 1 iff e E S. We begin by conditioning on the value of XS. 1' I@ I@ 1 Xs = z) Ye(.) 2 .(e) 328 6 Multi-set multi-cover In the multi-set multi-cover problem. the sets can have multiple copies of an element; clearly, we may assume that the multiplicity of an element is bounded by its requirement. We will consider the version of this problem in which a set can be picked at most once. For technical reasons pointed out later, we will assume that the requirement of each element is bounded by some fixed polynomial in n. The IP formulation is as follows: i n the figure 1, below. A rotation is performed by choosing a chunk of sets containing a t least r'(e,k) copies of e enumerated from the rear of the permutation and putting them in the front as shown. Since, we assume that any set has a t most r'(e, k) copies of e, each chunk has between r'(e, k) and 2r'(e, k) copies of e. So we can - 1 times. Hence we have our 'rotate' a t least 1--24, claimed 2 permutations, and the lemma fol0 lows, using the fact that r'(e, k) 5 &(e, k). The only change needed in the sequential algorithm for this problem from the set multi-cover case is that multiplicities have to be taken into consideration for evaluating the cost-effectiveness of a set. For the parallel algorithm, in step (b), an element votes for the first r(e) copies of itself, and the constant 32 in step (c) changes to 128. For the analysis, we will assume that the copies of an element in a set are numbered, and the votes are given in this order. If (e,k) is the kth copy of element e in set S , define r'(e,k) = .(e) - k 1. Define deg'(e,k) to be the degree of e in a set system where each set has been truncated t o contain at most r'(e,k) copies of e. Consider two copies of two elements (not necessarily different) in set S. The statement of lemma 5.3.1 changes only in that r(e) and deg(e) is replaced by r'(e,k) and deg'(e, k), and the proof is also identical. We need the following additional lemma: Figure 1: A rotation + Then, prob((e, k) votes for Sl(f,I ) votes for S) 2 Proof: (Sketch) The lower bound follows from considerations similar to those in the proof of lemma 5.3.1. In particular, we need to bound 7 '' 1 - prob((e, k) does not vote SlXs = z) dz deg'(e, k) - 1 - ( .'(elk) Covering integer programs Covering integer programs are integer programs of the following form: prob ((e, k) votes S ) = 1' 1 8 - The rest of the proof is identical to the set multi-cover case, yielding an O(1ogn) factor RNC4 approximation algorithm for multi-set multi-cover. The number of processors required is linear in the sum of cardinalities of all sets, taking multiplicities into consideration. Lemma 6.0.3 Let us consider the k t h copy of e , (e, k) zn S. Then, 2 awj. Corollary 6.0.4 Let (e, k) and (f,I ) be the kth and the Ith copies of e and f in S , and &$$j > CIP : MIN O2 s.t.Ve )T cscszs M(S,e ) z s 2 R, Is "S Choosing 8 appropriately, we get the lower bound. For the upper bound we notice that with each permutation of the sets such that ( e , k ) votes S. we can - 2 permutations such that associate a t least (e, k) does not vote for S . These permutations are constructed by repeatedly performing 'rotations' as shown E z+ Here, iZ+ are the non-negative integers. Without loss of generality, we may assume that M(S,e) 5 R e . These programs differ from multi-set multi-cover in that the multiplicities and requirements are arbitrary nonnegative real numbers, and sets can be picked any integral number of times. 329 largest entry in A , assuming that the smallest one is 1 [Do82]. Notice that the multi-set multi-cover problem with the restriction that each set can be picked a t most once is not a covering integer program, since we will have negative coefficients. This raises the question whether there is a larger class than covering integer programs for which we can achieve (even sequentially) an O(log n ) approximation factor. Lemma 7.0.5 There i s an approximation preserving NC' reduction from covenng integer programs to multiset multi-cover. Proof: Essentially, we need to reduce the element requirements t o a polynomial range, and ensure that the requirements and multiplicities are integral. Then, replicating sets to the extent of the largest element requirement, we would get an instance of multi-set multicover. We will achieve the first requirement as follows: we will obtain an (crude) approximation to O P T within a polynomial factor, and then we will ensure that the costs of sets are in a polynomial range around this estimate. Once this is done, we will show that rounding down multiplicities and rounding up requirements will not cause much loss in approximation guarantee. Let pe = re . mins M(&) , and 4 = m u e b'e. Clearly, p 5 CIP'. On the other hand it is possible to cover e at cost 2be by the following reasoning. Let Se be the Then, we can cover e by set minimizing mins References ,:a,,. picking copies of S e . The cost incurred is a t most [*j]cs which in turn is at most 28, since re 2 M ( S , e ) . Therefore, all elements can be covered at cost a t most 2 C e p , which is at most 2n4. Hence, CIP' 5 2n4 If a set S has large cost, i.e, cs > 2734, then S will not be used by O P T , and so we can eliminate it from consideration. So, for the remaining sets, cs 5 2n4. Define as = We will clump together as copies [&I. of S. The cost of the set so created is a t least {. If any element is covered to its requirement by a set of cost cover the element using that set. The cost incurred in the process is a t most 4 for all elements so covered, and this is subsumed in the approximation factor. The reason we require as to be integral is that we need to map back solutions from the reduced problem to the original problem. Henceforth, we can assume that the costs satisfy cs E 2np1. Since the cost of each set is a t least 1 and the optimum value is at most 2n2, we can deduce that the number of sets such that z : > 0 is a t most 2nZ. We set M'(S, e ) = w e also set r: = 4n3. N O W , it is easily seen that the total rounding error introduced is a t most (& per element). Since the value of the optimal is a t least 1, this amount is subsumed in the approximation 0 factor. This completes the reduction. E, [BRS89] Berger, B., Rompel, J., and Shor, P. Efficient NC Algorithms for Set Cover with applications t o Learning and Geometry. 30thIEEE Symposium on the Foundations of Computer Science (1989), proceedings pp 54-59. [Ch79] Chvatal, V. A greedy heuristic for the set covering problem. Math. Oper. Res. 4 ~~233-235. [Do821 Dobson, G. Worst-case Analysis of Greedy Heuristics for Integer Programming with Non-Negative Data. Mafh. Oper. Res.7 pp515-531. [Ho82] Hochbaum, D.S. Approximation algorithms for the set cover and vertex cover problems. SIAM J . Comput. l l ( 3 ) pp555556. [Jo74] Johnson, D.S. Approximation Algorithms for Combinatorial Problems. J. Comput. Sys. Sci. 9 pp256-278. [Ka72] Karp, R.M. Reducibility among combinatorial problems. Complexity of Computer Computations. R.E.Miller and J .W. Thatcher (eds.) Plenum Press, New York. ~~85-103. [LN93] Luby, M. and Nisan, N . A parallel approximation algorithm for Positive Linear Programming. 25thACM Symposium on Theory of Computing (1 998)) proceedings, pp 448-457. [KVY931 Khuller, S., Vishkin, U. and Young, N. A Primal-dual parallel approximation technique applied to weighted set and vertex cover. Third IPCO Conference 1993, proceedings pp 333-341. [Lo751 Lovasz, L. On the ratio of Optimal Integral and Fractional Covers. Discrete Math. 13 ~ ~ 3 8 3 - 3 9 0 . [E, 1-. Theorem 7.1 There is an O(1ogn) factor RNC4 approtimatron algonthm for covenng integer programs, requinng O(nz#(A)) processors, where # ( A ) i s the number of non-zero entnes in A . The previous best approximation factor for covering integer programs was H n m o o ( ~where ), m a z ( A ) is the 330 [LY92] Lund, C. and Yannakakis, M. On the hardness of approximating Minimization Problems. 25thACM Symposium on Theo r y of Computing (1993). proceedings, pp 286-293. [PS82] Papadimitriou, C.H., and Steiglitz, E<. Combinatorial Optimization: Algorithms and Complexity Prentice Hall Inc., New Jersey, 1982. [Ra881 Raghavan. P. Probabilistic Construction of Deterministic Algorithms: Approximating packing Integer Programs. J. Comput. Sys. Sci. 37 pp 130-143. [WGMV93] Williamson, D.P., Goemans, M.X., Mihail, M., and Vazirani, V.V. A Primal-Dual Approximation Algorithm for Generalized Steiner Network Problems. 2SthACM Symposium on Theory of Computing (1993), proceedings, pp 708717. 331
© Copyright 2025 Paperzz