Algorithms XI

Algorithms XI
Rounding
Guoqiang Li
School of Software, Shanghai Jiao Tong University
Rounding
Vertex Cover, Revisited
VC in ILP
Vertex cover
Given an undirected graph G = (V , E),
and a cost function on vertices
c : V → Q+ , find a minimum cost vertex
cover, i.e., a set V 0 ⊆ V such that every
edge has at least one endpoint incident at
V 0 . The special case, in which all vertices
are of unit cost, will be called the
cardinality vertex cover problem.
min c(xi )xi
xi + xj ≥ 1
(i, j) ∈ E
xi ∈ {0, 1}
vi ∈ V
Nature Bound
OPTf ≤ OPT
Half-Integral
Lemma:
Every vertex of the LP relaxation of VC is half-integral, meaning that
all its coordinates have values 0, 1/2 or 1.
Proof
Proof
Let x be a feasible solution which contains other than half-integral
values.
So, for certain (i, j) ∈ E, xi ∈ (0, 1/2) and xj ∈ (1/2, 1). We claim
that x is not an optimal solution of the LP. To prove this, we show that
x is not extreme.
The first thing to note is that there can be no (i, j) ∈ E with both
xi < 1/2 and xj < 1/2.
Proof (Cont.)
For some small but positive ε, consider the solutions,




xi + ε xi ∈ (1/2, 1)
xi − ε xi ∈ (1/2, 1)
+
−
xi = xi − ε xi ∈ (0, 1/2)
xi = xi + ε xi ∈ (0, 1/2)




xi
otherwise
xi
otherwise
We can choose ε small enough so that values of all vertices do not
leave the intervals (0, 1/2) and (1/2, 1).
x = (x + + x − )/2. We claim both x + , x − are still feasible. Assume
for some (i, j) ∈ E with xi ∈ (0, 1/2), and xj > 1/2.
If xj = 1, then xi+ + xj+ ≥ 1. Otherwise, xj ∈ (1/2, 1) and
xi+ + xj+ = (xi − ε) + (xj + ε) = xi + xj ≥ 1. A similar argument
shows that x − is feasible. Hence, x not an optimal solution.
Algorithm
Let x ∗ denote an optimal fractional solution. The rounding is to set
xi = 0 if xi∗ = 0, and xi = 1 if xi∗ ∈ {1/2, 1}.
To show that we get a 2-approximation, we need to show that
• x is still feasible, and
• at most a factor of 2 is preserved in the rounding.
• assume that some constraint xi + xj ≥ 1 for some (i, j) ∈ E is now
unsatisfied. Then xi = xj = 0. By the rounding, xi∗ = xj∗ = 0.
P
P
•
xi ≤
2 · xi∗ = 2 · OPTf ≤ 2 · OPT .
Set Cover
Set Cover of LP-Rounding
• We will design two approximation algorithms for set cover
problem using the method of LP-rounding.
• The first one is a simple rounding algorithm achieving a
guarantee of f .
• The second one is based on randomized rounding and achieves a
guarantee of O(log n).
The Set Cover LP-Relaxation
minimize
X
c(S)xS
S∈S
subject to
X
xS ≥ 1,
e∈U
S:e∈S
xS ≥ 0,
S∈S
A Simple Rounding Algorithm
Algorithm
1
Find an optimal solution to the LP-relaxation.
2
Pick all sets S for which xS ≥ 1/f in this solution.
Analysis
Theorem
This algorithm achieves an approximation factor of f for the set cover
problem.
Proof
Let C be the collection of picked sets. We first show that C is indeed a
set cover.
• Consider an element e. Since e is in at most f sets, one of these
sets must be picked to the extent of at least 1/f in the fractional
cover, due to the pigeonhole principle.
• Thus, e is covered by C, and hence C is a valid set cover.
Analysis
• The rounding process increases xS , for each set S ∈ C, by a factor
of at most f .
• Therefore, the cost of C is at most f times the cost of the
fractional cover, thereby proving the desired approximation
guarantee.
Randomized Rounding
Why Randomization
• The performance guarantee of a random approximation
algorithm holds with high probability.
• In most cases a randomized approximation algorithms can be
derandomized, and
• the randomization gains simplicity in the algorithm design and
analysis.
Randomized Rounding for Set Cover
• Intuitively, a set with larger value is more likely to be chosen in
the optimal solution.
• This motivates us to view the fractions as probability and round
accordingly.
Analysis (I)
• Let x = p be an optimal solution to the linear program.
• For each set S ∈ S, our algorithm pick S with probability pS .
• The expected cost of C is
E[cost(C)] =
X
S∈S
Pr[S is picked] · cS =
X
pS · cS = OPTf .
S∈S
• Next, let us compute the probability that an element a ∈ U is
covered by C.
Analysis (II)
• Suppose that a occurs in k sets of S.
• Let the probabilities associated with these sets be p1 , . . . , pk .
Then p1 + p2 + · · · + pk ≥ 1.
• Then we have
1
1
Pr[a is covered by C] ≥ 1 − (1 − )k ≥ 1 −
k
e
• To get a complete set cover, independently pick c log n such
subcollections, and let C 0 be their union, where c is a constant
such that
1
1
( )c log n ≤
e
4n
Analysis (III)
• Now we have
1
1
Pr[a is not covered by C 0 ] ≤ ( )c log n ≤
e
4n
• Summing over all elements a ∈ U, we get
Pr[C 0 is not a valid set cover ] ≤ n ·
1
1
≤ .
4n
4
Analysis (IV)
• Clearly E[C 0 ] ≤ OPTf · c log n.
• Applying Markov’s Inequality, we get
Pr[cost(C 0 ) ≥ OPTf · 4c log n] ≤
1
4
• Thus
Pr[C 0 is a valid set cover and has cost ≤ OPTf · 4c log n] ≥
1
2
MAX-SAT
MAX-SAT
Given n boolean variables x1 , . . . , xn , a CNF
ϕ(x1 , . . . , xn ) =
m
^
Cj
j=1
and a nonnegative weight wj for each Cj .
Find an assignment to the xi s that maximizes the weight of the
satisfied clauses.
Flipping a Coin
A very straightforward randomized approximation algorithm is to set
each xi to true independently with probability 1/2.
Setting each xi to true with probability 1/2 independently gives a
randomized 21 -approximation algorithm for weighted MAX-SAT.
Proof
Proof
Let W be a random variable that is equal to the total weight of the
satisfied clauses. Define an indicator random variable Yj for each
clause Cj such that Yj = 1 if and only if Cj is satisfied. Then
W=
m
X
wj Yj
j=1
We use OPT to denote value of optimum solution, then
E[W ] =
m
X
j=1
wj E[Yj ] =
m
X
j=1
wj · Pr[clause Cj satisfied]
Proof (cont’d)
Since each variable is set to true independently, we have
lj !
1
1
≥
Pr[clause Cj satisfied] = 1 −
2
2
where lj is the number of literals in clause Cj . Hence,
m
1X
1
E[W ] ≥
wj ≥ OPT .
2
2
j=1
A Finer Analysis
Observe that if lj ≥ k for each clause j, then the analysis above shows
that the algorithm is a (1 − ( 21 )k )-approximation algorithm for such
instances. For instance, the performance guarantee of MAX E3SAT is
7/8.
From the analysis, we can see that the performance of the algorithm is
better on instances consisting of long clauses.
Theorem
If there is an ( 78 + )-approximation algorithm for MAX E3SAT for
any constant > 0, then P = NP.
Derandomization
Derandomization
The previous randomized algorithm can be derandomized. Note that
E[W ] = E[W | x1 ← true] · Pr[x1 ← true]
+ E[W | x1 ← false] · Pr[x1 ← false]
1
= (E[W | x1 ← true] + E[W | x1 ← false])
2
We set b1 true if E[W | x1 ← true] ≥ E[W | x1 ← false] and set
b1 false otherwise. Let the value of x1 be b1 .
Continue this process until all bi are found, i.e., all n variables have
been set.
An Example
x3 ∨ x5 ∨ x7
• Pr[clause satisfied | x1 ← true, x2 ← false, x3 ← true] = 1
• Pr[clause satisfied | x1 ← true, x2 ← false, x3 ← false] = 1−( 21 )2 =
3
4
Derandomization
This is a deterministic 21 -approximation algorithm because of the
following two facts:
1
E[W | x1 ← b1 , . . . , xi ← bi ] can be computed in polynomial
time for fixed b1 , . . . , bi .
2
E[W | x1 ← b1 , . . . , xi ← bi , xi+1 ← bi+1 ] ≥ E[W | x1 ←
b1 , . . . , xi ← bi ] for all i, and by induction,
E[W | x1 ← b1 , . . . , xi ← bi , xi+1 ← bi+1 ] ≥ E[W ].
Flipping Biased Coins
Flipping Biased Coins
• Previously, we set each xi true or false with probability 21
independently.
1
2
is nothing special here.
• In the following, we set each xi true with probability p ≥ 21 .
• We first consider the case that no clause is of the form Cj = x̄i .
Lemma
If each xi is set to true with probability p ≥ 1/2 independently, then
the probability that any given clause is satisfied is at least
min(p, 1 − p2 ) for instances with no negated unit clauses.
Proof
Proof
• If the clause is a unit clause, then the probability the clause is
satisfied is p.
• If the clause has length at least two, then the probability that the
clause is satisfied is 1 − pa (1 − p)b , where a is the number of
negated variables and b is the number of unnegated variables.
Since p > 21 > 1 − p, this probability is at least 1 − p2 .
Flipping Biased Coins
2
Armed with previous lemma, we then maximize min(p,
√ 1 − p ),
1
2
which is achieved when p = 1 − p , namely p = 2 ( 5 − 1) ≈ 0.618.
We need more effort to deal with negated unit clauses, i.e., Cj = x̄i for
some j.
We distinguish between two cases:
1. Assume Cj = x̄i and there is no clause such that C = xi . In this
case, we can introduce a new variable y and replace the
appearance of x̄i in ϕ by y and the appearance of xi by ȳ.
Flipping Biased Coins
2. Cj = x̄i and some clause Ck = xi . W.L.O.G we assume
w(Cj ) ≤ w(Ck ). Note that for any assignment, Cj and Ck cannot
be satisfied simultaneously. Let vi be the weight of the unit clause
x̄i if it exists in the instance, and let vi be zero otherwise, we have
m
n
X
X
OPT ≤
wj −
vi
j=1
i=1
√
We set each xi true with probability p = 21 ( 5 − 1), then
m
X
E[W ] =
wj E[Yj ]
j=1

≥p·
m
X
j=1
≥ p · OPT
wj −
n
X
i=1

vi 
Rounding by Linear Programming
The Use of Linear Program
Integer Program Characterization: Linear Program Relaxation:
max
Pm
j=1 wj zj
P
i∈Pj yi
+
P
i∈Nj (1
− yi ) ≥ zj , ∀Cj =
yi ∈ {0, 1}0 ≤ yi ≤ 1,
zj ∈ {0, 1}0 ≤ zj ≤ 1,
W
i∈Pj xi
∨
W
i∈Nj x̄i ,
i = 1, . . . , n,
j = 1, . . . , m.
where yi indicate the assignment of variable xi and zj indicates
whether clause Cj is satisfied.
Flipping Different Coins
• Let (y∗ , z∗ ) be an optimal solution of the linear program.
• We set xi to true with probability yi∗ .
• This can be viewed as flipping different coins for every variable.
Randomized rounding gives a randomized (1 − 1e )-approximation
algorithm for MAX-SAT.
Analysis
Pr[clause Cj not satisfied]
Y
Y
yi∗
(1 − yi∗ )
=
i∈Nj
i∈Pj
≤
h P
1
lj

∗
i∈Pj (1 − yi ) +
∗
i∈Nj yi
P
ilj
Mean Inequality
lj

1 X ∗ X
= 1 − 
yi +
(1 − yi∗ ) ≤
lj
i∈Pj
i∈Nj
Arithmetic-Geometric
zj∗ lj
1−
lj
!
Analysis
Pr[clause Cj satisfied]
zj∗ lj
≥ 1− 1−
lj
l 1 j
≥
zj∗
1 − 1 − lj
Jensen’s Inequality
m
Therefore, we have
X
E[W ] =
wj Pr[clause Cj satisfied]
j=1
≥
m
X
"
wj zj∗
j=1
≥
1−
1
e
#
1 lj
1− 1−
lj
· OPT
The Combined Algorithm
Choosing the Better of Two
• The randomized rounding algorithm performs better when lj -s
are small. ( 1 −
1 k
k
is nondecreasing)
• The unbiased randomized algorithm performs better when lj -s
are large.
• We will combine them together.
Choosing the better of the two solutions given by the two algorithms
yields a randomized 43 -approximation algorithm for MAX SAT.
Analysis
Let W1 and W2 be the r.v. of value of solution of randomize rounding
algorithm and unbiased randomized algorithm respectively. Then
1
1
E[max(W1 , W2 )] ≥ E[ W1 + W2 ]
2
2"
#
m
m
X
1 lj
1X
1
∗
wj zj 1 − 1 −
+
wj 1 − 2−lj
≥
2
lj
2
j=1
j=1
"
!
#
m
l
j
X
1
1
∗ 1
−lj
≥
wj zj
1−2
1− 1−
+
2
lj
2
j=1
3
≥ · OPT
4
Referred Materials
• Content of this lecture comes from Chapter 14 and Chapter 16 in
[Vaz04], and Section 5.1-5.5 in [WS11].