Decision vs. Optimization Approximation

Approximation - Decision vs. Optimization
Approximation - Decision vs. Optimization
The lecture is taken from:
Introduction to Algorithms, Second Edition
By: T.H. Cormen, C.E. Leiserson, R.L. Rivest and C. Stein
Chapter 35 and material from some previous chapters.
12/8/08
1
Decision vs Optimization Problems. It is customary to state
many problems as optimization problems: find the “best”
solution according to some set of criteria (e.g., find the shortest
- longest - path between a and b). It turns out that NPCompleteness can be properly expressed only in terms of
“yes/no” (or decision) problems - same as for language
acceptance. We can show that this is not a significant
restriction, by changing the optimization problem to one in
which a yes/no question is asked (e.g., does there exist a
solution to the path problem between a and b with a path of
length at most - or at least - k?). Solving the optimization
problem would then provide - immediately - solutions for the
decision problem.
12/8/08
2
Approximation - Decision vs. Optimization
Approximation - Decision vs. Optimization
Thus, an easy optimization problem leads to an easy decision
problem.
Example: The Vertex-Cover Problem.
Def.: a vertex cover of an undirected graph G = (V, E) is a
subset V’ ⊆ V such that, if {u, v} ∈ E, then u ∈ V’ or v ∈ V’, or
both. (The vertices of V’ cover all the edges.)
Def.: the size of a vertex cover is the number of
vertices in it.
Def.: vertex-cover problem: find a cover of minimum
size in a given graph. As a decision problem: can you
find a vertex cover of size k? As a language:
VERTEX-COVER = {<G, k>: a graph G has a vertex cover of size k}.
Contrapositively, showing that the decision problem is provably
hard (well,… such that there is a polynomial reduction from an
NPComplete problem to it) can be used as clear evidence that
the corresponding optimization problem is also hard.
Note: finding a vertex cover is easy… finding one with the
desired properties is not…
12/8/08
3
12/8/08
4
1
Approximation - Decision vs. Optimization
Approximation - Decision vs. Optomization
Examples: Hamiltonian Cycle and Traveling Salesman Problem.
HC: Given an undirected graph G = (V, E), does it have a simple
cycle that contains each vertex in V?
TSP: Given a complete undirected graph G = (V, E), with a cost
function on E, find a Hamiltonian cycle of minimum cost.
Note: the formulation of TSP as an optimization problem does
not quite reflect our presentation of NP and NP Complete
problems. Those are presented as decision problems (answer:
yes or no). We can restate TSP as a decision problem:
Note: HC is NP Complete (finding one cycle is hard), while
finding a Hamiltonian cycle in TSP is the easy part… the
problem comes from having to test all possible Hamiltonian
cycles for cost… Note further that TSP does not have an
obvious "certificate": there is no way to check that the
Hamiltonian cycle I give you is a minimum cost one without
testing it against all others (modulo special circumstances…)
12/8/08
5
Approximation - Decision vs. Optomization
12/8/08
6
Approximation
We have seen that, regardless of the herculean efforts made up
to now, a VERY large number of problems belong to the
NPComplete class, or to the NP Hard class: we have no efficient
(= polynomial-time) algorithms for producing a (n optimal)
solution and, furthermore, no realistic hope of finding any such
algorithms.
What do we do?
Try to find a “solution” that is near enough being optimal (using
the cost function to be minimized or maximized) to be usable, at
a computational cost that is no worse than “low polynomial”
(maybe even just on average).
How? --- Be clever…
12/8/08
Decision-TSP: Given a complete undirected graph G = (V, E),
with a cost function on E, and a number k, find a Hamiltonian
cycle of cost ≤ k. It can be seen that this problem is in NP (has a
certificate: a permutation of the vertices V, which is then
checked, in polynomial time, both for being a tour and for
satisfying the cost inequality).
7
Def.: we say that an algorithm for a problem has an
approximation ratio of ρ(n) if, for any input of size n, the cost
C of the solution produced by the algorithm is within a factor of
ρ(n) of the cost C* of an optimal solution: max(C/C*, C*/C) ≤ ρ(n).
For maximization problems we will have 0 ≤ C ≤ C*, while
minimization ones will have 0 ≤ C* ≤ C.
What you would ideally look for is classes of algorithms where
you can trade the quality of the approximation for computation
time: some problems admit such algorithms, while we donʼt
have anything comparable for some other problems…
12/8/08
8
2
Approximation
Approximation
Def.: an algorithm that achieves (max(C/C*, C*/C) = ρ(n)) an
approximation ratio of ρ(n) is called a ρ(n)-approximation
algorithm.
Def.: an approximation scheme is a polynomial-time
approximation scheme if, for any fixed ε > 0, the scheme runs
in time polynomial in the size n of the input instance.
Note: a 1-approximation algorithm produces an optimal
solution.
Def.: an approximation scheme is a fully polynomial-time
approximation scheme if it is an approximation scheme and
its running time is polynomial in both 1/ε and the size n of the
input instance (e.g., O((1/ε) j nk ) with j, k, positive integers).
Def.: an approximation scheme for an optimization problem is
an approximation algorithm that takes two inputs: an instance of
the problem and a positive number ε, such that the scheme is a
(1 + ε)-approximation algorithm.
12/8/08
9
Approximation - Some Examples
12/8/08
10
Approximation
The Vertex-Cover Problem (again).
Def.: a vertex cover of an undirected graph G = (V, E) is a
subset V’ ⊆ V such that if (u, v) ∈ E, then either u ∈ V’ or v ∈
V’ or both. The size of a vertex cover is the number of vertices
in it.
The Algorithm. We will show that this fairly simple-minded
algorithm will produce, in polynomial time, a vertex cover which
is a 2-approximation (so no worse than twice as large as an
optimal one) to the optimal cover.
Def.: the minimum-vertex-cover problem is to find a vertex
cover of minimum size - an optimal vertex cover.
12/8/08
11
12/8/08
12
3
Approximation
Approximation
This shows the execution of the algorithm on an example graph.
Start, arbitrarily, with (b, c). Removing all edges incident to b
and c leaves graph (b). Next, pick (e, f), and remove edges
incident to e and f. We end up with graph (c). (d, g) is the last
remaining edge, add d and g to the cover:
V’ = {b, c, d, e, f, g}.
Optimal cover = {b, d, e}.
12/8/08
13
Approximation
12/8/08
14
Approximation
Thus no two edges in A are covered by the same vertex in C*,
and we must have the lower bound |C*| ≥ |A|. Each execution of
line 4 picks an edge neither of whose two vertices is as yet in
C: |C| = 2|A|. Combining the two expressions:
|C| = 2|A| ≤ 2|C*|.
QED.
12/8/08
Theorem 35.1. APPROX-VERTEX-COVER is a polynomialtime 2-approximation algorithm.
Proof.
a) The algorithm is polynomial-time. This is not hard: the loop
picks an edge (u, v) repeatedly from E, adding its endpoints to
C, and deletes all the edges covered by either u or v. Time:
O(|V| + |E|) if we use adjacency list representation.
b) The approximation ratio. Let A denote the set of edges that
were picked in line 4 of the algorithm. In order to cover all edges
in E, an optimal cover C* must include at least one endpoint of
each edge in E. By construction of A, no two edges in A share
an endpoint.
15
The Traveling Salesman Problem. We are given a complete
undirected graph G = (V, E), and a non-negative integer-valued
cost function c : V × V → Z+. We must find a minimum cost
tour. Let c (A) denote the total cost of the edges in a subset A
⊆ E.
Def.: a function c satisfies the triangle inequality if, for each
triple of vertices u, v, w ∈ V,
c(u, w) ≤ c(u, v) + c(v, w).
Meaning: a direct path is always no longer than an indirect
one…
We will now assume that the cost function satisfies the triangle
inequality.
12/8/08
16
4
Approximation
Approximation
Before going on, we should prove that TSP remains in NPC
even with the restriction that the cost function satisfies the
triangle inequality (CLRS - Ex. 35.2-2).
TSP with triangle inequality.
Approx-TSP-Tour(G, c)
1. select r ∈ V[G] as a “root”
2. compute an MST T for G starting from r - via MST-Prim(G,
c, r)
3. let L be the list of vertices visited in a preorder walk of T
4. return the hamiltonian cycle H that visits the vertices in order
L.
12/8/08
17
Approximation
12/8/08
18
Approximation
Theorem 35.2. Approx-TSP-Tour is a polynomial-time
2-approximation algorithm for the TSP with triangle
inequality.
Proof.
a) The algorithm is polynomial-time. The dominant
operation is the computation of the MST. MST-Prim can
be shown to be Θ(|V|2).
b) The approximation ratio. Let H* be an optimal tour, T
a MST. Since removing any edge from a tour gives a
spanning tree, we must have c(T) ≤ c(H*). The listing
through the walk will actually give a list of vertices with
repetitions: a vertex is added to the list when first
encountered and also whenever re-encountered after
12/8/08
The TSP using Approx-TSP-Tour(G, c). Let a be the root. Let c
be the ordinary euclidean distance. (d) gives the approximate
tour, (e) gives an optimal one.
19
finishing a visit to a non-empty sub-tree: note that
each edge of T is traversed exactly twice. If we denote
the full walk by W, we have c(W) ≤ 2 c(T). Combining, we
have
c(W) ≤ 2c(T) ≤ 2c(H*),
and this gives that the cost of W is within the correct
factor of the cost of the optimal tour. The only problem
is that W is not a tour, since it may visit some vertices
more than once. The triangle inequality assumption
allows us to remove vertices from W without increasing
the cost of the path: say that vertex w1 is followed by a
vertex w2 that has been already visited. Say that the
next vertex in W is w3. We can remove the edges (w1, w2)
and (w2, w3), replacing them by (w1, w3), without
12/8/08
20
5
Approximation
Approximation
increasing the length of the path. If w3 has already
been visited, continue along the list W until you find an
unvisited vertex, or the start vertex as the last vertex
in W. Add the appropriate edge. After we are done, we
have a hamiltonian cycle H. By construction, and the
triangle inequality condition, c(H) ≤ c(W) ≤ 2c(T) ≤ 2c(H*).
QED.
What happens if we drop the triangle inequality
assumption? Do we need this extra constraint on the
problem, reasonable as it may seem?
12/8/08
21
Approximation
Proof. By contradiction: assume that for some ρ ≥ 1 there is a
polynomial-time approximation algorithm A with approximation
ratio ρ. w.l.g., assume ρ is an integer. We show how to use A
to solve the hamiltonian-cycle problem in polynomial time which would then imply that P = NP.
Let G = (V, E) be an instance of the hamiltonian-cycle problem.
We turn G into an instance of TSP as follows: let G’ = (V, E’) be
the complete graph on V:
12/8/08
22
Approximation
E’ = {(u, v) : u, v ∈ V and u ≠ v}.
Assign an integer cost to each edge in E’:
$&1
if ( u,v ) " E,
c ( u,v ) = %
&' # V + 1 otherwise.
We can clearly go from G to G’ in polynomial time (in |V| and
|E|).
Now consider
the instance <G’, c, k> of TSP.
!
Assume G had a hamiltonian cycle H: since c assigns a cost of
1 to each edge in E, H is a tour with c(H) = |V|.
Assume G does not contain a hamiltonian cycle: any tour T in
G’ must contain at least one edge not in E.
c(T) ≥ (ρ|V| + 1) + (|V| - 1) = ρ|V| + |V| > ρ|V|.
12/8/08
Theorem 35.3. If P ≠ NP, then for any constant ρ ≥ 1, there is
no polynomial-time approximation algorithm with approximation
ratio ρ for the general traveling-salesman problem.
23
Because of the difference in cost between edges in G and
those not in G, the difference in cost between a tour in G and
one with some edge not in G is at least |V|. Now apply the
approximation algorithm A to the instance <G’, c, k> of TSP.
Since A is guaranteed to return a tour of cost no more than ρ
times optimal, if G contains a hamiltonian cycle, then A must
return it; if G does not contain one, then A returns a tour - in G’
- of cost more than ρ|V|. In either case, we can decide the
hamiltonian cycle problem in polynomial time, which, because
HAM-CYCLE ∈ NPC, would imply P = NP.
12/8/08
24
6
Approximation
Approximation
The Set-Covering Problem. This is a generalization of the
vertex-cover problem: let X be a finite set and F be a family of
subsets of X, such that every element of X belongs to at least
one subset of F : X = ∪S ∈ F S. We say that S ∈ F covers its
elements.
The problem: find a minimum-size subset C ⊆ F whose
members cover all of X: we say that anyC satisfying the
previous equation covers X: X = ∪S ∈ C S.
12/8/08
25
Approximation
12/8/08
26
Approximation
Theorem 35.4. Greedy-Set-Cover is a polynomial-time ρ(n)approximation algorithm, where
ρ(n) = H(max{|S|: S ∈ F }),
with H(d) = Σi=1d(1/i), the dth Harmonic Number (H(0) ≡ 0).
Proof.
a) The algorithm is polynomial-time in |X| and |F |. We can
observe that the number of iterations of the while loop in lines 36 is bounded above by min(|X|, |F |); the loop body can be
implemented to run in time O(|X||F |). Thus the whole algorithm
can be implemented in time O(|X||F |•min(|X|, |F |)). A linear
implementation is possible (Ex. 35.3-3).
12/8/08
The decision version asks about the existence of a cover of size
at most k. One can show (35.3.-2) that the decision version of
the set-covering problem is in NPC. We propose the following
as a polynomial-time approximation algorithm:
27
b) The approximation ratio. We will assign a cost of 1 to each
set selected by the algorithm, distributing this cost uniformly
over all the elements of this set that are being covered for the
first time. If we let Si denote the ith subset added by the
algorithm, we have a cost of 1 added when Si is selected; if cx
denotes the cost of an element x ∈ X (in particular x ∈ Si)
covered for the first time by Si, we have the formula
cx =
1
.
Si " ( S1 # S2 # ...# Si"1 )
Since we assign one unit of cost at each step of the algorithm,
|C | = Σx ∈ X cx . Furthermore, the cost assigned to an optimal
cover is given
! by ΣS ∈ C* Σ x ∈ S cx. Since each x must be in at
least one set of any optimal cover, ΣS ∈ C* Σx ∈ S cx ≥ Σx ∈ X cx ,
where we are counting each x only once in the second sum.
12/8/08
28
7
Approximation
Approximation
Randomized Algorithms.
Def.: we say that a randomized algorithm for a problem has an
approximation ratio of ρ(n) if, for any input of size n, the
expected cost C of a solution produced by the randomized
algorithm is within a factor of ρ(n) of the cost C* of an optimal
solution: max(C/C*, C*/C) ≤ ρ(n).
Combining the inequalities, we have:
|C | = Σx ∈ X cx ≤ ΣS ∈ C* Σx ∈ S cx .
Claim: For any set S ∈ F , Σx ∈ S cx ≤ H(|S|).
Pf. See textbook (CLRS, pp. 1036-37).
Combining these results:
|C | = Σx ∈ X cx ≤ ΣS ∈ C* Σx ∈ S cx ≤ ΣS ∈ C* H(|S|)
≤ |C *| • H(max{|S| : S ∈ F }),
giving the desired conclusion.
12/8/08
The basic difference between deterministic and randomized
algorithms is that the randomized algorithms deal with
expected costs, and expected bounds, rather than
deterministic ones: any particular solution could be much worse
or much better than the expected value.
29
Approximation
30
Approximation
MAX-3-CNF-SAT. Given any instance of 3-CNF-SAT, this
instance may or may not be satisfiable. For it to be satisfiable,
every clause must evaluate to 1 under some variable
assignment. We would like to know how close to satisfiable an
instance is: what is the maximum number of clauses that can
be satisfied?
A randomized algorithm would set each variable to true with
probability p, and false with probability q = 1 - p.
Assumption: no clause contains both a variable and its
negation. Can be removed w.l.g. - Ex. 35.4-1.
12/8/08
12/8/08
31
Theorem 35.6. Given an instance of MAX-3-CNF-SAT with n
variables x1, x2, …, xn and m clauses, the randomized algorithm
that independently sets each variable to 1 with probability 1/2
and to 0 with probability 1/2 is a randomized 8/7-approximation
algorithm.
Proof. For i = 1, …, m let Yi denote the indicator random variable
which takes the value 1 if the ith clause is satisfied, 0 if not: Yi =
I(clause i is satisfied). By assumption, no literal appears more
than once in a clause, and a variable cannot appear along with
its negation: this simply means that the settings of the three
literals are independent.
12/8/08
32
8
Approximation
Approximation
i
i
i
Pr(Yi is not satisfied) = Pr(l1 = 0)•Pr(l2 = 0)•Pr(l3 = 0)
= 1/2•1/2•1/2 = 1/8.
Pr(Yi is satisfied) = 1 - Pr(Yi is not satisfied) = 7/8.
By definition of expected value:
E[Yi] = 1• Pr(Yi is satisfied) + 0• Pr(Yi is not satisfied) = 7/8.
If we let Y denote the number of satisfied clauses, we can
represent Y = Y1 + Y2 + … + Ym. Applying the expectation
operator to both sides, and making use of its linearity (over
sums of independent random variables) E[Y] = Σi=1m E[Yi] =
m•7/8. Since m is an upper bound on the number of satisfied
clauses, the approximation ratio is m/(m•7/8) = 8/7. QED.
12/8/08
33
Approximation
Linear programming: the Algorithm. Skip.
12/8/08
34
Approximation
The Subset-Sum Problem.
The language (= decision problem) SUBSET-SUM is defined as
SUBSET-SUM = {< S, t >: S is a finite subset of N, t ∈ N, and there
exists S’ ⊆ S such that t = Σs∈S’ s }.
An exponential-time exact algorithm. Assume S = {x1, x2, …,
xn}. If L = {l1, l2, …, lk} is a list of integers, L + xi ≡ {l1 + xi, l2 + xi,
…, lk + xi}. The procedure Merge-Lists(L, L’) returns the sorted
list that is the merge of the two lists with duplicates removed. It
can be implemented, obviously, in O(|L| + |L’|).
The optimization problem asks to find the largest sum ≤ t.
We will first find an exponential-time exact solution, and then
modify it so that it becomes a fully-polynomial approximation
scheme (= polynomial in the size n = |S| of the input and in 1/ε,
where ε > 0 is a parameter).
12/8/08
35
12/8/08
36
9
Approximation
Approximation
Since the lists could double in length through each pass through
the loop, the total time could be exponential. It could be better
(e.g., polynomial) if t is a polynomial in |S| or all the numbers in S
are bounded by a polynomial in |S|. (why? - the first should be
obvious since the number of elements under consideration can
never be more than polynomial in |S|; the second?)
To show the algorithm correct, let Pi denote the set of all values
that can be obtained by selecting a (possibly empty) subset of
{x1, x2, …, xi } and summing its members. We can easily prove
by induction on i that Li is a sorted list containing every element
of Pi , ≤ t. We can show by induction that Pi = Pi-1 ∪ Pi-1 + xi.
12/8/08
37
Approximation
How? We give a “trimming parameter” δ ∈ (0, 1). Trimming a list
L by δ means removing as many elements as possible in such a
way that for every y removed form L, the resulting list L’
contains an element z such that y/(1 + δ) ≤ z ≤ y. z “represents” y
in the new list.
12/8/08
38
Approximation
We introduce the algorithm (L = {y1, …, ym}):
It is clear that the running time of Trim is Θ(m); it should also be
clear that the “gaps” between successive remaining members
increase exponentially: you can think of this algorithm as
providing a “near geometric series” with ratio (1 + δ) that will act
as a sieve, leaving only those elements that “jump” through the
next element of the series, while dropping all the intermediate
ones.
The next step involves introducing the “approximation
parameter” ε ∈ (0, 1). The procedure must return a number z
whose value is within a 1 + ε factor of the optimal solution.
Ex: δ = 0.1. L = {10, 11, 12, 15, 20, 21, 22, 23, 24, 29}. Trim L:
last = y1 = 10; last * (1 + δ) < y2 = 11? No. Skip.
last = y1 = 10; last * (1 + δ) < y3 = 12? Yes. last = 12.
last = y3 = 12; last * (1 + δ) < y4 = 15? Yes. last = 15. Etc…
12/8/08
We will now modify this exact algorithm so that it loses
exactness but gains polynomiality - a tradeoff. The only way to
guarantee polynomiality it to ensure that the sizes of the lists Li
do not grow exponentially: we need to “trim” the lists after each
merge operation.
39
12/8/08
40
10
Approximation
Approximation
The algorithm. What we must show is that the use of the Trim
procedure forces the lists Li to grow in ways that are no worse
than polynomial in n. We must also show that the remaining
elements allow us to approximate the optimal solution within the
claimed multiplicative factor.
12/8/08
41
Approximation
12/8/08
42
Approximation
Proof of 1.
Claim: for every element y ∈ Pi, y ≤ t, there is a z ∈ Li
s.t. y /(1 + ε/(2n))i ≤ z ≤ y.
Pf. By induction on i.
If y* ∈ Pn is an optimal solution, the inequality above
must hold for y*, and thus there is a z ∈ Ln s.t.
y*/(1 + ε /(2n))n ≤ z ≤ y*.
The left-hand inequality becomes y*/z ≤ (1 + ε /(2n))n.
Since z* is, by definition, the largest element of Ln, we
must have y*/z* ≤ (1 + ε /(2n))n.
We still need to prove (1 + ε /(2n))n ≤ (1 + ε).
12/8/08
Theorem 35.8. Approx-Subset-Sum is a fully
polynomial-time approximation scheme for the SubsetSum problem.
Proof. Trimming Li and removing from Li every element
greater than t maintain invariant the property that
every element of Li is also an element of Pi. Thus z* is
the sum of some subset of S. If y* ∈ Pn is an optimal
solution of the subset-sum problem, we must have that
z* ≤ y*. We need to show two things:
1. y*/z* ≤ 1 + ε.
2. The running time of the algorithm is polynomial in
both 1/ε and the size of the input.
43
The first observation comes from Calculus:
limn (1 + ε/(2n))n = eε/2.
If we think of n as a continuous variable, consider the
function f (n) = ln((1 + ε/(2n))n ) = n ln(1 + ε/(2n)).
Differentiating w.r.t. n:
f’(n) = ln(1 + ε/(2n)) + n/(1 + ε/(2n))(-ε/(2n2))
= ln(1 + ε/(2n)) - ε/(2n + ε) // expand ln in a power series
= -ε/(2n + ε) + [ε/(2n) - (ε/(2n))2/2 + (ε/(2n))3/3 - (ε/(2n))4/4 +…]
= -ε/(2n + ε) + ε/(2n) - (ε/(2n))2/2 + g(n) // g(n) > 0
= ε2(2n - ε)/(8n2(2n + ε)) + g(n) > 0
Since (1 + ε/(2n))n = exp(f(n)), differentiation, and the
inequality above, give that (1 + ε/(2n))n is monotonically
increasing in n. (Ex. 35.5-3)
→∞
12/8/08
44
11
Approximation
Approximation
For n large:
(1 + ε/(2n))n ≤ eε/2 ≤ 1 + ε/2 + (ε/2)2 ≤ 1 + ε.
The second inequality follows from an examination of
the power series expansion
ex = 1 + x + x2/2! + x3/3! + … = 1 + x + x2 + (- x2/2! + x3/3! + …)
and we use 0 < x < 1 to prove that
-xn/n! + xn+1/(n+1)! < - n xn+1/(n+1)! for n > 1.
The third follows from the fact that 0 < ε < 1. This
completes the proof of 1.
Proof of 2. We derive a bound on the length of Li. The
trimming construction implies that, after trimming,
successive elements z and z’ of Li must differ by at least
a factor of 1 + ε/(2n): z’/z > 1 + ε/(2n). Each list must
contain the value 0, possibly the value 1, and up to
log1+ε/(2n) t additional values. The number of elements in
each list Li is at most (using base conversion to nat log)
log1+ε/(2n) t + 2 = ln(t)/ln(1 + ε/(2n)) + 2
≤ 2n (1 + ε/(2n)) ln(t)/ε + 2
// x/(1 + x) ≤ ln(1 + x) ≤ x, for x > -1
// proof: compare derivatives over the range
≤ 4n ln(t)/ε + 2 // 0 < ε < 1
This bound is polynomial in the size of the input:
12/8/08
45
12/8/08
46
Approximation
which is the number of bits lg(t) needed to represent t,
plus the number of bits needed to represent S
(polynomial in n) and in 1/ε. Since the running time of
Approx-Subset-Sum is polynomial in the lengths of the
Li, Approx-Subset-Sum is a fully polynomial
approximation scheme.
12/8/08
47
12

Download Report

Decision vs. Optimization Approximation

Paperzz.com

Your Paperzz