Theoretical Computer Science 2013/2014 Homework 2

Theoretical Computer Science 2013/2014
Homework 2
Problem 1 (Greedy algorithm for partial cover). In the set cover problem we
are given a universe E of elements and a family {S1 , S2 , ..., Sm } of E’s subsets.
The goal is to find a collection
indexed by I ⊆ {1, ..., m} that miniS of subsets
P
mizes j∈I wj such that j∈I Sj = |E|. Consider the partial cover problem,
P
in which one finds a collection of subsets indexed by I that minimizes j∈I wj
S
such that j∈I Sj ≥ p|E| where 0 < p < 1 is some constant.
Consider the greedy algorithm for partial cover.
1. Prove that the greedy
for partial cover constructs a solution of
algorithm
1
· OP T SC , where OP T SC is the value
value no more than 1 + ln 1−p
of the optimal solution to the set cover problem (note: real set cover, not
partial set cover).
2. Prove that the greedy algorithm for the partial cover problem gives an
Hd|E|·pe -approximation algorithm, where Hn = 1 + 12 + 31 + ... + n1 .
Problem 2 (Linear program for partial cover). Consider the following linear
program for set cover problem:
P
min S wS xS
(1)
P
s.t.
x
≥
1
∀e
∈
E
S:e∈S S
xS ≥ 0
∀S ∈ {S1 , ...., Sm }.
An analog of the above LP for a partial cover problem would be
P
min S wS xS
P
s.t.
xS ≥ ce ∀e ∈ E
PS:e∈S
c
≥
p · |E|
e e
1 ≥ ce ≥ 0
∀e ∈ E
xS ≥ 0
∀S ∈ {S1 , ...., Sm },
where variable ce measures the coverage of element e.
1
(2)
1. Consider an algorithm for partial cover which randomly rounds the solution of linear program 1 (the one for real set cover).
Prove
that
this algo1
rithm constructs a solution of value no more than 1 + ln 1−p ·OP TLP ,
SC
where OP TLP
is the value of the optimal solution to the set cover linear
program.
2. Consider an algorithm for partial cover which randomly rounds the solution of linear program 2. Prove that this algorithm constructs a solution
PC
PC
of value no more than O (ln (|E| · p)) · OP TLP
, where OP TLP
is the solution to the partial cover linear program.
Hint: inequality 1 − exp(−x) ≥ x(1 − 1e ) for x ∈ [0, 1] might be handy.
Problem 3 (Approximation Algorithm for multiway cut). We are given an
undirected graph G = (V, E), costs ce ≥ 0 for all edges e ∈ E, and k distinguished vertices s1 , ..., sk . The goal is to remove a minimum-cost set of edges F
such that no pair of distinguished vertices si and sj for i 6= j are in the same
connected component of (V, E−F ).
We know that when k = 2 then the problem is just the min-cut problem,
and it can be solved in polynomial time via max-flow. Consider the following
algorithm for the problem with k vertices. For i = 1, ..., k let Fi be the mincut
S that separates vertex si and vertices s1 , s2 , ..., si−1 , si+1 , ..., sk . Solution
i=1,...,k Fi is obviously a feasible solution, i.e. it separates all s1 , ..., sk . Show
that it is also a 2-approximation of the optimal solution.
Problem 4 (Greedy Algorithm for set cover). In this problem you need to
implement a greedy algorithm for set cover. Your program reads from standard
input. In the first line of input you are given numbers n and m, where n is
the number of elements in the universe and m is the number of subsets. In the
next m lines you are given a description of subsets; in line i + 1 you are given
numbers wi , si , ai1 , ai2 , ..., aisi , where wi is the cost of the set Si , si is the number
of elements in Si , and ai1 , ai2 , ..., aisi are the elements of Si with each aij from
{1, ..., n}. On the standard output you need to write one number, the cost of a
feasible set-cover found by your algorithm.
Problem 5 (Optional “hard” exercise: New Year’s Eve in the lab). It is the
last day of the year and, in the lab, there are some tasks T1 , . . . , Tm that should
be performed before to run at home for the New Year’s Eve party. Each task
can be performed only by people with specific skills. Thus, for each worker i in
the lab there is a subset Si ⊆ {T1 , . . . , Tm } of tasks that she can perform. Let
A = (A1 , . . . , An ) ∈ S1 × · · · × Sn be an assignment of tasks to the n workers of
the lab. Each worker i would like to leave the lab as soon as possible in order
to join the New Year’s Eve party. Thus each worker i has a cost ci (A) for the
assignment A, representing the time that i takes to perform the assigned job,
where the time depends on the number of people assigned to the same task and
their skills. The aim of the worker is to minimize this cost.
The workers discuss a lot about how to assign the tasks to themselves, but
they are unable to find a solution that satisfies each of them. The head of the lab
2
then propose the following solution: He will announce a probability distribution
π on the possible assignments. Then he will draw a specific assignment A?
according to this distribution and suggests to each worker i the task A?i that
he must perform (but worker i will not know which task has been suggested
to other people). The worker i will then perform the task that minimizes the
expected cost, where the expectation is taken with respect to the announced
distribution π conditioned on the task assigned to i being A?i .
Example. Suppose n = m = 2 and Si = {T1 , T2 } for each worker i. Then
there are four possible assignments: (T1 , T1 ), (T1 , T2 ), (T2 , T1 ) and (T2 , T2 ).
The head of the lab assigns probability π(T1 , T1 ) = π(T2 , T2 ) = 1/6 and probability π(T1 , T2 ) = π(T2 , T1 ) = 1/3. This probability distribution is known to each
worker. Then the head of the lab selects an assignment according to this probability distribution, say (T1 , T2 ). The workers do not know which assignment
has been selected. Finally, the head of the lab signals to worker 1 that according
to the selected assignment he should perform task T1 . Similarly, to worker 2 is
signaled that he should perform task T2 . With this signal, worker 1 knows that
π(T1 ,T2 )
= 2/3
the selected assignment will be (T1 , T2 ) with probability π(T1 ,T
2 )+π(T1 ,T1 )
and it will be (T1 , T1 ) with remaining probability. Similarly, worker 2 knows
that the selected assignment will be (T1 , T2 ) with probability 2/3 and (T2 , T2 )
with probability 1/3. Then, worker 1 will perform the suggested task (T1 ) only
if
2
1
2
1
c1 (T1 , T2 ) · + c1 (T1 , T1 ) · ≤ c1 (T2 , T2 ) · + c1 (T2 , T1 ) · .
3
3
3
3
Similarly, worker 2 will perform the suggested task (T2 ) only if
c2 (T1 , T2 ) ·
1
2
1
2
+ c2 (T2 , T2 ) · ≤ c2 (T1 , T1 ) · + c2 (T2 , T1 ) · .
3
3
3
3
In above example, the probability distribution π has no particular property
and hence it may be the case that some workers deviate from the suggested
task. Prove that the head of the lab can always compute in polynomial time a
distribution π such that each worker i will perform the suggested task.
Problem 6 (Optional “hard” exercise: Hyper-graphs spanning tree). Consider
a Steiner tree problem. We are given graph G = (V, E) with edge costs ce ≥ 0,
and a subset R ⊆ V of required vertices (called terminals). A Steiner tree is a
subset of edges which connects all the terminals R. We assume that the cost
function c satisfies the triangle inequality, i.e. cuw + cwv ≥ cuv . The optimal
Steiner tree on G shall be denoted by OP T (G).
Let us consider a full graph G2 on vertices R only, where there is an edge
{u, v} for each pair of terminals u, v ∈ R. The cost of an edge {u, v} in G2 is
equal to the shortest path between u and v in G. In the class you have seen
(also you can find the proof in Vazirani Ch. 3.1) that the minimum spanning
tree in G2 costs at most 2 · OP T (G), i.e. M ST (G2 ) ≤ 2 · OP T (G). The factor-2
is tight as shown in Fig 1.
3
1
16
2
15
3
4
14
0
13
5
12
6
11
7
10
8
9
Figure 1: Tight factor-2 for the minimum spanning tree in G2 . Graph G consists
of vertices 0, 1, 2, ..., 15, 16 and edges (0, 1), (0, 2), . . . , (0, 15) , (0, 16) all of cost
1; terminals are 1, 2, ..., 16. The optimal Steiner tree in G is the graph itself.
The minimum spanning tree in G2 consists of edges {1, 2}, {2, 3}, {3, 4}, . . . ,
{14, 15}, {15, 16}. We can see that edges (0, 2), (0, 3), ..., (0, 15) are counted
twice in the cost of M ST (G2 ). Of course graph G2 consists of all pairs of
terminals, for example {7, 13} and {2, 5} are also there, but we don’t use them
to build M ST (G2 ).
We can look at MST on G2 as on building a Steiner tree in G out of small
components, and in this case these small components are just shortest-paths
between two terminals. One can ask a question, what if the small components
— from which we build the Steiner tree — were not shortest paths between two
terminals, but in fact optimal Steiner trees on three terminals?
So suppose now that we consider a hyper-graph G3 where the set of vertices is
still R, but the set of edges are all triples of terminals {{u, v, w}|u, v, w ∈ R}, and
the cost of hyper-edge {u, v, w} is the cost of an optimal Steiner tree that spans
u, v, w. Note here that the hyper-edges also include pairs, since the elements
of the triples might not be all distinct. For any three terminals u, v, w an
optimal Steiner tree spanning them can be found in polynomial time: if the
triple consists of only two distinct vertices, then the cheapest way to connect
them is via shortest path; if the vertices u, v, w are all distinct, then optimal
Steiner tree consists either of only u, v, w or u, v, w plus one additional vertex s
4
— there is no need to add more than one not-required vertex.
In hyper-graph the notion of connectivity remains the same — if two vertices
are in the same hyper-edge then they are directly connected, and two vertices
u, u0 are connected if there exists a path u = u0 , u1 , ..., ui = u0 between them,
such that uj , uj+1 are directly connected. Thus the concept of a spanning tree
in G3 should also be clear — spanning tree is a subset of hyper-edges that
connect all the terminals. Since a hyper-edge represents a tree on three nodes,
it is clear that a spanning three in G3 represents a feasible Steiner tree in G.
Now, it can be shown that the minimum spanning tree M ST (G3 ) gives a better
approximation than M ST (G2 ) does.
Example. Graph G consists of nodes 0, 1, 2, ..., 16 and edges (0, 1), (0, 2), . . . ,
(0, 16), they all have cost 1; terminals are 1, 2, ..., 16. The optimal Steiner tree
in the graph G is the graph G itself, and its cost is 16. Hyper-graph is constructed on nodes 1, 2, ..., 16 and it consists of all possible triples of elements
from 1, 2, ..., 16. Consider a triple {1, 2, 3}, the optimal Steiner tree that spans
{1, 2, 3} in G is a tree on nodes 0, 1, 2, 3 with edges (0, 1), (0, 2), (0, 3). From the
tree OP T (G), we can construct in the hyper-graph a spanning subgraph by taking the following hyper-edges: {1, 2, 3}, {3, 4, 5}, {5, 6, 7}, {7, 8, 9}, {9, 10, 11},
{11, 12, 13}, {13, 14, 15}, {15, 16, 1}1 . Cost of any hyper-edge {i, i + 1, i + 2}
is equal to 3, thus the total cost of the spanning subgraph we constructed is
8 · 3 = 24, and hence M ST (G3 ) < 24 = 23 · 16 = 23 · OP T (G). In general,
if instead of 16 we would have 2n vertices, then the cost of OP T (G) would
be 2n, and the cost of M ST (G3 ) would be less than n · 3, so the inequality
M ST (G3 ) < 23 OP T (G) would still be valid. We need to notice that the graph
G3 consists of many triples, e.g. {2, 4, 6} also is a hyper-edge of G3 , but we don’t
use it to construct the spanning tree.
Constant 23 worked only in the above example, and it does not work in
general. For G3 constant 53 is the correct one, i.e. it can be proven that for any
graph G
5
M ST (G3 ) ≤ OP T (G).
3
However, constant 23 will be true in general, if we shall consider G4 instead of
G3 , and a variant of this inequality is the problem you have to solve.
Suppose you are given a graph G in which the optimal Steiner tree OP T (G)
is a complete binary tree, in which all terminals are leaves, and all leaves are
terminals. Consider a hyper-graph G4 constructed as described earlier, but on
quadruples of terminals2 . Show that you can construct a spanning tree T in
G4 of cost at most 23 OP T (G), this will imply that M ST (G4 ) ≤ T ≤ 32 OP T (G).
Hint: Steiner tree OP T (G) might have different costs on edges, but first assume
that they are all equal. Show the inequality in this case, and then argue why
using randomization we can conclude that inequality M ST (G4 ) ≤ 32 OP T (G)
holds even for different edge-costs.
1 Note that we would be also fine with taking {15, 16} instead of {15, 16, 1} as the last edge,
but we want to keep things simple here.
2 Quadruples are actually much easier to handle in the binary tree case than only triples
5
1
16
2
15
3
4
14
0
13
5
12
6
11
7
10
8
9
Figure 2: Example of a spanning tree in a hyper-graph.
6

Download Report

Theoretical Computer Science 2013/2014 Homework 2

Paperzz.com

Your Paperzz