Chapter 11 11.1 Load Balancing

Approximation Algorithms
Q. Suppose I need to solve an NP-hard problem. What should I do?
A. Theory says you're unlikely to find a poly-time algorithm.
Chapter 11
Must sacrifice one of three desired features.
Solve problem to optimality.
Solve problem in poly-time.
Solve arbitrary instances of the problem.
Approximation
Algorithms



-approximation algorithm.
Guaranteed to run in poly-time.
Guaranteed to solve arbitrary instance of the problem
Guaranteed to find solution within ratio  of true optimum.



Challenge. Need to prove a solution's value is close to optimum, without
even knowing what optimum value is!
1
2
Load Balancing
11.1 Load Balancing
Input. m identical machines; n jobs; job j has processing time tj.
Job j must run contiguously on one machine.
A machine can process at most one job at a time.


Def. Let J(i) be the subset of jobs assigned to machine i. The
load of machine i is Li = j  J(i) tj.
Def. The makespan is the maximum load on any machine L = maxi Li.
Load balancing. Assign each job to a machine to minimize the makespan.
Decision Version. Is the makespan bound by a number K?
4
Load Balancing on 2 Machines
Load Balancing: Greedy Scheduling
Claim. Load balancing is hard even if only 2 machines.
Pf. NUMBER-PARTITION  P LOAD-BALANCE.
Greedy-scheduling algorithm.
Consider n jobs in some fixed order.
Assign job j to machine whose load is smallest so far.


NP-complete
a
e
b
c
Greedy-Scheduling(m, n, t1,t2,…,tn) {
for i = 1 to m {
load on machine i
Li  0
jobs assigned to machine i
J(i)  
}
d
f
g
length of job f
a
machine 1
b
machine 2
0
d Machine 1
f
c Machine e
2
for j = 1 to n {
i = argmink { Lk }
J(i)  J(i)  {j}
Li  Li + tj
}
return J(1), …, J(m)
yes
assign job j to machine i
update load of machine i
}
g
Time
machine i has smallest load
Implementation: O(n log m) using a priority queue.
L
5
6
1
Load Balancing: Greedy Scheduling Analysis
-approximation
An algorithm for an optimization problem is a -approximation if the
solution found by the algorithm is always within a factor  of the
optimal solution.
Theorem. [Graham, 1966] Greedy algorithm is a 2-approximation.
First worst-case analysis of an approximation algorithm.
Need to compare resulting solution with optimal makespan L*.


Minimization Problem:  = approximate-solution/optimal-solution
Lemma 1. The optimal makespan L*  maxj tj.
Pf. Some machine must process the most time-consuming job. ▪
Maximization Problem:  = optimal-solution/approximate-solution
In general, 1  . If  = 1, then the solution is optimal.
Lemma 2. The optimal makespan L *  1  t .
j j
m
Pf.
The total processing time is j tj .
One of m machines must do at least a 1/m fraction of total work. ▪


7
8
Load Balancing: Greedy Scheduling Analysis
Load Balancing: Greedy Scheduling Analysis
Theorem. Greedy algorithm is a 2-approximation.
Pf. Consider max load Li of bottleneck machine i.
Let j be the last job scheduled on machine i.
When job j assigned to machine i, i had smallest load. Its load
before assignment is Li - tj  Li - tj  Lk for all 1  k  m.
Theorem. Greedy algorithm is a 2-approximation.
Pf. Consider max load Li of bottleneck machine i.
Let j be the last job scheduled on machine i.
When job j assigned to machine i, i had smallest load. Its load
before assignment is Li - tj  Li - tj  Lk for all 1  k  m.
Sum inequalities over all k and divide by m:





Li  tj
blue jobs scheduled before j
Lemma 2

machine i
Now
j

1
L
m k k
 m1  k t k
 L*
Li  (Li  t j )  t j  2L *.

 
 
 L*
 L*

▪
Lemma 1
0
Li - tj
L = Li
9
10
Load Balancing: Greedy Scheduling Analysis
Load Balancing: List Scheduling Analysis
Q. Is our analysis tight?
A. Essentially yes.
Q. Is our analysis tight?
A. Essentially yes.
Ex: m machines, m(m-1) jobs length 1 jobs, one job of length m
Greedy solution = 2m-1;
Ex: m machines, m(m-1) jobs length 1 jobs, one job of length m
Greedy solution = 2m-1; Optimal makespan = m;  = 2-1/m.

machine 2 idle
machine 3 idle
machine 4 idle
m = 10
machine 5 idle
m = 10
machine 6 idle
machine 7 idle
machine 8 idle
machine 9 idle
machine 10 idle
list scheduling makespan = 19
11
optimal makespan = 10
12
2
Load Balancing: LPT Rule
Load Balancing: LPT Rule
Longest processing time (LPT). Sort n jobs in descending order of
processing time, and then run Greedy scheduling algorithm.
Observation. If at most m jobs, then greedy-scheduling is optimal.
Pf. Each job put on its own machine. ▪
Lemma 3. If there are more than m jobs, L*  2 tm+1.
Pf.
Consider first m+1 jobs t1, …, tm+1.
Since the ti's are in descending order, each takes at least tm+1 time.
There are m+1 jobs and m machines, so by pigeonhole principle, at
least one machine gets two jobs. ▪
LPT-Greedy-Scheduling(m, n, t1,t2,…,tn) {
Sort jobs so that t1 ≥ t2 ≥ … ≥ tn


for i = 1 to m {
Li  0
J(i)  
}

load on machine i
jobs assigned to machine i
for j = 1 to n {
i = argmink Lk
J(i)  J(i)  {j}
Li  Li + tj
}
return J(1), …, J(m)
Theorem. LPT rule is a 3/2 approximation algorithm.
Pf. Same basic approach as for the first greedy scheduling.
machine i has smallest load
assign job j to machine i
L i  (Li  t j )  t j 


 

 12 L*
 L*

update load of machine i
}
3 L *.
2
▪
Lemma 3
( by observation, can assume number of jobs > m )
13
14
Load Balancing: LPT Rule
11.2 Center Selection
Q. Is our 3/2 analysis tight?
A. No.
Theorem. [Graham, 1969] LPT rule is a 4/3-approximation.
Pf. More sophisticated analysis of the same algorithm.
Q. Is Graham's 4/3 analysis tight?
A. Essentially yes.
15
Center Selection Problem
Center Selection Problem
Input. Set of n sites s1, …, sn and integer k > 0.
Input. Set of n sites s1, …, sn and integer k > 0.
Center selection problem. Select k centers C so that maximum
distance from a site to nearest center is minimized.
Center selection problem. Select k centers C so that maximum
distance from a site to nearest center is minimized.
Notation.
dist(x, y) = distance between x and y.
dist(si, C) = min c  C dist(si, c) = distance from si to closest center.
r(C) = maxi dist(si, C) = smallest covering radius.
k=4



r(C)
Goal. Find set of centers C that minimizes r(C), subject to |C| = k.
Distance function properties.
dist(x, x) = 0
dist(x, y) = dist(y, x)
dist(x, y)  dist(x, z) + dist(z, y)
center
site



17
(identity)
(symmetry)
(triangle inequality)
18
3
Center Selection Example
Greedy Algorithm: A False Start
Ex: each site is a point in the plane, a center can be any point in the
plane, dist(x, y) = Euclidean distance.
Greedy algorithm. Put the first center at the best possible location
for a single center, and then keep adding centers so as to reduce the
covering radius each time by as much as possible.
Remark: search can be infinite!
Remark: arbitrarily bad!
r(C)
greedy center 1
center
k = 2 centers
site
center
site
19
20
Center Selection: Greedy Algorithm
Center Selection: Analysis of Greedy Algorithm
Greedy algorithm. Repeatedly choose the next center to be the site
farthest from any existing center.
Theorem. Let C* = {ci* } be an optimal set of centers. Then r(C)  2r(C*).
Pf. (by contradiction) Assume r(C*) < ½ r(C).
For each site ci in C, consider ball of radius ½ r(C) around it.
Exactly one ci* in each ball; let ci be the site paired with ci*.
Consider any site s and its closest center ci* in C*.
dist(s, C) = dist(s, ci)  dist(s, ci*) + dist(ci*, ci)  2r(C*).
Thus r(C)  2r(C*). ▪


Greedy-Center-Selection(k, n, s1,s2,…,sn) {

C = { s1 }
repeat k-1 times {
Select a site si with maximum dist(si, C)
Add si to C
site farthest from any center
}
return C


 r(C*) since ci* is closest center
}
½ r(C)
½ r(C)
Observation. Upon termination all centers in C are pairwise at least r(C)
apart.
Pf. By construction of algorithm.
ci
½ r(C)
C*
sites
s
ci*
21
22
Center Selection
Center Selection: Hardness of Approximation
Theorem. Let C* be an optimal set of centers. Then r(C)  2r(C*).
Theorem. Unless P = NP, there is no -approximation algorithm for
metric k-center problem for any  < 2.
Theorem. Greedy algorithm is a 2-approximation for center selection
problem.
Pf. We show how we could use a (2 - ) approximation algorithm for kcenter to solve DOMINATING-SET in poly-time.
see Exercise 8.29
Let G = (V, E), k be an instance of DOMINATING-SET.
Construct instance G' of k-center with sites V and distances
– d(u, v) = 1 if (u, v)  E
– d(u, v) = 2 if (u, v)  E
Note that G' satisfies the triangle inequality.
Claim: G has dominating set of size k iff there exists k centers C*
with r(C*) = 1.
Thus, if G has a dominating set of size k, a (2 - )-approximation
algorithm on G' must find a solution C* with r(C*) = 1 since it cannot
use any edge of distance 2.
Remark. Greedy algorithm always places centers at sites, but is still
within a factor of 2 of best solution that is allowed to place centers
anywhere.


e.g., points in the plane


Question. Is there hope of a 3/2-approximation? 4/3?

Theorem. Unless P = NP, there no -approximation for center-selection
problem for any  < 2.
23
24
4
Weighted Vertex Cover
11.4 The Pricing Method:
Weighted Vertex Cover
Weighted vertex cover. Given a graph G with vertex weights, find a
vertex cover of minimum weight.
2
4
2
4
2
9
2
9
weight = 11
weight = 2 + 2 + 4
26
Pricing Method
Pricing Method
Pricing method. Each edge must be covered by some vertex.
Edge e = (i, j) pays price pe  0 to use vertex i and j.
Pricing method. Set prices and find vertex cover simultaneously.
Fairness. Edges incident to vertex i should pay  wi in total.
2
Weighted-Vertex-Cover-Approx(G, w) {
foreach e in E
pe = 0
4
for each vertex i :  pe  wi
e ( i , j )
2
9
Lemma. For any vertex cover S and any fair prices pe: e pe  w(S).
Pf.
S  set of all tight nodes
return S
▪
 pe  
e E
each edge e covered by
at least one node in S
}
 pe   wi  w( S ).
i  S e  (i , j )
 pe  wi
e  (i , j )
while ( edge i-j such that neither i nor j are tight)
select such an edge e
increase pe as much as possible until i or j tight
}
iS
sum fairness inequalities
for each node in S
27
28
Pricing Method
Pricing Method: Analysis
Theorem. Pricing method is a 2-approximation.
Pf.
Algorithm terminates since at least one new node becomes tight
after each iteration of while loop.



price of edge a-b
vertex weight
Let S = set of all tight nodes upon termination of algorithm. S is a
vertex cover: if some edge i-j is uncovered, then neither i nor j is
tight. But then while loop would not terminate.
Let S* be optimal vertex cover. We show w(S)  2w(S*).
w(S)   wi  
i S
 pe 
i S e(i, j)
all nodes in S are tight
Figure 11.8
29

 pe  2  pe  2w(S*).
iV e(i, j)
S  V,
prices  0
e E
each edge counted twice
fairness lemma
30
5
Weighted Vertex Cover
11.6 LP Rounding: Weighted Vertex Cover
Weighted vertex cover. Given an undirected graph G = (V, E) with
vertex weights wi  0, find a minimum weight subset of nodes S such
that every edge is incident to at least one vertex in S.
10
A
6F
9
16
B
G
7
10
6
3C
H
9
23
D
I
33
7
E
J
10
32
total weight = 55
32
Weighted Vertex Cover: IP Formulation
Weighted Vertex Cover: IP Formulation
Weighted vertex cover. Given an undirected graph G = (V, E) with
vertex weights wi  0, find a minimum weight subset of nodes S such
that every edge is incident to at least one vertex in S.
Weighted vertex cover. Integer programming formulation.
( ILP) min
 wi xi
i  V
Integer programming formulation.
Model inclusion of each vertex i using a 0/1 variable xi.
s. t. xi  x j
xi

 1
(i, j)  E
 {0,1} i  V
 0 if vertex i is not in vertex cover
xi  
 1 if vertex i is in vertex cover
Observation. If x* is optimal solution to (ILP), then S = {i  V : x*i = 1}
is a minimum weight vertex cover.
Vertex covers in 1-1 correspondence with 0/1 assignments:
S = {i  V : xi = 1}


Objective function: minimize i wi xi.
For each edge (i, j), must take either i or j: xi + xj  1.
33
34
Integer Programming
Linear Programming
INTEGER-PROGRAMMING. Given integers aij and bi, find integers xj that
Linear programming. Max/min linear objective function subject to
linear inequalities.
Input: integers cj, bi, aij .
Output: real numbers xj.
satisfy:

n
 aij x j
max c t x
s. t. Ax  b
x
j1
integral
xj
xj
 bi
1 i  m

n
 0
1 j  n
integral 1  j  n
(P) max c t x
s. t. Ax  b
x  0
Observation. Vertex cover formulation proves that integer
programming is NP-hard.
(P) max
 cj xj
j1
n
s. t.  aij x j
j1
xj

bi 1  i  m
 0
1 j  n
Linear. No x2, xy, arccos(x), x(1-x), etc.
even if all coefficients are 0/1 and
at most two variables per inequality
Simplex algorithm. [Dantzig 1947] Can solve LP in practice.
Ellipsoid algorithm. [Khachian 1979] Can solve LP in poly-time.
35
36
6
LP Feasible Region
Weighted Vertex Cover: LP Relaxation
LP geometry in 2D.
Weighted vertex cover. Linear programming formulation.
x1 = 0
( LP) min
 wi xi
i  V
s. t. xi  x j
xi
 1
(i, j)  E
 0 i V
Observation. Optimal value of (LP) is  optimal value of (ILP).
Pf. LP has fewer constraints.
Note. LP is not equivalent to vertex cover.
x2 = 0
Q. How can solving LP help us find a small vertex cover?
A. Solve LP and round fractional values.
x1 + 2x2 = 6
2x1 + x2 = 6
½
½
½
37
38
Weighted Vertex Cover
Weighted Vertex Cover
Theorem. If x* is optimal solution to (LP), then S = {i  V : x*i  ½} is a
vertex cover whose weight is at most twice the min possible weight.
Theorem. 2-approximation algorithm for weighted vertex cover.
Pf. [S is a vertex cover]
Consider an edge (i, j)  E.
Since x*i + x*j  1, either x*i  ½ or x*j  ½  (i, j) covered.
Theorem. [Dinur-Safra 2001] If P  NP, then no -approximation
for  < 1.3607, even with unit weights.


10 5 - 21
Pf. [S has desired cost]
Let S* be optimal vertex cover. Then
Open research problem. Close the gap.

 wi 
i  S*

iS
wi xi*
LP is a relaxation

1
2
 wi
iS
x*i  ½
39
40
Polynomial Time Approximation Scheme
11.8 Knapsack Problem
PTAS. (1 + )-approximation algorithm for any constant  > 0.
Load balancing. [Hochbaum-Shmoys 1987]
Euclidean TSP. [Arora 1996]


Consequence. PTAS produces arbitrarily high quality solution, but trades
off time for accuracy.
This section. PTAS for knapsack problem via rounding and scaling.
42
7
Knapsack Problem
Knapsack is NP-Complete
KNAPSACK: Given a finite set X, nonnegative weights wi, nonnegative
values vi, a weight limit W, and a target value V, is there a subset S  X
such that:
 wi  W
Knapsack problem.
Given n objects and a "knapsack."
we'll assume wi  W
Item i has value vi > 0 and weighs wi > 0.
Knapsack can carry weight up to W.
Goal: fill knapsack so as to maximize total value.



iS
 vi

Item
Ex: { 3, 4 } has value 40.
W = 11
Value
1
1
1
2
6
2
3
18
5
4
22
6
5
28
7
 V
iS
Weight
SUBSET-SUM: Given a finite set X, nonnegative values ui, and an integer
U, is there a subset S  X whose elements sum to exactly U?
Claim. SUBSET-SUM  P KNAPSACK.
Pf. Given instance (u1, …, un, U) of SUBSET-SUM, create KNAPSACK
instance:
vi  wi  ui
 ui
 U
iS
V  W U
 ui
 U
iS
43
44
Knapsack Problem: Dynamic Programming 1
Knapsack Problem: Dynamic Programming II
Def. OPT(i, w) = max value subset of items 1,..., i with weight limit w.
Case 1: OPT does not select item i.
– OPT selects best of 1, …, i–1 using up to weight limit w
Case 2: OPT selects item i.
– new weight limit = w – wi
– OPT selects best of 1, …, i–1 using up to weight limit w – wi
Def. OPT(i, v) = min weight subset of items 1, …, i that yields value
exactly v.
Case 1: OPT does not select item i.
– OPT selects best of 1, …, i-1 that achieves exactly value v
Case 2: OPT selects item i.
– consumes weight wi, new value needed = v – vi
– OPT selects best of 1, …, i-1 that achieves exactly value v




 0
if i  0

if w i  w
OPT (i, w)   OPT (i  1, w)
 max OPT (i  1, w), v  OPT (i  1, w  w ) otherwise


i
i 
 0
if v  0

if i  0, v > 0
 
OPT (i, v)  
if vi  v
 OPT(i 1, v)

 min  OPT (i 1, v), wi  OPT (i 1, v  vi )  otherwise
Running time. O(n W).
W = weight limit.
Not polynomial in input size!


45
46
Knapsack: PTAS
 0

 
OPT (i, v)  
 OPT(i 1, v)

 min  OPT (i 1, v), wi  OPT (i 1, v  vi ) 
Knapsack: PTAS
 0

 
OPT (i, v)  
 OPT(i 1, v)

 min  OPT (i 1, v), wi  OPT (i 1, v  vi ) 
if v  0
if i  0, v > 0
if vi  v
otherwise
Item
Value
Weight
1
1
1
0 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1
0 1 2 3 4 5
2
1
2
3
3
0 0 x x x x x x x x x x x x x x x
4
4
1
5
6
i = 0 or v = 0
0
Item
Value
1
1
1
2
1
2
5
0 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1
0 1 2 3 4 5
3
3
5
6
0 0 x x x x x x x x x x x x x x x
4
4
6
7
1
5
6
7
2 0
3 0
if v  0
if i  0, v > 0
if vi  v
otherwise
i=1,v=…
0
1
x x x x x x x x x x x x x x
Weight
2 0
3 0
W = 11
4 0
W = 11
4 0
5 0
5 0
47
48
8
Knapsack: PTAS
 0

 
OPT (i, v)  
 OPT(i 1, v)

 min  OPT (i 1, v), wi  OPT (i 1, v  vi ) 
Knapsack: PTAS
 0

 
OPT (i, v)  
 OPT(i 1, v)

 min  OPT (i 1, v), wi  OPT (i 1, v  vi ) 
if v  0
if i  0, v > 0
if vi  v
otherwise
Item
Value
Weight
1
1
1
0 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1
0 1 2 3 4 5
2
1
2
3
3
0 0 x x x x x x x x x x x x x x x
4
4
1
5
6
i=2,v=…
0
1
x x x x x x x x x x x x x x
2 0
1
3 x x x x x x x x x x x x x
3 0
if v  0
if i  0, v > 0
if vi  v
otherwise
Item
Value
1
1
1
2
1
2
5
0 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1
0 1 2 3 4 5
3
3
5
6
0 0 x x x x x x x x x x x x x x x
4
4
6
7
1
5
6
7
i=3,v=…
W = 11
4 0
0
1
x x x x x x x x x x x x x x
2 0
1
3 x x x x x x x x x x x x x
3 0
1
3 5 6 8 x x x x x x x x x x
Weight
W = 11
4 0
5 0
5 0
49
50
Knapsack: PTAS
 0

 
OPT (i, v)  
 OPT(i 1, v)

 min  OPT (i 1, v), wi  OPT (i 1, v  vi ) 
Knapsack: PTAS
 0

 
OPT (i, v)  
 OPT(i 1, v)

 min  OPT (i 1, v), wi  OPT (i 1, v  vi ) 
if v  0
if i  0, v > 0
if vi  v
otherwise
Item
Value
Weight
1
1
1
0 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1
0 1 2 3 4 5
2
1
2
3
3
0 0 x x x x x x x x x x x x x x x
4
4
1
5
6
i = 4, v = …
0
1
x x x x x x x x x x x x x x
2 0
1
3 x x x x x x x x x x x x x
3 0
1
3 5 6 8 x x x x x x x x x x
4 0
1
3 5 6 7 9
11
12
14
if v  0
if i  0, v > 0
if vi  v
otherwise
Item
Value
1
1
1
2
1
2
5
0 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1
0 1 2 3 4 5
3
3
5
6
0 0 x x x x x x x x x x x x x x x
4
4
6
7
1
5
6
7
i = 5, v = …
W = 11
x x x x x x
5 0
0
1
x x x x x x x x x x x x x x
2 0
1
3 x x x x x x x x x x x x x
3 0
1
3 5 6 8 x x x x x x x x x x
4 0
1
3 5 6 7 9
11
12
14
x x x x x x
5 0
1
3 5 6 7 7 8
10
12
13
14
16
18
19
Weight
W = 11
21
51
52
Knapsack Problem: Dynamic Programming II
Knapsack: PTAS
Def. OPT(i, v) = min weight subset of items 1, …, i that yields value
exactly v.
Case 1: OPT does not select item i.
– OPT selects best of 1, …, i-1 that achieves exactly value v
Case 2: OPT selects item i.
– consumes weight wi, new value needed = v – vi
– OPT selects best of 1, …, i-1 that achieves exactly value v
Intuition for approximation algorithm.
Round all values up to lie in smaller range.
Run dynamic programming algorithm on rounded instance.
Return optimal items in rounded instance.





 0

 
OPT (i, v)  
 OPT(i 1, v)

 min  OPT (i 1, v), wi  OPT (i 1, v  vi ) 
if v  0
if i  0, v > 0
if vi  v
otherwise
Item
Value
Weight
Item
Value
1
934,221
1
1
1
1
2
5,956,342
2
2
6
2
3
17,810,013
5
3
18
5
4
21,217,800
6
4
22
6
5
27,343,199
7
5
28
7
V*  n vmax
W = 11
Running time. O(n V*) = O(n2 vmax).
V* = optimal value = maximum v such that OPT(n, v)  W.
Not polynomial in input size!
original instance

Weight
W = 11
rounded instance

53
54
9
Knapsack: PTAS
Knapsack PTAS. Round up all values:
Knapsack: PTAS
v 
v 
vi   i   , vˆi   i 
 
 
vmax = largest value in original instance
 = precision parameter
– 
= scaling factor =  vmax / n
Theorem. If S is solution found by our algorithm and S* is an optimal
solution of the original problem, then (1  )  vi   vi
–
–
Observation. Optimal solution to problems with
v 
vi   i  
 
Knapsack PTAS. Round up all values:
iS
i  S*
Pf. Let S* be an optimal solution satisfying weight constraint.
v or vˆ are equivalent.
 vi

i  S*
Intuition. v close to v so optimal solution using v is nearly optimal;
vˆ small and integral so dynamic programming algorithm is fast.
 vi
always round up
 vi
solve rounded instance optimally
i  S*

iS

Running time. O(n3 / ).
Dynamic program II running time is O(n2 vˆmax ) , where
 (vi  )
never round up by more than 
 vi  n
|S|  n
iS


v   n 
vˆmax  max   
    
i S
DP alg can take vmax
 (1 )  vi
n  =  vmax, vmax  iS vi
i S
55
56
10