Lecture 11: Load Balancing: Balls and Bins 1 Balls and Bins

CSCI-B609: A Theorist’s Toolkit, Fall 2016
Sept 27
Lecture 11: Load Balancing: Balls and Bins
Lecturer: Yuan Zhou
1
Scribe: Chao Tao
Balls and Bins
Moving on from last lecture, this lecture will talk about load balancing. Suppose now we
have n balls and n bins. For each ball, put it into an independently and uniformly randomly
chosen bin. We are interested in the number of balls within the max-loaded bin. This
problem has its application in load balancing. Suppose we want to assign n tasks to n
servers. Prior to making the assignment, we do not know anything about the server. The
goal is to ensure that every server gets even load of tasks. And an easy randomized algorithm
is to assign each task uniformly at random to any server.
log n
Theorem 1. The max-loaded bin has O log log n balls w.p. ≥ 1 − n1 .
Proof. Firstly, we can get the following fact that
k
n
1
Pr[Bin #i has ≥ k balls] ≤
·
k
n
k
n
1
1
1
≤
· k =
≤ k/2 .
k! n
k!
k
4 log n/ log log n
√
8 log n
k/2
,
we
have
k
≥
log
n
= 22 log n = n2 . By a union
When k = k ∗ = log
log n
bound, we have
Pr[∀i, Bin #i has < k ∗ balls] = 1 − Pr[∃i, Bin #i has ≥ k ∗ balls]
n
X
≥1−
Pr[Bin #i has ≥ k ∗ balls]
i=1
=1−n·
In fact, we also have the following lowerbound.
1
1
1
=
1
−
n2
n
Lecture 11: Load Balancing: Balls and Bins
Theorem 2. The max-loaded bin has O
2
log n
log log n
balls w.p. ≥ 1 −
e2
.
n1/3
Proof. First, note that
k n−k
n
1
1
Pr[Bin #i has ≥ k balls] ≥
1−
k
n
n
n k 1 k 1
1
≥ k.
≥
k
n
e
ek
When k = k ∗∗ =
log n
,
3 log log n
ek k ≤ e(log n)log n/(3 log log n) = en1/3 .
P
Let Xi = 1[Bin #i has ≥ k ∗∗ balls]. We have E[Xi ] ≥ en11/3 . Let X = ni=1 Xi . We have
2/3
EX ≥ n e . Note this doesn’t mean Pr[X ≥ 1] → 1 as n → +∞. We need to explore more
information about X. Recall Chebyshev’s inequality:
Pr[X = 0] ≤ Pr[|X − EX| ≥ EX] ≤
Var[X]
.
(EX)2
P
P
Here Var[X] = i Var[Xi ] + i6=j Cov[Xi , Xj ]. Note that Cov[Xi , Xj ] = E[(Xi − EXi )(Xj −
EXj )] is the covariance between Xi and Xj . Since Xi and Xj are negatively correlated: some
bin having more balls makes it less likely for another bin to do so, we have Cov[Xi , Xj ] ≤ 0
for i 6= j.
P
For Var[Xi ], we have Var[Xi ] = E(Xi − EXi )2 ≤ 1 since Xi ∈ {0, 1}. Therefore i Var[Xi ] ≤
n. To summarize, we can get
n
e2
Var[X]
≤ 4/3 2 = 1/3 .
Pr[X = 0] ≤
(EX)2
n /e
n
By (1), we will have Pr[X ≥ 1] ≥ 1 −
does.
e2
n1/3
(1)
which says the same meaning as this theorem
log n
log log n
Remark 2.1 (Threshold phenomenon). Here, τ = Θ
is a threshold. The probability
that max-loaded bin having balls much below & much above τ are both o(1). We can observe
such threshold phenomenon in many random variables. For example, in PSET 1, we have
seen
a) Pr[∃ 4-clique in G(n, p)] = o(1) when p = o(n2/3 ),
b) and Pr[@ 4-clique in G(n, p)] = o(1) when p = ω(n2/3 ).
Lecture 11: Load Balancing: Balls and Bins
2
3
The Power of Two Choices
Now let us consider a slightly different strategy: for each ball, independently randomly
choose 2 bins, and add the ball to the less-loaded bin. Then, we will have the following
theorem.
2 Theorem 3. The max-loaded bin has O(log log n) balls w.p. ≥ 1 − O logn n .
2.1
Intuition
Let Ni be the number of bins loaded with ≥ i balls. A bin has height i if there are i − 1
balls in the bin before the ball was added. And let Bi be the number of balls with height
≥ i. Then, we can find the following two facts:
a) Ni ≤ Bi ,
b) and N3 ≤ n3 .
Note that a bin loaded with ≥ i balls at least has one ball with height ≥ i. Then it is
not hard to get the fact that Ni ≤ Bi . We have known that N3 ≤ n3 . What about N4 ?
Intuitively, a ball with height ≥ 4 needs to choose both bins loaded with ≥ 3 balls - with
2
chance Nn3 ≤ 19 . Therefore, “on expectation”, N4 ≤ B4 ≤ n9 . Similarly, a ball with height
2
1
≥ 5 needs to choose two bins loaded with ≥ 4 balls - with chance Nn4 ≤ 81
. Therefore, we
i−2
n
−2
expect N5 ≤ B5 ≤ 81 . In general, we expect Ni ≤ Bi ≤ n · 3
. When i = Θ(log log n),
this number becomes < 1 .
2.2
Full proof
Let Ei be the event that Ni ≤ βi n where β3 = 1/3 and βi+1 = e(βi )2 for i ≥ 3. We already
have Pr[E3 ] = 1. Then, we have the following claim.
Claim 3.1. Consider the following process: as we put balls in bins, we mark bins. A ball is
called “marked” if both associated bins are marked. If throughout the process we mark ≤ αn
bins, then Pr[# of marked balls > eα2 n] ≤ n12 when α2 n ≥ 2 ln n.
Consider Pr[¬Ei+1 ∧ Ei ] when βi2 ≥ 2 ln n/n. Whenever a bin is loaded with ≥ i balls, we
mark it. Then a ball is marked if and only if the ball has height ≥ i + 1. Therefore,
Pr[¬Ei+1 ∧ Ei ] ≤ Pr[Bi+1 > βi+1 n ∧ Ni ≤ βi n]
= Pr[# of marked balls ≥ eβi2 n ∧ # of marked bins ≤ βi n]
1
≤ 2 (by Claim 3.1).
n
Lecture 11: Load Balancing: Balls and Bins
4
Therefore, as long as βi2 ≥ 2 ln n/n, we have
Pr[¬Ei+1 ] = Pr[¬Ei ] + Pr[¬Ei+1 ∧ Ei ]
1
≤ Pr[¬Ei ] + 2
n
≤ i/n2 (by induction).
What is the largest i∗ such that βi2∗ ≥
O(ln ln n).
Claim 3.2. The largest i∗ such that βi2∗ ≥
2 ln n
?
n
2 ln n
n
From the following claim, we have i∗ =
satisfies i∗ = O(ln ln n).
Therefore,
ln ln n
n2
Pr[¬Ei∗ +1 ] ≤ O
,
which means w.p. ≥ 1 − O lnnln2 n , there are at most βi∗ +1 <
O(ln ln n) balls. We also have the following claim.
2 Claim 3.3. Pr[Bi∗ +2 ≥ 1] ≤ O lnn n .
2.2.1
2 ln n
n
bins loaded with i∗ + 1 =
Proof of Claim 3.1
Each ball is marked w.p. ≤ α2 . Therefore expected number of marked balls µ ≤ α2 n. By
Chernoff bound (the worst case happens when µ = α2 n), we have
e−1 µ
e
1
Pr[X1 + . . . Xn ≥ eµ] ≤
= e−µ ≤ 2 .
e
e
n
2.2.2
Proof of Claim 3.2
Let γi = ln βi . We have γ3 = − ln 3 and γi+1 = 2γi + 1 for i ≥ 3. Then we will have
(γi+1 + 1) = 2(γi + 1). From this equation, we can further get γi + 1 = 2i−3 · (1 − ln 3) for
i ≥ 3. Then, γi = 2i−3 · (1 − ln 3) − 1.
In order to have βi2 ≥
2 ln n
,
n
we need 2γi ≥ ln ln n − ln n2 . Then, we have
n
⇒ 2i−2 (1 − ln 3) − 2 ≥ ln ln n − ln
2
ln
ln
n
−
ln
n
+
ln
2
+
2
⇒ 2i−2 ≤
1 − ln 3
ln n − ln ln n − ln 2 − 2
⇒ i − 2 ≤ ln
.
ln 3 − 1
Therefore i∗ = O(ln ln n).
Lecture 11: Load Balancing: Balls and Bins
2.2.3
Proof of Claim 3.3
Note the fact that
Pr[Bi∗ +2 ≥ 1] ≤ Pr[¬Ei∗ +1 ] + Pr[Bi∗ +2 ≥ 1 ∧ Ei∗ +1 ]
ln ln n
≤O
+ n · βi2∗ +1
2
n
2 ln n
ln ln n
+O
=O
2
n
n
2 ln n
.
=O
n
2 Therefore, we have Pr[Ni∗ +2 ≥ 1] ≤ Pr[Bi∗ +2 ≥ 1] = O lnn n .
5