Variable Length Path Coupling

Variable Length Path Coupling
Thomas P. Hayes
Abstract
We present a new technique for constructing and analyzing
couplings to bound the convergence rate of finite Markov
chains. Our main theorem is a generalization of the path
coupling theorem of Bubley and Dyer, allowing the defining
partial couplings to have length determined by a random
stopping time. Unlike the original path coupling theorem,
our version can produce multi-step (non-Markovian) couplings. Using our variable length path coupling theorem, we
improve the upper bound on the mixing time of the Glauber
dynamics for randomly sampling colorings.
1 Introduction
Overview Analysis of the convergence rate of finite
Markov chains has applications in a variety of fields,
including Theoretical Computer Science, Statistical
Physics, and Probability Theory. The Coupling Method
for proving upper bounds on convergence rate dates
back to the seminal work of Doeblin [6]. In computer
science, coupling has seen many recent applications in
the analysis of Markov Chain Monte Carlo algorithms
(e.g., [4, 15]).
An important new tool for simplifying and extending the coupling method is the Path Coupling approach
of Bubley and Dyer [3] (e.g., see [17]). Briefly, they
reduce the problem of proving an upper bound on coupling rate to proving a contraction condition for one
step of the evolution of a “partial coupling,” defined on
a much smaller subset of pairs of configurations. The
simple proof constructs a full coupling via iterated composition of partial couplings.
We present a natural generalization of this technique, which constructs a full coupling via iterated composition of partial couplings whose lengths may themselves be random variables. (See Section 3 for a precise
statement of our theorem.) The potential usefulness of
our result lies in providing a technique for simpler construction and analysis of multi-step (non-Markovian)
couplings, which are known to be more powerful than
∗ Department
of Computer Science, University of Chicago,
Chicago, IL 60637, {hayest,vigoda}@cs.uchicago.edu. Part of this
work was done while the second author was visiting the Isaac
Newton Institute for Mathematical Sciences, Cambridge, UK.
This work was partially supported by NSF Grant CCR-0237834.
Eric Vigoda
one-step (Markovian) couplings [11, 14, 13]. Moreover,
it enables the partial couplings to be variable-length,
which may be more natural in some settings. There
are several previous works which use a multi-step coupling, see [5, 13], or analyze a single-step coupling over
many steps, see [8, 9]. Our main theorem is a general
technique which, in many cases, will simplify and improve applications of multi-step couplings and analyses
of single-step couplings over many steps.
We use our new technique to analyze a simple
Markov chain for randomly sampling k-colorings of a
graph on n vertices with maximum degree ∆. Roughly
speaking, we can prove the chain has mixing time
O(n log n) when k ≥ 1.953∆. This improves a result
of Dyer et al. [8] which holds when k ≥ (2 − )∆ where
= 8 × 10−7 .
Variable Length Path Coupling Consider a finite
ergodic Markov chain with state space Ω, transition
matrix P and stationary distribution π. A coupling is a
joint stochastic process (Xt , Yt ) on Ω×Ω such that each
process viewed individually, in isolation from the other
chain, evolves according to P . However, the transitions
of the two processes can be highly correlated; this is
wherein the power of the method lies.
The probability of coalescence, i. e., of reaching the
same state, from an arbitrary pair of initial states,
bounds the distance from stationarity. This is easy
to prove by choosing the initial state for one of the
chains from the stationary distribution. Typically, the
probability of coalescence is analyzed by defining a
metric ρ : Ω × Ω → {0, 1, 2, . . . , D}. For α < 1, defining
a one-step coupling (Xt , Yt ) → (Xt+1 , Yt+1 ) such that
(1.1)
E (ρ(Xt+1 , Yt+1 ) | Xt , Yt ) < αρ(Xt , Yt ),
for all (Xt , Yt ) ∈ Ω × Ω where Xt 6= Yt , immediately
implies an arbitrary pair of states coalesces with probability at least 3/4 after dln(4D)/(1 − α)e steps. This
is called a one-step (or Markovian) coupling since the
one-step coupling forms a Markov chain on Ω × Ω with
stationary distribution π × π.
There always exists a coupling whose probability
of non-coalescence equals the distance from stationarity
(c.f. [11]). However, this coupling is often a multi-step
(or non-Markovian) coupling. Kumar and Ramesh [14]
give an interesting example where one-step couplings
are insufficient, and Hayes and Vigoda [13] recently
illustrated the power of multi-step couplings.
Defining and analyzing a coupling for all pairs
Xt , Yt ∈ Ω is often a difficult task. The path coupling
technique simplifies the approach by restricting attention to pairs in a subset S ⊆ Ω × Ω (assuming the graph
(Ω, S) is connected). It then suffices to define a onestep coupling such that (1.1) holds for all (Xt , Yt ) ∈ S.
Then, the path coupling theorem constructs, via simple
compositions, a one-step coupling satisfying (1.1) for all
Xt , Yt ∈ Ω.
In many settings, it is natural to expect improvements by considering multiple transitions simultaneously. For a fixed length `, the path coupling theorem
still applies to the Markov chain defined by P ` (i.e., the
`-step evolution of the original chain).
However, we would often like to consider variable
length `. More precisely, ` may be a random stopping
time (cf. [8]) which depends on the evolution and the
initial pair (Xt , Yt ). Such an analysis was used by Dyer
et al. [8] to prove improved convergence rates of a
Markov chain for generating a random k-coloring.
Our main theorem generalizes the path coupling
technique by allowing partial couplings whose length
is a random stopping time. Thus, our main result is
a variable length path coupling theorem. This work can
be viewed as an improvement (and simplification) of a
result of Dyer et al. [8, Theorem 2.2].
The proof of our theorem relies on a novel method
for composing variable length couplings. Our composition technique can produce multi-step couplings, unlike
the methods of Dyer et al. [8], and Bubley and Dyer
[3].
For proving upper bounds on the coalescence time,
one can always avoid the use of a variable length
coupling by analyzing a sufficiently large fixed length
coupling. However, our approach is often simpler and
more natural. This is illustrated in the proof of the
following result on randomly sampling graph colorings.
arity (see Section 2 for a formal definition).
The first significant result was by Jerrum [15], who
proved the mixing time is O(n log n) whenever k > 2∆.
Vigoda [17] later proved the mixing time is O(n2 )
whenever k > 11∆/6, via analysis of a different Markov
chain. Dyer et al. [8] subsequently proved O(n log n)
mixing time of the Glauber dynamics for k ≥ (2 − )∆
where = 8 × 10−7 , assuming the input graph is ∆regular with girth g ≥ 4 and ∆ ≥ 14. Their result
relied on a variable length path coupling theorem which
is significantly weaker than our theorem.
Using our new technique, we considerably improve
the result of Dyer et al. For any graph with girth g ≥ 5
and ∆ ≥ ∆0 , where ∆0 is a sufficiently large constant,
we prove the mixing time of the Glauber dynamics is
O(n log n) whenever k ≥ 1.953∆.
There are significantly stronger results known when
∆ = Ω(log n). Using a multi-step coupling, Hayes and
Vigoda [13] recently proved the following result. For all
> 0, all graphs with girth g ≥ 11 and ∆ = Ω(log n), all
k ≥ (1 + )∆, the mixing time of the Glauber dynamics
is O(n log n). That result builds upon earlier work of
Dyer and Frieze [7], Molloy [16], and Hayes [12].
2 Preliminaries
Throughout the text, for a finite ergodic Markov chain,
we use Ω to denote the set of states, P to denote the
transition matrix, and π the stationary distribution.
The total variation distance for a pair of distributions µ and ν on Ω is defined as
1X
|µ(x) − ν(x)|.
dT V (µ, ν) =
2
x∈Ω
Our interest is the mixing time τmix of the chain:
τmix = max min{t : dT V (P t (x, ·), π) ≤ 1/4}
x∈Ω
We use the coupling method to bound the mixing
time. A t-step coupling is defined as follows. For every
(x0 , y0 ) ∈ Ω2 , let (X, Y ) = (X (x0 ,y0 ) , Y (x0 ,y0 ) ) be a
random variable taking values in Ωt ×Ωt . We say (X, Y )
Applications to Graph Colorings We apply our is a valid t-step coupling if for all (x0 , y0 ) ∈ Ω2 , the
variable length path coupling theorem to improve con- distribution of Xt is P t (x0 , ·) and the distribution of Yt
vergence results of a simple Markov chain, known as the is P t (y0 , ·). In other words, viewed individually the tGlauber dynamics, for randomly sampling k-colorings of step evolutions of the chains are faithful copies of the
a graph with n vertices and maximum degree ∆. Transi- original Markov chain.
tions of the Glauber dynamics randomly recolor a ranA valid coupling gives the following bound, known
domly chosen vertex at each step (see Section 5 for a as the Coupling Inequality, on the convergence rate [6]
precise definition). The stationary distribution of the (or e.g. [1]). For all x0 ∈ Ω,
chain is uniformly distributed over (proper) k-colorings
dT V (P t (x0 , ·), π) ≤ max Pr(Xt 6= Yt )
of the input graph whenever k > ∆ + 1. We are intery0 ∈Ω
ested in upper bounding the mixing time, which is the
Therefore, by defining a valid t-step coupling where all
number of transitions until the chain is close to stationinitial pairs have coalesced (i.e., are at the same state)
with probability at least 3/4, we have proved the mixing stopping condition which is satisfied determines the
precise coupled sequence for Y . In this way, we have
time is at most t.
no knowledge of Y1 until observing the evolution of X
up to the stopping time T .
3 Variable-length Partial Coupling
Statement of Results We call S ⊆ Ω2 a pathWe can now state our main result.
generating set if the graph (Ω, S) is connected. We
+
assume S has an associated function d : S → N . Theorem 3. For a variable length partial coupling
We extend this to a metric on Ω2 by setting d(x, y) (X, Y , T ), let
as the weight of the shortest path between x and y in
the weighted graph (Ω, S). Also, let dmax denote the α := max E (d(X , Y )) and M := max T,
T
T
(x0 ,y0 )∈S
(x0 ,y0 )∈S
maximum over all (x, y) ∈ Ω2 of d(x, y).
Definition 1. Let S be a path-generating where M is infinite if the stopping time is unbounded.
set.
For every (x0 , y0 ) ∈ S, let (X, Y , T ) = If α < 1, then the mixing time satisfies
(X (x0 ,y0 ) , Y (x0 ,y0 ) , Tx0 ,y0 ) be a random variable
taking values in Ω∗ × Ω∗ × N. When the subscript
ln(4dmax )
τ
≤
M
mix
(x0 , y0 ) is clear from context, we will omit it. We say
1−α
(X, Y , T ) is a variable length partial coupling when the
following hold.
The path coupling theorem of Bubley and Dyer [3]
corresponds to the special case when T is always 1, i.e.,
• Length preservation. |X| = |Y | = T holds with
M = 1.
probability one, for every (x0 , y0 ) ∈ S.
Even when there is no good natural upper bound
• Faithful copies. For all t ≥ 0, for all (x0 , y0 ) ∈ S on the stopping time, we can always define a truncated
define a new random variable X t = Xxt 0 ,y0 ∈ Ω version of the given partial coupling, which leads to the
by the following experiment. Sample (X, Y ) = following corollary (proved in Section 4.3).
((X1 , . . . , XT ), (Y1 , . . . , YT )). If T ≥ t, then “truncate” by setting X t = Xt . Otherwise, “extend” by Corollary 4. With the notation in Theorem 3, let W
choosing X t from the distribution P t−T (XT , ·). We denote the maximum of d(Xt , Yt ) over all (x0 , y0 ) ∈ S
say X is a faithful copy if the distribution of X t is and t ≤ T . If α < 1, then the mixing time satisfies
P t (x0 , ·). Define “Y is a faithful copy” analogously.
ln(4dmax )
We require that, for every (x0 , y0 ) ∈ S, X and Y ,
τmix ≤ 2M 0
,
(1 − α)
considered separately, are each faithful copies.
Remark 2. A. Since our proofs will only examine the where M 0 satisfies Pr (T > M 0 ) ≤ (1 − α)/2W . In
distance from stationarity at one time, we do not require particular, if β = E (T ) is finite, then
that the sequence (X0 , X1 , . . . , Xt ) be a Markov chain,
2βW
ln(4dmax )
only that the distribution of Xt be correct for every
τmix ≤ 2
.
1−α
(1 − α)
t. We note however, that when a partial coupling does
possess this additional structure, then so will the full
4 Proof of Theorem 3
coupling which we construct in Section 4.1.
B. In the above definition, the random variable T is a Our proof is via the coupling method. Let
function of the evolution of the chains up to time T ,
ln(4dmax )
and thus is a stopping time. Therefore, we are defining
N :=
1−α
couplings with respect to a stopping time; most previous
works only consider couplings of fixed length.
Observe
C. The definition allows the partial couplings to be
multi-step. Specifically, we can allow T to be a function (4.2)
N > logα (1/4dmax )
of the initial states (x0 , y0 ) and the first T steps of only
one of the chains (X1 , . . . , XT ). The evolution of the
In Section 4.1, we construct an M N -step coupling,
other chain (Y1 , . . . , YT ) is decided upon reaching the denoted g N , and prove that it is valid. Then in Secstopping time. In particular, the stopping time is a tion 4.2 we prove that g N coalesces with high probabilset of conditions on the evolution of Xt . As soon as ity. The Coupling Inequality establishes the M N upper
one of these conditions is met, we stop. The specific bound on mixing time.
4.1 Construction of full coupling Denote the
variable-length coupling from the hypothesis of the theorem by µx0 ,y0 , (x0 , y0 ) ∈ S. Using this variable-length
coupling, we will construct a sequence of fixed-length
couplings. For all (x0 , y0 ) ∈ Ω2 , all 0 ≤ i < N , we
define a (iM )-step coupling denoted by νxiM
.
0 ,y0
We will convert variable-length couplings into fixedlength couplings by adding a trivial coupling at the
end. From (Xt , Yt ) ∈ Ω2 , the trivial one-step coupling
(Xt , Yt ) → (Xt+1 , Yt+1 ) is defined as follows. Evolve
Xt → Xt+1 according to the Markov chain of interest.
If Xt = Yt , set Yt+1 = Xt+1 , otherwise independently
evolve Yt → Yt+1 . We call t steps of this process as
the t-step trivial coupling, denoted, for initial states
(x0 , y0 ) ∈ Ω2 , as ωxt 0 ,y0 .
By converting the variable-length couplings into
fixed-length couplings, we can compose the couplings
along a path and define a coupling for all (x0 , y0 ) ∈ Ω2 .
We construct our sequence of fixed-length couplings in
an inductive manner which keeps the trivial coupling at
the end. The goal is to coalesce (with sufficiently large
probability) before we use the trivial coupling.
The coupling νx00 ,y0 is a 0-step coupling, and thus
inductively, using
trivial. For i > 0, we construct νxiM
0 ,y0
erated via the variable-length coupling up to the
stopping time and then is being extended via the
original Markov chain (since ν (i−1)M is valid and so is
the trivial coupling). Therefore, by the definition of the
validity of the variable-length coupling, the distribution
of Xt is identical to the distribution of our Markov
chain. The same argument holds for (Y1 , . . . , Yt ) as
well. This proves that ν iM is a valid coupling for
(x0 , y0 ), which completes the proof.
4.2 Analysis of full coupling It remains to bound
the probability of coalescence by time t = N M . For
(x0 , y0 ) ∈ Ω2 , let αi (x0 , y0 ) denote the probability νxiM
0 ,y0
does not coalesce. Also, let
αi (S) =
max
(x0 ,y0 )∈S
αi (x0 , y0 ).
Note, the maximum is over initial pairs whose partial
coupling is defined. For any (x0 , y0 ) ∈ Ω2 , we have
(4.3)
αi (x0 , y0 ) ≤ d(x0 , y0 )αi (S),
by a union bound. Our goal is to bound αN (S).
We inductively bound αi+1 (S) by conditioning on
(i−1)M
ν (i−1)M , µ, and ω. Suppose νx0 ,y0 has been defined the distance of (XT , YT ). Observe α0 (S) = 1. For
(x0 , y0 ) ∈ S,
for all pairs (x0 , y0 ) ∈ Ω2 .
by
running
the
For (x0 , y0 ) ∈ S, we form νxiM
0 ,y0
X
variable-length coupling µx0 ,y0 until the stopping time αi+1 (x0 , y0 ) =
αi (XT , YT )Pr (XT , YT | x0 , y0 ),
T (recall T ≤ M ). From the resulting states (XT , YT ),
XT ,YT
(i−1)M
we then run the coupling νXT ,YT for a further (i − 1)M
steps. Finally, from (X` , Y` ), where ` = T + (i − 1)M , by the definition of ν iM ,
t
we evolve the chains using the trivial coupling ωX
` ,Y`
X
for the final t = M − T steps. This defines the coupling
≤
d(XT , YT )αi (S)Pr (XT , YT | x0 , y0 ),
νxiM
for
all
initial
pairs
(x
,
y
)
∈
S.
0 0
0 ,y0
XT ,YT
We extend the coupling from pairs in S to all pairs
in Ω2 by compositions. More precisely, for (x0 , y0 ) ∈
is constructed by composing by inequality (4.3),
Ω2 \ S, the coupling νxiM
0 ,y0
the couplings ν iM along a shortest (x0 , y0 ) path in the
= αi (S)E (d(XT , YT ) | x0 , y0 )
graph (Ω, S). Since the couplings on the path are of
≤ αi (S)α,
identical lengths such a composition is standard (e.g.,
see [3, 2]). This completes the definition of the couplings
by the definition of α.
ν iM .
By induction, αN ≤ αN . Hence by (4.2),
We prove by induction on i that for all 0 < i ≤ N ,
iM
0
ν
is a valid coupling. For ν , this is trivial. Fix
αN ≤ αN < (4dmax )−1 .
0 < i ≤ N . Assume by induction that ν (i−1)M is a valid
coupling from any pair of states in Ω2 . Now, fix any
(x0 , y0 ) ∈ S. It will suffice to show that ν iM is a valid For a pair (x0 , y0 ) ∈ Ω2 and for t = N M ,
coupling for this initial pair, since the composition of
Pr (Xt 6= Yt ) ≤ dmax αN ≤ 1/4.
couplings is a well-defined coupling for the same Markov
chain (e.g., see [3, 2]).
Focus attention on the random sequence Thus the coupling time is at most t, and by the
(X1 , . . . , Xt ) where t = iM . The sequence is gen- Coupling Inequality, this bound also applies to the
mixing time.
4.3 Proof of Corollary 4 Suppose we are given a
partial coupling (X, Y , T ), and an integer M 0 . Define a
0
0
“truncated” partial coupling (X , Y , T 0 ) by


when T ≤ M 0
(X, Y , T )
0
0
0
(X , Y , T ) = ((X1 , . . . , XM 0 ), (Y1 , . . . , YM 0 ), M 0 )


otherwise.
It is straightforward to verify the stationary distribution
is uniform over the proper k-colorings whenever k >
∆ + 1 (e.g., see Jerrum [15]).
In practice, we can consider the chain defined only
on proper k-colorings. The extension to assignments is
for technical reasons, and its mixing time upper bounds
the mixing time of the chain defined only on proper
colorings. In the remainder of the section, we use the
term colorings to refer to assignments in Ω.
0
0
Clearly, (X , Y , T 0 ) inherits the property of being a
We can now state our theorem on the Glauber
partial coupling from (X, Y , T ).
dynamics.
Recall that by definition,
Theorem 5. There exists ∆0 such that, for every
α :=
max E (d(XT , YT )),
graph G = (V, E) on n vertices having maximum degree
(x0 ,y0 )∈S
∆ ≥ ∆0 and girth g ≥ 5, and for every k ≥ 1.953∆, the
Glauber
dynamics on k-colorings of G has mixing time
and
O(n
log
n).
W := max d(XT , YT ).
Hence the corresponding quantities
0
0
((X , Y , T 0 )) satisfy W 0 ≤ W and
α0
:=
max
(x0 ,y0 )∈S
≤
α0 , W 0
for
E (d(XT0 0 , YT0 0 ))
α + W Pr (T > M 0 )
We begin with a few definitions. Let S denote
pairs X, Y ∈ Ω which differ at exactly one vertex. For
Xt , Yt ∈ Ω, denote their Hamming distance by
H(Xt , Yt ) = |{z : Xt (z) 6= Yt (z)}|.
Let AXt (z) = [k] − |Xt (N (z))| denote the number of
available colors for vertex z in coloring Xt .
We use the maximal one-step coupling, originally
used by Jerrum [15]. More precisely, consider a pair of
colorings (X0 , Y0 ) ∈ S. Let v be the single vertex of
disagreement (X0 (v) 6= Y0 (v)).
At every step, both chains update the same vertex
z. The coupling at time t is as follows. If z is not a
neighbor of v, then both chains choose the same new
color Xt+1 (z) = Yt+1 (z).
By Markov’s inequality,
Suppose z is a neighbor of v. Without loss of
0
generality,
assume AXt (z) ≥ AYt (z). Thus, if z ∈ N (v)
Pr (T > M ) ≤ (1 − α)/2W.
and X0 (v) ∈
/ Yt (N (z)), then Y0 (v) ∈
/ Xt (N (z)). First,
choose the random new color Xt+1 (z) = cX ∈ [k] \
The result follows by the first half of the corollary.
Xt (N (z)). Choose a color c0 uniformly at random from
[k] \ Yt (N (z)). The new color Yt+1 (z) = cY is defined
5 Sampling Colorings
as
follows,
For a graph G = (V, E) with maximum degree ∆, let

Ω be the set of all assignments σ : V → K where

/ Yt (N (z))
X0 (v) if z ∈ N (v), c = Y0 (v), X0 (v) ∈
K = [k] = {1, . . . , k}. The Glauber dynamics (heat0
if z ∈ N (v), c = Y0 (v), X0 (v) ∈ Yt (N (z))
bath version) has state space Ω and transitions defined cY = c


as follows. From Xt ∈ Ω,
cX
otherwise.
Assuming Pr (T > M 0 ) ≤ (1 − α)/2W , we have α0 ≤
(1 + α)/2. Applying Theorem 3 for the partial coupling
0
0
(X , Y , T 0 ) proves the first half of Corollary 4.
The second half of the corollary is now easy. For
β = E (T ), define
2βW
M 0 :=
1−α
• Choose vertex z uniformly at random from V .
We define T to be the first time the Hamming
distance
changes; i. e., the least t > 0 at which either of
• Choose c uniformly at random from [k] \ Xt (N (z)),
the
following
occurs:
i.e., c is a random color not appearing in the
neighborhood of z.
• Vertex v is selected for recoloring.
• Set Xt+1 (w) = Xt (w) for all w 6= z.
• A neighbor z of v is selected for recoloring and
• Set Xt+1 (z) = c.
(cX = Y0 (v) or cY = X0 (v)).
Clearly T is a stopping time, so Corollary 4 applies. A
simpler version of our stopping time was analyzed by
Dyer et al. [8].
It remains to prove E (H(XT , YT )) < 1 − δ for some
fixed δ > 0.
Our proof will rely on a high-probability lower
bound for the number of available colors at a vertex w
after the Glauber dynamics has run for a certain amount
of time, an idea first exploited by Dyer and Frieze [7].
We will require the following time-dependent version,
whose proof is a straightforward extension of methods
in [12].
Proof of Theorem 5. Recall that the stopping time T
occurs when either v is recolored, which always reduces
the Hamming distance to zero, or when w ∈ N (v)
is selected, and either YT −1 (v) is chosen for XT (w)
or XT −1 (v) is chosen for YT (w), which increases the
Hamming distance to two.
It follows that
E (H(XT , YT )) = 2Pr (H(XT , YT ) = 2).
Decomposing into the “good” and “bad” cases, we have
Pr (H(XT , YT ) = 2) ≤ Pr (B) + Pr H(XT , YT ) = 2, B .
Lemma 6. For every δ > 0, for all sufficiently large
By Corollary 8, Pr (B) ≤ exp(−δ 2 ∆/200).
∆ ≥ ∆0 (δ), for every graph G = (V, E) having girth
We can expand
≥ 5 and maximum degree ∆, for k > (1 + δ)∆, for
every t > 0, w ∈ V ,
Pr H(XT , YT ) = 2, B
∞
X
Pr (A < (1 − δ)k 0 exp(−∆0 /k 0 )) ≤ exp(−δ 2 ∆/100),
=
Pr H(Xt , Yt ) = 2 | T ≥ t, B Pr T ≥ t, B .
t=0
where
A = AXt (w) = k − |Xt (N (v))|,
∆0 = ∆ − exp(−t/n)∆,
k 0 = k − exp(−t/n)∆.
Observe that
Pr (H(Xt , Yt ) = 0 | Xt−1 , Yt−1 ) = 1/n.
Since Jerrum’s coupling pairs the color choices YT −1 (v)
and XT −1 (v) as often as possible on N (v), we have
Lemma 6 is a slight generalization of an earlier
X
result of Hayes [12, Lemmas 11, 12], which assumed
1
Pr (H(Xt , Yt ) = 2 | Xt−1 , Yt−1 ) =
,
t = Ω(n) and ∆ = Ω(log n). Although the proof
nA(w)
w∈N
(v)
is essentially the same, we include it in Section 5.1
for completeness. We now continue with the proof of
where
Theorem 5.
Definition 7. With the notation of Lemma 6, we define
the “bad” event,
B := {(∃t ≤ T, w ∈ N (v)) A < (1 − δ)k 0 exp(−∆0 /k 0 )}.
Denote the complementary “good” event by B.
A(w) := k − max{Xt−1 (N (w)), Yt−1 (N (w))}.
Noting that T ≥ t if and only if H(Xs , Ys ) = 1 for all
s < t, it follows that
t
Y
Pr H(Xs , Ys ) = 1 | T ≥ s, B
Pr T ≥ t, B = Pr B
s=1
Corollary 8. For every δ > 0, for all sufficiently large
∆ ≥ ∆1 (δ)
≤
t
Y
s=1
Pr (B) ≤ exp(−δ 2 ∆/200).
where
1−
1
g(s)
+
n (1 − δ)n
,
∆
Proof. Since B can only occur when the updated vertex
g(s) = 0
.
0 (s)/k 0 (s))
k
(s)
exp(−∆
is within distance 2 of v, we can restrict our attention
to such times. With probability ≥ 1 − δ, T ≤ Putting this all together, and applying the definition of
n/δ and at most O(∆2 /δ) such times occur before B we have
T . By Lemma 6, the probability that the bad event
occurs at a particular time t and vertex w is at most
Pr H(XT , YT ) = 2, B
exp(−δ 2 ∆/100), assuming ∆ is sufficiently large with
∞
t
X
Y
g(t)
1
g(s)
respect to δ. Taking a union bound over the times of
≤
1−
+
.
(1 − δ)n s=1
n (1 − δ)n
interest and over w ∈ N (v) gives the desired result.
t=0
Taking limits as δ tends to 0 and ∆ tends to infinity,
For almost all outcomes of F, we will show that,
the right-hand side tends to
conditioned on that outcome, the variables Xt (z), z ∈
N (w) become nearly independent, and for most z ∈
Z x
Z ∞
N (w), the range of Xt (z) remains large and its disf (x) exp −
1 + f (y) dy dx,
tribution approximately uniform. Then we will apply
0
0
Lemma 10.
where
Proof of Lemma 6. Let c = ln 1/δ.
1
1 − exp(−x)
Since the events “vertex z is not selected at any time
f (x) =
exp
,
γ − exp(−x)
γ − exp(−x)
in T ” are negatively associated, and the probability of
each is (1 − 1/n)t < exp(− min{t/n, c})}, Chernoff’s
and γ = k/∆. Using Maple, we find that this integral bound applies (cf. [10, 12]), implying
is strictly less than 1/2 when γ > 1.953.
Hence for any γ ≥ 1.953, there exist suitably small
Pr (∆ − |M | > (exp(− min{t/n, c}) + δ)∆)
values of δ such that for all sufficiently large values of
< exp(−δ 2 ∆/2).
∆ and k > γ∆, we have shown E (H(XT , YT )) < 1 − δ.
Applying Corollary 4 to this partial coupling comSince there are always at least k −∆ colors available
pletes the proof. Note that in the notation of Corollary 4 for any vertex being recolored, it follows that the probwe have α = 1 − δ, β ≤ n, W = 2, and dmax = n.
ability any particular color is missed by all neighbors of
x during the time interval [t0 + 1, t] is at least
5.1 Proof of Lemma 6 Following [12], we define the
(1 − ∆/((n − 1)(k − ∆)))t−t0 ≈ exp(−c∆/(k − ∆)).
following experiment.
Definition 9. Let F denote the experiment which Since these events are negatively associated, Chernoff’s
reveals the entire sequence of selected vertices, as well bound implies
as all selected colors except those on neighbors of w. Let
Pr (r(x) < exp(−c∆/(k − ∆))(k − ∆) − δ∆))
c = c(δ) be a constant to be specified later. Let
< exp(−δ 2 ∆/2).
T = [max{1, t − c(n − 1)}, t]
An easy counting argument shows, for every `,
Let M denote the set of neighbors of w that are selected
cn −` ec `
Pr
(R
≥
`)
≤
n ≤
for recoloring at least once during T . For every x ∈
`
`
M , let r(x) denote the size of the range of Xt−1 (x),
conditioned on the outcome of F. In other words, r(x) For ` = δ 2 (k−∆)/4 and ∆ sufficiently large as a function
is the number of colors missed by the sets Xt0 (N (x)), of δ, this probability is < exp(−δ 2 ∆).
t0 ∈ [t0 , t], where t0 is the time of the last recoloring of
Similarly, for every x ∈ N (w),
x.
`
cn
ec∆
`
Pr
(I(x)
≥
`)
≤
(∆/n)
≤
We will use the following lemma of Hayes [12,
`
`
Lemma 21], which we have restated for our application.
For ` = e2 c∆ and ∆ sufficiently large, this probability
Lemma 10. Using the notation of Lemma 6,
is < exp(−c∆).
Taking a union bound over all x ∈ N (w), we find
that with small probability of error ≤ 4∆ exp(−δ 2 ∆/2),
Pr A < L(1 − p)|M |/Lp − δ∆ | F
the outcome of F satisfies
< exp(R∆/(k − ∆) − δ 2 ∆/2),
|M | ≥ (1 − exp(−t/n) − 2δ)∆
where
(∀x ∈ N (w)) r(x) ≤ exp(1/δ)/∆
(∀x ∈ N (w)) I(x) ≤ e2 c∆
L := k − (∆ − |M |)
p ≤ exp(1/δ 2 )/∆
1
p := max
exp(I(z)/(k − ∆)),
R ≤ δ 2 (k − ∆).
z∈M r(z)
I(z) := # recolorings of vertices of N (z) in T ,
R := # recolorings of w in T .
Plugging these into Lemma 10 and simplifying gives
the claimed result.
References
[1] D. J. Aldous. Random walks on finite groups and
rapidly mixing Markov chains. In Séminaire de Probabilities XVII, 243–297. Springer-Verlag, 1983. Lecture
Notes in Mathematics 986.
[2] J. van den Berg and R. Brouwer. Random sampling
for the monomer-dimer model on a lattice. J. Math.
Phys. 41(3):1585–1597, 2000.
[3] R. Bubley and M. Dyer. Path Coupling: A Technique
for Proving Rapid Mixing in Markov Chains. In
38th Annual Symposium on Foundations of Computer
Science, 223–231, 1997.
[4] R. Bubley, M. E. Dyer, and M. R. Jerrum. An
elementary analysis of a procedure for sampling points
in a convex body. Random Structures & Algorithms,
12(3):213–235, 1998.
[5] A. Czumaj and M. Kutylowski, Delayed path coupling
and generating random permutations. Random Structures & Algorithms, 17(3-4):238–259, 2000.
[6] W. Doeblin. Exposé de la theorie des chaı̂nes simples
constantes de Markov à un nombre fini d’états. Rev.
Math. Union Interbalkanique, 2:77–105, 1938.
[7] M. Dyer and A. Frieze. Randomly colouring graphs
with lower bounds on girth and maximum degree.
In Proceedings of the 42nd Annual Symposium on
Foundations of Computer Science, 579-587, 2001.
[8] M. Dyer, L. Goldberg, C. Greenhill, M. Jerrum, and
M. Mitzenmacher. An extension of path coupling
and its application to the Glauber dynamics for graph
colorings. SIAM J. Comput. 30(6):1962–1975, 2001.
[9] M. Dyer, C. Greenhill, and M. Molloy Very rapid
mixing of the Glauber dynamics for proper colorings
on bounded-degree graphs. Random Structures Algorithms 20(1):98–114, 2002.
[10] D. Dubhashi and D. Ranjan. Balls and bins: a
study in negative dependence. Random Structures and
Algorithms, 13(2):99–124, 1998.
[11] D. Griffeath, A maximal coupling for Markov chains, Z.
Wahrscheinlichkeitstheorie und Verw. Gebiete, 31:95–
106, 1974/75.
[12] T. P. Hayes. Randomly coloring graphs with girth at
least five. In Proceedings of the 35th Annual ACM
Symposium on Theory of Computer Science, 2003.
[13] T. P. Hayes and E. Vigoda. A non-Markovian coupling for randomly sampling colorings. To appear in
44th Annual Symposium on Foundations of Computer
Science, 2003.
[14] V. S. Anil Kumar and H. Ramesh. Markovian coupling
vs. conductance for the Jerrum-Sinclair chain. In
40th Annual Symposium on Foundations of Computer
Science, 241–252, 1999.
[15] M. R. Jerrum. A very simple algorithm for estimating
the number of k-colorings of a low-degree graph. Random Structures and Algorithms, 7(2):157–165, 1995.
[16] M. Molloy. The Glauber dynamics on colorings of
a graph with high girth and maximum degree. In
Proceedings of the 34th Annual ACM Symposium on
Theory of Computer Science, 91-98, 2002.
[17] E. Vigoda. Improved bounds for sampling colorings. In
40th Annual Symposium on Foundations of Computer
Science, 51-59, 1999.