x - MSRI

Counting and Sampling in Lattices:
The Computer Science Perspective
Dana Randall
Advance Professor of Computing
Georgia Institute of Technology
Counting and Random Sampling
Eg., Perfect Matchings
Question: In polynomial time, can we:
•
•
•
•
decide if there is one?
find one?
count them?
sample one at random?
In general:
Y
Y
N
?/Y
What about matchings on lattices?
On the lattice
What about matchings on lattices?
On the lattice
What about matchings on lattices?
On the lattice
“Domino tilings”
What about matchings on lattices?
On the lattice
“Domino tilings”
Q: How many?
What about matchings on lattices?
≥
On the lattice
“Domino tilings”
Q: How many?
What about matchings on lattices?
≥
On the lattice
“Domino tilings”
≥ 2 (Area/4) total tilings.
Q: How many?
What about matchings on lattices?
On the lattice
“Domino tilings”
Q: How many?
“Lozenge tilings”
Q2: What does a typical
one look like?
Counting and Random Sampling
Eg., Perfect Matchings
Question: In polynomial time, can we:
•
•
•
•
decide if there is one?
find one?
count them?
sample one at random?
In general:
Y
Y
N
?/Y
On lattices:
Y
Y
Y
Y
Other Computational Problems in the Sciences
Some (unrelated?) problems:
• Nanotechnology
- self-assembly
• Computer sciences - sampling/counting
(e.g., image segmentation)
• Physics
- phase transitions
• Chemistry
- colloids
Nanotechnology
• Universal model of computation using “Wang tiles”
• DNA-Based self-assembly
[Seeman, Winfree,…….]
Construction based on Watson-Crick complementarity
T A A T C
C C A T T
A
A
T
T
C
A
G G T A A
G
C
A
T
T
C
G
T
A
A
B
A A A C G
G
C
A
T
T
Physics
Phase transitions:
Macroscopic changes to the system due to a microscopic
change to some parameter.
e.g.:
gas/liquid/solid, spontaneous magnetization
Simulations of the Ising model
High temperature
Criticality
Low temperature
Chemistry
•
•
Colloids: mixtures of two types of molecules.
Must not overlap.
Low density
?
High density
(See poster by Sarah Miracle and Amanda Streib!)
Computational Problems in the Sciences
Some (unrelated?) problems:
• Nanotechnology
- self-assembly
• Computer sciences - sampling/counting
(e.g., image segmentation)
• Physics
- phase transitions
• Chemistry
- colloids
… Simulating the Ising Model (and other spin systems)
Lattice models
Independent Sets
The Ising Model
Matchings
Colorings (Potts Model)
Goals: Efficiently sample and approximately count.
Counting and Sampling
An FPRAS* for f takes input x, ε, δ, and produces A s.t.
Pr [ (1−ε)f(x) ≤ A ≤ (1+ε)f(x) ] ≥ 1−δ
and runs in time polynomial in |x|, ε−1 and log(1/δ).
(*Fully Polynomial Randomized Approximation Scheme)
An FPAUS generates samples from some distribution μ with
probability π s.t.
||μ,π|| ≤ δ,
and runs in time polynomial in |x| and log(1/δ).
(*Fully Polynomial Almost Uniform Sampler)
Exact Counting
⇓
Approximate Counting
(FPRUS)
⇒
Exact Sampling
⇓
⇐⇒
Approximate Sampling
“self-reducible”
(FPAUS)
Main Questions
• Is the problem efficiently computable (in polynomial time)?
Which problems are “intractable”?
• Does the “natural” sampling method work?
Give me *any*
fast solution!
Is *this* sampling
algorithm efficient?
Markov chains
State space Ω
n )
( |Ω| ~
c
~
 Step 1. Connect the state space.
E.g., if Ω = indep. sets of a graph G, connect I and I’ iff
|I I’| = 1.
x
Basics of Markov chains
y
H
Transitions P: Random walk on H
Starting at x:
(max deg in H)
- Pick a neighbor y.
- Move to y with prob. P(x,y) = 1/∆.
- With all remaining prob. stay at x.
Def’n: A MC is ergodic if it is:
(The “t step” transition prob.)
• Irreducible - for all x,y  Ω, $ t: Pt(x,y) > 0; (connected)
• Aperiodic - g.c.d. {t: Pt(x,y) > 0 } =1.
(not bipartite)
The stationary distribution p
Thm: Any finite, ergodic MC converges to a unique
stationary distribution π.
Thm: The stationary distribution π of a reversible chain
satisfies the detailed balance condition:
π(x) P(x,y) = π(y) P(y,x).
P symmetric implies π is uniform.
Sampling from non-uniform distributions
Q: What if we want to sample from some other distribution?
E.g., For l > 0, sample independent set I
with prob.:
π(I) =
l0
l1
l|I| ,
Z
where Z = ∑J l|J|.
Step 2.
Carefully define the transition probabilities.
l2
The Metropolis Algorithm
[MRRTT ’53]
Propose a move from x to y as before, but accept with prob.
min (1, π(y)/π(x)),
(and with all remaining probability stay at x).
π(x) P(x,y) = π(y) P(y,x)
x
π(y)/∆π(x)
1
π(y)
π(x)
y
For independent sets:
π(y)
I
π(x)
min(1,l-1)
min(1,l)

I
{v}
=
1/∆
( if π(x) ≥ π(y) )
l(|I|+1)/Z
l(|I|)/Z
=
l
Basics continued…
Step 1. Connect the state space.
Step 2. Carefully define the transition probabilities.
Starting at any state x0, take a random walk for some
number of steps . . . and output the final state (from p?).
Q: But for how long do we walk?
Step 3. Bound the mixing time.
This tells us the number of steps to take.
The mixing rate
Def’n: The total variation distance is
||Pt,π|| = max
1
__
∑ |Pt(x,y) - π(x)|.
x Ω 2 y Ω
Def’n Given e, the mixing time is
A
t(e) = min {t: ||Pt’,π|| < e,
t’ ≥ t}.
A Markov chain is rapidly mixing if t(e) is poly(n, log(e-1)).
(or polynomially mixing)
Spectral gap
Let 1 = l1 > |l2| ≥ … ≥ |l|Ω|| be the eigenvalues of P.
Gap(P) = 1 - |l2| is the spectral gap.
Thm: (Alon, Alon-Milman, Sinclair)
t(e) ≤
1
log ( 1 )
Gap(P)
π*e
|l2|
t(e) ≥
log ( 1 ).
2 Gap(P)
2e
Bounding Convergence Time
Techniques:
Problems:
• Coupling
• Colorings
• Flows and paths
• Matchings
• Indirect methods
• Independent sets
•Insights from physics
• Ising model
Ex 1: Colorings
Given: A graph G (max deg d), k > 1.
Goal: Find a random k-coloring of G.
Note: k ≥ d + 1

MCCOL: (Single point replacement)
• Starting at some k-coloring C0
The “lazy”
• Repeat:
chain
- With prob 1/2 do nothing.
- Pick v  V, c  [k];
- Recolor v with c, if possible.
colorings exist.
(Greedy)
If k ≥ d + 2, then the state space is connected.
(Therefore π is uniform.)
Coupling
y0
Simulate 2 processes:
Start at any x0 and y0
Couple moves, but each
simulates the MC
Once they agree, they move
in sync (xt=yt
xt+1=yt+1)
x0
Coupling
Def’n: A coupling is a MC on Ω x Ω:
1) Each process {Xt}, {Yt} is a faithful copy of the original MC,
2) If Xt = Yt, then Xt+1 = Yt+1.
The coupling time T is:
T = max ( E [ Tx,y ] ),
x,y
where Tx,y = min {t: Xt=Yt | X0=x, Y0=y}.
Thm: t(e) ≤ T e ln e-1.
[Aldous’81]
Path Coupling
[Bubley, Dyer, Greenhill’97-8]
Coupling: Show for all x,y  W,
E[ D(dist(x,y)) ] ≤ 0.
Path coupling: Show for all u,v s.t. dist(u,v)=1, that
E[ D(dist(u,v)) ] ≤ 0.
Consider a shortest path:
x = z 0, z 1,
z2, . . . , zr= y,

dist(zi,zi+1) = 1,
dist(x,y) = r.
Path coupling for MCCOL
Thm: MCCOL is rapidly mixing if k ≥ 3d.
2d
[Jerrum ’95]
Pf: Use path coupling: dist(x,y) = 1.
Cases:
 v = w, c  C \ { , , }:
∆dist = -1,
 v  N(w), c  { , }:
∆dist = +1 (or 0)
 o.w.: ∆dist = 0.
w
x
1 ((k-d)(-1) + 2d(+1))
2nk
1
= 2nk (3d - k)
2d
w
y
E[∆dist] ≤
≤ 0.
x
k-colorings
On Z2:
MCcol is fast when: k ≥ 8
[Jerrum]
k≥6
[BDG, AMMV]
k=3
[LRS]
On Zd:
MCcol is slow for k=3 for large d
[GKRS, Peled]
Ex 2: Sampling matchings
u
u
u
e’
e
e
e
v
MCMATCH:
 Starting at M0, repeat:
 Pick e = (u,v)  E
v
- If e  M, remove e;
v
- If u and v unmatched in M,
add e;
- If u matched (by e’) and v
unmatched (or vice versa),
add e and remove e’;
- Otherwise do nothing.
Sampling Matchings
MCMATCH is rapidly mixing if # NPM < poly # PM.
[Jerrum, Sinclair]
There is an FPRAS (and FPAUG) for matchings on any
bipartite graph.
[Jerrum, Sinclair, Vigoda]
Conductance and flows
[Jerrum-Sinclair, Lawler-Sokal]
∑
F(S) =
sS,
π(s) P(s,s’)
s’SC
∑ π(s)
Ω
SC
S
sS
F =
Thm:
min
F(S).
SΩ, π(S)≤1/2
F2
2
≤ Gap(P) ≤ 2 F.
(Thm: Coupling won’t work!
Ramesh’99])
[Kumar-
Matchings on Lattices
v
Markov chain for Lozenge Tilings
v
v
Repeat:
 Pick v in the lattice region;
 Add / remove the ``cube’’
at v w.p. ½, if possible.
.
Markov chain for Lozenge Tilings
v
v
The state space is connected.
The stationary distribution is uniform over tilings.
Thm: The lozenge Markov chain is rapidly mixing.
[Luby, R., Sinclair], [Wilson], [R.,Tetali]
Markov chain for Lozenge Tilings
v
v
Markov chain for Lozenge Tilings
Tower chain for Lozenge Tilings
Repeat:
 Pick v in the lattice region;
Add/remove the “tower of height h” at v
w.p. 1/2h, if possible.
.
Tower chain for Lozenge Tilings
To couple: Choose corresponding points and the same direction.
2
1
Move w/ prob 1/4
Do nothing
Comparison
P
unknown
[Diaconis,Saloff-Coste’93]
_
P
known
_
y For each edge (x,y)  P, make a path g
x,y
using edges in P.
x
w
Let G(z,w) be the set of paths gx,y using (z,w).
z
e
{
Q(e) gxy e

A = max
1
_
∑ |gx,y| π(x)P(x,y) }
_
1
Thm: Gap(P) ≥
Gap(P).
A
What about other models?
Dimer model
Potts model
Domino tilings
3-colorings
What about other models?
Dimer model
Potts model
Domino tilings
3-colorings
What about other models?
• Pick a 2 x 2 square;
Dimer model
Potts model
Domino tilings
3-colorings
What about other models?
• Pick a 2 x 2 square;
• Rotate, if possible;
Dimer model
Potts model
Domino tilings
3-colorings
What about other models?
• Pick a 2 x 2 square;
• Rotate, if possible;
• Otherwise do nothing.
Dimer model
Potts model
Domino tilings
3-colorings
What about other models?
• Pick a vtx and a color;
Dimer model
Potts model
Domino tilings
3-colorings
What about other models?
• Pick a vtx and a color;
• Recolor, if possible;
Dimer model
Potts model
Domino tilings
3-colorings
What about other models?
These local chains
• Pick a vtx and a color;
• Recolor, if possible;are also rapidly mixing
on domino tilings and
• Otherwise do nothing.
3-colorings.
Dimer model
Potts model
Domino tilings
3-colorings
Ex 3: Independent Sets
Goal: Given l, sample ind. set I with prob:
π(I) = l|I|/Z,
where Z = ∑J l|J|.
MCIND:
Starting at I0, Repeat:
- Pick v  V and b  {0,1};
- If v  I, b=0, remove v w.p. min (1,l-1)
- If v  I, b=1, add v w.p. min (1,l)
if possible;
- O.w. do nothing.
When l is small (sparse case)
G = Z2,
(fixed or toroidal boundary).
MCIND is fast on Z2 when:




l≤1
[Luby, Vigoda]
l ≤ 1.24
[van den Berg, Steif]
l ≤ 1.68
[Weitz]
l ≤ 2.38
[RSTVY]
(strong spatial mixing)
Conjecture: Fast for l < 3.79
When l is large (dense case)
G = Z2
MCIND is slow on Z2 when:
 l > 80
 l > 6.19 …
[BCFKTVV]
[R]
Conjecture: Slow for l > 3.79
(with toroidal boundary).
Slow mixing of MCIND on Z2 (large l)
(Even)
(Odd)
n2/2
l
l
(n2/2-n/2)
n2/2
l
SC
S
#R/#B
0
1
l large  there is a bad cut,
∞
. . . so MCIND is slowly mixing.
x
Ex 4: Ising Configurations
The local chain: Pick a site and a spin and update with the
appropriate Metropolis probability.
?
lc
Slow
Fast
Fast [Lubetzky, Sly]
Alternative: Simulated Tempering?
^
W = W x [M+1]
M =
i =  i
M
^
p ((,i)) = pi () / (M+1)
M-1
Tempering:
M-2
0 =0
w.p. 1/2: Do a LEVEL move:
Fix i ; update 
w.p. 1/2: Do a TEMP move:
Fix  ; update i
Thm: Tempering is fast for Ising on Kn, for all .
… But not for Potts.
[Madras/Zheng]
[Bhatnagar, R.]
Other approaches…
Thm: There is an FPRAS for the Ising model that estimates Z
(for all , all G).
[Jerrum, Sinclair]
based on the “high temperature expansion”
Z = ∑t e- H(t) = ∑HG …
Thm: There is an FPAUS for the Ising model to
sample from p (all , all G).
[R., Wilson]
based on the “random cluster respresentation” + JS
Conclusions
Techniques:
• Coupling: can be easy when it works
• Flows: requires global knowledge of chain;
very useful for slow mixing
• Indirect methods: top down approach;
often increases complexity
• Connection to physics: can offer
tremendous insights
Open problems: . . .
Open Problems
3-Colorings: MCcol is fast on Z2 when k=3 or k ≥ 6.
What about k=4 or 5?
6vtx / 8vtx: Consider MC8vtx where p(x) = l{#sources+sinks}/Z.
Fast: l = 0,
?
.9 < l < 1.1;
Slow: large l
SAWs: Is there an FPRAS / FPAUG?
There is an efficient “testable algorithm.”
[R., Sinclair]
Matchings: FPRAS / FPAUG on non-bipartite graphs?
Thank you!