presentation - UBC Computer Science

Simple Instances of
Swendson-Wang & RJMCMC
Daniel Eaton
CPSC 540
December 13, 2005
A Pedagogical Project (1/1)

Two algorithms implemented:


1. SW for image denoising
2. RJMCMC for model selection in a regression
problem (in real-time!)

13/12/2005
Similar to Frank Dellaert’s demo at ICCV 2005
Daniel Eaton
2
Swendson-Wang for Denoising (1/8)

Denoising problem input:
13/12/2005
lettera.bmp
binarized
Original image
Thresholded
Daniel Eaton
noisy ( =1.00)
+ Zero-mean noise
3
Swendson-Wang for Denoising (2/8)

Model:

Ising prior


Smoothness: a priori expect neighbouring pixels to be
the same
Likelihood
13/12/2005
Daniel Eaton
4
Swendson-Wang for Denoising (3/8)

Sampling from posterior




Easy: using Hastings algorithm
Propose to flip value at single pixel (i,j)
Accept with probability
Ratio has simple expression involving # of
disagreeing edges before/after flip
13/12/2005
Daniel Eaton
5
Swendson-Wang for Denoising (4/8)

Problem: convergence is slow
20
40
Many steps needed to fill this hole, since
update locations are chosen uniformly at
random!
60
80
100
120
20
13/12/2005
40
60
80
100
120
Daniel Eaton
6
Swendson-Wang for Denoising (5/8)

SW to the rescue

Flip whole chunks of the image at once
20
20
40
40
60
60
80
80
100
100
One step
120
120
20

40
60
80
100
20
120
40
60
80
100
120
BUT Make this a reversible Metropolis-Hastings
step so that we are still sampling from right
posterior
13/12/2005
Daniel Eaton
7
Swendson-Wang for Denoising (6/8)
Hastings
13/12/2005
SW
Daniel Eaton
8
Swendson-Wang for Denoising (7/8)

Demo:


Hastings VS. SW
Hastings allowed to iterate more often than SW to
account for difference in computational cost
13/12/2005
Daniel Eaton
9
Swendson-Wang for Denoising (8/8)

Conclusion: SW ill-suited for this task




Discriminative edge probabilities very important to
convergence
Makes large steps at start (if initialization is
uniform) but slows near convergence (in presence
of small disconnected regions) – ultimately,
becomes single-site update algorithm
Extra parameter to tune/anneal
Does what it claims to – energy is minimized
faster than with Hastings alone
13/12/2005
Daniel Eaton
10
RJMCMC for Regression (1/5)

Data randomly
generated from a
line on [-1,1] with
zero-mean
Gaussian noise
y = 0.50*x + 0.00 + N(0,0.05), nSamples=100
1
0.75
0.5
0.25
0
-0.25
-0.5
-0.75
-1
-1
13/12/2005
-0.75
Daniel Eaton
-0.5
-0.25
0
0.25
0.5
0.75
11
1
RJMCMC for Regression (2/5)

Allow two models to compete at explaining
this data (uniform prior over models)

1. Linear (parameters: slope & y-intercept)

2. Constant (parameters: offset)
13/12/2005
Daniel Eaton
12
RJMCMC for Regression (3/5)

Heavy-handedly solve simple model selection
problem with RJMCMC


Recall: ensuring reversibility is one (convenient)
way of constructing a MC having the posterior we
want to sample from as its invariant distribution
RJMCMC/TDMCMC is just a formalism for
ensuring reversibility for proposals that jump
between models of varying dimension (constant –
1 param., linear – 2 params.)
13/12/2005
Daniel Eaton
13
RJMCMC for Regression (4/5)

Given initial starting conditions (model type,
parameters), chain has two types of moves:

1. Parameter update (probability 0.6)


Within current model, propose new parameters
(Gaussian proposal) and accept with ordinary Hastings
ratio
2. Model change (probability 0.4)



13/12/2005
Propose a model swap
If going from constant->linear, uniformly sample a new
slope
Accept with special RJ ratio (see me/my tutorial for
details)
Daniel Eaton
14
RJMCMC for Regression (5/5)

Model selection: approximate marginal
likelihood by number of samples from each
model


If the model is better at explaining the data, the
chain will spend more time using it
Demo:
13/12/2005
Daniel Eaton
15
Questions?


Thanks for listening
I’ll be happy to answer any questions over a
beer later tonight!
13/12/2005
Daniel Eaton
16