Simple Instances of Swendson-Wang & RJMCMC Daniel Eaton CPSC 540 December 13, 2005 A Pedagogical Project (1/1) Two algorithms implemented: 1. SW for image denoising 2. RJMCMC for model selection in a regression problem (in real-time!) 13/12/2005 Similar to Frank Dellaert’s demo at ICCV 2005 Daniel Eaton 2 Swendson-Wang for Denoising (1/8) Denoising problem input: 13/12/2005 lettera.bmp binarized Original image Thresholded Daniel Eaton noisy ( =1.00) + Zero-mean noise 3 Swendson-Wang for Denoising (2/8) Model: Ising prior Smoothness: a priori expect neighbouring pixels to be the same Likelihood 13/12/2005 Daniel Eaton 4 Swendson-Wang for Denoising (3/8) Sampling from posterior Easy: using Hastings algorithm Propose to flip value at single pixel (i,j) Accept with probability Ratio has simple expression involving # of disagreeing edges before/after flip 13/12/2005 Daniel Eaton 5 Swendson-Wang for Denoising (4/8) Problem: convergence is slow 20 40 Many steps needed to fill this hole, since update locations are chosen uniformly at random! 60 80 100 120 20 13/12/2005 40 60 80 100 120 Daniel Eaton 6 Swendson-Wang for Denoising (5/8) SW to the rescue Flip whole chunks of the image at once 20 20 40 40 60 60 80 80 100 100 One step 120 120 20 40 60 80 100 20 120 40 60 80 100 120 BUT Make this a reversible Metropolis-Hastings step so that we are still sampling from right posterior 13/12/2005 Daniel Eaton 7 Swendson-Wang for Denoising (6/8) Hastings 13/12/2005 SW Daniel Eaton 8 Swendson-Wang for Denoising (7/8) Demo: Hastings VS. SW Hastings allowed to iterate more often than SW to account for difference in computational cost 13/12/2005 Daniel Eaton 9 Swendson-Wang for Denoising (8/8) Conclusion: SW ill-suited for this task Discriminative edge probabilities very important to convergence Makes large steps at start (if initialization is uniform) but slows near convergence (in presence of small disconnected regions) – ultimately, becomes single-site update algorithm Extra parameter to tune/anneal Does what it claims to – energy is minimized faster than with Hastings alone 13/12/2005 Daniel Eaton 10 RJMCMC for Regression (1/5) Data randomly generated from a line on [-1,1] with zero-mean Gaussian noise y = 0.50*x + 0.00 + N(0,0.05), nSamples=100 1 0.75 0.5 0.25 0 -0.25 -0.5 -0.75 -1 -1 13/12/2005 -0.75 Daniel Eaton -0.5 -0.25 0 0.25 0.5 0.75 11 1 RJMCMC for Regression (2/5) Allow two models to compete at explaining this data (uniform prior over models) 1. Linear (parameters: slope & y-intercept) 2. Constant (parameters: offset) 13/12/2005 Daniel Eaton 12 RJMCMC for Regression (3/5) Heavy-handedly solve simple model selection problem with RJMCMC Recall: ensuring reversibility is one (convenient) way of constructing a MC having the posterior we want to sample from as its invariant distribution RJMCMC/TDMCMC is just a formalism for ensuring reversibility for proposals that jump between models of varying dimension (constant – 1 param., linear – 2 params.) 13/12/2005 Daniel Eaton 13 RJMCMC for Regression (4/5) Given initial starting conditions (model type, parameters), chain has two types of moves: 1. Parameter update (probability 0.6) Within current model, propose new parameters (Gaussian proposal) and accept with ordinary Hastings ratio 2. Model change (probability 0.4) 13/12/2005 Propose a model swap If going from constant->linear, uniformly sample a new slope Accept with special RJ ratio (see me/my tutorial for details) Daniel Eaton 14 RJMCMC for Regression (5/5) Model selection: approximate marginal likelihood by number of samples from each model If the model is better at explaining the data, the chain will spend more time using it Demo: 13/12/2005 Daniel Eaton 15 Questions? Thanks for listening I’ll be happy to answer any questions over a beer later tonight! 13/12/2005 Daniel Eaton 16
© Copyright 2026 Paperzz