bayes - University College London

Bayesian Methods
Will Penny and Guillaume Flandin
Wellcome Department of Imaging Neuroscience,
University College London, UK
SPM Course, London, May 12th 2006
Overview
Bayes rule and model comparison
ANOVAs
Normalisation
Segmentation
fMRI stats
Hemodynamic models
Connectivity models (DCM)
Overview
Bayes rule and model comparison
ANOVAs
Normalisation
Segmentation
fMRI stats
Hemodynamic models
Connectivity models (DCM)
Bayes rule
Given p(Y), p() and p(Y,)

Conditional densities are given by
p (Y ,  )
p ( | Y ) 
Y
p (Y |  ) 
p (Y )
p (Y ,  )
p ( )
Eliminating p(Y,) gives Bayes rule
Likelihood
Posterior
p ( | Y ) 
Prior
p (Y |  ) p ( )
p (Y )
Evidence
Gaussian
Likelihood and Prior

  N  ,  
  N  ,  
p 
p y |
(1 )
1
(1 )
(1 )
(1 )
1
(2)
(2)
Posterior
Posterior

p 
(1 )
Likelihood
| y


 N m, p
1

Prior
p   (1 )   ( 2 )
m 
 (1 )
p

(1 )

(2)

(2)
p
Relative Precision Weighting

(2)
m

(1 )
Bivariate Gaussian
Model Comparison
Select the model m with the highest probability given the data:
p (m | Y ) 
P (Y | m ) p ( m )
p (Y )
Model evidence (marginal likelihood):
p (Y | m ) 
 p (Y | m , 
m
Accuracy
) p ( m | m ) d  m
Complexity
Model comparison and Bayes factor:
B12 
p (Y | m 1 )
p (Y | m 2 )
B12
p(m1|Y)
Evidence
1 to 3
50-75
Weak
3 to 20
75-95
Positive
20 to 150
95-99
Strong
 150
 99
Very strong
Overview
Bayes rule and model comparison
ANOVAs
Normalisation
Segmentation
fMRI stats
Hemodynamic models
Connectivity models (DCM)
ANOVA
Four
conditions
Model 0 (dotted lines) – no effect
Model 1 (solid lines) – an effect
Compare inferences on 100 data sets
p=0.05
Bayesian
BF=3
Classical
Overview
Bayes rule and model comparison
ANOVAs
Normalisation
Segmentation
fMRI stats
Hemodynamic models
Connectivity models (DCM)
Spatial Normalisation
Posterior
Deformation parameters
log p ( | y )  log p ( y |  )  log p ( )  
Mean square difference between
template and source image
(Likelihood)
Template
Squared distance between parameters
and their expected values
(Prior)
Max Likelihood
Max Posterior
Segmentation
 Intensities are modelled by a mixture of K Gaussian distributions.
 Overlay prior belonging probability maps to assist the segmentation:
 Prior probability of each voxel being of a particular type is derived from
segmented images of 151 subjects.
Overview
Bayes rule and model comparison
ANOVAs
Normalisation
Segmentation
fMRI stats
Hemodynamic models
Connectivity models (DCM)
fMRI stats
Even without applied spatial smoothing, activation maps
(and maps of eg. AR coefficients) have spatial structure.
Contrast
AR(1)
 Definition of a spatial prior via Gaussian Markov Random Field
 Automatic spatial regularisation of Regression coefficients and
AR coefficients
Generative Model
General Linear Model with Auto-Regressive error terms (GLM-AR):
Y=X β +E where E is an AR(p)
a

1
p ( b k )  N (0, a k D

b
1
p (a p )  N (0, 
)
1
p
D
A
p
Y
yt  X t b 
ae
i
i 1
t i
 t
1
)
Spatial prior
Over the regression coefficients:
p b   N 0 , a
k
Shrinkage
prior
1
k
D
1

Spatial precison: determines
the amount of smoothness
Spatial kernel
matrix
Gaussian Markov Random Field priors D
1


D  




1

d
ji
d ij
1






1

1 on diagonal elements dii
dij > 0 if voxels i and j are neighbors.
0 elsewhere
Same prior on the AR coefficients.
Convergence & Sensitivity
ROC curve
Sensitivity
Convergence
F
Iteration Number
o Global
o Spatial
o Smoothing
1-Specificity
Event related fMRI: familiar versus unfamiliar faces
Smoothing
Global prior
Spatial Prior
Posterior Probability Maps
Posterior distribution: probability of getting an effect, given the data
p(b | y)
mean: size of effect
precision: variability
Posterior probability map: images of the probability or confidence
that an activation exceeds some specified threshold, given the data

p(b   | y)  a
p(b | y)
b
Two thresholds:
• activation threshold : percentage of whole brain mean signal
(physiologically relevant size of effect)
• probability a that voxels must exceed to be displayed (e.g. 95%)
Posterior Probability Maps
Activation threshold 
p(b   | y)  a
Mean (Cbeta_*.img)
Posterior probability distribution p(b |Y)
Probability a
Std dev (SDbeta_*.img)
PPM (spmP_*.img)
SPM5 Interface
Overview
Bayes rule and model comparison
ANOVAs
Normalisation
Segmentation
fMRI stats
Hemodynamic models
Connectivity models (DCM)
Fourier
Gamma
Informed
Hemodynamic basis sets
FIR models
Size
of
signal
5s
Time
after
event
Inf2: Canonical + temporal deriv
Inf2: Canonical + temporal deriv
SPM5: from spm_vb_roi_basis.m
Overview
Bayes rule and model comparison
ANOVAs
Normalisation
Segmentation
fMRI stats
Hemodynamic models
Connectivity models (DCM)
Dynamic Causal Models
m=2
m=1
Photic
SPC
0.85
Photic
SPC
0.86
0.70
V1
V1
0.57
0.75
0.84
0.89
Attention
0.55
-0.02
Motion
0.58
Attention
1.42
V5
-0.02
0.56
V5
Motion
m=3
Bayesian Evidence:
Photic
SPC
0.85
0.70
0.85
1.36
V1
Bayes factors:
0.03
0.57
-0.02
0.23
Motion
Attention
V5
Attention
Summary
Bayes rule and model comparison
ANOVAs
Normalisation
Segmentation
fMRI stats
Hemodynamic models
Connectivity models (DCM)