11_DCM_Basics_FIL2011 - Wellcome Trust Centre for Neuroimaging

Dynamic Causal Modelling (DCM) for fMRI
Klaas Enno Stephan
Laboratory for Social & Neural Systems
Research (SNS)
University of Zurich
Wellcome Trust Centre for Neuroimaging
University College London
SPM Course, FIL
13 May 2011
Structural, functional & effective connectivity
• anatomical/structural connectivity
= presence of axonal connections
Sporns 2007, Scholarpedia
• functional connectivity
= statistical dependencies between regional time series
• effective connectivity
= directed influences between neurons or neuronal
populations
Some models of effective connectivity for fMRI data
• Structural Equation Modelling (SEM)
McIntosh et al. 1991, 1994; Büchel & Friston 1997; Bullmore et al. 2000
• regression models
(e.g. psycho-physiological interactions, PPIs)
Friston et al. 1997
• Volterra kernels
Friston & Büchel 2000
• Time series models (e.g. MAR/VAR, Granger causality)
Harrison et al. 2003, Goebel et al. 2003
• Dynamic Causal Modelling (DCM)
bilinear: Friston et al. 2003; nonlinear: Stephan et al. 2008
Dynamic causal modelling (DCM)
• DCM framework was introduced in 2003 for fMRI by Karl Friston, Lee Harrison
and Will Penny (NeuroImage 19:1273-1302)
• part of the SPM software package
• currently more than 160 published papers on DCM
Dynamic Causal Modeling (DCM)
Hemodynamic
forward model:
neural activityBOLD
Electromagnetic
forward model:
neural activityEEG
MEG
LFP
Neural state equation:
dx
 F ( x , u,  )
dt
fMRI
simple neuronal model
complicated forward model
EEG/MEG
complicated neuronal model
simple forward model
inputs
Example:
a linear model of
interacting visual
regions
x3
x1
FG
left
LG
left
FG
right
LG
right
x4
LG = lingual gyrus
FG = fusiform gyrus
x2
Visual input in the
- left (LVF)
- right (RVF)
visual field.
RVF
LVF
u2
u1
x1  a 11 x1  a 12 x 2  a 13 x 3  c12 u 2
x 2  a 21 x1  a 22 x 2  a 24 x 4  c 21u1
x 3  a 31 x1  a 33 x 3  a 34 x 4
x 4  a 42 x 2  a 43 x 3  a 44 x 4
Example:
a linear model of
interacting visual
regions
x3
x1
x  Ax  Cu
  { A, C }
FG
left
FG
right
LG
left
x4
LG = lingual gyrus
FG = fusiform gyrus
x2
Visual input in the
- left (LVF)
- right (RVF)
visual field.
LG
right
RVF
LVF
u2
u1
state
changes
effective
connectivity
 x1   a 11
  
a
x
 2    21
 x 3   a 31
  
 x4   0
a 12
a 13
a 22
0
0
a 33
a 42
a 43
system
state
0 

a 24

a 34 

a 44 
 x1 
 
x
 2 
 x3 
 
 x4 
input
parameters
 0

c
 21
 0

 0
external
inputs
c12 

0  u1 

 
0  u2 

0 
Extension:
bilinear model
x3
FG
left
FG
right
x4
m
x  (A 

u jB
( j)
)x  Cu
j 1
x1
 x1    a 11
  
a
x
 2     21
 x 3    a 31
  
 x4    0
LG
left
LG
right
x2
RVF
CONTEXT
LVF
u2
u3
u1
a 12
a 13
a 22
0
0
a 33
a 42
a 43
0 
0


a 24
0
u 
3
0
a 34 


a 44 
0
b12
( 3)
0
0
0
0
0
0
0
0 

0 
( 3)
b34 

0 







 x1 
 
x
 2 
 x3 
 
 x4 
 0

c
 21
 0

 0
c12
0
0
0
0
  u1 
0  
 u
2
0  
  u 3 
0
y


y
BOLD
y

activity
x2(t)
λ
hemodynamic
model
activity
x3(t)
activity
x1(t)
neuronal
states
x
integration
modulatory
input u2(t)
driving
input u1(t)
y
t
Neural state equation
( j)
x  ( A   u j B ) x  Cu
A
endogenous
connectivity
t
modulation of
connectivity
direct inputs
B
( j)

C 
 x
x

 x
u j x
 x
u
Bilinear DCM
driving
input
modulation
Two-dimensional Taylor series (around x0=0, u0=0):
dx
dt
 f ( x , u )  f ( x 0 ,0 ) 
f
x
x
f
u
 f
2
u
xu
ux  ...
A
f
x
u0
 f
2
B 
Bilinear state equation:

 A
dt

dx
(i) 
u
B
 i  x  Cu
i 1

m
C 
xu
f
u
x0
DCM parameters = rate constants
Integration of a first-order linear differential equation gives an
exponential function:
dx
 ax
dt
x ( t )  x 0 exp( at )
Coupling parameter a is inversely
proportional to the half life  of z(t):
x ( )  0.5 x 0
The coupling parameter a
thus describes the speed of
the exponential change in x(t)
0.5 x 0
 x 0 exp( a )
a  ln 2 / 
  ln 2 / a
Example:
context-dependent decay
stimuli
u1
context
u2
+
-
x1
+
u1
u1
u2
u2
Z1
x
Z2 1
x2
+
x2
-
x  Ax  u2B
Penny et al. 2004, NeuroImage
 x1   
  
 x 2   a 21
(2)
x  C u1
2

a 12 
b 11
x

u
2 
 
 0
0 
 c1
x
2
b 22   0
0   u1 
 
0  u2 
The problem of hemodynamic convolution
Goebel et al. 2003, Magn. Res. Med.
Hemodynamic forward models
are important for connectivity
analyses of fMRI data
Granger
causality
DCM
David et al. 2008, PLoS Biol.
u
The hemodynamic
model in DCM
stimulus functions
t

A

dt

neural state
equation
dx
m
u
j
B
j 1
( j)

 x  Cu


0.4
0.2
0
vasodilato ry signal
s  x   s  γ ( f  1)
f
0
2
4
6
8
10
12
s
s
N
RBM N,  = 1
CBM ,  = 1
N
RBM ,  = 2
1
flow induc tion (rCBF)
0.5
f  s
hemodynamic
state
equations
N
CBM N,  = 2
0
f
Balloon model
changes in volume
τ v  f  v
1 /α
v
 ( q, v ) 
14
RBM N,  = 0.5
CBM ,  = 0.5
v
0
2
4
6
8
10
12
14
0
2
4
6
8
10
12
14
0.2
0
changes in dHb
τ q  f E ( f,E 0 ) qE 0  v
1 /α
-0.2
q/v
q
-0.4
-0.6
S


 q
 V0 k1 1  q   k2 1    k3 1  v 
S0
 v


k1  4.30 E0TE
k2  r0 E0TE
k3  1  
BOLD signal
change equation
Stephan et al. 2007, NeuroImage
How interdependent are neural and hemodynamic
parameter estimates?
1
A
0.8
5
0.6
10
B
0.4
15
C
0.2
20
0
25
-0.2
h
ε
30
-0.4
35
-0.6
-0.8
40
5
10
15
20
25
30
35
40
-1
Stephan et al. 2007, NeuroImage
DCM is a Bayesian approach
new data
prior knowledge
p( y | )
p ( )
p ( | y )  p ( y |  ) p ( )
posterior
 likelihood
∙ prior
Bayes theorem allows one to formally
incorporate prior knowledge into
computing statistical probabilities.
In DCM:
empirical, principled & shrinkage priors.
The “posterior” probability of the
parameters given the data is an
optimal combination of prior knowledge
and new data, weighted by their
relative precision.
stimulus function u
Overview:
parameter estimation
•
•
•
•
Combining the neural and
hemodynamic states gives
the complete forward model.
An observation model
includes measurement
error e and confounds X
(e.g. drift).
Bayesian inversion:
parameter estimation by
means of variational EM
under Laplace approximation
Result:
Gaussian a posteriori
parameter distributions,
characterised by
mean ηθ|y and
covariance Cθ|y.
neural state
equation
x  ( A   u j B j ) x  Cu
activity - dependent vasodilato ry signal
s  z   s  γ ( f  1)
s
s
f
parameters
flow - induction (rCBF)
hidden states
f  s
z  { x , s , f , v , q}
f
state equation
h
 { ,  ,  ,  ,  }

n
 { A , B ... B , C }
1
m
  { ,  }
h
z  F ( x, u, )
changes in volume
τv  f  v1 /α
v
changes
n
in dHb
τ q  f E ( f,  ) q  v
1 /α
q/v
q
v
ηθ|y

y   (x )
y  h(u ,  )  X  e
modelled
BOLD response
observation model
Inference about DCM parameters:
Bayesian single-subject analysis
• Gaussian assumptions about the posterior distributions of the
parameters
• posterior probability that a certain parameter (or contrast of
parameters cT ηθ|y) is above a chosen threshold γ:
 c T

 y

p  N

 c T C y c






• By default, γ is chosen as zero ("does the effect exist?").
Bayesian single subject inference
LD|LVF
0.13
 0.19
FG
left
LD
p(cT>0|y)
= 98.7%
0.34
 0.14
FG
right
0.44
 0.14
0.29
 0.14
LG
left
0.01
 0.17
RVF
stim.
Stephan et al. 2005,
Ann. N.Y. Acad. Sci.
LD
LG
right
-0.08
 0.16
LD|RVF
LVF
stim.
Contrast:
Modulation LG right  LG links by LD|LVF
vs.
modulation LG left  LG right by LD|RVF
Inference about DCM parameters:
Bayesian parameter averaging (FFX group analysis)
Likelihood distributions from different
subjects are independent
Under Gaussian assumptions this is
easy to compute:
 one can use the posterior from one
subject as the prior for the next
group
posterior
covariance
p  | y1
y N   p  y1
 p 
yN 
N

 p  
p  yi 

i 1
C
N
 p  y1   p  y i 

1
 | y 1 ,..., y N
 p  y1 , y 2   p  y i 
 |y

1 ,...,
yN
i3
 p  y1
N


1
C  | yi
i 1
i2
N
individual
posterior
covariances
y N 1  p  y N 

“Today’s posterior is tomorrow’s prior”
group
posterior
mean
 N
1
   C  | y i  | y i
 i 1

 C  | y1 ,...,

yN
individual posterior
covariances and means
Inference about DCM parameters:
RFX group analysis (frequentist)
• In analogy to “random effects” analyses in SPM, 2nd level analyses
can be applied to DCM parameters:
Separate fitting of identical models
for each subject
Selection of (bilinear) parameters
of interest
one-sample t-test:
parameter > 0 ?
paired t-test:
parameter 1 >
parameter 2 ?
rmANOVA:
e.g. in case of multiple
sessions per subject
definition of model space
inference on model structure or inference on model parameters?
inference on
individual models or model space partition?
optimal model structure assumed
to be identical across subjects?
yes
FFX BMS
comparison of model
families using
FFX or RFX BMS
inference on
parameters of an optimal model or parameters of all models?
optimal model structure assumed
to be identical across subjects?
yes
no
FFX BMS
RFX BMS
no
RFX BMS
Stephan et al. 2010, NeuroImage
FFX analysis of
parameter estimates
(e.g. BPA)
RFX analysis of
parameter estimates
(e.g. t-test, ANOVA)
BMA
What type of design is good for DCM?
Any design that is good for a GLM of fMRI data.
GLM vs. DCM
DCM tries to model the same phenomena (i.e. local BOLD responses) as a
GLM, just in a different way (via connectivity and its modulation).
No activation detected by a GLM
→ no motivation to include this region in a deterministic DCM.
However, a stochastic DCM could be applied despite the absence of a local
activation.
Stephan 2004, J. Anat.
Multifactorial design:
explaining interactions with DCM
Stim 1
Stim 2
Stimulus factor
Task factor
Stim1/
Task A
Stim2/
Task A
Task A
Task B
TA/S1
TB/S1
X1
X2
TA/S2
TB/S2
Stim 1/
Task B
Stim 2/
Task B
X1
X2
Let’s assume that an SPM analysis
shows a main effect of stimulus in X1
and a stimulus  task interaction in X2.
Stim1
How do we model this using DCM?
Stim2
Task A
Task B
GLM
DCM
Simulated data
X1
Stimulus 1
–
+++
–
+
X1
Stimulus 2
+
+++
+++
Task A
X2
Stim 1
Task A
+
Task B
X2
Stephan et al. 2007, J. Biosci.
Stim 2
Task A
Stim 1
Task B
Stim 2
Task B
X1
Stim 1
Task A
Stim 2
Task A
Stim 1
Task B
Stim 2
Task B
X2
plus added noise (SNR=1)
DCM10 in SPM8
• DCM10 was released as part of SPM8 in July 2010 (version 4010).
• Introduced many new features, incl. two-state DCMs and stochastic DCMs
• This led to various changes in model defaults, e.g.
– inputs mean-centred
– changes in coupling priors
– self-connections: separately estimated for each area
• For details, see:
www.fil.ion.ucl.ac.uk/spm/software/spm8/SPM8_Release_Notes_r4010.pdf
• Further changes in version 4290 (released April 2011) to accommodate new
developments and give users more choice (e.g. whether or not to meancentre inputs).
The evolution of DCM in SPM
• DCM is not one specific model, but a framework for Bayesian inversion of
dynamic system models
• The default implementation in SPM is evolving over time
– better numerical routines for inversion
– change in priors to cover new variants (e.g., stochastic DCMs,
endogenous DCMs etc.)
To enable replication of your results, you should ideally state
which SPM version you are using when publishing papers.
Factorial structure of model specification in DCM10
• Three dimensions of model specification:
– bilinear vs. nonlinear
– single-state vs. two-state (per region)
– deterministic vs. stochastic
• Specification via GUI.
bilinear DCM
non-linear DCM
modulation
driving
input
driving
input
modulation
Two-dimensional Taylor series (around x0=0, u0=0):
f
 f x
 f ( x , u )  f ( x 0 ,0 ) 
x
u
ux  ... 2
 ...
dt
x
u
xu
x 2
dx
Bilinear state equation:

 A
dt

dx
(i) 
 u i B  x  Cu
i 1

m
f
 f
2
2
2
Nonlinear state equation:

A

dt

dx
m
uB
i
i 1
n
(i)

x
j 1
j
D
( j)

 x  Cu


Neural population activity
0.4
0.3
0.2
u2
0.1
0
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
0.6
u1
0.4
x3
0.2
0
0.3
0.2
0.1
0
x1
x2
3
fMRI signal change (%)
2
1
0
Nonlinear dynamic causal model (DCM)
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
4
3

A

dt

dx
m
uB
n
(i)
i
i 1


j 1

( j)
x j D  x  Cu


2
1
0
-1
3
2
1
Stephan et al. 2008, NeuroImage
0
attention
MAP = 1.25
0.10
0.8
0.7
PPC
0.6
0.26
0.5
0.39
1.25
stim
0.26
V1
0.13
0.46
0.50
V5
0.4
0.3
0.2
0.1
0
-2
motion
Stephan et al. 2008, NeuroImage
-1
0
1
2
3
4
p ( D V 5 ,V 1  0 | y )  99 . 1 %
PPC
5
motion &
attention
static
motion &
no attention dots
V1
V5
PPC
observed
fitted
Two-state DCM
Single-state DCM
Two-state DCM
input
u
E
x1
E
x1
x1
I
x1
I
x1
x   x  Cu




 ij   ij exp( Aij  uB ij )
x   x  Cu
 ij  Aij  uB ij
  11




  N 1



1N 



 NN 
Marreiros et al. 2008, NeuroImage
 x1

x 

 x N




EE
  11
 IE
  11
  
 EE
 N1
 0

 11
EI

 11
II
1N
EE
0

 NN
EE
0
0
Extrinsic
(between-region)
coupling

 NN
IE
0 

0 
 

EE
 NN 
II
 NN 
Intrinsic
(within-region)
coupling
 x1E 
 I
 x1 
x   
 E
xN 
xI 
 N
-1
Stochastic DCM
0
200
400
600
800
1000
1200
hidden states - neuronal
0.1
excitatory
signal
0.05
0
dx
dt
 f  x, u ,   
• accounts for stochastic neural
fluctuations
• can be fitted to resting state data
•  has unknown precision and
smoothness
 additional hyperparameters
-0.05
-0.1
0
200
400
600
800
1000
1200
hidden states - hemodynamic
1.3
flow
volume
dHb
1.2
1.1
1
0.9
0.8
0
200
400
600
800
1000
1200
predicted BOLD signal
2
observed
predicted
1
0
-1
-2
-3
Friston et al. (2008, 2011) NeuroImage
Daunizeau et al. (2009) Physica D
0
200
400
600
time (seconds)
800
1000
1200
Li et al. (2011) NeuroImage
Thank you