Hierarchical Dynamic Models

Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
Dynamic Models
Hierarchical Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Will Penny
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
26th May 2011
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
AR Processes
Will Penny
Discrete time autoregressive processes of order p can be
described
p
X
xt =
ak xt−k + bo zt
k=1
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
They can also be written as
p
X
k=0
Generative Model
Energies and Actions
Linear Convolution
Model
ak xt−k = bo zt
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
OU Processes
Will Penny
OU Processes
One can define the analagous continuous time process
as
p
X
ak xtk = bo zt
k=0
where xtk denotes the k th order derivative of xt .
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
This is referred to as an OU(p) process (Gardiner, 1983).
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
OU Processes
Will Penny
For an OU(1) process we have
OU Processes
Embedding
OU(2) process
a0 xt + a1 ẋt = bo zt
Dynamic Models
Generalised coordinates
Joint Likelihood
We can rewrite this as
ẋt
b0
a0
xt + zt
a1
a1
= −axt + bzt
= −
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
In last weeks lecture we wrote this as
dxt = −axt dt + bdWt
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
where Wt is a Wiener process.
Hierarchical
Dynamic Models
References
Embedding
Hierarchical
Dynamic Models
Will Penny
A pth order differential equation describing y can be
expressed as an ordinary differential equation comprising
the p variables in u which is defined via an embedding.
Consider, for example, an OU(4) process.
...
....
a0 xt + a1 ẋt + a2 ẍt + a3 x t + a4 x t = b0 zt
We can divide through by a4 and redefine coefficients
giving
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
ẋt
= ẋt
ẍt
...
xt
....
xt
= ẍt
...
= xt
...
= −a0 xt − a1 ẋt − a2 ẍt − a3 x t + b0 zt
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Embedding
...
If we define ut = [xt , ẋt , ẍt , x t ]T we can express the
process in vector form
Will Penny
OU Processes
Embedding
OU(2) process
u̇t = Aut + Dut + Bzt
Dynamic Models
Generalised coordinates
where
Joint Likelihood
Filtering

A
=
0
 0

 0
−a0
0
0
0
−a1
0
0
0
−a2

0

0


0
−a3
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation

D
=
0
 0

 0
0
1
0
0
0
0
1
0
0

0
0 

1 
0
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data

B
=
0
 0

 0
0
0
0
0
0
0
0
0
0

0
0 

0 
b0
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
where D is a derivative operator.
OU(2) process
Hierarchical
Dynamic Models
Will Penny
Single realisation of an OU(2) process with parameters
a0 = 100, a1 = 2, b0 = 1.
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
OU process
Will Penny
Defining F = A + D we can write the OU process as
OU Processes
Embedding
OU(2) process
u̇t = Fut + Bzt
Dynamic Models
Generalised coordinates
Joint Likelihood
This is equivalently written
Filtering
Mode Following
dut = Fut dt + Bdwt
Dynamic
Expectation
Maximisation
Generative Model
They have closed form solutions as follows
State Equation
Observation Equation
Generative Model
Z
ut = exp(−Ft)u0 +
0
t
Energies and Actions
exp(−F (t − t 0 ))Bdwt0
Linear Convolution
Model
Generative Model
Generated Data
This follows from the result in the last lecture (stochastic
processes) - applying Ito’s formula and integrating.
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Covariance functions
Will Penny
Williams (2006) (see also Gardiner 1983, section 4.4.6)
shows that, if a12 < 4a0 , then the covariance function for
an OU(2) process is given by
C(τ ) =
b02
2a0 a1
exp(−ατ )(cos[βτ ] +
α
sin[βτ ])
β
where α = a1 /2 and α2 + β 2 = a0 . For the parameters
used to generate the sample path in the previous figure,
this corresponds to a frequency of f = β/(2π) = 1.6Hz,
which seems about right.
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Similar results exist for OU(p) processes (Gardiner,
1983).
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Resting state MEG
Hierarchical
Dynamic Models
Will Penny
Hindriks et al. (2011) have fitted such an equation to
resting state MEG data (showing alpha rhythm
dynamics).
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
By matching the observed and model-based covariance
functions they estimated the parameters to be
a0 = 65.22 , a1 = 3.2, b0 = 224.3. This corresponds to a
dominant frequency of f = 10.4Hz.
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Dynamic models
Will Penny
We now add on an observation equation, implying that the
dynamic processes of interest are not directly observed.
Friston et al. (2008) developed a triple estimation DEM
framework for estimation hidden states, parameters and
hyperparameters. We first focus on filtering.
y
ẋ
= g(x) + z
= f (x) + w
Here y is data to be modelled and x is a hidden state. The
second equation here embodies a dynamical prior.
The noise terms z and w are stochastic innovations such that
the covariance of the vector [z, ż, z̈, ..] is well defined. Similarly
for w.
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
In Friston et al’s DEM formulation the noise processes are
smooth Gaussian processes with non-zero correlation times.
They are not Wiener processes.
Triple Estimation
Hierarchical
Dynamic Models
References
Other Approaches
Hierarchical
Dynamic Models
Will Penny
Before getting into the DEM formalism we note that the
majority of work in filtering focuses on discrete time
models
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
yt
xt
= g(xt ) + zt
= f (xt−1 ) + wt
Estimation of states then takes place using discrete time
Bayesian filtering. Methods include (extended) Kalman
filtering, particle filtering. There is a very large literature
on this see eg. Bishop et al. (2006).
Daunizeau et al (2009) have developed a triple estimation
discrete time framework using variational methods. For
continuous time formulations (ie inference on SDEs)
other than DEM see eg Archambeau (2011) and
Calderhead (2009).
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Generalised coordinates
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
The hallmark of the DEM approach is its use of
generalised coordinates.
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Under local linearity assumptions the state equation can
be repeatedly differentiated with respect to time to give
ẋ
= f (x) + w
ẍ = fx ẋ + ẇ
...
x = fx ẍ + ẅ
.. = ..
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
where fx denotes the derivative df /dx.
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Generalised coordinates
Hierarchical
Dynamic Models
Will Penny
Instead however we write
OU Processes
Embedding
OU(2) process
0
= f (x) + w
00
= fx x 0 + ẇ
x
x
x 000 = fx x 00 + ẅ
.. = ..
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
where x̃ = [x, x 0 , x 00 , ..] are the ‘Generalised Coordinates
(GCs)’ of x.
Filtering will take place in the space of GCs, and we will
estimate x̃. Because we are estimating the states it is
unlikely that x 0 = ẋ. However, this will be the case when
tracking is good.
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Generalised coordinates
Hierarchical
Dynamic Models
Will Penny
The embedded state transition can be written as
f̃ = [f , f 00 , f 000 , ...] where
OU Processes
Embedding
OU(2) process
Dynamic Models
f
= f (v , x, θ)
f 0 = fx x 0
f 00 = fx x 00
.. = ..
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
and we write
Generative Model
w̃ = [w, ẇ, ẅ, ...]
Energies and Actions
Linear Convolution
Model
Generative Model
Hence
Generated Data
D x̃ = f̃ + w̃
where D is the derivative operator.
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Observation equation
Will Penny
OU Processes
Embedding
OU(2) process
Similarly the observation equation can be repeatedly
differentiated to give
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
y
= g(x) + z
ẏ
= gx x 0 + ż
ÿ
= gx x + z̈
00
.. = ..
where gx denotes the derivative dg/dx.
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Generalised coordinates
The embedded predicted response can be written as
g̃ = [g, g 0 , g 00 , ...] where
Will Penny
OU Processes
Embedding
g = g(v )
g 0 = gx x 0
g 00 = gx x
00
.. = ..
and we also write
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
ỹ
= [y, ẏ , ÿ , ..]
z̃ = [z, ż, z̈, ..]
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
This means we can write the observation equation in
generalised coordinates as
ỹ = g̃ + z̃
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Compact expression
Will Penny
Hence, for locally linear systems
OU Processes
Embedding
OU(2) process
ỹ
= g̃ + z̃
D x̃
= f̃ + w̃
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Joint Likelihood
The joint log likelihood of observatons and states is
L(x̃, t) = log[p(ỹ |x̃)p(x̃)]
Will Penny
OU Processes
Embedding
OU(2) process
Given Gaussian densities
Dynamic Models
Generalised coordinates
z −1
p(ỹ |x̃) = N(ỹ ; g̃, (Π̃ )
Joint Likelihood
)
p(x̃) = N(D x̃; f̃ , (Π̃w )−1 )
where Π̃z and Π̃w are the precision matrices for the
observations and state (more on this later). As a function
of x we therefore have
1
1
L(x̃, t) = − ẽyT Π̃z ẽy − ẽxT Π̃w ẽx
2
2
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
where the observation and flow error terms are
ẽy
= ỹ − g̃
ẽx
= D x̃ − f̃
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Filtering
Filtering refers to the estimation of the hidden states x̃.
Hidden states can be estimated by following the gradient
of L(x̃, t) with respect to x̃
Will Penny
OU Processes
Embedding
OU(2) process
Dynamic Models
j(x̃) =
=
dL(x̃, t)
d x̃
dL1 (x̃, t) d ẽy
dL2 (x̃, t) d ẽx
+
d ẽy
d x̃
d ẽx
d x̃
= (Ip ⊗ gx ) Π̃z (ỹ − g̃) + (Ip ⊗ fx − D) Π̃w (D x̃ − f̃ )
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
where p is the embedding dimension. The hidden states
could then be updated by following this gradient
Energies and Actions
Linear Convolution
Model
Generative Model
x̃˙ = j(x̃)
This optimisation would be sufficient for a stationary
system where L(x̃, t) = L(x̃).
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Mode Following
However, because the likelihood itself is changing we
must add another term
Will Penny
OU Processes
Embedding
x̃˙ = j(x̃) + D x̃
OU(2) process
Dynamic Models
Generalised coordinates
We can see that this makes sense because at the
maximum likelihood value (or the ‘mode’) the gradient is
zero, j(x̃) = 0. We then have
x̃˙ = D x̃
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
x̃
D x̃
x̃˙
= [x, x 0 , x 00 , ..]
= [x 0 , x 00 , x 000 , ..]
0
00
= [ẋ, ẋ , ẋ , ..]
So at the mode, the generalised coordinates are equal to
the time derivatives eg. x 0 = ẋ
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Dynamic Model
Will Penny
OU Processes
We consider dynamic models of the form
ẋ
= f (x, v , θ) + w
y
= g(x, v , θ) + z
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
where x is a hidden state, v is a hidden cause, θ are
model parameters, and y are observed time series.
Dynamic
Expectation
Maximisation
Generative Model
State Equation
We have now made explicit the dependence on
parameters θ. The state and observation noise also
depend on hyperparameters λ.
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
The DEM framework provides a method for estimating
states, causes, parameters and hyperparameters.
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Generative Model
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Causes and Parameters
Will Penny
OU Processes
Embedding
OU(2) process
There is a Gaussian prior over the hidden causes
Dynamic Models
Generalised coordinates
Joint Likelihood
v
v
p(ṽ ) = N(ṽ ; η , C )
Filtering
Mode Following
where η v is the mean and C v is the covariance.
The prior over the parameters is
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
θ
θ
p(θ) = N(θ; η , C )
where η θ is the mean and C θ is the covariance.
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
State Equation
Will Penny
OU Processes
Embedding
OU(2) process
The distribution governing the evolution of the states is
Dynamic Models
Generalised coordinates
w
p(x̃|ṽ , θ, λ) = N(D x̃; f̃ , Σ̃ )
Joint Likelihood
Filtering
Mode Following
Σ̃w
The state noise covariance
is parameterised by a
hyperparameter, λw , and a parameter γ that governs the
smoothness of the Gaussian innovations (Friston et al,
2008).
Dynamic
Expectation
Maximisation
Previously we referred to the precision of the state noise
covariance (Π̃w )−1 = Σ̃w .
Linear Convolution
Model
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Observation Equation
The distribution governing the evolution of the states is
Hierarchical
Dynamic Models
Will Penny
OU Processes
p(ỹ |x̃, ṽ , θ, λ) = N(ỹ; g̃, Σ̃z )
Embedding
OU(2) process
Dynamic Models
The state noise covariance Σ̃z is the inverse of the state
noise precision Π̃z .
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
It is parameterised by a hyperparameter, λz , and a
parameter γ that governs the smoothness of the
Gaussian innovations.
Hierarchical
Dynamic Models
References
Hyperparameters
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
For the hyperparameters λ = {λw , λz } we have a prior
p(λ) = N(λ; η λ , C λ )
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Generative Model
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Joint Likelihood and Posterior
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
The joint log-likelihood of the data, hidden states, causes,
parameters and hyperparameters is
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
L(x̃, ṽ , t, θ, λ) = log [p(ỹ |x̃)p(x̃|ṽ )p(ṽ )p(θ)p(λ)]
We assume an approximate posterior that factorises over
states, parameters and hyperparameters
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
q(x̃, ṽ , t, θ, λ) = q(x̃, ṽ , t)q(θ)q(λ)
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Energies and Actions
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
The variational energies
Dynamic Models
Generalised coordinates
Z Z
I(x̃, ṽ , t)
L(x̃, ṽ , t, θ, λ)q(θ)q(λ)dθdλ
Z Z Z Z
I(θ) =
L(x̃, ṽ , t, θ, λ)q(x̃, ṽ , t)q(λ)d x̃d ṽ dλdt
Z Z Z Z
I(λ) =
L(x̃, ṽ , t, θ, λ)q(x̃, ṽ , t)q(θ)d x̃d ṽ dθdt
=
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
The latter two are known as ’actions’ as the integral is
also over time.
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Dynamic Expectation Maximisation
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
The Dynamic Expectation Maximisation (DEM) algorithm
implements a triple estimation of states (D-step),
parameters (E-step) and hyperparameters (M-step).
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
I
The D-step updates q(x̃, ṽ , t) so as to minimise the
variational energy I(x̃, ṽ , t).
I
The E-step updates q(θ) so as to minimise the
variational action I(θ).
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
I
The M-step updates q(λ) so as to minimise the
variational action I(λ)
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
D-Step
Will Penny
The state estimates are updated (approximately) using
OU Processes
Embedding
OU(2) process
x̃˙ = j(x̃) + D x̃
Dynamic Models
Generalised coordinates
Joint Likelihood
Previously (when just filtering) the gradient was of the
joint log likelihood
j(x̃) =
dL(x̃, t)
d x̃
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
But for the D-step we use the gradient of the variational
energy
dI(x̃, t)
j(x̃) =
d x̃
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Generative Model
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Linear Convolution model
Hierarchical
Dynamic Models
Will Penny
We generate data from a single input multiple output
linear dynamical model with input
1
v = exp( [t − 12]2 )
4
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Linear Convolution model
Hierarchical
Dynamic Models
Will Penny
State variables were then generated according to the
equations
OU Processes
Embedding
OU(2) process
ẋ1 = −0.25x1 + x2 + v + w1
ẋ2 = −0.50x1 − 0.25x2 + w2
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Generated Data
Observations were then created from the state variables
Hierarchical
Dynamic Models
Will Penny
OU Processes
y1 = 0.125x1 + 0.1633x2 + z1
y2 = 0.125x1 + 0.0676x2 + z2
y3 = 0.125x1 − 0.0676x2 + z3
y4 = 0.125x1 − 0.1633x2 + z4
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Linear Convolution model
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
This conforms to a dynamical model
Dynamic Models
Generalised coordinates
Joint Likelihood
ẋ
= f (x, v , θ) + w
y
= g(x, v , θ) + z
Filtering
Mode Following
where
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
f (x, v ) = Fx + hv
g(x, v ) = Gx
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical
Dynamic Models
Linear Convolution model
Will Penny
OU Processes
Embedding
OU(2) process
The parameters θ = {F , h, G} are given by
Dynamic Models
Generalised coordinates
Joint Likelihood
F
h
G
−0.25
1
=
−0.50 −0.25
1
=
0

0.125 0.1633
 0.125 0.0676
= 
 0.125 −0.0676
0.125 −0.1633
Filtering
Mode Following
Dynamic
Expectation
Maximisation

Generative Model
State Equation
Observation Equation



Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Filtering
Hierarchical
Dynamic Models
Will Penny
We now run the DEM algorithm with (infinitely) tight priors
on the parameters and hyperparameters. Effectively,
DEM just implements the D-step ie filtering.
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
The figure shows the true (solid) and estimated (dotted)
causes, v . Implementation in ’DEMdemoconvolution.m’
from the DEM toolbox of SPM.
Triple Estimation
Hierarchical
Dynamic Models
References
Linear Convolution model
Hierarchical
Dynamic Models
Will Penny
The figure shows the true (solid) and estimated (dotted)
hidden variables, x.
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Importantly, DEM also provides an approximate posterior
density over the hidden states (not just the posterior
mean - dotted line).
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Linear Convolution model
Hierarchical
Dynamic Models
Will Penny
Importantly, DEM also provides an approximate posterior
density over the hidden states, q(x̃, t).
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Extended Kalman Filtering
State estimation is more accurate using DEM as the
innovations can be smooth Gaussian processes. With the
(Extended) Kalman Filter (EKF) the Gaussian innovations are
assumed IID (ie rough) (Bishop, 2006).
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Additionally, DEM can invert models with hidden causes and
hierarchical structure (see later).
Hierarchical
Dynamic Models
References
Triple Estimation
The DEM algorithm was then run with uninformative priors over
two of the parameters.
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
These were (1) the parameter coupling the first hidden state to
the second (true value −0.5) and (2) the parameter coupling
the first hidden state to the output (true value 0.125).
Triple Estimation
Hierarchical
Dynamic Models
References
State Posterior
Hierarchical
Dynamic Models
Will Penny
Posterior state distribution, q(x̃, t), from DEM filtering
(left) - where the parameters are assumed known versus DEM triple estimation (right) - where the
parameters and hyperparameters are estimated.
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
The gray tubes mark 90 percent confidence intervals.
Hierarchical
Dynamic Models
References
Cause Posterior
Hierarchical
Dynamic Models
Will Penny
Posterior cause distribution, q(ṽ , t), from DEM filtering
(left) - where the parameters are assumed known versus DEM triple estimation (right) - where the
parameters and hyperparameters are estimated.
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
The gray tubes mark 90 percent confidence intervals.
Hierarchical
Dynamic Models
References
Hierarchical Dynamic Models
Hierarchical
Dynamic Models
Will Penny
We now make the causes dependent on higher order
dynamics and causes. This continues in hierarchical
fashion up to the nth level cause
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
y
= g(x1 , v1 , θ1 ) + z1
v1 = g(x2 , v2 , θ2 ) + z2
.. = ..
vn−1 = g(xn−1 , vn , θn−1 ) + en−1
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
These hierarchical relations embody what might be
termed structural priors.
The generative models are identical to those in the
hierarchy lecture (number 4) with the addition of hidden
variables xi that allow the model to remember the causes
vi of previous data y.
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical Dynamic Models
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
The hidden variables evolve dynamically and are also
constrained by higher level causes (and parameters at
the same level).
ẋ1 = f (x1 , v1 , θ1 )
ẋ2 = f (x2 , v2 , θ2 )
.. = ..
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
ẋn−1 = f (xn−1 , vn , θn−1 )
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Hierarchical Dynamic Models
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
Speech Perception
Kiebel et al (2009) proposed that speech perception might
correspond to Bayesian inference in dynamical systems
comprised of a hierarchy of SHCs.
Hierarchical
Dynamic Models
Will Penny
OU Processes
Embedding
OU(2) process
Dynamic Models
Generalised coordinates
Joint Likelihood
Filtering
Mode Following
Dynamic
Expectation
Maximisation
Generative Model
State Equation
Observation Equation
Generative Model
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
Filtering
Triple Estimation
Hierarchical
Dynamic Models
References
References
Hierarchical
Dynamic Models
Will Penny
C. Archambeau (2011) Approximate Inference for continuous-time Markov processes. In D. Barber, A. T.
Cemgil, and S. Chiappa, Inference and Learning in Dynamic Models. Cambridge University Press.
OU Processes
Embedding
C. Bishop (2006) Pattern Recognition and Machine Learning, Springer.
OU(2) process
B. Calderhead et al (2009) Accelerating Bayesian Inference over Nonlinear Differential Equations with
Gaussian Processes Advances in Neural Information Processing Systems, Vol. 21, 217-224, MIT Press.
Dynamic Models
J. Daunizeau et al. (2009) Variational Bayesian identification and prediction of stochastic nonlinear
dynamic causal models. Physica D: nonlinear phenomena, 238: 2089-2118.
Filtering
K. Friston et al. (2008) DEM: A variational treatment of dynamic systems. Neuroimage 41, 849-885.
Dynamic
Expectation
Maximisation
Generalised coordinates
Joint Likelihood
K. Friston (2008) Variational filtering in generalised coordinates of motion. Conference on Approximate
Inference in Stochastic Processes and Dynamic Systems. http://videolectures.net.
Mode Following
Generative Model
State Equation
C. Gardiner (1983) Handbook of Stochastic Methods. Springer-Verlag, 1983.
Observation Equation
Generative Model
R. Hindriks et al (2011) Dynamics underlying spontaneous human alpha oscillations: a data-driven
approach. Neuroimage, in press.
S. Kiebel et al. (2009) Recognizing sequences of sequences. PLoS Computational Biology, e1000464.
Energies and Actions
Linear Convolution
Model
Generative Model
Generated Data
C. Williams (2006) A tutorial introduction to stochastic differential equations. Invited talk in NIPS
Filtering
Triple Estimation
workshop on Dynamical Systems, Stochastic Processes and Bayesian Inference Whistler, BC, Canada.
http://videolectures.net.
Hierarchical
Dynamic Models
References

Download Report

Hierarchical Dynamic Models

Paperzz.com

Your Paperzz