State-Space Models

State-Space Models
Mei-Yuan Chen
Department of Finance
National Chung Hsing University
February 25, 2013
M.-Y. Chen
state-space
State-Space Models
State-space models are widely applied in econometrics to deal
with dynamic time series models involving unobserved
variables. Unobserved variables are common in economic
theory, for example, permanent income, expectations, the ex
ante real rate of interest, and the reservation wage. The basic
tool used to deal with the standard state-space model is the
Kalman filter, which is a recursive procedure for computing
the estimator of the unobservable component or the state
vector, based on available information at time t.
M.-Y. Chen
state-space
M.-Y. Chen
state-space
M.-Y. Chen
state-space
Time-Varying-Parameter Models and the Kalman
Filter
Consider the following regression model:
yt = x′t β t + et , t = 1, . . . , T,
(1)
β t = µ̃ + F β t−1 + v t ,
(2)
et ∼ i.i.d.N(0, R),
(3)
v t ∼ i.i.d.N(0, Q), ,
(4)
where x′t is a 1 × k vector of exogenous or predetermined
variables, et and v t are independent, F is k × k and Q is also
k × k.
M.-Y. Chen
state-space
Assume, for simplicity, µ̃ = 0,
βt
=
F β t−1 + v t
=
F (F βt−2 + v t−1 ) + v t
=
=
F 2 (F β t−3 + v t−2 ) + v t + F v t−1
..
.
=
F t−2 (F β t−(t−1) + v t−(t−2) ) + v t + F v t−1 + · · · + F t−3 v t−(t−3)
=
F t−1 β 1 + v t + F v t−1 + · · · + F t−3 v 3 + F t−2 v 2 .
M.-Y. Chen
state-space
Using above equation, β 1 , β 2 , . . . , β t−1 can be solved as
β1
β2
=
F −(t−1) [β t − (F t−2 v 2 + F t−3 v 3 + · · · + F v t−1 + v t )]
=
F −t+1 β t − (F −1 v 2 + F −2 v 3 + · · · + F −t+2 v t−1 + F −t+1 v t )
=
F −(t−2) [β t − (F t−3 v 3 + F t−4 v 4 + · · · + F v t−1 + v t )]
=
F −t+2 β t − (F −1 v 3 + F −2 v 4 + · · · + F −t+3 v t−1 + F −t+2 v t )
..
.
..
. =
β t−1
=
F −1 β t − F −1 v t
βt
=
βt .
M.-Y. Chen
state-space
Therefore, above equations can be written in a matrix form as








β1
β2
..
.
β t−1
βt









F −t+1 β t
F −t+2 β 2
..
.
F −1 βt−1
βt



F −t+1 β t
F −t+2 β 2
..
.
F −1 βt−1
βt





= 






= 



 
 
 
−
 
 
 
 
 
 
−
 
 
 
M.-Y. Chen
F −1 v 2 + F −2 v 3 + · · · + F −t+2 v t−1 + F −
F −1 v 3 + F −2 v 4 + · · · + F −t+3 v t−1 + F −
..
.
F −1 v t
0

−1
−2
−t+1  
v2
F
F
··· F


0
F −1 · · · F −t+2 
  v3 



..
..
..
..
  ..  .
.
.
.
.
 . 


0
0
···
F −1   v t−1 
0
state-space
0
···
0
vt
Then equation (1) can be written as








y1
y2
..
.
yt−1
yt
x′1 F −t+1

 x′ F −t+2

 2


..
 = 
.



 ′

 xt−1 F −1
x′t
 ′ −1
x1 F

0


..

−
.


0
0

ỹt





 βt



x′1 F −2
x′2 F −1
..
.
0
0
···
···
..
.
···
···
∗
= X̃ t βt + ǫ̃t .
M.-Y. Chen
state-space
x′1 F −t+1
x′2 F −t+2
..
.
x′t−1 F −1
0








v2
v3
..
.
v t−1
vt


e1
  e
  2
  .
+ .
  .
 
  et−1
et
(
E[ǫ̃t ǫ̃′t ] = A(I t−1 ⊗ Q)A′t + σ 2 I t = Ωt ,
where




At = 








var 



v2
v3
..
.
v t−1
vt








x′1 F −1
..
.
0
0



= 



x′1 F −2
x′2 F −1
..
.
0
0
···
···
..
.
···
···
x′1 F −t+1
x′2 F −t+2
..
.
′
xt−1 F −1
0




,



var(v 2 )
cov(v 2 , v 3 ) · · ·
cov(v 3 , v 2 )
var(v 3 )
···
..
..
..
.
.
.
cov(v t , v 2 ) cov(v t , v 3 ) · · ·
var(v 2 )
0
···
 M.-Y. Chen
state-space
0
var(v
) ···
0
0

cov(v 2 , v t )

cov(v 3 , v t ) 

(6)
..


.
var(v 3 )



var(ǫ̃) =
=
=
e1
e2
..
.




var 


 et−1
et

e1
 e
 2
 .

var 
 ..

 et−1
et
x′1 F −2 · · ·
 
x′2 F −1 · · ·
 
 
..
..
..
−
.
.
.
 
 
 
0
0
···
0
0
···
 


v2
  v 

3 
 

 


 + var At  .. 
  . 

 


  v t−1 



x′1 F −1
vt

v2
 v
3

 .
2

σ I t + At var 
 ..

 v t−1
M.-Y. Chen



 ′
 A
 t


state-space
x′1 F −t+1
x′2 F −t+2
..
.
x′t−1 F −1
0








v2
v3
..
.
v t−1
vt








Therefore, for t = k + 1, . . . , T , (5) can be estimated by GLS:
′
−1 ′
β t|t = X̃ t Ω−1
X̃ t Ω−1
X̃
t
t
t ỹt ,
where β t|t refers to an estimate of β t conditional on
information up to time t. The Kalman filter approach can be
easily implemented without having to invert such large matrix.
M.-Y. Chen
state-space
Kalman Filter
The Kalman filter is a recursive procedure for computing the
optimal estimate of the unobserved-state vector
β t , t = 1, 2, . . . , T , based on the appropriate information set,
assuming that µ̃, F , R and Q are known. It provides a
minimum mean squared error estimate of β t given appropriate
information set. Assuming that xt is available at the
beginning of time t and a new obervation of yt is made at the
end of time t, the Kalman filter consists of the following two
steps: prediction and updating steps.
M.-Y. Chen
state-space
Prediction: At the beginning of time t, we may want to form an optimal
predictor of yt , based on all the available information up to time t − 1,
yt|t−1 . That is, yt|t−1 = X ′t β t|t−1 . To do this, calculation of β t|t−1 is
needed.
Updating: Once yt is realized at the end of time t, the prediction error
can be calculated: ηt|t−1 = yt − yt|t−1 . This prediction error contains
new information about β t beyond that contained in βt|t−1 . Thus, after
observing yt , a more accurate inference can be made of β t . β t|t , an
inference of β t based on information up to time t, may be of the
following form: β t|t = βt|t−1 + K t ηt|t−1 , where K t is the weight
assigned to new information about β t contained in the predictor error.
M.-Y. Chen
state-space
To be more specific, the basic filter is described by the
following six equations:
Prediction Steps:
1. β t|t−1 : expectation (estimate) of β t conditional on
information up to time t − 1. By (2):
β t = µ̃ + F β t−1 + v t , we have
β t|t−1 = E[β t |Ft−1 ]
= E[µ̃ + F β t−1 + v t |Ft−1 ]
= E[µ̃|Ft−1 ] + F E[β t−1 |Ft−1 ] + E[v t |Ft−1 ]
= µ̃ + F β t−1|t−1 .
M.-Y. Chen
state-space
(10)
2. P t|t−1 : covariance matrix of βt conditional on information up to
time t − 1;
P t|t−1
= E[(β t − β t|t−1 )(β t − β t|t−1 )′ |Ft−1 ]
= E[(µ̃ + F β t−1 + v t − β t|t−1 )(µ̃ + F β t−1 + v t − β t|t−1 )′ |Ft
= E[(F β t−1 + v t − F βt−1|t−1 )(F β t−1 + v t − F βt−1|t−1 )′ |Ft
= E[(F (β t−1 − βt−1|t−1 ))(F (βt−1 − β t−1|t−1 ))′ |Ft−1 ] + E[v t
−E[(F (β t−1 − β t−1|t−1 ))v ′t |Ft−1 ] − E[(F (β t−1 − βt−1|t−1 )
= F E[(β t−1 − β t−1|t−1 )(β t−1 − βt−1|t−1 )′ |Ft−1 ]F ′ + Q − 0 −
= F P t−1|t−1 F ′ + Q.
M.-Y. Chen
state-space
3. ηt|t−1 : prediction error;
ηt|t−1 = yt − yt|t−1 = yt − x′t β t|t−1 .
(12)
4. ft|t−1 : conditional variance of the prediction error;
2
ft|t−1 = E[ηt|t−1
] = var[ηt|t−1 ]
= E[(yt − x′t β t|t−1 )2 ]
= E[(x′t β t + et − x′t β t|t−1 )2 ]
= E[(x′t (β t − β t|t−1 ) + et )2 ]
= E[(x′t (β t − β t|t−1 )(β t − β t|t−1 )′ xt ] + E[e2t ]
= x′t E[(β t − β t|t−1 )(β t − β t|t−1 )′ ]xt + R
= x′t P t|t−1 xt + R.
M.-Y. Chen
state-space
(13)
Updating Steps: Let Z1 and Z2 , conditional Ft−1 , be
normally distributed,
!!
!
!
Σ11 Σ12
µ1
Z1
.
,
|Ft−1 ∼ N
Σ21 Σ22
µ2
Z2
Then,
Z1 |Z2 , Ft−1 ∼ N(µ1|2 , Σ11|2 ),
where
µ1|2 = µ1 + Σ12 Σ−1
22 (Z2 − µ2 )
Σ11|2 = Σ11 − Σ12 Σ−1
22 Σ21 .
M.-Y. Chen
state-space
Thus, let Z1 = β t and Z2 = ηt|t−1 ,
µ1
= E[Z1 |Ft−1 ] = E[β t |Ft−1 ] = βt|t−1 ,
Σ11
= var(Z1 |Ft−1 ) = var(β t |Ft−1 ) = P t|t−1 ,
Σ22
= var(Z2 |Ft−1 ) = var(ηt|t−1 |Ft−1 ) = ft|t−1
Σ12
= cov(Z1 , Z2 |Ft−1 ) = cov(βt , ηt|t−1 |Ft−1 )
= E[β t (yt − x′t βt|t−1 )|Ft−1 ]
= E[β t (x′t β t + et − x′t β t|t−1 )|Ft−1 ]
= E[β t x′t (β t − βt|t−1 ) + β t et |Ft−1 ]
= E[β t (β t − βt|t−1 )′ |Ft−1 ]xt + E[β t et |Ft−1 ]
= E[(β t − β t|t−1 )(β t − β t|t−1 )′ |Ft−1 ]xt
+E[β t|t−1 )(β t − β t|t−1 )′ |Ft−1 ]xt
= E[(β t − β t|t−1 )(β t − β t|t−1 )′ |Ft−1 ]xt
= P t|t−1 xt .
M.-Y. Chen
state-space
Therefore, as Z1 |Z2 , Ft−1 = β t |ηt|t−1 , Ft−1 = βt |Ft ,
E[β t |Ft ] =
βt|t
µ1|2 = µ1 + Σ12 Σ−1
22 (Z2 − µ2 )
=
−1
(ηt|t−1 − 0), as E[ηt|t−1 |Ft−1 ] = 0
βt|t−1 + P t|t−1 xt ft|t−1
=
−1
βt|t−1 + P t|t−1 xt ft|t−1
ηt|t−1
(14)
=
βt|t−1 + K t ηt|t−1 ,
(15)
−1
where K t = P t|t−1 xt ft|t−1
is denoted as the Kalman gain. Besides,
var(β t |Ft−1 ) =
P t|t
var(Z1 |Z2 , Ft−1 ) = Σ11 − Σ12 Σ−1
22 Σ21
=
P t|t−1 − P t|t−1 xt ft|t−1 x′t P t|t−1
=
P t|t−1 − K t x′t P t|t−1 .
M.-Y. Chen
state-space
(16)
Given the initial values, β 0|0 and P 0|0 , the six equations in the
basic filter can be iterated for t = 1, 2, . . . , T . Flowchart of
the iterations is presented as follows.
β 0|0 , P 0|0 , l(θ) = 0
⇓
β 1|0 = µ̃ + F β 0|0
P 1|0 = F P 0|0 F ′ + Q
η1|0 = y1 − y1|0 = y1 − x′1 β 1|0
f1|0 = x′1 P 1|0 x1 + R
⇓
−1
K 1 = P 1|0 x1 f1|0
β 1|1 = β 1|0 + K 1 η1|0
P 1|1 = P 1|0 − K 1 x′1 P 1|0
M.-Y. Chen
state-space
⇓
β 2|1 = µ̃ + F β1|1
P 2|1 = F P 1|1 F ′ + Q
η2|1 = y2 − y2|1 = y2 − x′2 β 2|1
f2|1 = x′2 P 2|1 x2 + R
⇓
−1
K 2 = P 2|1 x2 f2|1
β2|2 = β 2|1 + K 2 η2|1
P 2|2 = P 2|1 − K 2 x′2 P 2|1
⇓
⇓
β t|t−1 = µ̃ + F β t−1|t−1
P t|t−1 = F P t−1|t−1 F ′ + Q
ηt|t−1 = yt − yt|t−1 = yt − x′t βt|t−1
ft|t−1 = x′t P t|t−1 xt + R
⇓
KM.-Y.
P t|t−1state-space
xt f −1
t =Chen
Note that the Kalman filter provides a minimum mean squared
error estimate of β t , t = 1, . . . , T , given information up to
time t − 1 or t. For stationary β t in (2), the unconditional
mean and variance matrix of β t may be employed as the initial
values, β 0|0 and P 0|0 . The unconditional mean of stationary
β t is derived as
E[β t ] = µ̃ + F E[β t−1 ] + E[v t ],
β 0|0 = µ̃ + F E[β 0|0 ] At steady state,
β 0|0 = (I k − F )−1 µ̃.
M.-Y. Chen
state-space
The unconditional variance matrix of stationary β t is derived
as
cov(β t ) = F cov(β t−1 )F ′ + cov(v t ),
P 0|0 = F P 0|0 F ′ + Q,
At, steady state,
vec(P 0|0 ) = vec(F P 0|0 F ′ ) + vec(Q),
vec(P 0|0 ) = (F ⊗ F )vec(P 0|0 ) + vec(Q),
vec(P 0|0 ) = (I − F ⊗ F )−1 vec(Q),
as vec(ABC) = (C ′ ⊗ A)vec(B).
M.-Y. Chen
state-space
For nonstationary βt in (2), the unconditional mean and covariance do
not exist. In this case, β 0|0 may be set at any arbitrary k × 1 vector (wild
guessing). But in order to assign very large uncertainty to this wild guess,
very large positive values have to be assigned to the diagonal elements of
P 0|0 . Since, if P t−1|t−1 is a very large positive definite matrix, most of
the weight in the updating equation (15) is assigned to new information
contained in the forecast error of yt , and the information content in
β t|t−1 is treated as negligible. Alternatively, if we assume that β 0|0 as
additional parameters to be estimated. In this case, P 0|0 should be set
equal to a k × k matrix of 0s when estimating the hyperparameters of the
model via MLE, because β 0|0 is not a random variable. Once these
parameters are estimated along with other hyperparameters of the model,
we can run the Kalman filter again by setting β0|0 = β̂ 0|0,MLE and
P 0|0 = cov(β̂ 0|0,MLE for inferences on βt , t = 1, . . . , T .
M.-Y. Chen
state-space
Smoothing (β t|T ) provides us with a more accurate inference on βt ,
since it uses more information than the basic filter.
smoothing:
β t|T
=
βt|t + P t|t F ′ P −1
t+1|t (β t+1|T − F β t|t − µ̃),
P t|T
=
−1
′
P t|t + P t|t F ′ P −1
t+1|t (P t+1|T − P t+1|t )P t+1|t F P t|t .
′
Given the initial values for smoothing, β T |T and P T |T , are obtained from
the last iteration of the basic filter,
β T −1|T
= β T −1|T −1 + P T −1|T −1 F ′ P −1
T |T −1 (β T |T − F β T −1|T −1 − µ̃),
P T −1|T
−1
′
= P T −1|T −1 + P T −1|T −1 F ′ P −1
T |T −1 (P T |T − P T |T −1 P T |T −1 F P T −
′
where βT −1|T −1 , P T −1|T −1 , and P T |T −1 are also obtained from the
basic filter.
M.-Y. Chen
state-space
For t = T − 2,
β T −2|T = β T −2|T −2 + P T −2|T −2 F ′ P −1
T −1|T −2 (β T −1|T − F β T −2|T −
P T −2|T = P T −2|T −2 + P T −2|T −2 F ′ P −1
T −1|T −2 (P T −1|T − P T −1|T −2
where β T −2|T −2 , P T −2|T −2 , and P T −1|T −2 are obtained from
the basic filter and β T −1|T and P T −1|T are obtained from the
first smoothing at t = T − 1.
M.-Y. Chen
state-space
Note that the algorithms of basic filter and smoothing
discussed previously are conditional on the assumption that
the model’s parameters, µ̃ and F , are known. However, some
of these parameters are usually unknown. In this case, the
parameters need to be estimated first; then the estimate of
β t , t = 1, . . . , T m is conditional on these estimated
parameters. If β 0 and {et , v t }Tt=1 are Gaussian, the
distribution of yt conditional on Ft−1 is also Gaussian:
yt |Ft−1 ∼ N(yt|t−1 , ft|t−1 ),
and the sample log likelihood function is represented as
T
T
1X ′
1X
ln(2πft|t−1 ) −
η
f −1 η
,
ln L = −
2 t=1
2 t=1 t|t−1 t|t−1 t|t−1
which can be maximized with respect unknown parameters of
M.-Y. Chen
state-space
For nonstationary β t in (2), the log likelihood is evaluated
from observation τ + 1(τ ≫ 1):
T
T
1 X
1 X ′
−1
ln L = −
ln(2πft|t−1 ) −
ηt|t−1 ft|t−1
ηt|t−1 .
2 t=τ +1
2 t=τ +1
Notice that we start the Kalman filter with an arbitrary initial
values β 0|0 and P 0|0 , with large diagonal elements for
nonstationary β t . Iterating the filter starting from t = 1, then
evaluating the log likelihood function from t = τ + 1
minimizes the effect of the arbitrary initial values β 0|0 on the
log likelihood value.
M.-Y. Chen
state-space
State-Space Models and the Kalman Filter
State-space models, which were originally developed by control
engineers (Kalman, 1960), are useful tools for expressing
dynamic system involving unobserved state variables. A
state-space model consists of two equations: a transition
equation (sometimes called a state equation) and a
measurement equation. The measurement equation describes
the relation between observed variables (data) and unobserved
state variables. The transition equation describes the dynamics
of the state variables. The transition equation has the form of
a first-order difference equation in the state vector.
M.-Y. Chen
state-space
A representative state-space model is:
Measurement Equation:
yt = H t β t + Az t + et ,
(17)
Transition Equation:
β t = µ̃ + F β t−1 + v t ,
(18)
et ∼ i.i.d.N(0, R),
(19)
v t ∼ i.i.d.N(0, Q),
(20)
E(et v ′s ) = 0,
(21)
where z t is an r × 1 vector of exogenous or predetermined
observed variables.
M.-Y. Chen
state-space
It is clear that the differences between the time-varying
parametric model and the state-space model are: (1) the data
xt is in the TVP model while Ht in the State-space model; (2)
the exogenous variables z t with time-invariant parameters are
introduced in the state-space model. The Kalman filter for the
TVP model can be easily modified to estimate
β t , t = 1, . . . , T for the state-space model.
M.-Y. Chen
state-space
Basic Filtering
Prediction Steps:
β t|t−1 = µ̃ + F β t−1|t−1 .
(22)
P t|t−1 = F P t−1|t−1 F ′ + Q.
(23)
ηt|t−1 = yt − yt|t−1 = yt − H t β t|t−1 − Az t
(24)
ft|t−1 = H t P t|t−1 H ′t + R,
(25)
(26)
Updating Steps:
β t|t = β t|t−1 + K t ηt|t−1 ,
(27)
P t|t = P t|t−1 − K t H t P t|t−1 ,
(28)
(29)
−1
where K t = P t|t−1 H ′t fM.-Y.
is thestate-space
Kalman gain.
Chen
t|t−1
Smoothing:
β t|T = β t|t + P t|t F ′ P −1
t+1|t (β t+1|T − F β t|t − µ̃),
′
′
−1
P t|T = P t|t + P t|t F ′ P −1
t+1|t (P t+1|T − P t+1|t )P t+1|t F P t|t .
M.-Y. Chen
state-space
M.-Y. Chen
state-space
M.-Y. Chen
state-space
M.-Y. Chen
state-space