Nonlinear and non-Gaussian state-space

State-space models
Application
Summary
Nonlinear and non-Gaussian state-space modelling
by means of hidden Markov models
Roland Langrock
University of Göttingen
St Andrews, 13 December 2010
bla
bla
bla
bla
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
1
State-space models
Basics
Approximation via hidden Markov models
2
Application
Glacial varve thickness
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
(General) state-space model (SSM):
...
yt−1
0yt 0
yt+1
gt−1
0gt 0
gt+1
(observable)
...
yt = a(gt , t )
gt = b(gt−1 , ηt )
a, b: known functions (not necessarily linear)
t , ηt iid (not necessarily ∼ N )
Roland Langrock
Fitting SSMs using HMMs
(non-observable)
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
(General) state-space model (SSM):
...
yt−1
0yt 0
yt+1
gt−1
0gt 0
gt+1
(observable)
...
yt = a(gt , t )
gt = b(gt−1 , ηt )
a, b: known functions (not necessarily linear)
t , ηt iid (not necessarily ∼ N )
Roland Langrock
Fitting SSMs using HMMs
(non-observable)
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Example 1. Stochastic volatility model:
yt = t β exp(gt /2)
gt = φgt−1 + σηt
iid
iid
t ∼ tν or N (0, 1), ηt ∼ N (0, 1)
gt determines variance (volatility) of yt
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Example 2. Poisson autoregression:
yt ∼ Poisson β exp(gt )
gt = φgt−1 + σηt
iid
ηt ∼ N (0, 1)
gt determines mean (and variance) of yt
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Desired:
parameter estimation
state decoding
forecasts
model checking
SSM likelihood:
Z
L(y) =
Z
. . . f (y, g) dg
| {z }
n−fold
can not be evaluated directly...
(SSM linear & Gaussian → Kalman filter optimal)
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Desired:
parameter estimation
state decoding
forecasts
model checking
SSM likelihood:
Z
L(y) =
Z
. . . f (y, g) dg
| {z }
n−fold
can not be evaluated directly...
(SSM linear & Gaussian → Kalman filter optimal)
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Desired:
parameter estimation
state decoding
forecasts
model checking
SSM likelihood:
Z
L(y) =
Z
. . . f (y, g) dg
| {z }
n−fold
can not be evaluated directly...
(SSM linear & Gaussian → Kalman filter optimal)
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Parameter estimation in case of nonlinearity/non-Gaussianity:
Extended Kalman filter
+ simple implementation
− in general poor approximation
(Generalized) method of moments
+ simple implementation
− low efficiency, no state decoding
Monte Carlo methods
+ high efficiency
− computer-intensive
nonstandard models require nontrivial modifications
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Parameter estimation in case of nonlinearity/non-Gaussianity:
Extended Kalman filter
+ simple implementation
− in general poor approximation
(Generalized) method of moments
+ simple implementation
− low efficiency, no state decoding
Monte Carlo methods
+ high efficiency
− computer-intensive
nonstandard models require nontrivial modifications
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Parameter estimation in case of nonlinearity/non-Gaussianity:
Extended Kalman filter
+ simple implementation
− in general poor approximation
(Generalized) method of moments
+ simple implementation
− low efficiency, no state decoding
Monte Carlo methods
+ high efficiency
− computer-intensive
nonstandard models require nontrivial modifications
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Parameter estimation in case of nonlinearity/non-Gaussianity:
Extended Kalman filter
+ simple implementation
− in general poor approximation
(Generalized) method of moments
+ simple implementation
− low efficiency, no state decoding
Monte Carlo methods
+ high efficiency
− computer-intensive
nonstandard models require nontrivial modifications
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
1
State-space models
Basics
Approximation via hidden Markov models
2
Application
Glacial varve thickness
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Hidden Markov model:
...
yt−1
0yt 0
yt+1
gt−1
0gt 0
gt+1
(observable)
...
(non-observable)
Non-observable process: N-state Markov chain gt
initial distribution δi = P(g1 = i)
transition probabilities γij = P(gt = j | gt−1 = i)
Observable process: yt
state-dependent density f (yt | gt )
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Hidden Markov model:
...
yt−1
0yt 0
yt+1
gt−1
0gt 0
gt+1
(observable)
...
(non-observable)
Non-observable process: N-state Markov chain gt
initial distribution δi = P(g1 = i)
transition probabilities γij = P(gt = j | gt−1 = i)
Observable process: yt
state-dependent density f (yt | gt )
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Hidden Markov model:
...
yt−1
0yt 0
yt+1
gt−1
0gt 0
gt+1
(observable)
...
(non-observable)
Non-observable process: N-state Markov chain gt
initial distribution δi = P(g1 = i)
transition probabilities γij = P(gt = j | gt−1 = i)
Observable process: yt
state-dependent density f (yt | gt )
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Key idea:
HMMs have the same two-process structure as SSMs
in SSMs: gt continuous-valued
discretizing gt yields approximation by HMM
benefit: HMM methodology becomes applicable
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
split essential range of gt into m equidistant intervals
Bi := [bi−1 , bi ]
bi∗ : midpoint of Bi
Z
⇒ L(y) =
Z
=
f (g1 )f (y1 |g1 )
Z
...
n Z
Y
f (y, g) dg
f (gt |gt−1 )f (yt |gt ) dgn . . . dg1
t=2
≈
m
X
P
(g1 ∈ Bi1 )f (y1 |g1 =bi∗1 )
n X
m
Y
P(gt ∈ Bi |gt−1 =bi∗
t
t=2 it =1
i1 =1
⇒ L(y) =: Lapprox (y)
Roland Langrock
Fitting SSMs using HMMs
t−1
)f (yt |gt =bi∗t )
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
split essential range of gt into m equidistant intervals
Bi := [bi−1 , bi ]
bi∗ : midpoint of Bi
Z
⇒ L(y) =
Z
=
f (g1 )f (y1 |g1 )
Z
...
n Z
Y
f (y, g) dg
f (gt |gt−1 )f (yt |gt ) dgn . . . dg1
t=2
≈
m
X
P
(g1 ∈ Bi1 )f (y1 |g1 =bi∗1 )
n X
m
Y
P(gt ∈ Bi |gt−1 =bi∗
t
t=2 it =1
i1 =1
⇒ L(y) =: Lapprox (y)
Roland Langrock
Fitting SSMs using HMMs
t−1
)f (yt |gt =bi∗t )
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
split essential range of gt into m equidistant intervals
Bi := [bi−1 , bi ]
bi∗ : midpoint of Bi
Z
⇒ L(y) =
Z
=
f (g1 )f (y1 |g1 )
Z
...
n Z
Y
f (y, g) dg
f (gt |gt−1 )f (yt |gt ) dgn . . . dg1
t=2
≈
m
X
P
(g1 ∈ Bi1 )f (y1 |g1 =bi∗1 )
n X
m
Y
P(gt ∈ Bi |gt−1 =bi∗
t
t=2 it =1
i1 =1
⇒ L(y) =: Lapprox (y)
Roland Langrock
Fitting SSMs using HMMs
t−1
)f (yt |gt =bi∗t )
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
split essential range of gt into m equidistant intervals
Bi := [bi−1 , bi ]
bi∗ : midpoint of Bi
Z
⇒ L(y) =
Z
=
f (g1 )f (y1 |g1 )
Z
...
n Z
Y
f (y, g) dg
f (gt |gt−1 )f (yt |gt ) dgn . . . dg1
t=2
≈
m
X
P
(g1 ∈ Bi1 )f (y1 |g1 =bi∗1 )
n X
m
Y
P(gt ∈ Bi |gt−1 =bi∗
t
t=2 it =1
i1 =1
⇒ L(y) =: Lapprox (y)
Roland Langrock
Fitting SSMs using HMMs
t−1
)f (yt |gt =bi∗t )
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
split essential range of gt into m equidistant intervals
Bi := [bi−1 , bi ]
bi∗ : midpoint of Bi
Z
⇒ L(y) =
Z
=
f (g1 )f (y1 |g1 )
Z
...
n Z
Y
f (y, g) dg
f (gt |gt−1 )f (yt |gt ) dgn . . . dg1
t=2
≈
m
X
P
(g1 ∈ Bi1 )f (y1 |g1 =bi∗1 )
n X
m
Y
P(gt ∈ Bi |gt−1 =bi∗
t
t=2 it =1
i1 =1
⇒ L(y) =: Lapprox (y)
Roland Langrock
Fitting SSMs using HMMs
t−1
)f (yt |gt =bi∗t )
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Consider HMM with
m-state MC (possible outcomes: midpoints bi∗ )
transition probabilities: γij := P(gt ∈ Bj |gt−1 = bi∗ )
transition probability matrix: Γ = (γij )
initial distribution: δi := P(g1 ∈ Bi )
observable process:
state-dependent density: f (yt |gt = bi∗ )
P(yt ): diag. matrix with ith entry f (yt | gt = bi∗ )
⇒ Lapprox (y) = δP(y1 )ΓP(y2 )Γ · · · ΓP(yn−1 )ΓP(yn )1t
→ the HMM (δ, Γ, f (yt |·)) approximates the SSM
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Consider HMM with
m-state MC (possible outcomes: midpoints bi∗ )
transition probabilities: γij := P(gt ∈ Bj |gt−1 = bi∗ )
transition probability matrix: Γ = (γij )
initial distribution: δi := P(g1 ∈ Bi )
observable process:
state-dependent density: f (yt |gt = bi∗ )
P(yt ): diag. matrix with ith entry f (yt | gt = bi∗ )
⇒ Lapprox (y) = δP(y1 )ΓP(y2 )Γ · · · ΓP(yn−1 )ΓP(yn )1t
→ the HMM (δ, Γ, f (yt |·)) approximates the SSM
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Consider HMM with
m-state MC (possible outcomes: midpoints bi∗ )
transition probabilities: γij := P(gt ∈ Bj |gt−1 = bi∗ )
transition probability matrix: Γ = (γij )
initial distribution: δi := P(g1 ∈ Bi )
observable process:
state-dependent density: f (yt |gt = bi∗ )
P(yt ): diag. matrix with ith entry f (yt | gt = bi∗ )
⇒ Lapprox (y) = δP(y1 )ΓP(y2 )Γ · · · ΓP(yn−1 )ΓP(yn )1t
→ the HMM (δ, Γ, f (yt |·)) approximates the SSM
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Consider HMM with
m-state MC (possible outcomes: midpoints bi∗ )
transition probabilities: γij := P(gt ∈ Bj |gt−1 = bi∗ )
transition probability matrix: Γ = (γij )
initial distribution: δi := P(g1 ∈ Bi )
observable process:
state-dependent density: f (yt |gt = bi∗ )
P(yt ): diag. matrix with ith entry f (yt | gt = bi∗ )
⇒ Lapprox (y) = δP(y1 )ΓP(y2 )Γ · · · ΓP(yn−1 )ΓP(yn )1t
→ the HMM (δ, Γ, f (yt |·)) approximates the SSM
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Basics
Approximation via hidden Markov models
Pros and Cons (HMM method):
+ likelihood directly available
extensions straightforward
simple formulae for residuals, forecasts, decoding
− m and range of gt have to be chosen
only feasible for one-dimensional state spaces
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Glacial varve thickness
1
State-space models
Basics
Approximation via hidden Markov models
2
Application
Glacial varve thickness
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Glacial varve thickness
Applications considered in Langrock (2010):
stochastic volatility
earthquake counts
polio counts (seasonal)
daily rainfall occurrence (seasonal)
glacial varve thickness
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Glacial varve thickness
Applications considered in Langrock (2010):
stochastic volatility
earthquake counts
polio counts (seasonal)
daily rainfall occurrence (seasonal)
glacial varve thickness
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Glacial varve thickness
varves: layers of sediment deposited by melting glaciers
can be useful for long-term climate research
source: Shumway and Stoffer (Time Series Analysis and Its
Applications, 2006)
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Glacial varve thickness
varves: layers of sediment deposited by melting glaciers
can be useful for long-term climate research
source: Shumway and Stoffer (Time Series Analysis and Its
Applications, 2006)
varve thickness in mm
150
100
50
0
0
100
200
300
400
500
600
years
Figure: Series of glacial varve thicknesses for a location in Massachusetts.
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Glacial varve thickness
yt = t β exp(gt )
gt = φgt−1 + σηt
t ∼ Gamma shape = cv−2 , scale = cv2
Properties:
E(yt |gt ) = β exp(gt )
(Conditional) coefficient of variation:
sd(yt |gt )
E(yt |gt ) = cv
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Glacial varve thickness
yt = t β exp(gt )
gt = φgt−1 + σηt
t ∼ Gamma shape = cv−2 , scale = cv2
Properties:
E(yt |gt ) = β exp(gt )
(Conditional) coefficient of variation:
sd(yt |gt )
E(yt |gt ) = cv
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Glacial varve thickness
yt = t β exp(gt )
gt = φgt−1 + σηt
t ∼ Gamma shape = cv−2 , scale = cv2
Properties:
E(yt |gt ) = β exp(gt )
(Conditional) coefficient of variation:
sd(yt |gt )
E(yt |gt ) = cv
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Glacial varve thickness
Table: Estimated model parameters and bootstrap 95% confidence
intervals (400 replications).
para.
φ
σ
β
cv
estimate
00.95
00.15
24.42
00.40
c.i.
[0.90, 0.97]
[0.11, 0.19]
[19.1, 31.1]
[0.37, 0.42]
resolution: m = 200
gt − range: b0 = −3, bm = 3
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Glacial varve thickness
varve thickness
150
100
50
0
0
100
200
300
400
500
600
years
Figure: Series of glacial varve thicknesses (solid grey line) and decoded
mean sequence of the fitted gamma SSM (crosses).
Roland Langrock
Fitting SSMs using HMMs
State-space models
Application
Summary
Summary
HMM approximation convenient in SSM context
whole HMM methodology applicable
simple implementation of standard and nonstandard models
Langrock, R., MacDonald, I. M., Zucchini, W., 2010
Estimating standard and nonstandard stochastic volatility models using
structured hidden Markov models. (submitted)
Langrock, R., 2010
Some applications of nonlinear and non-Gaussian state-space modeling by
means of hidden Markov models. (submitted)
Roland Langrock
Fitting SSMs using HMMs