Niemi Stochastic Dynamic Models

Stochastic dynamic models for low count observations
(and forecasting from them)
Dr. Jarad Niemi
Iowa State University
April 19, 2016
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
1 / 18
Overview
1
Poisson-binomial state-space model
2
Inference and forecasting
3
Forecasting with noisy observations
4
Forecasting with inference on parameters
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
2 / 18
Poisson-binomial state-space model
S→I→R stochastic compartment model
Focus on Susceptible (S) - Infectious (I) - Recovered (R) model with Poisson
transitions, i.e.
→
−
−
→
SI ∼ Po(λS→I SI/N)
IR ∼ Po(λI→R I)
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
3 / 18
Poisson-binomial state-space model
S→I→R stochastic compartment model
Focus on Susceptible (S) - Infectious (I) - Recovered (R) model with Poisson
transitions, i.e.
→
−
−
→
SI ∼ Po(λS→I SI/N)
IR ∼ Po(λI→R I)
updating the states, i.e.
→
−
St+1 = St − SI ,
Jarad Niemi (ISU)
→
− −
→
It+1 = It + SI − IR,
Forecasting from low counts
−
→
Rt+1 = Rt + IR
April 19, 2016
3 / 18
Poisson-binomial state-space model
S→I→R stochastic compartment model
Focus on Susceptible (S) - Infectious (I) - Recovered (R) model with Poisson
transitions, i.e.
→
−
−
→
SI ∼ Po(λS→I SI/N)
IR ∼ Po(λI→R I)
updating the states, i.e.
→
−
St+1 = St − SI ,
→
− −
→
It+1 = It + SI − IR,
−
→
Rt+1 = Rt + IR
and binomial obserations
→
−
YS→I ∼ Bin(SI , θS→I )
Jarad Niemi (ISU)
−
→
YI→R ∼ Bin(IR, θI→R )
Forecasting from low counts
April 19, 2016
3 / 18
Poisson-binomial state-space model
S→I→R stochastic compartment model
Focus on Susceptible (S) - Infectious (I) - Recovered (R) model with Poisson
transitions, i.e.
→
−
−
→
SI ∼ Po(λS→I SI/N)
IR ∼ Po(λI→R I)
updating the states, i.e.
→
−
St+1 = St − SI ,
→
− −
→
It+1 = It + SI − IR,
−
→
Rt+1 = Rt + IR
and binomial obserations
→
−
YS→I ∼ Bin(SI , θS→I )
−
→
YI→R ∼ Bin(IR, θI→R )
A more general stochastic dynamic modeling structure can be used to extended
to geographical regions, subpopulations, etc.
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
3 / 18
Poisson-binomial state-space model
SIR modeling simulations
1000
750
count
state
I
500
R
S
250
0
0
10
20
30
40
50
time
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
4 / 18
Poisson-binomial state-space model
SIR modeling simulations
1000
750
state
I
500
R
S
250
0
0
10
20
30
40
50
time
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
5 / 18
Inference and forecasting
Forecasting with perfect information
Suppose you know transition rates λ, observation probabilities θ = 1, and
the states X0:t and your only goal is to forecast the future states Xt+1:T ,
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
6 / 18
Inference and forecasting
Forecasting with perfect information
Suppose you know transition rates λ, observation probabilities θ = 1, and
the states X0:t and your only goal is to forecast the future states Xt+1:T ,
i.e.
p(Xt+1:T |θ, λ, X0:t ) = p(Xt+1:T |θ, λ, Xt )
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
6 / 18
Inference and forecasting
Forecasting with perfect information
Suppose you know transition rates λ, observation probabilities θ = 1, and
the states X0:t and your only goal is to forecast the future states Xt+1:T ,
i.e.
p(Xt+1:T |θ, λ, X0:t ) = p(Xt+1:T |θ, λ, Xt )
this distribution is estimated via Monte Carlo simulation.
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
6 / 18
Inference and forecasting
t= 0
t= 10
t= 20
1000
750
500
250
state
0
I
t= 30
t= 40
t= 50
R
1000
S
750
500
250
0
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
time
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
7 / 18
Inference and forecasting
Delay in data analysis
Suppose you have a one or two week delay in collecting, processing, and
analyzing so that when trying to forecast future states you are using “old”
data,
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
8 / 18
Inference and forecasting
Delay in data analysis
Suppose you have a one or two week delay in collecting, processing, and
analyzing so that when trying to forecast future states you are using “old”
data, i.e.
p(Xt+1:T |θ, λ, Xt−L )
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
8 / 18
Inference and forecasting
Delay in data analysis
Suppose you have a one or two week delay in collecting, processing, and
analyzing so that when trying to forecast future states you are using “old”
data, i.e.
p(Xt+1:T |θ, λ, Xt−L )
where
L = 0 indicates up-to-date data
L = 1 indicates one-week old data
L = 2 indicates two-week old data
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
8 / 18
Inference and forecasting
L= 0
L= 1
L= 2
1000
750
state
I
500
R
S
250
0
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
time
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
9 / 18
Inference and forecasting
L= 0
L= 1
L= 2
150
100
state
I
50
0
0
10
20
30
40
50
0
10
20
30
40
50
0
10
20
30
40
50
time
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
10 / 18
Forecasting with noisy observations
Forecasting with noisy observations
Suppose, we know the transition rates (λ) and the observation probabilities
(θ), but we only observe a noisy version of the state transitions, i.e. .
→
−
YS→I ∼ Bin(SI , θS→I )
Jarad Niemi (ISU)
−
→
YI→R ∼ Bin(IR, θI→R )
Forecasting from low counts
April 19, 2016
11 / 18
Forecasting with noisy observations
Forecasting with noisy observations
Suppose, we know the transition rates (λ) and the observation probabilities
(θ), but we only observe a noisy version of the state transitions, i.e. .
→
−
YS→I ∼ Bin(SI , θS→I )
−
→
YI→R ∼ Bin(IR, θI→R )
Now the forecast distribution we need is
Z
p(Xt+1:T |λ, θ, y0:t ) = p(Xt+1:T , λ, θ|Xt )p(Xt |λ, θ, y0:t )dXt .
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
11 / 18
Forecasting with noisy observations
S to I: 0
S to I: 0.001
S to I: 0.01
S to I: 0.1
9
I to R: 0
6
3
0
I to R: 0.001
9
6
count
3
transition
IR
9
SI
I to R: 0.01
0
6
3
0
I to R: 0.1
9
6
3
0
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
time
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
12 / 18
Forecasting with noisy observations
Noisily observed state
S to I: 0
S to I: 0.001
S to I: 0.01
S to I: 0.1
750
I to R: 0
500
250
0
I to R: 0.001
750
500
250
0
state
S
I
I to R: 0.01
750
500
250
R
0
I to R: 0.1
750
500
250
0
20
30
40
50
20
30
40
50
20
30
40
50
20
30
40
50
time
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
13 / 18
Forecasting with inference on parameters
Forecasting with inference on parameters
In reality, we don’t know the transition rates (λ) and the observation
probabilities (θ), and we only observe a noisy version of the state
transitions.
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
14 / 18
Forecasting with inference on parameters
Forecasting with inference on parameters
In reality, we don’t know the transition rates (λ) and the observation
probabilities (θ), and we only observe a noisy version of the state
transitions.
Now the forecast distribution we need is
Z Z Z
p(Xt+1:T |y0:t ) =
p(Xt+1:T |λ, θ, Xt )p(Xt , λ, θ|y0:t )dλdθdXt .
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
14 / 18
Forecasting with inference on parameters
Prior distributions
In order to calculate (or approximate) the integral
Z Z Z
p(Xt+1:T , λ, θ|Xt )p(Xt , λ, θ|y0:t )dλdθdXt
we need to assign priors to λ, θ, and X0 .
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
15 / 18
Forecasting with inference on parameters
Prior distributions
In order to calculate (or approximate) the integral
Z Z Z
p(Xt+1:T , λ, θ|Xt )p(Xt , λ, θ|y0:t )dλdθdXt
we need to assign priors to λ, θ, and X0 . Suppose, we assume
θk
ind
∼ Be(nθ pk , nθ [1 − pk ])
ind
λk ∼ Ga(nλ ck , nλ )
X0 ∼ Mult(N; z1 , . . . , zS )
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
15 / 18
Forecasting with inference on parameters
Prior distributions
In order to calculate (or approximate) the integral
Z Z Z
p(Xt+1:T , λ, θ|Xt )p(Xt , λ, θ|y0:t )dλdθdXt
we need to assign priors to λ, θ, and X0 . Suppose, we assume
θk
ind
∼ Be(nθ pk , nθ [1 − pk ])
ind
λk ∼ Ga(nλ ck , nλ )
X0 ∼ Mult(N; z1 , . . . , zS )
We can control how informative the priors are with nθ and nλ .
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
15 / 18
Forecasting with inference on parameters
Informative priors
n_lambda: 1
n_lambda: 100
n_lambda: 10000
1000
n_theta: 1
750
500
250
0
1000
n_theta: 100
750
500
250
state
S
I
R
0
1000
n_theta: 10000
750
500
250
0
20
30
40
50
20
30
40
50
20
30
40
50
time
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
16 / 18
Forecasting with inference on parameters
Balance priors and data
n_lambda: 1
n_lambda: 100
n_lambda: 10000
1000
S to I: 0
750
500
250
0
1000
S to I: 0.001
750
500
250
state
S
0
1000
I
S to I: 0.01
750
500
250
R
0
1000
S to I: 0.1
750
500
250
0
20
30
40
50
20
30
40
50
20
30
40
50
time
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
17 / 18
Forecasting with inference on parameters
We need information
Information can come from
Data
Priors
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
18 / 18
Forecasting with inference on parameters
We need information
Information can come from
Data
Priors
We can quantitatively assess the impact of better information,
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
18 / 18
Forecasting with inference on parameters
We need information
Information can come from
Data
Priors
We can quantitatively assess the impact of better information, i.e.
increasing prior information, e.g. λ ∼ Ga(nλ ck , nλ ) by increasing nλ ,
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
18 / 18
Forecasting with inference on parameters
We need information
Information can come from
Data
Priors
We can quantitatively assess the impact of better information, i.e.
increasing prior information, e.g. λ ∼ Ga(nλ ck , nλ ) by increasing nλ ,
increasing surveillance, e.g. Y ∼ Bin(S → I , θ) by increasing θ, and
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
18 / 18
Forecasting with inference on parameters
We need information
Information can come from
Data
Priors
We can quantitatively assess the impact of better information, i.e.
increasing prior information, e.g. λ ∼ Ga(nλ ck , nλ ) by increasing nλ ,
increasing surveillance, e.g. Y ∼ Bin(S → I , θ) by increasing θ, and
increasing timeliness, e.g. p(Xt+1:T |y1:t−L ) by decreasing L.
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
18 / 18
Forecasting with inference on parameters
We need information
Information can come from
Data
Priors
We can quantitatively assess the impact of better information, i.e.
increasing prior information, e.g. λ ∼ Ga(nλ ck , nλ ) by increasing nλ ,
increasing surveillance, e.g. Y ∼ Bin(S → I , θ) by increasing θ, and
increasing timeliness, e.g. p(Xt+1:T |y1:t−L ) by decreasing L.
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
18 / 18
Forecasting with inference on parameters
We need information
Information can come from
Data
Priors
We can quantitatively assess the impact of better information, i.e.
increasing prior information, e.g. λ ∼ Ga(nλ ck , nλ ) by increasing nλ ,
increasing surveillance, e.g. Y ∼ Bin(S → I , θ) by increasing θ, and
increasing timeliness, e.g. p(Xt+1:T |y1:t−L ) by decreasing L.
Then, we can discuss how to assign resources depending on the costs
associated with each impact above.
Jarad Niemi (ISU)
Forecasting from low counts
April 19, 2016
18 / 18