Numerical approximation methods for stochastically modeled

Numerical approximation methods for stochastically
modeled biochemical reaction networks
David F. Anderson∗
∗
[email protected]
Department of Mathematics
University of Wisconsin - Madison
Illinois Math Bio Seminar
February 24th, 2011
Outline
1. Discuss Poisson process and time changes.
2. Describe/develop model – stochastically modeled reaction networks.
3. Numerical methods and error approximation.
4. Mathematical question for today: how do we effectively quantify how well
different methods approximate the continuous time Markov chain model.
5. Discuss/use multi-scale nature of biochemical reaction networks.
The Poisson process
I
The models I am going to discuss are a subset of the class of models
termed continuous time Markov chains.
I
The simplest continuous time Markov chain is the Poisson process.
The Poisson process
A Poisson process, Y (·), is a model for a series of random observations
occurring in time.
(a) Let {ξi } be i.i.d. exponential random variables with parameter one.
The Poisson process
A Poisson process, Y (·), is a model for a series of random observations
occurring in time.
(a) Let {ξi } be i.i.d. exponential random variables with parameter one.
(b) Now, put points down on a line with spacing equal to the ξi .
x x
x
↔ ↔
←→
ξ1 ξ2
ξ3
x
···
x
x
x
x
t
I
Let Y (t) denote the number of points hit by time t.
I
In the figure above, Y (t) = 6.
Intuition: The unit rate Poisson process is simply the number of points hit
when we run along the time frame at rate one.
The Poisson process
A Poisson process, Y (·), is a model for a series of random observations
occurring in time.
(a) Let {ξi } be i.i.d. exponential random variables with parameter one.
(b) Now, put points down on a line with spacing equal to the ξi .
x x
x
↔ ↔
←→
ξ1 ξ2
ξ3
x
x
x
x
···
x
t
I
Let Y (t) denote the number of points hit by time t.
I
In the figure above, Y (t) = 6.
Intuition: The unit rate Poisson process is simply the number of points hit
when we run along the time frame at rate one.
25
20
λ=1
15
10
5
0
0
5
10
15
20
The Poisson process
Let
I
Y be a unit rate Poisson process.
I
Define Yλ (t) ≡ Y (λt),
Then Yλ is a Poisson process with parameter λ.
Intuition: The Poisson process with rate λ is simply the number of points hit
(of the unit-rate point process) when we run along the time frame at rate λ.
Thus, we have “changed time” to convert a unit-rate Poisson process to one
which has rate λ.
The Poisson process
Let
I
Y be a unit rate Poisson process.
I
Define Yλ (t) ≡ Y (λt),
Then Yλ is a Poisson process with parameter λ.
Intuition: The Poisson process with rate λ is simply the number of points hit
(of the unit-rate point process) when we run along the time frame at rate λ.
Thus, we have “changed time” to convert a unit-rate Poisson process to one
which has rate λ.
60
50
λ=3
40
30
20
10
0
0
5
10
15
20
The Poisson process
There is no reason λ needs to be constant in time, in which case
Z t
Yλ (t) ≡ Y
λ(s)ds
0
is an inhomogeneous Poisson process.
The Poisson process
There is no reason λ needs to be constant in time, in which case
Z t
Yλ (t) ≡ Y
λ(s)ds
0
is an inhomogeneous Poisson process.
Key property: Y (T + ∆) − Y (T ) = Poisson(∆).
The Poisson process
There is no reason λ needs to be constant in time, in which case
Z t
Yλ (t) ≡ Y
λ(s)ds
0
is an inhomogeneous Poisson process.
Key property: Y (T + ∆) − Y (T ) = Poisson(∆).
Thus
t+∆t
Z
P{Yλ (t + ∆t) − Yλ (t) > 0|Ft } = 1 − exp −
λ(t)dt
t
≈ λ(t)∆t.
The Poisson process
There is no reason λ needs to be constant in time, in which case
Z t
Yλ (t) ≡ Y
λ(s)ds
0
is an inhomogeneous Poisson process.
Key property: Y (T + ∆) − Y (T ) = Poisson(∆).
Thus
t+∆t
Z
P{Yλ (t + ∆t) − Yλ (t) > 0|Ft } = 1 − exp −
λ(t)dt
≈ λ(t)∆t.
t
Points:
1. We have “changed time” to convert a unit-rate Poisson process to one
which has rate or intensity λ(t).
2. Will use more complicated time changes of unit-rate processes to build
models of interest.
Outline
1. Discuss Poisson process and time changes.
2. → Describe/develop model – stochastically modeled reaction networks.
3. Numerical methods and error approximation.
4. Mathematical question for today: how do we effectively quantify how well
different methods approximate the continuous time Markov chain model.
5. Discuss/use multi-scale nature of biochemical reaction networks.
Models of interest
Consider the simple system
A+B →C
where one molecule each of A and B is being converted to one of C.
Models of interest
Consider the simple system
A+B →C
where one molecule each of A and B is being converted to one of C.
Intuition for standard stochastic model:
P{reaction occurs in (t, t + ∆t]Ft } ≈ κXA (t)XB (t)∆t
where
I
Ft represents the information about the system available at time t and
I
κ is a positive constant, the reaction rate constant.
Models of interest
A+B →C
Simple book-keeping: if X (t) = (XA (t), XB (t), XC (t)) gives the state at time t,


−1
X (t) = X (0) + R(t)  −1  ,
1
where
I
R(t) is the # of times the reaction has occurred by time t and
I
X (0) is the initial condition.
Models of interest
A+B →C
Simple book-keeping: if X (t) = (XA (t), XB (t), XC (t)) gives the state at time t,


−1
X (t) = X (0) + R(t)  −1  ,
1
where
I
R(t) is the # of times the reaction has occurred by time t and
I
X (0) is the initial condition.
R is a counting process:
I
R(0) = 0 and
I
R is constant except for jumps of plus one.
Goal: represent R(t) in terms of Poisson process.
Models of interest
Recall that for A + B → C our intuition was
P{reaction occurs in (t, t + ∆t]Ft } ≈ κXA (t)XB (t)∆t,
Models of interest
Recall that for A + B → C our intuition was
P{reaction occurs in (t, t + ∆t]Ft } ≈ κXA (t)XB (t)∆t,
and that for an inhomogeneous Poisson process with rate λ(t) we have
Z
P{Yλ (t + ∆t) − Yλ (t) > 0|Ft } = 1 − exp −
0
∆t
λ(t)dt
≈ λ(t)∆t.
Models of interest
Recall that for A + B → C our intuition was
P{reaction occurs in (t, t + ∆t]Ft } ≈ κXA (t)XB (t)∆t,
and that for an inhomogeneous Poisson process with rate λ(t) we have
Z
P{Yλ (t + ∆t) − Yλ (t) > 0|Ft } = 1 − exp −
∆t
λ(t)dt
0
This suggests we can model
t
Z
R(t) = Y
κXA (s)XB (s)ds
0
where Y is a unit-rate Poisson process.
≈ λ(t)∆t.
Models of interest
Recall that for A + B → C our intuition was
P{reaction occurs in (t, t + ∆t]Ft } ≈ κXA (t)XB (t)∆t,
and that for an inhomogeneous Poisson process with rate λ(t) we have
Z
P{Yλ (t + ∆t) − Yλ (t) > 0|Ft } = 1 − exp −
∆t
λ(t)dt
≈ λ(t)∆t.
0
This suggests we can model
t
Z
R(t) = Y
κXA (s)XB (s)ds
0
where Y is a unit-rate Poisson process.
Hence




Z t
XA (t)
−1
 XB (t)  ≡ X (t) = X (0) +  −1  Y
κXA (s)XB (s)ds .
0
XC (t)
1
This equation uniquely determines X for all t > 0.
General stochastic models of biochemical reactions
• We consider a network of reactions involving d chemical species,
S1 , . . . , Sd :
d
d
X
X
νik Si −→
νik0 Si
i=1
i=1
Zd≥0 :
• νk ∈
number of molecules of each chemical species consumed in
the k th reaction.
• νk0 ∈ Zd≥0 : number of molecules of each chemical species created in the
k th reaction.
Denote reaction vector as ζk = νk0 − νk , so that if k th reaction occurs at
time t, the new state becomes
X (t) = X (t−) + νk0 − νk = X (t−) + ζk ,
General stochastic models of biochemical reactions
• We consider a network of reactions involving d chemical species,
S1 , . . . , Sd :
d
d
X
X
νik Si −→
νik0 Si
i=1
i=1
Zd≥0 :
• νk ∈
number of molecules of each chemical species consumed in
the k th reaction.
• νk0 ∈ Zd≥0 : number of molecules of each chemical species created in the
k th reaction.
Denote reaction vector as ζk = νk0 − νk , so that if k th reaction occurs at
time t, the new state becomes
X (t) = X (t−) + νk0 − νk = X (t−) + ζk ,
• The intensity (or propensity) of k th reaction is λk : Zd≥0 → R.
• By analogy with before
X (t) = X (0) +
X
k
t
Z
Yk
λk (X (s))ds ζk ,
0
Yk are independent, unit-rate Poisson processes.
Mass-action kinetics
The standard intensity function chosen is mass-action kinetics:
!
Y
Y
xi !
x
λk (x) = κk ( νik !)
= κk
.
νk
(xi − νik )!
i
i
1. Rate is proportional to the number of distinct subsets of the molecules
present that can form inputs for the reaction.
2. This assumes vessel is “well-stirred”.
Example: If S1 → anything, then λk (x) = κk x1 .
Example: If S1 + S2 → anything, then λk (x) = κk x1 x2 .
Example: If 2S2 → anything, then λk (x) = κk x2 (x2 − 1).
Some examples
E. coli Heat Shock Response Model. 9 species, 18 reactions.
1
1
Hye Won Kang, presentation at SPA in 2007.
Some examples
E. coli Heat Shock Response Model. 9 species, 18 reactions.
1
Lotka-Volterra predator-prey model: think of A as a prey and B as a predator.
κ
A →1 2A,
1
κ
A + B →2 2B,
Hye Won Kang, presentation at SPA in 2007.
κ
B →3 ∅,
Markov chain models
Path-wise representation
X (t) = X (0) +
X
k
Z
Yk
t
λk (X (s))ds ζk .
0
Kolmogorov’s forward equation (“Chemical Master Equation”) describes the
evolution of the distribution of the state of the system
X
X
d
Px (t) =
λk (x − ζk )Px−ζk (t) −
λk (x)Px (t),
dt
k
k
where Px (t) is probability X (t) = x.
Model I’ve described is equivalent to this “chemical master equation model”.
Population Example: Lotka-Volterra predator-prey model
Think of A as a prey and B as a predator.
κ
A →1 2A,
κ
A + B →2 2B,
κ
B →3 ∅,
with A(0) = B(0) = 1000 and κ1 = 2, κ2 = .002, κ3 = 2.
Population Example: Lotka-Volterra predator-prey model
Think of A as a prey and B as a predator.
κ
κ
A →1 2A,
A + B →2 2B,
κ
B →3 ∅,
with A(0) = B(0) = 1000 and κ1 = 2, κ2 = .002, κ3 = 2.
Deterministic model. Let x(t) = [A(t), B(t)]T .
t
Z
x(t) = x(0) + κ1
x1 (s)ds
0
1
0
t
Z
+ κ2
x1 (s)x2 (s)ds
0
−1
1
Z
+ κ3
t
x2 (s)ds
0
0
−1
Population Example: Lotka-Volterra predator-prey model
Think of A as a prey and B as a predator.
κ
κ
A →1 2A,
A + B →2 2B,
κ
B →3 ∅,
with A(0) = B(0) = 1000 and κ1 = 2, κ2 = .002, κ3 = 2.
Deterministic model. Let x(t) = [A(t), B(t)]T .
t
Z
x(t) = x(0) + κ1
x1 (s)ds
0
1
0
t
Z
+ κ2
x1 (s)x2 (s)ds
0
−1
1
Z
+ κ3
t
x2 (s)ds
0
Stochastic model. Let X (t) = [A(t), B(t)]T .
Z t
Z t
1
−1
X (t) = X (0) + Y1 κ1
X1 (s)ds
+ Y2 κ2
X1 (s)X2 (s)ds
0
1
0
0
Z t
0
+ Y3 κ3
X2 (s)ds
−1
0
0
−1
Lotka-Volterra
Think of A as a prey and B as a predator.
κ
κ
κ
A →1 2A,
B →3 ∅,
A + B →2 2B,
with A(0) = B(0) = 1000 and κ1 = 2, κ2 = .002, κ3 = 2.
1800
1500
Stochastic model
ODE model
1600
Prey
Predator
1400
1300
1200
1400
1100
Prey
1000
1200
900
800
1000
700
0
1800
600
400
5
10
15
20
25
30
35
40
15
20
25
30
35
40
2000
800
Prey
Predator
1600
600
800
1000
1200
Predator
1400
1600
1800
1400
1200
1000
800
600
400
0
5
10
Pathwise Representations – Random time changes
A representation for path-wise solutions of our model is given by random
time-changes of Poisson processes
X Z t
λk (X (s))ds ζk ,
X (t) = X (0) +
Yk
k
0
where the Yk are independent, unit-rate Poisson processes.
Random time changes have interesting history:
(Wolfgang Doeblin)
Outline
1. Discuss Poisson process and time changes.
2. Describe/develop model – stochastically modeled reaction networks.
3. → Numerical methods and error approximation.
4. Mathematical question for today: how do we effectively quantify how well
different methods approximate the continuous time Markov chain model.
5. Discuss/use multi-scale nature of biochemical reaction networks.
Numerical methods
X (t) = X (0) +
X
k
Z
Yk
t
λk (X (s))ds ζk ,
0
Good news. There are a number of numerical methods that produce
statistically exact sample paths:
1. Gillespie’s algorithm.
2. The first reaction method.
3. The next reaction method.
Numerical methods
X (t) = X (0) +
X
Z
t
Yk
λk (X (s))ds ζk ,
0
k
Good news. There are a number of numerical methods that produce
statistically exact sample paths:
1. Gillespie’s algorithm.
2. The first reaction method.
3. The next reaction method.
For each step of these methods one must find :
(i) the amount of time that passes until the next reaction takes place:
!
X
∆n ∼ exp
λk (X (t))
k
(the minimum of exponential RVs)
(ii) which reaction takes place at that time.
Numerical methods
X (t) = X (0) +
X
Z
t
Yk
λk (X (s))ds ζk ,
0
k
Good news. There are a number of numerical methods that produce
statistically exact sample paths:
1. Gillespie’s algorithm.
2. The first reaction method.
3. The next reaction method.
For each step of these methods one must find :
(i) the amount of time that passes until the next reaction takes place:
!
X
∆n ∼ exp
λk (X (t))
k
(the minimum of exponential RVs)
(ii) which reaction takes place at that time.
Bad news. If
P
k
λk (X (t)) 1, then ∆n ≈ P
k
1
1
λk (X (t))
∗ time to produce a single path over an interval [0, T ] can be prohibitive.
Tau-leaping
Explicit “τ -leaping” 2 was developed by Dan Gillespie in an effort to overcome
the problem that ∆n may be prohibitively small.
Z t
Tau-leaping is essentially an Euler approximation of
λk (X (s))ds:
0
Z (τ ) = Z (0) +
X
≈ Z (0) +
X
d
X
τ
Z
Yk
λk (Z (s)) ds ζk
0
k
Yk λk (Z (0)) τ ζk
k
= Z (0) +
Poisson λk (Z (0)) τ ζk .
k
2
D. T. Gillespie, J. Chem. Phys., 115, 1716 – 1733.
Tau-leaping
One “intuitive” representation for Z (t) generated by τ -leaping is
Z (t) = X (0) +
X
k
Z
t
λk (Z ◦ η(s))ds ζk ,
Yk
0
where
η(s) = tn ,
if tn ≤ s < tn+1 = tn + τ
is a step function giving left endpoints of time discretization.
Another algorithm: A midpoint method
For a time discretization 0 = t0 < t1 < · · · < tN = T , with τ = tn − tn−1 , let
ρ(z) = z +
1 X
τ
λk (z)ζk ,
2
k
be a “deterministic” midpoint approximation
and let Z(t) solve:
Z(t) = X (0) +
X
k
t
Z
Yk
λk ◦ ρ(Z ◦ η(s))ds ζk ,
0
where η(s) = tn , if tn ≤ s < tn+1 = tn + τ .
A new algorithm: Weak trapezoidal method
Fixing a θ ∈ (0, 1), we define
ξ1 =
1
1
2 θ(1 − θ)
def
and ξ2 =
1 (1 − θ)2 + θ2
.
2 θ(1 − θ)
Repeat the following steps until tn+1 > T , in which we first compute a
θ-midpoint y ∗ , and then the new value Z (tn+1 ):
1. Set y ∗ = Z (tn ) +
X
yk ,n,1 λk (Z (tn ))θh ζk .
k
2. Set Z (tn+1 ) = y ∗ +
X
yk ,n,2 [ξ1 λk (y ∗ ) − ξ2 λk (tn )]+ (1 − θ)h ζk .
k
3. Set tn+1 = tn + h.
Notes:
1. If θ = 1/2, then ξ1 = 2 and ξ2 = 1.
2. First step is just Euler’s method on partial interval.
Weak trapezoidal method: intuition
Alternative representation for
X (t) = X (0) +
X
Z
Yk
X (t) = X (0) +
X
k
∞
Z tZ
ζk
0
λk (X (s))ds ζk ,
0
k
is
t
0
1[0,λk (X (s))) (u)µk (du × ds),
where µk are independent Poisson random measures with Lebesgue mean
measure (area):
Weak trapezoidal method: intuition
Alternative representation for
X (t) = X (0) +
X
Z
Yk
X (t) = X (0) +
X
k
∞
Z tZ
ζk
0
λk (X (s))ds ζk ,
0
k
is
t
0
1[0,λk (X (s))) (u)µk (du × ds),
where µk are independent Poisson random measures with Lebesgue mean
measure (area):
1. For disjoint A, B ⊂ R2 ,
2. µk (A) and µk (B) are Poisson with parameters Area(A) and Area(B),
respectively.
Why this works: θ = 1/2
λk (X(0))
λk (X(0))
λk (y ∗ )
λk (y ∗ )
Region 1
Why this works: θ = 1/2
λk (X(0))
λk (X(0))
Region 3
λk (y ∗ )
λk (y ∗ )
Region 1
Region 2
Why this works: θ = 1/2
λk (X(0))
λk (X(0))
Region 3
λk (y ∗ )
λk (y ∗ )
Region 1
Region 2
Region 3
Region 4
Region 5
Error analysis
I
Let X (t) denote some process and Zh (t) be an approximation.
I
Multiple ways to understand how good an approximation Zh is to X .
Error analysis
I
Let X (t) denote some process and Zh (t) be an approximation.
I
Multiple ways to understand how good an approximation Zh is to X .
1. Strong error approximation. Bound:
1/p
sup Ex0 |X (t) − Zh (t)|p
≤ Chq
t≤T
Ex0 sup |X (t) − Zh (t)|p
t≤T
Common choices are p = 1, 2.
1/p
≤ Chq
Error analysis
I
Let X (t) denote some process and Zh (t) be an approximation.
I
Multiple ways to understand how good an approximation Zh is to X .
1. Strong error approximation. Bound:
1/p
sup Ex0 |X (t) − Zh (t)|p
≤ Chq
t≤T
Ex0 sup |X (t) − Zh (t)|p
t≤T
1/p
≤ Chq
Common choices are p = 1, 2.
2. Weak error approximation. For large class of real valued test functions f ,
bound
sup Ex0 f (X (t)) − Ex0 f (Zh (t)) ≤ Chq .
t≤T
Error analysis
Under the scaling h → 0:
1. Li3 and also Rathinam, Petzold, Cao, and Gillespie4 showed, among
other things (implicit methods, etc.), that explicit Euler tau-leaping has a
strong error (in the L2 norm) of order 1/2 and a weak error of order one:
q
sup Ex0 |Z (tn ) − X (tn )|2 ≤ Ch1/2
n≤N
|Ex0 f (Z (T )) − Ex0 f (X (T ))| ≤ Ch,
where 0 = t0 < t1 < · · · < tN = T is a partition of [0, T ].
2. The midpoint method has the same order accurate as explicit Euler
tau-leaping as h → 0.
3
T. Li, SIAM Multi. Model. Simul., 6, 2007, 417 – 436.
M. Rathinam et al., SIAM Multi. Model. Simul., 4, 2005, 867 – 895.
5
A., Ganguly, Kurtz, Ann. Applied Prob., 2011
4
Error analysis
Under the scaling h → 0:
1. Li3 and also Rathinam, Petzold, Cao, and Gillespie4 showed, among
other things (implicit methods, etc.), that explicit Euler tau-leaping has a
strong error (in the L2 norm) of order 1/2 and a weak error of order one:
q
sup Ex0 |Z (tn ) − X (tn )|2 ≤ Ch1/2
n≤N
|Ex0 f (Z (T )) − Ex0 f (X (T ))| ≤ Ch,
where 0 = t0 < t1 < · · · < tN = T is a partition of [0, T ].
2. The midpoint method has the same order accurate as explicit Euler
tau-leaping as h → 0.
3. Many examples show midpoint method more accurate than Euler: led to
work in “classical scaling.” 5
4. Nothing known about weak trapezoidal algorithm.
3
T. Li, SIAM Multi. Model. Simul., 6, 2007, 417 – 436.
M. Rathinam et al., SIAM Multi. Model. Simul., 4, 2005, 867 – 895.
5
A., Ganguly, Kurtz, Ann. Applied Prob., 2011
4
Example
Again think of A as a prey and B as a predator.
κ
A →1 2A,
κ
A + B →2 2B,
κ
B →3 ∅,
with A(0) = B(0) = 1000 and κ1 = 2, κ2 = .002, κ3 = 2.
Letting τ = 1/20 and simulating 30,000 sample paths with each method
yields the following approximate distributions for B(10):
0.16
0.14
Gillespie algorithm
Euler tau−leaping
Midpoint tau−leaping
Weak Trapezoidal
0.12
0.1
0.08
0.06
0.04
0.02
0
0
500
1000
1500
2000
2500
3000
What’s happening?
Recall, tau-leaping methods are used when h E∆n , for otherwise an exact
method would be performed. Therefore, we should require that
1
= E∆n
λ
k k (X (t))
h P
while
X
λk (X (t)) 1.
k
So h → 0 does not tell the whole story · · · what to do?
What’s happening?
Recall, tau-leaping methods are used when h E∆n , for otherwise an exact
method would be performed. Therefore, we should require that
1
= E∆n
λ
k k (X (t))
h P
while
X
λk (X (t)) 1.
k
So h → 0 does not tell the whole story · · · what to do?
Explicitly take the multi-scale nature of the systems into account.
Mulit-scale nature of biochemical reaction networks
Let N 1. Assume that we are given a model of the form
X Z t 0
X (t) = X (0) +
Yk
λk (X (s))ds ζk ,
0
k
where the λ0k are of the form
λ0k (x)
=
κ0k
Y
i
νik !
Y
i
!
xi
.
νik
Mulit-scale nature of biochemical reaction networks
Let N 1. Assume that we are given a model of the form
X Z t 0
X (t) = X (0) +
Yk
λk (X (s))ds ζk ,
0
k
where the λ0k are of the form
λ0k (x)
=
κ0k
Y
νik !
i
Y
i
!
xi
.
νik
For each species i, define the normalized abundance
XiN (t) =
Xi (t)
,
N αi
where αi ≥ 0 should be selected so that XiN = O(1).
Here XiN may be the species number (αi = 0) or the species concentration or
something else.
Mulit-scale nature of biochemical reaction networks
Rate constants, κ0k , may also vary over several orders of magnitude.
Mulit-scale nature of biochemical reaction networks
Rate constants, κ0k , may also vary over several orders of magnitude. We write
κ0k = κk N βk
where the βk are selected so that κk = O(1).
Mulit-scale nature of biochemical reaction networks
Rate constants, κ0k , may also vary over several orders of magnitude. We write
κ0k = κk N βk
where the βk are selected so that κk = O(1).
Thus:
1. For a binary reaction Si + Sj → ∗
κ0k xi xj = N βk +αi +αj κk zi zj ,
and we can write
βk + αi + αj = βk + νk · α.
Mulit-scale nature of biochemical reaction networks
Rate constants, κ0k , may also vary over several orders of magnitude. We write
κ0k = κk N βk
where the βk are selected so that κk = O(1).
Thus:
1. For a binary reaction Si + Sj → ∗
κ0k xi xj = N βk +αi +αj κk zi zj ,
and we can write
βk + αi + αj = βk + νk · α.
2. We also have for Si → ∗ and 2Si → ∗, respectively,
κ0k xi = N βk +νk ·α zi ,
κ0k xi (xi − 1) = N βk +νk ·α zi (zi − N −αi ),
with similar expressions for intensities involving higher order reactions.
Mulit-scale nature of biochemical reaction networks
Our model has become
XiN (t) = XiN (0) +
X
k
N −αi Yk
N βk +νk ·α
Z
0
t
λk (X N (s))ds ζk ,
i ∈ {1, . . . , d}.
Mulit-scale nature of biochemical reaction networks
Our model has become
XiN (t) = XiN (0) +
X
N −αi Yk
N βk +νk ·α
Z
t
λk (X N (s))ds ζk ,
i ∈ {1, . . . , d}.
0
k
Remark:
If
βk + νk · α = αi = 1
for all i, k , then model is
X N (t) = X N (0) +
X 1 Z t
Yk N
λk (X N (s))ds ζk .
N
0
k
which is typically called the classical scaling.
I
In this case it is natural to consider X N as a vector whose ith component
gives the concentration, in moles per unit volume, of the ith species.
Mulit-scale nature of biochemical reaction networks
Our model has become
XiN (t) = XiN (0) +
X
N −αi Yk
N βk +νk ·α
Z
t
λk (X N (s))ds ζk ,
i ∈ {1, . . . , d}.
0
k
Remark:
If
βk + νk · α = αi = 1
for all i, k , then model is
X N (t) = X N (0) +
X 1 Z t
Yk N
λk (X N (s))ds ζk .
N
0
k
which is typically called the classical scaling.
I
In this case it is natural to consider X N as a vector whose ith component
gives the concentration, in moles per unit volume, of the ith species.
I
For large N, can use Langevin/diffusion approximation.
Mulit-scale nature of biochemical reaction networks
Our model has become
XiN (t) = XiN (0) +
X
N −αi Yk
N βk +νk ·α
Z
t
λk (X N (s))ds ζk ,
i ∈ {1, . . . , d}.
0
k
Remark:
If
βk + νk · α = αi = 1
for all i, k , then model is
X N (t) = X N (0) +
X 1 Z t
Yk N
λk (X N (s))ds ζk .
N
0
k
which is typically called the classical scaling.
I
In this case it is natural to consider X N as a vector whose ith component
gives the concentration, in moles per unit volume, of the ith species.
I
For large N, can use Langevin/diffusion approximation.
I
As N → ∞, converge to ODE model with mass action kinetics.
Mulit-scale nature of biochemical reaction networks
Our model has become
XiN (t) = XiN (0) +
X
N −αi Yk
N βk +νk ·α
Z
t
λk (X N (s))ds ζk ,
i ∈ {1, . . . , d}.
0
k
Remark:
If
βk + νk · α = αi = 1
for all i, k , then model is
X N (t) = X N (0) +
X 1 Z t
Yk N
λk (X N (s))ds ζk .
N
0
k
which is typically called the classical scaling.
I
In this case it is natural to consider X N as a vector whose ith component
gives the concentration, in moles per unit volume, of the ith species.
I
For large N, can use Langevin/diffusion approximation.
I
As N → ∞, converge to ODE model with mass action kinetics.
I
It was specifically this scaling that was used in the analyses of Euler and
midpoint τ -leaping found in A., Ganguly, Kurtz, Ann. Appl. Prob, 2011.
A case for simulation
Our model has become
XiN (t) = XiN (0) +
X
N −αi Yk
k
N βk +νk ·α
Z
t
λk (X N (s))ds ζk ,
i ∈ {1, . . . , d}.
0
In this general setting, methods of model simplification/reduction:
1. Langevin/Diffusion approximations, or
2. linear noise/Van Kampen approximations, or
3. Averaging ideas to reduce system, or
4. ODE/Law of large numbers types of approximations
are exceedingly difficult.
Very active line of current research: Kurtz, Popovic, Kang, etc.
Weak error analysis on normalized processes
XiN (t) = XiN (0) +
X
N −αi Yk
N βk +νk ·α
Z
t
λk (X N (s))ds ζk ,
0
k
I
Denote exact process as X N and approximate by Z N .
I
For f ∈ C0 (Rd , R), we define the operators Pt and Pt by
def
Pt f (x) = Ex f (X N (t))
def
Pt f (x) = Ex f (Z N (t)).
i ∈ {1, . . . , d}.
Weak error analysis on normalized processes
XiN (t) = XiN (0) +
X
N −αi Yk
N βk +νk ·α
Z
t
λk (X N (s))ds ζk ,
0
k
I
Denote exact process as X N and approximate by Z N .
I
For f ∈ C0 (Rd , R), we define the operators Pt and Pt by
def
Pt f (x) = Ex f (X N (t))
def
Pt f (x) = Ex f (Z N (t)).
Want to understand
Pt f (x) − Pt f (x) = Ex f (X N (t)) − Ex f (Z N (t))
i ∈ {1, . . . , d}.
Weak error analysis on normalized processes
Z t
X −α
XiN (t) = XiN (0)+
N i Yk N βk +νk ·α
λk (X N (s))ds ζk ,
i ∈ {1, . . . , d}.
0
k
Define
def
γ=
max
{βk + νk · α − αi }.
{i,k : ζik 6=0}
Then, N γ should be interpreted as the “time-scale” of the system.
(1)
Weak error analysis on normalized processes
Z t
X −α
XiN (t) = XiN (0)+
N i Yk N βk +νk ·α
λk (X N (s))ds ζk ,
i ∈ {1, . . . , d}.
0
k
Define
def
γ=
max
{βk + νk · α − αi }.
{i,k : ζik 6=0}
Then, N γ should be interpreted as the “time-scale” of the system.
Example. If system is
100
S → ∅,
X (0) = 10,000.
(1)
Weak error analysis on normalized processes
Z t
X −α
XiN (t) = XiN (0)+
N i Yk N βk +νk ·α
λk (X N (s))ds ζk ,
i ∈ {1, . . . , d}.
0
k
Define
def
γ=
max
{βk + νk · α − αi }.
{i,k : ζik 6=0}
Then, N γ should be interpreted as the “time-scale” of the system.
Example. If system is
100
S → ∅,
X (0) = 10,000.
Then, N = 10,000, κ = 100 = N 1/2 . Normalized system is
Z t
1
XN (s)ds
X N (t) = 1 − Y N 1/2 N
.
N
0
Here,
γ = 1/2,
N γ = 100.
(1)
Weak error analysis on normalized processes
Define ζkN component-wise
ζikN = N −αi ζik .
def
Model is
X N (t) = X N (0) +
X
k
Yk
N γ N βk +νk ·α−γ
Z
0
t
λk (X N (s))ds ζkN .
Weak error analysis on normalized processes
Define ζkN component-wise
ζikN = N −αi ζik .
def
Model is
X N (t) = X N (0) +
X
Yk
N γ N βk +νk ·α−γ
k
Z
t
λk (X N (s))ds ζkN .
0
Define a “directional derivative” in direction ζkN :
∇Nk f (x) = N βk +νk ·α−γ (f (x + ζkN ) − f (x)).
def
Weak error analysis on normalized processes
Define ζkN component-wise
ζikN = N −αi ζik .
def
Model is
X N (t) = X N (0) +
X
Yk
N γ N βk +νk ·α−γ
Z
t
λk (X N (s))ds ζkN .
0
k
Define a “directional derivative” in direction ζkN :
∇Nk f (x) = N βk +νk ·α−γ (f (x + ζkN ) − f (x)).
def
Now define a “j-norm”:
N
kf k∇
j
! ( p
Y
N
∇`i f = sup i=1
∞
)
, 1 ≤ `i ≤ m, p ≤ j
Weak error analysis on normalized processes
Define ζkN component-wise
ζikN = N −αi ζik .
def
Model is
X N (t) = X N (0) +
X
Yk
N γ N βk +νk ·α−γ
Z
t
λk (X N (s))ds ζkN .
0
k
Define a “directional derivative” in direction ζkN :
∇Nk f (x) = N βk +νk ·α−γ (f (x + ζkN ) − f (x)).
def
Now define a “j-norm”:
N
kf k∇
j
! ( p
Y
N
∇`i f = sup i=1
)
, 1 ≤ `i ≤ m, p ≤ j
∞
Finally, define, for any operator Q : C0 (Rd , R) → C0 (Rd , R),
N
N
kQk∇
j→` = sup
j
f ∈C0
kQf k∇
`
.
N
kf k∇
j
Weak error analysis on normalized processes
We want
|Ex f (X N (t))−Ex f (Z N (t))| ≤
Weak error analysis on normalized processes
We want
|Ex f (X N (t))−Ex f (Z N (t))| ≤ kPt f −Pt f k∞
Weak error analysis on normalized processes
We want
|Ex f (X N (t))−Ex f (Z N (t))| ≤ kPt f −Pt f k∞ = k(Pt −Pt )f k∇
0
N
Weak error analysis on normalized processes
We want
N
N
N
∇
|Ex f (X N (t))−Ex f (Z N (t))| ≤ kPt f −Pt f k∞ = k(Pt −Pt )f k∇
≤ kPt −Pt k∇
0
m→0 kf km .
Want: For t ≤ T
N
q
kPt − Pt k∇
m→0 = O(h )
for the different methods.
Weak error analysis
def
Pt f (x) = Ex f (X N (t)),
def
Pt f (x) = Ex f (Z N (t)).
Theorem (A., Koyama – 2011) For any n, m ≥ 0, and h > 0 (think: T = nh)
N
N
∇
kPhn − Pnh k∇
m→0 = O(n kPh − Ph km→0
N
max {kP`h k∇
m→m })
`∈{1,...,n}
Points:
1. Last part is independent of the method.
Just depends on behavior of exact process.
2. Rest says:
one step (local error) is O(hq+1 ) =⇒ global error is O(hq ).
=⇒ Can focus on local, one step error.
Weak error analysis
Proof
Note that
kPh f k0 = sup |Ex f (Z N (h))| ≤ kf k0 .
x
N
Thus, Ph is a contraction, i.e. kPh k∇
0→0 ≤ 1.
Weak error analysis
Proof
Note that
kPh f k0 = sup |Ex f (Z N (h))| ≤ kf k0 .
x
N
Thus, Ph is a contraction, i.e. kPh k∇
0→0 ≤ 1. With this in mind
X
n
j
j−1
(P
P
−
P
P
)f
k(Phn − Pnh )f k0 = h(n−j)
h(n−j+1)
h
h
j=1
≤
n
X
kPhj−1 (Ph
0
(2)
− Ph )Ph(n−j) f k0
j=1
≤
n
X
N
N
N
N
∇
∇
∇
kPhj−1 k∇
0→0 kPh − Ph km→0 kPh(n−j) km→m kf km
j=1
Since Ph is a contraction, i.e. kPh k0→0 ≤ 1, the result is shown.
One step error analyses
Theorem (A., Koyama – 2011) For forward Euler, if h < N −γ
N
2γ 2
kPh − Ph k∇
2→0 ∈ O(N h ).
Corollay If Z N is generated via forward Euler, then
sup |Ex f (X N (t)) − Ex f (Z N (t))| = O(N 2γ h).
t≤T
One step error analyses
Define
def

Example: if
ζkN
ηk = min{αi : ζikN 6= 0}.

−2
N


0
−ηk

=
).
 −N −1  , then ηk = 1/2. Thus, |ζk | = O(N
N −1/2
One step error analyses
Define
def

Example: if
ζkN
ηk = min{αi : ζikN 6= 0}.

−2
N


0
−ηk

=
).
 −N −1  , then ηk = 1/2. Thus, |ζk | = O(N
N −1/2
Theorem (A., Koyama – 2011) For midpoint method, if h < N −γ
N
3γ 3
2γ−min{ηk } 2
kPh − Ph k∇
h ).
3→0 = O(N h + N
Corollay If Z N is generated via midpoint method, then
sup |Ex f (X N (t)) − Ex f (Z N (t))| = O(N 3γ h2 + N 2γ−min{ηk } h).
t≤T
One step error analyses
Define
def

Example: if
ζkN
ηk = min{αi : ζikN 6= 0}.

−2
N


0
−ηk

=
).
 −N −1  , then ηk = 1/2. Thus, |ζk | = O(N
N −1/2
Theorem (A., Koyama – 2011) For midpoint method, if h < N −γ
N
3γ 3
2γ−min{ηk } 2
kPh − Ph k∇
h ).
3→0 = O(N h + N
Corollay If Z N is generated via midpoint method, then
sup |Ex f (X N (t)) − Ex f (Z N (t))| = O(N 3γ h2 + N 2γ−min{ηk } h).
t≤T
Notes
1. In classical scaling, ηk ≡ 1, and γ = 0. Further, h 1/N. So, error is
O(h2 ).
2. In other scalings, can by O(h).
3. O(h2 ) if h N −γ−min{ηk } .
One step error analyses
Theorem (A., Koyama – 2011) For weak trapezoidal, if h < N −γ
N
3γ 3
kPh − Ph k∇
3→0 ∈ O(N h ).
Corollay If Z N is generated via weak trapezoidal, then
sup |Ex f (X N (t)) − Ex f (Z N (t))| = O(N 3γ h2 ).
t≤T
Flavor of proof: Euler’s method
Generator for any Markov process X is defined to be operator A satisfying
Af (x) = lim
h→0
Ex f (X (h)) − f (x)
.
h
Flavor of proof: Euler’s method
Generator for any Markov process X is defined to be operator A satisfying
Af (x) = lim
h→0
Ex f (X (h)) − f (x)
.
h
Our model is
N
N
X (t) = X (0) +
X
k
Yk
γ
N N
βk +νk ·α−γ
Z
t
N
λk (X (s))ds ζkN .
0
Generator for our process is operator AN
X β +ν ·α−γ
AN f (x) = N γ
N k k
λk (x)(f (x + ζkN ) − f (x)).
k
Flavor of proof: Euler’s method
Generator for any Markov process X is defined to be operator A satisfying
Af (x) = lim
h→0
Ex f (X (h)) − f (x)
.
h
Our model is
N
N
X (t) = X (0) +
X
Yk
γ
N N
βk +νk ·α−γ
Z
t
N
λk (X (s))ds ζkN .
0
k
Generator for our process is operator AN
X β +ν ·α−γ
AN f (x) = N γ
N k k
λk (x)(f (x + ζkN ) − f (x)).
k
Recalling that we defined
∇Nk f (x) = N βk +νk ·α−γ (f (x + ζkN ) − f (x)).
def
it is
AN f (x) = N γ
X
k
λk (x)∇Nk f (x) = N γ (λ · ∇N )f (x)
Flavor of proof: Euler’s method
Generators useful
Ex f (X N (t)) = f (x) + Ex
Z
t
AN f (X N (s))ds
0
Thus, they are right analytical tool for us to understand Ex f (X N (t)).
Flavor of proof: Euler’s method
Euler’s method solves
Z N (h) = x0 +
X
Yk
N γ N βk +νk ·α−γ
h
Z
λk (x0 ) ds ζkN .
0
k
Generator of process is
X β +ν ·α−γ
def
BN f (x) = N γ
N k k
λk (x0 )(f (x + ζkN ) − f (x)) = N γ (λ(x0 ) · ∇N )f (x)
k
Ex0 f (Z N (h)) = f (x0 ) + Ex0
Z
0
h
BN f (Z N (s))ds.
Flavor of proof: Euler’s method
Ex0 f (X N (h)) = f (x0 ) + Ex0
Z
h
AN f (X N (s))ds
0
N
Z
h
s
Z
= f (x0 ) + hA f (x0 ) + Ex0
0
(AN )2 f (X N (r ))dr
0
N
h2
3
= f (x0 ) + hA f (x0 ) + (AN )2 f (x0 ) + O(N 3γ kf k∇
3 h )
2
N
Flavor of proof: Euler’s method
Ex0 f (X N (h)) = f (x0 ) + Ex0
Z
h
AN f (X N (s))ds
0
Z
N
h
s
Z
= f (x0 ) + hA f (x0 ) + Ex0
0
(AN )2 f (X N (r ))dr
0
N
h2
3
= f (x0 ) + hA f (x0 ) + (AN )2 f (x0 ) + O(N 3γ kf k∇
3 h )
2
N
and
Ex0 f (Z N (h)) = f (x0 ) + hBN f (x0 ) +
N
h2 N 2
3
(B ) f (x0 ) + O(N 3γ kf k∇
3 h ),
2
Now do lots of computations and compare.
Flavor of proof: other methods
1. Midpoint method simply has different generator BN . Repeat basic
analysis.
Flavor of proof: other methods
1. Midpoint method simply has different generator BN . Repeat basic
analysis.
2. Weak trapezoidal is harder because generator changes half-way through
time step. Instead use that
Ex0 f (Z (h)) = Ex0 [Ex0 [f (Z (h))|y ∗]].
i. Work on inner expectation with one operator, and then
ii. work on outer expectation with different operator.
Example
Consider
A B C,
with each κk ≡ 1 and X (0) = [1000, 1000, 4000]. Log-Log plot of
|EZC (1) − EXC (1)|:
2
1
0
−1
−2
Euler
Midptoint
Weak Trap
−3
−4
−3
−2.5
−2
−1.5
Slopes:
Euler = .9957,
Midpoint = 2.7,
Weak Trap = 2.32.
Example
Consider
A B C,
with each κk ≡ 1 and X (0) = [1000, 1000, 4000]. Log-Log plot of second
moments:
8
7
6
Euler
Midptoint
Weak Trap
5
4
3
−2.6
−2.4
−2.2
−2
−1.8
−1.6
−1.4
−1.2
Careful!
Consider
A B C,
with each κk ≡ 1 and X (0) = [1000, 1000, 4000]. Log-Log plot of
|Var (ZC (1)) − Var (XC (1))|:
5
4
3
2
1
Euler
Midptoint
Weak Trap
0
−1
−2.5
−2
−1.5
−1
Slopes:
Euler = .9831,
Midpoint = 1.0525,
Weak Trap = 2.2291.
Careful!
For Midpoint method in classical scaling:
1
h
N
1
E(Z /N) − E(X /N) ≈ c3 h2 + c4 h
N
E(Z /N)2 − E(X /N)2 ≈ c1 h2 + c2
Thus,
Var (X ) − Var (Z ) ≈ (c1 h2 N 2 − c2 hN) − 2(c3 h2 N + c4 h)E[Z ],
which could be (and was on previous example) O(hN).
Take home messages
1. Simulation of continuous time Markov chains is easy.....unless it’s not.
2. Calculating error estimates must be done with care.
3. Incorporating the natural scales of the process plays an important role.
Take home messages
1. Simulation of continuous time Markov chains is easy.....unless it’s not.
2. Calculating error estimates must be done with care.
3. Incorporating the natural scales of the process plays an important role.
Thank you!