Numerical approximation methods for stochastically modeled biochemical reaction networks David F. Anderson∗ ∗ [email protected] Department of Mathematics University of Wisconsin - Madison Illinois Math Bio Seminar February 24th, 2011 Outline 1. Discuss Poisson process and time changes. 2. Describe/develop model – stochastically modeled reaction networks. 3. Numerical methods and error approximation. 4. Mathematical question for today: how do we effectively quantify how well different methods approximate the continuous time Markov chain model. 5. Discuss/use multi-scale nature of biochemical reaction networks. The Poisson process I The models I am going to discuss are a subset of the class of models termed continuous time Markov chains. I The simplest continuous time Markov chain is the Poisson process. The Poisson process A Poisson process, Y (·), is a model for a series of random observations occurring in time. (a) Let {ξi } be i.i.d. exponential random variables with parameter one. The Poisson process A Poisson process, Y (·), is a model for a series of random observations occurring in time. (a) Let {ξi } be i.i.d. exponential random variables with parameter one. (b) Now, put points down on a line with spacing equal to the ξi . x x x ↔ ↔ ←→ ξ1 ξ2 ξ3 x ··· x x x x t I Let Y (t) denote the number of points hit by time t. I In the figure above, Y (t) = 6. Intuition: The unit rate Poisson process is simply the number of points hit when we run along the time frame at rate one. The Poisson process A Poisson process, Y (·), is a model for a series of random observations occurring in time. (a) Let {ξi } be i.i.d. exponential random variables with parameter one. (b) Now, put points down on a line with spacing equal to the ξi . x x x ↔ ↔ ←→ ξ1 ξ2 ξ3 x x x x ··· x t I Let Y (t) denote the number of points hit by time t. I In the figure above, Y (t) = 6. Intuition: The unit rate Poisson process is simply the number of points hit when we run along the time frame at rate one. 25 20 λ=1 15 10 5 0 0 5 10 15 20 The Poisson process Let I Y be a unit rate Poisson process. I Define Yλ (t) ≡ Y (λt), Then Yλ is a Poisson process with parameter λ. Intuition: The Poisson process with rate λ is simply the number of points hit (of the unit-rate point process) when we run along the time frame at rate λ. Thus, we have “changed time” to convert a unit-rate Poisson process to one which has rate λ. The Poisson process Let I Y be a unit rate Poisson process. I Define Yλ (t) ≡ Y (λt), Then Yλ is a Poisson process with parameter λ. Intuition: The Poisson process with rate λ is simply the number of points hit (of the unit-rate point process) when we run along the time frame at rate λ. Thus, we have “changed time” to convert a unit-rate Poisson process to one which has rate λ. 60 50 λ=3 40 30 20 10 0 0 5 10 15 20 The Poisson process There is no reason λ needs to be constant in time, in which case Z t Yλ (t) ≡ Y λ(s)ds 0 is an inhomogeneous Poisson process. The Poisson process There is no reason λ needs to be constant in time, in which case Z t Yλ (t) ≡ Y λ(s)ds 0 is an inhomogeneous Poisson process. Key property: Y (T + ∆) − Y (T ) = Poisson(∆). The Poisson process There is no reason λ needs to be constant in time, in which case Z t Yλ (t) ≡ Y λ(s)ds 0 is an inhomogeneous Poisson process. Key property: Y (T + ∆) − Y (T ) = Poisson(∆). Thus t+∆t Z P{Yλ (t + ∆t) − Yλ (t) > 0|Ft } = 1 − exp − λ(t)dt t ≈ λ(t)∆t. The Poisson process There is no reason λ needs to be constant in time, in which case Z t Yλ (t) ≡ Y λ(s)ds 0 is an inhomogeneous Poisson process. Key property: Y (T + ∆) − Y (T ) = Poisson(∆). Thus t+∆t Z P{Yλ (t + ∆t) − Yλ (t) > 0|Ft } = 1 − exp − λ(t)dt ≈ λ(t)∆t. t Points: 1. We have “changed time” to convert a unit-rate Poisson process to one which has rate or intensity λ(t). 2. Will use more complicated time changes of unit-rate processes to build models of interest. Outline 1. Discuss Poisson process and time changes. 2. → Describe/develop model – stochastically modeled reaction networks. 3. Numerical methods and error approximation. 4. Mathematical question for today: how do we effectively quantify how well different methods approximate the continuous time Markov chain model. 5. Discuss/use multi-scale nature of biochemical reaction networks. Models of interest Consider the simple system A+B →C where one molecule each of A and B is being converted to one of C. Models of interest Consider the simple system A+B →C where one molecule each of A and B is being converted to one of C. Intuition for standard stochastic model: P{reaction occurs in (t, t + ∆t]Ft } ≈ κXA (t)XB (t)∆t where I Ft represents the information about the system available at time t and I κ is a positive constant, the reaction rate constant. Models of interest A+B →C Simple book-keeping: if X (t) = (XA (t), XB (t), XC (t)) gives the state at time t, −1 X (t) = X (0) + R(t) −1 , 1 where I R(t) is the # of times the reaction has occurred by time t and I X (0) is the initial condition. Models of interest A+B →C Simple book-keeping: if X (t) = (XA (t), XB (t), XC (t)) gives the state at time t, −1 X (t) = X (0) + R(t) −1 , 1 where I R(t) is the # of times the reaction has occurred by time t and I X (0) is the initial condition. R is a counting process: I R(0) = 0 and I R is constant except for jumps of plus one. Goal: represent R(t) in terms of Poisson process. Models of interest Recall that for A + B → C our intuition was P{reaction occurs in (t, t + ∆t]Ft } ≈ κXA (t)XB (t)∆t, Models of interest Recall that for A + B → C our intuition was P{reaction occurs in (t, t + ∆t]Ft } ≈ κXA (t)XB (t)∆t, and that for an inhomogeneous Poisson process with rate λ(t) we have Z P{Yλ (t + ∆t) − Yλ (t) > 0|Ft } = 1 − exp − 0 ∆t λ(t)dt ≈ λ(t)∆t. Models of interest Recall that for A + B → C our intuition was P{reaction occurs in (t, t + ∆t]Ft } ≈ κXA (t)XB (t)∆t, and that for an inhomogeneous Poisson process with rate λ(t) we have Z P{Yλ (t + ∆t) − Yλ (t) > 0|Ft } = 1 − exp − ∆t λ(t)dt 0 This suggests we can model t Z R(t) = Y κXA (s)XB (s)ds 0 where Y is a unit-rate Poisson process. ≈ λ(t)∆t. Models of interest Recall that for A + B → C our intuition was P{reaction occurs in (t, t + ∆t]Ft } ≈ κXA (t)XB (t)∆t, and that for an inhomogeneous Poisson process with rate λ(t) we have Z P{Yλ (t + ∆t) − Yλ (t) > 0|Ft } = 1 − exp − ∆t λ(t)dt ≈ λ(t)∆t. 0 This suggests we can model t Z R(t) = Y κXA (s)XB (s)ds 0 where Y is a unit-rate Poisson process. Hence Z t XA (t) −1 XB (t) ≡ X (t) = X (0) + −1 Y κXA (s)XB (s)ds . 0 XC (t) 1 This equation uniquely determines X for all t > 0. General stochastic models of biochemical reactions • We consider a network of reactions involving d chemical species, S1 , . . . , Sd : d d X X νik Si −→ νik0 Si i=1 i=1 Zd≥0 : • νk ∈ number of molecules of each chemical species consumed in the k th reaction. • νk0 ∈ Zd≥0 : number of molecules of each chemical species created in the k th reaction. Denote reaction vector as ζk = νk0 − νk , so that if k th reaction occurs at time t, the new state becomes X (t) = X (t−) + νk0 − νk = X (t−) + ζk , General stochastic models of biochemical reactions • We consider a network of reactions involving d chemical species, S1 , . . . , Sd : d d X X νik Si −→ νik0 Si i=1 i=1 Zd≥0 : • νk ∈ number of molecules of each chemical species consumed in the k th reaction. • νk0 ∈ Zd≥0 : number of molecules of each chemical species created in the k th reaction. Denote reaction vector as ζk = νk0 − νk , so that if k th reaction occurs at time t, the new state becomes X (t) = X (t−) + νk0 − νk = X (t−) + ζk , • The intensity (or propensity) of k th reaction is λk : Zd≥0 → R. • By analogy with before X (t) = X (0) + X k t Z Yk λk (X (s))ds ζk , 0 Yk are independent, unit-rate Poisson processes. Mass-action kinetics The standard intensity function chosen is mass-action kinetics: ! Y Y xi ! x λk (x) = κk ( νik !) = κk . νk (xi − νik )! i i 1. Rate is proportional to the number of distinct subsets of the molecules present that can form inputs for the reaction. 2. This assumes vessel is “well-stirred”. Example: If S1 → anything, then λk (x) = κk x1 . Example: If S1 + S2 → anything, then λk (x) = κk x1 x2 . Example: If 2S2 → anything, then λk (x) = κk x2 (x2 − 1). Some examples E. coli Heat Shock Response Model. 9 species, 18 reactions. 1 1 Hye Won Kang, presentation at SPA in 2007. Some examples E. coli Heat Shock Response Model. 9 species, 18 reactions. 1 Lotka-Volterra predator-prey model: think of A as a prey and B as a predator. κ A →1 2A, 1 κ A + B →2 2B, Hye Won Kang, presentation at SPA in 2007. κ B →3 ∅, Markov chain models Path-wise representation X (t) = X (0) + X k Z Yk t λk (X (s))ds ζk . 0 Kolmogorov’s forward equation (“Chemical Master Equation”) describes the evolution of the distribution of the state of the system X X d Px (t) = λk (x − ζk )Px−ζk (t) − λk (x)Px (t), dt k k where Px (t) is probability X (t) = x. Model I’ve described is equivalent to this “chemical master equation model”. Population Example: Lotka-Volterra predator-prey model Think of A as a prey and B as a predator. κ A →1 2A, κ A + B →2 2B, κ B →3 ∅, with A(0) = B(0) = 1000 and κ1 = 2, κ2 = .002, κ3 = 2. Population Example: Lotka-Volterra predator-prey model Think of A as a prey and B as a predator. κ κ A →1 2A, A + B →2 2B, κ B →3 ∅, with A(0) = B(0) = 1000 and κ1 = 2, κ2 = .002, κ3 = 2. Deterministic model. Let x(t) = [A(t), B(t)]T . t Z x(t) = x(0) + κ1 x1 (s)ds 0 1 0 t Z + κ2 x1 (s)x2 (s)ds 0 −1 1 Z + κ3 t x2 (s)ds 0 0 −1 Population Example: Lotka-Volterra predator-prey model Think of A as a prey and B as a predator. κ κ A →1 2A, A + B →2 2B, κ B →3 ∅, with A(0) = B(0) = 1000 and κ1 = 2, κ2 = .002, κ3 = 2. Deterministic model. Let x(t) = [A(t), B(t)]T . t Z x(t) = x(0) + κ1 x1 (s)ds 0 1 0 t Z + κ2 x1 (s)x2 (s)ds 0 −1 1 Z + κ3 t x2 (s)ds 0 Stochastic model. Let X (t) = [A(t), B(t)]T . Z t Z t 1 −1 X (t) = X (0) + Y1 κ1 X1 (s)ds + Y2 κ2 X1 (s)X2 (s)ds 0 1 0 0 Z t 0 + Y3 κ3 X2 (s)ds −1 0 0 −1 Lotka-Volterra Think of A as a prey and B as a predator. κ κ κ A →1 2A, B →3 ∅, A + B →2 2B, with A(0) = B(0) = 1000 and κ1 = 2, κ2 = .002, κ3 = 2. 1800 1500 Stochastic model ODE model 1600 Prey Predator 1400 1300 1200 1400 1100 Prey 1000 1200 900 800 1000 700 0 1800 600 400 5 10 15 20 25 30 35 40 15 20 25 30 35 40 2000 800 Prey Predator 1600 600 800 1000 1200 Predator 1400 1600 1800 1400 1200 1000 800 600 400 0 5 10 Pathwise Representations – Random time changes A representation for path-wise solutions of our model is given by random time-changes of Poisson processes X Z t λk (X (s))ds ζk , X (t) = X (0) + Yk k 0 where the Yk are independent, unit-rate Poisson processes. Random time changes have interesting history: (Wolfgang Doeblin) Outline 1. Discuss Poisson process and time changes. 2. Describe/develop model – stochastically modeled reaction networks. 3. → Numerical methods and error approximation. 4. Mathematical question for today: how do we effectively quantify how well different methods approximate the continuous time Markov chain model. 5. Discuss/use multi-scale nature of biochemical reaction networks. Numerical methods X (t) = X (0) + X k Z Yk t λk (X (s))ds ζk , 0 Good news. There are a number of numerical methods that produce statistically exact sample paths: 1. Gillespie’s algorithm. 2. The first reaction method. 3. The next reaction method. Numerical methods X (t) = X (0) + X Z t Yk λk (X (s))ds ζk , 0 k Good news. There are a number of numerical methods that produce statistically exact sample paths: 1. Gillespie’s algorithm. 2. The first reaction method. 3. The next reaction method. For each step of these methods one must find : (i) the amount of time that passes until the next reaction takes place: ! X ∆n ∼ exp λk (X (t)) k (the minimum of exponential RVs) (ii) which reaction takes place at that time. Numerical methods X (t) = X (0) + X Z t Yk λk (X (s))ds ζk , 0 k Good news. There are a number of numerical methods that produce statistically exact sample paths: 1. Gillespie’s algorithm. 2. The first reaction method. 3. The next reaction method. For each step of these methods one must find : (i) the amount of time that passes until the next reaction takes place: ! X ∆n ∼ exp λk (X (t)) k (the minimum of exponential RVs) (ii) which reaction takes place at that time. Bad news. If P k λk (X (t)) 1, then ∆n ≈ P k 1 1 λk (X (t)) ∗ time to produce a single path over an interval [0, T ] can be prohibitive. Tau-leaping Explicit “τ -leaping” 2 was developed by Dan Gillespie in an effort to overcome the problem that ∆n may be prohibitively small. Z t Tau-leaping is essentially an Euler approximation of λk (X (s))ds: 0 Z (τ ) = Z (0) + X ≈ Z (0) + X d X τ Z Yk λk (Z (s)) ds ζk 0 k Yk λk (Z (0)) τ ζk k = Z (0) + Poisson λk (Z (0)) τ ζk . k 2 D. T. Gillespie, J. Chem. Phys., 115, 1716 – 1733. Tau-leaping One “intuitive” representation for Z (t) generated by τ -leaping is Z (t) = X (0) + X k Z t λk (Z ◦ η(s))ds ζk , Yk 0 where η(s) = tn , if tn ≤ s < tn+1 = tn + τ is a step function giving left endpoints of time discretization. Another algorithm: A midpoint method For a time discretization 0 = t0 < t1 < · · · < tN = T , with τ = tn − tn−1 , let ρ(z) = z + 1 X τ λk (z)ζk , 2 k be a “deterministic” midpoint approximation and let Z(t) solve: Z(t) = X (0) + X k t Z Yk λk ◦ ρ(Z ◦ η(s))ds ζk , 0 where η(s) = tn , if tn ≤ s < tn+1 = tn + τ . A new algorithm: Weak trapezoidal method Fixing a θ ∈ (0, 1), we define ξ1 = 1 1 2 θ(1 − θ) def and ξ2 = 1 (1 − θ)2 + θ2 . 2 θ(1 − θ) Repeat the following steps until tn+1 > T , in which we first compute a θ-midpoint y ∗ , and then the new value Z (tn+1 ): 1. Set y ∗ = Z (tn ) + X yk ,n,1 λk (Z (tn ))θh ζk . k 2. Set Z (tn+1 ) = y ∗ + X yk ,n,2 [ξ1 λk (y ∗ ) − ξ2 λk (tn )]+ (1 − θ)h ζk . k 3. Set tn+1 = tn + h. Notes: 1. If θ = 1/2, then ξ1 = 2 and ξ2 = 1. 2. First step is just Euler’s method on partial interval. Weak trapezoidal method: intuition Alternative representation for X (t) = X (0) + X Z Yk X (t) = X (0) + X k ∞ Z tZ ζk 0 λk (X (s))ds ζk , 0 k is t 0 1[0,λk (X (s))) (u)µk (du × ds), where µk are independent Poisson random measures with Lebesgue mean measure (area): Weak trapezoidal method: intuition Alternative representation for X (t) = X (0) + X Z Yk X (t) = X (0) + X k ∞ Z tZ ζk 0 λk (X (s))ds ζk , 0 k is t 0 1[0,λk (X (s))) (u)µk (du × ds), where µk are independent Poisson random measures with Lebesgue mean measure (area): 1. For disjoint A, B ⊂ R2 , 2. µk (A) and µk (B) are Poisson with parameters Area(A) and Area(B), respectively. Why this works: θ = 1/2 λk (X(0)) λk (X(0)) λk (y ∗ ) λk (y ∗ ) Region 1 Why this works: θ = 1/2 λk (X(0)) λk (X(0)) Region 3 λk (y ∗ ) λk (y ∗ ) Region 1 Region 2 Why this works: θ = 1/2 λk (X(0)) λk (X(0)) Region 3 λk (y ∗ ) λk (y ∗ ) Region 1 Region 2 Region 3 Region 4 Region 5 Error analysis I Let X (t) denote some process and Zh (t) be an approximation. I Multiple ways to understand how good an approximation Zh is to X . Error analysis I Let X (t) denote some process and Zh (t) be an approximation. I Multiple ways to understand how good an approximation Zh is to X . 1. Strong error approximation. Bound: 1/p sup Ex0 |X (t) − Zh (t)|p ≤ Chq t≤T Ex0 sup |X (t) − Zh (t)|p t≤T Common choices are p = 1, 2. 1/p ≤ Chq Error analysis I Let X (t) denote some process and Zh (t) be an approximation. I Multiple ways to understand how good an approximation Zh is to X . 1. Strong error approximation. Bound: 1/p sup Ex0 |X (t) − Zh (t)|p ≤ Chq t≤T Ex0 sup |X (t) − Zh (t)|p t≤T 1/p ≤ Chq Common choices are p = 1, 2. 2. Weak error approximation. For large class of real valued test functions f , bound sup Ex0 f (X (t)) − Ex0 f (Zh (t)) ≤ Chq . t≤T Error analysis Under the scaling h → 0: 1. Li3 and also Rathinam, Petzold, Cao, and Gillespie4 showed, among other things (implicit methods, etc.), that explicit Euler tau-leaping has a strong error (in the L2 norm) of order 1/2 and a weak error of order one: q sup Ex0 |Z (tn ) − X (tn )|2 ≤ Ch1/2 n≤N |Ex0 f (Z (T )) − Ex0 f (X (T ))| ≤ Ch, where 0 = t0 < t1 < · · · < tN = T is a partition of [0, T ]. 2. The midpoint method has the same order accurate as explicit Euler tau-leaping as h → 0. 3 T. Li, SIAM Multi. Model. Simul., 6, 2007, 417 – 436. M. Rathinam et al., SIAM Multi. Model. Simul., 4, 2005, 867 – 895. 5 A., Ganguly, Kurtz, Ann. Applied Prob., 2011 4 Error analysis Under the scaling h → 0: 1. Li3 and also Rathinam, Petzold, Cao, and Gillespie4 showed, among other things (implicit methods, etc.), that explicit Euler tau-leaping has a strong error (in the L2 norm) of order 1/2 and a weak error of order one: q sup Ex0 |Z (tn ) − X (tn )|2 ≤ Ch1/2 n≤N |Ex0 f (Z (T )) − Ex0 f (X (T ))| ≤ Ch, where 0 = t0 < t1 < · · · < tN = T is a partition of [0, T ]. 2. The midpoint method has the same order accurate as explicit Euler tau-leaping as h → 0. 3. Many examples show midpoint method more accurate than Euler: led to work in “classical scaling.” 5 4. Nothing known about weak trapezoidal algorithm. 3 T. Li, SIAM Multi. Model. Simul., 6, 2007, 417 – 436. M. Rathinam et al., SIAM Multi. Model. Simul., 4, 2005, 867 – 895. 5 A., Ganguly, Kurtz, Ann. Applied Prob., 2011 4 Example Again think of A as a prey and B as a predator. κ A →1 2A, κ A + B →2 2B, κ B →3 ∅, with A(0) = B(0) = 1000 and κ1 = 2, κ2 = .002, κ3 = 2. Letting τ = 1/20 and simulating 30,000 sample paths with each method yields the following approximate distributions for B(10): 0.16 0.14 Gillespie algorithm Euler tau−leaping Midpoint tau−leaping Weak Trapezoidal 0.12 0.1 0.08 0.06 0.04 0.02 0 0 500 1000 1500 2000 2500 3000 What’s happening? Recall, tau-leaping methods are used when h E∆n , for otherwise an exact method would be performed. Therefore, we should require that 1 = E∆n λ k k (X (t)) h P while X λk (X (t)) 1. k So h → 0 does not tell the whole story · · · what to do? What’s happening? Recall, tau-leaping methods are used when h E∆n , for otherwise an exact method would be performed. Therefore, we should require that 1 = E∆n λ k k (X (t)) h P while X λk (X (t)) 1. k So h → 0 does not tell the whole story · · · what to do? Explicitly take the multi-scale nature of the systems into account. Mulit-scale nature of biochemical reaction networks Let N 1. Assume that we are given a model of the form X Z t 0 X (t) = X (0) + Yk λk (X (s))ds ζk , 0 k where the λ0k are of the form λ0k (x) = κ0k Y i νik ! Y i ! xi . νik Mulit-scale nature of biochemical reaction networks Let N 1. Assume that we are given a model of the form X Z t 0 X (t) = X (0) + Yk λk (X (s))ds ζk , 0 k where the λ0k are of the form λ0k (x) = κ0k Y νik ! i Y i ! xi . νik For each species i, define the normalized abundance XiN (t) = Xi (t) , N αi where αi ≥ 0 should be selected so that XiN = O(1). Here XiN may be the species number (αi = 0) or the species concentration or something else. Mulit-scale nature of biochemical reaction networks Rate constants, κ0k , may also vary over several orders of magnitude. Mulit-scale nature of biochemical reaction networks Rate constants, κ0k , may also vary over several orders of magnitude. We write κ0k = κk N βk where the βk are selected so that κk = O(1). Mulit-scale nature of biochemical reaction networks Rate constants, κ0k , may also vary over several orders of magnitude. We write κ0k = κk N βk where the βk are selected so that κk = O(1). Thus: 1. For a binary reaction Si + Sj → ∗ κ0k xi xj = N βk +αi +αj κk zi zj , and we can write βk + αi + αj = βk + νk · α. Mulit-scale nature of biochemical reaction networks Rate constants, κ0k , may also vary over several orders of magnitude. We write κ0k = κk N βk where the βk are selected so that κk = O(1). Thus: 1. For a binary reaction Si + Sj → ∗ κ0k xi xj = N βk +αi +αj κk zi zj , and we can write βk + αi + αj = βk + νk · α. 2. We also have for Si → ∗ and 2Si → ∗, respectively, κ0k xi = N βk +νk ·α zi , κ0k xi (xi − 1) = N βk +νk ·α zi (zi − N −αi ), with similar expressions for intensities involving higher order reactions. Mulit-scale nature of biochemical reaction networks Our model has become XiN (t) = XiN (0) + X k N −αi Yk N βk +νk ·α Z 0 t λk (X N (s))ds ζk , i ∈ {1, . . . , d}. Mulit-scale nature of biochemical reaction networks Our model has become XiN (t) = XiN (0) + X N −αi Yk N βk +νk ·α Z t λk (X N (s))ds ζk , i ∈ {1, . . . , d}. 0 k Remark: If βk + νk · α = αi = 1 for all i, k , then model is X N (t) = X N (0) + X 1 Z t Yk N λk (X N (s))ds ζk . N 0 k which is typically called the classical scaling. I In this case it is natural to consider X N as a vector whose ith component gives the concentration, in moles per unit volume, of the ith species. Mulit-scale nature of biochemical reaction networks Our model has become XiN (t) = XiN (0) + X N −αi Yk N βk +νk ·α Z t λk (X N (s))ds ζk , i ∈ {1, . . . , d}. 0 k Remark: If βk + νk · α = αi = 1 for all i, k , then model is X N (t) = X N (0) + X 1 Z t Yk N λk (X N (s))ds ζk . N 0 k which is typically called the classical scaling. I In this case it is natural to consider X N as a vector whose ith component gives the concentration, in moles per unit volume, of the ith species. I For large N, can use Langevin/diffusion approximation. Mulit-scale nature of biochemical reaction networks Our model has become XiN (t) = XiN (0) + X N −αi Yk N βk +νk ·α Z t λk (X N (s))ds ζk , i ∈ {1, . . . , d}. 0 k Remark: If βk + νk · α = αi = 1 for all i, k , then model is X N (t) = X N (0) + X 1 Z t Yk N λk (X N (s))ds ζk . N 0 k which is typically called the classical scaling. I In this case it is natural to consider X N as a vector whose ith component gives the concentration, in moles per unit volume, of the ith species. I For large N, can use Langevin/diffusion approximation. I As N → ∞, converge to ODE model with mass action kinetics. Mulit-scale nature of biochemical reaction networks Our model has become XiN (t) = XiN (0) + X N −αi Yk N βk +νk ·α Z t λk (X N (s))ds ζk , i ∈ {1, . . . , d}. 0 k Remark: If βk + νk · α = αi = 1 for all i, k , then model is X N (t) = X N (0) + X 1 Z t Yk N λk (X N (s))ds ζk . N 0 k which is typically called the classical scaling. I In this case it is natural to consider X N as a vector whose ith component gives the concentration, in moles per unit volume, of the ith species. I For large N, can use Langevin/diffusion approximation. I As N → ∞, converge to ODE model with mass action kinetics. I It was specifically this scaling that was used in the analyses of Euler and midpoint τ -leaping found in A., Ganguly, Kurtz, Ann. Appl. Prob, 2011. A case for simulation Our model has become XiN (t) = XiN (0) + X N −αi Yk k N βk +νk ·α Z t λk (X N (s))ds ζk , i ∈ {1, . . . , d}. 0 In this general setting, methods of model simplification/reduction: 1. Langevin/Diffusion approximations, or 2. linear noise/Van Kampen approximations, or 3. Averaging ideas to reduce system, or 4. ODE/Law of large numbers types of approximations are exceedingly difficult. Very active line of current research: Kurtz, Popovic, Kang, etc. Weak error analysis on normalized processes XiN (t) = XiN (0) + X N −αi Yk N βk +νk ·α Z t λk (X N (s))ds ζk , 0 k I Denote exact process as X N and approximate by Z N . I For f ∈ C0 (Rd , R), we define the operators Pt and Pt by def Pt f (x) = Ex f (X N (t)) def Pt f (x) = Ex f (Z N (t)). i ∈ {1, . . . , d}. Weak error analysis on normalized processes XiN (t) = XiN (0) + X N −αi Yk N βk +νk ·α Z t λk (X N (s))ds ζk , 0 k I Denote exact process as X N and approximate by Z N . I For f ∈ C0 (Rd , R), we define the operators Pt and Pt by def Pt f (x) = Ex f (X N (t)) def Pt f (x) = Ex f (Z N (t)). Want to understand Pt f (x) − Pt f (x) = Ex f (X N (t)) − Ex f (Z N (t)) i ∈ {1, . . . , d}. Weak error analysis on normalized processes Z t X −α XiN (t) = XiN (0)+ N i Yk N βk +νk ·α λk (X N (s))ds ζk , i ∈ {1, . . . , d}. 0 k Define def γ= max {βk + νk · α − αi }. {i,k : ζik 6=0} Then, N γ should be interpreted as the “time-scale” of the system. (1) Weak error analysis on normalized processes Z t X −α XiN (t) = XiN (0)+ N i Yk N βk +νk ·α λk (X N (s))ds ζk , i ∈ {1, . . . , d}. 0 k Define def γ= max {βk + νk · α − αi }. {i,k : ζik 6=0} Then, N γ should be interpreted as the “time-scale” of the system. Example. If system is 100 S → ∅, X (0) = 10,000. (1) Weak error analysis on normalized processes Z t X −α XiN (t) = XiN (0)+ N i Yk N βk +νk ·α λk (X N (s))ds ζk , i ∈ {1, . . . , d}. 0 k Define def γ= max {βk + νk · α − αi }. {i,k : ζik 6=0} Then, N γ should be interpreted as the “time-scale” of the system. Example. If system is 100 S → ∅, X (0) = 10,000. Then, N = 10,000, κ = 100 = N 1/2 . Normalized system is Z t 1 XN (s)ds X N (t) = 1 − Y N 1/2 N . N 0 Here, γ = 1/2, N γ = 100. (1) Weak error analysis on normalized processes Define ζkN component-wise ζikN = N −αi ζik . def Model is X N (t) = X N (0) + X k Yk N γ N βk +νk ·α−γ Z 0 t λk (X N (s))ds ζkN . Weak error analysis on normalized processes Define ζkN component-wise ζikN = N −αi ζik . def Model is X N (t) = X N (0) + X Yk N γ N βk +νk ·α−γ k Z t λk (X N (s))ds ζkN . 0 Define a “directional derivative” in direction ζkN : ∇Nk f (x) = N βk +νk ·α−γ (f (x + ζkN ) − f (x)). def Weak error analysis on normalized processes Define ζkN component-wise ζikN = N −αi ζik . def Model is X N (t) = X N (0) + X Yk N γ N βk +νk ·α−γ Z t λk (X N (s))ds ζkN . 0 k Define a “directional derivative” in direction ζkN : ∇Nk f (x) = N βk +νk ·α−γ (f (x + ζkN ) − f (x)). def Now define a “j-norm”: N kf k∇ j ! ( p Y N ∇`i f = sup i=1 ∞ ) , 1 ≤ `i ≤ m, p ≤ j Weak error analysis on normalized processes Define ζkN component-wise ζikN = N −αi ζik . def Model is X N (t) = X N (0) + X Yk N γ N βk +νk ·α−γ Z t λk (X N (s))ds ζkN . 0 k Define a “directional derivative” in direction ζkN : ∇Nk f (x) = N βk +νk ·α−γ (f (x + ζkN ) − f (x)). def Now define a “j-norm”: N kf k∇ j ! ( p Y N ∇`i f = sup i=1 ) , 1 ≤ `i ≤ m, p ≤ j ∞ Finally, define, for any operator Q : C0 (Rd , R) → C0 (Rd , R), N N kQk∇ j→` = sup j f ∈C0 kQf k∇ ` . N kf k∇ j Weak error analysis on normalized processes We want |Ex f (X N (t))−Ex f (Z N (t))| ≤ Weak error analysis on normalized processes We want |Ex f (X N (t))−Ex f (Z N (t))| ≤ kPt f −Pt f k∞ Weak error analysis on normalized processes We want |Ex f (X N (t))−Ex f (Z N (t))| ≤ kPt f −Pt f k∞ = k(Pt −Pt )f k∇ 0 N Weak error analysis on normalized processes We want N N N ∇ |Ex f (X N (t))−Ex f (Z N (t))| ≤ kPt f −Pt f k∞ = k(Pt −Pt )f k∇ ≤ kPt −Pt k∇ 0 m→0 kf km . Want: For t ≤ T N q kPt − Pt k∇ m→0 = O(h ) for the different methods. Weak error analysis def Pt f (x) = Ex f (X N (t)), def Pt f (x) = Ex f (Z N (t)). Theorem (A., Koyama – 2011) For any n, m ≥ 0, and h > 0 (think: T = nh) N N ∇ kPhn − Pnh k∇ m→0 = O(n kPh − Ph km→0 N max {kP`h k∇ m→m }) `∈{1,...,n} Points: 1. Last part is independent of the method. Just depends on behavior of exact process. 2. Rest says: one step (local error) is O(hq+1 ) =⇒ global error is O(hq ). =⇒ Can focus on local, one step error. Weak error analysis Proof Note that kPh f k0 = sup |Ex f (Z N (h))| ≤ kf k0 . x N Thus, Ph is a contraction, i.e. kPh k∇ 0→0 ≤ 1. Weak error analysis Proof Note that kPh f k0 = sup |Ex f (Z N (h))| ≤ kf k0 . x N Thus, Ph is a contraction, i.e. kPh k∇ 0→0 ≤ 1. With this in mind X n j j−1 (P P − P P )f k(Phn − Pnh )f k0 = h(n−j) h(n−j+1) h h j=1 ≤ n X kPhj−1 (Ph 0 (2) − Ph )Ph(n−j) f k0 j=1 ≤ n X N N N N ∇ ∇ ∇ kPhj−1 k∇ 0→0 kPh − Ph km→0 kPh(n−j) km→m kf km j=1 Since Ph is a contraction, i.e. kPh k0→0 ≤ 1, the result is shown. One step error analyses Theorem (A., Koyama – 2011) For forward Euler, if h < N −γ N 2γ 2 kPh − Ph k∇ 2→0 ∈ O(N h ). Corollay If Z N is generated via forward Euler, then sup |Ex f (X N (t)) − Ex f (Z N (t))| = O(N 2γ h). t≤T One step error analyses Define def Example: if ζkN ηk = min{αi : ζikN 6= 0}. −2 N 0 −ηk = ). −N −1 , then ηk = 1/2. Thus, |ζk | = O(N N −1/2 One step error analyses Define def Example: if ζkN ηk = min{αi : ζikN 6= 0}. −2 N 0 −ηk = ). −N −1 , then ηk = 1/2. Thus, |ζk | = O(N N −1/2 Theorem (A., Koyama – 2011) For midpoint method, if h < N −γ N 3γ 3 2γ−min{ηk } 2 kPh − Ph k∇ h ). 3→0 = O(N h + N Corollay If Z N is generated via midpoint method, then sup |Ex f (X N (t)) − Ex f (Z N (t))| = O(N 3γ h2 + N 2γ−min{ηk } h). t≤T One step error analyses Define def Example: if ζkN ηk = min{αi : ζikN 6= 0}. −2 N 0 −ηk = ). −N −1 , then ηk = 1/2. Thus, |ζk | = O(N N −1/2 Theorem (A., Koyama – 2011) For midpoint method, if h < N −γ N 3γ 3 2γ−min{ηk } 2 kPh − Ph k∇ h ). 3→0 = O(N h + N Corollay If Z N is generated via midpoint method, then sup |Ex f (X N (t)) − Ex f (Z N (t))| = O(N 3γ h2 + N 2γ−min{ηk } h). t≤T Notes 1. In classical scaling, ηk ≡ 1, and γ = 0. Further, h 1/N. So, error is O(h2 ). 2. In other scalings, can by O(h). 3. O(h2 ) if h N −γ−min{ηk } . One step error analyses Theorem (A., Koyama – 2011) For weak trapezoidal, if h < N −γ N 3γ 3 kPh − Ph k∇ 3→0 ∈ O(N h ). Corollay If Z N is generated via weak trapezoidal, then sup |Ex f (X N (t)) − Ex f (Z N (t))| = O(N 3γ h2 ). t≤T Flavor of proof: Euler’s method Generator for any Markov process X is defined to be operator A satisfying Af (x) = lim h→0 Ex f (X (h)) − f (x) . h Flavor of proof: Euler’s method Generator for any Markov process X is defined to be operator A satisfying Af (x) = lim h→0 Ex f (X (h)) − f (x) . h Our model is N N X (t) = X (0) + X k Yk γ N N βk +νk ·α−γ Z t N λk (X (s))ds ζkN . 0 Generator for our process is operator AN X β +ν ·α−γ AN f (x) = N γ N k k λk (x)(f (x + ζkN ) − f (x)). k Flavor of proof: Euler’s method Generator for any Markov process X is defined to be operator A satisfying Af (x) = lim h→0 Ex f (X (h)) − f (x) . h Our model is N N X (t) = X (0) + X Yk γ N N βk +νk ·α−γ Z t N λk (X (s))ds ζkN . 0 k Generator for our process is operator AN X β +ν ·α−γ AN f (x) = N γ N k k λk (x)(f (x + ζkN ) − f (x)). k Recalling that we defined ∇Nk f (x) = N βk +νk ·α−γ (f (x + ζkN ) − f (x)). def it is AN f (x) = N γ X k λk (x)∇Nk f (x) = N γ (λ · ∇N )f (x) Flavor of proof: Euler’s method Generators useful Ex f (X N (t)) = f (x) + Ex Z t AN f (X N (s))ds 0 Thus, they are right analytical tool for us to understand Ex f (X N (t)). Flavor of proof: Euler’s method Euler’s method solves Z N (h) = x0 + X Yk N γ N βk +νk ·α−γ h Z λk (x0 ) ds ζkN . 0 k Generator of process is X β +ν ·α−γ def BN f (x) = N γ N k k λk (x0 )(f (x + ζkN ) − f (x)) = N γ (λ(x0 ) · ∇N )f (x) k Ex0 f (Z N (h)) = f (x0 ) + Ex0 Z 0 h BN f (Z N (s))ds. Flavor of proof: Euler’s method Ex0 f (X N (h)) = f (x0 ) + Ex0 Z h AN f (X N (s))ds 0 N Z h s Z = f (x0 ) + hA f (x0 ) + Ex0 0 (AN )2 f (X N (r ))dr 0 N h2 3 = f (x0 ) + hA f (x0 ) + (AN )2 f (x0 ) + O(N 3γ kf k∇ 3 h ) 2 N Flavor of proof: Euler’s method Ex0 f (X N (h)) = f (x0 ) + Ex0 Z h AN f (X N (s))ds 0 Z N h s Z = f (x0 ) + hA f (x0 ) + Ex0 0 (AN )2 f (X N (r ))dr 0 N h2 3 = f (x0 ) + hA f (x0 ) + (AN )2 f (x0 ) + O(N 3γ kf k∇ 3 h ) 2 N and Ex0 f (Z N (h)) = f (x0 ) + hBN f (x0 ) + N h2 N 2 3 (B ) f (x0 ) + O(N 3γ kf k∇ 3 h ), 2 Now do lots of computations and compare. Flavor of proof: other methods 1. Midpoint method simply has different generator BN . Repeat basic analysis. Flavor of proof: other methods 1. Midpoint method simply has different generator BN . Repeat basic analysis. 2. Weak trapezoidal is harder because generator changes half-way through time step. Instead use that Ex0 f (Z (h)) = Ex0 [Ex0 [f (Z (h))|y ∗]]. i. Work on inner expectation with one operator, and then ii. work on outer expectation with different operator. Example Consider A B C, with each κk ≡ 1 and X (0) = [1000, 1000, 4000]. Log-Log plot of |EZC (1) − EXC (1)|: 2 1 0 −1 −2 Euler Midptoint Weak Trap −3 −4 −3 −2.5 −2 −1.5 Slopes: Euler = .9957, Midpoint = 2.7, Weak Trap = 2.32. Example Consider A B C, with each κk ≡ 1 and X (0) = [1000, 1000, 4000]. Log-Log plot of second moments: 8 7 6 Euler Midptoint Weak Trap 5 4 3 −2.6 −2.4 −2.2 −2 −1.8 −1.6 −1.4 −1.2 Careful! Consider A B C, with each κk ≡ 1 and X (0) = [1000, 1000, 4000]. Log-Log plot of |Var (ZC (1)) − Var (XC (1))|: 5 4 3 2 1 Euler Midptoint Weak Trap 0 −1 −2.5 −2 −1.5 −1 Slopes: Euler = .9831, Midpoint = 1.0525, Weak Trap = 2.2291. Careful! For Midpoint method in classical scaling: 1 h N 1 E(Z /N) − E(X /N) ≈ c3 h2 + c4 h N E(Z /N)2 − E(X /N)2 ≈ c1 h2 + c2 Thus, Var (X ) − Var (Z ) ≈ (c1 h2 N 2 − c2 hN) − 2(c3 h2 N + c4 h)E[Z ], which could be (and was on previous example) O(hN). Take home messages 1. Simulation of continuous time Markov chains is easy.....unless it’s not. 2. Calculating error estimates must be done with care. 3. Incorporating the natural scales of the process plays an important role. Take home messages 1. Simulation of continuous time Markov chains is easy.....unless it’s not. 2. Calculating error estimates must be done with care. 3. Incorporating the natural scales of the process plays an important role. Thank you!
© Copyright 2025 Paperzz