Computing mRNA and protein statistical moments for a renewal

CONFIDENTIAL. Limited circulation. For review only.
Computing mRNA and protein statistical moments for a renewal model of stochastic
gene-expression
Duarte Antunes, Abhyudai Singh
Abstract— The level of a given mRNA or protein exhibits
significant variations from cell-to-cell across a homogenous population of living cells. Much work has focused on understanding
the different sources of noise in the gene-expression process
that drive this stochastic variability in gene-expression. Recent
experiments tracking growth and division of individual cells
reveal that cell division times have considerable intercellular
heterogeneity. Here we investigate how randomness in the cell
division times can create variability in population counts.
We consider a model where mRNA/protein levels evolve
according to a linear differential equation with cell divisions
times spaced by independent and identically distributed random
intervals. Whenever the cell divides the population of mRNA
and protein is halved. For this model, we provide a method
for computing the mean and variance in mRNA and protein
levels. In fact, the time evolution of statistical moments can be
obtained from the solution to Volterra equations. Our analysis
shows that these Volterra equations reduce to linear differential
equations when the cell division times are gamma distributed.
For this latter case, we provide exact analytical formulas for
the asymptotic moments of the mRNA and protein population
counts. Computation of the statistical moments for physiologically relevant parameter values show that randomness in the
cell division process can be a major factor in driving difference
in protein levels across a population of cells.
I. I NTRODUCTION
Gene-expression is the process by which genes produce
mRNA and protein molecules through transcription and
translation, respectively. Single-cell measurements reveal that
the level of a protein or mRNA inside an individual cell can
vary significantly across a genetically-identical population
of cells exposed to the same environment [1]–[7]. This
stochastic variability has been shown to play a key role
in cellular decision-making [8]–[11], information processing [12], and buffering populations from hostile changes
in the environment [13]–[16]. Previous experimental and
computational work has investigated how stochastic birthdeath of individual molecules drives intercellular differences
in population counts [17]–[23]. Here we focus on an alternative mechanism for explaining gene-expression variability:
randomness in cell division times. Living cells grow and
divide, at which point the quantities of mRNAs and proteins
D.Antunes is with the Hybrid and Networked Systems Group, Department of Mechanical Engineering, Eindhoven University of Technology, the
Netherlands. D. [email protected]. Abhyudai Singh is with the Department of Electrical and Computer Engineering, Biomedical Engineering
and Mathematical Sciences, University of Delaware, Newark, DE, U.S.A.
[email protected]
This work is supported by the Dutch Science Foundation (STW) and the
Dutch Organization for Scientific Research (NWO) under the VICI grant
“Wireless controls systems: A new frontier in Automation” (No. 11382), by
the European 7th Framework Network of Excellence by the project “Highlycomplex and networked control systems (HYCON2-257462)”.
are approximately divided equally between daughter cells
(assuming symmetric cell division). Since cell division times
can vary from cell-to-cell [24]–[26], we investigate its role in
driving variability in the level of a given mRNA or protein.
We model the time evolution of mRNA and protein levels
in a cell by a set of linear differential equations. Cell divisions occur at times spaced by independent and identically
distributed random intervals and when the cell divides both
mRNA and protein quantities are halved. Our goal is to
obtain explicit expressions for the statistical moments (mean,
variance, correlation) of the mRNA/protein population counts
in terms of model parameters and the cell division time
distirbution. To this effect, we start by showing that the time
evolution of the statistical moments of the mRNA and protein
levels can be captured by an impulsive renewal system, a
model recently proposed in the literature [27]. Borrowing
ideas from [27] we show that the statistical moments can be
explicitly obtained from the solution to Volterra equations.
These equations admit a closed-form expression although it
is more expedite to obtain their solution through efficient
numerical methods.
For the special case in which the cell division intervals are
gamma distributed, we show that the statistical moments can
be obtained from the solution to a set of linear differential
equations. Using this fact, we provide expressions for the
asymptotic moments of mRNA and protein levels, unveiling
their dependence on model parameters. Analysis of these
expressions reveal that cell-to-cell variability in protein levels
monotonically decreases to a lower limit with decreasing
variability in the cell division times. Finally, we provide a
numerical example with physiologically relevant parameter
values and experimentally obtained distribution for the cell
division times. Our results show that the randomness in
the cell division process can be a major factor in driving
intercellular difference in protein levels.
The remainder of the paper is organized as follows. The
renewal model of stochastic gene-expression is presented
in Section II. In Section III we show how to compute the
statistical moments for this model. Moment calculations for
the case of gamma distributed cell division times is presented
in Section IV. In Section V we provide a numerical example
and in Section VI we give possible directions for future work.
The proofs of the propositions stated in the paper are given
in the appendix.
Notation We denote by I n the n × n identity matrix, by
0m×n the m × n zero matrix, and by 1 p the vector of ones
with p entries. Dimensions are sometimes omitted when they
are clear from the context. We denote by diag([A 1 . . . An ])
Preprint submitted to 52nd IEEE Conference on Decision and Control .
Received March 11, 2013.
CONFIDENTIAL. Limited circulation. For review only.
a block diagonal matrix with blocks A i . If v = [v1 . . . vn ] is
a vector diag(v) is a diagonal matrix with entries v i . For a
matrix A, A! denotes its transpose. The Kronecker product
is denoted by ⊗.
III. S TATISTICAL MOMENTS FOR ARBITRARY CELL
DIVISION DISTRIBUTIONS
An impulsive renewal system [27] is described by
x(tk ) =
In gene-expression, the mRNA count at time t in a cell,
denoted by m(t), and the protein count at time t in a cell,
denoted by p(t), can be described by the following linear
system of differential equations
ṁ(t) = km − γm m(t), m(T ) = m0 ,
ṗ(t) = kp m(t) − γp p(t), p(T ) = p0 ,
(1)
1
1 −
m(t−
k ), p(tk ) = p(tk ), k ∈ N,
2
2
the limit from the left
intervals between these
cell division times {tk+1 − tk }k∈N0 , are assumed to be
independent and identically distributed, and described
by a
!b
probability density f , i.e., Prob[h k ∈ (a, b)] = a f (s)ds for
every k ∈ N0 and given positive constants a and b.
The goal of the present work is to obtain explicit expressions for the first and second centered moments of the mRNA
and protein counts, i.e.,
E[m(t)], E[(m(t)−E[m(t)])2 ], E[p(t)], E[(p(t)−E[p(t)])2 ],
and their respective steady-state values in terms of the distribution f . As done in previous studies [19], we will quantify
variability using the dimensionless measure coefficient of
variation squared, defined as
CV2m (t) :=
(3)
(4)
since
E[(m(t) − E[m(t)])2 ] = E[m(t)2 ] − E[m(t)]2 ,
E[(p(t) − E[p(t)])2 ] = E[p(t)2 ] − E[p(t)]2 .
(6)
S(t) = eA(t−tr ) JeAhr−1 . . . JeAh0 ,
(8)
for r = max{k ∈ Z≥0 : tk ≤ t}, is the transition matrix.
We show next that (6) can capture the gene expression
model (1), (2) and in the sequel we provide a method for
computing statistical moments of (6).
A. Gene expression as a renewal model
Let a(t) be an auxiliary variable set to one for every time
t ∈ R≥ 0, which for convenience we write in terms of the
following simple impulsive renewal system
ȧ(t) = 0,
a(tk ) =
a(0) = 1,
a(t−
k ),
(9)
where as before t k are the cell division times. Let also


m(t)
x(t) :=  p(t) 
(10)
a(t)
Then, if we make the probability density function of the
random variables {h k }k∈N0 in (6) equal to f ,


0
1
−γm
−γp 0 ,
(11)
A =  kp
0
0
0
and
1
2
J = 0
0
0
1
2
0

0
0 ,
1
(12)
we have that (6) describes (1), (2), and (9).
Note that to obtain the second order centered moments
it suffices to obtain the first and second order uncentered
moments
E[m(t)], E[m(t)2 ], E[p(t)], E[m(t)2 ].
t0 = 0, x(t0 ) = x0 ,
where the intervals between consecutive transition times
{hk := tk+1 − tk }k∈N0 are assumed to be independent and
identically distributed. The value at time t of a sample path
of (6) is given by
x(t) = S(t)x0
(7)
(2)
where we use u(t−
k ) to denote
of a function u at t k . The time
E[(m(t) − E[m(t)])2 ]
E[m(t)]2
E[(p(t) − E[p(t)])2 ]
CV2p (t) :=
.
E[p(t)]2
Jx(t−
k ),
where
for a time interval [T, T + "), " > 0, in which there are
no cell divisions. The constant k m is the mRNA production
rate (also called transcription rate) and γ m is the mRNA
degradation (or death) rate. Each mRNA produces proteins
at a rate kp and these molecules degrade at a constant rate
γp .
Let the times at which there exist cell divisions be denoted
by {tk }k∈N . Then (1) holds for t ∈ R ≥0 \{tk }k∈N for an
initial time T = 0. Let also t0 := 0. At division times
{tk }k∈N que mRNA and protein counts are halved, i.e.,
m(tk ) =
t %= tk , t ≥ 0, k ∈ Z>0
ẋ(t) = Ax(t),
II. R ENEWAL MODEL AND PROBLEM FORMULATION
(5)
B. Expected values
We start by noticing that taking into account (7), we can
write
E[x(t)! ] = x(0)! Φ(t),
(13)
where
Φ(t) := E[S(t)! ].
(14)
We show next that Φ(t) can be described by a Volterra
equation. The result builds upon similar ideas to the ones
presented in [27], [28]. We include the proof in the Appendix.
Preprint submitted to 52nd IEEE Conference on Decision and Control .
Received March 11, 2013.
CONFIDENTIAL. Limited circulation. For review only.
Proposition 1: Φ(t) satisfies the following Volterra Equation
& t
!
Φ(t) :=
(JeAτ )! Φ(t − τ )f (τ )dτ + eA t s(t), (15)
0
where s(t) :=
!∞
f (s)ds is the survivor function of f .
Note that for x(t) taking the form (10), we have that
E[x(t)! ⊗ x(t)! ]
= [E[m(t)2 ] E[m(t)p(t)] E[m(t)] E[m(t)p(t)] . . . (22)
. . . E[p(t)2 ] E[p(t)] E[m(t)] E[p(t)] 1]
!
Note that if x(t) takes the form (10), we can obtain the
desired expected values
where we used the fact that a(t) = 1 for every t ∈ R ≥0 .
Hence, to obtain E[m(t) 2 ] and E[p(t)2 ] and obtain the
desired centered moments through (5) it suffices to solve (21)
for matrices (11) and (12) and use (19).
E[x(t)! ] = [E[m(t)] E[p(t)] 1]
IV. M OMENT COMPUTATION AND ASYMPTOTIC
t
by solving (15) for matrices (11) and (12) and using (13).
Equation (15) can be solved analytically (cf. [27]) but
it is more expedite to use a numerical method to solve it
(see [29]). A simple numerical method is to approximate the
integral by a quadrature formula (e.g. a simple trapezoidal
rule) at equally spaced points jh ∈ [0, t], where h is the
discretization step, i.e.,
'
!
qr (JeArh )! Φ(h(j − r))f (rh) + eA jh s(jh),
Φ(jh) =
r∈[0,j]
j ∈ N0 ∩ [0, t/h].
(16)
where the quadrature weights are denoted by q r . Then
Φ(jτ ) can be obtained iteratively from (16). Note that
this numerical method can easily incorporate distributions
of the cell division intervals described from histograms,
which is typically the case when these distributions are obtained experimentally (see, e.g., [26]). In fact, experimentally
obtaining the percentage of cell division intervals among
samples that fall in the interval [jh, (j + 1)h), for a given
h > 0, gives(f (rh) in (16), where s(jh) can be estimated
by s(jh) = r≥j f (rh).
C. Covariances
Recall that for dimensional compatible matrices
and
(AB) ⊗ (CD) = (A ⊗ B)(C ⊗ D),
(17)
(A ⊗ B)! = A! ⊗ B ! .
(18)
(cf. [30]). Using (17) and (18) we obtain that
where
E[x(t)! ⊗ x(t)! ] = x(0)! ⊗ x(0)! Ψ(t),
(19)
Ψ(t) := E[S(t)! ⊗ S(t)! ].
(20)
Using similar arguments to the ones used to prove Proposition 2, we can obtain the following result.
Proposition 2: Ψ(t) satisfies the following Volterra equation.
& t
!
!
Ψ(t)= (JeAτ )! ⊗(JeAτ )! Ψ(t−τ )f (τ )dτ +eA t ⊗eA ts(t).
0
(21)
!
ANALYSIS FOR GAMMA DISTRIBUTIONS
In this section we consider that the cell division intervals
are described by gamma distributions, i.e.,
f (t) =
t
1
t
( )κ−1 e− θ
θ(κ − 1)! θ
(23)
where κ ∈ N is the shape parameter and θ ∈ R >0 is the
scale parameter. We start by considering κ = 1 in which
case f (t) is an exponential distribution. In this case we can
directly differentiate the Volterra equations (15), (21) and
establish that Ψ and Φ can be obtained from the solution to
linear differential equations. The result is state next. Let
M := A ⊗ In + In ⊗ A
and
N := J ⊗ J.
Proposition 3: Suppose that f (t) is described by (23) with
κ = 1. Then the solution to (15) satisfies
d
1
1
Φ(t) = (A! − In + J ! )Φ(t), Φ(0) = In ,
dt
θ
θ
and the solution to (21) satisfies
(24)
d
1
1
Ψ(t) = (M ! − In2 + N ! )Ψ(t), Ψ(0) = In2 . (25)
dt
θ
θ
!
In the case κ > 2 we use the fact that a gamma distributed
random variable with shape parameter κ and scale parameter
θ can be obtained by adding exponential random variables
with scale κθ . Hence if we consider auxiliary times {s $ }$∈N0 ,
s0 := 0, such that {s$+1 − s$ }$∈N0 are exponentially
distributed with scale κθ and make
tk = skκ−ι , k ∈ N
(26)
for some ι ∈ {0, . . . , κ−1}, introduced here for convenience,
then {tk+1 − tk }k∈N are distributed according to (23). The
first transition interval t 1 − t0 , t0 = 0, follows a gamma
θ
. Hence this only
distribution with order κ−ι and scale (κ−ι)
conforms to the model presented in Section III if ι = 0. We
define an auxiliary process
ν(t) ∈ {0, . . . , κ − 1}
started with initial condition ν(0) = ι, constant between
times s$ , i.e., ν̇(t) = 0 if t ∈ R\{s$ }$∈N0 , and satisfying
−
ν(s$ ) = ν(s−
$ ) + 1, if ν(s$ ) < κ − 1
Preprint submitted to 52nd IEEE Conference on Decision and Control .
Received March 11, 2013.
CONFIDENTIAL. Limited circulation. For review only.
and
A. Asymptotic moments
ν(s$ ) = 0, if
ν(s−
$ )
= κ − 1,
at times s$ .
Let Φi (t) denote the expected value of the transition
function given that the process ν(0) starts with initial value
ι, i.e.,
Φι (t) := E[S(t)! |ν(0) = ι],
which, as explained above, is equivalent to assuming that
t1 − t0 follows a gamma distribution with shape κ − ι and
θ
scale (κ−ι)
. Note that Φ0 (t) = Φ(t). Likewise, let
Ψι (t) := E[S(t)! ⊗ S(t)! |ν(0) = ι]
can be described by a set of Volterra equations with exponential f with scale parameter κθ , which is established using
similar arguments to the ones used to prove Propositions 1
and 2, For matrices D, E ∈ R r , let P (D, E) be a matrix in
Rκr×κr described by
0
λ̄Ir
..
.
...
...
...
..
.
..
.
D! − λ̄Ir
0
0
0
..
.
λ̄Ir
D − λ̄Ir
!
κ
θ.




.



where λ̄ :=
Proposition 4: Suppose that f (t) is described by (23) with
κ = 1. Then Φ̄(t) satisfies
d
Φ̄(t) = M1 Φ̄(t),
dt
Φ̄(0) = [In In . . . In ]!
(27)
Ψ̄(0) = [In2 In2 . . . In2 ]!
(28)
where M1 := P (A, J) and
d
Ψ̄(t) = M2 Ψ̄(t),
dt
where M2 := P (A ⊗ In + In ⊗ A, J ⊗ J).
ŝκ (a) :=
where
1 − fˆκ (a)
a
κ κ
) .
fˆκ (a) := (
κ+a
and a ∈ R. Moreover, for a vector a = [a 1 a2 , . . . an ], let
+
,
Ŝκ (a) := diag( ŝκ (a1 ) ŝκ (a2 ) . . . ŝκ (an )
and
and note that Ψ 0 (t) = Ψ(t).
The next result states that Φ ι (t) and Ψι (t) can be obtained
from the solution to a set of linear differential equations. The
proof, given in the appendix, relies on the fact that
 0

 0

Φ (t)
Ψ (t)
 Φ1 (t) 
 Ψ1 (t) 




Φ̄(t) := 
 , Ψ̄(t) := 

..
..




.
.
κ−1
κ−1
Φ
Ψ
(t)
(t)
P (D, E) :=
 !
D − λ̄Ir
λ̄Ir

!

0
D − λ̄Ir


..
..

.
.


0
0
0
λ̄E !
We first provide a result characterizing the asymptotic
behavior of a linear differential equation with the same
structure as (27), (28). Let
!
Thus, to obtain the statistical moments (13), (19) it suffices
to solve (27), (28) and use the fact that Φ(t) = Φ 0 (t) and
Ψ(t) = Ψ0 (t). The fact that Φ(t) and Ψ(t) can be obtained
by the solution to linear differentiable equations enables us to
compute the asymptotic behavior of the statistical moments,
as we describe next.
,
+
F̂κ (a) := diag( fˆκ (a1 ) fˆκ (a2 ) . . . fˆκ (an ) .
Proposition 5: Consider a κ ∈ N ≥2 and a θ ∈ R>0 and a
differential equation taking the form
d
Z(t) = P (D, E)Z(t), Z(0) = [Ip Ip . . . Ip ]!
(29)
dt
+ 0 !
,!
Z (t)
Z 1 (t)! . . . Z κ−1 (t)! ,
where Z(t) =
Z i (t) ∈ Rp×p ; the matrix D takes the form
.
−U ΓU −1 B
D=
(30)
0
0
for a diagonal matrix Γ = diag(γ), γ = [γ 1 γ2 . . . γp−1 ],
with diagonal entries γ i > 0, and some B ∈ R(p−1)×1 ,
U ∈ R(p−1)×(p−1) ; the matrix E takes the form
E = diag(G, 1),
(31)
+
,
for a diagonal matrix G = diag( g1 g2 . . . gp−1 ), with
diagonal entries 0 < g i ≤ 1. Then the following holds
.
0
0p−1×1
(32)
lim Z 0 (t) = p−1×p−1
t→∞
v!
1
where v ∈ Rn is given by
v = U (I −
1
Ŝκ (θγ))Γ−1 U −1 B+
κθ
1
U Ŝκ (θγ)U −1 G(I − U F̂κ (θγ)U −1 G)−1 U Ŝκ (θγ)U −1 B.
κθ
(33)
!
Since (27), (28) take the form (29) we can use this
result to provide explicit expressions for the statistical moments (13), (19). In fact, using (32) we have that
E[m(t) p(t)] = u
where u is described by (33) when D = A and E = J and
A, J take the form (1), (2),
[E[m(t)2 ] E[m(t)p(t)] E[m(t)] E[m(t)p(t)] . . .
. . . E[p(t)2 ] E[p(t)] E[m(t)] E[p(t)] = w
where w is described by (33) when D = A ⊗ I n + In ⊗ A
and E = J ⊗ J. Proposition 5 allows us then to infer how
the statistical moments vary with the model parameters. The
expressions for the steady-state expected value of mRNA
Preprint submitted to 52nd IEEE Conference on Decision and Control .
Received March 11, 2013.
CONFIDENTIAL. Limited circulation. For review only.
Probability density functions
For the lognormal distribution considered in [26] this gives
1.8
lognormal µ = − 2/2, = 0.268
Gamma shape = 13, scale = 1/13
1.6
lim CV2m (t) = 0.0191
t→∞
1.4
lim CV2p (t) = 0.0560.
pdf
1.2
t→∞
1
0.8
0.6
0.4
0.2
0
0
0.5
1
1.5
2
x/µs
2.5
3
3.5
4
Fig. 1. Gamma distribution with shape parameter 13 approximating a log1
. Both distributions are normalized
normal distribution with variance 13.406
to have an unitary mean.
count, obtained as a direct application of Proposition 5, is
given in the next result.
Proposition 6: The steady-state values of the expected
value of the mRNA count m(t) is as follows
lim E[m(t)] =
t→∞
km
1 ŝκ (θγm )
[1 −
].
γm
κθ (2 − fˆ(γm ))
!
Expressions for the asymptotic values of the covariance
of mRNA, and expected values and covariance of protein
can be obtained in a similar way, although the expressions
are naturally much longer. We will show in the next section
these expressions as a function of κ, by fixing the remaining
parameters.
V. N UMERICAL R ESULTS
For a particular cell type, recent experiments have approximated the distribution for the time intervals between
cell divisions as a lognormal distribution with mean µ s =
9.3 hours and standard deviation 2.54 hours [26, Fig. 1D].
The coefficient of variation squared (CV 2 ) of the distribution
is given by (2.54/9.3)2 = 1/13.406. Since the CV 2 of the
gamma distribution is 1/κ, we approximate this lognormal
distribution by a gamma distribution with κ ≈ 13. This
approximation is illustrated in Figure 1.
For the gene-expression model (1) we consider the following parameters normalized by the mean cell division time
µs = 9.3 hours:
γm
=5
µs
γp
=1
(34)
µs
κm = 20
κp = 100.
These values imply a 10-hour and a 2-hour protein and
mRNA half-life, respectively. Next we use results from Section III to compute the steady-state coefficients of variation
squared for the mRNA and protein levels (defined in (3)).
These values are obtained by numerically solving the Volterra
equations (21) and (15) as described before and obtaining the
steady-state values. For the approximating gamma distribution we can use directly Propositions 5, 6, yielding the very
similar values (as expected from the accurate fitting depicted
in Figure 1)
lim CV2m (t) = 0.0191
t→∞
lim CV2p (t) = 0.0563.
t→∞
A CV 2 of 0.056 corresponds to a standard deviation that
is 24% of the mean, showing significant heterogeneity in
protein levels can be generated by randomness in the cell
division process.
Next, we investigate how moments vary with the shape of
the distribution. In Figure 2 we plot mRNA/protein means
and CV 2 for a gamma distributed cell division time with
a unit mean and increasing shape parameter κ (which corresponds to decreasing variance of the gamma distribution).
The moments first sharply decrease with increasing κ but
then saturate to a lower limit for larger values of κ. Moment
dynamic is illustrated in Figure 3 which becomes more and
more oscillatory as the cell division times becomes more
deterministic with increasing κ.
VI. C ONCLUSIONS AND F UTURE W ORK
Most stochastic models of gene-expression incorporate
noise in the production/degradation process of molecules
and do not include the cell-division process. Here we
have considered a renewal model of gene-expression where
mRNA/protein levels evolve deterministically, and reduce
by half every time the cell divides into daughter cells.
Stochasticity enters our model through the time interval
between cell division, which is assumed to be an arbitrary
random variable. We provided a method to obtain both the
time-evolution and steady-state statistical moments of the
mRNA/protein levels.
For the special case of gamma distributed cell division
times, explicit analytical expressions were provided for these
steady-state moments. Analytical expressions were useful
in understanding how stochastic variability is connected to
underlying model parameters. In particular, our results show
that as the time interval between cell divisions becomes more
deterministic, the moment dynamics becomes more oscillatory and both means/CV 2 converge to lower values. Finally,
calculations using distributions obtained from experiments
reveals that for a given parameter set, randomness in the
cell division process can be a significant factor in creating
intercellular variability in protein levels (standard deviation
in protein level was 25% of the mean level).
Future will consider computing higher order moments
(such as skewness and kurtosis) for the protein/mean level
Preprint submitted to 52nd IEEE Conference on Decision and Control .
Received March 11, 2013.
CONFIDENTIAL. Limited circulation. For review only.
mRNA asymptotic expected value
mRNA expected value
3.65
4
3.64
3.62
3
3.61
E[m(t)]
lim E[m(t)]
3.5
3.63
3.6
3.59
2.5
2
0
10
20
30
40
shape of gamma distribution
protein asymptotic expected value
lognormal
=1
=3
= 13
= 30
50
1.5
250
1
lim E[p(t)]
240
0
0.5
1
1.5
2
2.5
time
3
3.5
4
4.5
5
protein expected value
260
230
240
220
220
200
0
10
20
30
shape of gamma distribution
40
50
E[p(t)]
210
mRNA asymptotic cv2
180
160
140
0.024
lognormal
=1
=3
= 13
= 30
2
lim cvm(t)]
120
0.022
100
80
0.02
0
0.5
1
1.5
2
2.5
time
3
3.5
4
4.5
5
2
mRNA cv
0.018
0.04
0
10
20
30
shape of gamma distribution
40
50
0.035
2
protein asymptotic cv
0.03
E[ (m(t)−E[m(t)]) ]/E[m(t)])
2
0.15
0.025
0.02
2
2
lim cvp(t)
0.1
0.05
0.015
0.01
lognormal
=1
=3
= 13
= 30
0.005
0
0
0
10
20
30
shape of gamma distribution
40
50
−0.005
Fig. 2. Steady-state mean and coefficient of variation of mRNA and protein
population count as a function of κ (shape of gamma distribution). The
mean of the gamma distribution is fixed at one and therefore increasing
κ reduces the variance in the cell division time distributions and leads to
lower variability in mRNA/protein levels.
0
0.5
1
1.5
2
2.5
time
3
3.5
4
4.5
5
2
protein cv
0.14
0.12
0.08
2
E[ (p(t)−E[p(t)]) ]/E[p(t)])
2
0.1
since they will enable us to obtain the stationary probability distribution. As many proteins are present at low-copy
numbers inside cells two additional sources of noise come
into play: stochastic birth-death of individual mRNA/protein
molecules and ii) stochastic partitioning of molecules between daughter cell at the time of cell division which
could be modeled by a binomial distribution. An important
direction of future work will be to consider more complex
models of gene-expression that incorporate all these different
sources of noise. Such models will enable a systematic understanding of their contributions to the observed variability
0.06
0.04
lognormal
=1
=3
= 13
= 30
0.02
0
−0.02
0
0.5
1
1.5
2
2.5
time
3
3.5
4
4.5
5
Fig. 3. Dynamics of mRNA/protein mean and coefficient of variation for
different values of the shape parameter κ and for a log-normal distribution
1
with variance 13.406
and unitary mean. The initial condition are taken
m(0) = 1 and p(0) = 100.
Preprint submitted to 52nd IEEE Conference on Decision and Control .
Received March 11, 2013.
CONFIDENTIAL. Limited circulation. For review only.
in protein/mRNA levels.
A PPENDIX
Proof: (of Proposition 1) Conditioning (14) on the time
of the first jump t 1 , we obtain
& ∞
Φ(t) =
E[S(t)! |t1 = s]f (s)ds.
(35)
0
where
!
E[S(t) |t1 = s] =
/
!
eA t , if s > t
E[(S1 (t − s)JeAs )! ], if s ≤ t
, (36)
and S1 (t − s) is the transition matrix of (6) from s = t 1 to t,
which depends on {h k : k ≥ 1}. Due to the i.i.d. assumption
on the intervals between transitions E[S 1 (t)! ] = Φ(t). Thus,
partitioning (35) using (36) we obtain (15):
Proof: (of Proposition 2) The proof is similar to the
proof of Proposition 1 and is obtained by conditioning (20)
on the time of the first jump t 1 and noticing that
E[(S1 (t − s)JeAs )! ⊗ (S1 (t − s)JeAs )! ]
= (JeAs )! ⊗ (JeAs )! Ψ(t).
Proof: (of Proposition 3) The fact that f (and hence
also the survivor function s) is differentiable implies that the
solution to (15) is differentiable (the proof of this fact can
be found in [31]). Differentiating (15) we obtain
& t
d
d
As !
Φ(t) = (Je ) f (t) +
(JeAs )! Φ(t − s)f (s)ds
dt
dt
0
!
1
+ (A! − I)eA t s(t)
θ
(37)
d
where we used the fact that dt s(t) = −f (t) and f (t) =
1
1 − θt
) and that
θ s(t) for exponential distributions (f (t) = θ e
Φ(0) = I. For the integral term on the right hand side we use
d
d
Φ(t − s) = − ds
Φ(t − s), and use integration
the fact that dt
by parts to obtain
& t
1
d
(JeAs )! (− Φ(t − s))f (s)ds = J ! Φ(t)− J ! eAt f (t)
ds
θ
0
& t
1
+ (A! − I)
(JeAs )! Φ(t − s)f (s)ds.
θ
0
(38)
Replacing (38) in (37) we obtain (24). Likewise, to establish (25) it suffices to differentiate (21) and use similar
arguments.
Proof: (of Proposition 4) From similar arguments to the
ones used in the proof of Proposition 1 we obtain that
& t
!
!
Φι (t) =
eA τ Φι+1 (t − τ )fˆ(τ )dτ + eA τ ŝ(t)
0
if ι ∈ {0, . . . , κ − 2} and
& t
!
Φκ−1 (t) =
(JeAτ )! Φ0 (t − τ )fˆ(s)ds + eA τ ŝ(t)
0
κ
for an exponential distribution distribution fˆ(t) := κθ e− θ t
−κ
t
and corresponding survivor function ŝ(t) := e θ . The proof
of (27) then follows by differentiating these expressions and
using similar arguments to the ones used in the proof of
Proposition 3. The proof of (28) follows from analogous
arguments.
Proof: (of Proposition 5) It is easy to see that for
matrices D and E taking the form (30) and (31), the
matrix P (D, E) has a unique zero eigenvalue with associated
right eigenvector 1 κ ⊗ ([01×(p−1) 1]! ) and all the remaining
eigenvalues have negative real part. Thus,
.
01×(p−1)
P (D,E)t
lim e
= (1κ ⊗
)w!
1
t→∞
where w = (w0 , . . . , wκ−1 ), wi ∈ Rp is the (normalized)
left eigenvector associated with eigenvalue 0 characterized
by
(39)
w! P (D, E) = 0,
and
.
κ−1
'
0
w! (1κ ⊗ ( (p−1)×1 ) = 1 ⇔ [01×(p−1) 1](
wι ) = 1
1
ι=0
(40)
Taking into account that Z(t) = e P (D,E)t Z(0) and the
characterization of the eigenvalues of P (D, E), we have that
. κ−1
0p−1 '
ι
wi )!
(41)
(
lim Z (t) =
1
t→∞
i=0
for every ι ∈ {1, . . . , κ}. From (39) we obtain that
κ
κ
wi = (− (D − Ip )−1 )i Ewκ
θ
θ
for i ∈ {1, . . . , κ − 1} and
κ
κ
wκ = (− (D − Ip )−1 )κ Ewκ .
θ
θ
Now the eigenvalue decomposition of D is as follows D =
W diag(Γ 0)W −1 , where
.
- −1
.
U U Γ−1 U −1 B
U
−Γ−1 U −1 B
−1
W =
, W =
,
0
1
0
1
From this decomposition we can obtain that
.
κ
κ −1 κ
F̂κ (θγ) 0
(− (D − Ip ) ) = W
W −1
0
1
θ
θ
and
κ−1
'
κ
κ
(− (D − Ip )−1 )ι = W
θ
θ
ι=0
-κ
θ Ŝκ (θγ)
0
.
0
W −1
κ
from which we can conclude that
.
(I − U F̂κ (γ)U −1 G)−1 U Ŝκ (γ)U −1 B
wn = α
1
and
κ
'
i=1
-
κv
wi = α
κ
.
where v is described in (33) and α is a normalization factor
which equals κ1 due to (40). Thus (32) follows from (41).
Preprint submitted to 52nd IEEE Conference on Decision and Control .
Received March 11, 2013.
CONFIDENTIAL. Limited circulation. For review only.
Proof: (of Proposition 6) Since mRNA count m(t)
does not depend on protein count p(t), we can consider a
simplified model with x = (m, a),
.
.
-1
0
−γm km
.
, J= 2
A=
0
1
0 1
In this case U = 1, Λ = −γm , B = km , which replaced
in (33) yields
1
−1
ŝκ (θγm ))γm
km +
κθ
1
1
1
ŝκ (θγm ) (1 − fˆκ (θγm ))−1 ŝκ (θγm )km
κθ
2
2
km
1
1 − fˆκ (θγm )
ŝκ (θγm )(1 −
=
(1 −
)))
γm
κθ
2 − fˆκ (θγm )
v = (1 −
=
(42)
1 ŝκ (θγm )
km
(1 −
)
γm
κθ (2 − fˆ(γm ))
R EFERENCES
[1] A. Raj and A. van Oudenaarden, “Nature, nurture, or chance: stochastic gene expression and its consequences,” Cell, vol. 135, pp. 216–226,
2008.
[2] A. Raj, C. Peskin, D. Tranchina, D. Vargas, and S. Tyagi, “Stochastic
mRNA synthesis in mammalian cells,” PLoS Biology, vol. 4, p. e309,
2006.
[3] I. Golding, J. Paulsson, S. Zawilski, and E. Cox, “Real-time kinetics
of gene activity in individual bacteria,” Cell, vol. 123, pp. 1025–1036,
2005.
[4] A. Bar-Even, J. Paulsson, N. Maheshri, M. Carmi, E. O’Shea, Y. Pilpel,
and N. Barkai, “Noise in protein expression scales with natural protein
abundance,” Nature Genetics, vol. 38, pp. 636–643, 2006.
[5] J. R. S. Newman, S. Ghaemmaghami, J. Ihmels, D. K. Breslow,
M. Noble, J. L. DeRisi, and J. S. Weissman, “Single-cell proteomic
analysis of S. cerevisiae reveals the architecture of biological noise,”
Nature Genetics, vol. 441, pp. 840–846, 2006.
[6] M. Kaern, T. Elston, W. Blake, and J. Collins, “Stochasticity in gene
expression: from theories to phenotypes,” Nature Reviews Genetics,
vol. 6, pp. 451–464, 2005.
[7] M. B. Elowitz, A. J. Levine, E. D. Siggia, and P. S. Swain, “Stochastic
gene expression in a single cell,” Science, vol. 297, pp. 1183–1186,
2002.
[8] R. Losick and C. Desplan, “Stochasticity and cell fate,” Science, vol.
320, pp. 65–68, 2008.
[9] A. Singh and L. S. Weinberger, “Stochastic gene expression as a
molecular switch for viral latency,” Current Opinion in Microbiology,
vol. 12, pp. 460–466, 2009.
[10] L. S. Weinberger, J. C. Burnett, J. E. Toettcher, A. Arkin, and D. Schaffer, “Stochastic gene expression in a lentiviral positive-feedback loop:
HIV-1 Tat fluctuations drive phenotypic diversity,” Cell, vol. 122, pp.
169–182, 2005.
[11] A. Arkin, J. Ross, and H. H. McAdams, “Stochastic kinetic analysis
of developmental pathway bifurcation in phage λ-infected Escherichia
coli cells,” Genetics, vol. 149, pp. 1633–1648, 1998.
[12] E. Libby, T. J. Perkins, and P. S. Swain, “Noisy information processing through transcriptional regulation,” Proceedings of the National
Academy of Sciences, vol. 104, pp. 7151–7156, 2007.
[13] A. Eldar and M. B. Elowitz, “Functional roles for noise in genetic
circuits,” Nature, vol. 467, pp. 167–173, Sept. 2010.
[14] J. W. Veening, W. K. Smits, and O. P. Kuipers, “Bistability, epigenetics, and bet-hedging in bacteria,” Annual Review of Microbiology,
vol. 62, p. 193210, 2008.
[15] E. Kussell and S. Leibler, “Phenotypic diversity, population growth,
and information in fluctuating environments,” Science, vol. 309, pp.
2075–2078, 2005.
[16] N. Balaban, J. Merrin, R. Chait, L. Kowalik, and S. Leibler, “Bacterial
persistence as a phenotypic switch,” Science, vol. 305, pp. 1622–1625,
2004.
[17] A. Singh, B. S. Razooky, R. D. Dar, and L. S. Weinberger, “Dynamics
of protein noise can distinguish between alternate sources of geneexpression variability,” Molecular Systems Biology, vol. 8, p. 607,
2012.
[18] V. Shahrezaei and P. S. Swain, “Analytical distributions for stochastic
gene expression,” Proceedings of the National Academy of Sciences,
vol. 105, pp. 17 256–17 261, 2008.
[19] J. Paulsson, “Summing up the noise in gene networks,” Nature, vol.
427, pp. 415–418, 2004.
[20] N. Friedman, L. Cai, and X. Xie, “Linking stochastic dynamics to
population distribution: an analytical framework of gene expression,”
Phys. Rev. Lett., vol. 97, p. 168302, 2006.
[21] Y. Taniguchi, P. Choi, G. Li, H. Chen, M. Babu, J. Hearn, A. Emili, and
X. Xie, “Quantifying E. coli proteome and transcriptome with singlemolecule sensitivity in single cells,” Science, vol. 329, pp. 533–538,
2010.
[22] B. Munsky, B. B. Trinh, and M. Khammash, “Listening to the noise:
random fluctuations reveal gene network parameters,” Molecular systems biology, vol. 5, p. 318, 2009.
[23] A. Singh, B. Razooky, C. D. Cox, M. L. Simpson, and L. S.
Weinberger, “Transcriptional bursting from the HIV-1 promoter is
a significant source of stochastic noise in HIV-1 gene expression,”
Biophysical Journal, vol. 98, pp. L32–L34, 2010.
[24] A. Roeder, V. Chickarmane, B. Obara, B. Manjunath, and E. M.
Meyerowitz, “Variability in the control of cell division underlies sepal
epidermal patterning in Arabidopsis thaliana,” PLoS biology, vol. 8,
p. e1000367, 2010.
[25] A. Zilman, V. Ganusov, and A. Perelson, “Stochastic models of
lymphocyte proliferation and death,” PloS One, vol. 5, p. e12775,
2010.
[26] E. D. Hawkins, J. F. Markham, L. P. McGuinness, and P. Hodgkin,
“A single-cell pedigree analysis of alternative stochastic lymphocyte
fates,” Proceedings of the National Academy of Sciences, vol. 106, pp.
13 457–13 462, 2009.
[27] D. Antunes, J. P. Hespanha, and C. Silvestre, “Volterra integral
approach to impulsive renewal systems: Application to networked
control,” Automatic Control, IEEE Transactions on, vol. 57, no. 3,
pp. 607 –619, march 2012.
[28] ——, “Stochastic hybrid systems with renewal transitions: Moment
analysis with application to networked control systems with delays,”
May 2012, accepted for publication for the SIAM Journal on Control
and Optimization, Available at http://www.dct.tue.nl/New/Antunes/
publications/, = ”8”.
[29] P. Linz, Analytical and Numerical Methods for Volterra Equations.
SIAM Studies in Applied Mathematics 7, 1985.
[30] R. A. Horn and C. R. Johnson, Topics in matrix analysis. New York,
NY, USA: Cambridge University Press, 1994.
[31] G. Gripenberg, S. O. Londen, and O. Staffans, Volterra Integral and
Functional Equations. Cambridge University Press, 1990.
Preprint submitted to 52nd IEEE Conference on Decision and Control .
Received March 11, 2013.