A methodology for global sensitivity analysis of time

A methodology for global sensitivity analysis of time-dependent outputs in systems biology
modelling
T. Sumner, E. Shephard, I.D.L. Bogle
Electronic Supplementary Material
This electronic appendix provides mathematical details for the calculation of functional principal
components, describes the algorithm for generating the parameter sets for the Morris method and
provides mathematical details of the insulin signalling pathway model which is used as an example to
demonstrate the sensitivity analysis methodology presented in the accompanying article. A
comparison of the principal components generated from the Sobol and Morris method is also
included.
A.
Functional Principal Component Analysis
In standard multivariate PCA the aim is to transform our original data set (yij, i = 1,…,N, j = 1,…,p)
consisting of N observations of p variables into some new set of variables which most efficiently
explain the variance in the observations. The directions of these new variables are described by the
principal component weight vectors and the values of each observation in these new variables are
given by the principal component scores.


First we find the principal component weight vector, ξ1  11,...,  p1 for which the principal
component scores:
p
i1    j1 yij
i  1,...N
(A1)
j 1
maximize
p

j 1

N
i 1
i21 subject to the constraint:
1
2
j1
(A2)


Then compute the weight vector ξ 2  12 ,...,  p 2 in the same manner subject to the additional
constraint:
p

j 1
 j1  0
(A3)
j2
and so on up to a maximum number of principal components dictated by the number of variables, p.
In functional principal component analysis (fPCA) our data consists of N observations of some
function y(t) and the aim is to find some new set of functions which most efficiently capture the
variation in the data.
We begin by finding the principal component weight function 1 (t ) for which the principal
component scores
i1   1 (t ) yi (t )dt
maximize

N
i 1
i  1,..., N
(A4)
i21 subject to the constraint:
  t dt  1
2
1
(A5)
Then we compute the next weight function ξ 2 (t ) in the same manner subject to the additional
constraint:
  (t ) (t )dt  0
2
1
(A6)
In the functional case, the maximum number of components which can be calculated is dictated by the
number of observations or functions N. In reality only n << N principal components need to be
considered because higher components describe very small amounts of the variance in the model
outputs.
Functional principal components can be calculated in a number of ways. The simplest conceptual
approach is to discretize the N functions on some regular grid of time points. This discretized data can
be considered as a set of N observations of p variables (where p is the number of time points). The
principal components can then be calculated by solving the eigenequation:
Vξ  ξ
(A7)
Where V  N 1Y Y is the sample variance-covariance matrix in which Y is matrix in which columns
represent the p variables and rows represent the N observations. ξ is an eigenvector of V and ρ is an
eigenvalue.
This approach is only suitable if the time points are evenly spaced a condition which may not be met
if the model is solved using a numerical method with an adaptive step size. Adaptive step solvers can
be more computationally efficient because, unlike fixed-step solvers where the step-size throughout
the whole solution must be small enough to capture the fastest variation in the model output, a larger
step can be used for those regions where the model output varies more slowly.
The second method for calculating the principal components of functional data requires the data to
first be expanded using some pre-defined set of basis functions. The principal component analysis can
then be defined as an eigen-analysis problem in terms of the covariance of the coefficients of the
expansion as outlined below.
N
v( s, t )  N 1  yi ( s ) yi (t )
(A8)
i
is the variance-covariance function of the sample of observed functions. The functional eigenequation
is:
 v(s, t ) (t )dt   (s)
(A9)
where ρ is an eigenvalue and ξ(t) is an eigenfunction of the variance-covariance function.
If the observed functions are expanded in terms of some set of basis functions  (t ) :
y (t )  C (t )
(A10)
and the jth eigenfunction by the expansion:
 j ( s)  b j ( s)
(A11)
Then the variance-covaraince function can be rewritten:
v(s, t )  N 1 (s)C C (t )
(A12)
And the eigenequation becomes:
N 1(s)CCJbj  (s)b j
(A13)

where J   (t ) (t )dt
Equation A13 must be true for all values of s so:
N 1J 1/2CCJ 1/2u j  u j
(A14)
where u j  J 1 / 2b j
Equation A14 can be solved to find u j from which we can calculate the coefficients b j of the
expansion of the eigenfunctions  j (s) in equation A11.
Further discussion of functional data analysis techniques including fPCA can be found in [1]. The
computation of functional principal components using the basis function approach was implemented
in this research using the using the “fda” package [2] for the statistical programming language R [3].
B.
Morris Method Algorithm
The simplest way to generate r elementary effects for k parameters requires 2rk runs. The model must
be run twice for each elementary effect, once at P and once at P + Δ. The key to the Morris method is
a more efficient design which requires r(k + 1) model runs to generate the necessary samples. Each
parameter may take one of q values, V=(v1,…,vq), equally spaced between its minimum and maximum
value.
The method proceeds as follows:

Randomly select a base value P* for P, with each parameter being sampled from the subset of
possible values v1,…vq−1

Increase one or more of the parameters in P* by Δ such that the resulting vector P(1) is still in
the set of possible values

Generate the second sampling point P(2) from P* with the property that it differs from P(1) in
the randomly selected ith parameter by ±Δ

Select P(3) such that it differs from P(2) for only one parameter j ≠ i by ±Δ
The last step is repeated to produce a succession of k + 1 parameter vectors P(1),…,P(k+1) in which two
consecutive vectors differ in only one parameter and any parameter i of the base vector has been
selected once to be increased by Δ. These k + 1 vectors form a trajectory in the parameter space and
define a (k+1) × k matrix B* whose rows are the parameter vectors. If the model is then evaluated for
each vector (note that P* is not used to evaluate the model), an elementary effect can be calculated for
each factor as:
di ( P (l ) ) 
[ y ( P (l 1) )  y ( P (l ) )]

(B1)
By generating r such “design” matrices B* we can produce a sample of elementary effects of size
r for each factor. B* can be constructed as follows:
B*  ( J k 1,1P*   / 2[(2B  J k 1, k ) D*  J k 1, k ]) M *
(B2)
where B is a (k+1) × k matrix with elements that are 0s and 1s such that for every column there are
two rows of B that differ in only one element (a convenient choice is a strictly lower triangular matrix
of 1s), Jk+1,k is a (k+1) × k matrix of 1s, D* is a k-d diagonal matrix with elements either +1 or −1 with
equal probability and M* is a k × k random permutation matrix in which each column contains one
element equal to 1 and all others equal to 0 and no two columns have 1s in the same position.
C.
Mathematical Details of the Insulin Signalling Model
The insulin signalling model consists of 18 differential equations describing the dynamics of the
model variables (labelled x1-x18). The model is based on [4] and extended to describe the dynamics of
GSK3. The external input to the model is u(t) the amount of insulin at time t. In the results presented
in the article u is assumed to be a step function of magnitude u = 1×10-6M from t = 0 to t = 30 minutes
and u = 0 for t > 30 minutes.
C.1
Receptor Binding Subsystem
The receptor binding subsystem represents the association and dissociation of insulin and the
phosphorylation and dephosphorylation of the receptor. Free receptors (x1) can bind a single insulin
molecule (u). The ligand-receptor complex (x2) then undergoes phosphorylation. The phosphorylated,
once-bound receptor (x4) can bind a second insulin molecule (which has no effect on the
phosphorylation state) resulting in a twice-bound phosphorylated receptor (x3). The dissociation of the
first insulin molecule leads to rapid dephosphorylation of the receptor.
C.2
Receptor Recycling Subsystem
The second subsystem describes the synthesis, degradation, exocytosis (transfer to cell membrane)
and endocytosis (internalization) of receptors. Free receptors are recycled directly into the internal
pool (x5) which undergoes constant turnover via receptor synthesis and degradation. Internalized
phosphorylated receptors (x6 (twice bound) and x7 (once bound)) undergo an additional step in which
they are dephosphorylated before they are added to the intracellular pool.
C.3
Post Receptor Signalling Pathway
IRS (x8) is activated (x9) by the phosphorylated receptors and deactivated by PTP. The rate of IRS
activation is modelled as a linear function of the phosphorylated receptor concentration (x3 + x4).
Activated IRS binds with and activates free PI3K (x10) in a 1:1 stoichiometry. This complex (x11)
converts PI(4,5)P2 (x13) to PI(3,4,5)P3 (x12). This phosphoinositol lipid is also generated from
PI(3,4)P2 (x14). The lipid phosphatases, SHIP2 and PTEN convert PI(3,4,5)P3 back to PI(3,4)P2 and
PI(4,5)P2 respectively. The activation of Akt (x15 → x16) is taken to be dependent on the level of
PI(3,4,5)P3 and any intermediate steps (e.g. the action of PDK1/2) are not modelled. Active GSK3
(x17) is inactivated (x18) by active Akt.
C.4
Model Equations
dx1
 k1 x2  k 3 x4  k1ux1  k 4 x5  k4 x1
dt
(C1)
dx2
 k1ux1  k1 x2  k3 x2
dt
(C2)
dx3
 k2ux4  k 2 x3  k 4' x6  k4' x3
dt
(C3)
dx4
 k3 x2  k 2 x3  k2ux4  k 3 x4  k 4 ' x7  k4 ' x4
dt
(C4)
dx5
 k5  k 5 x5  k6 x6  x7   k4 x1  k 4 x5
dt
(C5)
dx6
 k4 ' x3  k 4 ' x6  k6 x6
dt
(C6)
dx7
 k4 ' x4  k 4 ' x7  k6 x7
dt
(C7)
The receptor synthesis rate k5 is defined so that the net synthesis and degradation of receptors is zero
under basal conditions therefore k5 = k-5 x5(0). If the intracellular receptor concentration falls below its
basal level an accelerated synthesis rate k5acc = 6k5 is used.
dx8
k x x  x4 
 k 7 x9  7 8 3
dt
IRP
(C8)
dx9 k7 x8 x3  x4 

 k8 x11  k 7  k8 x10 x9
dt
IRP
(C9)
dx10
k8 x11  k8 x9 x10
dt
(C10)
dx11
 k8 x9 x10  k8 x11
dt
(C11)
dx12
 k9 x13  k10 x14  k 9  k10 x12
dt
(C12)
dx13
 k 9 x12  k9 x13
dt
(C13)
dx14
 k10 x12  k10 x14
dt
(C14)
dx15
 k11x16  k11x15
dt
(C15)
dx16
 k11x15  k11x16
dt
(C16)
dx17
 k15 x18  k15 x17
dt
(C17)
dx18
 k15 x17  k15 x18
dt
(C18)
The rate at which PI(4,5)P2 is converted to PI(3,4,5)P3, k9, is taken to be a linear function of active
PI3K, (x11), increasing from some basal value in the absence of insulin to k9st at maximal stimulation.
k-9 and k9basal are also defined in terms of k9st


x11
k9   k9 st  k9basal 
 k9basal 
PI 3K max


(C19)
The rate of activation of Akt, k11, is taken to be a function of PI(3,4,5)P3, (x12), increasing from zero
to its maximal value as PI(3,4,5)P3 increases from its basal value, x12(0) to its maximal value PIP3max.
k11  k11d
x12  x12 0
PIP3max  x12 0
(C20)
The rate at which GSK3 is inactivated, k15, increases from 0 to k15d = ln(2)/2 as a linear function of the
amount of activated Akt.
k15  k15d
x16
Aktmax
p
(C21)
where Aktpmax is the percentage of phosphorylated Akt following maximal insulin stimulation.
C.5
Initial Conditions and Parameter Values
The initial conditions and parameter values for the model are listed in tables C1 and C2. Unless
otherwise indicated values are taken from [4].
Initial Conditions
x1(0)
x2(0)
x3(0)
x4(0)
x5(0)
x6(0)
x7(0)
x8(0)
x9(0)
x10(0)
x11(0)
x12(0)
x13(0)
x14(0)
x15(0)
x16(0)
x17(0)
x18(0)
Description
Unbound surface IR
Unphosphorylated once-bound surface IR
Phosphorylated twice-bound surface IR
Phosphorylated once-bound surface IR
Unphosphorylated unbound intracellular IR
Phosphorylated twice-bound intracellular IR
Phosphorylated once-bound intracellular IR
Unphosphorylated IRS
Tyrosine-phosphorylated IRS
Inactivated PI3K
Active IRS/PI3K complex
PI(3,4,5)P3 in total lipid population
PI(4,5)P2 in total lipid population
PI(3,4)P2 in total lipid population
Inactivated Akt
Activated Akt
Active GSK3
Inactive GSK3
Value
9×10-13
0
0
0
1×10-13
0
0
1×10-12
0
1×10-13
0
0.31
99.4
0.29
100
0
100†
0
Units
M
M
M
M
M
M
M
M
M
M
M
% of total lipid
% of total lipid
% of total lipid
% of total Akt
% of total Akt
% of total GSK3
% of total GSK3
† – we assume that under basal conditions (no insulin) all GSK3 is active. This follows from the assumption in [4] that under basal
conditions no Akt is in the phosphorylated state.
Table C1: Initial conditions used in the insulin model. Abbreviations: IR=insulin receptor.
Parameter
k1
k-1
k2
k-2
k3
k-3
k4
k-4
k4’
k-4’
k-5
k6
k7
k-7
k8
k-8
k9st
k11d
k-11
k15d
k-15
Reaction
Association rate of first insulin molecule to IR
Dissociation rate of first insulin molecule from IR
Association rate of second insulin molecule to IR
Dissociation rate of second insulin molecule from IR
Phosphorylation rate of surface IR
Dephosphorylation rate of surface IR
Endocytosis of free IR
Exocytosis of free IR
Endocytosis of bound IR
Exocytosis of bound IR
IR degradation
Dephosphorylation of intracellular IR
Phosphorylation of IRS
Dephosphorylation of IRS
Formation of IRS/PI3K complex
Separation of IRS/PI3K complex
Maximal conversion of PI(4,5)P2 to PI(3,4,5)P3
Maximal phosphorylation of Akt
Dephosphorylation of Akt
Maximal phosphorylation of GSK3
Dephosphorylation of GSK3
Value
6×107
0.20
6×107
20
2500
0.20
0.00033
0.003
2.1×10-3
2.1×10-4
1.67×10-18
0.461
4.16
1.396
0.706×1012
10
1.39
ln(2)
10 ln(2)
ln(2)/2‡
ln(2)/3*
Units
M-1 min-1
min-1
M-1 min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
min-1
‡ – the half-time, t1/2, for inhibition of GSK3 by insulin is approximately 2 minutes [5, 6] and for a first order rate constant k=ln(2)/t1/2
* – maximal insulin stimulation produces a 60:40 ratio of inactive to active GSK3 [6] hence at equilibrium k-15=k15/1.5=ln(2)/3
Table C2: Nominal parameter values for the insulin model.
D.
Principal components of the GSK3 time-course
Because the parameter sampling is different for the Sobol and Morris methods the set of model
outputs and hence the principal components (PC) may in theory differ. Figure D1 shows the first 3
principal components calculated via each method. While there are quantitative differences in the
values of the PC curves the two methods clearly capture the same types of qualitative variation in the
model outputs and sensitivity indices calculated via the two methods can be directly compared. In
addition the proportion of the variance calculated in each PC are consistent between the two sampling
methods (see table D1).
Figure D1: The first three principal components (PCs) of the GSK3 time-course simulated by the
insulin signalling model. Panels a, c, and e show the results based on the Sobol method. Panels b, d,
and f show the results based on the Morris method.
Sobol Method
90.6%
PC1
8.4%
PC2
0.7%
PC3
Cumulative 99.7%
Morris Method
89.1%
9.9%
0.6%
99.6%
Table D1: The proportion of variance captured in each of the first three PCs based on the Sobol and
Morris methods.
References
1.
2.
3.
Ramsay, J.O. and B.W. Silverman, Functional Data Analysis. 1997, New York: Springer.
Ramsay, J.O., et al., fda: Functional Data Analysis. 2008. p. R package.
R Development Core Team. R: A Language and Environment for Statistical Computing.
2010; Available from: http://www.R-project.org.
4.
Sedaghat, A.R., A. Sherman, and M.J. Quon, A mathematical model of metabolic insulin
signalling pathways. American Journal of Physiology, Endocrinology and Metabolism, 2002.
283(5): p. 1084-1101.
5.
Hurel, S.J., et al., Insulin action in cultured human myoblasts: contribution of different
signalling pathways to regulation of glycogen synthesis. Biochemistry Journal, 1996. 320(3):
p. 871-877.
6.
Cross, D.A., et al., Insulin activates protein kinase B, inhibits glycogen synthase kinase-3 and
activates glycogen
synthase by rapamycin-insensitive pathways in skeletal muscle and adipose tissue. FEBS Letters,
1997. 406(1-2): p. 211-215.