Mathematical models of cell cycle regulation

Hendrik Fuß
is a PhD student of the
Bioinformatics Research Group
at the University of Ulster. His
research project focuses on the
analysis and modelling of
protein kinase and phosphatase
systems involved in the cell
cycle.
Werner Dubitzky
holds a Chair in Bioinformatics
and is Head of the
Bioinformatics Research Group
at the University of Ulster. His
research interests include
bioinformatics, systems biology,
data and text mining, artificial
intelligence and grid
technology.
C. Stephen Downes
is Director of the Centre for
Molecular Biosciences at the
University of Ulster, and is
Head of the Cancer and Ageing
Research Group. His research
interests include cell cycle
checkpoint controls.
Mary Jo Kurth
is a Research Associate in the
Cancer and Ageing Research
Group at the University of
Ulster. Her research interests
include cell cycle checkpoint
control, proteomics and
systems biology.
Keywords: cell cycle,
computer simulation, systems
biology, protein–protein
interaction, regulatory
networks
Hendrik Fuß,
School of Biomedical Sciences,
University of Ulster,
Cromore Road,
Coleraine, BT52 1SA, UK
Tel.: +44 (0) 28 7032 3536
E-mail: [email protected]
Mathematical models of
cell cycle regulation
Hendrik Fuß, Werner Dubitzky, C. Stephen Downes and Mary Jo Kurth
Received in revised form: 7th April 2005
Abstract
The cell division cycle is a fundamental process of cell biology and a detailed understanding of
its function, regulation and other underlying mechanisms is critical to many applications in
biotechnology and medicine. Since a comprehensive analysis of the molecular mechanisms
involved is too complex to be performed intuitively, mathematical and computational
modelling techniques are essential. This paper is a review and analysis of recent approaches
attempting to model cell cycle regulation by means of protein–protein interaction networks.
INTRODUCTION
Proliferating cells perform a series of
coordinated actions collectively referred
to as the cell cycle. The complex network
of regulatory enzymes and cellular
components that controls these processes
enables cells to grow and divide, to
control or prevent growth when
appropriate, to carry out the different
stages of growth and division in the
correct order, and to respond to DNA
damage by arresting progression through
the cycle so as to allow time for repair to
occur before more DNA is replicated or
chromosomes are condensed.
Because the development of cancer is
associated with loss of control over this
regulatory system, the study of the
mechanisms and functions of the
eukaryotic cell cycle has gained increased
attention in the past decades.1 A detailed
understanding of the mechanisms
underlying tumour growth, DNA
damage repair, intercellular signalling and
other cell cycle related processes is
therefore of paramount importance to
diagnosis, treatment and prognosis of
cancer.
The underlying regulatory system is
often described as complex. In fact, three
different aspects contribute to the
complexity of the cell cycle biochemistry:
• a large number of enzymes and
proteins are involved in the cell cycle;
• a large number of interactions exist
between the proteins and enzymes;
• the proteins and enzymes interact in
different ways, performing
stimulatory, inhibitory or other
modulating functions.
Complexity makes the characterisation of
this system difficult. A computational
approach towards understanding cell cycle
regulation therefore has two major
objectives. The first objective is to
provide a qualitative and quantitative
explanatory model describing the inner
workings of cell cycle regulation.
Computer simulations can help linking
observations to hypotheses by
reproducing the expected behaviour from
a theoretical model (see Figure 1A). The
second objective is concerned with
predictive models capable of extrapolating
from a given state of the cell or its
components to subsequent states (see
Figure 1B). The prediction of anti-cancer
drug response, for example, has
applications in tumour therapy. Targeted
tumour therapy uses computational model
predictions to select the most effective
medication and to reduce the overall costs
of a therapy.2
Expensive experimental techniques
such as chip technologies are involved in
both medical applications and in cell cycle
research. Both fields can benefit from the
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
163
Fuß et al.
A
B
Experiments
Observations
behaviour in vivo
Hypotheses
on mechanisms
and constants
Knowledge
from literature, databases
or previous models
Knowledge
Mathematical
Model
Mathematical
Model
compare
Predictions
behaviour in silico
Predictions
behaviour in silico
Accept or reject
hypotheses
Hypotheses
on mechanisms
and constants
Experiment
suggestions
Treatment
suggestions
Figure 1: Workflows of two mathematical modelling strategies. (A) Explanatory modelling
using inductive reasoning. The experimental observations and the model’s predictions (possibly
those of several alternative models) are compared in order to confirm or reject the
hypotheses encoded in the mathematical model. The term ‘behaviour’ can refer to features of
physiological parameters such as enzyme concentrations or macroscopic events, for example,
the growth of a cell population. (B) Predictive modelling using deductive reasoning. A
computational implementation of the mathematical model may be used to conduct artificial (in
silico) experiments that can suggest novel hypotheses or support decisions on therapeutic
strategies. Observations from the real system may be used to validate the new hypotheses or
to provide parameters or initial conditions to the mathematical model
Oridinary differential
equations as
mathematical
representation of
cellular processes
Gene-regulatory
networks v. proteinprotein interaction
networks
164
predictive capabilities of a mathematical
model.
In order to simulate the molecular
processes inside a cell, a mathematical
model captures the processes and its
components by representing the system
state x(t) at time t and by computing its
subsequent state x(t + 1) based on the
state x(t). The variable x denotes the
concentrations of all molecular species
captured by the model. In chemical
kinetic theory3 the interactions between
species are expressed using ordinary
differential equations (ODEs).
ODEs are equations in which the
unknown element is a function, rather
than a number, and in which the
known information relates that function
to its ordinary (as opposed to partial)
derivatives. Many phenomena in
physics, chemistry and biology with a
temporal dimension involve relationships
between the values of variables at a
given point in time and the changes in
these values over time. Generally, an
ODE takes the form:
:::
F[t, x(t), x_(t), x€(t), x(t), . . .] ¼ 0, for all t
where t is a scalar variable (normally
interpreted as time), F is a known
function, x is an unknown function, x_(t)
is the derivative of x with respect to t, x€(t)
is the second derivative of x with respect
to t, and so on.
The following simple example, adapted
from supplementary material in
Pomerening et al.,4 illustrates a system of
two mutually activated enzymes A and B
represented by two ODEs.
x_ ¼ k1 x þ (a x)G( y)
y_ ¼ k2 y þ (b y)F(x)
where a and b represent the total
concentrations of the two proteins and x
and y represent the concentrations of the
active form of the proteins. The terms
F(x) and G(x) describe the mutual
interactions between the two proteins.
In the majority of cell cycle models
these mathematical principles are applied
to one of two biochemical domains: gene
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
Mathematical models of cell cycle regulation
The cell cycle consists of
four morphological
phases
regulatory networks and protein–protein
interaction networks. In gene regulatory
networks interactions are modelled purely
on a genetic level. Interactions between
genes and proteins are implicit in these
models. Cell cycle regulation depends
heavily on interactions of proteins and
enzymes and their modification through
phosphorylation and dephosphorylation
by kinases and phosphatases, respectively.
These interactions and enzymatic
reactions are represented in protein–
protein interaction or regulatory
networks. In contrast to gene regulatory
networks such models explicitly represent
known or hypothetical molecular
interactions on the protein level.
Macroscopic effects, such as the
migration of a cell within a tissue or the
formation of buds in yeast cells, can be an
explicit or implicit part of such a model.
Patterns and shapes of higher-level
structures sometimes arise purely from
reaction dynamics. In many cases it is not
clear how these macroscopic cellular
processes connect to the cell’s physiology.
Theoretical models that reproduce such
effects can therefore make an important
contribution to our understanding of the
cell cycle.
For the successful application of
mathematical modelling to cell cycle
problems one has to answer a number of
questions: What is the system in question?
What is the environment? How can we
model it and what information do we use
as input? What do we expect to observe
and what insights do we expect from the
application of the model? What methods
should we use for modelling, analysis and
validation? How can we most effectively
obtain experimental evidence to support
new hypotheses?
Generally, the process of modelling the
cell cycle and its regulatory mechanisms
involves three distinct phases:
• Model design phase. This phase is
concerned with the definition and
formulation of a mathematical model
of the biological process, system or
problem under investigation.
• Model application/analysis phase. Here
the computational implementation of
the model is used to simulate and
study the dynamic behaviour of the
model under different conditions in
relation to the studied biological
system or process.
• Model validation phase. Here the
behaviour and data generated by the
computational simulation of the
model are either compared against
data obtained from analogous
experiments on real biological systems
or assessed epistemologically on the
basis of existing knowledge.
Before we revisit each of the modelling
phases outlined above in detail, we first
review the relevant background of
biology.
THE CELL CYCLE –
BIOLOGICAL
BACKGROUND
The eukaryotic cell cycle is divided into
four main phases, the synthesis (S) phase,
the mitosis (M) phase and two growth
phases in between (G1 and G2, see Figure
2). A newborn cell resides either in G0,
the quiescent non-dividing state, or in G1
until physiological parameters allow it to
enter the S phase and to start replicating
its genetic material. An exact copy of each
DNA strand is created, and by the end of
the subsequent G2 phase the
chromosomes are aligned on the mitotic
spindle. In the M phase the paired
chromatids are separated and transported
to the two opposite spindle ends, ready to
form two new nuclei with identical
genetic information. The cell cycle ends
with cytokinesis, the physical division into
two new cells. The period of this circular
process varies between species and tissues
from several minutes to several days.5 It
should be noted that the distinction of
these four phases is based on
morphological observations and does not
relate to the underlying biochemical
mechanisms. Therefore not all of these
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
165
Fuß et al.
Figure 2: Schematic representation of cell
cycle phases and checkpoints. Proliferating
cells proceed through a cycle of DNA
synthesis (S phase) and mitosis (M phase)
separated by two growth phases (G1 and
G2). Quiescent cells remain in G0 state until
conditions are favourable for cell division
again. Checkpoints (represented by halt
signs) prevent progression in response to
delays or defects
Cell cycle checkpoints
respond to
perturbations
phases are sharply defined on a molecular
basis.
Surveillance systems called checkpoints
prevent progression of the cell cycle in
response to perturbations, such as DNA
damage, nucleotide depletion or other
defects (see Figure 2).6 Several classes of
protein kinases and other enzymes perform
essential functions in the implementation
of the cell cycle and the checkpoint
system. Protein kinases alter the activity of
their enzyme substrates by attaching a
phosphate group. In 2001, the Nobel
Prize in medicine was awarded for the
discovery of how cyclin-dependent kinases
(CDKs), an essential class of cell cycle
enzyme, control progression through the
cell cycle. Different CDKs are activated in
characteristic patterns in the distinct
phases of the mammalian cell cycle (see
Figure 3). Their activity is dependent on
the availability of specific cyclin subunits,
the phosphorylation state of the CDK
itself and the presence of CDK inhibitors.
These interactions form the biochemical
manifestation of the cell cycle phases and
checkpoints.
The term ‘checkpoint’ suggests that
there exist certain entities in a cell whose
purpose is to check whether a condition
(eg complete DNA replication) is fulfilled
and to emit signals to other active entities
within the cell cycle. While this notion is
part of the current consensus model of the
cell cycle, such external monitoring
entities have been subject to critique.7
The signal-emitting entity may well be an
inherent part of the metabolism, such as
the nucleotides used to replicate the DNA
or the cytoskeleton that separates the
chromatids rather than an external
‘checking’ instance. In this paper we will
use the word checkpoint to describe any
kind of regulatory mechanism that will
halt the cell cycle in response to certain
Figure 3: Each cell cycle phase (bottom) is characterised by the activity of cyclin-dependent
kinases (CDKs), which are activated by association with an appropriate cyclin subunit. While
cyclins undergo specific patterns of synthesis and degradation, CDKs are present throughout
the cell cycle (adapted from McBride8 )
166
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
Mathematical models of cell cycle regulation
intracellular or extracellular conditions,
whether it is evolved or inherent to the
system.
MODEL DESIGN
Mathematical
simulations require
more detail than
contained in most
pathway charts and
databases
Deriving a formal abstraction (ie a
mathematical description) of the biological
system or process under investigation is a
crucial step in modelling. This step
involves decisions on what precisely is
understood by ‘the system’ and requires
the definition of system boundaries and the
components of the system. Furthermore, it
is necessary to determine what types of
interaction with the environment should
be included and which simplifying
assumptions should be made.
Detailed and comprehensive
information about interactions between
cell cycle-related proteins is available in
the form of molecular interaction maps,9
pathway charts10 and public databases.11
However, mathematical models of
biochemical and cellular processes require
a precise and unambiguous formulation of
the underlying reactions and their
quantitative parameters in order to predict
the real system’s behaviour. Resources
such as the above-mentioned molecular
interaction maps leave a large amount of
uncertainty, as there are numerous ways
to mathematically model interactions
between two partners, especially when
enzymes, inhibitors or other modulators
are involved. This requirement forces us
to systematically generate hypotheses and
gather all the available information in
order to mathematically describe the
problem.12
Based on the mathematical
formulation of the problem, the
underlying quantitative information is
finally encoded as an executable
computational model, which can
simulate the behaviour of the studied
biological system or process by numerical
integration. Thus, the computational
model can be used to test hypotheses and
to compare modelled system responses
and behaviour to experimental
observations (see Figure 1A).
Scope and complexity
Cell cycle modules
Mathematical modelling has been applied
to numerous problems related to the cell
cycle. Some modelling approaches focus
on the basic function of the cell cycle
‘engine’ that is responsible for the
different cell cycle phases and transitions
between them. Modelling of protein
interactions in well-studied species such as
budding yeast has led to highly developed
and complex models that reproduce and
explain a major part of the physiological
phenomena that are currently known to
occur within the cell cycle.13
However, the complexity of the
underlying biochemistry often makes it
necessary to subdivide the problem into
several parts, which can be studied and
optimised individually.14 Complex
mathematical models can be composed of
smaller, well-studied modules. A
systematic study of elementary proteinprotein interaction modules with
applications to cell cycle physiology was
presented by Tyson et al.15 Biochemical
switches, signal amplifiers, chemical
oscillators and other modules resemble the
components of an electronic circuit. They
can provide a high-level structure to
networks as well as inspire hypotheses on
novel pathways to achieve certain signalresponse properties.
Since many comprehensive and wellcharacterised cell cycle models are
available, it is tempting to use these as a
basis for larger models. Such an approach
could be useful for determining the roles
of novel regulatory proteins in the cell
cycle, but also to extend the scope from
molecular to cellular or multicellular level
or to add a spatial component.
Cell cycle phase transitions
Modelling specific phase transitions often
allows for a more detailed study of the
features and roles of the key enzymes
involved. Qu et al.16 discuss the
importance of multisite phosphorylation
for bistability and oscillation within the
G1/S transition. Qualitatively different
dynamics arise from the number of
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
167
Fuß et al.
Multi-cellular systems
are simulated through
agent-based models
phosphorylation sites on kinases and
transcription factors. A larger number of
sites increases the activation threshold and
leads to ‘steeper’ response characteristics.
Their model suggests that at least two
phosphorylation sites are required for the
regulation of this phase transition.
A recent study17 addressed the
characterisation of the restriction point, an
early cell cycle checkpoint first proposed
by Pardee.18 Once the cell has passed this
‘point of no return’ it does not respond to
certain cell cycle halting agents or
deprivation of growth factors. The
mathematical model quantitatively
reproduces the effects of such
interventions. The transient deprivation
of growth factors may lead to an increased
cell cycle length depending on the current
position in the cycle. Restriction point
control is believed to be an important
oncological factor, although there are
controversial views on the interpretation
of Pardee’s experiments (see for example
Cooper19 ). In the light of this controversy
mathematical models of these phenomena
might serve as a basis for a profound
discussion of the biological mechanisms
underlying the observations, to eventually
yield a theory that withstands critique.
Beyond the cell cycle
Developmental biology is the study of
processes that give rise to tissues, organs
and organisms in specific shapes and
patterns. These processes, too, are
controlled by biochemical networks,
which are closely related to the cell cycle.
Fertilised eggs of Xenopus laevis, for
example, undergo a series of 12 rapid,
synchronous cell divisions lacking growth
phases in between. The embryo then
enters the mid-blastula transition, after
which its cells grow asynchronously and
regulated by the full, somatic cell cycle
system. In the model presented by
Ciliberto et al.20 these features all emerge
from a set of differential equations with
appropriate parameters. The model
thereby suggests two new biochemical
mechanisms, introducing a hypothetical
kinase required for oscillatory behaviour
168
and a mechanism by which cyclins are
removed or inactivated.
Models of social behaviour and
intercellular messaging take whole
populations of cells into account, each
one proceeding through its own cell
cycle, influenced by extracellular
parameters such as growth factors and
bonding with neighbouring cells. It seems
reasonable to assume that models at this
level do not necessarily require molecular
details to provide interesting results.
Walker et al.21 developed a simple rulebased computational model that is
sufficient to reproduce cell migration
phenomena under varying conditions.
The model does not include any
molecular interactions, but instead keeps
track of the cell cycle phase and the cell
size for each individual cell. Decisions are
based on these two parameters and on the
position of neighbouring cells. Such
models offer interesting applications with
regard to wound healing, tumour growth
or apoptosis.22
An illustrative simulation that integrates
several of these levels is an agent-based
system of cells simulating the growth
dynamics of brain tumours by Athale et
al.23 Each cell decides whether it should
proliferate or migrate based on simulated
molecular parameters. The cells can
migrate within an extracellular matrix
featuring a glucose gradient. The uptake
of glucose then feeds back into the cell’s
physiology. Using this approach, the
expansion of tumours can be studied
using a computational approach on both
molecular and macroscopic levels.
Representing biochemical
networks
There are many different possible
mathematical representations that describe
the same biochemical reactions in a cell.
Some authors prefer to model protein
interactions and enzymatic reactions using
only basic mass-action kinetics,16 which
leads to a high number of reactions (or
variables in the model), while maintaining
relatively simple individual equations.
Using more abstract, higher-level kinetics,
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
Mathematical models of cell cycle regulation
XML-based modelling
languages and model
repositories
Explicit v. heuristic
diagrams
such as Michaelis–Menten equations, Hill
functions17 or threshold functions,20 it is
possible to reduce the number of
variables, but at the same time,
complexity rises owing to the various
types of reaction representations involved.
This approach leads to fewer, but
mathematically more complex, equations.
Few mathematical models work with
actual physiological concentrations,23,24
since dimensionless terms greatly simplify
the resulting equations by eliminating
constants. Such a model’s variables
represent fractions rather than absolute
concentrations of proteins. Dimensionless
terms are therefore useful, when
quantitative predictions are not required.
Graphical representations
Graphical notations of biochemical
networks are useful for two purposes:
they are a compact form for displaying
complex interaction networks and also
allow scientists without a mathematical
background to participate in the
modelling process. However, many
common forms of biochemical diagrams
cannot be directly transformed into a
mathematical model. Such diagrams do
not contain the mechanistic details that
are needed for simulating. Interactions
such as the stimulatory influence denoted
by the plus arrows in Figure 4A do not
have a single, defined mathematical
representation. The modeller must decide
whether a Michaelis–Menten, basic massaction or other kinetic law is appropriate
for each interaction.
Kohn25 therefore distinguishes two
types of diagram: explicit diagrams (as in
Figure 4C), which contain all information
required for simulation, and heuristic
diagrams (as in Figure 4B), which are not
sufficient, but possibly helpful for model
creation. While heuristic diagrams
provide a compact notation for depicting
large networks of biochemical processes,
explicit diagrams provide a graphical
representation of a mathematical model.
This means that an explicit diagram can
be unambiguously converted into a set of
ODEs, which, given appropriate
parameters, is ready for numerical
integration.
Computer document formats
To permit computational simulation it is
necessary to transfer mathematical models
to computational representations, ie
programming languages, data structures
and algorithms that efficiently manipulate
those structures. Software for model
building and simulation, as well as
libraries and services, must agree on a
common format or language to exchange
the computational representations of
biological models. In the context of cell
cycle modelling, two XML-based
document formats for this purpose exist:
the Systems Biology Markup Language
(SMBL, see Figure 4E)27 and CellML.28
They both allow quantitative specification
of reaction kinetics in multiple
compartments.
Repositories containing cell cycle
models in SBML and CellML format
exist, but neither of these repositories nor
the respective formats seem to have
attracted great attention in the relevant
modelling literature.
MODEL ANALYSIS
A central objective of mathematical cell
cycle models is to facilitate understanding
of cell cycle regulation. However,
currently there does not seem to be a
consensus on what ‘understanding’
precisely means. Merely reproducing the
known biochemistry and the observed
behaviour in a computer obviously does
not further our knowledge about the cell
cycle. Various analysis methods do
actually provide insights into real
biological systems that cannot be gained
otherwise.
One important aspect of understanding
is the derivation of principles and
architectural or topological features.
Scientists seek to understand how a
system is constructed to be able to carry
out its work and why it has been
constructed that way. Owing to recent
advances in graph theory, the notion of
scale-free networks 29 and the identification
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
169
Fuß et al.
Figure 4: Examples of
various representations
of the same core cyclin/
CDK subsystem
(adapted from Qu et
al.26 ). (A) A typical
reaction scheme
diagram. The double
arrows denote the
possible reactions
between species, which
in this diagram consist of
phosphorylations,
dephosphorylations and
associations. Plus arrows
indicate the stimulatory
influence of an enzyme
on another reaction. (B)
and (C) Molecular
interaction maps, a very
compact display
formalism that allows
depiction of large
interaction networks.9
The explicit diagram (C)
contains nodes for
enzyme substrate
complexes to represent
enzymatic activity, such
as phosphorylation and
dephosphorylation by
Cdc25 and Wee1. The
heuristic diagram (B)
uses designated symbols
(circle ended and Tshaped line) to represent
these features. (D) Two
equations from a system
of ordinary differential
equations (ODE)
describing the change in
concentration of each
individual element over
time. (E) Excerpt from
an SBML model of this
system
170
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
Mathematical models of cell cycle regulation
Emergent properties
correspond to biological
features
Bifurcation analysis
of network hubs have triggered a number of
studies on topological features of
biochemical networks. Reoccurring
modular substructures have been
identified in various types of biological
networks.30 These modules reveal
evolutionary principles in the architecture
of regulatory networks, similar to motifs
in protein sequences.
So-called emergent properties provide
insights gained from a systems-level view
and analysis of a biological process or
entity. These properties arise from the
qualitative and quantitative structures of
the system and its components and their
interactions, similar to how pressure
emerges as a global property from large
numbers of interacting gas molecules.
Even simple biochemical networks
featuring a small number of interactions
and feedback loops can exhibit
multistability,31 oscillatory behaviour and can
undergo different kinds of bifurcations (see
below). These emergent properties
directly correspond to biological features
such as cell cycle phase transitions and
checkpoint responses.
Furthermore, mathematical models and
their computational implementations
provide a framework for in silico or
‘artificial’ experiments. Experimentalists
can execute and explore such models to
decide whether an experiment is
promising or not and thus reduce
experimental time and costs. Cross et al.
performed a series of experiments based
on the prediction of a mathematical yeast
cell cycle model.32 Since these studies
often involved laborious genetic
manipulations, computational modelling
allowed them to specifically select the
most insightful experiments.
Parameter space and
behaviour space
Parameterisation and initialisation of a
mathematical model is a prerequisite for
model execution. Once all kinetic
parameters and initial conditions have
been specified, time-series data can be
obtained from the simulation by
numerical integration. Hakenberg et al.
demonstrated how text mining can be
used to determine kinetic model
parameters from large volumes of
scientific articles.33 However, often
investigators are not interested in studying
systems based on particular parameters but
in how the system responds to
perturbations of these parameters.
Perturbations lead to qualitatively
different results that correspond to
different behaviours or phenotypes of the
biological system. In their comprehensive
yeast cell cycle model, Chen et al.
analysed simulation data for over 100
parameter sets.13 Each set corresponds to a
certain genotype (eg deletion or overexpression mutants) and leads to an
observed in silico behaviour (eg arrest in a
particular cell cycle phase, production of
abnormally large cells or wild-type
behaviour), that can be compared to
experimental observations.
The study of qualitative changes in
dynamical behaviour of a system in
response to parameter variation is called
bifurcation analysis34 and several aspects
of this theory have interesting
implications for cell cycle modelling.
Two of the two main implications,
multistability and sensitivity analysis, are
discussed below.
Multistability
Multistability refers to systems that have
more than one stable state within a range
of parameters. Such systems can alternate
between two or more mutually exclusive
states over time. It is widely believed that
multistability is a critical property in the
control of the cell cycle. Bistability of the
G1/S transition has been demonstrated
experimentally in yeast32 and mammalian
cells.4 It involves a hysteretic switch that
prevents the system from returning to G1
state once it has entered S phase.
The point at which the system
undergoes a transition between two stable
states usually depends on certain kinetic
parameters. Novák and Tyson argued that
this sensitivity to parameter variation plays
a major role in the mechanisms
underlying the cell cycle checkpoints.35
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
171
Fuß et al.
Robustness as a
measure of plausibility
Whenever conditions are unfavourable
for progression into the next cell cycle
phase, an allosteric modification
(modification of a regulatory enzyme
activity’s by the non-covalent binding of a
particular metabolite at a site (the
allosteric site) other than the active site) of
an enzyme may change its kinetic
behaviour, shifting its activation threshold
to a virtually unreachable level.
Other mechanisms have been found to
play a role in cell cycle transitions. For
example, insoluble aggregates of cyclins
can provide a kind of buffer for soluble
cyclin in the cytoplasm. Since solvatation
of cyclin is faster than cyclin synthesis, this
allows for a rapid and sensitive response in
the activation of the G1/M transition.
The mathematical model proposed by
Slepchenko and Terasaki simulates this
phase transition.36 With cyclin
aggregation the model displays bistable
behaviour over a wider range of
parameter values. In conclusion this
mechanism increases the robustness of the
system.
Sensitivity analysis
Characterisation of a mathematical model
often involves parameter sensitivity
analysis, which is used to determine the
roles and importance of individual
parameters. A kinetic parameter may, for
example, directly affect major factors such
as cell cycle period or mass at division,
while others may raise or lower the
threshold concentration level of a
particular protein that is needed to
activate an enzyme complex. The
methods used for sensitivity analysis range
from single parameter variation13 to
complete randomisation tests.37 Clearly,
changing only single parameters cannot
provide a comprehensive overview over
the robustness of the system. Cancerous
cells normally have multiple defects that,
taken together, cause the regulatory
system to malfunction.1 Randomisation
tests by von Dassow et al., on the other
hand, have revealed a remarkable level of
robustness in a developmental module in
Drosophila.37 Out of their simulations with
172
random parameters, 80 per cent
reproduce the normal, healthy behaviour.
This finding can also be taken as a
measure of plausibility for the model,
since in any biological system slight
perturbations should not disrupt essential
cellular functions. This is particularly
important to systems exhibiting features
that are well conserved over a wide range
of species. While enzymes have changed
in the course of evolution and exhibit
altered kinetic constants, the basic
function of the system remains intact.
Robustness can be used as a criterion to
discriminate between several hypothetical
models for a biological system. A
quantitative analysis of robustness,38
comparing two published Xenopus cell
cycle models, clearly demonstrates both
the predictive and explanatory advantage
of the more recent model, which differs
from the earlier one only by the addition
of two feedback regulation loops. Some
of its basic properties, like the cell division
frequency, are more robust to parameter
variation, which makes the latter model
more plausible. Furthermore, the earlier
model required some kinetic parameters
to assume physiologically unacceptable
values. The latter model, which allows for
a broader range of valid parameter values,
eliminates this problem.
MODEL VALIDATION
In cell cycle simulations, experimental
data are mainly used for three tasks:
• reverse-engineering of networks to
construct a computer model from
experimental data;
• estimating parameters of an existing
mathematical model; or
• testing and discriminating between
several alternative models.
Pure data-driven approaches are rare in
protein–protein interaction modelling.
Reverse-engineering networks from
measured data using machine-learning,
evolutionary or artificial intelligence
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
Mathematical models of cell cycle regulation
Hypothesis-driven
approaches
methods are common in gene regulation
studies, but have not been applied to
protein-interaction networks. The main
reason for this is that regulatory networks
involve non-linear dynamics, which
makes the reverse-engineering problem
extremely complex. It is questionable
whether a reverse-engineered regulatory
network that was fitted to a set of training
data has any explanatory capabilities, even
when it has demonstrated predictive
qualities. Another reason for the lack of
data-driven approaches may be that
available experimental data are too sparse
to construct a network without prior
knowledge. While large-scale time-series
data can be obtained today using
microarray techniques, protein chip
technology is still in its infancy.
Most cell cycle simulations employ a
hypothesis-driven approach.
Experimental observations are then used
as evidence to decide whether the
hypotheses are plausible (see Figure 1A).
It should also be noted that there are
probably more insights to be gained from
the finding that a model does not comply
with a set of data. Ciliberto et al. found
that it was necessary to add a hypothetical
kinase to their model in order to
reproduce the experimentally observed
behaviour.20 The resulting model thereby
suggests a novel pathway, involving a yet
unknown protein, which remains to be
discovered.
Experimental methods
Experimental data can be obtained on
various levels of biological organisation.
In cell cycle scenarios this involves data
from the following groups of dynamic
and static data. Dynamic data describe
the temporal evolution of a system or its
components:
• Concentration time-series data obtained
using proteomic methods. This most
direct and most common form of
validation involves the detection of
relevant proteins with specific
antibodies or by using tagged or
fluorescent labelled proteins. These
techniques are often used in
conjunction with polyacrylamide gel
electrophoresis (PAGE) to separate the
proteins in a sample. The obtained
concentration data directly
corresponds to the model variables.20
When combined with genetic
engineering techniques, these methods
allow for sophisticated experiments,
which can give evidence for complex
phenomena such as bistability, as
demonstrated by Cross et al.32
However, the requirement of a
specific detection method makes these
procedures difficult to scale.
• Morphological and phenotypic observations.
Much of what we know about the cell
cycle was inferred from the behaviour
of genetically altered strains. These
genetic modifications are easy to
reproduce in the computer and
phenotypic features such as relative
cell cycle times, cell size at division,
resistance to perturbations such as
DNA damage etc can be compared
with experiments.13
• Observations at population or
developmental level. Automated
cytometry methods such as flowcytometry measure the distribution of
a feature (cell size, DNA or protein
concentration) across a population.
Cross et al. have employed this
technique to measure the change of
protein concentrations during cell
growth, which roughly correlates with
progression through the cell cycle.32
Microscopic images, too, have been
used as experimental evidence for cell
cycle simulations to show that the
model correctly reproduces cell
migration or growth patterns.22,39
Static data describe time-independent
features of a system or its components:
• DNA and protein sequence data. Large
amounts of data are available and
accessible through public databases.
For example, molecular dynamics
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
173
Fuß et al.
Optimisation of
experimental strategies
simulations with existing protein
structural data have been used in other
mathematical modelling fields to
estimate rate constants for kinetic
equations.40 Exploiting this type of
data for use in cell cycle models is an
open challenge.
• Protein interaction studies. Such
experiments are mostly based on yeast
two-hybrid systems or phage display
libraries and contain qualitative
information about cell physiology.
While these types of experiments
cannot reveal the function of protein
interactions or even determine
quantitative kinetic parameters, they
show which proteins can associate due
to their biochemical nature. The
integration of such large-scale
interaction data in actual cell cycle
models has so far not been exploited,
but numerous studies on topological
system features rely on the availability
of public interaction databases.30,41,42
It is often noted that experimental data
for regulatory networks are difficult to
obtain.12 It is true that most variables in a
mathematical cell cycle model are not
easily observable experimentally. On the
other hand, indirect evidence for a model
can be found on various levels. A key
challenge for future research in cell cycle
modelling lies in the correlation of these
indirect measurements to the model.
Parameter estimation
The danger of
over-fitting
174
It is often argued that as a mathematical
model becomes more complex and more
hypothetical model parameters are
included in the model, the more the
model’s ability to adjust to any arbitrary
data set increases, and thus, the
information content of the model
decreases.12 Using the information
encoded in experimental data to estimate
model parameters can limit the model’s
flexibility and thus makes it more likely to
reflect the biological processes accurately.
Common maximum likelihood methods,
however, require a large number of data
points to avoid over-fitting, but
measurements are often time-consuming
and expensive and may even disturb the
system itself. In order to use laboratory
resources most efficiently, experimental
strategies need to be optimised, so that
they can provide most useful information
to the mathematical methods employed.
Kutalik et al. have presented a
mathematical method for optimising
sampling times and the number of
replicates required when collecting data
for parameter estimation.43 This
optimisation method can either achieve a
given maximal tolerance level for the
estimates or it can minimise the error for a
given number of measurements. They
illustrate their method using a simple
mathematical model, which demonstrates
that careful selection of sampling times
can significantly increase estimation
accuracy.
Parameter estimation is currently not
widely used for cell cycle modelling.
Models aiming at high predictive accuracy
would certainly need this kind of datadriven support, but it may not be crucial
to many cell cycle simulations that seek to
qualitatively explain the observed
behaviour.
CONCLUSIONS
A number of models for diverse cell cycle
regulation problems have been proposed
by the research community. They
simulate physiological or developmental
effects in a wide range of species. Most of
the models considered in this paper
pursue the all-important goal in the postgenomic era: characterisation of the roles
and functions of genes and proteins. Put
in the context of a particular system, the
knowledge of these individual functions is
expected to improve our understanding of
the cell cycle as a whole and its
constituent mechanisms.
Many of the molecular processes that
govern the basic cell cycle ‘engine’ are
known today, but this knowledge does
not explain the highly developed
organisation of the cell cycle. Events such
as nuclear envelope breakdown,
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
Mathematical models of cell cycle regulation
Need for
standardisation of
graphical and
computational model
representations
chromosome condensation and alignment
or cytokinesis have so far not been
described mathematically. Also, while
molecular targets involved in checkpoint
responses have been identified, the
concept of checkpoints in the cell cycle is
not clearly defined and may need reexamination. It is currently not clear how
checkpoints detect and respond to the
various perturbations.44 For efficient use
of the available knowledge, it is vital that
model developers can both publish their
work in a usable way and access existing
models to apply analysis methods and
experimental data or to build new models.
A consensus on a graphical formalism
for protein interaction models is yet to
emerge. Some promising approaches are
underway.45 While differential equations
allow detailed mathematical analyses and
graphical displays, they also obscure
certain features of a system, such as flow
relationships between elements.46 The
community should therefore agree on the
use of standardised model formalisms as
well as computer document formats that
facilitate exchanging models between
researchers and between software
applications. Curated model repositories
with standardised interfaces might
comprise platforms as powerful as
SwissProt for protein sequences.
Existing software tools, both
commercial and open, support the
systems biologist in some of the steps
involved. The use and integration of
experimental data for the purpose of
parameter estimation or model
discrimination are still open issues.
SBML27 and CellML28 were not intended
for storage of experimental data or
resulting analyses. Emerging integration
infrastructures like the Systems Biology
Workbench47 are a vital contribution to
efficient model development, because
software solutions for common tasks such
as numerical simulation and graphical
visualisation can be reused to build upon.
Still, the learning curve is very steep and
building computer simulations typically
requires advanced programming skills.
This hinders scientists from various
disciplines from ‘deep’ participation in the
modelling process.
Current cell cycle models make little
use of high-throughput methods such as
mass spectrometry, hybridisation-based
assays and 2D gel electrophoresis. While
this is probably because of the difficulties
associated with data-driven approaches in
this field, overcoming these hurdles might
dramatically change the ‘lack of data’
problem. Methods for obtaining
dynamical data for mathematical models
should be both specific to the system’s
components and scalable to larger systems
with more parameters.
References
1.
Hanahan, D. and Weinberg, R. (2000), ‘The
hallmarks of cancer’, Cell, Vol. 100(1), pp.
57–70.
2.
Robert, J., Vekris, A., Pourquier, P. and
Bonnet, J. (2004), ‘Predicting drug response
based on gene expression’, Crit. Rev. Oncology/
Hematology, Vol. 51, pp. 205–227.
3.
Tyson, J. J., Novák, B., Odell, G. M. et al.
(1996), ‘Chemical kinetic theory:
understanding cell-cycle regulation’, Trends
Biochem. Sci., Vol. 21(3), pp. 89–96.
4.
Pomerening, J. R., Sontag, E. D. and Ferrell
J. E. (2003), ‘Building a cell cycle oscillator:
Hysteresis and bistability in the activation of
Cdc2’, Nature Cell. Biol., Vol. 5(4), pp.
346–351.
5.
Murray, A. and Hunt, T. (1993), ‘The Cell
Cycle: An Introduction’, Oxford University
Press, New York, Oxford.
6.
Hartwell, L. H. and Weinert, T. A. (1989),
‘Checkpoints: Controls that ensure the order
of cell cycle events’, Science, Vol. 246(4930),
pp. 629–634.
7.
Cooper, S. (2000), ‘The continuum model and
G1-control of the mammalian cell cycle’, in
Meijer, L. et al., Eds, ‘Progress in Cell Cycle
Research. Vol. 4’, Kluwer Academic/Plenum
Publishers, New York, Chapter 3, pp. 27–39.
8.
McBride, C. M. (2001), ‘Characterisation of
Human Cell Cycle Control Systems’, PhD
Thesis, University of Ulster.
9.
Kohn, K. W. (1999), ‘Molecular interaction
map of the mammalian cell cycle control and
DNA repair systems’, Mol. Biol. Cell, Vol.
10(8), pp. 1065–1075.
10. Biocarta Pathway Charts (URL: http://
www.biocarta.com/genes/[cited 5th April,
2005]).
11. Kanehisa, M. (1996), ‘Toward pathway
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
175
Fuß et al.
engineering: a new database of genetic and
molecular pathways’, Sci. Technol. Japan, Vol.
59, pp. 34–38 (URL: http://
www.genome.jp/kegg/).
12. Ingolia, N. T. and Murray, A. W. (2004), ‘The
ups and downs of modelling the cell cycle’,
Current Biol., Vol. 14, pp. R771–R777.
13. Chen, K. C., Calzone, L. and Csikasz-Nagy,
A. et al. (2004), ‘Integrative analysis of cell
cycle control in budding yeast’, Mol. Biol. Cell,
Vol. 15(8), pp. 3841–3862.
14. Hartwell, L. H., Hopfield, J. J., Leibler, S. and
Murray, A. W. (1999), ‘From molecular to
modular cell biology’, Nature, Vol. 402(6761
Suppl), pp. C47–52.
15. Tyson, J. J., Chen, K. C. and Novák, B.
(2003), ‘Sniffers, buzzers, toggles and blinkers:
Dynamics of regulatory and signaling pathways
in the cell’, Curr. Opin. Cell Biol., Vol. 15(2),
pp. 221–231.
16. Qu, Z., Weiss, J. N. and MacLellan, W. R.
(2003), ‘Regulation of the mammalian cell
cycle: A model of the G1-to-S transition’,
Amer. J. Physiol. Cell Physiol., Vol. 284,
pp. C349–C364.
17. Novák, B. and Tyson, J. J. (2004), ‘A model
for restriction point control of the mammalian
cell cycle’, J. Theor. Biol., Vol. 230(4),
pp. 563–579.
18. Pardee, A. B. (1974), ‘A restriction point for
control of normal animal cell proliferation’,
Proc. Natl Acad. Sci. USA, Vol. 71(4),
pp. 1286–1290.
19. Cooper, S. (2003), ‘Reappraisal of serum
starvation, the restriction point, G0, and G1
phase arrest points’, FASEB J., Vol. 17,
pp. 333–340.
20. Ciliberto, A., Petrus, M. J., Tyson, J. J. and
Sible, J. C. (2003), ‘A kinetic model of the
cyclin E/Cdk2 developmental timer in
Xenopus laevis embryos’, Biophys. Chem., Vol.
104(3), pp. 573–589.
21. Walker, D., Southgate, J., Hill, G. et al. (2004),
‘The epitheliome: Agent-based modelling of
the social behaviour of cells’, Biosystems, Vol.
76(1–3), pp. 89–100.
22. Walker, D. C., Hill, G., Wood, S. M. et al.
(2004), ‘Agent-based computational modelling
of wounded epithelial cell monolayers’, IEEE
Trans Nanobiosci., Vol. 3(3), pp. 153–163.
23. Athale, C., Mansury, Y. and Deisboeck, T. S.
(2005), ‘Simulating the impact of a molecular
‘‘decision-process’’ on cellular phenotype and
multicellular patterns in brain tumors’,
J. Theor. Biol., Vol. 233(4), pp. 469–481.
24. Marlovits, G., Tyson, C. J., Novák, B. and
Tyson, J. J. (1998), ‘Modelling M-phase
control in Xenopus oocyte extracts: The
surveillance mechanism for unreplicated
176
DNA’, Biophys. Chem., Vol. 72(1–2),
pp. 169–184.
25. Kohn, K. W. (2001), ‘Molecular interaction
maps as information organizers and simulation
guides’, Chaos, Vol. 11(1), pp. 84–97.
26. Qu, Z., MacLellan, W. R. and Weiss, J. N.
(2003), ‘Dynamics of the cell cycle:
Checkpoints, sizers, and timers’, Biophys. J.,
Vol. 85(6), pp. 3600–3611.
27. Finney, A. and Hucka, M., ‘Systems Biology
Markup Language (SBML) Level 2: Structures
and facilities for model definitions’ (URL:
http://www.sbml.org [cited 6th April, 2005]).
28. Cuellar, A., Nielsen, P, Halstead, M. et al.
(2003), ‘CellML 1.1 draft specification’ (URL:
http://www.cellml.org/public/specification/
20030930/cellml_specification.html).
29. Barabási, A.-L. and Albert, R. (1999),
‘Emergence of scaling in random networks’,
Science, Vol. 286(5439), pp. 509–512.
30. Milo, R., Shen-Orr, S., Itzkovitz, S. et al.
(2002), ‘Simple building blocks of complex
networks’, Science, Vol. 298, pp. 824–827.
31. Angeli, D., Ferrell, J. E. and Sontag, E. D.
(2004), ‘Detection of multistability,
bifurcations, and hysteresis in a large class of
biological positive-feedback systems’, Proc.
Natl Acad. Sci. USA, Vol. 101(7),
pp. 1822–1827.
32. Cross, F. R., Archambault, V., Miller, M. and
Klovstad, M. (2002), ‘Testing a mathematical
model of the yeast cell cycle’, Mol. Biol. Cell,
Vol. 13(1), pp. 52–70.
33. Hakenberg, J., Schmeier, S., Kowald, A. et al.
(2004), ‘Finding kinetic parameters using text
mining’, OMICS: J. Integrative Biol., Vol. 8(2),
pp. 131–152.
34. Strogatz, S. H. (2000), ‘Non-linear Dynamics
and Chaos’, Perseus Books, Reading, MA.
35. Novák, B. and Tyson, J. J. (2003), ‘Modelling
the controls of the eukaryotic cell cycle’,
Biohem. Soc. Trans, Vol. 31 Pt 6,
pp. 1526–1529.
36. Slepchenko, B. M. and Terasaki, M. (2003),
‘Cyclin aggregation and robustness of
bio-switching’, Mol. Biol. Cell, Vol. 14(11),
pp. 4695–4706.
37. von Dassow, G., Meir, E., Munro, E. M. and
Odell, G. M. (2000), ‘The segment polarity
network is a robust developmental module’,
Nature, Vol. 406, pp. 188–192.
38. Morohashi, M., Winn, A. E., Borisuk, M.
et al. (2002), ‘Robustness as a measure of
plausibility in models of biochemical
networks’, J. Theor. Biol., Vol. 216(1),
pp. 19–30.
39. Anderson, A. R. A. and Chaplain, M. A. J.
(1998), ‘Continuous and discrete mathematical
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
Mathematical models of cell cycle regulation
models of tumor-induced angiogenesis’, Bull.
Math. Biol., Vol. 60, pp. 857–900.
40. Gabdoulline, R. R., Kummer, U., Olsen,
L. F. and Wade, R. C. (2003), ‘Concerted
simulations reveal how peroxidase compound
III formation results in cellular oscillations’,
Biophys. J., Vol. 85(3), pp. 1421–1428.
41. Xenarios, I., Salwinski, L., Duan, X. J. et al.
(2002), ‘DIP, the Database of Interacting
Proteins: A research tool for studying cellular
networks of protein interactions’, Nucleic Acids
Res., Vol. 30(1), pp. 303–305.
42. Han, J.-D. J., Bertin, N., Hao, T. et al. (2004),
‘Evidence for dynamically organized
modularity in the yeast protein–protein
interaction network’, Nature, Vol. 430(6995),
pp. 88–93.
43. Kutalik, Z., Cho, K.-H. and Wolkenhauer, O.
(2004), ‘Optimal sampling time selection for
parameter estimation in dynamic pathway
modelling’, Biosystems, Vol. 74(1–3),
pp. 43–55.
44. Nurse, P. (2000), ‘A long twentieth century of
the cell cycle and beyond’, Cell, Vol. 100(1),
pp. 71–78.
45. Kitano, H., ‘A Standard Graphical Notation
for Biological Networks’ (URL: http://
www.sbml.org/workshops/sixth/[cited 6th
April, 2005]).
46. Mandel, J., Palfreyman, N. M., Lopez, J. A.
et al. (2004), ‘Representing bioinformatic
causality’, Brief. Bioinformatics, Vol. 5(3),
pp. 270–283.
47. The Systems Biology Workbench, http://
sbw.sourceforge.net/[cited 6th April, 2005].
& HENRY STEWART PUBLICATIONS 1467-5463. B R I E F I N G S I N B I O I N F O R M A T I C S . VOL 6. NO 2. 163–177. JUNE 2005
177