Design and implementation of a tool for translating SBML into the

BIOINFORMATICS
ORIGINAL PAPER
Vol. 22 no. 24 2006, pages 3075–3081
doi:10.1093/bioinformatics/btl516
Systems biology
Design and implementation of a tool for translating SBML into the
biochemical stochastic p-calculus
Claudio Eccher1,2, and Corrado Priami1,3
1
3
DIT—University of Trento, 2ITC-irst, Centre for Scientific and Technological Research, Trento and
The Microsoft Research—University of Trento Centre for Computational and Systems Biology, Trento, Italy
Received on June 14, 2006; revised on September 8, 2006; accepted on October 5, 2006
Advance Access publication October 17, 2006
Associate Editor: Alvis Brazma
ABSTRACT
Motivation: SBML is becoming a standard ‘de-facto’ to represent and
store biological models. Although SBML is very useful in defining
ways of exchanging and storing biological information, it is not formal
enough to allow direct translation into non ambiguous formal representation languages to perform analysis and simulation of models.
We here suggest to map SBML models into process calculi representations.
Results: We implemented and validated a tool that translates SBML
descriptions into stochastic p-calculus specifications.
Availability: Source code is freely available for academic use by
contacting the authors.
Contact: [email protected]
1
INTRODUCTION
Biological systems can be considered ‘complex software systems’:
layer upon layer of complex control mechanisms that in essence do
a lot of very sophisticated information processing. To conduct a
system level analysis the integration of systems biology [Kitano,
(2001)] with Information Technology (IT) is needed to predict
and explain the behavior of possibly large and complex reaction
networks. A variety of mathematical formalisms equipped with
simulation techniques have been proposed in mathematical biology
and bioinformatics to model the dynamic causal interactions of
biochemical entities (e.g. Mestl et al., 1995; Matsuno et al.,
2003; Kam et al., 2001; Kahn et al., 2003).
Biological systems can be considered as distributed systems
composed by a huge number of patterns that interact and compete
with decentralized control and strong localization of interactions. In
computer science, concurrent systems are just defined as a group
of co-existing computational processes that can communicate each
other in a synchronous or asynchronous way (Milner, 1989). The
similarity of the abstract view of biological systems outlined above
and concurrent systems made us think of using the specification
techniques of concurrent systems for biological systems as well.
Our metaphor considers biological components as concurrent
processes and their interaction as process communication or process
movement. Among the process algebras, the p-calculus (Regev
et al., 2001) and its stochastic variant, the biochemical stochastic
p-calculus (Priami et al., 2001), has been proposed as an
appropriate formalism to model a system of interacting molecules
To whom correspondence should be addressed.
in a network of biochemical reactions. The biochemical stochastic
p-calculus can express many qualitative features of molecular
pathways (concurrency, compositionality, mobility—i.e. change in
network structure as a result of interaction—and a modular and
hierarchical structure), as well as describe their quantitative behavior. One of the main benefits of the calculus is that the emergent
behavior of a complex system can be predicted by modeling and
composing independent system components. Tools for performing
dynamic and quantitative simulations are available, such as BioSpi
(The Biospi project, 2002, http://www.wisdom.weizmann.ac.il/
biospi) and the more recent Stochastic Pi Machine (SPiM) (Phillips
and Cardelli, 2004). The recent literature reports examples of pcalculus modeling and simulation of cell cycle control (Lecca and
Priami, 2003), l-phage switch (Kuttler et al., 2006), and lymphocyte recruitment in inflamed brain micro-vessels (Lecca et al.,
2004). These results emphasize the suitability of the new formalism.
On the other hand, the modeling of large systems into p-calculus
specifications is an hard task for biologists and we need to find
automatic translators that hide as much formal details as possible
from the user.
Several exchange languages have been recently developed to
overcome problems of integration and reuse of biological models
(Liao and Ghanadan, 2002; Spellman et al., 2002; Taylor et al.,
2003; Hanisch et al., 2002; Waugh et al., 2002; Cuellar et al.,
2003). Most of them are based on the eXtensible Markup
Language (XML), whose use has been widely spreading in bioinformatics (Achard et al., 2001). Among them, the Systems Biology Markup Language (SBML) (Hucka et al., 2003) is becoming
a standard de facto for a common representation supporting
basic biochemical models. SBML is supported by more than
90 software packages and it is the standard model definition
language used by several consortia (Kumar and Feidler, 2003;
Holden, 2002). As a consequence, hundreds of SBML models of
gene regulatory networks and metabolic pathways that code a considerable body of biological knowledge have been accumulated in
repositories.
To convert the existing SBML models into the biochemical stochastic p-calculus for exploiting verification, analysis and simulation,
we present here the design an development of the software translator
SBML2PI. The tool produces stochastic p-calculus models according to the SPiM syntax, which is a slight variant of the biochemical
stochastic p-calculus, modulo the choice of ascii characters. The
conversion process is completely automatic and needs minimal
intervention by the end user, in such a way to hide the complexity
The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected]
3075
C.Eccher and C.Priami
of both formalisms from the biologists. The tool provides a user
friendly graphical interface by which the users can select and
display the SBML model, both in text and graphic format, automatically translate it into the SPiM specification, set the model’s
parameters and save the model for subsequent simulation. The
biochemical stochastic p-calculus and SBML express biological
knowledge at considerable different level of abstraction, hence to
perform an automatic translation we limited the translation to a
well defined subset of SBML structures.
We validated the translation tool by performing simulation with
SPiM on available SBML models. We translated models without
manual annotations, taken from the SBML repository formerly
available at http://sbml.org/models; hence we performed the translation exploiting only the information provided by the SBML
biological structures in the considered model. We present here the
results obtained on an acetylcholine receptor model and compare
them with that presented in literature.
2
METHODS
2.1
Assumptions
Since not all the SBML Level 2 components can be translated into the
stochastic p-calculus, we made some assumptions discussed in the following
paragraphs. In p-calculus the syntactical structure of a model codifies the
information on the structure of molecules and complexes along with their
interaction capabilities. On the other hand, SBML is an ODE based formalism that allows to describe systems at high level of abstraction, also in
presence of partial and incomplete biological knowledge. The information
about the detailed molecular structure cannot be retrieved from the biological components constituting the model alone.1 As a consequence, the level
of detail that the p-calculus allows to specify can be unexpressed in SBML
and some characteristics that render the p-calculus well suited for modeling
bio-molecular systems cannot be fully exploited or may render difficult the
translation.
2.1.1 Species Species are simple, indivisible biochemical entities of
the same type located in a specific compartment and have only one possible
state (Hucka et al., 2004). Biological entities with possible different internal
states, such as the states of a protein phosphorylated at different locations,
have to be represented as separately-named chemical species. As an example, compare the biological model of the Tyson cell cycle in Figure 1 [Tyson,
(1991)] with the reactions and species in the corresponding SBML model,
given in Table 1. In the SBML model cdc2 is rendered with the species C2
and its phosphorylated form CP. Similarly, the complex P-cyclin-cdc2-P
and its form with the cdc2 subunit dephosphorylated are rendered with
the species pM and M, respectively. As a consequence, the link between
the real biochemical entity and the corresponding SBML species is lost;
therefore, one cannot know from the model if different species actually
represent the same biological entity. Moreover, species lack the property
of compositionality, which, on the contrary, is the main feature of the
p-calculus process algebra. The behavior of a molecule in a biochemical
interaction cannot be expressed in terms of the behavior of its subcomponents or functional parts.
Assumption 1—We adopt a literal translation of the SBML model. We
consider each species as a monolithic entity without both internal structure
and internal states and abstract this monolithic entity with a p-calculus
process.
1
We did not use manually annotated models, since they were not available
at the time the model analysis was made, hence this first version of the tool
does not parse the additional information provided by annotated models.
3076
Fig. 1. The cell cycle regulatory network model in the paper of Tyson.
Table 1. The set of reactions in the SBML model of the cell cycle
Id
Reaction
Reaction
Reaction
Reaction
Reaction
Reaction
Reaction
Reaction
Reaction
1
2
3
4
5
6
7
8
9
Reactants
Products
Reaction
M
C2
CP
CP, Y
M
EmptySet
Y
YP
pM
C2, YP
CP
C2
pM
pM
Y
EmptySet
EmptySet
M
M ! C2+YP
C2 ! CP
CP ! C2
CP+Y ! pM
M ! pM
EmptySet ! Y
Y ! EmptySet
YP ! EmptySet
pM ! M
2.1.2 Reactions The information provided by the reaction structures in
the SBML model does not make possible to classify reactions. For example,
we cannot definitely state that Reaction4 in Table 1 is the formation of
a complex between phosphorylated cdc2 and cyclin. Hence, we are forced
to treat reactions at a high level of abstraction: namely as generic interactions
between species in which some entities disappear (reactant species), some
(possibly different) entities are created (product species). At this level of
abstraction, we do not need to use two important features of the p-calculus
language: the name passing mechanism that implements mobility and the
scope restriction that implements the isolation of an interaction through
the use of a private channel. As for mobility, the potential of interaction
of a monolithic species cannot be changed by interactions it participates in;
therefore, the reaction network does not need dynamic reconfiguration. For
the same reason it is not necessary to isolate the interactions between
sub-components of an entity (e.g. the dephosphorylation of the cdc2 subunit
of the ‘preMPF’ complex in the example model).
Assumption 2—A species in a reaction is abstracted by a p-calculus
process that can perform an action. After making the action (firing of the
reaction) the process continues either as the null process, or as a species
process, or as a parallel composition of species processes.
For example in the reaction:
S1 þ S2 ! S3
Design and implementation of a tool
we abstract the species S1 and S2 with two processes defined as:
P1 ¼ ða‚ rÞ: P3 ‚
P2 ¼ ð
a ‚ rÞ:0;
2
where P3 is the process abstracting S3.
SBML specifications do not impose any restriction on the number of
reactants and modifiers of a reaction. Let us consider reactants first. Reactions with more than two reactants cannot be modeled in p-calculus for
two reasons:
(1) communications in p-calculus are pairwise: only transitions between
pairs of processes sharing a common channel name are allowed.
Consequently, the p-calculus model can accommodate only reactions
with no reactants, one or two reacting molecules (zeroth, first and
second order reactions, respectively);3
(2) the Gillespie algorithm used for the stochastic simulation [Gillespie,
(1977)] is based on a theoretical framework that can deal with at
most two-body collisions.
Assumption 3—We limit the automatic translation to reactions with at
most two reactants.
For the very nature of the p-calculus only reactions with reactants that
have integer stoichiometries can be translated. From Assumption 3 it follows
that the stoichiometry of a reactant species can be only one (first order
reactions and second order reactions with different reactants) or two (homodimerizations).
In SBML the kinetic law can be a complex mathematical function of time
and concentration of species (modifiers, reactants and products, in general).
A species acting as catalyst or inhibitor is modeled as a modifier that is
neither created or destroyed in the reaction, whose effect is taken into
account by using the modifier concentration in the expression of the kinetic
law. The stochastic rate constants can be computed by using the Gillespie
relations only when the kinetic laws are zero, first or second order mass
action rate laws. Therefore, the translation of reaction with catalysts is still
an open problem.
Assumption 4—In this version, the translation does not take into account
the presence of modifiers.
SBML Level 2 fully supports the representation of stochastic kinetic
models if one defines the amount of species in terms of number of molecules
and uses the law of mass action as rate law. However, in the formal part of
the SBML model there is no indication whether the model is stochastic or
deterministic and, consequently, whether the deterministic rate constant or
the stochastic ones are given. This lack of information prevents the automatic
application of the Gillespie relations to the rate constants.
Assumption 5—We do not automatically perform the conversion from
the deterministic to the stochastic rates neither in the case kinetic laws in
the model are laws of mass action.
2.1.3 Compartments
In p-calculus the restriction of name scopes can
be used to confine interactions between processes in such a way they cannot
interfere. On the other hand, in SBML Level 2 a compartment, defined as
a bounded space in which a species is located, is an attribute of the species,
not of the reaction. Therefore, a species must be assigned to only one compartment. The same biological entity in two different compartments has to
be modeled by two different species. Actually, the concept compartment
is only used to represent simple topological relationships between bounded
spaces and to provide the size of the space in which a species is located.
Assumption 6—We do not need to use scope restriction in the p-calculus
model because the compartmental localization is ‘built-in’ in the concept
of species.
In SBML the volume of a compartment can be determined by rules and
can vary during the simulation. The compartment volume appears in the
Gillespie relation for the second order stochastic constant. However, the
Gillespie algorithm cannot accommodate dynamically varying basal rates;
therefore:
Assumption 7—We do not consider variable size compartments in the
translation.
2.1.4 Rules, events and function definitions Rules are mathematical expressions to set parameters, establish constraints between quantities,
etc., events are mathematical formulas evaluated at specified moments in
the time evolution of the system, and function definitions allow to define any
mathematical function that can be used throughout the model. In stochastic
process algebra models it is not obvious how to force the number of
processes or the stochastic rates to dynamically depend on trigger events,
constraints or functions that cannot be expressed using reactions.
Assumption 8—If present in a SBML model, the structures Rule, Event
and FunctionDefinition are not automatically translated.
2.2
Translation
2.2.1 General process form We abstract a reactant species S with a
process P defined as a summation over the number of reactions to which the
species participates as reactant; formally:4
P¼
m
X
Pi ‚
Pi ¼ ðpi ‚ ri Þ·Qi ‚
ð1Þ
i¼1
where m is the number of reactions in which S takes part as reactant, pi
is an input or output globally fresh name and ri is the channel stochastic
communication rate. A communication along this channel abstracts the
reaction i. Qi can be either the null process or a suitable combination of
reactant and product processes composed in parallel.
SBML allows to indicate whether the concentration (or amount) of a
species has to remain constant, or can be changed either by the set of
reactions or by the rules.5 Since the number of molecules of a constant
species can be neither decreased nor increased by a reaction, the intuition
is that in the definition of Qi there have to appear both the processes abstracting non-constant products and those abstracting constant reactants.
In defining the formal translation rules we distinguish among zeroth,
first and second order reactions, and homodimerizations.
2.2.2 Zeroth order reactions Zeroth order reactions, characterized
by the absence of reactants in the SBML description, are translated by
transforming them to first order reactions through the definition of a fictitious constant reactant species. This allows to apply the first order reaction
translation rules defined in the next section.
2.2.3 First order reactions Let us define the ith first order reaction
with the reactant species S as:
X
Ri : S !
nj Sj ‚ j ¼ 1‚. . . ‚l‚
j
where l is the number of the product species, Sj is the jth product, (which
may be the reactant itself) and nj is its stoichiometry. The reactant species
S is abstracted by the process P defined in Equation (1). We define the
ith member of the summation P, determined by Ri, as:
Pi ¼ ðai ‚ ri Þ·Qi ‚
ð2Þ
where ai is an input channel with rate ri.
P
We represent the general form of the Summation as i2I Pi and of Parallel
composition as Pi2I Pi, where the index set I is finite.
5
When the concentration of a species can be changed only by the set of rules
we treat it as constant, since we do not consider rules in the translation
process.
4
2
In this schema, also the variants that lead to the same behavior of the system
P1 j P2 are valid choices: e.g. P1 ¼ (a, r).0, P2 ¼ (a, r).P3.
3
Zeroth and first order reactions need the definition of suitable processes
allowing the reaction to proceed.
3077
C.Eccher and C.Priami
If the reactant concentration cannot change, we define the process Qi as:
l0
Y
Qi ¼ P j
ð3Þ
Vj‚
j¼1
where Vj is given by:
nj times
zfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflffl{
Vj ¼ Wj j . . . j Wj :
ð4Þ
0
The set of the Wj indexed by {1, . . . , l } {1, . . . , l} contains the processes
abstracting the product species Sj (with stoichiometry nj) whose concentration can be changed by the set of reactions, but does not contain the
reactant process P.
If the reactant concentration can be changed by the set of reactions,
we define the process Qi as:
Y
Vj ‚
ð5Þ
Qi ¼
Table 2. The processes abstracting the SBML species of the Tyson model
Species
Reactant of
Process
EmptySet
C2
CP
M
pM
Y
YP
Reaction6
Reaction2
Reaction3, Reaction4
Reaction1, Reaction5
Reaction9
Reaction4, Reaction7
Reaction8
PES ¼ PESR6
PC2 ¼ PC2R2
PCP ¼ PCPR3 + PCPR4
PM ¼ PMR1 + PCPR5
PpM ¼ PpMR9
PY ¼ PYR4 + PYR7
PYP ¼ PYPR8
channels, i.e.:
Pi ¼ ðai ‚ r i Þ:Q1i þ ðai ‚ ri Þ:Q2i ‚
ð12Þ
j
where Vj is as in Equation (4), but now the set of the Wj contains P if
the reactant appears as product as well. If the set of the Wj is empty then:
Qi ¼ 0:
ð6Þ
To allow the reaction to proceed, in pair with the reactant of the ith first
order reaction we define a special process CLOCKi with a complementary
output channel ai :
CLOCK i ¼ ðai ‚ ri Þ:CLOCK‚
where Q1i and Q2i are defined as in the case of second order reactions.
2.2.6 An example We show the step-by-step application of the translation rules to the set of reactions of the cell cycle SBML model reported
in Table 1. The species processes, defined according to Equation (1), are
given in Table 2. Applying the rules to each reaction in turn we obtain the
following specifications of the components in Table 2:
PMR1 ¼ ðchR1 ‚ rR1 Þ:ðPC2 j PYPÞ
PC2R2 ¼ ðchR2 ‚ r R2 Þ: PCP using Equationð5Þ
where CLOCK is the summation of the components CLOCKi over the
number of the first order reactions.
2.2.4
PCPR3 ¼ ðchR3 ‚ r R3 Þ: PC2 using Equationð5Þ
Second order reactions Let us define the ith second order
PCPR4 ¼ ðchR4 ‚ r R4 Þ: PpM
reaction as:
Ri : S1 þ S2 !
X
using Equationð9Þ
PY R4 ¼ ðchR4 ‚ r R4 Þ:0 using Equationð11Þ
j ¼ 1‚. . . ‚ l‚
nj Sj ‚
using Equationð5Þ
ð7Þ
j
PMR5 ¼ ðchR5 ‚ r R5 Þ: PpM
where the reactant species S1 and S2 are different and are abstracted by
the processes:
X
X
P1 ¼
ðai ‚ ri Þ:Q1i ‚ P2 ¼
ðai i‚r i Þ:Q2i :
i
PESR6 ¼ ðchR6 ‚ rR6 Þ:ðPES j PYÞ using Equationð3Þ
PY R7 ¼ ðchR7 ‚r R7 Þ:0
i
If the concentration of the species S1 cannot be changed by the set of
reactions, we define Q1i as:
Q1i ¼ P1 j
l0
Y
PYPR8 ¼ ðchR8 ‚r R8 Þ:0
PpMR9 ¼ ðchR9 ‚r R9 Þ: M
Vj ‚
ð8Þ
j¼1
ð10Þ
if the concentration of the species S2 cannot be changed by the set of
reactions, otherwise:
Homodimerizations
ð11Þ
A special case is represented by homod-
imerizations:
Ri : 2S ! SD
which we treat as a second order reaction where the two reactant processes are the same component of the summation P with complementary
3078
using Equationð5Þ
þ ðchR3 ‚ rR3 Þ:CLK þ ðchR5 ‚ rR5 Þ:CLK
þ ðchR6 ‚ rR6 Þ:CLK þ ðchR7 ‚ rR7 Þ:CLK
and P1 can be in {Wj}. As before, if the set {Wj} is empty Q1i is set equal
to the null process.
The process Q2i is defined by:
2.2.5
using Equationð6Þ
CLK ¼ðchR1 ‚ r R1 Þ:CLK þ ðchR2 ‚ rR2 Þ:CLK
j
Q2i ¼ 0:
using Equationð6Þ
Since there are eight first order reactions in the model, using Equation (7)
we define a clock process CLK as:
= fW j g, otherwise:
where the set of the Vj is as in [Equation (4)] and P1 2
Y
Vj
ð9Þ
Q1i ¼
Q2i ¼ P2 j 0‚
using Equationð5Þ
þ ðchR8 ‚ rR8 Þ:CLK þ ðchR9 ‚ rR9 Þ:CLK
The set of processes so defined are concurrently composed in the whole
system by using parallel composition.
3
3.1
RESULTS
Implementation
In this section we present SBML2PI, a tool written in Java that
implements the translation rules and performs the automatic
translation into the biochemical stochastic p-calculus. SBML2PI
produces an output for the SPiM Version 0.04 (http://research.
microsoft.com/~aphillip/spim). The user interface of the tool is
shown in Figure 2.
Design and implementation of a tool
Fig. 3. Complete network of conformational interconversions and agonist
binding at two sites of each receptor. Conformational interconversion
reactions are represented horizontally and ligand (agonist) reactions are
represented vertically. B is the basal activatable state, A is the active state,
I is the inactivatable state and D the fully desensitized state. X represents the
ligand. The state subscript indicates the number of ligand molecules bound.
Fig. 2. Screenshot of the tool user interface with text (upper left panel) and
graphical (lower left panel) representation of the reaction network. Boxes
represent reactions, filled dark and light gray circles represent reactants
and products, respectively. Constant species are represented with outlined
circles. The right panel displays the processes generated from the translation
and composed in the process algebra model according to the SPiM syntax
version 0.04.
The three main steps performed by SBML2PI are:
(1) The SBML file is read from the disk and parsed to retrieve
the SBML structures. The set of reactions is shown both in text
and graphical form in the tool user interface;
(2) The reaction network is solved by applying to each reaction in
turn the rules in the preceding section to define the p-calculus
processes abstracting the SBML species;
(3) The whole system is composed according to the SPiM syntax
and shown in the user interface. Before saving the SPiM
model two input forms allow the user to set the stochastic rate
constants and the initial number of process copies for the
simulation.
3.2
Validation
We show the results of the simulation of a SBML model of nicotinic
acetylcholine receptors (nAChR) involved in the mediation of interconversion between open and closed channel states under the
control of neurotransmitter, translated into the stochastic p-calculus
by using SBML2PI. The biological model, originally described in
(Edelstein et al., 1996), is depicted in Figure 3. The receptor
molecules are present in an equilibrium between at least four
distinct conformational states, each of which can bind up to two
molecules of agonist. The allosteric states differ by their affinity for
agonists and the interconversion rates. The binding of agonists
changes the rates of interconversion of the allosteric states.
Figure 4, reproduced from the original paper, shows the results
of kinetic simulation on the model. The progression through states
following activation upon the application of a strong and prolonged
pulse of agonist is displayed. Because the data occur over several
time regimes a log10 time axis scale was used.
Fig. 4. Graph taken from the Edelstein paper showing the fractional
population of the four states B, A, I and D in the time range 108–102 s
during a strong agonist pulse (105 M), on a logarithmic scale.
The SBML model defines 1 3D compartment, 13 species (one for
each state plus the ligand species), 17 reversible reactions and 34
kinetic constants. To test the translation we set in the p-calculus
model 1000 copies of the process abstracting the state B0 and 2000
copies of the process abstracting the ligand. The stochastic rates
were calculated by applying the Gillespie relations to the deterministic constants using the compartment volume given in SBML.
We run SPiM for a total simulation time of 100 s and obtained
the graph in Figure 5, which shows the time evolution of the number
of processes corresponding to B0, B1, A2, D2 and I2.
The comparison of the two graphs shows an excellent agreement,
both qualitatively (shape of the curves) and quantitatively (fractional population values at each time), between our results and
those given in the original paper.
4
CONCLUSION AND FURTHER WORK
We addressed the problem of translating SBML models into the
biochemical stochastic p-calculus for subsequent simulation. We
developed SBML2PI, a prototype tool for working biologists implemented in Java that performs the automatic translation of the SBML
models. We assessed the validity of our approach by obtaining
simulation results on the translated models in agreement with the
literature. To our knowledge only one recent paper is reported in
3079
C.Eccher and C.Priami
translator could automatically access the formal description of
biological entities in these knowledge bases to produce more
detailed process algebra models fully exploiting the expressivity
of the language.
Conflict of Interest: None declared.
REFERENCES
Fig. 5. The graph of the number of processes over the logarithm (to the
base 10) of the time (expressed in seconds) obtained from the simulation
of the p-calculus model by setting 1000 copies of the process abstracting the
state B0 and 2000 copies of the process abstracting the agonist.
literature (Dong et al., 2005) that deals with the implementation of a
mapping from SBML into the p-calculus. However, it is not
clear neither how the translation is performed nor the limitations
the process suffers and the consequent assumptions the authors had
to set to perform the translation.
This is a first step toward the development of a full working tool
able to translate all the SBML structures to produce p-calculus
models. The next version of SBML could help improving our
approach. In fact, it will be able to represent multi-component
species through the definition of the concept species type. Moreover, the concept of reaction will be generalized to allow reactions to
occur in any compartment by referring to participating compounds
by species type rather than by compartment specific species. These
new concepts could allow to solve the problem of dealing with
monolithic species (Assumption 1), classify the reaction in a known
type (Assumption 2) and using compartments (Assumption 6).
A graphical instrument for manually factorizing higher order
reactions in a set of first and second order steps can be easily
integrated in the tool, although performing a biological meaningful
factorization may be an hard task also for an expert biologist. Such
a tool could also allow to factorize reactions with complex kinetic
laws (Assumption 3 and 4), which often are results of approximations that aggregate binary interactions. When the original set of
reactions is known (e.g. the Michael–Mentis kinetics) this factorization could be automatic.
Some kinds of events in SBML (e.g. events that constraint
parameters to assume new values after a certain time) could be
translated by producing several p-calculus models, one per event,
with non-overlapping temporal validity intervals. The simulator
could run a model for the time in which it is valid, setting the initial
number of processes to that obtained from the simulation of the
preceding model.
Since the conclusion of this work, all the available SBML models
have been annotated with links to relevant biological data sources
and made available in the BioModels Database [Le Novère et al.
(2006)]. The annotations could be parsed and the related information on species and reactions could be retrieved and displayed
during the translation to help users to enhance the model when
the automatic translation cannot be performed. Moreover, the
3080
Achard,F. et al. (2001) XML, bioinformatics and data integration. Bioinformatics, 17,
115–125.
Cuellar,A.A. et al. (2003) An overview of CELLML 1.1, a biological model description
language. Simulation, 79, 740–747.
Dong,Z. et al. (2005) An implementation for mapping SBML to BioSPI. In Wang,L.
and Jin,Y. (eds), Proceedings of the Second International Conference on Fuzzy
Systems and Knowledge Discovery FSKD 2005, Lecture Notes in Artificial
Intelligence (LNAI) 3614, Springer, pp. 1128–1131.
Edelstein,S.J. et al. (1996) A kinetic mechanism for nicotinic acetylcholine receptors
based on multiple allosteric transitions. Biol. Cybern., 75, 361–379.
Finney,A. and Hucka,M. (2003) Systems Biology Markup Language (SBML) Level 2:
structures and facilities for model definitions.
Gillespie,D.T. (1977) Exact stochastic simulation of coupled chemical reactions.
J. Phys. Chem., 81, 2340–2361.
Hanisch,D. et al. (2002) ProML—the Protein Markup Language for specification of
protein sequences, structures and families. In Silico Biol, 2, 313–324.
Holden,C. (2002) Alliance launched to model E. coli. Science, 297, 1459–1460.
Hucka,M. et al. (2003) The Systems Biology Markup Language (SBML): a medium for
representation and exchange of biochemical network models. Bioinformatics, 19,
524–531.
Hucka,M. et al. (2004) Evolving a lingua franca and associated software infrastructure
for computational systems biology: the System Biology Markup Language
(SBML) project. Syst. Biol., 1, 41–53.
Kam,N. et al. (2001) The immune system as a reactive system: modeling t cell activation with statecharts. In Proceedings of Symposia on Human-Centric Computing
Languages and Environments, IEEE Computer Society Press, pp. 15–22.
Kahn,S. et al. (2003) A multi-agent system for the quantitative simulation of biological networks. In Proceedings of the AAMAS ’03, Melbourne, Australia,
pp. 385–392.
Kitano,H. (2001) Foundation of Systems Biology. MIT Press, Cambridge,
Massachusetts.
Kumar,S. and Feidler,J.C. (2003) BioSPICE: a computational infrastructure for
integrative biology. Omics: a J. Int. Biol., 7, 225.
Kuttler,C. et al. (2006) Gene regulation in the Pi calculus: simulating cooperativity at
the lambda switch. In Priami,C., Ingolfsdottir,A., Mishra,B., Nielson,H.R. (eds),
TCSB VII, Lecture Notes in Computer Science (LNCS) 4230, Springer, pp. 29–59.
Lecca,P. and Priami,C. (2003) Cell cycle control in eukaryotes: a BioSpi model. In
Proceedings of 1st International Workshop on Concurrent Models in Molecular
Biology (BioConcur ’03), Electronic Notes in Theoretical Computer Science
(ENTCS), Elsevier, in press; Technical Report DIT-03-045, University of Trento.
Lecca,P. et al. (2004) A stochastic process algebra approach to simulation of autoreactive lymphocyte recruitment. Simulation, 80, 273–288.
Le Novère,N. et al. (2006) BioModels Database: a free, centralized database of curated,
published, quantitative kinetic models of biochemical and cellular systems.
Nucleic Acids Res., 34, D689–D691.
Liao,Y.M. and Ghanadan,H. (2002) The chemical markup language. Anal. Chem., 74,
389A–390A.
Matsuno,H. et al. (2003) Biopathways representation and simulation on hybrid
functional Petri net. In Silico Biol, 3, 389–404.
Mestl,T. et al. (1995) A mathematical framework for describing and analyzing gene
regulatory networks. J. Theor. Biol., 176, 291–300.
McAdams,H.H. and Arkin,A. (1997) Stochastic mechanism in gene expression.
Proc. Natl Acad. Sci. USA, 94, 814–819.
Milner,R. (1989) Communication and Concurrency. Prentice Hall International,
Englewood Cliffs.
Milner,R. (1999) Communicating and Mobile Systems: The p-Calculus. Cambrigde
University Press.
Phillips,A. and Cardelli,L. (2004) A correct abstract machine for the stochastic
Pi-calculus. In Proceedings of the 2nd International Workshop on Concurrent
Models in Molecular Biology (BioConcur ’04), Electronic Notes in Theoretical
Computer Science (ENTCS), Elsevier, in press.
Design and implementation of a tool
Priami,C. et al. (2001) Application of a stochastic name-passing calculus to representation and simulation of molecular processes. Inf. Proc. Lett, 80, 25–31.
Regev,A. et al. (2001) Representation and simulation of biological processes using
the p-calculus process algebra. In Proceedings of the 6th Pacific Symposium on
Biocomputing, pp. 459–470.
Spellman,P.T. et al. (2002) Design and implementation of Microarray Gene Expression
Markup Language. Genome Biol, 3, 0046.0041–0046.0049.
Taylor,C.F. et al. (2003) A systematic approach to modeling, capturing, and
disseminating proteomics experimental data. Nat. Biotechnol., 21, 247–254.
Tyson,J.J. (1991) Modeling the cell division cycle: cdc2 and cyclin interactions.
Cell Biol., 88, 7328–7332.
The BioSpi project (2002) Web site.
Waugh,A. et al. (2002) RNAML: A standard syntax for exchanging RNA information.
RNA, 8, 707–717.
3081