Genome-Scale Metabolic Networks: reconstruction - EMBL-EBI

Genome-Scale Metabolic Networks:
reconstruction, properties, and
applications
Mathias Ganter
Microme Workshop on Microbial Metabolism
- EMBL-EBI, October 9, 2013 -
csb
computational systems biology
1
Metabolic network models
http://www.cs.cmu.edu/~blmt/Seminar/SeminarMaterials/IntroMolBasDisease.html
Motivation:
• knowledge repository
• phenotype growth
simulations
• model-driven discovery
E. coli:
reactions: ~ 2400
metabolites: ~ 1600
genes: ~ 1400
A. thaliana (unpublished):
~ 4200
~ 3700
2
~ 2400
Metabolic reactions
REVIEWS
gene-protein-reaction
Box 1 | Defining metabolicrelation
reactions
0
0
0
0
0
0
0
0
0
0
0
0
0
–1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
–1
0
–1
0
0
0
0
0
0
0
0
0
Differentb1676
levels of information
b2779
b1854 are needed to obtain a
detailed description of a biochemical transformation.
HEX1
Biochemical accuracy is especially important
if the
mathematical
eno
pykFrepresentation
pykA of the reconstruction is to
be used for subsequent computations, otherwise the
calculated network properties are likely to be incorrect.
The first level defines the metabolite specificity of a
PGI
gene product.
primary metabolites are often
Eno
PykFAlthough
PykA
the same for homologous enzymes across organisms, the
use of coenzymesormight vary. In the case of lactate
PFK NAD
dehydrogenase in Escherichia coli (see figure),
serves as an electron
ENO
PYK acceptor for lactate (LAC)
resulting in the formation of pyruvate (PYR) and NADH. The
second level of detail accounts for the charged
molecularb1416
formula ofb1417
each metabolite at aFBA
physiological
b1779
pH. The knowledge of the chemical formula leads to
the third level of detail, the stoichiometric
TPI coefficients
gapA
gapC2By balancing
gapC1 out the elements and
of the reaction.
charge in the reaction, the overall stoichiometry of
the reaction canand
be defined. It is here GAPD
that protons and water
molecules are often added to balance the chemical
equation. The directionality of the reaction represents the
GapC
GapA
fourth level, at which biochemical studies and
PGK
thermodynamic properties define the in vivo reaction
directionality.
or At the fifth level, the cellular compartment in
whichGAPD
the reaction takes place has to be determined.
See
PGM
supplementary information S1 (box) for more details.
Level 1: Metabolite specificity
Primary metabolites
LAC
PYR
Coenzymes
NADH
NAD
Level 2: Metabolite formulae
Neutral formulae
C3H6O3
C3H4O3
C21H28N7O14P2
C21H29N7O14P2
Charged formulae
C3H5O3–
C3H3O3–
C21H27N7O14P22–
Level 3: Stoichiometry
1 LAC + 1 NAD
?
C21H26N7O14P2–
1 PYR + 1 NADH + 1 H
Level 4: Thermodynamic considerations and/or directionality
1 LAC + 1 NAD
1 PYR + 1 NADH + 1 H
Step-wise incorporation of information
Reed et al., Nature Reviews Genetics 7, 130-141 (February 2006)
Genes
glk
pgi
pfkA, pfkB
fbaA, fbaB
tpiA
gapA, gapC1, gapC2
pgk
gpmA, gpmB
eno
pykA, pykF
metabolic reaction
Level 5: Localization
Prokaryotes
[c]: cytoplasm
[e]: extracellular
[p]: periplasm
[n]:
[g]:
[v]:
[l]:
nucleus
golgi aparatus
vacuole
lysosome
[m]:
[x]:
[h]:
[r]:
mitochondria
peroxisome
chloroplast
endoplasmic
reticulum
Eukaryotes
1 LAC [c] + 1 NAD [c]
1 PYR [c] + 1 NADH [c] + 1 H [c]
ENO
3
How to reconstruct metabolic networks. Although
high
stoichiometry for the metabolites is generally available
From genomes to models
External knowledge
4
http://www.cs.cmu.edu/~blmt/Seminar/SeminarMaterials/IntroMolBasDisease.html
Metabolic model
Annotated genome
Problems
common
namespace
directions
glucose, GLC, met1,
C00031, CHEBI:4167
?
missing
reactions
missing
pathways
dead-end
erroneous
annotations
?
localization
5
Special reactions for modeling
• Uptake and secretion reactions
• Biomass reaction
amino acids
lipids
nucleotides
conversion
cofactors
...
• ATP maintenance reaction (all
energy demands not related to
growth)
6
1 gram of biomass
tool for studying the systems biology of metabolism1–7. The number here, as they do not generally result in functional, mathematical
of organisms for which metabolic reconstructions have been cre- models.
ated is increasing at a pace similar to whole genome sequencing.
The metabolic reconstruction process described herein is
However, the quality of metabolic reconstructions differ consider- usually very labor and time intensive, spanning from 6 months for
ably, which is partially caused by varying amounts of available data well-studied, medium-sized bacterial genome, to 2 years (and six Q2
for the target organisms and also by a missing standard operating people) for the metabolic reconstruction of human metabolism15.
procedure that describes the reconstruction process in detail. This Often, the reconstruction process is iterative, as demonstrated
protocol details a procedure by which a quality-controlled quality- by the metabolic network of Escherichia coli, whose reconstrucreconstruction
1. Draft
reconstruction can be built to ensure high quality and
tionData
has assembly
been expanded
and refined over the last 19 years7. As the
for simulation16. assured
and dissemination
daystoto
weeks
1| Obtain genome
annotation.
95| Print
model content.
comparability
between
reconstructions. In particular, the protocol number
days to a week
of Matlab
reconstructed
organisms increases, the need
find
Identify candidate metabolic functions.
d to facilitate
the points2| out
96| Add gap information to the reconstruction output.
data
that are
necessary
for the reconstruction process automated,
or at least semi-automated, ways to reconstruct meta3| Obtain
candidate
metabolic
reactions.
4|
Assembly
of
draft
reconstruction.
ng and manual and that
should accompany reconstructions. Moreover, standard bolic networks straight from the genome annotation is growing.
5| Collect of experimental data.
ongoing
tests are presented, which are necessary to verify functionality
andevaluation
Despite the growing experience and knowledge, to date,
we to
are
4. Network
d herein.
week
applicability of reconstruction-derived metabolic models.
Finally,
43−44|
Test if network
is mass-and
charge balanced.
still not
able to completely
automatically reconstruct high-quality
months
s in detail the this protocol presents strategies to debug non- or malfunctioning
45| Identify metabolic dead-ends.
metabolic
networks
that
can
be
used
as
predictive
models.
Recent
2. Refinement of reconstruction
46−48| Gap analysis.
month
to
a
year
Add missing reviews
exchangehighlight
reactions tocurrent
model. problems with genome annotations and
tabolic recons- models.
6| Determine
andthe
verify
substrate and cofactor
usage.
Although
reconstruction
process
has been 49|
reviewed
50| Set exchange constraints for a simulation condition.
7| Obtain neutral formula for each metabolite.
8–11
by numerous
groups and a good general 51−58|
overview
databases, which
makecycles.
automated reconstructions challenging and
Test for stoichiometrically
balanced
8| Determine
the charged formula.
epresentatives of conceptually
59|
Re-compute
gap
list.
8,9
9|
Calculate
reaction
stoichiometry.
of the necessary data and steps is available, no detailed description
thus, require
manual
evaluation
. Organism-specific
features, such
60−65|
Test
if
biomass
precursors
can
be
produced
in
standard
medium.
10|
Determine
reaction
directionality.
rocess of recons- of the reconstruction, debugging and iterative validation66|process
Test if biomass
produced in
other growthof
media.
as precursors
substratecan
andbecofactor
utilization
enzymes, intracellular pH
11| Add information for gene and reaction localization.
67−75|
Test
if
the
model
can
produce
known
secretion
products.
12|
Add
subsystems
information.
eukaryotic meta- has been
published. This protocol seeks to make this process
explicit
reaction
directionality remain problematic, and thus, requiring
76−78|
Check forand
blocked
reactions.
13| Verify gene−protein-reaction association.
79−80| Computemanual
single gene
deletion phenotypes.
available.
14| Add metabolite
identifier.
evaluation.
However, some organism-specific databases and
nciple, identical, and generally
81−82| Test for known incapabilites of the organism.
15| Determine and add confidence score.
The16|presented
protocol
describes the procedure necessary
to predicted
approaches
exist,properties
which can
used
for automation. We describe
83| Compare
physiological
with be
known
properties.
Add references
and notes.
nstructions are reconstruct
17| Flag metabolic
information from
other
organisms.
84−87|
Test
if
the
model
can
grow
fast
enough.
networks intended to be used for computa- here the manual reconstruction process in detail.
18| Repeat Steps 6 to 17 for all genes.
88−94| Test if the model grows too fast.
f size of genomes, tional19|
modeling,
including
the constraint-based
A limited number of software tools and packages are available
Add spontaneous
reactions
to the reconstruction.reconstruction and
11,12
20|
Add
extracellular
and
periplasmic
transport
reactions.
(COBRA) approach . These network reconstructions, (freely and commercially), which aim at assisting and facilitating
nd the multitude analysis
21| Add exchange reactions.
of reconstruction
and in22|silico
models, transport
are created
in a bottom–up manner based 3. Conversion
Add intracellular
reactions.
the reconstruction
process (Table 1). This
protocol
can, in princidays
to a week
into computable format
Specific proper- on genomic
23| Draw metabolic map (optional).
and bibliomic
and thus represent a biochemical, ple, be combined with those reconstruction tools. For generality,
24−32| Determine
biomassdata,
composition.
38| Initialize the COBRA toolbox.
33|and
Add biomass
reaction.
total:Excel
up to 2
hted.
genetic
genomic
(BiGG) knowledge-base for the target orga-39| Load
we present
the entire
procedure using a spreadsheet, namely
reconstruction
into Matlab.
40| Verify S matrix.
934| Add ATP-maintenance reaction (ATPM).
. These
BiGG reactions.
reconstructions can be converted into mathe-41| Set
workbook
(Microsoft), and a numeric computation andyears
visualizaandQ36
Add demand
ction and debug- nism 35|
objective function.
36| Add
sink reactions.
simulation
constraints.
matical
models
and
their systems and physiological properties42| Set
tion
software
package, namely Matlab (Mathwork). Free
spread-(e.g.
Q4
people
37|
Determine
growth medium requirements.
rganism-specific can be determined.
For example, they can be used to simulate sheets (e.g., Open office and Google Docs) could be used instead
of
human)
um information theFigure
maximal
of a cell
in aprocedure
given environmental
condition
the listed
spreadsheet.
Alternatively,
MySQL databases
1 |growth
Overview
of the
to iteratively
reconstruct
metabolic
networks.
In particular,
Stages may be used,
13,14
analysis (FBA) . In contrast, the generation of as they are very helpful to structure and track data. Matlab was also
ence, from which using
2–4flux-balance
are continuously
iterated until model predictions are similar to the phenotypic characteristics of the
networks derived from top–down approaches (high-throughput used to encode the COBRA Toolbox, which is a suite of COBRA
Current reconstruction protocol
PROTOCOL
can be obtained, target organism and/or all experimental data for comparison are exhausted.
such as growth
NATURE PROTOCOLS | VOL.4 NO.12 | 2009 | 1
e comparison of
e the network’s content. In general, the and metabolomic data), and converted into a mathematical format
physiology, biochemistry and genetics is to investigate metabolic capabilities and generate new biologianism, the better the predictive capacity cal hypotheses. The 7multitude of possible applications of BiGG
Goal: automatic model reconstruction
Model coverage
Available online at www.sciencedirect.com
Recent advances in reconstruction and applications of
genome-scale metabolic models
Tae Yong Kim1,2, Seung Bum Sohn1,2, Yu Bin Kim1,2,
Won Jun Kim1,2 and Sang Yup Lee1,2,3
In the last decade, reconstruction and applications of genomescale metabolic models have greatly influenced the field of
systems biology by providing a platform on which highthroughput computational analysis of metabolic networks can
be performed. The last two years have seen an increase in
volume of more than 33% in the number of published genomescale metabolic models, signifying a high demand for these
metabolic models in studying specific organisms. The diversity
in modeling different types of cells, from photosynthetic
microorganisms to human cell types, also demonstrates their
growing influence in biology. Here we review the recent
advances and current state of genome-scale metabolic
models, the methods employed towards ensuring high quality
models, their biotechnological applications, and the progress
towards the automated reconstruction of genome-scale
metabolic models.
Addresses
1
Metabolic and Biomolecular Engineering National Research
Laboratory, Department of Chemical and Biomolecular Engineering
(BK21 program), Center for Systems and Synthetic Biotechnology,
Institute for the BioCentury, Korea Advanced Institute of Science and
Technology (KAIST), Daejeon 305-701, Republic of Korea
2
BioInformatics Research Center, KAIST, Daejeon 305-701, Republic of
Korea
3
Department of Bio and Brain Engineering, BioProcess Engineering
Research Center, KAIST, Daejeon 305-701, Republic of Korea
Corresponding author: Lee, Sang Yup ([email protected])
Current Opinion in Biotechnology 2012, 23:617–623
This review comes from a themed issue on Systems biology
Edited by Jens Nielsen and Sang Yup Lee
For a complete overview see the Issue and the Editorial
Available online 4th November 2011
0958-1669/$ – see front matter, # 2011 Elsevier Ltd. All rights
reserved.
http://dx.doi.org/10.1016/j.copbio.2011.10.007
Introduction
Genome-scale metabolic models have become an important tool in the study of metabolic networks in biotechnology. The explosion in the number of new genome-scale
metabolic models reconstructed over the last decade, and
in particular in last several years, is a proof of its great
usefulness in the study and applications of biological
systems (Figure 1). It also highlights the increasing import-
metabolic models were employed towards understanding
the characteristics of microbial pathogens at genome-scale,
which was followed by developing strategies for metabolically engineering microbial hosts for enhanced production of various bioproducts. As developing superior
microorganisms for biorefinery applications have become
increasingly important, these metabolic models are widely
used in metabolic engineering studies to overcome the
limitations of established knowledge on the metabolic
network and to identify new non-intuitive metabolic reactions to be engineered for further improvement of strains.
Availability of ever increasing number of genome-scale
metabolic models is of course important, but the quality of
these metabolic models is more important. Validation of
these metabolic models ensures the quality of the metabolic model and their ability to correctly predict the
physiological characteristics of the organism. This entails
the use of experimental data that can be compared against
the predicted physiological characteristics of the metabolic
model. By comparing the simulated physiological characteristics with the observed experimental results, the
accuracy of the metabolic model can be improved. Furthermore, algorithms have been developed to incorporate other
aspects of cellular characteristics, other than metabolic
functions, to increase the accuracy of the model.
Recently, the genome-scale metabolic models have
become more refined and complex, allowing for the
expanded scopes in their applications. Algorithms have
been developed to examine metabolic models from various
angles; for instance, calculating the redistribution of the
metabolic flux in response to genetic or environmental
perturbation [1,2]. Reconstruction of metabolic models of
yeast species has been employed to investigate the production of heterologous therapeutic proteins that are unsuitable for production in bacterial hosts owing to the absence
of eukaryotic post-translational modification mechanisms
[3]. Pathogenic metabolic models allow for the development of novel drugs to combat infection with minimal side
effect to the host [4!,5]. The metabolic models of mammals,
such as Homo sapiens, have been employed to study various
human diseases and develop strategies for potential treatments [6,7!!].
8
The advantages acquired by employing genome-scale
metabolic models have consequently driven the develop-
Automatic model reconstruction
Gap-filling by genome annotation
Database and information integration
Automated reconstruction & prediction
of novel network components
Computational complexity
Manual (assisted) network reconstruction
Computational challenge: combinatorial optimization
9
STRATED THAT MODELDIRECTED STRAIN DESIGN CAN LEAD TO INCREASED METABOLITE
PRODUCTIONn )N THESE STUDIES THE % COLI '%- IS PRINCIPALLY USED TO
ANALYZE THE METABOLITE PRODUCTION POTENTIAL OF % COLI AND IDENTIFY META
BOLIC INTERVENTIONS NEEDED TO PRODUCE THE METABOLITE OF INTEREST 4HUS
% COLI STRAINS HAVE BEEN SYSTEMATICALLY DESIGNED THROUGH IN SILICO ANALYSIS
TO OVERPRODUCE TARGET METABOLITES SUCH AS LYCOPENE LACTIC ACID OUR
GROUP ETHANOL SUCCINIC ACID ,VALINE ,THREONINE ADDITIONAL
AMINO ACIDS AS WELL AS DIVERSE PRODUCTS FROM HYDROGEN TO VANILLIN
Iterative model updates: E. coli
5.#&2
© 2008 Nature Publishing Group htt
OF BACTERIAL EVOLUTIONn 4HE IN SILICO METHODS USED TO PROBE THE % COLI
'%- IN EACH STUDY ARE SUMMARIZED IN &IGURE 4HESE METHODS PERFORM
AN ASSESSMENT OF THE SOLUTION SPACES ASSOCIATED WITH THE MATHEMATICAL
REPRESENTATION OF A RECONSTRUCTION THEY ARE CATEGORIZED AS EITHER UNBI
ASED OR BIASED 4HE LATTER CATEGORY RELIES ON AN OBSERVER BIAS THAT IS STATED
THROUGH AN OBJECTIVE FUNCTION THAT IS NOW BEGINNING TO BE EXPERIMENTALLY
EXAMINED THAT HAS BEEN UTILIZED IN MOST OF THE STUDIES REVIEWED HERE FOR
THE GENERAL APPLICATION OF FLUX BALANCE ANALYSIS &"!n %ACH CATEGORY
<1/2#35/'05#.+;'&3'%104536%5+10&+45+0%52'3+2.#4/
<95'04+7'%'..8#../'5#$1.+4/2*142*1.+2+&4/63'+0
<'#%5+105*'3/1&:0#/+%4
<.5'30#5'%#3$1065+.+;#5+10
<6+010'%*#3#%5'3+;#5+10
<.'/'05#.#0&%*#3)'$#.#0%+0)
&"$4*0/3
&/&3
<#55:#%+&/'5#$1.+4/
<92#0&'&%'..6.#353#0421354:45'/4
< 4'&)'01/'#4#4%#((1.&
&4"#0-*4&3
<'..8#..%1045+56'05$+14:05*'4+4
<1(#%513$+14:05*'4+4
<3185*&'2'0&'05$+1/#441$,'%5+7'(60%5+10
</+01#%+&#0&
06%.'15+&'
$+14:0
"+&73,*
0."$)
"2."
"-330/
3#/#0+-
'#4.+0)
&8#3&4
#.4410
''&
'+45
!&"2
&IGURE 4HE ITERATIVE RECONSTRUCTION AND HISTORY OF THE % COLI METABOLIC NETWORK 3IX MILESTONE EFFORTS ARE SHOWN THAT CONTRIBUTED TO THE RECONSTRUCTION
OF THE % COLI METABOLIC NETWORK &OR EACH OF THE SIX RECONSTRUCTIONSn THE NUMBER OF INCLUDED REACTIONS BLUE DIAMONDS GENES GREEN TRIANGLES AND
METABOLITES PURPLE SQUARES ARE DISPLAYED !LSO LISTED ARE NOTEWORTHY EXPANSIONS THAT EACH SUCCESSIVE RECONSTRUCTION PROVIDED OVER PREVIOUS EFFORTS
&OR EXAMPLE 6ARMA 0ALSSON INCLUDED AMINO ACID AND NUCLEOTIDE BIOSYNTHESIS PATHWAYS IN ADDITION TO THE CONTENT THAT -AJEWSKI $OMACH CHARACTERIZED 4HE START OF THE GENOMIC ERA MARKED A SIGNIFICANT INCREASE
10 IN INCLUDED RECONSTRUCTION COMPONENTS FOR EACH SUCCESSIVE ITERATION 4HE
REACTION GENE AND METABOLITE VALUES FOR PREGENOMICERA RECONSTRUCTIONS WERE ESTIMATED FROM THE CONTENT OUTLINED IN EACH PUBLICATION AND IN SOME CASES
linear updates: hardly any problems
Biotechnology Advances 30 (2012) 979–988
Iterative model updates: Yeast
Contents lists available at ScienceDirect
Biotechnology Advances
j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / b i o t e c h a d v
T. Österlund et al. / Biotechnology Advances 30 (2012) 979–988
Research review paper
Fifteen years of large scale metabolic modeling of yeast: Developments and impacts
Tobias Österlund, Intawat Nookaew, Jens Nielsen ⁎
Department of Chemical and Biological Engineering, Chalmers University of Technology, SE-412 96 Gothenburg, Sweden
a r t i c l e
i n f o
Available online 6 August 2011
Keywords:
Genome-scale metabolic model
Systems biology
Metabolic engineering
Computational algorithms
Evolution
a b s t r a c t
Since the first large-scale reconstruction of the Saccharomyces cerevisiae metabolic network 15 years ago the
development of yeast metabolic models has progressed rapidly, resulting in no less than nine different yeast
genome-scale metabolic models. Here we review the historical development of large-scale mathematical
modeling of yeast metabolism and the growing scope and impact of applications of these models in four
different areas: as guide for metabolic engineering and strain improvement, as a tool for biological
interpretation and discovery, applications of novel computational framework and for evolutionary studies.
© 2011 Elsevier Inc. All rights reserved.
Contents
1.
2.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
Framework for reconstructing genome-scale metabolic models . . .
2.1.
Metabolic network reconstruction . . . . . . . . . . . . .
2.2.
Mathematical formulation and debugging . . . . . . . . .
2.3.
Validation with experimental data . . . . . . . . . . . . .
3.
Development of yeast metabolic models . . . . . . . . . . . . .
3.1.
Models of central carbon metabolism . . . . . . . . . . .
3.2.
Genome-scale metabolic models . . . . . . . . . . . . . .
4.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.
Guidance for metabolic engineering and strain improvement
4.2.
Biological interpretation and discovery . . . . . . . . . . .
4.3.
Applications of novel computational framework . . . . . .
4.4.
Evolutionary elucidation . . . . . . . . . . . . . . . . . .
5.
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . .
Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1. Introduction
The yeast Saccharomyces cerevisiae serves as an important cell factory
in biotech production of food, beer, wine, nutraceuticals, pharmaceuticals,
chemicals and fuels. It is also a very important model organism for
eukaryal biology as it has a number of features that are conserved with
higher eukaryotes, including humans. Its genome was among the first to
be completely sequenced (Cherry et al., 1997; Goffeau et al., 1996) and
many functional genomics tools have been pioneered using this yeast as a
model organism (Chien et al., 1991; Winzeler et al., 1999; Wodicka et al.,
⁎ Corresponding author. Tel.: + 46 31 772 3804; fax: +46 31 772 3801.
E-mail address: [email protected] (J. Nielsen).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
979
980
980
981
981
981
981
982
983
983
984
985
986
986
986
986
1997). Thus there are comprehensive databases available, including the
highly structured Saccharomyces Genome Database (SGD) (www.
yeastgenome.org) (Weng et al., 2003). Many different yeast strains have
been sequenced with the objective to understand evolution towards
different kinds of applications (Borneman et al., 2008; Legras et al., 2007;
Rainieri et al., 2006), e.g. wine, bread and beer production, and for
providing a basis for advancing metabolic engineering (Otero et al., 2011).
With the advancement of systems biology, in particular for gaining new
insight from high-throughput experimental data, S. cerevisiae has also
played an important role (Mustacchi et al., 2006; Nielsen and Jewett,
2008). In this interface between experiments and mathematical modeling
the concept of genome-scale metabolic models (Covert et al., 2001a) plays
an important role, as it allows for direct integration of high-throughput
experimental data with mathematical modeling, and hence advance our
0734-9750/$ – see front matter © 2011 Elsevier Inc. All rights reserved.
doi:10.1016/j.biotechadv.2011.07.021
many different branches: many problems
11
Application: Drug discovery
motivation: resistance against existing drugs
12
Application: Drug discovery - malaria
A
metabolic model
Host
NMase
NA
NM
Histone
acetylation
essential genes
P. falciparum metabolic netwo
NPRT
NAD+
NADK
NADP+
NADS
NMNAT
NaMN
non-human homologs
H
N
H
N
O
Br
NaAD
N
O
Compound 1_03
check for reported drugs in other
pathogens
B
Control
Reinvasion
Ring
12 h
Troph
24 h
Late troph
36 h
Schizont
48 h
Ring
66 h
Ring
100 !M
cpd 1_03
Ring
biol. experiment
Troph
Troph
Troph
Troph
Figure 2 Small-molecule inhibition
of the parasite
nicotinate
adenylyltransferase (NMNAT). (A) Schematic of the P. falciparum NAD
Molecular Systems
Biology 6; Article
numbermononucleotide
408; doi:10.1038/msb.2010.60
recycling pathway determinedCitation:
from Molecular
the genome
Systemssequence.
Biology 6:408Nicotinamide (NM) and nicotinic acid (NA) can be scavenged from the host. Compound 1_
EMBOcauses
and Macmillan
Publishers
Limited
All rights reserved 1744-4292/10
& 20101_03
targeting NMNAT. (B) Compound
growth
arrest
of intraerythrocytic
P. falciparum. Cultures were resuspended in niacin-free medium conta
www.molecularsystemsbiology.com
of compound 1_03 at early ring stage and observed for 66 h (see Materials and methods). Untreated parasites undergo normal development and re
drug-treated parasites arrest at the trophozoite (‘troph’) stage and do not reinvade. NM, nicotinamide; NA, nicotinic acid; NaMN, nicotinate monon
nicotinate adenine dinucleotide; NAD(P) þ, nicotinamide adenine dinucleotide (phosphate), reduced; NMase, nicotinamidase; NPRT, nicotinate
transferase; NMNAT, nicotinate mononucleotide adenylyltransferase; NADS, NAD synthase; NADK, NAD kinase.
Reconstruction and flux-balance analysis of the
Plasmodium falciparum metabolic network
no growth
growth
1,2,6
1,3,6
4,5,
1,3,
Germán
Plata(see
, Tzu-Lin
Hsiaoand
, Kellen
L Olszewski4,5made
, Manuel
Llinás
*shuffled
and Dennisdata
Vitkup
* higher than th
with
the
were
schizont developmental
stages
Materials
methods).
using
the
original
expression
values
(Supplem
Following Colijn et al
(2009),
the
maximum
flux
allowed
1
Center for Computational Biology and Bioinformatics, Columbia University, New York City, NY, USA, 2 Integrated Program in Cellular, Molecular, S
S3).of To
explore
the Columbia
effectsUniversity,
of multiple
optimal
through enzymes was
constrained
proportionally
to USA,
the3 Department
Genetic
Studies, Columbia University,
New York City, NY,
Biomedical
Informatics,
New York City,
NY, USA,F
of Molecular
Biology,
Princeton University,
Princeton, NJ, USA and 5 Lewis-Sigler
Institute forand
Integrative
Genomics, 2003)
Princeton on
University,
Princeton,
(Mahadevan
Schilling,
the
predict
relative expression level
of
the
corresponding
genes.
6
These authors contributed equally to this work
used
the centering
algorithm
(
We compared the * accuracy
of our
predictions
to the
Corresponding authors.
D Vitkup,
Department of Biomedical
Informatics,we
Center
for Computational
Biology and hit-and-run
Bioinformatics, Columbia
University, 11
Avenuemetabolic
803A, New Yorkchanges
City, NY 10032,
Tel.: þ 1 212 851 5151; Fax:
þ 1 2121998),
851 5149; E-mail:
[email protected]
or MCOBRA
Llinás, Departmen
Smith,
implemented
in the
too
experimentally measured
inUSA.
PlasmodiumBiology, Lewis-Sigler Institute for Integrative Genomics, Princeton University, 246 Carl Icahn Lab, Princeton, NJ 08544, USA. Tel.: þ 1 609 258 9391
et al, 2007), to randomly sample the solution spa
infected13
RBCs (Olszewski
et1 al,
In Figure
3, we show the
Fax: þ
609 2009).
258 3565. E-mail:
[email protected]
with the expression constraints. The 70% ac
predicted and experimentally measured changes, indicating
targeted experiments
Difficult problems/challenges
•
•
•
•
•
•
•
•
•
combinatorial explosion
reaction direction assignment
compartmentalization of euk. models
automation of reconstruction protocol for eukaryotes
integration of exp. data and predictions
reconstruction of models with few exp. data
tissue-specific models
multicellular models
whole organism models
14