Unraveling Bacterial Growth Laws - Dipartimento di Fisica

Unraveling Bacterial Growth Laws:
Coupling Energetics and Regulation in Cell Metabolism
Scuola di dottorato Vito Volterra
Dottorato di Ricerca in Fisica – XXVII Ciclo
Candidate
Matteo Mori
ID number 1178192
Thesis Advisors
Prof. Enzo Marinari
Prof. Andrea De Martino
A thesis submitted in partial fulfillment of the requirements
for the degree of Doctor of Philosophy in Physics
December 2014
Thesis not yet defended
Unraveling Bacterial Growth Laws: Coupling Energetics and Regulation in Cell
Metabolism
Ph.D. thesis. Sapienza – University of Rome
© 2014 Matteo Mori. All rights reserved
This thesis has been typeset by LATEX and the Sapthesis class.
Author’s email: [email protected]
A tutti coloro mi hanno sopportato in questi anni.
Non vi fate ringraziare uno per uno,
che sarebbe tutto troppo lungo
e anche basta scrivere, su.
v
Abstract
In this dissertation I study genome–scale modelling of metabolic networks, coarse
grained proteome allocation models and their integration. Constraint–based modelling is nowadays widely used in molecular biology and bioengineering due to its simplicity. The importance of thermodynamic constraints in such models of metabolism
is underlined, and I describe a new technique of identifying flux cycles in metabolic
networks.
The main result of my Ph.D. studies is a new framework, called Constrained
Allocation Flux Balance Analysis (CAFBA). CAFBA integrates contraint–based
modelling with a simple model of proteome allocation developed by T. Hwa and
coworkers since 2010, based on empirical growth laws; the resulting algorithm is a
simple linear programming problem in the case of biomass production rate optimization. CAFBA allows for accurate modelling of overflow metabolism in Escherichia
coli, and a few parameters can be adjusted to fit strain–specific differences, such as
the maximum growth rate. Our results show that overflow metabolism is the result
of the tradeoff between high–yield metabolism and its cost in terms of protein production, and suggests some unexplored correlations among reaction fluxes, enzyme
levels, kinetic constants, and metabolite levels.
Despite the success of the proteome growth laws to describe in a simple and
effective way how proteome adjust to cope with different sources of growth limitation, some aspects are quite mysterious. In particular, proteome appears to be
inefficiently used. I explored the role of such proteome excess during nutritional
shifts, that is, sudden variations in nutrients availability. Result suggests that part
of the proteome is purposely allocated to speed up growth during transitions from
poor to rich substrates. This is confirmed by the study of optimality of such excess
in upshift scenarios.
vii
Contents
List of Figures
x
Introduction
1
1 Modeling bacterial growth: an overview
1.1 Bacterial physiology . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Metabolism . . . . . . . . . . . . . . . . . . . . . . .
1.1.2 Growth laws . . . . . . . . . . . . . . . . . . . . . .
1.2 Genome scale modeling of metabolic networks . . . . . . . .
1.2.1 Fluxes normalization and dilution terms . . . . . . .
1.2.2 Fundamental subspaces of the stoichiometric matrix
1.2.3 Flux Balance Analysis . . . . . . . . . . . . . . . . .
1.2.4 Other frameworks . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2 Thermodynamics in metabolic networks
2.1 Thermodynamics in a chemical network . . . . . . . . . . . . . . . .
2.1.1 Gibbs ensemble and chemical reactions . . . . . . . . . . . . .
2.1.2 Chemical potentials and concentrations . . . . . . . . . . . .
2.1.3 Second principle of thermodynamics, directionality of fluxes
and non–equilibrum steady state . . . . . . . . . . . . . . . .
2.1.4 Duality theorems . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Algorithm for identifying and correcting infeasible loops . . . . . . .
2.2.1 Structure of the Algorithm . . . . . . . . . . . . . . . . . . .
2.2.2 Checking Thermodynamic Viability by Relaxation . . . . . .
2.2.3 Identifying Loops by Monte Carlo . . . . . . . . . . . . . . .
2.2.4 Correcting the Flux Configuration: a Toy Model . . . . . . .
2.2.5 Correcting the Flux Configuration: Local Strategy . . . . . .
2.2.6 Correcting the Flux Configuration: Global Strategy . . . . .
2.3 Loops in the E. Coli Network iAF1260 . . . . . . . . . . . . . . . . .
2.4 Correcting infeasible flux configurations in the Recon-2 human networks
2.4.1 Inconsistencies in the FBA Solution for the Overall Human
Reactome . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.2 Correcting Infeasible Loops in FBA Solutions for Cell-Type
Specific Human Metabolic Networks . . . . . . . . . . . . . .
2.5 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
5
7
8
14
15
16
17
19
21
22
22
23
25
27
28
28
30
31
32
32
34
35
37
38
39
41
viii
Contents
3 Constrained Allocation Flux Balance Analysis
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Proteome allocation constraint . . . . . . . . . . . . . . . . . . . . .
3.2.1 Enzyme–limited kinetics . . . . . . . . . . . . . . . . . . . . .
3.2.2 Thermodynamics and estimates of enzyme levels . . . . . . .
3.2.3 Interpretation of the constitutive relation as lower bound on
the enzyme proteome fractions . . . . . . . . . . . . . . . . .
3.3 Constrained Allocation FBA . . . . . . . . . . . . . . . . . . . . . .
3.3.1 Details on CAFBA implementation . . . . . . . . . . . . . . .
3.4 Results: Homogeneous case . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 Choice of the control parameter: flux versus weight . . . . . .
3.4.2 CAFBA and the growth laws . . . . . . . . . . . . . . . . . .
3.5 CAFBA as tradeoff between two optimal solutions . . . . . . . . . .
3.6 Inhomogeneous case . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.2 Fixing the maximum growth rate for a synthetic strain . . . .
3.6.3 Optimal regulation for different carbon sources . . . . . . . .
3.7 Growth rate-dependent biomass composition . . . . . . . . . . . . .
3.7.1 Rna to protein ratio . . . . . . . . . . . . . . . . . . . . . . .
3.7.2 Implementation of a variable biomass in (CA)FBA . . . . . .
3.7.3 Energy requirements . . . . . . . . . . . . . . . . . . . . . . .
3.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.8.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Beyond steady state
4.1 What are the proteome offsets for? . . . . . . . . . . . . .
4.2 Coarse–grained model of protein synthesis and allocation
4.3 Upshifts . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Estimate for the R–sector offset . . . . . . . . . . .
4.3.2 Efficiencies and the other offsets . . . . . . . . . .
4.3.3 Upshifts: Lower bounds to other offsets . . . . . .
4.4 Experimental results . . . . . . . . . . . . . . . . . . . . .
4.5 Upshift fitness landscape . . . . . . . . . . . . . . . . . . .
4.5.1 Analitical solution in a simple case . . . . . . . . .
4.5.2 Comparing different environments . . . . . . . . .
4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Statics and dynamics in a small model
5.1 Steady state . . . . . . . . . . . . . . . . . . . . . .
5.1.1 Randomization . . . . . . . . . . . . . . . .
5.2 Kinetic model . . . . . . . . . . . . . . . . . . . . .
5.2.1 Optimal proteome allocation in a dynamical
5.2.2 Results for constant G/T ratio . . . . . . .
5.2.3 Optimal G/T tradeoff exploitation . . . . .
5.3 Discussion . . . . . . . . . . . . . . . . . . . . . . .
Conclusion
45
45
46
48
49
50
52
53
54
55
57
62
68
69
71
77
77
82
82
83
84
84
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
89
89
90
91
92
93
94
95
97
97
98
103
. . . . . . . .
. . . . . . . .
. . . . . . . .
environment
. . . . . . . .
. . . . . . . .
. . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
105
106
108
110
112
114
117
117
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
121
ix
List of Figures
1.1
1.2
1.3
1.4
1.5
Schematic representation of a prokaryiote bacterium.
Central dogma of molecular biology. . . . . . . . . .
Scheme of intermediate metabolism. . . . . . . . . .
Growth laws, RNA/protein ratio. . . . . . . . . . . .
Illustration of Flux Balance Analysis . . . . . . . . .
2.1
Flowchart of the algorithm for counting and removing cycles from a
flux configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Toy reaction network. . . . . . . . . . . . . . . . . . . . . . . . . . .
Loops in E. coli iAF1260 model as a function of the number of random
configurations tested. . . . . . . . . . . . . . . . . . . . . . . . . . . .
Lengths of loops in E. coli iAF1260 model. . . . . . . . . . . . . . .
Example of loop created by two transport reactions and ATP hydrolysis.
2.2
2.3
2.4
2.5
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
3.13
3.14
3.15
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Molecular masses and turnover numbers of enzymes in E. coli. . . .
CAFBA results using E. coli enzyme’s molecular masses and turnover
numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CAFBA main results in the homogeneous (no randomization) case. .
Illustration of CAFBA optimization and glucose uptake flux. . . . .
Comparison between different procedures to obtain growth limitation
reducing carbon intake. Growth rate and acetate excretion. . . . . .
Comparison between different procedures to obtain growth limitation
reducing carbon intake. Fluxes. . . . . . . . . . . . . . . . . . . . . .
Linear fits used to determine the analytical form of growth rate with
respect to wC and wR . . . . . . . . . . . . . . . . . . . . . . . . . . .
Surface plot of growth rate and acetate excretion as a function of
kC = 1/wC and kR = 1/kR . . . . . . . . . . . . . . . . . . . . . . . .
Optimal proteome allocation in C– and R–limitation. . . . . . . . . .
R–sector and fluxes in R–limitation. . . . . . . . . . . . . . . . . . .
Growth rate and fluxes in Q–limitation. . . . . . . . . . . . . . . . .
Overlap between FBA and CAFBA solutions. . . . . . . . . . . . . .
Feasibility region in the λ–φE plane. . . . . . . . . . . . . . . . . . .
Optimal fluxes in the λ–φE plane. CAFBA solution is also showed. .
CAFBA main results in the randomized case. CAFBA fits acetate
excretion and growth yield data from Basan et al. (2014) . . . . . .
6
7
8
13
19
30
32
36
37
39
50
51
55
57
58
58
60
61
61
63
64
66
67
67
70
x
List of Figures
3.16 CAFBA fits acetate excretion and growth yield data from Vemuri et
al. (2006) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.17 Comparison between different distribution funtions for the weights in
the proteome allocation constraint. . . . . . . . . . . . . . . . . . . .
3.18 CAFBA shows that NADH/NADPH transhydrogenases fluxes reverse its overall direction in carbon limitation. . . . . . . . . . . . . .
3.19 Flux differences for all reactions in the main catabolic pathways in
carbon (glucose) limitation. . . . . . . . . . . . . . . . . . . . . . . .
3.20 Scatter plots of CAFBA solutions. . . . . . . . . . . . . . . . . . . .
3.21 Histograms of CAFBA solutions obtained at high and medium growth
rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.22 Exploring different fluctuation ranges: fluxes. . . . . . . . . . . . . .
3.23 Exploring different fluctuation ranges: L1 norm and growth yield. . .
3.24 CAFBA fluxes for different carbon sources. . . . . . . . . . . . . . .
3.25 Acetate excretion in C– and Q–limitation, using four different carbon
sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.26 Growth rate dependent biomass: Convergence of the iterative procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.27 Growth rate dependent biomass: Biomass prescription functions. . .
3.28 Growth rate dependent biomass: RNA/proteome ratio. . . . . . . . .
3.29 Growth rate dependent biomass: CAFBA solutions for three different
values of growth–rate dependent ATP hydrolysis flux. . . . . . . . .
4.1
4.2
4.3
4.4
4.5
4.6
5.1
5.2
5.3
5.4
5.5
5.6
71
72
72
73
75
76
78
79
80
81
83
85
85
86
A: Proteome partition model. B: Experimental data and fit. C:
Fitness landscape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Single upshift experiment. . . . . . . . . . . . . . . . . . . . . . . . . 96
Analytical results for offsets optimality. . . . . . . . . . . . . . . . . 98
Fitness landscape as a function of normalized offset and upshift time. 100
Fitness landscape as a function of upshift time and initial carbon
source quality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Fitness landscape as a function of initial and final carbon source quality.102
Coarse grained model of metabolism. . . . . . . . . . . . . . . .
FBA and CAFBA solutions of the coarse grained model. . . . .
Numerical solutions of the kinetic model. . . . . . . . . . . . .
Results of growth laws optimization as a function of period T
nutrient availability s. . . . . . . . . . . . . . . . . . . . . . . .
Optimal proteome partitioning for different T and s. . . . . . .
Optimal G/T protein partitioning in the kinetic model. . . . .
. . .
. . .
. . .
and
. . .
. . .
. . .
106
107
113
115
116
118
1
Introduction
Twentieth century biology has been strongly influenced by the reductionist approach
of molecular biology [130, 134]. Starting from the cell hypothesis at the beginning
of XIX century, the microscopic constituents of living organisms have been unveiled,
and their function progressively clarified. Even if such approach has been proved
extremely useful to understand of the basic mechanisms of life, some properties
of living organisms can be studied only considering the living system as a whole.
Systems biology [63, 95, 18, 80] is a recent branch of biology, which endorses a
holistic approach to the study of processes in living organisms. Its aim is the study
of cell–scale phenomena and their relations with the constituent of the system;
in particular, systems biology studies networks (e.g. metabolic network, signalling,
regulation networks and protein–protein interaction networks) rather than the single
network constituents, hence the term network biology [6]. The availability of high–
throughput data (e.g. shotgun genome sequencing, transcriptome profiling with
DNA microarrays or proteome profiling using mass spectrometry) made possible
in recent years to build detailed models of metabolism, regulation and proteome
expression at the whole cell level. On one hand, this allows to understand how
functional properties and the behaviour of living organisms are brought about by
interations of their constituents (bottom–up approach), and on the other one to
unravel the molecular components and logic that underlie cellular processes (top–
down).
A central issue is the formulation of system–scale models which are able to bridge
the gap between networks as a whole and the details of their constituents. Such
effective models are often very hard to formulate in a meaningful manner, due to
the fact that biological phenomena involve many components and many timescales,
making it hard to disentangle the whole cell into separate networks. Thus, the
problem is to find the description that better allows to isolate the phenomenon
of interest from the rest of the system and the environment. Furthermore, the
resulting model can be overwhelmingly complex, not allowing to grasp insights into
the phenomena and behaviours associated with the system under study.
Simple laws can sometimes emerge from complexity, often due to a combination
of physical constraints and evolutionary pressure [15]. An example of such simple
laws is given by the discovery of the characterization of protein expression based on
a coarse grained proteome partitioning model. This model is described in detail in
Chapter 1, and provides the starting point for the work illustrated in Chapters 3 to
5.
This Thesis is organized as follows. Chapter 1 provides a twofold introduction
to the topics treated in the rest of the Thesis. Its first part provides an introduction
2
Introduction
to the model bacterium Escherichia coli and the basic concepts of gene expression
and regulation. We also discuss bacterial metabolism and some quantitative studies of bacterial growth, starting from J. Monod [85] and continuing to the work
by T. Hwa and collaborators [115, 137]. In particular, bacterial proteome can be
divided in proteome sectors, each one with a distinct response to environmental perturbations; notably, these proteome shares appear to be simply related to growth
rate for each different kind of growth limitation, pointing out the existence of a few
global regulation circuits.
Genome scale modelling of metabolic networks is discussed in the second half of
the Chapter, with a detailed exposition of constraint–based modelling of metabolic
networks. This framework, whose most famous application being the so–called
Flux Balance Analysis [95, 94], is extremely popular due to the fact that its main
input is the topology of the reaction network, and it does not rely on metabolite
concentration or reaction kinetics. Possible steady states can be studied either
via sampling or by picking a particular solution using some optimization principle.
In the case of bacterial cells, the most common optimization principle is that of
maximization of biomass production rate, which gives good results in predicting
the growth rate of real cells in exponential growth. On the other hand, real cells
can exhibit apparently suboptimal properties, such as overflow metabolism, and
Flux Balance Analysis is not able to model such behaviours without introducing
some ad hoc constraints.
Chapter 2 deals with thermodynamical constraints in constraint–based models
of metabolis. There is a twofold relationship between thermodynamics and reaction
network. On one hand, knowing the chemical potentials of the metabolites allows
to fix the direction of reaction fluxes without the need of knowing the kinetic mechanism underlying the reaction; on the other, a flux configuration constrains the
chemical potentials. In particular, some flux configurations are not compatible with
any set of chemical potentials for the metabolites, and are therefore forbidden. One
can show that such thermodynamically unfeasible flux configurations must contain
flux loops, which can provide a signature of inconsistency. After a general introduction to termodynamics and reaction networks, we will focus on the identification of
loops using a mixed relaxation/Monte Carlo approach, and on the removal of such
loops from flux patterns.
Chapter 3 describes the integration of the coarse grained model of proteome
allocation described in the first Chapter within a genome scale, constraint–based
model of Escherichia coli metabolism. The resulting framework, Constrained Allocation Flux Balance Analysis (CAFBA), is discussed in detail, and its predictions
are compared to experimental data. Our implementation of CAFBA differs from
the standard Flux Balance Analysis on three fundamental aspects: 1) The presence
of a proteome allocation constraint, based on a four fold partitioning of the proteome 2) The control variable, being the amount of proteins dedicated to carbon
(e.g. glucose) transport and metabolism, instead of the carbon flux itself, and 3) A
randomization procedure of the protein costs, which is used to assess the sensitivity of the model with respect to parameter fluctuations. In particular, the average
values of the fluxes nicely agree with experimental data on acetate excretion and
growth yield. CAFBA allows for a detailed description of the switch between overflow metabolism at high growth rates, to a more efficient (respiration) metabolism
Introduction
3
when nutrient quality is poor, a feature which is missing in standard Flux Balance
Analysis.
Proteome allocation in E. coli appears to be suboptimal in the light of the proteome partitioning model. Some baseline proteome fractions (“offsets”) are present
for each proteome group, reducing steady state growth rate. In Chapter 4 we investigate this puzzle by connecting the growth laws to dynamics, and investigating
their role in the cellular response to nutritional shifts, that is, sudden change of
the medium the cells are growing in. The connection is allowed by the “instantaneous” model formulated by Dennis and Bremer [38] to model proteome production
during nutritional shifts. Experimental upshift – nutritional shift from poor to
high–quality carbon nutrients – measures allow to provide estimates of the offsets
for various sectors which nicely agree with data obtained with steady state measurements. Moreover, optimality of the growth laws is discussed using the instantaneous
model; the analysis suggests that such offsets may be optimized to maximize growth
during upshifts.
Chapter 5 is dedicated to the application of the ideas discussed in Chapters 3
and 4 to a minimal model of metabolism, including carbon transport across the
cell membrane, respiration and fermentation metabolism, and biomass production.
Application of CAFBA to this small network allows to carry out some analytical calculations, and gain insight into the behaviour of CABFA application to the genome
scale model. Kinetics for both reactions ad proteome allocation can be introduced,
and the resulting kinetic model can be used to test how the growth laws for the
various proteome sectors fit in a given environment. Preliminary results show a
substantial agreement between the empirical growth laws, CAFBA results and the
growth laws obtained optimizing bacterial mass production in an environment in
which nutrients availability varies periodically.
Publications
Chapter 1 does not contain original material, since it only serves as introduction and
review of the state of the art in bacterial physiology and constraint–based modelling.
Chapter 2 contains material from [35]. Chapter 3 contains material from Refs. [86]
and [87]. Chapter 4 contains material from Ref. [43]. Chapter 5 mostly contains
work–in–progress material, part of which will be included in Ref. [87].
Ref. [86] is to be submitted to the journal Molecular Systems Biology along with
an experimental work of Basan et al. [8], focusing on the respiration/fermentation
switch.
During my Ph.D. studied I also studied uniform sampling of steady state fluxes
using Hit–and–Run Markov Chain Monte Carlo dynamics. Sampling flux space is
usually very difficult due to the ill–conditioning of the flux space (a convex polytope). In this work the ill–conditioning is removed by using three different rounding
methods, allowing to sample genome–scale flux spaces in an effective way. Hence,
our implementation of Hit–and–Run algorithm can be used as a benchmark to evaluate the performances of other samples, such as gpSampler or optGpSampler. This
topic will not be covered in this Thesis, so we refer the interested reader to the
article [37].
5
Chapter 1
Modeling bacterial growth: an
overview
Escherichia coli is the prototypical prokaryote, being extensively studied for over
than 60 years. Nowadays, it has an important role in biotechnology, as it is used as
a host for recombinant DNA sequences in order to produce recombinant proteins
[5, 22], or to produce metabolites on industrial scale (e.g. biofuel [17]) for which
often a chemical production is not possible (e.g. human insulin [53]).
Since this is not the place for a detailed introduction to cell biology, we will only
skects some relevant facts, referring the interested reader to textbook references.
A comprehensive review of bacterial physiology and metabolism can be found in
Ref. [62]. A brief, but complete, historical review of the first quantitative studies on
microbial growth can be found in Ref. [96]. Classical textbooks on constraint–based
modelling are Refs. [12, 95].
1.1
Bacterial physiology
The fundamental unit of life is the cell. All living cells (not viruses, for instance)
share some basic components:
• The cell membrane separates the interior of the cell (the cytoplasm) from the
outside environment. Cell membranes are double layers of phospholipids and
are selectively permeable, allowing molecules flow between the cytoplasm and
the environment. It serves as support for a variety of other structures, such
as the cell wall and membrane proteins.
• DNA stores the genetic information used in the development and functioning
of all organisms. It contains genes, which are DNA regions associated to the
basic information units.
• RNA perform a variety of functions, ranging from protein synthesis (messenger, transfer and ribosomal RNA) to regulation (e.g. antisense RNA).
• Proteins are formed by polypeptides, that is, amino acid monomers linked by
peptide bonds, which participate in virtually any biological function, ranging
6
1. Modeling bacterial growth: an overview
plasma membrane
capsule
cell wall
plasmid
pili
nucleoid (DNA)
ribosomes
flagellum
cytoplasm
Figure 1.1. Schematic representation of a prokaryote bacterium. The cell (plasma) membrane is surrounded by a peptidoglycan cell wall and, possibly, by a capsule. E. coli has
more than one flagellum, projecting in all directions.
from catalysis of chemical reactions to regulation and duplication. The term
“protein” actually refers to the folded polypeptides only, when they are able
to perform their functions properly.
Prokaryote cells have a very simple structure, as shown in Fig. 1.1. All intracellular
water soluble constituents (proteins, DNA, RNA, metabolites, etc.) are located together in the same volume enclosed by the cell membrane, rather than in separate
cell compartments. In fact, they do not contain mitochondria or chloroplasts, with
oxidative phosphorylation and/or photosynthesis taking place on the cell wall, neither they have a nucleus, with DNA arranged in a structure called nucleoid, which
lacks any nuclear envelope.
The information contained in the DNA is used in a process called gene expression, which is the synthesis of a functional gene product. Such gene products are
proteins and functional RNA (which is the product of non–coding genes). The
typical information flow is described by the so–called central dogma of molecular
biology, see Fig. 1.2. The information stored in DNA genes is first copied (transcribed) into single–stranded messenger RNA (mRNA) molecules by the enzyme
RNA polymerase. Then, mRNA strands are translated into peptide chains by dedicate organella called ribosomes. Ribosomes are large macromolecules (weighting
around 2700 kDa) formed by proteins and ribosomal RNA (rRNA) which may occupy a large (50%) share of cell’s proteome mass.
Information flow can be altered in a number of ways through regulation. Dedicated proteins (e.g. transcription factors) and RNA molecules (e.g. antisense RNA)
can affect the expression of a gene. Therefore, proteome production has some degree of flexibility: regulatory circuits can express specific genes in response to precise
stimuli (e.g. starvation, heat, ...).
1.1 Bacterial physiology
7
GENE EXPRESSION AND REGULATION
transcription
translation
RNA
DNA
inverse transcription
proteins
RNA replication
Figure 1.2. The central dogma of molecular biology is a framework for understanding the
residue–by–residue transfer of sequence information in living organisms. At the most basic level, information is stored within DNA, then transfered to RNA molecules (through
transcription) and to proteins (through translation). Other information flows (most notably RNA replication and reverse transcription) only occur in special cases, for example
when retroviruses infect an host cell. Gene expression can be altered (regulated) in different ways, ranging from modification of DNA (e.g. methylation) to transcriptional
regulation (e.g. transcription factors and repressors), post–transcriptional regulation
(only in eukaryotes) and regulation of translation.
1.1.1
Metabolism
Life is, at its essence, an out–of–equilibrium process which requires energy. Cells
must be able to gather energy from the environment and use it to perform activities
such as replication, movement, and so on. The set of chemical reactions needed
to sustain life of a cell is named metabolism. Metabolism is a network of mainly
enzyme–catalyzed reactions, which perform a variety of functions, ranging from
nutrients breakdown to the polymerization of macromolecules.
Metabolism is conventionally divided in catabolism and anabolism, see the scheme
in Fig. 1.3. Catabolism is the set of chemical reactions devoted to the production
of energy and smaller molecules from the breakdown of larger molecules. On the
other hand, anabolism is the part of metabolism which uses some of the byproducts
of catabolism, the so–called building blocks, to synthesize all macromolecules which
are needed to build a copy of the cell.
Both catabolic and anabolic reactions can differ among living organisms. In
fact, organisms can be classified with respect to the capacity of building complex
molecules starting from simple ones, like carbon dioxide and water (autotrophic
organisms as opposed to heterotrophic). Further divisions are possible, for instance
whether the organism can use sunlight to activate processes inside the cell (e.g.
fixation of CO2 ) or they must use the chemical energy originated by the breakdown
and/or oxidation of chemical species.
Escherichia coli metabolism can be well represented as in Fig. 1.3. E. coli can
grow in a minimal medium with some sugar (e.g. glucose) as sole carbon source, plus
ammonia and other salts, or in richer media (e.g. amino acids as carbon/nitrogen
sources). For sakes of simplicity, let us focus on E. coli cells growing in aerobic
conditions glucose as the only carbon source. Glucose (C6 H12 O6 ) is catabolized
to produce energy and carbon precursors. At low growth rates, bacteria optimally
use glucose , with carbon dioxide (CO2 ) being the only carbon waste. On the
8
1. Modeling bacterial growth: an overview
INTERMEDIATE METABOLISM
cell wall
waste
catabolism
anabolism
Figure 1.3. Intermediate metabolism. Nutrients are transported across the cellular membrane, and broken down to simpler molecules, called “precursors”. These are later assembled to form the building blocks (amino acids, lipids, nitrogenous bases,. . . ) needed
to build all macromolecules (proteins, the cell wall, DNA and RNA). This process also
produces energy (ATP, NADH or other reducing molecules).
other hand, fast growing E. coli cells excrete large amounts of acetate (C2 H3 O2– ),
which is a byproduct of glycolysis [135]. This is analogue to the Crabtree effect [31]
first observed in yeast [30], and the Warburg effect observed in cancer metabolism
[48, 40]. Crabtree effect is a short–term, reversible, switch of cellular metabolism
between respiration and fermentation; instead, the Warburg effect is a long-term
metabolic reprogramming, typical of cancer cells [40].
1.1.2
Growth laws
The start of quantitative studies on bacterial growth dates back to Louis Pasteur,
when his paper Mémoire de la Fermentation appelèe lactique was published in 1857.
He was able to show that fermentation was due to microbial growth combining
growth dynamic data and chemical assays, thus bringing a strong evidence against
the theory of spontaneous generation.
Bacterial cultures may be grown in different conditions. Agar plates can be
used to grow bacterial colonies, but most measurements are much easier if bacterial
cultures are grown in suspension, as in batch cultures. In the latter case, cells are
grown in suspension in agitated flasks with controlled temperature; bacteria are
allowed to duplicate until they ran out of nutrients. A fed–batch culture is similar
to the batch culture, but nutrients are continuously added to the vessel; similarly,
flow cells allow continuous cultures with a constant flux of nutrients.
1.1 Bacterial physiology
9
Growth phases
Bacterial growth in batch cultures can be modeled in a series of distinct phases.
Let us suppose, for instance, E. coli cells are inoculated in a vessel with sugar (e.g.
glucose) and minimal salts (to provide ammonia, sodium, potassium, iron, etc.).
Several phases can be distinguished:
• Lag phase. Cells do not immediately start to grow, as they have to adapt to
the new environment, syntetizing RNA and enzymes and preparing for cell
division.
• Exponential phase (also called log phase or logarithmic phase). In this phase
cells are duplicating at a constant rate, so that their number grows exponentially as N = N0 2t/τ with τ being the (average) doubling time, or N = N0 eλt ,
where λ is the growth rate. Cells usually do not duplicate exactly at the
same moment, even if synchronous cultures may be obtained with proper
techniques.
• Stationary phase. In this phase an essential nutrient has been completely used,
so that cells cannot grow any more. The number of cells is therefore constant
during this phase.
• Death phase. Cells die, resulting in a decreasing cell population.
In particular, many measures focus on the exponential phase, since the average
properties of the cells can be considered constant (the co–called balanced growth
regime). We have to distinguish between cell concentration (number of cells per
unit volume) and bacterial density (dry weight mass per unit volume). In fact, the
density of single cells is roughly constant, but their volume can change by a factor 2
or 3 at different growth rates. Keeping in mind this difference, we will always refer to
growth as “bacterial mass production”, and not as “increase in the number of cells”.
A remarks on the estimation of bacterial density are necessary. Direct measurements
of growth – that is, of the mass of the bacterial culture – are performed filtering
and drying the solution and weighting the essicated cells. This procedure, although
simple, may be not convenient in practice. Instead, spectrophotometers can be used
to measure the light scattered by the culture solution, being a proxy for bacterial
density [71]. The conversion factor between optical density (OD) and bacterial
density has to be determinated experimentally.
Monod’s growth kinetics
Jacques Monod was able to model growth dynamics by studying growth kinetics
[85]. The core of Monod’s theory is the link between instantaneous growth rate
λ(t) = d log M (t)/dt (with M (t) being the total bacterial mass, or density, at time
t) and the limiting substrate concentration, s(t). First, he found that growth yield
Y , that is, the bacterial mass produced per substrate consumed, is quite indipendent
from growth condition, intial substrate concentrations or the chemical form of the
substrate. The maximum content of biomass Mf in a batch culture occurs at the
10
1. Modeling bacterial growth: an overview
moment of complete utilization of the substrate, and it depends in a linear way from
the initial substrate s0 :
Mf = M0 + Y s0 .
(1.1)
Using this relation, Monod was then able to infer the following relationship between
instantaneous growth rate and the substrate concentration as:
λ = λs
s
s + Ks
(1.2)
which is known as Monod’s law. Similarly to the Michaelis–Menten kinetics of
enzyme–catalyzed reactions, this is a kinetic law which can be used to formulate a
closed dynamical system. In fact, using Eq. (1.1) and (1.2) we can write:
1 d
s(t)
M (t) = λm
,
M (t) dt
s(t) + Ks
s(t) = s0 +
1
(M (t) − M0 )
Y
(1.3)
The resulting differential equation for M (t) can be easily solved and used to predict growth dynamics in exponential phase, with initial values M0 and s0 , and the
parameters Y , λm and Ks being determined for each bacterium in given growth
conditions. For instance, exponential growth of E. coli cells in glucose at 30 ℃ is
well described by the parameters1 Y = 0.23, λm = 0.93 h−1 and Ks = 4 mg/L.
Catabolite repression
Monod eventually won the Nobel Prize in Physiology or Medicine for his studies
on carbon catabolite repression (CCR) and the lac operon in E. coli. He showed
that cells grown in batch culture using a mixture of glucose and another sugar, e.g.
lactose, display a biphasic (diauxic) growth. After an initial lag phase, cells grow exponentially until glucose runs out. Then, after a second lag phase, cells start again
to grow exponentially using lactose as the carbon source. The explanation of such
behaviour is that the enzymes needed for lactose transport an metabolism, encoded
in the lac operon, start being expressed only when glucose runs out. The preference
for glucose over different carbon sources, and the regulatory phenomena behind
such behaviour, are called carbon catabolite repression. The mechanism behind
CCR in E. coli is relatively well understood [54]: transport of glucose through the
phosphoenolpyruvate-–carbohydrate phosphotransferase system (PTS) inhibits the
enzyme adenylate cyclase, which catalyzes the production of the signalling molecule
cyclic adenosine monophosphate (cAMP). When glucose runs out, adenylate cyclase
is expressed, and the newly produced cAMP binds to a protein called cyclic AMP
receptor protein (CRP; also known as catabolite activator protein, CAP). The CRP–
cAMP complex is a transcription factor which regulates hundreds of genes in direct
or indirect way. Most of the genes regulated through the CRP–cAMP complex (the
Crp regulon) are related to carbohydrates transport and metabolism, energy production, amino acid metabolism, nucleotide metabolism and ion transport systems
[139].
1
Monod used base 2 logarithms, so that λs = 1.35 divisions per hour. Curiosly, the growth
yield reported by Monod is quite low, with more recent measurements giving Y between 0.4 and
0.6. See, for instance, Chapter 3 and the experimental fits.
1.1 Bacterial physiology
11
Gene regulation of the lac operon was the first genetic regulatory mechanism to
be understood clearly. Expression of the operon is regulated in two different ways.
An intracellular regulatory protein called lac repressor is contitutively expressed,
and inhibits transcription of the lac operon by binding a DNA region just downstream the lac promoter. When lactose is present in the cell, it can bind to the
lac repressor, causing its detachment from DNA and allowing for the lac operon to
be expressed. The second regulatory mechanism is the binding of the CRP–cAMP
complex to a specific DNA site upstream of the promoter; the CRP–cAMP complex
facilitates RNA polymerase binding to the promoter.
Growth laws: the SMK papers
Monod law describes how growth rate is affected by variations in substrate levels.
Two papers [109, 66] (see also Ref. [25]) from Schaechter, Maaløe and Kjeldgaard
(SMK) were published in 1958 describing how cellular composition varies with
growth conditions. The first paper [109] focused on steady state (exponential phase)
conditions, and showed that many aspects of cell composition depend exclusively
on growth rate, and not on the medium the cells are growing in. In particular, RNA
and ribosome content of the cell was found to be linearly correlated with growth
rate, irrespective on environmental (e.g. nutrient) conditions (see also subsequent
work by Dennis and Bremer, Ref. [39]). If cells are grown in constant conditions
(in exponential phase) for long enough, the fractions of cellular components in the
culture do not change – a condition called balanced growth. It may seem difficult
to think that predictions on the average bacterial cell of a given species is possible without fully specifying the environmental conditions; the importance of this
work resides in the discovery that some descriptions may actually be formulated by
knowing growth rate alone [89].
The second paper [66] studied how biomass composition changes after sudden
variations in growth media. The authors studied both shifts from poor to higher
quality nutrients (shift–ups, or upshifts), or from rich to poor substrates (shift–
downs, or downshifts). These experiments showed the “rate mantainence” phenomenon: mass production quickly (almost instantaneously) adjusts to the new
environment, but duplication rate has some lag time. The result is that, after a
shift–up, cell size grows before the duplication rate adjusts. In fact, the growth
phases we described before (lag, exponential, stationary, and death phases) can be
thought as a succession of upshifts and downshifts. A model in which production
rates stabilize instantaneously compares well with experiments [19].
More on the RNA/protein ratio growth laws
Starting from 2010, the group led by Terry Hwa pursued a systematic study on
the mass composition of the cell to environmental conditions. Very much in the
spirit of the first SMK paper, they uncovered a series of simple relations between
growth rate λ and the RNA/protein ratio r. They studied the behaviour of the
Schaechter line by limiting growth in two different ways, either reducing the quality
of the carbon nutrient (C–limitation) or introducing translation limiting antibiotics,
such as chloramphenicol (R–limitation). In the first case the RNA/protein ratio
12
1. Modeling bacterial growth: an overview
is well described by a linear relationship r = r0 + λ/kr , where kr is reduced when
antibiotics are present. In fact, the maximum value of kr , ranging between 5.4 and
5.9/h in absence of antibiotics, is related to the maximal translation rate of the
ribosomes (around 20 aa/s), and is therefore callede translational capacity. In the
case of R–limitation the slope changes sign, with r = rmax −λ/kc , with kc depending
on the quality of the carbon source. This behaviour is illustrated in Fig. 1.4, panel
(a). If we suppose that kc and kr are state variables, encoding all informations on
growth and RNA/protein ratio, we can invert the two linear relations to obtain:
λ kc , kr = rmax − r0
r kc , kr = r0
k k
c r
kc + kr
kr
kc
+ rmax
kc + kr
kc + kr
(1.4)
(1.5)
These relationships have been verified for many different carbon sources and for
slow–translating mutants. The similarity between Eq. (1.5) and Eq. (1.2) confirms
that kC can be thought as a proxi for external carbon substrate concentration. One
can associate
The growth rate dependence of the RNA/protein ratio is ribosomal–affiliated
proteins (extended ribosome). This sectors contains all ribosomal proteins, plus all
other ribosome–affiliated proteins (e.g. elongation factors).roughly proportional to
the ribosomal proteins of the cell, since (1) roughly 85% of RNA is rRNA, and
the fraction is roughly constant from moderate to fast growth rates, (2) rRNA
and ribosomal proteins come in fixed proportions and (3) all other R–affiliated
proteins are usually co–expressed along with ribosomal proteins. Scott et al. [114]
estimate the R–sector proteome mass to RNA mass be ρ = 0.76. Therefore, Eq. (1.5)
suggests a coarse grained model of proteome allocation, with φR = ρr and the
rest of the proteome minimally divided into a growth–independent sector φQ and
a growth–dependent sector φP satisfying φP (λ) = λ/kC . Given the constraint
φP + φR + φQ = 1, the model is easily solved as:
φP = φP,0 + λ/kC ,
(1.6)
φR = λ/kR ,
(1.7)
λ = (1 − φQ − φR,0 )
kC kR
,
kC + kR
(1.8)
with kC = ρkc and kR = ρkr . The y–intercept of the R–sector is estimated to be
φR,0 ∼ 6.6%, while the fixed φQ is estimated around 45% of total proteome.
More insights into regulation
The P–sector appear to be upregulated in carbon limitation, so that it is likely
to contain catabolic proteins. You et al. [137] provided more insights by directly
measuring lacZ expression. lacZ is a gene in the lac operon, coding for the enzyme
β–galactosidase (β–gal), whose function is cleaving lactose to glucose and galactose.
A non–metabolizable inducer of the lac operon, IPTG, can be used to unbind the
lac repressor from the operon, so that β–galactosidase levels can be considered a
measure of the CRP–cAMP activity.
1.1 Bacterial physiology
13
Figure 1.4. (a) Behaviour of the RNA/protein ratio upon nutritional (carbon) limitation
(continuous line) and translational limitation (dashed lines) obtained growing cells in
media with increasing quantities of antibiotics, as modeled by Eq. (1.4) and Eq. (1.5).
(b) Given the constraint φP + φR = constant, reciprocal relations must be obeyed by
the φP sector. (c) Pie chart showing the three–fold partition of the proteome introduced
in Scott et al. [114]: an R–sector of ribosome–affiliated proteins, a fixed Q–sector, and
a growth–dependent P–sector. Image from Ref. [115].
A third growth limitation mode is studied, namely nitrogen (A–) limitation; a
corresponding proteome sector φA is upregulated in response to decreased nitrogen
sources. The nitrogen uptake can be tuned by using an E. coli strain with titratable
nitrogen uptake system. Expression of glnA, which encodes the major ammonia
assimilating protein glutamine synthetase, is taken as a proxi for the A–sector.
All observations can be consistently described using a five–fold proteome partition model with three state variables kC , kA and kR :
φC = φC,0 + λ/kC
(∝ lacZ expression)
φR = φR,0 + λ/kR
(∝ RNA/protein ratio)
(1.10)
φA = φA,0 + λ/kA
(∝ glnA expression)
(1.11)
φO+U = φO + φU,0 + λ/kU
λ=
1 − φQ −
X
X
φX,0
(1.9)
( “core” + uninduced)
!
1
1
1
1
+
+
+
kC
kA kR kU
−1
(1.12)
(1.13)
In order to obtain the correct maximum growth rate in carbon limitation (around
1.2/h), an “uninduced sector” φU must be introduced. This U–sector is not upregulated by any of the three growth limitations, and satisfies φU = φU,0 + λ/kU .
Sulphur assimilation and nucleotide synthesis could be examples of proteins which
are not targeted by either C–, R–, or A– limitation. The very existence of such
laws for the different proteome sectors requires their coordination. In particular,
the C–sector of catabolic proteins and the A–sector of anabolic proteins have opposing behaviour with respect to C– and A– limitation, suggesting the existence of
a common regulation system. The autors propose inhibition of adenylate cyclase
from α–keto acids as a way to balance the flux of carbon precursors from glycolysis
and the production of amino acids.
Very recently, a direct determination of the proteome sectors using genome–scale
mass spectrometry has been carried out by Hui et al. [58]. Six different proteome
sectors have been defined by clustering of protein mass variations upon C–, A– and
14
1. Modeling bacterial growth: an overview
R– limitation, suggesting the presence of a few global regulators. A theory for this
six–component proteome partitioning can be formulated and cross–checked with
similar methods to those used in [114, 137].
1.2
Genome scale modeling of metabolic networks
Chemical networks are bipartite networks involving chemical species (named metabolites in metabolic networks) and chemical reactions. Following Palsson [95], we can
define certain key properties of chemical reactions:
1. Stoichiometry. Chemical species and reactions are connected by the stoichiometric coefficients, which describe the variations in the number of chemical
species due to an elementary step of a chemical reaction. For example, in an
isomerism A −−→ B the product B has a coefficient +1, while the reactant A
has a coefficient −1. Of course, a conventional directionality for any reactions
has to be introduced in order to define “products” and “reactants”.
2. Rates. Reaction rates are fixed by a combination of factors, such as substrates
concentrations, kinetic constants, presence or absence of catalysts, temperature, pressure, ionic strenght of the solution, etc. The cell is able to manipulate
reaction rates thanks to the fact that most reactions are enzyme–catalyzed,
and many regulation circuits allow the enzymes to be produced only when
they are needed.
Chemical networks can be modeled with different detail levels. For instance, chemical reactions are, at their essence, stochastic phenomena; On the other hand, when
the number of molecules is high enough, the rates follow deterministic laws with
good approximation. The knowledge of intracellular concentrations and kinetic parameters is essential to build a dynamical model of cellular metabolism; on the
contrary, steady state models can be formulated without any knowledge about such
quantities.
For a given metabolic network, all informations about stoichiometry are encoded
in the the stoichiometric matrix. The metabolites are first labeled with an index
µ = 1, . . . , Nmets , and the reactions with an index i = 1, . . . , Nrxns . For each reaction
a conventional directionality is also assumed. Then, the stoichiometric matrix S is
constructed2 so that its entries Sµi are equal to the stoichiometric coefficient of
metabolite µ in reaction i, with the plus sign if µ labels a product and the minus
sign if it is a reactant. If metabolite µ appears both as reactant and as a product, the
difference between the two stoichiometric coefficients is used. Therefore, Sµi equals
the variation in the number of molecules of species µ given a single elementary step
of reaction i. If a metabolite is involved in the reaction, but only acts as catalyst
(i.e. it is neither produced or consumed by the reaction), the correspondent entry
of the S matrix is zero.
By definition, the time evolution of the number of metabolites Nµ (t) is governed by the reaction rates φi (t) (the number of elementary reactions per time unit)
2
The stoichiometric matrix is also indicated in literature with the letter N . We will always use
the letter S, hoping the reader will not be confused with the entropy.
1.2 Genome scale modeling of metabolic networks
15
through the stoichiometric matrix, as follows:
dN
= Sφ
dt
X
d
Sµi φi
Nµ =
dt
i
i.e.
∀µ
(1.14)
A normalization is usually introduced dividing both members by the cell volume
or mass. Defining the concentrations [cµ ] = Nµ /V and the fluxes vi = φi /V with
contant V , we have:
d
[c] = Sv
dt
X
d
Sµi vi
[cµ ] =
dt
i
i.e.
∀µ
(1.15)
In turn, reaction rates, or fluxes, depend on concentrations, usually in a nonlinear
way. At the elementary level, the mass–action law can be used as kinetic law. Mass
action kinetics simply states that the rate of a chemical process is proportional to
the product of the concentrations of the reactants. For example, given the reaction
−−
⇀
A+B↽
−
− C + D, the forward and backward fluxes are computed as:
v + = k+ [a][b] ,
v − = k− [c][d] .
(1.16)
with [a], [b], [c] and [d] being the concentrations of the four chemical species. The
net flux is thus given by v = v + − v − . The kinetic constants k+ and k− are related
by thermodynamics, as we shall see in Chapter 2.
Mass action kinetics does not take into account the action of enzymes; In fact,
most biological reactions are enzyme–catalyzed. The textbook example of such
rate expression is given by the Michaelis–Menten kinetics, which models a simple
E
enzyme–catalyzed isomerism A −
→ B:
v = kcat [e]
[a]
[a] + KM
(1.17)
where [a] and [e] are the substrate and enzyme concentrations, respectively, kcat is
the turnover number of the reaction (which fixes the maximum speed of the reaction
for a given enzyme concentration), and KM is the Michaelis constant (which fixes
the affinity of the enzyme to the substrate).
1.2.1
Fluxes normalization and dilution terms
Equation (1.15) can be hardly directly applied to a growing cell. In fact, bacteria are usually studied during exponential growth. Individual cells also appear to
grow exponentially [24], with their density being roughly constant; furthermore, intracellular fluxes may depend on the part of the cell cycle the bacteria is in. As
we consider an exponentially growing bacterial culture, we may insted consider the
total fluxes and total cell volume. This is a convenient choice since they are macroscopic quantities (as long bacterial population is large enough), so they show much
smaller fluctuations. Furthermore, the dry weight MDW of the cellular culture can
be used to normalize the fluxes instead of the volume, being the former much easier
to measure. As we consider exponentially growing cells we have:
d
dt
Nµ
MDW
=
X
i
Sµi
d 1
φi
− Nµ
MDW
dt MDW
(1.18)
16
1. Modeling bacterial growth: an overview
Defining dry–weight normalized concentrations [cµ ] = Nµ /MDW and fluxes vi =
φi /MDW , and growth/dilution rate λ = d log MDW /dt we get:
X
d
Sµi vi − [cµ ]λ
[cµ ] =
dt
i
(1.19)
where the last term −[cµ ]λ is a dilution term, due to the fact that we are dealing
with quantities ([cµ ], vi and MDW ) which are growing as eλt . It can be neglected
only if the average lifespan of a metabolite is much less than the doubling time.
1.2.2
Fundamental subspaces of the stoichiometric matrix
The stoichiometric matrix encodes many informations about the chemical network
in a compact fashion [44]. In particular, the four fundamental subspaces of S (kernel,
image, cokernel and coimage) have a very clear physical interpretation.
Kernel
The kernel of S is the vector space spanned by all vectors which are annihilated by
the matrix S:
ker(S) = span(v : Sv = 0)
(1.20)
where 0 is the null (column) vector. Any flux configuration in ker(S) does not
produce any variations in the concentrations of the metabolites. This is clearly
seen from (1.15), as the substitution v → v + k with k ∈ ker(S) can be performed
without affecting the left hand side of the equation. The simplest example of such
degenerancy is given by a reversible reaction which is splitted in two irreversible
reactions with opposite directions. In this case there is an obvious degenerancy, due
to the fact that only the net flux of the overall reaction affects the concentrations.
In general, such loops are governed by thermodynamics: in this particular case,
the forward and the backward fluxes are set by the free energy difference between
products and reactants. The relations between thermodinamics and fluxes will be
described in detail in Chapter 2.
Cokernel
The cokernel is defined as ker(S T ), that is:
coker(S) = span(w : wS = 0)
(1.21)
where 0 is the null (column) vector. Therefore, vectors in coker(S) have dimension
equal to the number of metabolites in the network. Vectors belonging to the cokernel
of S define conservation laws in the metabolite concentrations. In fact, taking the
scalar product between a vector w ∈ coker(S) and Eq. (1.15) we have:
w·
X
X
X
∂
∂
(wS)i vi = 0
wµ [cµ ] =
wµ Sµi vi =
[c] =
∂t
∂t
µ
i
µ,i
⇒
∂
(w · [c]) = 0
∂t
(1.22)
1.2 Genome scale modeling of metabolic networks
17
Therefore, linear combinations of concentrations exist such that their values are
not affected by the fluxes. The values of these conserved metabolic pools must be
provided as initial data in kinetic models.
Image and coimage
Image, im(S), and coimage, coim(S), are the vector spaces complementary to ker(S)
and coker(S), respectively. The equality sign in Eq. (1.15) only relates vectors in
im(S) and coim(S). To make this statement more clear, we can use the Singular
Value Decomposition (SVD) on the stoichiometric matrix. SVD is a powerful tool
in linear algebra which allows to write any matrix in a diagonal form using two
orthogonal transformations. In particular:
S = U ΣRT with
(U U T )µν = δµν ,
(RRT )ij = δij ,
Σµi = σi δµi
(1.23)
The matrix Σ has the same dimensions as S (say, M rows and N columns), but has
non–zero entries only on the diagonal, which are called singular values. The number
of singular values is evidently equal to min(N ,M ). By definition, the singular values
are non–negative and arranged in decreasing order, σ1 > σ2 > . . . . The number q
of nonzero singular values is the rank of the matrix S. The columns of U and R,
which we call u(µ) , µ = 1, . . . , M , and r (i) , i = 1, . . . , N , form a basis for RM and
RN , respectively. We can write:
X
(U T )µν
ν
X
ν
∂
∂[cν ]
= (u(µ) · [c])
∂t
∂t
(RT )ij vj = r (i) · v
(1.24)
(1.25)
Then, we can recast Eq. (1.15) as:
∂ (a)
(u · [c]) = σa (r (a) · v) ,
∂t
∂ (a)
(u · [c]) = 0 ,
∂t
a = 1, . . . , q
(1.26)
a = q + 1, . . . , M
(1.27)
The vectors r (a) and u(a) with a ≤ q span, respectively, im(S) and coim(S). Of
course, dim(im(S))=dim(coim(S))=rank(Σ)=rank(S)=q. Eq. (1.26) clearly shows
a 1:1 relation between vectors in im(S) and coim(S), with u(a) = Sr (a) . On the
other hand, Eq. (1.27) is equivalent to the conservation laws in Eq. (1.22), with w
being any linear combination of the u(a) vectors.
1.2.3
Flux Balance Analysis
Building a detailed model of metabolism presupposes knowledge of the kinetic parameters and reaction mechanisms [21, 26], and should possibly take into account
stochasticity [52] and spatial diffusion [50, 13]. Unfortunately, the knowledge of
many biochemical details of genome–scale metabolic network is quite poor.
Constraint-based models [95, 12] are widely employed in the literature to describe the operation of a biochemical reaction network at steady state. The main
18
1. Modeling bacterial growth: an overview
advantage of this approach over dynamic models is that kinetic rates and concentrations are not required. Instead, they focus on the fluxes v, which are considered
indipendent variables. Dry weight mass normalization is the standard choice, and
the dilution term is usually neglected. Steady state is then imposed in Eq. (1.19)
to obtain:
X
Sv = 0
i.e.
Sµi vi = 0 ∀µ
(1.28)
i
These equations enforce mass–balance constraints among the reactions, so that no
metabolite concentration varies with time. Clearly, neglecting the dilution terms
allows to remove concentrations from the constraints. In usual applications, physiological aspects constrain fluxes to vary with certain ranges, so that bounds of the
type vi ∈ [li , ui ] are normally prescribed for every reaction, i. Such bounds may
reflect, for instance, the fact that certain processes are known to be physiologically
irreversible (e.g., vi ≥ 0), they have an upper bound due to limited enzyme availability, or are required to occur at precise rates (as can be the case for maintenance
reactions). The usual choice for most reactions is to set very high (non physiologic)
upper bounds, e.g. 1000 mmol/gDW h.
A nontrivial solution of Eq. (1.28) is a possible non–equilibrium steady state
(NESS) for the reaction network. From a geometric point of view, under Eq. (1.28)
and the bounds on fluxes, the space of possible NESSs is represented by a convex
polytope. If all flux configurations inside this volume could be considered as physically realizable solutions, one might assess the ‘typical’ productive capabilities of
the network by sampling them using a controlled algorithm [33]. Unluckily, this
route often turns out to be computationally too expensive for large enough systems.
Alternatively, one may search for the state(s) that maximize the value of certain
biologically motivated objective functions, which can usually be cast in the form
of a linear combination of fluxes that represents the selective production of a given
set of metabolites. Such a framework, known as Flux Balance Analysis (FBA) [94],
has been shown to be predictive in many instances, even under genetic and/or environmental perturbations (possibly with small modifications, see Section 1.2.4). The
standard form of an FBA problem is the following:
arg max c · v
v
subject to
Sv = 0 ,
l≤v≤u
(1.29)
where c is a vector with Nrxns components which defines the function to be optimized. The flux configurations that maximize such a linear functional can be
retrieved with the methods of linear programming [113], the textbook case being
biomass production maximization [46]. The biomass reaction joins the metabolic
network and macromolecular composition of the cells: it is a sink for biomass precursors (amino acids, fatty acids, nucleobases), at the same time draining the energy,
i.e. hydrolyzing ATP, required for the formation of such macromolecules. Biomass
production flux is usually normalized so that its value equals the growth rate: in
fact, fluxes are usually expressed in mmol/gDW h, while growth rate is expressed
in 1/h. Therefore, the stoichiometric coefficients of the biomass reaction have dimensions mmol/gDW , and they can be empirically determined by measuring the
average biomass composition of the cells. Of course, biomass composition may well
depend on the environment the cell grow in; we will study in detail how a growth
rate–dependent biomass can be introduced in FBA in Section 3.7.
1.2 Genome scale modeling of metabolic networks
19
Figure 1.5. With no constraints, the flux distribution of a biological network may lie at any
point in a solution space. When mass balance constraints imposed by the stoichiometric
matrix S (labeled 1) and capacity constraints imposed by the lower and upper bounds
(ai and bi ) (labeled 2) are applied to a network, it defines an allowable solution space.
The network may acquire any flux distribution within this space, but points outside
this space are denied by the constraints. Through optimization of an objective function,
FBA can identify a single optimal flux distribution that lies on the edge of the allowable
solution space. (Image and caption from [94].)
1.2.4
Other frameworks
A number of predictions obtained using Flux Balance Analysis are verified by experiments. An E. coli K–12 MG1655 strain growing in glycerol as sole carbon source
for about 700 generations was shown [59] to evolve, increasing its maximum growth
rate to the optimal value predicted by FBA. Regulation at the single–gene level can
be included using a set of boolean contraints to model transcriptional regulation, as
done in regulatory Flux Balance Analysis (rFBA) [28, 27].
Knockout lethality can be assessed [41] removing from the metabolic network
the reactions catalyzed by enzymes corresponding to deleted genes. Furthermore,
phenotypes of the surviving strains can be modeled with good precision using tools
like Minimization of Metabolic Adjustment (MOMA) [116] and Regulatory On/Off
Minimization of metabolic changes (ROOM) [119]; both methods find flux configurations which are “close” to the FBA wild–type fluxes.
One aspect in which FBA fails is the description of apparently suboptimal behaviours of the cells, such as the Crabtree effect. Variations in glucose availability
are usually modeled in FBA by changing the upper bound to glucose uptake. With
such boundary conditions, FBA solutions have the maximum growth yield (growth
rate per glucose flux): glucose is always fully broken down to CO2 , irrespective
of growth rate. One possible way to induce fermentation in FBA solutions is setting bounds on particular fluxes [79] or global bounds on the total flux, as in Flux
Balance Analysis with Molecular Crowding (FBAwMC) [13]. Chapter 3 describes a
new framework, Constrained Allocation Flux Balance Analysis (CAFBA), which describes the switch to acetate overflow with high precision, using a global constraint
on fluxes inspired by the proteome partitioning models described in Sections 1.1.2.
Thermodynamics affects possible flux configurations, as we shall see in detail in
Chapter 2. In fact, the second principle of termodynamics does not allow closed
loops of fluxes. Enforcing such constraint may be a challenging task, but and may
allow to estimate intracellular metabolic concentrations [57, 56].
20
1. Modeling bacterial growth: an overview
Finally, some approximated dynamics can be formulated coupling FBA to metabolite concentrations, as in Dynamic Flux Balance Analysis [77]. Sequential use of
different carbon sources (the so–called carbon catabolite repression [54]) has also
been studied with good results [13].
121
Conclusion
The research work presented in this Thesis focused on the study of metabolism from
many different perspectives, which we will summarize using three dichotomies.
• Constraints as opposed to optimization in Flux Balance Analysis. Mass balance constraints alone can only define a set of possible flux configurations. A
biologically motivated optimization principle can be used to extract predictions, but two kind of constraints must be included: 1) Physical constraints
such as the ones provided by thermodynamics, studied in Chapter 2. These
constraints are mandatory, that is, flux configurations violating these constraints cannot be taken into consideration. 2) Biological constraints which
emerge from cell biology (e.g. ATP maintenance flux, molecular crowding
or proteome allocation constraints). They are needed to refine FBA predictions in order to reconcile “naïve” results produced by FBA and experimental
data. In particular, protein production is expensive; Hence, a proteome allocation constraint allows to improve FBA predictions considerably, naturally
describing the emergence of overflow metabolism and the use of the Entner–
Doudoroff (ED) pathway instead of the more canonical Embden–Meyerhof–
Parnas (EMP) glycolytic pathway as a protein–saving strategy. This is confirmed by the study of a coarse grained model, in which the optimality of an
high yield/high proteome cost pathway is studied in a dynamic environment.
• Genome scale as opposed to coarse grained modelling. Pros and cons exist for
both approaches. Complex models usually allow to describe richier phenomena
than coarse grained models, but they may lack the ability to provide insights
into emergent properties. One can manage to get the best out the two worlds
by combining them, as done in Chapter 3 where we show how CAFBA refines
phenotypic predictions and naturally displays the tradeoff between an efficient
(large growth yield) metabolism and the cost of allocating proteins. CAFBA
can also be applied to a much smaller model of metabolism, where one can
actually perform analytical calculations, hence
• Statics versus dynamics. Bacterial growth laws are usually studied using exponentially growing bacterial cultures, which allows to study the properties of
the cells at constant growth rates. In particular, proteome shares are found to
be linearly related to growth rate. Nonetheless, dynamics can provide complementary informations about the shape of such laws. Proteome growth laws’
offsets, whose existence seems puzzling from the steady state point of view,
appear to have a clear role in boosting growth during nutritional upshifts, as
122
Conclusion
shown in Chapter 4. In Chapter 5 we applied CAFBA and the growth laws to
a small model of metabolism, which can be studied both from the static and
the dynamic point of view, thus complementing the analysis in the previous
Chapters (which focused on genome scale models and upshift scenarios).
We will now discuss in more detail the main findings and future directions for
the main arguments discussed in the Thesis.
Thermodynamic constraints in metabolic networks
Thermodynamics provides fundamental contraints which have to be satisfied by
constraint–based models of metabolism; in fact, the presence of flux loops implies
that there is not set of chemical potentials which can be assigned to the metabolites.
Flux loops in FBA solutions are often harmless, since they can be removed using
a simple flux minimization without altering the objective function (e.g. growth
rate) and the exchange fluxes. In other cases, flux loops cannot be removed without
altering the value objective function or modifying some other constraint. This is the
case of the transporters loop in Recon–2 human cell metabolism models, shown in
Eq. (2.49). Two transport reactions are combined such that ATP is sinthesized from
ADP, which is absurd since ATP hydrolysis should be spontaneous. This loop cannot
be removed by a global procedure such as flux minimization if ATP maintenance
hydrolysis flux or growth rate are fixed, therefore signalling an inconsistency of
the model. Identification of infeasible cycles is in principle a difficult taks, but
the problem can be tackled using a combination of relaxation and Monte Carlo
algorithms. Once identified, loops can be used to fix the models models imposing
bounds on reactions.
Uniform sampling of steady state fluxes in genome–scale models is technically
feasible [37], but loops are usually included. An open question is how to uniformly
sample thermodynamically feasible flux configurations; this is much harder, since
the solution space is non–convex. A simple projection (including minimization of
some norm, see Sect. 2.2.6), is able to remove the loops, but does not yield an
uniform distribution of the fluxes. Efficient uniform sampling is likely to require
the preliminary knowledge of the possible loops, that is, to dissect the solution space
in loop–free convex subspaces.
Constrained Allocation Flux Balance Analysis
CAFBA shows that integration of genome–scale models of metabolism and regulation maybe possible using some simple constraints, without using detailed informations about concentrations and the kinetic parameters of the single reactions (which
are largely unknown). This remarkable finding is probably due to the existence of
a few main regulation systems, such as the one mediated by Cra–CMP complex.
We confirmed that the acetate switch and the reduced expression of TCA genes
at high growth rates are part of a deliberate strategy [127, 8] operated by the cell
to efficiently express its genes in rich media, when protein synthesis is the main
bottleneck to growth .
Sticking to the interpretation of the weights as inverse turnover numbers, see
Section 3.2.1, we observe that correct results are obtained for weights fluctuations
Conclusion
123
that are much smaller that the actual variations among turnover numbers from reaction to reaction. This suggests that some kind of cross–regulation among enzyme
levels, kinetic constants and concentrations [3, 91] should be reflected in the weights.
Further explorations are needed to confirm such hypothesis.
More work on CAFBA should be done to assess its applicability range. For
instance, one should check whether CAFBA predicts overflow metabolism in organisms different from Escherichia coli, such as Lactococcus lactis (excreting lactate and
formate) and Saccaromyces cerevisiae (excreting ethanol) [129]. The randomization
procedure carried out in Sect. 3.6 can be in principle validated using single–cell measurements of growth and enzyme levels [124, 73, 64]. CAFBA ability of accurately
modeling fermentation can be used to model microbial communities [55], or to study
how the evolutionary pressure shaped the metabolic networks [7] by evolving cells
in different environments.
Growth laws, role of the offsets and dynamics
As noted before, one of the main take–home messages of this thesis work is that
statics and dynamics often give complementary informations. This is particularly
true as we think about regulation, which ultimately is the way cells adapt to different
environmental conditions.
The work presented in Chapter 4 can pave the way to new experiments about
the growth laws. The validity of the concurrent use of the growth laws and Dennis
and Bremers’ instantaneous model can be questioned performing upshifts in different
conditions, e.g. varying the amount of translation–inhibiting antibiotics or reducing
nitrogen availability. The growth rate increase at the upshift (λ0 -λi ) can be related
to the offsets and slopes of the various proteome sectors.
A realistic kinetic model of the various proteome sectors using what is known
about the Crp regulon, ppGpp–mediated regulation of ribosome affiliated proteins
etc. can also be formulated, based on the work of You et al. [137].
125
Bibliography
[1] GLPK - GNU Linear Programming Kit. http://www.gnu.org/software/glpk.
[2] GLPKMEX - a Matlab MEX interface
http://sourceforge.net/projects/glpkmex/.
for
the
GLPK
library.
[3] Adadi, R., Volkmer, B., Milo, R., Heinemann, M., and Shlomi, T.
Prediction of microbial growth rate versus biomass yield by a metabolic
network with kinetic parameters. PLoS computational biology, 8 (2012),
e1002575.
[4] Balaban, N. Q., Merrin, J., Chait, R., Kowalik, L., and Leibler, S.
Bacterial persistence as a phenotypic switch. Science, 305 (2004), 1622.
[5] Baneyx, F. Recombinant protein expression in Escherichia coli. Current
opinion in biotechnology, 10 (1999), 411.
[6] Barabasi, A.-L. and Oltvai, Z. N. Network biology: understanding the
cell’s functional organization. Nature Reviews Genetics, 5 (2004), 101.
[7] Bardoscia, M., Marsili, M., and Samal, A. Phenotypic constraints
promote latent versatility and carbon efficiency in metabolic networks. arXiv
preprint arXiv:1408.4555, (2014).
[8] Basan, M., Hui, S., Zhang, Z., Shen, Y., Williamson, J. R., and Hwa,
T. Efficient allocation of proteomic resources for energy metabolism results
in acetate overflow. In preparation, (2014).
[9] Beard, D. A., Babson, E., Curtis, E., and Qian, H. Thermodynamic constraints for biochemical networks. Journal of theoretical biology,
228 (2004), 327.
[10] Beard, D. A., Liang, S.-d., and Qian, H. Energy balance for analysis of
complex metabolic networks. Biophysical journal, 83 (2002), 79.
[11] Beard, D. A. and Qian, H. Relationship between thermodynamic driving
force and one-way fluxes in reversible processes. PLoS One, 2 (2007), e144.
[12] Beard, D. A. and Qian, H. Chemical biophysics: Quantitative analysis of
cellular systems. Cambridge University Press (2008).
126
Bibliography
[13] Beg, Q., Vazquez, A., Ernst, J., De Menezes, M., Bar-Joseph,
Z., Barabási, A.-L., and Oltvai, Z. Intracellular crowding defines the
mode and sequence of substrate uptake by escherichia coli and constrains
its metabolic activity. Proceedings of the National Academy of Sciences, 104
(2007), 12663.
[14] Benyamini, T., Folger, O., Ruppin, E., and Shlomi, T. Method flux
balance analysis accounting for metabolite dilution. Genome Biol., 11 (2010),
R43.
[15] Bialek, W. Biophysics: searching for principles. Princeton University Press
(2012).
[16] Binder, K. and Heermann, D. Monte Carlo simulation in statistical
physics: an introduction. Springer (2010).
[17] Bokinsky, G., et al. Synthesis of three advanced biofuels from ionic liquidpretreated switchgrass using engineered escherichia coli. Proceedings of the
National Academy of Sciences, 108 (2011), 19949.
[18] Boogerd, F., Bruggeman, F. J., Hofmeyr, J.-H. S., and Westerhoff, H. V. Systems biology: philosophical foundations. Elsevier (2007).
[19] Bremer, H. and Dennis, P. P. Transition period following a nutritional
shift-up in the bacterium escherichia coli B/r: Stable RNA and protein synthesis. Journal of theoretical biology, 52 (1975), 365.
[20] Bremer, H., Dennis, P. P., et al. Modulation of chemical composition and other parameters of the cell by growth rate. Escherichia coli and
Salmonella: cellular and molecular biology, 2 (1996), 1553.
[21] Chassagnole, C., Noisommit-Rizzi, N., Schmid, J. W., Mauch, K.,
and Reuss, M. Dynamic modeling of the central carbon metabolism of
escherichia coli. Biotechnology and bioengineering, 79 (2002), 53.
[22] Chen, R. Bacterial expression systems for recombinant protein production:
E. coli and beyond. Biotechnology advances, 30 (2012), 1102.
[23] Condon, C., Liveris, D., Squires, C., Schwartz, I., and Squires,
C. L. rRNA operon multiplicity in escherichia coli and the physiological
implications of rrn inactivation. Journal of bacteriology, 177 (1995), 4152.
[24] Cooper, S. What is the bacterial growth law during the division cycle?
Journal of bacteriology, 170 (1988), 5001.
[25] Cooper, S. On the fiftieth anniversary of the Schaechter, Maaløe, Kjeldgaard
experiments: implications for cell-cycle and cell-growth control. BioEssays,
30 (2008), 1019.
[26] Cornish-Bowden, A. Fundamentals of enzyme kinetics. John Wiley & Sons
(2013).
Bibliography
127
[27] Covert, M. W. and Palsson, B. Ø. Transcriptional regulation in
constraints-based metabolic models of Escherichia coli. Journal of Biological Chemistry, 277 (2002), 28058.
[28] Covert, M. W., Schilling, C. H., and Palsson, B. Regulation of gene
expression in flux balance models of metabolism. Journal of theoretical biology,
213 (2001), 73.
[29] Csonka, L. N., Ikeda, T. P., Fletcher, S. A., and Kustu, S. The
accumulation of glutamate is necessary for optimal growth of Salmonella typhimurium in media of high osmolality but not induction of the proU operon.
Journal of bacteriology, 176 (1994), 6324.
[30] Dashko, S., Compagno, C., and Piškur, J. Why, when and how did yeast
evolve alcoholic fermentation? FEMS yeast research, (2014).
[31] De Deken, R. The crabtree effect: a regulatory system in yeast. Journal of
General Microbiology, 44 (1966), 149.
[32] De Martino, A., De Martino, D., Mulet, R., and Uguzzoni, G. Reaction networks as systems for resource allocation: A variational principle for
their non-equilibrium steady states. PloS one, 7 (2012), e39849.
[33] De Martino, A. and Marinari, E. The solution space of metabolic networks: Producibility, robustness and fluctuations. In Journal of Physics: Conference Series, vol. 233, p. 012019. IOP Publishing (2010).
[34] De Martino, D. Thermodynamics of biochemical networks and duality
theorems. Physical Review E, 87 (2013), 052108.
[35] De Martino, D., Capuani, F., Mori, M., De Martino, A., and Marinari, E. Counting and correcting thermodynamically infeasible flux cycles
in genome-scale metabolic networks. Metabolites, 3 (2013), 946.
[36] De Martino, D., Figliuzzi, M., De Martino, A., and Marinari, E.
A scalable algorithm to explore the Gibbs energy landscape of genome–scale
metabolic networks. PLoS computational biology, 8 (2012), e1002562.
[37] De Martino, D., Mori, M., and Parisi, V. Uniform sampling of steady
states in metabolic networks: heterogeneous scales and rounding. Submitted
to PLoS ONE, (2014).
[38] Dennis, P. P. and Bremer, H. Differential rate of ribosomal protein synthesis in Escherichia coli B/r. Journal of Molecular Biology, 84 (1974), 407.
[39] Dennis, P. P. and Bremer, H. Macromolecular composition during steadystate growth of Escherichia coli B/r. Journal of bacteriology, 119 (1974), 270.
[40] Diaz-Ruiz, R., Rigoulet, M., and Devin, A. The Warburg and Crabtree
effects: on the origin of cancer cell energy metabolism and of yeast glucose
repression. Biochimica et Biophysica Acta (BBA)-Bioenergetics, 1807 (2011),
568.
128
Bibliography
[41] Edwards, J. and Palsson, B. The Escherichia coli MG1655 in silico
metabolic genotype: its definition, characteristics, and capabilities. Proceedings of the National Academy of Sciences, 97 (2000), 5528.
[42] Ehrenberg, M., Bremer, H., and Dennis, P. P. Medium-dependent
control of the bacterial growth rate. Biochimie, 95 (2013), 643.
[43] Erickson, D., Mori, M., and Schink, S. Investing in a proteome stock
as strategy to cope with environmental changes. In preparation, (2014).
[44] Famili, I. and Palsson, B. O. The convex basis of the left null space of
the stoichiometric matrix leads to the definition of metabolically meaningful
pools. Biophysical journal, 85 (2003), 16.
[45] Feist, A. M., Henry, C. S., Reed, J. L., Krummenacker, M., Joyce,
A. R., Karp, P. D., Broadbelt, L. J., Hatzimanikatis, V., and Palsson, B. Ø. A genome-scale metabolic reconstruction for escherichia coli
K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information.
Molecular systems biology, 3 (2007).
[46] Feist, A. M. and Palsson, B. O. The biomass objective function. Current
opinion in microbiology, 13 (2010), 344.
[47] Fendt, S.-M., Buescher, J. M., Rudroff, F., Picotti, P., Zamboni,
N., and Sauer, U. Tradeoff between enzyme and metabolite efficiency maintains metabolic homeostasis upon perturbations in enzyme capacity. Molecular systems biology, 6 (2010).
[48] Ferreira, L. M. Cancer metabolism: the warburg effect today. Experimental
and molecular pathology, 89 (2010), 372.
[49] Flamholz, A., Noor, E., Bar-Even, A., Liebermeister, W., and
Milo, R. Glycolytic strategy as a tradeoff between energy yield and protein
cost. Proceedings of the National Academy of Sciences, 110 (2013), 10039.
[50] Frey, E. and Kroy, K. Brownian motion: a paradigm of soft matter and
biological physics. Annalen der Physik, 14 (2005), 20.
[51] Gaspard, P. Fluctuation theorem for nonequilibrium reactions. The Journal
of chemical physics, 120 (2004), 8898.
[52] Ge, H., Qian, M., and Qian, H. Stochastic theory of nonequilibrium
steady states. Part II: Applications in chemical biophysics. Physics Reports,
510 (2012), 87.
[53] Goeddel, D. V., et al. Expression in escherichia coli of chemically synthesized genes for human insulin. Proceedings of the National Academy of
Sciences, 76 (1979), 106.
[54] Görke, B. and Stülke, J. Carbon catabolite repression in bacteria: many
ways to make the most out of nutrients. Nature Reviews Microbiology, 6
(2008), 613.
Bibliography
129
[55] Harcombe, W. R., et al. Metabolic resource allocation in individual microbes determines ecosystem interactions and spatial dynamics. Cell reports,
7 (2014), 1104.
[56] Henry, C. S., Broadbelt, L. J., and Hatzimanikatis, V.
Thermodynamics-based metabolic flux analysis. Biophysical journal, 92
(2007), 1792.
[57] Hoppe, A., Hoffmann, S., and Holzhütter, H.-G. Including metabolite
concentrations into flux balance analysis: thermodynamic realizability as a
constraint on flux distributions in metabolic networks. BMC systems biology,
1 (2007), 23.
[58] Hui, T. et al. Quantitative mass spectrometry reveals simple proteome
partition in escherichia coli. Submitted to Cell, (2014).
[59] Ibarra, R. U., Edwards, J. S., and Palsson, B. O. Escherichia coli K12 undergoes adaptive evolution to achieve in silico predicted optimal growth.
Nature, 420 (2002), 186.
[60] Johnson, D. B. Finding all the elementary circuits of a directed graph.
SIAM Journal on Computing, 4 (1975), 77.
[61] Keseler, I. M., et al. Ecocyc: a comprehensive database of escherichia
coli biology. Nucleic acids research, 39 (2011), D583.
[62] Kim, B. H. and Gadd, G. M. Bacterial physiology and metabolism. Cambridge university press (2008).
[63] Kitano, H. Systems biology: a brief overview. Science, 295 (2002), 1662.
[64] Kiviet, D. J., Nghe, P., Walker, N., Boulineau, S., Sunderlikova,
V., and Tans, S. J. Stochasticity of metabolism and growth at the single-cell
level. Nature, (2014).
[65] Kjeldgaard, N. and Kurland, C. The distribution of soluble and ribosomal RNA as a function of growth rate. Journal of Molecular Biology, 6 (1963),
341.
[66] Kjeldgaard, N., Maaløe, O., and Schaechter, M. The transition
between different physiological states during balanced growth of salmonella
typhimurium. Journal of general microbiology, 19 (1958), 607.
[67] Klappenbach, J. A., Dunbar, J. M., and Schmidt, T. M. rRNA operon
copy number reflects ecological strategies of bacteria. Applied and environmental microbiology, 66 (2000), 1328.
[68] Klumpp, S., Scott, M., Pedersen, S., and Hwa, T. Molecular crowding
limits translation and cell growth. Proceedings of the National Academy of
Sciences, 110 (2013), 16754.
130
Bibliography
[69] Koch, A. L. Overall controls on the biosynthesis of ribosomes in growing
bacteria. Journal of theoretical biology, 28 (1970), 203.
[70] Koch, A. L. The adaptive responses of escherichia coli to a feast and famine
existence. Advances in microbial physiology, 6 (1971), 147.
[71] Koch, A. L. Growth measurement. Methods for general and molecular
bacteriology, (1994), 248.
[72] Krauth, W. and Mézard, M. Learning algorithms with optimal stability
in neural networks. Journal of Physics A: Mathematical and General, 20
(1987), L745.
[73] Labhsetwar, P., Cole, J. A., Roberts, E., Price, N. D., and LutheySchulten, Z. A. Heterogeneity in protein expression induces metabolic variability in a modeled escherichia coli population. Proceedings of the National
Academy of Sciences, 110 (2013), 14006.
[74] Lemuth, K., Hardiman, T., Winter, S., Pfeiffer, D., Keller, M.,
Lange, S., Reuss, M., Schmid, R., and Siemann-Herzberg, M. Global
transcription and metabolic flux analysis of escherichia coli in glucose-limited
fed-batch cultivations. Applied and environmental microbiology, 74 (2008),
7002.
[75] Lerman, J. A., et al. In silico method for modelling metabolism and gene
product expression at genome scale. Nature communications, 3 (2012), 929.
[76] Licht, T. R., Tolker-Nielsen, T., Holmstrøm, K., Krogfelt, K. A.,
and Molin, S. Inhibition of escherichia coli precursor-16s rrna processing by
mouse intestinal contents. Environmental microbiology, 1 (1999), 23.
[77] Mahadevan, R., Edwards, J. S., and Doyle III, F. J. Dynamic flux
balance analysis of diauxic growth in escherichia coli. Biophysical journal, 83
(2002), 1331.
[78] Mahadevan, R. and Schilling, C. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metabolic engineering, 5 (2003), 264.
[79] Majewski, R. and Domach, M. Simple constrained-optimization view of
acetate overflow in e. coli. Biotechnology and bioengineering, 35 (1990), 732.
[80] Mast, F. D., Ratushny, A. V., and Aitchison, J. D. Systems cell
biology. The Journal of cell biology, 206 (2014), 695.
[81] Mezard, M. and Montanari, A. Information, physics, and computation.
Oxford University Press (2009).
[82] Mikkola, R. and Kurland, C. Is there a unique ribosome phenotype for
naturally occurring escherichia coli? Biochimie, 73 (1991), 1061.
Bibliography
131
[83] Miller, S., Lesk, A. M., Janin, J., Chothia, C., et al. The accessible
surface area and stability of oligomeric proteins. Nature, 328 (1987), 834.
[84] Molenaar, D., van Berlo, R., de Ridder, D., and Teusink, B. Shifts
in growth strategies reflect tradeoffs in cellular economics. Molecular systems
biology, 5 (2009).
[85] Monod, J. The growth of bacterial cultures. Annual Reviews in Microbiology,
3 (1949), 371.
[86] Mori, M., De Martino, A., Marinari, E., and Hwa, T. Constrained
Allocation Flux Balance Analysis. In preparation for Molecular Systems Biology.
[87] Mori, M. et al. Pareto–optimality of Constrained Allocation Flux Balance Analysis predictions and comparison with other constraint based models.
Working title, in preparation, (2015).
[88] Nath, K. and Koch, A. L. Protein degradation in Escherichia coli I. Measurement of rapidly and slowly decaying components. Journal of Biological
Chemistry, 245 (1970), 2889.
[89] Neidhardt, F. C. Bacterial growth: Constant obsession with dn/dt. Journal
of bacteriology, 181 (1999), 7405.
[90] Neidhardt, F. C., Ingraham, J. L., and Schaechter, M. Physiology of
the bacterial cell: a molecular approach. Sinauer Associates Sunderland, MA
(1990).
[91] Noor, E., Bar-Even, A., Flamholz, A., Reznik, E., Liebermeister,
W., and Milo, R. Pathway thermodynamics highlights kinetic obstacles in
central metabolism. PLoS computational biology, 10 (2014), e1003483.
[92] O’Brien, E. J., Lerman, J. A., Chang, R. L., Hyduke, D. R., and
Palsson, B. Ø. Genome-scale models of metabolism and gene expression
extend and refine growth phenotype prediction. Molecular systems biology, 9
(2013).
[93] Orth, J. D., Conrad, T. M., Na, J., Lerman, J. A., Nam, H., Feist,
A. M., and Palsson, B. Ø. A comprehensive genome-scale reconstruction
of Escherichia coli metabolism. Molecular systems biology, 7 (2011).
[94] Orth, J. D., Thiele, I., and Palsson, B. Ø. What is flux balance analysis?
Nature biotechnology, 28 (2010), 245.
[95] Palsson, B. O. Systems biology. Cambridge university press (2006).
[96] Panikov, N. S. Microbial growth kinetics. Springer (1995).
[97] Paul, B. J., Ross, W., Gaal, T., and Gourse, R. L. rrna transcription
in Escherichia coli. Annu. Rev. Genet., 38 (2004), 749.
132
Bibliography
[98] Pedersen, S. Escherichia coli ribosomes translate in vivo with variable rate.
The EMBO journal, 3 (1984), 2895.
[99] Potrykus, K. and Cashel, M. (p) ppGpp: Still Magical?
Microbiol., 62 (2008), 35.
Annu. Rev.
[100] Poulsen, L. K., Licht, T. R., Rang, C., Krogfelt, K. A., and Molin,
S. Physiological state of escherichia coli bj4 growing in the large intestines of
streptomycin-treated mice. Journal of bacteriology, 177 (1995), 5840.
[101] Pramanik, J. and Keasling, J. Stoichiometric model of escherichia coli
metabolism: incorporation of growth-rate dependent biomass composition
and mechanistic energy requirements. Biotechnology and bioengineering, 56
(1997), 398.
[102] Price, N. D., Famili, I., Beard, D. A., and Palsson, B. Ø. Extreme
pathways and kirchhoff’s second law. Biophysical journal, 83 (2002), 2879.
[103] Price, N. D., Schellenberger, J., and Palsson, B. O. Uniform sampling of steady-state flux spaces: means to design experiments and to interpret
enzymopathies. Biophysical journal, 87 (2004), 2172.
[104] Qian, H. and Beard, D. A. Thermodynamics of stoichiometric biochemical
networks in living systems far from equilibrium. Biophysical chemistry, 114
(2005), 213.
[105] Qian, H., Beard, D. A., and Liang, S.-d. Stoichiometric network theory
for nonequilibrium biochemical systems. European Journal of Biochemistry,
270 (2003), 415.
[106] Reed, J. L., Vo, T. D., Schilling, C. H., Palsson, B. O., et al. An
expanded genome–scale model of Escherichia coli K-12 (iJR904 GSM/GPR).
Genome Biol, 4 (2003), R54.
[107] Sauer, U., Canonaco, F., Heri, S., Perrenoud, A., and Fischer, E.
The soluble and membrane-bound transhydrogenases UdhA and PntAB have
divergent functions in NADPH metabolism of Escherichia coli. Journal of
Biological Chemistry, 279 (2004), 6613.
[108] Savageau, M. A. Escherichia coli habitats, cell types, and molecular mechanisms of gene control. American Naturalist, (1983), 732.
[109] Schaechter, M., Maaløe, O., and Kjeldgaard, N. Dependency on
medium and temperature of cell size and chemical composition during balanced growth of salmonella typhimurium. Journal of General Microbiology,
19 (1958), 592.
[110] Schellenberger, J., et al. Quantitative prediction of cellular metabolism
with constraint-based models: the cobra toolbox v2. 0. Nature protocols, 6
(2011), 1290.
Bibliography
133
[111] Schilling, C. H., Letscher, D., and Palsson, B. Ø. Theory for the systemic definition of metabolic pathways and their use in interpreting metabolic
function from a pathway-oriented perspective. Journal of theoretical biology,
203 (2000), 229.
[112] Schomburg, I., et al. Brenda in 2013: integrated reactions, kinetic data,
enzyme function data, improved disease classification: new options and contents in brenda. Nucleic acids research, 41 (2013), D764.
[113] Schrijver, A. Theory of linear and integer programming. John Wiley &
Sons (1998).
[114] Scott, M., Gunderson, C. W., Mateescu, E. M., Zhang, Z., and
Hwa, T. Interdependence of cell growth and gene expression: origins and
consequences. Science, 330 (2010), 1099.
[115] Scott, M. and Hwa, T. Bacterial growth laws and their applications.
Current opinion in biotechnology, 22 (2011), 559.
[116] Segre, D., Vitkup, D., and Church, G. M. Analysis of optimality in natural and perturbed metabolic networks. Proceedings of the National Academy
of Sciences, 99 (2002), 15112.
[117] Seifert, U. Entropy production along a stochastic trajectory and an integral
fluctuation theorem. Physical review letters, 95 (2005), 040602.
[118] Sevick, E., Prabhakar, R., Williams, S. R., and Searles, D. J. Fluctuation theorems. arXiv preprint arXiv:0709.3888, (2007).
[119] Shlomi, T., Berkman, O., and Ruppin, E. Regulatory on/off minimization of metabolic flux changes after genetic perturbations. Proceedings of the
National Academy of Sciences of the United States of America, 102 (2005),
7695.
[120] Shlomi, T., Cabili, M. N., Herrgård, M. J., Palsson, B. Ø., and
Ruppin, E. Network-based prediction of human tissue-specific metabolism.
Nature biotechnology, 26 (2008), 1003.
[121] Soh, K. C. and Hatzimanikatis, V. Network thermodynamics in the postgenomic era. Current opinion in microbiology, 13 (2010), 350.
[122] Solopova, A., van Gestel, J., Weissing, F. J., Bachmann, H.,
Teusink, B., Kok, J., and Kuipers, O. P. Bet-hedging during bacterial diauxic shift. Proceedings of the National Academy of Sciences, (2014),
201320063.
[123] Steuer, R., Gross, T., Selbig, J., and Blasius, B. Structural kinetic
modeling of metabolic networks. Proceedings of the National Academy of
Sciences, 103 (2006), 11868.
134
Bibliography
[124] Taniguchi, Y., Choi, P. J., Li, G.-W., Chen, H., Babu, M., Hearn, J.,
Emili, A., and Xie, X. S. Quantifying e. coli proteome and transcriptome
with single-molecule sensitivity in single cells. Science, 329 (2010), 533.
[125] Taymaz-Nikerel, H., Borujeni, A. E., Verheijen, P. J., Heijnen,
J. J., and van Gulik, W. M. Genome–derived minimal metabolic models for Escherichia coli MG1655 with estimated in vivo respiratory ATP stoichiometry. Biotechnology and bioengineering, 107 (2010), 369.
[126] Thiele, I., et al. A community-driven global reconstruction of human
metabolism. Nature biotechnology, 31 (2013), 419.
[127] Valgepea, K., Adamberg, K., Nahku, R., Lahtvee, P.-J., Arike, L.,
and Vilu, R. Systems biology approach reveals that overflow metabolism
of acetate in escherichia coli is triggered by carbon catabolite repression of
acetyl-coa synthetase. BMC systems biology, 4 (2010), 166.
[128] Valgepea, K., Adamberg, K., Seiman, A., and Vilu, R. Escherichia
coli achieves faster growth by increasing catalytic and translation rates of
proteins. Molecular BioSystems, 9 (2013), 2344.
[129] van Hoek, M. J. and Merks, R. M. Redox balance is key to explaining
full vs. partial switching to low-yield metabolism. BMC systems biology, 6
(2012), 22.
[130] Van Regenmortel, M. H. Reductionism and complexity in molecular biology. EMBO reports, 5 (2004), 1016.
[131] Veening, J.-W., Smits, W. K., and Kuipers, O. P. Bistability, epigenetics, and bet-hedging in bacteria. Annu. Rev. Microbiol., 62 (2008), 193.
[132] Vemuri, G., Altman, E., Sangurdekar, D., Khodursky, A., and
Eiteman, M. Overflow metabolism in Escherichia coli during steady-state
growth: transcriptional regulation and effect of the redox ratio. Applied and
environmental microbiology, 72 (2006), 3653.
[133] Wiback, S. J., Famili, I., Greenberg, H. J., and Palsson, B. Ø. Monte
carlo sampling can be used to determine the size and shape of the steady-state
flux space. Journal of theoretical biology, 228 (2004), 437.
[134] Woese, C. R. A new biology for a new century. Microbiology and Molecular
Biology Reviews, 68 (2004), 173.
[135] Wolfe, A. J. The acetate switch. Microbiology and Molecular Biology Reviews, 69 (2005), 12.
[136] Wright, J. and Wagner, A. Exhaustive identification of steady state
cycles in large stoichiometric networks. BMC systems biology, 2 (2008), 61.
[137] You, C., et al. Coordination of bacterial proteome with metabolism by
cyclic amp signalling. Nature, 500 (2013), 301.
Bibliography
135
[138] Young, R. and Bremer, H. Polypeptide–chain–elongation rate in Escherichia coli B/r as a function of growth rate. Biochem. J, 160 (1976),
185.
[139] Zheng, D., Constantinidou, C., Hobman, J. L., and Minchin, S. D.
Identification of the CRP regulon using in vitro and in vivo transcriptional
profiling. Nucleic acids research, 32 (2004), 5874.
[140] Zhuang, K., Vemuri, G. N., and Mahadevan, R. Economics of membrane occupancy and respiro-fermentation. Molecular systems biology, 7
(2011).