Unraveling Bacterial Growth Laws: Coupling Energetics and Regulation in Cell Metabolism Scuola di dottorato Vito Volterra Dottorato di Ricerca in Fisica – XXVII Ciclo Candidate Matteo Mori ID number 1178192 Thesis Advisors Prof. Enzo Marinari Prof. Andrea De Martino A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Physics December 2014 Thesis not yet defended Unraveling Bacterial Growth Laws: Coupling Energetics and Regulation in Cell Metabolism Ph.D. thesis. Sapienza – University of Rome © 2014 Matteo Mori. All rights reserved This thesis has been typeset by LATEX and the Sapthesis class. Author’s email: [email protected] A tutti coloro mi hanno sopportato in questi anni. Non vi fate ringraziare uno per uno, che sarebbe tutto troppo lungo e anche basta scrivere, su. v Abstract In this dissertation I study genome–scale modelling of metabolic networks, coarse grained proteome allocation models and their integration. Constraint–based modelling is nowadays widely used in molecular biology and bioengineering due to its simplicity. The importance of thermodynamic constraints in such models of metabolism is underlined, and I describe a new technique of identifying flux cycles in metabolic networks. The main result of my Ph.D. studies is a new framework, called Constrained Allocation Flux Balance Analysis (CAFBA). CAFBA integrates contraint–based modelling with a simple model of proteome allocation developed by T. Hwa and coworkers since 2010, based on empirical growth laws; the resulting algorithm is a simple linear programming problem in the case of biomass production rate optimization. CAFBA allows for accurate modelling of overflow metabolism in Escherichia coli, and a few parameters can be adjusted to fit strain–specific differences, such as the maximum growth rate. Our results show that overflow metabolism is the result of the tradeoff between high–yield metabolism and its cost in terms of protein production, and suggests some unexplored correlations among reaction fluxes, enzyme levels, kinetic constants, and metabolite levels. Despite the success of the proteome growth laws to describe in a simple and effective way how proteome adjust to cope with different sources of growth limitation, some aspects are quite mysterious. In particular, proteome appears to be inefficiently used. I explored the role of such proteome excess during nutritional shifts, that is, sudden variations in nutrients availability. Result suggests that part of the proteome is purposely allocated to speed up growth during transitions from poor to rich substrates. This is confirmed by the study of optimality of such excess in upshift scenarios. vii Contents List of Figures x Introduction 1 1 Modeling bacterial growth: an overview 1.1 Bacterial physiology . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Metabolism . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Growth laws . . . . . . . . . . . . . . . . . . . . . . 1.2 Genome scale modeling of metabolic networks . . . . . . . . 1.2.1 Fluxes normalization and dilution terms . . . . . . . 1.2.2 Fundamental subspaces of the stoichiometric matrix 1.2.3 Flux Balance Analysis . . . . . . . . . . . . . . . . . 1.2.4 Other frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Thermodynamics in metabolic networks 2.1 Thermodynamics in a chemical network . . . . . . . . . . . . . . . . 2.1.1 Gibbs ensemble and chemical reactions . . . . . . . . . . . . . 2.1.2 Chemical potentials and concentrations . . . . . . . . . . . . 2.1.3 Second principle of thermodynamics, directionality of fluxes and non–equilibrum steady state . . . . . . . . . . . . . . . . 2.1.4 Duality theorems . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Algorithm for identifying and correcting infeasible loops . . . . . . . 2.2.1 Structure of the Algorithm . . . . . . . . . . . . . . . . . . . 2.2.2 Checking Thermodynamic Viability by Relaxation . . . . . . 2.2.3 Identifying Loops by Monte Carlo . . . . . . . . . . . . . . . 2.2.4 Correcting the Flux Configuration: a Toy Model . . . . . . . 2.2.5 Correcting the Flux Configuration: Local Strategy . . . . . . 2.2.6 Correcting the Flux Configuration: Global Strategy . . . . . 2.3 Loops in the E. Coli Network iAF1260 . . . . . . . . . . . . . . . . . 2.4 Correcting infeasible flux configurations in the Recon-2 human networks 2.4.1 Inconsistencies in the FBA Solution for the Overall Human Reactome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Correcting Infeasible Loops in FBA Solutions for Cell-Type Specific Human Metabolic Networks . . . . . . . . . . . . . . 2.5 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 7 8 14 15 16 17 19 21 22 22 23 25 27 28 28 30 31 32 32 34 35 37 38 39 41 viii Contents 3 Constrained Allocation Flux Balance Analysis 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Proteome allocation constraint . . . . . . . . . . . . . . . . . . . . . 3.2.1 Enzyme–limited kinetics . . . . . . . . . . . . . . . . . . . . . 3.2.2 Thermodynamics and estimates of enzyme levels . . . . . . . 3.2.3 Interpretation of the constitutive relation as lower bound on the enzyme proteome fractions . . . . . . . . . . . . . . . . . 3.3 Constrained Allocation FBA . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Details on CAFBA implementation . . . . . . . . . . . . . . . 3.4 Results: Homogeneous case . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Choice of the control parameter: flux versus weight . . . . . . 3.4.2 CAFBA and the growth laws . . . . . . . . . . . . . . . . . . 3.5 CAFBA as tradeoff between two optimal solutions . . . . . . . . . . 3.6 Inhomogeneous case . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.2 Fixing the maximum growth rate for a synthetic strain . . . . 3.6.3 Optimal regulation for different carbon sources . . . . . . . . 3.7 Growth rate-dependent biomass composition . . . . . . . . . . . . . 3.7.1 Rna to protein ratio . . . . . . . . . . . . . . . . . . . . . . . 3.7.2 Implementation of a variable biomass in (CA)FBA . . . . . . 3.7.3 Energy requirements . . . . . . . . . . . . . . . . . . . . . . . 3.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Beyond steady state 4.1 What are the proteome offsets for? . . . . . . . . . . . . . 4.2 Coarse–grained model of protein synthesis and allocation 4.3 Upshifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Estimate for the R–sector offset . . . . . . . . . . . 4.3.2 Efficiencies and the other offsets . . . . . . . . . . 4.3.3 Upshifts: Lower bounds to other offsets . . . . . . 4.4 Experimental results . . . . . . . . . . . . . . . . . . . . . 4.5 Upshift fitness landscape . . . . . . . . . . . . . . . . . . . 4.5.1 Analitical solution in a simple case . . . . . . . . . 4.5.2 Comparing different environments . . . . . . . . . 4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Statics and dynamics in a small model 5.1 Steady state . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Randomization . . . . . . . . . . . . . . . . 5.2 Kinetic model . . . . . . . . . . . . . . . . . . . . . 5.2.1 Optimal proteome allocation in a dynamical 5.2.2 Results for constant G/T ratio . . . . . . . 5.2.3 Optimal G/T tradeoff exploitation . . . . . 5.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . Conclusion 45 45 46 48 49 50 52 53 54 55 57 62 68 69 71 77 77 82 82 83 84 84 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 89 90 91 92 93 94 95 97 97 98 103 . . . . . . . . . . . . . . . . . . . . . . . . environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 106 108 110 112 114 117 117 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 ix List of Figures 1.1 1.2 1.3 1.4 1.5 Schematic representation of a prokaryiote bacterium. Central dogma of molecular biology. . . . . . . . . . Scheme of intermediate metabolism. . . . . . . . . . Growth laws, RNA/protein ratio. . . . . . . . . . . . Illustration of Flux Balance Analysis . . . . . . . . . 2.1 Flowchart of the algorithm for counting and removing cycles from a flux configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toy reaction network. . . . . . . . . . . . . . . . . . . . . . . . . . . Loops in E. coli iAF1260 model as a function of the number of random configurations tested. . . . . . . . . . . . . . . . . . . . . . . . . . . . Lengths of loops in E. coli iAF1260 model. . . . . . . . . . . . . . . Example of loop created by two transport reactions and ATP hydrolysis. 2.2 2.3 2.4 2.5 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14 3.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Molecular masses and turnover numbers of enzymes in E. coli. . . . CAFBA results using E. coli enzyme’s molecular masses and turnover numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CAFBA main results in the homogeneous (no randomization) case. . Illustration of CAFBA optimization and glucose uptake flux. . . . . Comparison between different procedures to obtain growth limitation reducing carbon intake. Growth rate and acetate excretion. . . . . . Comparison between different procedures to obtain growth limitation reducing carbon intake. Fluxes. . . . . . . . . . . . . . . . . . . . . . Linear fits used to determine the analytical form of growth rate with respect to wC and wR . . . . . . . . . . . . . . . . . . . . . . . . . . . Surface plot of growth rate and acetate excretion as a function of kC = 1/wC and kR = 1/kR . . . . . . . . . . . . . . . . . . . . . . . . Optimal proteome allocation in C– and R–limitation. . . . . . . . . . R–sector and fluxes in R–limitation. . . . . . . . . . . . . . . . . . . Growth rate and fluxes in Q–limitation. . . . . . . . . . . . . . . . . Overlap between FBA and CAFBA solutions. . . . . . . . . . . . . . Feasibility region in the λ–φE plane. . . . . . . . . . . . . . . . . . . Optimal fluxes in the λ–φE plane. CAFBA solution is also showed. . CAFBA main results in the randomized case. CAFBA fits acetate excretion and growth yield data from Basan et al. (2014) . . . . . . 6 7 8 13 19 30 32 36 37 39 50 51 55 57 58 58 60 61 61 63 64 66 67 67 70 x List of Figures 3.16 CAFBA fits acetate excretion and growth yield data from Vemuri et al. (2006) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.17 Comparison between different distribution funtions for the weights in the proteome allocation constraint. . . . . . . . . . . . . . . . . . . . 3.18 CAFBA shows that NADH/NADPH transhydrogenases fluxes reverse its overall direction in carbon limitation. . . . . . . . . . . . . . 3.19 Flux differences for all reactions in the main catabolic pathways in carbon (glucose) limitation. . . . . . . . . . . . . . . . . . . . . . . . 3.20 Scatter plots of CAFBA solutions. . . . . . . . . . . . . . . . . . . . 3.21 Histograms of CAFBA solutions obtained at high and medium growth rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.22 Exploring different fluctuation ranges: fluxes. . . . . . . . . . . . . . 3.23 Exploring different fluctuation ranges: L1 norm and growth yield. . . 3.24 CAFBA fluxes for different carbon sources. . . . . . . . . . . . . . . 3.25 Acetate excretion in C– and Q–limitation, using four different carbon sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.26 Growth rate dependent biomass: Convergence of the iterative procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.27 Growth rate dependent biomass: Biomass prescription functions. . . 3.28 Growth rate dependent biomass: RNA/proteome ratio. . . . . . . . . 3.29 Growth rate dependent biomass: CAFBA solutions for three different values of growth–rate dependent ATP hydrolysis flux. . . . . . . . . 4.1 4.2 4.3 4.4 4.5 4.6 5.1 5.2 5.3 5.4 5.5 5.6 71 72 72 73 75 76 78 79 80 81 83 85 85 86 A: Proteome partition model. B: Experimental data and fit. C: Fitness landscape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Single upshift experiment. . . . . . . . . . . . . . . . . . . . . . . . . 96 Analytical results for offsets optimality. . . . . . . . . . . . . . . . . 98 Fitness landscape as a function of normalized offset and upshift time. 100 Fitness landscape as a function of upshift time and initial carbon source quality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Fitness landscape as a function of initial and final carbon source quality.102 Coarse grained model of metabolism. . . . . . . . . . . . . . . . FBA and CAFBA solutions of the coarse grained model. . . . . Numerical solutions of the kinetic model. . . . . . . . . . . . . Results of growth laws optimization as a function of period T nutrient availability s. . . . . . . . . . . . . . . . . . . . . . . . Optimal proteome partitioning for different T and s. . . . . . . Optimal G/T protein partitioning in the kinetic model. . . . . . . . . . . . . . and . . . . . . . . . 106 107 113 115 116 118 1 Introduction Twentieth century biology has been strongly influenced by the reductionist approach of molecular biology [130, 134]. Starting from the cell hypothesis at the beginning of XIX century, the microscopic constituents of living organisms have been unveiled, and their function progressively clarified. Even if such approach has been proved extremely useful to understand of the basic mechanisms of life, some properties of living organisms can be studied only considering the living system as a whole. Systems biology [63, 95, 18, 80] is a recent branch of biology, which endorses a holistic approach to the study of processes in living organisms. Its aim is the study of cell–scale phenomena and their relations with the constituent of the system; in particular, systems biology studies networks (e.g. metabolic network, signalling, regulation networks and protein–protein interaction networks) rather than the single network constituents, hence the term network biology [6]. The availability of high– throughput data (e.g. shotgun genome sequencing, transcriptome profiling with DNA microarrays or proteome profiling using mass spectrometry) made possible in recent years to build detailed models of metabolism, regulation and proteome expression at the whole cell level. On one hand, this allows to understand how functional properties and the behaviour of living organisms are brought about by interations of their constituents (bottom–up approach), and on the other one to unravel the molecular components and logic that underlie cellular processes (top– down). A central issue is the formulation of system–scale models which are able to bridge the gap between networks as a whole and the details of their constituents. Such effective models are often very hard to formulate in a meaningful manner, due to the fact that biological phenomena involve many components and many timescales, making it hard to disentangle the whole cell into separate networks. Thus, the problem is to find the description that better allows to isolate the phenomenon of interest from the rest of the system and the environment. Furthermore, the resulting model can be overwhelmingly complex, not allowing to grasp insights into the phenomena and behaviours associated with the system under study. Simple laws can sometimes emerge from complexity, often due to a combination of physical constraints and evolutionary pressure [15]. An example of such simple laws is given by the discovery of the characterization of protein expression based on a coarse grained proteome partitioning model. This model is described in detail in Chapter 1, and provides the starting point for the work illustrated in Chapters 3 to 5. This Thesis is organized as follows. Chapter 1 provides a twofold introduction to the topics treated in the rest of the Thesis. Its first part provides an introduction 2 Introduction to the model bacterium Escherichia coli and the basic concepts of gene expression and regulation. We also discuss bacterial metabolism and some quantitative studies of bacterial growth, starting from J. Monod [85] and continuing to the work by T. Hwa and collaborators [115, 137]. In particular, bacterial proteome can be divided in proteome sectors, each one with a distinct response to environmental perturbations; notably, these proteome shares appear to be simply related to growth rate for each different kind of growth limitation, pointing out the existence of a few global regulation circuits. Genome scale modelling of metabolic networks is discussed in the second half of the Chapter, with a detailed exposition of constraint–based modelling of metabolic networks. This framework, whose most famous application being the so–called Flux Balance Analysis [95, 94], is extremely popular due to the fact that its main input is the topology of the reaction network, and it does not rely on metabolite concentration or reaction kinetics. Possible steady states can be studied either via sampling or by picking a particular solution using some optimization principle. In the case of bacterial cells, the most common optimization principle is that of maximization of biomass production rate, which gives good results in predicting the growth rate of real cells in exponential growth. On the other hand, real cells can exhibit apparently suboptimal properties, such as overflow metabolism, and Flux Balance Analysis is not able to model such behaviours without introducing some ad hoc constraints. Chapter 2 deals with thermodynamical constraints in constraint–based models of metabolis. There is a twofold relationship between thermodynamics and reaction network. On one hand, knowing the chemical potentials of the metabolites allows to fix the direction of reaction fluxes without the need of knowing the kinetic mechanism underlying the reaction; on the other, a flux configuration constrains the chemical potentials. In particular, some flux configurations are not compatible with any set of chemical potentials for the metabolites, and are therefore forbidden. One can show that such thermodynamically unfeasible flux configurations must contain flux loops, which can provide a signature of inconsistency. After a general introduction to termodynamics and reaction networks, we will focus on the identification of loops using a mixed relaxation/Monte Carlo approach, and on the removal of such loops from flux patterns. Chapter 3 describes the integration of the coarse grained model of proteome allocation described in the first Chapter within a genome scale, constraint–based model of Escherichia coli metabolism. The resulting framework, Constrained Allocation Flux Balance Analysis (CAFBA), is discussed in detail, and its predictions are compared to experimental data. Our implementation of CAFBA differs from the standard Flux Balance Analysis on three fundamental aspects: 1) The presence of a proteome allocation constraint, based on a four fold partitioning of the proteome 2) The control variable, being the amount of proteins dedicated to carbon (e.g. glucose) transport and metabolism, instead of the carbon flux itself, and 3) A randomization procedure of the protein costs, which is used to assess the sensitivity of the model with respect to parameter fluctuations. In particular, the average values of the fluxes nicely agree with experimental data on acetate excretion and growth yield. CAFBA allows for a detailed description of the switch between overflow metabolism at high growth rates, to a more efficient (respiration) metabolism Introduction 3 when nutrient quality is poor, a feature which is missing in standard Flux Balance Analysis. Proteome allocation in E. coli appears to be suboptimal in the light of the proteome partitioning model. Some baseline proteome fractions (“offsets”) are present for each proteome group, reducing steady state growth rate. In Chapter 4 we investigate this puzzle by connecting the growth laws to dynamics, and investigating their role in the cellular response to nutritional shifts, that is, sudden change of the medium the cells are growing in. The connection is allowed by the “instantaneous” model formulated by Dennis and Bremer [38] to model proteome production during nutritional shifts. Experimental upshift – nutritional shift from poor to high–quality carbon nutrients – measures allow to provide estimates of the offsets for various sectors which nicely agree with data obtained with steady state measurements. Moreover, optimality of the growth laws is discussed using the instantaneous model; the analysis suggests that such offsets may be optimized to maximize growth during upshifts. Chapter 5 is dedicated to the application of the ideas discussed in Chapters 3 and 4 to a minimal model of metabolism, including carbon transport across the cell membrane, respiration and fermentation metabolism, and biomass production. Application of CAFBA to this small network allows to carry out some analytical calculations, and gain insight into the behaviour of CABFA application to the genome scale model. Kinetics for both reactions ad proteome allocation can be introduced, and the resulting kinetic model can be used to test how the growth laws for the various proteome sectors fit in a given environment. Preliminary results show a substantial agreement between the empirical growth laws, CAFBA results and the growth laws obtained optimizing bacterial mass production in an environment in which nutrients availability varies periodically. Publications Chapter 1 does not contain original material, since it only serves as introduction and review of the state of the art in bacterial physiology and constraint–based modelling. Chapter 2 contains material from [35]. Chapter 3 contains material from Refs. [86] and [87]. Chapter 4 contains material from Ref. [43]. Chapter 5 mostly contains work–in–progress material, part of which will be included in Ref. [87]. Ref. [86] is to be submitted to the journal Molecular Systems Biology along with an experimental work of Basan et al. [8], focusing on the respiration/fermentation switch. During my Ph.D. studied I also studied uniform sampling of steady state fluxes using Hit–and–Run Markov Chain Monte Carlo dynamics. Sampling flux space is usually very difficult due to the ill–conditioning of the flux space (a convex polytope). In this work the ill–conditioning is removed by using three different rounding methods, allowing to sample genome–scale flux spaces in an effective way. Hence, our implementation of Hit–and–Run algorithm can be used as a benchmark to evaluate the performances of other samples, such as gpSampler or optGpSampler. This topic will not be covered in this Thesis, so we refer the interested reader to the article [37]. 5 Chapter 1 Modeling bacterial growth: an overview Escherichia coli is the prototypical prokaryote, being extensively studied for over than 60 years. Nowadays, it has an important role in biotechnology, as it is used as a host for recombinant DNA sequences in order to produce recombinant proteins [5, 22], or to produce metabolites on industrial scale (e.g. biofuel [17]) for which often a chemical production is not possible (e.g. human insulin [53]). Since this is not the place for a detailed introduction to cell biology, we will only skects some relevant facts, referring the interested reader to textbook references. A comprehensive review of bacterial physiology and metabolism can be found in Ref. [62]. A brief, but complete, historical review of the first quantitative studies on microbial growth can be found in Ref. [96]. Classical textbooks on constraint–based modelling are Refs. [12, 95]. 1.1 Bacterial physiology The fundamental unit of life is the cell. All living cells (not viruses, for instance) share some basic components: • The cell membrane separates the interior of the cell (the cytoplasm) from the outside environment. Cell membranes are double layers of phospholipids and are selectively permeable, allowing molecules flow between the cytoplasm and the environment. It serves as support for a variety of other structures, such as the cell wall and membrane proteins. • DNA stores the genetic information used in the development and functioning of all organisms. It contains genes, which are DNA regions associated to the basic information units. • RNA perform a variety of functions, ranging from protein synthesis (messenger, transfer and ribosomal RNA) to regulation (e.g. antisense RNA). • Proteins are formed by polypeptides, that is, amino acid monomers linked by peptide bonds, which participate in virtually any biological function, ranging 6 1. Modeling bacterial growth: an overview plasma membrane capsule cell wall plasmid pili nucleoid (DNA) ribosomes flagellum cytoplasm Figure 1.1. Schematic representation of a prokaryote bacterium. The cell (plasma) membrane is surrounded by a peptidoglycan cell wall and, possibly, by a capsule. E. coli has more than one flagellum, projecting in all directions. from catalysis of chemical reactions to regulation and duplication. The term “protein” actually refers to the folded polypeptides only, when they are able to perform their functions properly. Prokaryote cells have a very simple structure, as shown in Fig. 1.1. All intracellular water soluble constituents (proteins, DNA, RNA, metabolites, etc.) are located together in the same volume enclosed by the cell membrane, rather than in separate cell compartments. In fact, they do not contain mitochondria or chloroplasts, with oxidative phosphorylation and/or photosynthesis taking place on the cell wall, neither they have a nucleus, with DNA arranged in a structure called nucleoid, which lacks any nuclear envelope. The information contained in the DNA is used in a process called gene expression, which is the synthesis of a functional gene product. Such gene products are proteins and functional RNA (which is the product of non–coding genes). The typical information flow is described by the so–called central dogma of molecular biology, see Fig. 1.2. The information stored in DNA genes is first copied (transcribed) into single–stranded messenger RNA (mRNA) molecules by the enzyme RNA polymerase. Then, mRNA strands are translated into peptide chains by dedicate organella called ribosomes. Ribosomes are large macromolecules (weighting around 2700 kDa) formed by proteins and ribosomal RNA (rRNA) which may occupy a large (50%) share of cell’s proteome mass. Information flow can be altered in a number of ways through regulation. Dedicated proteins (e.g. transcription factors) and RNA molecules (e.g. antisense RNA) can affect the expression of a gene. Therefore, proteome production has some degree of flexibility: regulatory circuits can express specific genes in response to precise stimuli (e.g. starvation, heat, ...). 1.1 Bacterial physiology 7 GENE EXPRESSION AND REGULATION transcription translation RNA DNA inverse transcription proteins RNA replication Figure 1.2. The central dogma of molecular biology is a framework for understanding the residue–by–residue transfer of sequence information in living organisms. At the most basic level, information is stored within DNA, then transfered to RNA molecules (through transcription) and to proteins (through translation). Other information flows (most notably RNA replication and reverse transcription) only occur in special cases, for example when retroviruses infect an host cell. Gene expression can be altered (regulated) in different ways, ranging from modification of DNA (e.g. methylation) to transcriptional regulation (e.g. transcription factors and repressors), post–transcriptional regulation (only in eukaryotes) and regulation of translation. 1.1.1 Metabolism Life is, at its essence, an out–of–equilibrium process which requires energy. Cells must be able to gather energy from the environment and use it to perform activities such as replication, movement, and so on. The set of chemical reactions needed to sustain life of a cell is named metabolism. Metabolism is a network of mainly enzyme–catalyzed reactions, which perform a variety of functions, ranging from nutrients breakdown to the polymerization of macromolecules. Metabolism is conventionally divided in catabolism and anabolism, see the scheme in Fig. 1.3. Catabolism is the set of chemical reactions devoted to the production of energy and smaller molecules from the breakdown of larger molecules. On the other hand, anabolism is the part of metabolism which uses some of the byproducts of catabolism, the so–called building blocks, to synthesize all macromolecules which are needed to build a copy of the cell. Both catabolic and anabolic reactions can differ among living organisms. In fact, organisms can be classified with respect to the capacity of building complex molecules starting from simple ones, like carbon dioxide and water (autotrophic organisms as opposed to heterotrophic). Further divisions are possible, for instance whether the organism can use sunlight to activate processes inside the cell (e.g. fixation of CO2 ) or they must use the chemical energy originated by the breakdown and/or oxidation of chemical species. Escherichia coli metabolism can be well represented as in Fig. 1.3. E. coli can grow in a minimal medium with some sugar (e.g. glucose) as sole carbon source, plus ammonia and other salts, or in richer media (e.g. amino acids as carbon/nitrogen sources). For sakes of simplicity, let us focus on E. coli cells growing in aerobic conditions glucose as the only carbon source. Glucose (C6 H12 O6 ) is catabolized to produce energy and carbon precursors. At low growth rates, bacteria optimally use glucose , with carbon dioxide (CO2 ) being the only carbon waste. On the 8 1. Modeling bacterial growth: an overview INTERMEDIATE METABOLISM cell wall waste catabolism anabolism Figure 1.3. Intermediate metabolism. Nutrients are transported across the cellular membrane, and broken down to simpler molecules, called “precursors”. These are later assembled to form the building blocks (amino acids, lipids, nitrogenous bases,. . . ) needed to build all macromolecules (proteins, the cell wall, DNA and RNA). This process also produces energy (ATP, NADH or other reducing molecules). other hand, fast growing E. coli cells excrete large amounts of acetate (C2 H3 O2– ), which is a byproduct of glycolysis [135]. This is analogue to the Crabtree effect [31] first observed in yeast [30], and the Warburg effect observed in cancer metabolism [48, 40]. Crabtree effect is a short–term, reversible, switch of cellular metabolism between respiration and fermentation; instead, the Warburg effect is a long-term metabolic reprogramming, typical of cancer cells [40]. 1.1.2 Growth laws The start of quantitative studies on bacterial growth dates back to Louis Pasteur, when his paper Mémoire de la Fermentation appelèe lactique was published in 1857. He was able to show that fermentation was due to microbial growth combining growth dynamic data and chemical assays, thus bringing a strong evidence against the theory of spontaneous generation. Bacterial cultures may be grown in different conditions. Agar plates can be used to grow bacterial colonies, but most measurements are much easier if bacterial cultures are grown in suspension, as in batch cultures. In the latter case, cells are grown in suspension in agitated flasks with controlled temperature; bacteria are allowed to duplicate until they ran out of nutrients. A fed–batch culture is similar to the batch culture, but nutrients are continuously added to the vessel; similarly, flow cells allow continuous cultures with a constant flux of nutrients. 1.1 Bacterial physiology 9 Growth phases Bacterial growth in batch cultures can be modeled in a series of distinct phases. Let us suppose, for instance, E. coli cells are inoculated in a vessel with sugar (e.g. glucose) and minimal salts (to provide ammonia, sodium, potassium, iron, etc.). Several phases can be distinguished: • Lag phase. Cells do not immediately start to grow, as they have to adapt to the new environment, syntetizing RNA and enzymes and preparing for cell division. • Exponential phase (also called log phase or logarithmic phase). In this phase cells are duplicating at a constant rate, so that their number grows exponentially as N = N0 2t/τ with τ being the (average) doubling time, or N = N0 eλt , where λ is the growth rate. Cells usually do not duplicate exactly at the same moment, even if synchronous cultures may be obtained with proper techniques. • Stationary phase. In this phase an essential nutrient has been completely used, so that cells cannot grow any more. The number of cells is therefore constant during this phase. • Death phase. Cells die, resulting in a decreasing cell population. In particular, many measures focus on the exponential phase, since the average properties of the cells can be considered constant (the co–called balanced growth regime). We have to distinguish between cell concentration (number of cells per unit volume) and bacterial density (dry weight mass per unit volume). In fact, the density of single cells is roughly constant, but their volume can change by a factor 2 or 3 at different growth rates. Keeping in mind this difference, we will always refer to growth as “bacterial mass production”, and not as “increase in the number of cells”. A remarks on the estimation of bacterial density are necessary. Direct measurements of growth – that is, of the mass of the bacterial culture – are performed filtering and drying the solution and weighting the essicated cells. This procedure, although simple, may be not convenient in practice. Instead, spectrophotometers can be used to measure the light scattered by the culture solution, being a proxy for bacterial density [71]. The conversion factor between optical density (OD) and bacterial density has to be determinated experimentally. Monod’s growth kinetics Jacques Monod was able to model growth dynamics by studying growth kinetics [85]. The core of Monod’s theory is the link between instantaneous growth rate λ(t) = d log M (t)/dt (with M (t) being the total bacterial mass, or density, at time t) and the limiting substrate concentration, s(t). First, he found that growth yield Y , that is, the bacterial mass produced per substrate consumed, is quite indipendent from growth condition, intial substrate concentrations or the chemical form of the substrate. The maximum content of biomass Mf in a batch culture occurs at the 10 1. Modeling bacterial growth: an overview moment of complete utilization of the substrate, and it depends in a linear way from the initial substrate s0 : Mf = M0 + Y s0 . (1.1) Using this relation, Monod was then able to infer the following relationship between instantaneous growth rate and the substrate concentration as: λ = λs s s + Ks (1.2) which is known as Monod’s law. Similarly to the Michaelis–Menten kinetics of enzyme–catalyzed reactions, this is a kinetic law which can be used to formulate a closed dynamical system. In fact, using Eq. (1.1) and (1.2) we can write: 1 d s(t) M (t) = λm , M (t) dt s(t) + Ks s(t) = s0 + 1 (M (t) − M0 ) Y (1.3) The resulting differential equation for M (t) can be easily solved and used to predict growth dynamics in exponential phase, with initial values M0 and s0 , and the parameters Y , λm and Ks being determined for each bacterium in given growth conditions. For instance, exponential growth of E. coli cells in glucose at 30 ℃ is well described by the parameters1 Y = 0.23, λm = 0.93 h−1 and Ks = 4 mg/L. Catabolite repression Monod eventually won the Nobel Prize in Physiology or Medicine for his studies on carbon catabolite repression (CCR) and the lac operon in E. coli. He showed that cells grown in batch culture using a mixture of glucose and another sugar, e.g. lactose, display a biphasic (diauxic) growth. After an initial lag phase, cells grow exponentially until glucose runs out. Then, after a second lag phase, cells start again to grow exponentially using lactose as the carbon source. The explanation of such behaviour is that the enzymes needed for lactose transport an metabolism, encoded in the lac operon, start being expressed only when glucose runs out. The preference for glucose over different carbon sources, and the regulatory phenomena behind such behaviour, are called carbon catabolite repression. The mechanism behind CCR in E. coli is relatively well understood [54]: transport of glucose through the phosphoenolpyruvate-–carbohydrate phosphotransferase system (PTS) inhibits the enzyme adenylate cyclase, which catalyzes the production of the signalling molecule cyclic adenosine monophosphate (cAMP). When glucose runs out, adenylate cyclase is expressed, and the newly produced cAMP binds to a protein called cyclic AMP receptor protein (CRP; also known as catabolite activator protein, CAP). The CRP– cAMP complex is a transcription factor which regulates hundreds of genes in direct or indirect way. Most of the genes regulated through the CRP–cAMP complex (the Crp regulon) are related to carbohydrates transport and metabolism, energy production, amino acid metabolism, nucleotide metabolism and ion transport systems [139]. 1 Monod used base 2 logarithms, so that λs = 1.35 divisions per hour. Curiosly, the growth yield reported by Monod is quite low, with more recent measurements giving Y between 0.4 and 0.6. See, for instance, Chapter 3 and the experimental fits. 1.1 Bacterial physiology 11 Gene regulation of the lac operon was the first genetic regulatory mechanism to be understood clearly. Expression of the operon is regulated in two different ways. An intracellular regulatory protein called lac repressor is contitutively expressed, and inhibits transcription of the lac operon by binding a DNA region just downstream the lac promoter. When lactose is present in the cell, it can bind to the lac repressor, causing its detachment from DNA and allowing for the lac operon to be expressed. The second regulatory mechanism is the binding of the CRP–cAMP complex to a specific DNA site upstream of the promoter; the CRP–cAMP complex facilitates RNA polymerase binding to the promoter. Growth laws: the SMK papers Monod law describes how growth rate is affected by variations in substrate levels. Two papers [109, 66] (see also Ref. [25]) from Schaechter, Maaløe and Kjeldgaard (SMK) were published in 1958 describing how cellular composition varies with growth conditions. The first paper [109] focused on steady state (exponential phase) conditions, and showed that many aspects of cell composition depend exclusively on growth rate, and not on the medium the cells are growing in. In particular, RNA and ribosome content of the cell was found to be linearly correlated with growth rate, irrespective on environmental (e.g. nutrient) conditions (see also subsequent work by Dennis and Bremer, Ref. [39]). If cells are grown in constant conditions (in exponential phase) for long enough, the fractions of cellular components in the culture do not change – a condition called balanced growth. It may seem difficult to think that predictions on the average bacterial cell of a given species is possible without fully specifying the environmental conditions; the importance of this work resides in the discovery that some descriptions may actually be formulated by knowing growth rate alone [89]. The second paper [66] studied how biomass composition changes after sudden variations in growth media. The authors studied both shifts from poor to higher quality nutrients (shift–ups, or upshifts), or from rich to poor substrates (shift– downs, or downshifts). These experiments showed the “rate mantainence” phenomenon: mass production quickly (almost instantaneously) adjusts to the new environment, but duplication rate has some lag time. The result is that, after a shift–up, cell size grows before the duplication rate adjusts. In fact, the growth phases we described before (lag, exponential, stationary, and death phases) can be thought as a succession of upshifts and downshifts. A model in which production rates stabilize instantaneously compares well with experiments [19]. More on the RNA/protein ratio growth laws Starting from 2010, the group led by Terry Hwa pursued a systematic study on the mass composition of the cell to environmental conditions. Very much in the spirit of the first SMK paper, they uncovered a series of simple relations between growth rate λ and the RNA/protein ratio r. They studied the behaviour of the Schaechter line by limiting growth in two different ways, either reducing the quality of the carbon nutrient (C–limitation) or introducing translation limiting antibiotics, such as chloramphenicol (R–limitation). In the first case the RNA/protein ratio 12 1. Modeling bacterial growth: an overview is well described by a linear relationship r = r0 + λ/kr , where kr is reduced when antibiotics are present. In fact, the maximum value of kr , ranging between 5.4 and 5.9/h in absence of antibiotics, is related to the maximal translation rate of the ribosomes (around 20 aa/s), and is therefore callede translational capacity. In the case of R–limitation the slope changes sign, with r = rmax −λ/kc , with kc depending on the quality of the carbon source. This behaviour is illustrated in Fig. 1.4, panel (a). If we suppose that kc and kr are state variables, encoding all informations on growth and RNA/protein ratio, we can invert the two linear relations to obtain: λ kc , kr = rmax − r0 r kc , kr = r0 k k c r kc + kr kr kc + rmax kc + kr kc + kr (1.4) (1.5) These relationships have been verified for many different carbon sources and for slow–translating mutants. The similarity between Eq. (1.5) and Eq. (1.2) confirms that kC can be thought as a proxi for external carbon substrate concentration. One can associate The growth rate dependence of the RNA/protein ratio is ribosomal–affiliated proteins (extended ribosome). This sectors contains all ribosomal proteins, plus all other ribosome–affiliated proteins (e.g. elongation factors).roughly proportional to the ribosomal proteins of the cell, since (1) roughly 85% of RNA is rRNA, and the fraction is roughly constant from moderate to fast growth rates, (2) rRNA and ribosomal proteins come in fixed proportions and (3) all other R–affiliated proteins are usually co–expressed along with ribosomal proteins. Scott et al. [114] estimate the R–sector proteome mass to RNA mass be ρ = 0.76. Therefore, Eq. (1.5) suggests a coarse grained model of proteome allocation, with φR = ρr and the rest of the proteome minimally divided into a growth–independent sector φQ and a growth–dependent sector φP satisfying φP (λ) = λ/kC . Given the constraint φP + φR + φQ = 1, the model is easily solved as: φP = φP,0 + λ/kC , (1.6) φR = λ/kR , (1.7) λ = (1 − φQ − φR,0 ) kC kR , kC + kR (1.8) with kC = ρkc and kR = ρkr . The y–intercept of the R–sector is estimated to be φR,0 ∼ 6.6%, while the fixed φQ is estimated around 45% of total proteome. More insights into regulation The P–sector appear to be upregulated in carbon limitation, so that it is likely to contain catabolic proteins. You et al. [137] provided more insights by directly measuring lacZ expression. lacZ is a gene in the lac operon, coding for the enzyme β–galactosidase (β–gal), whose function is cleaving lactose to glucose and galactose. A non–metabolizable inducer of the lac operon, IPTG, can be used to unbind the lac repressor from the operon, so that β–galactosidase levels can be considered a measure of the CRP–cAMP activity. 1.1 Bacterial physiology 13 Figure 1.4. (a) Behaviour of the RNA/protein ratio upon nutritional (carbon) limitation (continuous line) and translational limitation (dashed lines) obtained growing cells in media with increasing quantities of antibiotics, as modeled by Eq. (1.4) and Eq. (1.5). (b) Given the constraint φP + φR = constant, reciprocal relations must be obeyed by the φP sector. (c) Pie chart showing the three–fold partition of the proteome introduced in Scott et al. [114]: an R–sector of ribosome–affiliated proteins, a fixed Q–sector, and a growth–dependent P–sector. Image from Ref. [115]. A third growth limitation mode is studied, namely nitrogen (A–) limitation; a corresponding proteome sector φA is upregulated in response to decreased nitrogen sources. The nitrogen uptake can be tuned by using an E. coli strain with titratable nitrogen uptake system. Expression of glnA, which encodes the major ammonia assimilating protein glutamine synthetase, is taken as a proxi for the A–sector. All observations can be consistently described using a five–fold proteome partition model with three state variables kC , kA and kR : φC = φC,0 + λ/kC (∝ lacZ expression) φR = φR,0 + λ/kR (∝ RNA/protein ratio) (1.10) φA = φA,0 + λ/kA (∝ glnA expression) (1.11) φO+U = φO + φU,0 + λ/kU λ= 1 − φQ − X X φX,0 (1.9) ( “core” + uninduced) ! 1 1 1 1 + + + kC kA kR kU −1 (1.12) (1.13) In order to obtain the correct maximum growth rate in carbon limitation (around 1.2/h), an “uninduced sector” φU must be introduced. This U–sector is not upregulated by any of the three growth limitations, and satisfies φU = φU,0 + λ/kU . Sulphur assimilation and nucleotide synthesis could be examples of proteins which are not targeted by either C–, R–, or A– limitation. The very existence of such laws for the different proteome sectors requires their coordination. In particular, the C–sector of catabolic proteins and the A–sector of anabolic proteins have opposing behaviour with respect to C– and A– limitation, suggesting the existence of a common regulation system. The autors propose inhibition of adenylate cyclase from α–keto acids as a way to balance the flux of carbon precursors from glycolysis and the production of amino acids. Very recently, a direct determination of the proteome sectors using genome–scale mass spectrometry has been carried out by Hui et al. [58]. Six different proteome sectors have been defined by clustering of protein mass variations upon C–, A– and 14 1. Modeling bacterial growth: an overview R– limitation, suggesting the presence of a few global regulators. A theory for this six–component proteome partitioning can be formulated and cross–checked with similar methods to those used in [114, 137]. 1.2 Genome scale modeling of metabolic networks Chemical networks are bipartite networks involving chemical species (named metabolites in metabolic networks) and chemical reactions. Following Palsson [95], we can define certain key properties of chemical reactions: 1. Stoichiometry. Chemical species and reactions are connected by the stoichiometric coefficients, which describe the variations in the number of chemical species due to an elementary step of a chemical reaction. For example, in an isomerism A −−→ B the product B has a coefficient +1, while the reactant A has a coefficient −1. Of course, a conventional directionality for any reactions has to be introduced in order to define “products” and “reactants”. 2. Rates. Reaction rates are fixed by a combination of factors, such as substrates concentrations, kinetic constants, presence or absence of catalysts, temperature, pressure, ionic strenght of the solution, etc. The cell is able to manipulate reaction rates thanks to the fact that most reactions are enzyme–catalyzed, and many regulation circuits allow the enzymes to be produced only when they are needed. Chemical networks can be modeled with different detail levels. For instance, chemical reactions are, at their essence, stochastic phenomena; On the other hand, when the number of molecules is high enough, the rates follow deterministic laws with good approximation. The knowledge of intracellular concentrations and kinetic parameters is essential to build a dynamical model of cellular metabolism; on the contrary, steady state models can be formulated without any knowledge about such quantities. For a given metabolic network, all informations about stoichiometry are encoded in the the stoichiometric matrix. The metabolites are first labeled with an index µ = 1, . . . , Nmets , and the reactions with an index i = 1, . . . , Nrxns . For each reaction a conventional directionality is also assumed. Then, the stoichiometric matrix S is constructed2 so that its entries Sµi are equal to the stoichiometric coefficient of metabolite µ in reaction i, with the plus sign if µ labels a product and the minus sign if it is a reactant. If metabolite µ appears both as reactant and as a product, the difference between the two stoichiometric coefficients is used. Therefore, Sµi equals the variation in the number of molecules of species µ given a single elementary step of reaction i. If a metabolite is involved in the reaction, but only acts as catalyst (i.e. it is neither produced or consumed by the reaction), the correspondent entry of the S matrix is zero. By definition, the time evolution of the number of metabolites Nµ (t) is governed by the reaction rates φi (t) (the number of elementary reactions per time unit) 2 The stoichiometric matrix is also indicated in literature with the letter N . We will always use the letter S, hoping the reader will not be confused with the entropy. 1.2 Genome scale modeling of metabolic networks 15 through the stoichiometric matrix, as follows: dN = Sφ dt X d Sµi φi Nµ = dt i i.e. ∀µ (1.14) A normalization is usually introduced dividing both members by the cell volume or mass. Defining the concentrations [cµ ] = Nµ /V and the fluxes vi = φi /V with contant V , we have: d [c] = Sv dt X d Sµi vi [cµ ] = dt i i.e. ∀µ (1.15) In turn, reaction rates, or fluxes, depend on concentrations, usually in a nonlinear way. At the elementary level, the mass–action law can be used as kinetic law. Mass action kinetics simply states that the rate of a chemical process is proportional to the product of the concentrations of the reactants. For example, given the reaction −− ⇀ A+B↽ − − C + D, the forward and backward fluxes are computed as: v + = k+ [a][b] , v − = k− [c][d] . (1.16) with [a], [b], [c] and [d] being the concentrations of the four chemical species. The net flux is thus given by v = v + − v − . The kinetic constants k+ and k− are related by thermodynamics, as we shall see in Chapter 2. Mass action kinetics does not take into account the action of enzymes; In fact, most biological reactions are enzyme–catalyzed. The textbook example of such rate expression is given by the Michaelis–Menten kinetics, which models a simple E enzyme–catalyzed isomerism A − → B: v = kcat [e] [a] [a] + KM (1.17) where [a] and [e] are the substrate and enzyme concentrations, respectively, kcat is the turnover number of the reaction (which fixes the maximum speed of the reaction for a given enzyme concentration), and KM is the Michaelis constant (which fixes the affinity of the enzyme to the substrate). 1.2.1 Fluxes normalization and dilution terms Equation (1.15) can be hardly directly applied to a growing cell. In fact, bacteria are usually studied during exponential growth. Individual cells also appear to grow exponentially [24], with their density being roughly constant; furthermore, intracellular fluxes may depend on the part of the cell cycle the bacteria is in. As we consider an exponentially growing bacterial culture, we may insted consider the total fluxes and total cell volume. This is a convenient choice since they are macroscopic quantities (as long bacterial population is large enough), so they show much smaller fluctuations. Furthermore, the dry weight MDW of the cellular culture can be used to normalize the fluxes instead of the volume, being the former much easier to measure. As we consider exponentially growing cells we have: d dt Nµ MDW = X i Sµi d 1 φi − Nµ MDW dt MDW (1.18) 16 1. Modeling bacterial growth: an overview Defining dry–weight normalized concentrations [cµ ] = Nµ /MDW and fluxes vi = φi /MDW , and growth/dilution rate λ = d log MDW /dt we get: X d Sµi vi − [cµ ]λ [cµ ] = dt i (1.19) where the last term −[cµ ]λ is a dilution term, due to the fact that we are dealing with quantities ([cµ ], vi and MDW ) which are growing as eλt . It can be neglected only if the average lifespan of a metabolite is much less than the doubling time. 1.2.2 Fundamental subspaces of the stoichiometric matrix The stoichiometric matrix encodes many informations about the chemical network in a compact fashion [44]. In particular, the four fundamental subspaces of S (kernel, image, cokernel and coimage) have a very clear physical interpretation. Kernel The kernel of S is the vector space spanned by all vectors which are annihilated by the matrix S: ker(S) = span(v : Sv = 0) (1.20) where 0 is the null (column) vector. Any flux configuration in ker(S) does not produce any variations in the concentrations of the metabolites. This is clearly seen from (1.15), as the substitution v → v + k with k ∈ ker(S) can be performed without affecting the left hand side of the equation. The simplest example of such degenerancy is given by a reversible reaction which is splitted in two irreversible reactions with opposite directions. In this case there is an obvious degenerancy, due to the fact that only the net flux of the overall reaction affects the concentrations. In general, such loops are governed by thermodynamics: in this particular case, the forward and the backward fluxes are set by the free energy difference between products and reactants. The relations between thermodinamics and fluxes will be described in detail in Chapter 2. Cokernel The cokernel is defined as ker(S T ), that is: coker(S) = span(w : wS = 0) (1.21) where 0 is the null (column) vector. Therefore, vectors in coker(S) have dimension equal to the number of metabolites in the network. Vectors belonging to the cokernel of S define conservation laws in the metabolite concentrations. In fact, taking the scalar product between a vector w ∈ coker(S) and Eq. (1.15) we have: w· X X X ∂ ∂ (wS)i vi = 0 wµ [cµ ] = wµ Sµi vi = [c] = ∂t ∂t µ i µ,i ⇒ ∂ (w · [c]) = 0 ∂t (1.22) 1.2 Genome scale modeling of metabolic networks 17 Therefore, linear combinations of concentrations exist such that their values are not affected by the fluxes. The values of these conserved metabolic pools must be provided as initial data in kinetic models. Image and coimage Image, im(S), and coimage, coim(S), are the vector spaces complementary to ker(S) and coker(S), respectively. The equality sign in Eq. (1.15) only relates vectors in im(S) and coim(S). To make this statement more clear, we can use the Singular Value Decomposition (SVD) on the stoichiometric matrix. SVD is a powerful tool in linear algebra which allows to write any matrix in a diagonal form using two orthogonal transformations. In particular: S = U ΣRT with (U U T )µν = δµν , (RRT )ij = δij , Σµi = σi δµi (1.23) The matrix Σ has the same dimensions as S (say, M rows and N columns), but has non–zero entries only on the diagonal, which are called singular values. The number of singular values is evidently equal to min(N ,M ). By definition, the singular values are non–negative and arranged in decreasing order, σ1 > σ2 > . . . . The number q of nonzero singular values is the rank of the matrix S. The columns of U and R, which we call u(µ) , µ = 1, . . . , M , and r (i) , i = 1, . . . , N , form a basis for RM and RN , respectively. We can write: X (U T )µν ν X ν ∂ ∂[cν ] = (u(µ) · [c]) ∂t ∂t (RT )ij vj = r (i) · v (1.24) (1.25) Then, we can recast Eq. (1.15) as: ∂ (a) (u · [c]) = σa (r (a) · v) , ∂t ∂ (a) (u · [c]) = 0 , ∂t a = 1, . . . , q (1.26) a = q + 1, . . . , M (1.27) The vectors r (a) and u(a) with a ≤ q span, respectively, im(S) and coim(S). Of course, dim(im(S))=dim(coim(S))=rank(Σ)=rank(S)=q. Eq. (1.26) clearly shows a 1:1 relation between vectors in im(S) and coim(S), with u(a) = Sr (a) . On the other hand, Eq. (1.27) is equivalent to the conservation laws in Eq. (1.22), with w being any linear combination of the u(a) vectors. 1.2.3 Flux Balance Analysis Building a detailed model of metabolism presupposes knowledge of the kinetic parameters and reaction mechanisms [21, 26], and should possibly take into account stochasticity [52] and spatial diffusion [50, 13]. Unfortunately, the knowledge of many biochemical details of genome–scale metabolic network is quite poor. Constraint-based models [95, 12] are widely employed in the literature to describe the operation of a biochemical reaction network at steady state. The main 18 1. Modeling bacterial growth: an overview advantage of this approach over dynamic models is that kinetic rates and concentrations are not required. Instead, they focus on the fluxes v, which are considered indipendent variables. Dry weight mass normalization is the standard choice, and the dilution term is usually neglected. Steady state is then imposed in Eq. (1.19) to obtain: X Sv = 0 i.e. Sµi vi = 0 ∀µ (1.28) i These equations enforce mass–balance constraints among the reactions, so that no metabolite concentration varies with time. Clearly, neglecting the dilution terms allows to remove concentrations from the constraints. In usual applications, physiological aspects constrain fluxes to vary with certain ranges, so that bounds of the type vi ∈ [li , ui ] are normally prescribed for every reaction, i. Such bounds may reflect, for instance, the fact that certain processes are known to be physiologically irreversible (e.g., vi ≥ 0), they have an upper bound due to limited enzyme availability, or are required to occur at precise rates (as can be the case for maintenance reactions). The usual choice for most reactions is to set very high (non physiologic) upper bounds, e.g. 1000 mmol/gDW h. A nontrivial solution of Eq. (1.28) is a possible non–equilibrium steady state (NESS) for the reaction network. From a geometric point of view, under Eq. (1.28) and the bounds on fluxes, the space of possible NESSs is represented by a convex polytope. If all flux configurations inside this volume could be considered as physically realizable solutions, one might assess the ‘typical’ productive capabilities of the network by sampling them using a controlled algorithm [33]. Unluckily, this route often turns out to be computationally too expensive for large enough systems. Alternatively, one may search for the state(s) that maximize the value of certain biologically motivated objective functions, which can usually be cast in the form of a linear combination of fluxes that represents the selective production of a given set of metabolites. Such a framework, known as Flux Balance Analysis (FBA) [94], has been shown to be predictive in many instances, even under genetic and/or environmental perturbations (possibly with small modifications, see Section 1.2.4). The standard form of an FBA problem is the following: arg max c · v v subject to Sv = 0 , l≤v≤u (1.29) where c is a vector with Nrxns components which defines the function to be optimized. The flux configurations that maximize such a linear functional can be retrieved with the methods of linear programming [113], the textbook case being biomass production maximization [46]. The biomass reaction joins the metabolic network and macromolecular composition of the cells: it is a sink for biomass precursors (amino acids, fatty acids, nucleobases), at the same time draining the energy, i.e. hydrolyzing ATP, required for the formation of such macromolecules. Biomass production flux is usually normalized so that its value equals the growth rate: in fact, fluxes are usually expressed in mmol/gDW h, while growth rate is expressed in 1/h. Therefore, the stoichiometric coefficients of the biomass reaction have dimensions mmol/gDW , and they can be empirically determined by measuring the average biomass composition of the cells. Of course, biomass composition may well depend on the environment the cell grow in; we will study in detail how a growth rate–dependent biomass can be introduced in FBA in Section 3.7. 1.2 Genome scale modeling of metabolic networks 19 Figure 1.5. With no constraints, the flux distribution of a biological network may lie at any point in a solution space. When mass balance constraints imposed by the stoichiometric matrix S (labeled 1) and capacity constraints imposed by the lower and upper bounds (ai and bi ) (labeled 2) are applied to a network, it defines an allowable solution space. The network may acquire any flux distribution within this space, but points outside this space are denied by the constraints. Through optimization of an objective function, FBA can identify a single optimal flux distribution that lies on the edge of the allowable solution space. (Image and caption from [94].) 1.2.4 Other frameworks A number of predictions obtained using Flux Balance Analysis are verified by experiments. An E. coli K–12 MG1655 strain growing in glycerol as sole carbon source for about 700 generations was shown [59] to evolve, increasing its maximum growth rate to the optimal value predicted by FBA. Regulation at the single–gene level can be included using a set of boolean contraints to model transcriptional regulation, as done in regulatory Flux Balance Analysis (rFBA) [28, 27]. Knockout lethality can be assessed [41] removing from the metabolic network the reactions catalyzed by enzymes corresponding to deleted genes. Furthermore, phenotypes of the surviving strains can be modeled with good precision using tools like Minimization of Metabolic Adjustment (MOMA) [116] and Regulatory On/Off Minimization of metabolic changes (ROOM) [119]; both methods find flux configurations which are “close” to the FBA wild–type fluxes. One aspect in which FBA fails is the description of apparently suboptimal behaviours of the cells, such as the Crabtree effect. Variations in glucose availability are usually modeled in FBA by changing the upper bound to glucose uptake. With such boundary conditions, FBA solutions have the maximum growth yield (growth rate per glucose flux): glucose is always fully broken down to CO2 , irrespective of growth rate. One possible way to induce fermentation in FBA solutions is setting bounds on particular fluxes [79] or global bounds on the total flux, as in Flux Balance Analysis with Molecular Crowding (FBAwMC) [13]. Chapter 3 describes a new framework, Constrained Allocation Flux Balance Analysis (CAFBA), which describes the switch to acetate overflow with high precision, using a global constraint on fluxes inspired by the proteome partitioning models described in Sections 1.1.2. Thermodynamics affects possible flux configurations, as we shall see in detail in Chapter 2. In fact, the second principle of termodynamics does not allow closed loops of fluxes. Enforcing such constraint may be a challenging task, but and may allow to estimate intracellular metabolic concentrations [57, 56]. 20 1. Modeling bacterial growth: an overview Finally, some approximated dynamics can be formulated coupling FBA to metabolite concentrations, as in Dynamic Flux Balance Analysis [77]. Sequential use of different carbon sources (the so–called carbon catabolite repression [54]) has also been studied with good results [13]. 121 Conclusion The research work presented in this Thesis focused on the study of metabolism from many different perspectives, which we will summarize using three dichotomies. • Constraints as opposed to optimization in Flux Balance Analysis. Mass balance constraints alone can only define a set of possible flux configurations. A biologically motivated optimization principle can be used to extract predictions, but two kind of constraints must be included: 1) Physical constraints such as the ones provided by thermodynamics, studied in Chapter 2. These constraints are mandatory, that is, flux configurations violating these constraints cannot be taken into consideration. 2) Biological constraints which emerge from cell biology (e.g. ATP maintenance flux, molecular crowding or proteome allocation constraints). They are needed to refine FBA predictions in order to reconcile “naïve” results produced by FBA and experimental data. In particular, protein production is expensive; Hence, a proteome allocation constraint allows to improve FBA predictions considerably, naturally describing the emergence of overflow metabolism and the use of the Entner– Doudoroff (ED) pathway instead of the more canonical Embden–Meyerhof– Parnas (EMP) glycolytic pathway as a protein–saving strategy. This is confirmed by the study of a coarse grained model, in which the optimality of an high yield/high proteome cost pathway is studied in a dynamic environment. • Genome scale as opposed to coarse grained modelling. Pros and cons exist for both approaches. Complex models usually allow to describe richier phenomena than coarse grained models, but they may lack the ability to provide insights into emergent properties. One can manage to get the best out the two worlds by combining them, as done in Chapter 3 where we show how CAFBA refines phenotypic predictions and naturally displays the tradeoff between an efficient (large growth yield) metabolism and the cost of allocating proteins. CAFBA can also be applied to a much smaller model of metabolism, where one can actually perform analytical calculations, hence • Statics versus dynamics. Bacterial growth laws are usually studied using exponentially growing bacterial cultures, which allows to study the properties of the cells at constant growth rates. In particular, proteome shares are found to be linearly related to growth rate. Nonetheless, dynamics can provide complementary informations about the shape of such laws. Proteome growth laws’ offsets, whose existence seems puzzling from the steady state point of view, appear to have a clear role in boosting growth during nutritional upshifts, as 122 Conclusion shown in Chapter 4. In Chapter 5 we applied CAFBA and the growth laws to a small model of metabolism, which can be studied both from the static and the dynamic point of view, thus complementing the analysis in the previous Chapters (which focused on genome scale models and upshift scenarios). We will now discuss in more detail the main findings and future directions for the main arguments discussed in the Thesis. Thermodynamic constraints in metabolic networks Thermodynamics provides fundamental contraints which have to be satisfied by constraint–based models of metabolism; in fact, the presence of flux loops implies that there is not set of chemical potentials which can be assigned to the metabolites. Flux loops in FBA solutions are often harmless, since they can be removed using a simple flux minimization without altering the objective function (e.g. growth rate) and the exchange fluxes. In other cases, flux loops cannot be removed without altering the value objective function or modifying some other constraint. This is the case of the transporters loop in Recon–2 human cell metabolism models, shown in Eq. (2.49). Two transport reactions are combined such that ATP is sinthesized from ADP, which is absurd since ATP hydrolysis should be spontaneous. This loop cannot be removed by a global procedure such as flux minimization if ATP maintenance hydrolysis flux or growth rate are fixed, therefore signalling an inconsistency of the model. Identification of infeasible cycles is in principle a difficult taks, but the problem can be tackled using a combination of relaxation and Monte Carlo algorithms. Once identified, loops can be used to fix the models models imposing bounds on reactions. Uniform sampling of steady state fluxes in genome–scale models is technically feasible [37], but loops are usually included. An open question is how to uniformly sample thermodynamically feasible flux configurations; this is much harder, since the solution space is non–convex. A simple projection (including minimization of some norm, see Sect. 2.2.6), is able to remove the loops, but does not yield an uniform distribution of the fluxes. Efficient uniform sampling is likely to require the preliminary knowledge of the possible loops, that is, to dissect the solution space in loop–free convex subspaces. Constrained Allocation Flux Balance Analysis CAFBA shows that integration of genome–scale models of metabolism and regulation maybe possible using some simple constraints, without using detailed informations about concentrations and the kinetic parameters of the single reactions (which are largely unknown). This remarkable finding is probably due to the existence of a few main regulation systems, such as the one mediated by Cra–CMP complex. We confirmed that the acetate switch and the reduced expression of TCA genes at high growth rates are part of a deliberate strategy [127, 8] operated by the cell to efficiently express its genes in rich media, when protein synthesis is the main bottleneck to growth . Sticking to the interpretation of the weights as inverse turnover numbers, see Section 3.2.1, we observe that correct results are obtained for weights fluctuations Conclusion 123 that are much smaller that the actual variations among turnover numbers from reaction to reaction. This suggests that some kind of cross–regulation among enzyme levels, kinetic constants and concentrations [3, 91] should be reflected in the weights. Further explorations are needed to confirm such hypothesis. More work on CAFBA should be done to assess its applicability range. For instance, one should check whether CAFBA predicts overflow metabolism in organisms different from Escherichia coli, such as Lactococcus lactis (excreting lactate and formate) and Saccaromyces cerevisiae (excreting ethanol) [129]. The randomization procedure carried out in Sect. 3.6 can be in principle validated using single–cell measurements of growth and enzyme levels [124, 73, 64]. CAFBA ability of accurately modeling fermentation can be used to model microbial communities [55], or to study how the evolutionary pressure shaped the metabolic networks [7] by evolving cells in different environments. Growth laws, role of the offsets and dynamics As noted before, one of the main take–home messages of this thesis work is that statics and dynamics often give complementary informations. This is particularly true as we think about regulation, which ultimately is the way cells adapt to different environmental conditions. The work presented in Chapter 4 can pave the way to new experiments about the growth laws. The validity of the concurrent use of the growth laws and Dennis and Bremers’ instantaneous model can be questioned performing upshifts in different conditions, e.g. varying the amount of translation–inhibiting antibiotics or reducing nitrogen availability. The growth rate increase at the upshift (λ0 -λi ) can be related to the offsets and slopes of the various proteome sectors. A realistic kinetic model of the various proteome sectors using what is known about the Crp regulon, ppGpp–mediated regulation of ribosome affiliated proteins etc. can also be formulated, based on the work of You et al. [137]. 125 Bibliography [1] GLPK - GNU Linear Programming Kit. http://www.gnu.org/software/glpk. [2] GLPKMEX - a Matlab MEX interface http://sourceforge.net/projects/glpkmex/. for the GLPK library. [3] Adadi, R., Volkmer, B., Milo, R., Heinemann, M., and Shlomi, T. Prediction of microbial growth rate versus biomass yield by a metabolic network with kinetic parameters. PLoS computational biology, 8 (2012), e1002575. [4] Balaban, N. Q., Merrin, J., Chait, R., Kowalik, L., and Leibler, S. Bacterial persistence as a phenotypic switch. Science, 305 (2004), 1622. [5] Baneyx, F. Recombinant protein expression in Escherichia coli. Current opinion in biotechnology, 10 (1999), 411. [6] Barabasi, A.-L. and Oltvai, Z. N. Network biology: understanding the cell’s functional organization. Nature Reviews Genetics, 5 (2004), 101. [7] Bardoscia, M., Marsili, M., and Samal, A. Phenotypic constraints promote latent versatility and carbon efficiency in metabolic networks. arXiv preprint arXiv:1408.4555, (2014). [8] Basan, M., Hui, S., Zhang, Z., Shen, Y., Williamson, J. R., and Hwa, T. Efficient allocation of proteomic resources for energy metabolism results in acetate overflow. In preparation, (2014). [9] Beard, D. A., Babson, E., Curtis, E., and Qian, H. Thermodynamic constraints for biochemical networks. Journal of theoretical biology, 228 (2004), 327. [10] Beard, D. A., Liang, S.-d., and Qian, H. Energy balance for analysis of complex metabolic networks. Biophysical journal, 83 (2002), 79. [11] Beard, D. A. and Qian, H. Relationship between thermodynamic driving force and one-way fluxes in reversible processes. PLoS One, 2 (2007), e144. [12] Beard, D. A. and Qian, H. Chemical biophysics: Quantitative analysis of cellular systems. Cambridge University Press (2008). 126 Bibliography [13] Beg, Q., Vazquez, A., Ernst, J., De Menezes, M., Bar-Joseph, Z., Barabási, A.-L., and Oltvai, Z. Intracellular crowding defines the mode and sequence of substrate uptake by escherichia coli and constrains its metabolic activity. Proceedings of the National Academy of Sciences, 104 (2007), 12663. [14] Benyamini, T., Folger, O., Ruppin, E., and Shlomi, T. Method flux balance analysis accounting for metabolite dilution. Genome Biol., 11 (2010), R43. [15] Bialek, W. Biophysics: searching for principles. Princeton University Press (2012). [16] Binder, K. and Heermann, D. Monte Carlo simulation in statistical physics: an introduction. Springer (2010). [17] Bokinsky, G., et al. Synthesis of three advanced biofuels from ionic liquidpretreated switchgrass using engineered escherichia coli. Proceedings of the National Academy of Sciences, 108 (2011), 19949. [18] Boogerd, F., Bruggeman, F. J., Hofmeyr, J.-H. S., and Westerhoff, H. V. Systems biology: philosophical foundations. Elsevier (2007). [19] Bremer, H. and Dennis, P. P. Transition period following a nutritional shift-up in the bacterium escherichia coli B/r: Stable RNA and protein synthesis. Journal of theoretical biology, 52 (1975), 365. [20] Bremer, H., Dennis, P. P., et al. Modulation of chemical composition and other parameters of the cell by growth rate. Escherichia coli and Salmonella: cellular and molecular biology, 2 (1996), 1553. [21] Chassagnole, C., Noisommit-Rizzi, N., Schmid, J. W., Mauch, K., and Reuss, M. Dynamic modeling of the central carbon metabolism of escherichia coli. Biotechnology and bioengineering, 79 (2002), 53. [22] Chen, R. Bacterial expression systems for recombinant protein production: E. coli and beyond. Biotechnology advances, 30 (2012), 1102. [23] Condon, C., Liveris, D., Squires, C., Schwartz, I., and Squires, C. L. rRNA operon multiplicity in escherichia coli and the physiological implications of rrn inactivation. Journal of bacteriology, 177 (1995), 4152. [24] Cooper, S. What is the bacterial growth law during the division cycle? Journal of bacteriology, 170 (1988), 5001. [25] Cooper, S. On the fiftieth anniversary of the Schaechter, Maaløe, Kjeldgaard experiments: implications for cell-cycle and cell-growth control. BioEssays, 30 (2008), 1019. [26] Cornish-Bowden, A. Fundamentals of enzyme kinetics. John Wiley & Sons (2013). Bibliography 127 [27] Covert, M. W. and Palsson, B. Ø. Transcriptional regulation in constraints-based metabolic models of Escherichia coli. Journal of Biological Chemistry, 277 (2002), 28058. [28] Covert, M. W., Schilling, C. H., and Palsson, B. Regulation of gene expression in flux balance models of metabolism. Journal of theoretical biology, 213 (2001), 73. [29] Csonka, L. N., Ikeda, T. P., Fletcher, S. A., and Kustu, S. The accumulation of glutamate is necessary for optimal growth of Salmonella typhimurium in media of high osmolality but not induction of the proU operon. Journal of bacteriology, 176 (1994), 6324. [30] Dashko, S., Compagno, C., and Piškur, J. Why, when and how did yeast evolve alcoholic fermentation? FEMS yeast research, (2014). [31] De Deken, R. The crabtree effect: a regulatory system in yeast. Journal of General Microbiology, 44 (1966), 149. [32] De Martino, A., De Martino, D., Mulet, R., and Uguzzoni, G. Reaction networks as systems for resource allocation: A variational principle for their non-equilibrium steady states. PloS one, 7 (2012), e39849. [33] De Martino, A. and Marinari, E. The solution space of metabolic networks: Producibility, robustness and fluctuations. In Journal of Physics: Conference Series, vol. 233, p. 012019. IOP Publishing (2010). [34] De Martino, D. Thermodynamics of biochemical networks and duality theorems. Physical Review E, 87 (2013), 052108. [35] De Martino, D., Capuani, F., Mori, M., De Martino, A., and Marinari, E. Counting and correcting thermodynamically infeasible flux cycles in genome-scale metabolic networks. Metabolites, 3 (2013), 946. [36] De Martino, D., Figliuzzi, M., De Martino, A., and Marinari, E. A scalable algorithm to explore the Gibbs energy landscape of genome–scale metabolic networks. PLoS computational biology, 8 (2012), e1002562. [37] De Martino, D., Mori, M., and Parisi, V. Uniform sampling of steady states in metabolic networks: heterogeneous scales and rounding. Submitted to PLoS ONE, (2014). [38] Dennis, P. P. and Bremer, H. Differential rate of ribosomal protein synthesis in Escherichia coli B/r. Journal of Molecular Biology, 84 (1974), 407. [39] Dennis, P. P. and Bremer, H. Macromolecular composition during steadystate growth of Escherichia coli B/r. Journal of bacteriology, 119 (1974), 270. [40] Diaz-Ruiz, R., Rigoulet, M., and Devin, A. The Warburg and Crabtree effects: on the origin of cancer cell energy metabolism and of yeast glucose repression. Biochimica et Biophysica Acta (BBA)-Bioenergetics, 1807 (2011), 568. 128 Bibliography [41] Edwards, J. and Palsson, B. The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proceedings of the National Academy of Sciences, 97 (2000), 5528. [42] Ehrenberg, M., Bremer, H., and Dennis, P. P. Medium-dependent control of the bacterial growth rate. Biochimie, 95 (2013), 643. [43] Erickson, D., Mori, M., and Schink, S. Investing in a proteome stock as strategy to cope with environmental changes. In preparation, (2014). [44] Famili, I. and Palsson, B. O. The convex basis of the left null space of the stoichiometric matrix leads to the definition of metabolically meaningful pools. Biophysical journal, 85 (2003), 16. [45] Feist, A. M., Henry, C. S., Reed, J. L., Krummenacker, M., Joyce, A. R., Karp, P. D., Broadbelt, L. J., Hatzimanikatis, V., and Palsson, B. Ø. A genome-scale metabolic reconstruction for escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Molecular systems biology, 3 (2007). [46] Feist, A. M. and Palsson, B. O. The biomass objective function. Current opinion in microbiology, 13 (2010), 344. [47] Fendt, S.-M., Buescher, J. M., Rudroff, F., Picotti, P., Zamboni, N., and Sauer, U. Tradeoff between enzyme and metabolite efficiency maintains metabolic homeostasis upon perturbations in enzyme capacity. Molecular systems biology, 6 (2010). [48] Ferreira, L. M. Cancer metabolism: the warburg effect today. Experimental and molecular pathology, 89 (2010), 372. [49] Flamholz, A., Noor, E., Bar-Even, A., Liebermeister, W., and Milo, R. Glycolytic strategy as a tradeoff between energy yield and protein cost. Proceedings of the National Academy of Sciences, 110 (2013), 10039. [50] Frey, E. and Kroy, K. Brownian motion: a paradigm of soft matter and biological physics. Annalen der Physik, 14 (2005), 20. [51] Gaspard, P. Fluctuation theorem for nonequilibrium reactions. The Journal of chemical physics, 120 (2004), 8898. [52] Ge, H., Qian, M., and Qian, H. Stochastic theory of nonequilibrium steady states. Part II: Applications in chemical biophysics. Physics Reports, 510 (2012), 87. [53] Goeddel, D. V., et al. Expression in escherichia coli of chemically synthesized genes for human insulin. Proceedings of the National Academy of Sciences, 76 (1979), 106. [54] Görke, B. and Stülke, J. Carbon catabolite repression in bacteria: many ways to make the most out of nutrients. Nature Reviews Microbiology, 6 (2008), 613. Bibliography 129 [55] Harcombe, W. R., et al. Metabolic resource allocation in individual microbes determines ecosystem interactions and spatial dynamics. Cell reports, 7 (2014), 1104. [56] Henry, C. S., Broadbelt, L. J., and Hatzimanikatis, V. Thermodynamics-based metabolic flux analysis. Biophysical journal, 92 (2007), 1792. [57] Hoppe, A., Hoffmann, S., and Holzhütter, H.-G. Including metabolite concentrations into flux balance analysis: thermodynamic realizability as a constraint on flux distributions in metabolic networks. BMC systems biology, 1 (2007), 23. [58] Hui, T. et al. Quantitative mass spectrometry reveals simple proteome partition in escherichia coli. Submitted to Cell, (2014). [59] Ibarra, R. U., Edwards, J. S., and Palsson, B. O. Escherichia coli K12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature, 420 (2002), 186. [60] Johnson, D. B. Finding all the elementary circuits of a directed graph. SIAM Journal on Computing, 4 (1975), 77. [61] Keseler, I. M., et al. Ecocyc: a comprehensive database of escherichia coli biology. Nucleic acids research, 39 (2011), D583. [62] Kim, B. H. and Gadd, G. M. Bacterial physiology and metabolism. Cambridge university press (2008). [63] Kitano, H. Systems biology: a brief overview. Science, 295 (2002), 1662. [64] Kiviet, D. J., Nghe, P., Walker, N., Boulineau, S., Sunderlikova, V., and Tans, S. J. Stochasticity of metabolism and growth at the single-cell level. Nature, (2014). [65] Kjeldgaard, N. and Kurland, C. The distribution of soluble and ribosomal RNA as a function of growth rate. Journal of Molecular Biology, 6 (1963), 341. [66] Kjeldgaard, N., Maaløe, O., and Schaechter, M. The transition between different physiological states during balanced growth of salmonella typhimurium. Journal of general microbiology, 19 (1958), 607. [67] Klappenbach, J. A., Dunbar, J. M., and Schmidt, T. M. rRNA operon copy number reflects ecological strategies of bacteria. Applied and environmental microbiology, 66 (2000), 1328. [68] Klumpp, S., Scott, M., Pedersen, S., and Hwa, T. Molecular crowding limits translation and cell growth. Proceedings of the National Academy of Sciences, 110 (2013), 16754. 130 Bibliography [69] Koch, A. L. Overall controls on the biosynthesis of ribosomes in growing bacteria. Journal of theoretical biology, 28 (1970), 203. [70] Koch, A. L. The adaptive responses of escherichia coli to a feast and famine existence. Advances in microbial physiology, 6 (1971), 147. [71] Koch, A. L. Growth measurement. Methods for general and molecular bacteriology, (1994), 248. [72] Krauth, W. and Mézard, M. Learning algorithms with optimal stability in neural networks. Journal of Physics A: Mathematical and General, 20 (1987), L745. [73] Labhsetwar, P., Cole, J. A., Roberts, E., Price, N. D., and LutheySchulten, Z. A. Heterogeneity in protein expression induces metabolic variability in a modeled escherichia coli population. Proceedings of the National Academy of Sciences, 110 (2013), 14006. [74] Lemuth, K., Hardiman, T., Winter, S., Pfeiffer, D., Keller, M., Lange, S., Reuss, M., Schmid, R., and Siemann-Herzberg, M. Global transcription and metabolic flux analysis of escherichia coli in glucose-limited fed-batch cultivations. Applied and environmental microbiology, 74 (2008), 7002. [75] Lerman, J. A., et al. In silico method for modelling metabolism and gene product expression at genome scale. Nature communications, 3 (2012), 929. [76] Licht, T. R., Tolker-Nielsen, T., Holmstrøm, K., Krogfelt, K. A., and Molin, S. Inhibition of escherichia coli precursor-16s rrna processing by mouse intestinal contents. Environmental microbiology, 1 (1999), 23. [77] Mahadevan, R., Edwards, J. S., and Doyle III, F. J. Dynamic flux balance analysis of diauxic growth in escherichia coli. Biophysical journal, 83 (2002), 1331. [78] Mahadevan, R. and Schilling, C. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metabolic engineering, 5 (2003), 264. [79] Majewski, R. and Domach, M. Simple constrained-optimization view of acetate overflow in e. coli. Biotechnology and bioengineering, 35 (1990), 732. [80] Mast, F. D., Ratushny, A. V., and Aitchison, J. D. Systems cell biology. The Journal of cell biology, 206 (2014), 695. [81] Mezard, M. and Montanari, A. Information, physics, and computation. Oxford University Press (2009). [82] Mikkola, R. and Kurland, C. Is there a unique ribosome phenotype for naturally occurring escherichia coli? Biochimie, 73 (1991), 1061. Bibliography 131 [83] Miller, S., Lesk, A. M., Janin, J., Chothia, C., et al. The accessible surface area and stability of oligomeric proteins. Nature, 328 (1987), 834. [84] Molenaar, D., van Berlo, R., de Ridder, D., and Teusink, B. Shifts in growth strategies reflect tradeoffs in cellular economics. Molecular systems biology, 5 (2009). [85] Monod, J. The growth of bacterial cultures. Annual Reviews in Microbiology, 3 (1949), 371. [86] Mori, M., De Martino, A., Marinari, E., and Hwa, T. Constrained Allocation Flux Balance Analysis. In preparation for Molecular Systems Biology. [87] Mori, M. et al. Pareto–optimality of Constrained Allocation Flux Balance Analysis predictions and comparison with other constraint based models. Working title, in preparation, (2015). [88] Nath, K. and Koch, A. L. Protein degradation in Escherichia coli I. Measurement of rapidly and slowly decaying components. Journal of Biological Chemistry, 245 (1970), 2889. [89] Neidhardt, F. C. Bacterial growth: Constant obsession with dn/dt. Journal of bacteriology, 181 (1999), 7405. [90] Neidhardt, F. C., Ingraham, J. L., and Schaechter, M. Physiology of the bacterial cell: a molecular approach. Sinauer Associates Sunderland, MA (1990). [91] Noor, E., Bar-Even, A., Flamholz, A., Reznik, E., Liebermeister, W., and Milo, R. Pathway thermodynamics highlights kinetic obstacles in central metabolism. PLoS computational biology, 10 (2014), e1003483. [92] O’Brien, E. J., Lerman, J. A., Chang, R. L., Hyduke, D. R., and Palsson, B. Ø. Genome-scale models of metabolism and gene expression extend and refine growth phenotype prediction. Molecular systems biology, 9 (2013). [93] Orth, J. D., Conrad, T. M., Na, J., Lerman, J. A., Nam, H., Feist, A. M., and Palsson, B. Ø. A comprehensive genome-scale reconstruction of Escherichia coli metabolism. Molecular systems biology, 7 (2011). [94] Orth, J. D., Thiele, I., and Palsson, B. Ø. What is flux balance analysis? Nature biotechnology, 28 (2010), 245. [95] Palsson, B. O. Systems biology. Cambridge university press (2006). [96] Panikov, N. S. Microbial growth kinetics. Springer (1995). [97] Paul, B. J., Ross, W., Gaal, T., and Gourse, R. L. rrna transcription in Escherichia coli. Annu. Rev. Genet., 38 (2004), 749. 132 Bibliography [98] Pedersen, S. Escherichia coli ribosomes translate in vivo with variable rate. The EMBO journal, 3 (1984), 2895. [99] Potrykus, K. and Cashel, M. (p) ppGpp: Still Magical? Microbiol., 62 (2008), 35. Annu. Rev. [100] Poulsen, L. K., Licht, T. R., Rang, C., Krogfelt, K. A., and Molin, S. Physiological state of escherichia coli bj4 growing in the large intestines of streptomycin-treated mice. Journal of bacteriology, 177 (1995), 5840. [101] Pramanik, J. and Keasling, J. Stoichiometric model of escherichia coli metabolism: incorporation of growth-rate dependent biomass composition and mechanistic energy requirements. Biotechnology and bioengineering, 56 (1997), 398. [102] Price, N. D., Famili, I., Beard, D. A., and Palsson, B. Ø. Extreme pathways and kirchhoff’s second law. Biophysical journal, 83 (2002), 2879. [103] Price, N. D., Schellenberger, J., and Palsson, B. O. Uniform sampling of steady-state flux spaces: means to design experiments and to interpret enzymopathies. Biophysical journal, 87 (2004), 2172. [104] Qian, H. and Beard, D. A. Thermodynamics of stoichiometric biochemical networks in living systems far from equilibrium. Biophysical chemistry, 114 (2005), 213. [105] Qian, H., Beard, D. A., and Liang, S.-d. Stoichiometric network theory for nonequilibrium biochemical systems. European Journal of Biochemistry, 270 (2003), 415. [106] Reed, J. L., Vo, T. D., Schilling, C. H., Palsson, B. O., et al. An expanded genome–scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol, 4 (2003), R54. [107] Sauer, U., Canonaco, F., Heri, S., Perrenoud, A., and Fischer, E. The soluble and membrane-bound transhydrogenases UdhA and PntAB have divergent functions in NADPH metabolism of Escherichia coli. Journal of Biological Chemistry, 279 (2004), 6613. [108] Savageau, M. A. Escherichia coli habitats, cell types, and molecular mechanisms of gene control. American Naturalist, (1983), 732. [109] Schaechter, M., Maaløe, O., and Kjeldgaard, N. Dependency on medium and temperature of cell size and chemical composition during balanced growth of salmonella typhimurium. Journal of General Microbiology, 19 (1958), 592. [110] Schellenberger, J., et al. Quantitative prediction of cellular metabolism with constraint-based models: the cobra toolbox v2. 0. Nature protocols, 6 (2011), 1290. Bibliography 133 [111] Schilling, C. H., Letscher, D., and Palsson, B. Ø. Theory for the systemic definition of metabolic pathways and their use in interpreting metabolic function from a pathway-oriented perspective. Journal of theoretical biology, 203 (2000), 229. [112] Schomburg, I., et al. Brenda in 2013: integrated reactions, kinetic data, enzyme function data, improved disease classification: new options and contents in brenda. Nucleic acids research, 41 (2013), D764. [113] Schrijver, A. Theory of linear and integer programming. John Wiley & Sons (1998). [114] Scott, M., Gunderson, C. W., Mateescu, E. M., Zhang, Z., and Hwa, T. Interdependence of cell growth and gene expression: origins and consequences. Science, 330 (2010), 1099. [115] Scott, M. and Hwa, T. Bacterial growth laws and their applications. Current opinion in biotechnology, 22 (2011), 559. [116] Segre, D., Vitkup, D., and Church, G. M. Analysis of optimality in natural and perturbed metabolic networks. Proceedings of the National Academy of Sciences, 99 (2002), 15112. [117] Seifert, U. Entropy production along a stochastic trajectory and an integral fluctuation theorem. Physical review letters, 95 (2005), 040602. [118] Sevick, E., Prabhakar, R., Williams, S. R., and Searles, D. J. Fluctuation theorems. arXiv preprint arXiv:0709.3888, (2007). [119] Shlomi, T., Berkman, O., and Ruppin, E. Regulatory on/off minimization of metabolic flux changes after genetic perturbations. Proceedings of the National Academy of Sciences of the United States of America, 102 (2005), 7695. [120] Shlomi, T., Cabili, M. N., Herrgård, M. J., Palsson, B. Ø., and Ruppin, E. Network-based prediction of human tissue-specific metabolism. Nature biotechnology, 26 (2008), 1003. [121] Soh, K. C. and Hatzimanikatis, V. Network thermodynamics in the postgenomic era. Current opinion in microbiology, 13 (2010), 350. [122] Solopova, A., van Gestel, J., Weissing, F. J., Bachmann, H., Teusink, B., Kok, J., and Kuipers, O. P. Bet-hedging during bacterial diauxic shift. Proceedings of the National Academy of Sciences, (2014), 201320063. [123] Steuer, R., Gross, T., Selbig, J., and Blasius, B. Structural kinetic modeling of metabolic networks. Proceedings of the National Academy of Sciences, 103 (2006), 11868. 134 Bibliography [124] Taniguchi, Y., Choi, P. J., Li, G.-W., Chen, H., Babu, M., Hearn, J., Emili, A., and Xie, X. S. Quantifying e. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science, 329 (2010), 533. [125] Taymaz-Nikerel, H., Borujeni, A. E., Verheijen, P. J., Heijnen, J. J., and van Gulik, W. M. Genome–derived minimal metabolic models for Escherichia coli MG1655 with estimated in vivo respiratory ATP stoichiometry. Biotechnology and bioengineering, 107 (2010), 369. [126] Thiele, I., et al. A community-driven global reconstruction of human metabolism. Nature biotechnology, 31 (2013), 419. [127] Valgepea, K., Adamberg, K., Nahku, R., Lahtvee, P.-J., Arike, L., and Vilu, R. Systems biology approach reveals that overflow metabolism of acetate in escherichia coli is triggered by carbon catabolite repression of acetyl-coa synthetase. BMC systems biology, 4 (2010), 166. [128] Valgepea, K., Adamberg, K., Seiman, A., and Vilu, R. Escherichia coli achieves faster growth by increasing catalytic and translation rates of proteins. Molecular BioSystems, 9 (2013), 2344. [129] van Hoek, M. J. and Merks, R. M. Redox balance is key to explaining full vs. partial switching to low-yield metabolism. BMC systems biology, 6 (2012), 22. [130] Van Regenmortel, M. H. Reductionism and complexity in molecular biology. EMBO reports, 5 (2004), 1016. [131] Veening, J.-W., Smits, W. K., and Kuipers, O. P. Bistability, epigenetics, and bet-hedging in bacteria. Annu. Rev. Microbiol., 62 (2008), 193. [132] Vemuri, G., Altman, E., Sangurdekar, D., Khodursky, A., and Eiteman, M. Overflow metabolism in Escherichia coli during steady-state growth: transcriptional regulation and effect of the redox ratio. Applied and environmental microbiology, 72 (2006), 3653. [133] Wiback, S. J., Famili, I., Greenberg, H. J., and Palsson, B. Ø. Monte carlo sampling can be used to determine the size and shape of the steady-state flux space. Journal of theoretical biology, 228 (2004), 437. [134] Woese, C. R. A new biology for a new century. Microbiology and Molecular Biology Reviews, 68 (2004), 173. [135] Wolfe, A. J. The acetate switch. Microbiology and Molecular Biology Reviews, 69 (2005), 12. [136] Wright, J. and Wagner, A. Exhaustive identification of steady state cycles in large stoichiometric networks. BMC systems biology, 2 (2008), 61. [137] You, C., et al. Coordination of bacterial proteome with metabolism by cyclic amp signalling. Nature, 500 (2013), 301. Bibliography 135 [138] Young, R. and Bremer, H. Polypeptide–chain–elongation rate in Escherichia coli B/r as a function of growth rate. Biochem. J, 160 (1976), 185. [139] Zheng, D., Constantinidou, C., Hobman, J. L., and Minchin, S. D. Identification of the CRP regulon using in vitro and in vivo transcriptional profiling. Nucleic acids research, 32 (2004), 5874. [140] Zhuang, K., Vemuri, G. N., and Mahadevan, R. Economics of membrane occupancy and respiro-fermentation. Molecular systems biology, 7 (2011).
© Copyright 2026 Paperzz