Genetic Control Systems

Diversity and Design in Cellular
Networks
Prediction, Control and Design
of and with Biology
Adam Arkin, University of California, Berkeley
http://genomics.lbl.gov
"Nothing in biology makes sense except in the light of evolution."
Theodosius Dobzhansky, The American Biology Teacher, March 1973
A scientist
Bacillus
Yeast
Volvox
An egg
Humpty Dumpty
The Advent of Molecular Biology
Genome
Macromolecules
Metabolites
Biochemistry
Through RNA
Feedback &
Feedforward
Myxococcus xanthus
• Even cells as “simple” as bacteria
are highly social, differentiating,
sensing/actuation systems
Images from Reichardt or D. Kaiser
Immune cells
• They perform amazing engineering
feats under the control of complex
cellular networks
Onsum, Arkin, UCB
Mione, Redd, UCL
1/50 of the known neutrophil
chemotaxis c5anetwork
receptor
Fc- receptor
Calcium control
PIP3 control
Systems and Synthetic Biology
• Systems biology seeks to uncover the design and
control principles of cellular systems through
– Biophysical characterization of macromolecules and other cellular
structures
– Comparative genomic analysis
– Functional genomic and high-throughput phenotyping of cellular
systems
– Mathematical modeling of regulatory networks and interacting cell
populations.
• Synthetic biology seeks to develop new designs in the
biological substrate for biotechnological, medical, and
material science.
– Founded on the understanding garnered from systems biology
– New modalities for genetic engineering and directed evolution
– Scaling towards programmable biomaterials.
Systems biology is necessary
• Because of the highly interconnected nature of
cellular networks
• Because it is the best way to understand what
is controllable and what is not in pathway
dynamics
• Because it discovers what designs evolution
has arrived at to solve cellular engineering
problems that we emulate in our own designs.
A broader overview
•
•
•
•
•
•
•
•
Evolutionary Game Theory
Ecological Modeling
Population Biology
Epidemiology
Neuroscience
Organ Physiology
Immune Networks
Cellular Networks
•
Problems:
–
–
–
–
–
Static and Dynamic Representations
Physical Picture for Representation (e.g. deterministic vs. stochastic)
Mathematical Description of Physics (e.g. Langevin vs. Master Equation)
Levels of abstraction: Formal and ad hoc.
Measurement: High-throughput/broadbrush/imprecise vs. lowthrough/targetted/precise
Chemical Kinetics: The short course I.
Consider a collision between two hard spheres:
v12dt
r1
In a small time interval, dt, sphere 1 will sweep out a
small volume relative to sphere 2.
r12 =r 1+r 2
Vcol  r122  v12t
v12
If the center of sphere 2 lies within this volume at time t, then in the time small time interval
the spheres will collide.
The probability that a given sphere of type 2 is in that volume is simply Vcol/V (where V is
the containing volume).
All that remains is to average this quantity over the velocity distributions of the spheres.
Vcol / V  V 1r122 v12t
Chemical Kinetics: The short course II.
Given that, at time t, there are X1 type-1 spheres and X2 type-2 spheres then the
probability that a 1-2 collision will occur on V in the next time interval is:
X 1  X 2  V 1r122 v12dt
Now if each collision has a probability of causing a reaction then in analogy to
the last equation, all we can say is
X1 X2 c1 dt =
average probability that an R1 reaction will occur
somewhere in V within the interval dt.
Chemical Kinetics: The Master Equation I.
If we wish to map trajectories of chemical concentration, we want to know the
probability that there will be

X  { X 1, X 2, X 3,..., Xn}
molecules of each species in the chemical mechanism at time t in V. We call
that probability:

P( X , t )
This function gives complete knowledge of the stochastic state of the system at time t.
The master equation is simply the time evolution of this probability. To derive it we need to
derive which is simply done from our previous work.
It is the sum of two terms:
1. The probability that we were at X at time t and we stayed there.
2. The probability that a reaction of type m brought us to this state.
Chemical Kinetics: The Master Equation II
The first term is given by:
M

Pstay  P( X , t ) *[1   am dt ]
m 1
Where
am dt  hm cm dt = The probability that a reaction of type m
will occur given that the system is in a
given state at time t.
and where hm is a combinatorial function of the number of
molecules of each chemical species in reaction type m.
Chemical Kinetics: The Master Equation III
The second term is given by:
M
Penter   Bm dt
m 1
where Bm is the probability that the system is one reaction m away from
state at time t and then undergoes a reaction of type m.

Plugging these terms into the equation for P( X , t  dt )
we arrive at the master equation.
M



P( X , t )   Bm  am P( X , t )
t
m 1
and rearranging
Deterministic Kinetics I.
M



P( X , t )   Bm  am P( X , t )
t
m 1
Deterministic kinetics may be derived with some
assumptions from the master equation. The end result is
simple a set of coupled ODE’s:


dX
 v
dt
where  is the stoichiometric matrix and v is a vector of
rate laws.
Example: Enzyme kinetics
Mathematical Representation
X+2Y
Z+E
EZ
2Z
EZ
E+P
Enzymatic
Very simplest “Mass action representation”
X 
 1 0 0 0 
Y 
 2 0 0 0  
2

k
X
*
Y
 

 1
Z 
 2  1  1 0   k 2 E * Z 
d   / dt  

E
0

1

1

1
 

  k  2 EZ 
 EZ 
 0  1  1  1  kcat EZ 
 


P
0
0
0

1
 


Stoichiometric
Matrix
Flux Vector
Mathematical Representation
X+2Y
Z+E
EZ
2Z
EZ
X+2Y
Z
E
2Z
P
E+P
Often times…the enzyme isn’t represented…
X 
 1 0 
Y 
 2 0   k1 X * Y 2 

Z 
d   / dt  

Z 
 2  1 Vmax
K m  Z 
 

 
P
 0  1
Enzyme Kinetics II.
 k1[ E ][ S ]  k 2 [ E  S ]  k3 [ E  S ]

 k1[ E ][ S ]  k 2 [ E  S ]
dX

k1[ E ][ S ]  k 2 [ E  S ]  k3[ E  S ]
dt
k3 [ E  S ]
But often times we make assumptions equivalent to a singular
perturbation. E.g. we assume that E,S and ES are in rapid equilibrium:
Etot  [ E ]  [ E  S ]
[ E ][ S ] /[ E  S ]  K M
Etot  K M [ E  S ] /[ S ]  [ E  S ]
[ E  S ]  Etot [ S ] /( K M  [ S ])
dP
 k3 * [ E  S ]  k3 Etot [ S ] /( K M  [ S ])  Vmax [ S ] /( K M  [ S ])
dt
These forms are the common forms used in basic analysis
Stationary State Analysis
 1 0 0 0 
 2 0 0 0  
2

 k1 X * Y 
 2  1  1 0   k 2 E * Z 
0

0

1

1

1

  k  2 EZ 
 0  1  1  1  kcat EZ 


0 0  1
 0
Clearly, the steady state fluxes are in the “null
space” of the stoichiometric matrix.
But these are only unique if significant
constraints are also applied (the system in underdetermined).
Also– highly dependent on “representation”.
The Stoichiometric Matrix
R1 R 2 R3 R 4
X  1 0 0 0 
Y  2 0 0 0 
Z  2  1  1 0 


E  0  1  1  1
EZ  0  1  1  1


P  0
0 0  1
• This matrix is a description of the
“topology” of the network.
• It is tricky to abstract into a simple
incidence matrix, for example.
• Most experimental measurements can
only capture a small fraction of the
interactions that make up a network.
• However, it does put some limits on
behavior…
Graph Theory: “Scale-Free” networks?
• Nodes are protein domains
• Edges are “interactions”
• Statements are made about
– Robustness
– Signal Propagation (small world properties)
– Evolution
Stability Analysis for Deterministic
Systems
•a
• ab
•a+2b3b
• b c
v=
v=
v=
v=
m
k*a
a*b2
b
da/dt= m- k*a – a*b2
db/dt= k*a + a*b2 - b
Stationary State
da/dt= m- k*a – a*b2=0
db/dt= k*a + a*b2 – b=0
ass= m/(m2+k), bss= m;
So for any given value of m or k we can
calculate the steady-state. These are
“parameters”
Stability
• We calculate stability by figuring
out if small perturbations around a
stationary state grow away from
the state or fall back towards the
state….
• So we expand our differential
equations around a steady state
and ask how small pertubations in
a and b grow….
Stability
da
 m  k1a  ab 2  f (a, b | m, k1 )
dt
db
 k1a  ab 2  b  g (a, b | m, k1 )
dt
d a
 f 
 f 
 f ss    a    b 
dt
 a  ss
 b  ss
d b
 g 
 g 
 g ss    a    b 
dt
 a  ss
 b  ss
f ss  g ss  0
 f 
e.g ,    (k1  bss2 )  ( k1  m 2 )
 a  ss
Stability
da
 m  k1a  ab 2  f (a, b | m, k1 )
dt
db
 k1a  ab 2  b  g (a, b | m, k1 )
dt
 f 
e.g ,    (k1  bss2 )  (k1  m 2 )
 a  ss
f1 
 f1
2



2
m
2
 x

( k  m )
xn
2 
 1
  1
(
k

m
)
1


J 
2m 2 

 
2
f n   (k1  m )
2 
 f n

(
k

m
)

1

 x


x
n 
 1
Stability
 a 
d 
 b   J  a 
 b 
dt
 
 a   c1 exp(1t )  c2 exp(2t ) 
 b   c exp( t )  c exp( t ) 
   3
1
4
2 
Thus the  are the eigenvalues of the perturbation
matrix and will determine if the perturbations grow or
diminish.
Why is quantitative analysis important?
B-p
20
15

A-p
A
[A]ss
10
5
5
10
[B-p]
15
20
B-p

A
A-p
?
E.g. Focal Adhesion Kinase Alternative Splice
Quantitative Analysis
B-p

A
A-p
dA
 Dephosphorylation  ( PhosphorylationB  Phosphorylation A  p )
dt
Phosphoryl ationB  kcat  f * [ B  p ] *
[ A]
K Af  [ A]
[ A  p]
Dephosphorylation  Vmax r *
K Ar  [ A  p ]
Phosphorylation A p
[ A]
 kcat  fA *[ A  p]*
K Af 2  [ A]
Bistability
A simple model of the positive feedback
kc
Stationary state [FAK-I]
kC=1.6
Monostable
Irreversibly Bistable
Weakly bistable
kc – catalytic constant for the trans-autophosphorylation.
Signal Filtering
Chemical frequency filtering can be achieved using simple mass-action
kinetics. The filtering of an input chemical oscillation is dependent on
the reaction rates in the network.
First Order Reaction
(1+sin(t))
A
0.1
0.08
k
0.06
0.04
|A| =
sin(t)|
k
0.02
(1 + 2/k2)1/2
0
0
Second Order Reaction
P+sin(t)
A
k1
C
RB
0.2 0.4 0.6 0.8

Title:
Creator: Mathematica
k2
B
Equation for amplitude
very complicated
Log
1
1.2 1.4
Brief Digression: Chemical Impedance
IA*
dA
 k1[ I ]  k 2 [ A]
dt

k1
At  ( I )  [ I ]
k2
So A is the signal inside the cell that I is outside the cell.
What if A signals to downstream targets by reacting with them?
A+B*C
dA
 k1[ I ]  k 2 [ A] - k 3 [A][B]
dt
k1[ I ]
 At  ( I ) 
k 2  k3 [ B ]
The rates and concentrations of downstream processes degrade the signal
from A.
∫
+
Brief
Digression: Chemical Impedance
IA*
dA
 k1[ I ]  k 2 [ A]
dt

k1
At  ( I )  [ I ]
k2
But what if reaction is by reversible binding?
A+B*C
dA
 k1[ I ]  k 2 [ A] - k 3 [A][B]  k 4 [C]
dt
k1[ I ]
At  ( I ) 
k2
The rates and concentrations of downstream processes don’t affect the signal.
But….what about the ME
M



P( X , t )   Bm  am P( X , t )
t
m 1
Error and ORDINARY
DIFFERENTIAL EQUATIONS
Ordinary Differential
Equations
• A differential equation defines a
relationship between an unknown
function and one or more of its
derivatives
• Physical problems using differential
equations
– electrical circuits
– heat transfer
– motion
Ordinary Differential
Equations
•
•
The derivatives are of the dependent
variable with respect to the
independent variable
First order differential equation with y
as the dependent variable and x as
the independent variable would be:
dy
 f  x, y 
dx
Ordinary Differential
Equations
A second order differential equation
would have the form:
2
dy
dy

f
(
x
,
y
,
)
2
dx
dx
Ordinary Differential
Equations
• An ordinary differential equation is
one with a single independent
variable.
• Thus, the previous two equations
are ordinary differential equations
• The following is not:
dy
= f (x1, x 2, y )
d x1
Partial Differential Equations
dy
= f (x1, x 2 , y )
dx1
Correct notation:
¶y
d
d
¶ x1
= f (x1, x 2 , y )
Ordinary Differential Equations
• The analytical solution of ordinary
differential equation as well as
partial differential equations is
called the “closed form solution”
• This solution requires that the
constants of integration be
evaluated using prescribed values
of the independent variable(s).
Ordinary Differential Equations
• At best, only a few differential
equations can be solved
analytically in a closed form.
• Solutions of most practical
engineering problems involving
differential equations require the
use of numerical methods.
One Step Methods
• Focus is on solving ODE in the
form
h
y
dy
= f (x, y )
dx
y i+ 1 = y i + f h
yi
x
This is the same as saying:
new value = old value + (slope) x (step size)
One Step Methods
• Focus is on solving ODE in the
form
dy
= f (x, y )
dx
y i+ 1 = y i + f h
h
y
yi
slope = f
This is the same as saying:
new value = old value + (slope) x (step size)
x
One Step Methods
• Focus is on solving ODE in the
form
h
y
dy
= f (x, y )
dx
y i+ 1 = y i + f h
yi
slope = f
This is the same as saying:
new value = old value + (slope) x (step size)
x
Euler’s Method
• The first derivative provides a
direct estimate of the slope at xi
• The equation is applied
iteratively, or one step at a time,
over small distance in order to
reduce the error
• Hence this is often referred to as
Euler’s One-Step Method
Taylor Series
h2
y  x i  h   y  x i   hy  x i  
y  x i   K
2
y  x i  h   y  x i   hy  x i 
EXAMPLE
For the initial condition y(1)=1, determine y
for h = 0.1 analytically and using Euler’s
method given:
dy
2
= 4x
dx
dy
2
 4x
dx
I.C. y  1 at x  1
4 3
y x C
3
1
C
3
4 3 1
y x 
3
3
y 1.1  1.44133
dy
2
 4x
dx
yi 1  yi  fh
y 1.1  y 1  4 1   0.1  1.4


2
dy
 4x 2
dx
y i 1  y i  fh
2

y 1.1  y 1  4 1   0.1  1.4


Note :
y 1.1  y 1   4 1   0.1


2
I.C.
dy/dx
step size
dy
2
 4x
dx
y i 1  y i  fh
y 1.1  y 1   4 1   0.1  1.4


2
Recall the analytical solution was 1.4413
If we instead reduced the step size to to 0.05 and
apply Euler’s twice
If we instead reduced the step size to to 0.05
and apply Euler’s twice:
2

y(1.05)  y(1)  4 1  1.05  1.00   1  0.2  1.2


2

y 1.1  y 1.05  4 1.05  1.1  1.05  1.4205


Recall the analytical solution was 1.4413
Error Analysis of Euler’s
Method
• Truncation error - caused by the nature
of the techniques employed to
approximate values of y
– local truncation error (from Taylor Series)
– propagated truncation error
– sum of the two = global truncation error
• Round off error - caused by the limited
number of significant digits that can be
retained by a computer or calculator
Taylor Series
h2
h3
y  x i  h   y  x i   hy  x i  
y  x i   y  x i K
2
6
h2
y  x i  h   y  x i   hy  x i  
y  x i 
2
Higher Order Taylor Series
Methods
h2
y  x i  h   y  x i   hy  x i  
y  x i 
2
y  x   f  x, y 
df  x, y  f x f y f
f
y  x  



f
dx
x x y x x
y
h2
y i 1  y i  f  x i , y i  h 
f x  f  x i , yi  f y  x i , yi 
2
f
f
fy 
fx 
y
x


Derivatives
y  f  x, y 
y  f x  f f y
y  f xx  2f f xy  f f yy  f x f y  f
2
M
2
fy
Modification of Euler’s
Methods
• A fundamental error in Euler’s
method is that the derivative at the
beginning of the interval is assumed
to apply across the entire interval
• Two simple modifications will be
demonstrated
• These modification actually belong to
a larger class of solution techniques
called Runge-Kutta which we will
explore later.
Heun’s Method
Consider our Taylor expansion:
Approximate f’ as a simple forward difference
f '  x i , yi 
f  x i 1 , yi 1   f  x i , yi 

h
Heun’s Method
Substituting into the expansion
y i 1  y i  f i
2
f

f
h
 i 1 i 
 fi 1  fi 
h
 yi  

h
 h  2
 2 
Heun’s Method
• Determine the derivatives for the
interval @
– the initial point
– end point (based on Euler step from
initial point)
• Use the average to obtain an
improved estimate of the slope for
the entire interval
• We can think of the Euler step as a
“test” step
y
Take the slope at xi
Project to get f(xi+1 )
based on the step size h
h
xi
xi+1
y
h
xi
xi+1
y
Now determine the slope
at xi+1
xi
xi+1
y
xi
xi+1
Take the average of these
two slopes
y
xi
xi+1
y
Use this “average” slope
to predict yi+1
xi
xi+1
{
f  xi , yi   f xi 1 , yi 1 
yi 1  yi 
h
2
y
Use this “average” slope
to predict yi+1
xi
xi+1
{
yi 1  yi 
f  xi , yi   f xi 1 , yi 1 
h
2
y
f  xi , yi   f xi 1 , yi 1 
yi 1  yi 
h
2
y
xi
xi+1
xi
xi+1
x
f  xi , yi   f xi 1 , yi 1 
yi 1  yi 
h
2
y
yi 1  yi  fh
xi
xi+1
x
Improved Polygon Method
• Another modification of Euler’s Method
• Uses Euler’s to predict a value of y at
the midpoint of the interval
h
yi 1/ 2  yi  f  x i , yi 
2
• This predicted value is used to estimate
the slope at the midpoint
y'i 1/ 2  f  xi 1/ 2 , yi 1/ 2 
Improved Polygon Method
• We then assume that this slope
represents a valid approximation of the
average slope for the entire interval
• Use this slope to extrapolate linearly
from xi to xi+1 using Euler’s algorithm
yi 1  yi  f  x i 1/ 2 ,yi 1/ 2  h
Improved Polygon Method
We could also get this algorithm from substituting a
forward difference in f to i+1/2 into the Taylor
expansion for f’, i.e.
 f i 1/ 2  f i  h
y i 1  y i  f i h  

 h/2  2
 yi  f i 1/ 2 h
2
y
f(xi)
xi
x
y
h/2
xi
xi+1/2
x
y
h/2
xi
xi+1/2
x
y
f(xi+1/2)
xi
xi+1/2
x
y
f’(xi+1/2)
xi
xi+1/2
x
y
Extend your slope
now to get f(x i+1)
h
xi
xi+1/2
xi+1
x
y
f(xi+1)
xi
xi+1/2
xi+1
x
Conclusions
• Algorithms can be more or less
stable to truncation or round off
error.
• Algorithms can be better or worse
approximations to the math you
want to do.
• Algorithms can be more or less
complex
Master Equation Simulation I
(Based on Gillespie, D.T. (1977) JPC, 81(25): 2340)
§ We are given a system in the state (X1,...,XN) at time t.
To move the system forward in time we must ask two questions:
•When will the next reaction occur?
•What kind of reaction will it be?
In order to answer these questions we introduce
P(t,m)dt =
probability that, given the state
(X1,...,XN) at time t, the next
reaction in V will occur in
the
infinitesmal time interval
(t+t,t+t+dt) there will be a
reaction of type Rm.
Master Equation Simulation II
Now we can define the P(t,m) to be the probability that no reaction occurs in the
interval (t,t+t) (Po(t)) times the probability that reaction Rm will occur in the
infinitesmal time dt following this interval (aµdt):
P(t,m)dt= Po(t) aµd t
Now aµ is simply a term related to the rate equation for a given reaction. In fact it
is a transition probability, cµ, times a combinatorial term which enumerates the
number of ways n-species can react in volume V given the configuration
(X1,...,XN), hµ.
Therefore
[1-S aµd t ']=
probability that no
reaction will occur in
time d t ' from the state
(X1,...,XN).
and
Po(t ' + d t ')= Po(t ')[1-S aµd t ']
the solution of which is
Po(t ')= exp[-S aµ t]
Master Equation Simulation III
• The Algorithm
Step 0:
Choose Initial Conditions and Rates
Step 1:
Calculate aµ for each reaction as well as the
sum of all of them.
Step 2:
Generate a random number, t, based on
Po(t) and roulette wheel select a reaction
based on aµ.
Step 3:
Increment time by t and execute reaction
µ. Goto Step 1.
Endogenous Noise
P romoter
ga
PA
ene
A
Signal P rotein
2 A
•
•
•
•
A A
*
A
70
One gene
Growing cell, 45 minutes division time
Average ~60 seconds between transcripts
Average 10 proteins/transcript:
Monte Carlo simulation data
60
about
50
50 molecules
40
30
25 molecules
20
10
0
0
5
10
15
20
25
Time (minutes)
30
35
40
45
∫
+
B-p

A
A-p
What happens when you have bistability and noise?
Langevin equation
20
E+
15
10

A
A-p
5
5
10
15
20
• But what if there is external noise
on E?
• Let’s start with…
Efree (t )  Ebound (t )  E (t )  E  Noise(t )  E    f ( E )Wt ,
Efree (t )  Ebound (t )  E  Const ,
The compact Langevin
• Plug the conservation conditions
into the equations for A-p (A*)
*


k
E
A
k
E
A
k A
*
 
 
dA  dA  

dt   
f ( E ) dBt
*
K  A
 K  A K  A 
Diffusion
Drift
Note that another term in 1/K+A has been introduced. There is now
the possibility of a cubic nullcline.
The Fokker-Planck equivalent.
2
*







  k A
p ( A; t )   k E A k E A
1 





p
(
A
;
t
)

f
(
E
)
p
(
A
;
t
)




 
t
A  K   A K   A* 
2 A  K   A
 




Which yields the stationary nullcline
k E ( A0  Ass )( K   Ass )
k K  2
2
E 

f
(
E
)
0

2
k Ass ( K   A0  Ass )
( K   Ass )
• Compared to the
deteriministic nullcline…
k ( K   Ass )( X 0  Ass )
E 
E  0
k ( K   X 0  Ass ) Ass
Depending on the noise type
AXss
ss
1
0.5
0.1
0.05
det
p0
p ½
0.01
0.005
E0 0.3
0.5
f ( E )  E
p

p 1
E½ E1 1
1.5
2
E

p=0
Normal Noise
p=1/2 Chi-square noise
p=1
Log-normal noise
Validation by ME simulation
k 1
X  E  
C
X  E   C
k 2
C 
X  E
C  X  E 
C  X *  E 
k3
k 1
*
*
k 2
*
*
k3
C * 
X  E
N  N  E   N
k 21
k  21
N  E 
k 22
k  22
It turns out this generates
log-normal noise on E+
ME Simulation
With noise on E
Without noise
Stationary Distribution with Noise
N
 E 
X
*
Stationary Distribution w/o Noise
 E 
X
*
Summary
• Adding noise to a system (in this case
external noise) can qualitatively change
its dynamics.
• Interestingly we can predict the effect
with a compact Langevin approach AND
a MM approximation pretty well
compared to what’s observed in a full
ME simulation.
• The implications for noise-induced
bistability and switching haven’t been
fully worked out.
But an Ugly specter is raised….
B-p

A
A-p
Is this really a valid picture? Adding noise changes the nullcline!
Nonetheless: Static noise
can make things look
bistable
a linear response
X
a switch response
p(E)
E
Linear
p(x)
X
There is a relationship
between the variance on E
and the slope of the response
that determines whether the
stationary distribution will be
bimodal.
Switch
p(x)
X
Niches are Dynamic
abiotoic reservoir
• Characteristic times may be
spent in each environment.
• Environments themselves are
variable.
Adaptability vs. Evolvability
Life Cycle
New niches with
new lifecycles
• Adaptability: Adjustment on the time scale of the life cycle of the organism
• Evolvability: Capacity for genetic changes to invade new life cycles
Evolvability
• In a dynamic environment, the lineage that adapts first, wins
• Fewer mutations means faster evolution
Pattern
“Environment”
Pattern
Pattern
• Are some biosystems constructed to minimize the mutations required to find
improvements?
{ Parameter Space }
{ Parameter Space }
• Modularity
• Robustness / Neutral drift improves functional sampling
• Shape of functionality in parameter space
• Minimize null regions in parameter space (entropy of multiple mutations)
Chris Voigt
Logic of B.subtilis stress response
AbrB; SinR
ComA~P
DegU~P
Sporulation
Spo0A~P
AbrB
DegU~P ComK
AbrB
SinR
ComA~P
AbrB; SinR;SigH
• Network organization has a functional logic.
• There are different levels of abstraction to be
found.
ResD~P
PhoP~P
species
3
Chemotaxis
1
Clustered Phylogenetic Profiles
2
4
5
6
• Clustered phylogenetic profile
shows blocks of conserved
genes
7
Sporulation
genes
1.
8
8
9
Competence
10
methyl-processing receptors and chemotaxis
genes in motile bacteria
2. methyl-processing receptors and chemotaxis
genes in motile Archaea
3. flagellar genes in motile bacteria
4. type III secretion system (virulence) in nonmotile pathogenic bacteria
5. motility genes in spore-forming bacteria
6. late-stage sporulation genes in spore-forming
bacteria
7. spore coat and germination response genes in
spore-forming bacteria that are not competent
8. late-stage sporulation genes in spore-forming
bacteria that are also competent
9. DNA uptake genes in Gram positive bacteria
10. DNA uptake genes in Gram negative bacteria
Consider Chemotaxis: E. coli
Periplasm
Cytoplasm
Consider Chemotaxis: E. coli
Periplasm
Cytoplasm
Integral Feedback Controller
input
Sensor
(Input
Transducer)
signal
proportional
to input
error or
actuating
signal
CheAWYZ
Flagella
Controller
Actuator
output
receptors
signal
proportional
to output
(Adapted from Control Systems Engineering, N.S. Nise 2000)
Sensor
(Output
Transducer)
cheB/cheR
Clusters are functionally coherent
Receptors
Signal Transduction (che)
Hook and Flagellar Body
Flagellar export/Type III secretion
Flagellar length and motor control
Hypthothetical receptors
Cross-Regulation with Sporulation/Cell Cycle
Endopathogens
Endopathogens
Plant pathogens
Sporulators
Animal pathogens
Archeal Extremophiles
Different modules for different lives
What Ontology Recovers Modules?
Systems Ontology
Color legend:
■ sensor
■ controller
■ actuator
■ cross-talk between
networks
■ unknown
Comparative analysis is especially important
Rao, CV, Kirby, J, Arkin, A,P. (2004) PLOS Biology, 2(2), 239-252
These are the homologous chemotaxis pathways in E.coli and B. subtilis
They have the same wild-type behavior.
Different biochemical mechanisms.
Different robustnesses!
Chris Rao/John Kirby
Two important features
Adaptation
Time
Exact
Adaptation
Differences in robustness
Chris Rao/John Kirby
E . Coli
B . subtilis
Do these differences lead to
differences in actual fitness?
Evolvability
• In a dynamic environment, the lineage that adapts first, wins
• Fewer mutations means faster evolution
Pattern
“Environment”
Pattern
Pattern
• Are some biosystems constructed to minimize the mutations required to find
improvements?
{ Parameter Space }
{ Parameter Space }
• Modularity
• Robustness / Neutral drift improves functional sampling
• Shape of functionality in parameter space
• Minimize null regions in parameter space (entropy of multiple mutations)
Chris Voigt
Logic of B.subtilis stress response
AbrB; SinR
ComA~P
DegU~P
Sporulation
Spo0A~P
AbrB
DegU~P ComK
AbrB
SinR
ComA~P
AbrB; SinR;SigH
• Network organization has a functional logic.
• There are different levels of abstraction to be
found.
ResD~P
PhoP~P
Sporulation initiation
A Motif
The SIN Operon: A recurrent motif
Environmental &
Cellular Signals
Sporulation genes
(stage II)
Spo0A
spoIIG as model
Spo0A~P
P1
sinI
P3
sinR
SIN Operon
• Vegetative (healthy) growth: Constitutive SinR expression from P3
• Resource depletion and high cell density leads to the phosphorylation of Spo0A
Feedback provides filtering
Spo0A~P(nM)
200
150
INPUT of Spo0A~P
100
50
0
0
2000
4000
6000
8000
10000
0
2000
4000
6000
8000
10000
0
2000
4000
6000
8000
10000
I (nM)
1500
1000
500
0
I (nM)
1500
1000
500
0
time (s)
Functional Regions in Parameter Space
1.2
Bistability
Parameter Space
1.0
0.8
Type 1
0.6
Type 2
0.4
0.2
0.0
k11
3
G2 S GRNAP
G4 R
A5I
6I
K7I
k38
AR9
R10
K11R
1.2
Oscillations
1.0
Hopf
points
0.8
0.6
0.4
0.2
0.0
k11
2
3
4
G
S GRNAP GR
P1
Chris Voigt
A5I
6I
K7I
SinI Activity
k83
P3
AR9
10
R
K11R
SinR Activity
Full Bifurcation Analyses: Evolvability?
• Tuning the expression of SinR (AR) with respect to
SinI leads to dynamical plasticity
• Transcription from P3 (k3) strengthens bistability
and damps oscillations
Bistability
Graded
[SinI] (nM)
10
10
10
10
10
Osc
Two steady states
Oscillations
Pulse
3
0A = 10,000 nM
2
0A = 10 nM
k3 (mRNA/s)
10
Switch
Single steady state
1
0
-1
-2
0
0.1
0.2
0.3
AR (protein/mRNA-s)
0.4
AR (protein/mRNA-s)
IN
OUT
Examples of Protein-Antagonist
Operons
Iron flux control
Thermotolerance
Bacitracin Resistance
Bistable Switch
Iron concentration
Growth phase
OUT
Sporulation
(Spo0A~P)
IN
sigX
Competance
(ComA~P)
rsiX
Pulse Generator
OUT
Sporulation
(Spo0A and
stage II promoters)
IN
rapA
Chromosome
organization
phrA
? – spatial
oscillations
soj
spo0J
• How can complicated dynamical behavior arise from simple evolutionary events?
• What are the requirements to bias the operon to one function?
Lisa can
Fontaine-Bodin,
Keasling
LAb
• Once established
one function evolve
into another?
Chris Voigt
Comparative analysis of SinI/SinR
region affecting k1
KI
Comparison of five strains of Bacillus anthracis
In anthracis:
Mutations mostly affect KI and k1
Threshold of the switch is most affected.
Across ALL sporulators
Very variable.
Voigt, CA, Wolf, DM, Arkin, AP, (2004) Genetics, In press
PMID: 15466432
Feedback induces stochastic bimodality
450
70
[spo0A~p]=1nm
400
35
[spo0A~p]=4nm
60
[spo0A~p]=100nm
30
count
350
50
25
250
40
20
200
30
15
20
10
10
5
300
150
100
50
0
0
count
0.0
0.5
1.0
1.5
2.0
2.5
0
0.0
0.5
1.0
1.5
2.0
2.5
450
90
35
400
80
30
350
70
300
60
250
50
20
200
40
15
150
30
100
20
50
10
0
0.5
1.0
1.5
2.0
2.5
0.5
1.0
1.5
2.0
2.5
0.0
0.5
1.0
1.5
2.0
2.5
25
10
5
0
0.0
0.0
0
0.0
0.5
1.0
1.5
2.0
2.5
I (log10 nM)
[sinI]
Though we must be careful since the addition of noise itself changes the qualitative dynamics.
Heterogeneity of Entry to Sporulation
A.
B.
Microscopic analysis of LF25 (amyE::PspoIIE cm). Observation by DIC X60 (A.)
and fluorescence (B.) of cells resuspended to induce sporulation and incubated 3
hours at 37°C. An example of cells not showing fluorescence are circled in figure A.
Lisa Fontaine-Bodin, Denise Wolf, Jay Keasling
Summary 1
So this motif:
• Has flexible function based on parameters
– Most parameters tune response
– A couple of parameters qualitatively change the
response
• Is an example of a possible Evolvable Motif
• Sometimes exhibits stochastic effects
– Are they adaptive?
Stochastic Effects Are Ubiquitous
Clones
Stochastic Gene Expression in
HIV-1 Derived Lentiviruses
Stable Clones
No Positive Feedback
Tat Feedback: Very Bright Sort
Tat Feedback: Bright Sort
10 -1
10
0
10
1
10
2
10
3
Images
Software
• MatLab
• Mathematica
• Berkeley Madonna
• GEPASI
• TerraNode
• JDesigner
The game of life
E3
Environment
E1
E1
E2
E4
E2
noise
t
1
2
quorum
Sensors
S1
SN
S5
S1
S2
pi
S3
S4
Organism 1
SN
Output
signals
S5
S2
pi
S3
S4
Organism 2
Beginning to link Game Theory to Dynamical Cellular Strategies.
Formal Model
x
 x1 ( k ) 
 y (k ) 
Xk   1 
 x 2 (k ) 


 y 2 (k )
Ei?
yes
no
Time kt
pObs
Ei
Observers
Non1-pObs observers
Transition matrix TI,j(k)
1-p1,2
p1,2
E1
gx>gy
p2,1
E2
1-p2,1
gy>gx
Time-varying environment
P1
1-P1
Incorrect
Ē
Psij
Observability pObs Accuracy Si
a)
b)
Psii Correct
Ē
P2
1-P2
y
sx1
x
x1
sq1,2

y
y1
S1
y2
S2
sy1
sq2,1 sq1,2 sq
2,1
sx
x2
x
Mixing M

2
sy

x
2
y
 x1 ( k  1) 
 y ( k  1) 
 1

 x 2 ( k  1) 


 y 2 ( k  1) 

y
Rate matrix Ri(k)
Time (k+1)t
Example: two environments, two moves, no
sensor
sx1
x
y
sy1
e.g. x=pili; y=no pili
E1=in host; E2=out
~m gen
E1
E2
~n gen
sx2
x
y
sy2
E1 E2
IF E1: selects for x, against y
E2: selects against x, for y
Denise Wolf, Vijay Vazirani
x
y
E1
E2
With no sensor, the options are…
Denise Wolf, Vijay Vazirani
y
x
E1
E2
x
y
1. ALL cells in state x
2. ALL cells in state y
3. Statically mixed population (some x, some y)
4. Phase variation of individual cells between x and y
With no sensor, the options are…
Denise Wolf, Vijay Vazirani
y
x
Extinction
E1
E2
x
y
1. ALL cells in state x
2. ALL cells in state y
E1 E2 E1 ..
3. Statically mixed population (some x, some y)
4. Phase variation of individual cells between x and y
With no sensor, the options are…
Denise Wolf, Vijay Vazirani
y
x
Extinction
E1
E2
x
y
1. ALL cells in state x
2. ALL cells in state y
E1 E2 E1 ..
3. Statically mixed population (some x, some y)
4. Phase variation of individual cells between x and y
With no sensor, the options are…
Denise Wolf, Vijay Vazirani
y
x
Extinction
E1
E2
x
y
1. ALL cells in state x
2. ALL cells in state y
3. Statically mixed population (some x, some y)
4. Phase variation of individual cells between x and y
With no sensor, the options are…
Denise Wolf, Vijay Vazirani
Proliferation!
y
x
E1
E2
x
y
1. ALL cells in state x
2. ALL cells in state y
3. Statically mixed population (some x, some y)
4. Phase variation of individual cells between x and y
Rate of YX Switching
Phase variation for survival
Rate of XY Switching
This is a Devil’s compromise: Phase-variation behaviors is not optimal in any
one environment but necessary for survival with noisy sensors in a fluctuating
environment.
Denise Wolf, Vijay Vazirani
Learning Environment from Cell State
Strategy
Sensor profile
Environmental profile
Random
Phase
Variation
(RPV)
No sensors
•Devil’s Compromise (DC) lifecycle: time varying environment with
different environmental states selecting for different cell states.
•Optimal switching rates a function of lifecycle asymmetries and
environmental autocorrelation.
•Time variation required (spatial variation insufficient).
O=Low prob. observable transitions over
DC or extinction set.
D=Long delays relative to env. transition
times.
Perfect sensors
Frequency dependent growth curves with mixed ESS.
Sensor
Based
Mixed
O=High prob. observable transitions;
A=Poor accuracy
•Devil’s Compromise lifecycle.
Sensor
Based
Mixed;
LPF
O=High prob. observable transitions;
A=Poor accuracy.
N=High additive noise.
•Optimal mixing probabilities biased toward selected cell-states in
dominant environmental states.
Sensor
Based
Pure
O=High prob. observable transitions;
A=High accuracy; or moderate accuracy
and low noise N.
Temporally or spatially varying environment with each environmental
state selecting for a single cell state.
Sensor
Based Pure;
LPF
O=High prob. observable transitions;
A=Moderate accuracy.
N=High additive noise.
•Asymmetric lifecycle required.
Denise Wolf, Vijay Vazirani
Robustness and Fragility
• The stratagems of a cell evolve in a given
environment for robust survival.
• Evolution writes an internal model of the
environment into the genome.
• But the system is fragile both
– to certain changes in the environment (though there
are evolvable designs)
– And certain random changes in its process structure.
• One of the central questions has to be: Robust
on what time scale? Can evolution “design” for
the future by learning from the past?
Summary
• The availability of large numbers of
bacterial genomes and our ability to
measure their expression opens a new
field of “Evolutionary Systems Biology” or
“Regulatory Phylogenomics”.
• Comparative genomics identifies
particularly conserved motifs, parts of
which are evolutionarily variable and
select for different behaviors of the
network.
• By understanding what evolution selects in
a network context we better understand
what the engineerable aspects of the
network are.
Acknowledgements
• Comparative Stress Response: Amoolya Singh, Denise Wolf
• SinIR analysis: Chris Voigt, Denise Wolf
• Chemotaxis: Chris Rao, John Kirby
• HIV: Leor Weinberger, David Schaffer
• Games: Denise Wolf, Vijay V. Vazirani
• Funding:
–
–
–
–
NIGMS/NIH
DOE Office of Science
DARPA BioCOMP
HHMI