Lecture 22: Rules for gene regulation 2

Demand Rules for Gene Regulation II
and
Simplicity in Biology
04/12/2012
Error loads of positive and negative regulations
¾ Low-demand g
genes tend to use a repressor
p
for regulation,
g
, and
high-demand genes tend to use an activator for regulation.
De
emand
¾ The choice of the mode of regulation is to minimize the fraction
of time that the cis-regulatory binding site is exposed to errors.
1
p=
Δf
1+ 1
Δf 0
2
Δf1/ Δf0
3
Demand rules for multi-regulatory systems
¾ If a gene is regulated by N regulators, then, there are 2N ways
to implement the regulation.
Glucose
¾ Let’s use the lac system in E. coli for
our analysis.
Lactose
Outside
Inside
¾ When glucose is present in the
environment E.
environment,
E coli preferentially uses
glucose as carbon and energy source.
¾ This preference is implemented by
a combination of two mechanisms:
Glucose
Lactose
cAMP
Allolactose
1. The activator of CRP: cAMP is only generated CRP
under glucoses starvation to activate CRP.
2. When glucose is pumped into the cell, lactose
entry is blocked.
blocked
LacI
LacZYA
¾ Thus the two inducers cannot appear simultaneously in the cell,
this phenomenon is called inducer-exclusion.
Input-output relationship of the lac system
¾ There are four possible combinations of glucose and lactose to
appear in the environment, which define four input states.
¾ Each input state induces a certain combination of the signal
molecules cAMP and allolactose in the cell:
The four input states
Glucose
Lactose
Signal patterns
cAMP
Allolactose
0
0
1
0
0
1
1
1
1
0
0
0
1
1
0
0
¾ Because of inducer exclusion, when both glucose and lactose
appear in the environment,
environment lactose will not enter the cell
cell, so the
last two input states result in the same signal pattern.
Input-output relationship of the lac system
¾ Thus, there are four possible binding states for CRP and LacI
on the regulatory region: (CRP,LacI)=(1,1), (1,0), (0,1) or (0,0).
¾ However, one binding state (CRP,LacI)=(0,0) cannot be reached
due to inducer-exclusion,
inducer exclusion which is called the excluded state.
state
¾ There are four output states associated with each binding state.
Input states
Glucose lactose
0
0
1
0
1
0/1
CRP Allolac
1
1
0
Excluded state
0
1
Binding states
1
1
CRP
LacI
1
CRP
0
0
1
LacI
0
0
0
Output states
Z2=0.06
Z4=1
Z1=0.003
Z3=0.13
Input-output relationship of the lac system
¾ Mapping from the input-states to binding-states, and finally to
output states.
10
11
Excluded state
01
00
Z1
Z2
Z3
Z4
11
10
Binding states [CRP
[CRP, LacI]
00
Input states [Glucose, Lactose]
01
There are 10
possible
four p
regulatory
mechanisms
that two
g
regulators
00
can have:
1 Activator1.
repressor
2. Activatoractivator;
ti t
3. Repressorrepressor;
p
4. Repressoractivator.
AR
01
Z1 Z2
10
RR
11
Z3 Z4
10
01
00
11
10
10
11
[Glucose, Lactose]
RA
10
00
[Glucose, Lactose] 01
00
01
11
11
Z1 Z2
Z3 Z4
00
11
01
Z1 Z2
[Glucose, Lactose]
01
00
Z3 Z4
11
Z1 Z2
AA
10
00
10
00
11
Z3 Z4
01
[Glucose, Lactose] 01
Input-output relationship of AA
¾ All the other three non-naturally occurring mechanisms map the
input states onto the output states in the same way as the
naturally occurring mechanisms, AR, but through different
binding
g states.
Input states
Binding states
Output states
Glucose lactose CRP Allolac
0
0
1
0
1
0/1
1
1
0
0
1
1
CRP
0
1
1
CRP
LacI
0
0
0
1
Z2=0.003
0
Excluded state
LacI
Z4=1
Z1=0.003
Z3=0.13
Input-output relationship of RR
¾ All the other three non-naturally occurring mechanisms map the
input states onto the expression state in the same way as the
non-naturally occurring mechanisms, AR, but through different
binding
g states.
Input states
Binding states
Glucose lactose CRP Allolac
0
0
1
0
1
0/1
1
1
0
0
1
LacI
0
0
0
0
0
CRP
LacI
1
0
1
0
Excluded state
CRP
Output states
Z2=0.003
Z4=1
Z1=0.003
Z3=0.13
Input-output relationship of RA
¾ All the other three non-naturally occurring mechanisms map the
input states onto the expression state in the same way as the
non-naturally occurring mechanisms, AR, but through different
binding
g states.
Input states
Binding states
Output states
Glucose lactose CRP Allolac
0
0
1
0
0
0
0
1
1
0/1
1
0
Excluded state
1
LacI
1
0
0
1
CRP
0
1
CRP
1
LacI
Z2=0.003
Z4=1
Z1=0.003
Z3=0.13
Error load associated with AR mechanism
¾ Therefore, it seems that AR was selected for its unique TF
bi di state.
binding
t t T
To see thi
this, llett llook
k att th
the error lload
d ffor each
h
mechanisms.
¾ If we assume errors are associated with free binding sites
sites, then
there are two error-loads associated the AR mechanism as
shown below:
Input states
Binding states
Output states
Glucose lactose
0
0
1
0
CRP
LacI
CRP
Δf4’
Δf1
LacI
1
0/1
Excluded state
Z2=0.06
Z4=1
Z1=0.003
Z3=0.13
Error load associated with AA mechanism
¾ If we assume errors are associated with free binding sites, then
there are three error-loads associated the AA mechanism as
shown below:
I
Input
t states
t t
Glucose lactose
0
0
1
0
1
0/1
Excluded state
Bi di states
Binding
t t
CRP
Δf2’
CRP
LacI
Δf1
Δf1’
LacI
O t t states
Output
t t
Z2=0.003
Z4=1
Z1=0.003
0 003
Z3=0.13
Error load associated with RR mechanism
¾ If we assume errors are associated with free binding sites, then
there are three error-loads associated the RR mechanism as
shown below:
Input states
Binding states
Output states
Glucose lactose
0
0
1
0
1
0/1
Excluded state
Δf2
LacI
Z2=0.003
Δf4
Δf4’
Z4=1
CRP
L I
LacI
Z1=0.003
=0 003
CRP
Z3=0.13
=0 13
Error load associated with RA mechanism
¾ If we assume errors are associated with free binding sites, then
there are four error-loads associated the RA mechanism as
shown below:
Input states
Glucose lactose
0
0
0
1
1
0/1
Excluded state
Binding states
Output states
Δf2
Δf2’
Z2=0.003
0 003
Δf4
LacI
Z4=1
CRP
Δf1’
CRP
LacI
Z1=0.003
Z3=0.13
Error-loads for the four possible mechanisms for
g
system
y
a two-regulator
Mapping
pp g from input
p states to error-loads
Regulation
mechanism
(0,0)
(0,1)
(1,0)/(1,1)
AA
Δf2’
0
Δf1 + Δf1’
AR
0
Δf4’
Δf1
RA
Δf2 + Δf2’
Δf4
Δf4’
RR
Δf2
Δf4 + Δf4’
0
Average error-load of a two-regulator system
¾ The average error load is calculated by multiplying the
probability of each input-state by the relevant fitness reduction
and summing over all input-states.
¾ Si
Since th
there are only
l th
three bi
binding
di states
t t ffor th
the ffour possible
ibl
input states, two probabilities are needed for the calculation.
Let’s use denote them byy p00 and p01 for the calculations.
p00 : the probability that neither glucose nor lactose are present
in the environment;
p01: the probability that glucose is absent, but lactose is present
in the environment.
p10,11,: the
th probability
b bilit th
thatt glucose
l
iis present,
t b
butt llactose
t
can be
b
present or absent, its value is,
p10,11 = 1 − p00 − p01.
Average error-load of a two-regulator system
¾ Using the error loads of the four mechanisms, their average
error load can be computed as follows:
f
E AR = p01Δf 4 '+(1 − p00 − p01 ) Δf1 ,
E AA = p00 Δf 2 '+(1 − p00 − p01 )( Δf1 + Δf1 ' ),
E RR = p00 Δf 2 + p01 ( Δf 4 + Δf 4 ' ),
E RA = p00 ( Δf 2 + Δf 2 ' ) + p01Δf 4 + (1 − p00 − p01 ) Δf1 '.
Regulation mechanism
(0 0)
(0,0)
(0 1)
(0,1)
(1 0)/(1 1)
(1,0)/(1,1)
AA
Δf2’
0
Δf1 + Δf1’
AR
0
Δf4’
Δf1
Δf4
Δf4’
RA
RR
Δf2 + Δf2’
Δf2
Δf4 + Δf4’
0
¾ Since p00 + p01 ≤1, therefore the selection diagram is a triangle
defined by the axes p00 and p01 and the line p00 + p01 = 1.
Mechanisms that minimize error load
¾ Given a specific environment (p00, p01), the mechanism that has
the lowest error load can be identified.
¾ Under different conditions, AR, AA and RR can exclusively have
the minimal error load
load. However
However, under no condition can the RA
mechanism have the
p01 = − p00 + 1
lowest error load.
¾ The AR mechanism
minimizes the error
load in a region of
the diagram that
includes
environments where
lactose and glucose
are present with low
probability, i.e.,
p01 << 1 and p00 ≈ 1.
Why does Nature choose AR for the lac system?
¾ The most frequent input state in the environment of E. coli is
(glucose, lactose)=(0,0), this corresponds the binding state
(CRP, LacI) = (1,1). Thus the AR mechanism keeps the TF
binding
g sites p
protected from error most of the time.
¾ Furthermore, the inducer-exclusion guarantees that the most
noisy binding state can never be reached.
Glucose lactose CRP Allolac
0
0
1
0
1
0/1
1
1
0
0
1
0
Excluded state
CRP
CRP
Δf1
LacI
Δf4’
LacI
Z2=0.06
Z4=1
Z1=0.003
Z3=0.13
What is life ?
¾ Life usuallyy p
possesses the following
g the features:
1. Complexity: the components that form a life and the
interactions among these components are much more
complex than a non-life matter.
2. Robustness: life is veryy tolerant to environmental
disturbances;
3 Reproductivity: life can autonomously reproduce a similar
3.
copy of itself.
4. Evolvability: life can adapt itself to the long term changes
in the environments through evolution.
¾S
So, can we fully
f ll understand
d t d lif
life, including
i l di ourselves
l
b
by
scientific research ?
Simplicity in Biology
¾ Although biological systems are evolved to function, not for
human understanding; however, the complex biological
systems can be understood by simple rules which are
discovered byy systems
y
level studies:
1. Network motifs: structurally, biological interaction networks
can be understood by network motifs, each performs a
specific information processing function in different levels of
systems;
Network
N
t
k motifs
tif are discovered
di
db
by N
Nature
t
th
through
h
convergent evolution, instead of duplication.
More complex networks are formed by intertwining basic
network motifs, but the ways that they connect to one
another are understandable.
Simplicity in Biology
2. Modularity: a set of components that perform a specific
function tend to have strong interactions among them and
less interaction with outside components through input and
output nodes.
Therefore, modules can work in relative isolation.
If functionality is only the constrain on the system
system, then
then, non
nonmodular systems are always the optimal solutions, and
modularity can never be evolved.
Simulation studies suggest that modularity is evolved for reutilization of components, instead of functionality.
Specifically, the goals of evolution need to change from time
to time, but all goals share the same sub-problems, so the
existing ones are reused again and again
again.
Simplicity in Biology
3. Strong separation of timescale: temporally, biological
functions performed by network motifs can be separated by
different timescales.
Thus,
Th
s fast processes can be modeled b
by their stead
steady state
behaviors when we study the dynamics of slow processes.
4. Universality of simple mathematical modeling: many
different biological processes can be modeled by simple
mathematical models without loss of the global picture of the
biological process.
e.g,
e
g logical approximation of input function of transcription
network and neuronal integration.
Simplicity in Biology
5. Robustness: biological systems are generally robust to
environmental changes, which can be used to eliminate most
of incorrect models when analyzing biological systems.
This is because
because, although many simply models can explain a
biological system, only few can account for the robustness.
We have known a few ways to achieve robustness:
1) Integral feedback, 2) Kinetic proofreading, and
3) Self-enhanced
S lf h
dd
degradation
d ti off morphogen;
h
6. Stochastic nature of biological systems: genetically
id ti l cells
identical
ll iin th
the same environment
i
t respond
d iin a
probabilistic way to the same stimulus.
This may broaden the region of responses in an unpredictable
future, and thus increase the chance for at least a fraction of
cells to survive in sudden environment changes.