Dynamics of adaptation in sexual and asexual populations

Dynamics of adaptation
in sexual and asexual populations
Joachim Krug
Institut für Theoretische Physik, Universität zu Köln
• Motivation and basic concepts
• The speed of evolution in large asexual populations
• Epistasis and recombination
Joint work with Su-Chan Park and Arjan de Visser
[PNAS 104, 18135 (2007); arXiv:0807.3002]
I have deeply regretted that I did not proceed far enough at least to understand
something of the great leading principles of mathematics, for men thus
endowed seem to have an extra sense.
The Autobiography of Charles Darwin
The modern synthesis
R.A. Fisher
J.B.S. Haldane
S. Wright
The evolutionary process is concerned, not with individuals, but with the
species, an intricate network of living matter, physically continuous in spacetime, and with modes of response to external conditions which it appears can
be related to the genetics of individuals only as a statistical consequence of
Sewall Wright (1931)
the latter.
It is not generally realized that genetics has finally solved the age-old problem
of the reason for the existence (i.e., the function) of sexuality and sex, and that
only geneticists can properly answer the question, “Is sex necessary?”
H.J. Muller (1932)
The problem of sex
Sex is costly:
• Two-fold cost of males (Maynard-Smith, 1971)
• Cost of finding and courting a mate
Nevertheless sexual reproduction is ubiquitous in plants and animals
⇒ What evolutionary forces are responsible for the maintenance of sex?
Genetic mechanisms:
• Sex helps to eliminate deleterious mutations (Muller’s ratchet)
• Sex speeds up the establishment of beneficial mutations
(Muller-Fisher effect/clonal interference)
The Muller-Fisher hypothesis for the advantage of sex
J.F. Crow & M. Kimura, Am. Nat. 99, 439 (1965)
• Dynamics of an adapting population:
periodic selection
The Muller-Fisher hypothesis for the advantage of sex
• Clonal interference slows down the adaptation of asexual populations
• How to quantify this picture?
The speed of evolution
in large asexual populations
The Wright-Fisher model
U
U
U
U
time
• Constant population size N , discrete non-overlapping generations
• Each individual chooses an ancestor from the preceding generation
• Individual i is chosen with probability ∼ wi
Wrightian fitness
• Mutations occur with probability U per individual and generation
Fixation
• U = 0: When a single mutant of fitness w ′ is introduced into a genetically
homogeneous population of fitness w, the outcome for t → ∞ is either
fixation (all w ′) or loss of the mutation (all w)
• Fixation probability for the Wright-Fisher model
1 − e−2s
w′
πN (s) ≈
, s = −1
−2Ns
1−e
w
(Kimura, 1962)
selection coefficient
• Under strong selection: (N|s| ≫ 1) deleterious mutations (s < 0) cannot fix,
while beneficial mutations (s > 0) fix with probability
π (s) = 1 − e−2s ≈ 2s, s ≪ 1
• Mean time to fixation of a beneficial mutation:
ln N
tfix ≈
s
Model I: Single selection coefficient
• Infinitely many sites approximation:
Each mutation creates a new genotype, no recurrent mutations
• Multiplicative fitness:
Fitness of offspring w ′ related to parental fitness w by
w → w ′ = w(1 + s)
• Beneficial mutations occur with probability Ub , and all have the same
selection coefficient s = s0 > 0
• Time between fixed beneficial mutations:
1
tmut =
2s0UbN
• s0 < 0 corresponds to Muller’s ratchet
Periodic selection vs. clonal interference
• Periodic selection requires tfix ≪ tmut
population fraction
1
0
t fix
t
t mut
• Rate of adaptation in the periodic selection regime:
lnhwi
s0
R = lim
≈
= 2s20NUb
t→∞
t
tmut
• Beneficial mutations interfere when tfix ≫ tmut or 2NUb ln N ≫ 1
• Clonal interference is inevitable for large N
[provided Ub is constant]
Rate of adaptation in the clonal interference regime
• Crow & Kimura 1965: Each fixed beneficial mutation must occur in the
offspring of the preceding one
• Number of offspring of a successful mutant arising at time t = 0:
Noff (t) =
Z
0
t
−2
τ
)
=
s
d τ s−1
exp(s
0
0
0 exp(s0t)
• Time ts between successful mutants determined by 2s0UbNoff (ts) = 1
s20
s0
⇒ R= =
ts ln(s0/2Ub )
• Fluctuations lead to a logarithmic speedup:
2s20 ln N
R≈ 2
ln (s0/2Ub )
M.M. Desai & D.S. Fisher 2007
Model II: Distribution of selection coefficients
• Extremal statistics arguments suggest that the distribution of selection
coefficients for beneficial mutations is exponential:
H.A. Orr 2003
−s/sb
Pb (s) = s−1
, s>0
b e
• Probability distribution of contending mutations destined for fixation:
Pc(s) ∼ sPb (s) ∼ se−s/sb
1
all mutations
contending mutations
0.8
P(s)
0.6
0.4
0.2
0
0
1
2
3
4
s/s_b
5
6
7
8
The Gerrish-Lenski theory of clonal interference
P.J. Gerrish, R.E. Lenski, Genetica 102/103, 127 (1998)
• Key idea: A mutation s survives clonal competition if no superior mutation
s ′ > s arises during the time to fixation of s.
• The survival probability is exp[−λ (s)] with
λ (s) = NUb tfix
Z
s
∞
N ln NUb
′
′
′
ds π (s )Pb (s ) =
s
Z
s
∞
′
ds ′ π (s ′)s−1
b exp[−s /sb ]
• GL theory does not (explicitly) account for the complex interaction of
different clones. In particular, the possibility of beneficial mutations arising
within a growing clone (multiple mutations) is ignored.
• Qualitative predictions:
Clonal interference reduces the rate of substition Γ = 1/ts but increases the
mean selection coefficient of fixed mutations hsi.
Distribution of fixed mutations: Simulation [Ub = 10−6, sb = 0.02]
20
1034
10
5
106
107
10
8
109
10
P.S.
Pfix(s)
16
12
8
4
0
0
0.05
0.1
0.15
0.2
s
0.25
0.3
0.35
Distribution of fixed mutations: Experiments (E. coli)
L. Perfeito et al., Science 317, 813 (2007)
A: N = 2 × 104
B: N = 107
GL-theory: Asymptotics for large N
C.O. Wilke (2004); S.C. Park & JK (2007)
• Rate of substitution:
γ ≈ 0.577215... Euler’s constant
sb
[ln(UbN ln N) + γ − 1] → sb
Γ≈
ln N
• Mean selection coefficient of fixed mutations:
hsi ≈ sb[ln(UbN ln N) + γ ]
• Rate of adaptation:
R = hsiΓ → s2b ln(UbN ln N)
• Logarithmic dependence on the mutation supply NUb
Extremal statistics estimates
• Probability to find a selection coefficient larger than S :
Prob[s > S ] =
Z
∞
Pb (s) ds = e−S/sb
S
• The largest selection coefficient smax in t generations is determined by
1
Prob[s > smax] =
⇔ smax = sb ln(NUbt)
NUb t
• Self-consistency requires that t = tfix(smax) = ln N/smax
⇒ smax = sb ln(NUb ln N/smax) ⇒ smax = hsi ≈ sb ln(NUb ln N)
• Rate of substitution:
Γ=
1
smax
sb
=
≈
ln(NUb ln N)
tfix(smax) ln N ln N
Other mutation distributions
• Extremal statistics for Pb (s) ∼ exp[−(s/sb)β ] yields
smax ∼ sb(ln N)1/β , Γ ∼ sb(ln N)1/β −1, R ∼ s2b(ln N)2/β −1
• Compare to behavior for mutations of single strength Pb (s) = δ (s − s0):
2s20 ln N
R≈ 2
∼ s20 ln N
ln (Ub/s0)
• Adaptation driven by
(i) single mutations of large effect for β < 1
(ii) multiple mutations of average effect for β > 1
• The “standard case” β = 1 is marginal
GL-theory vs. simulations: Rate of adaptation
10
-3
10
-4
10
-5
10
-6
simulation
GL(Wilke)
ln w̄
t
0.002
0.001
10
0
3
10
3
4
10
10
4
10
5
10
5
10
6
10
N
6
7
10
8
10
9
10
GL-theory vs. simulations: Selection coefficient of fixed mutations
0.25
E[s]
0.2
0.15
0.1
simulation
GL
0.05
large N
0
3
10
10
4
10
5
6
10
N
10
7
10
8
10
9
ln w̄
Breakdown of GL-theory at very large N
10
6
10
4
10
2
10
0
3
10
-2
10
-4
10
-6
10
-8
10
104
105
106
107
10
8
109
10
∞
Eq.
0
10
1
(17)
10
2
10
3
10
4
t
• Top curve: Exact solution of the infinite population limit
• For large N the process is dominated by multiple mutations
⇒ temporal substructure of the substitution process, Γ → 1 ≫ sb
E[k]
Rate of substitution
10
-1
10
-2
10
-3
GL
10
10
-4
Fixation
Origination
sb
-5
10
3
10
4
10
5
6
10
N
10
7
10
8
10
9
Epistasis and recombination
Epistasis
• So far: Fitness effects of different beneficial mutations are independent
• Epistasis implies interactions between the effects of different mutations
• General setting: Genome of L binary loci (sites) i = 1, ..., L at which a
mutation can be present (σi = 1) or absent (σi = 0)
• A fitness landscape is a function w(σ ) on the set of 2L genotype sequences
σ = (σ1, ..., σL)
• In the absence of epistatic interactions w(σ ) = ∏Li=1 ωi(σi)
• What is the effect of epistasis on the speed of adaptation?
• How epistatic are real fitness landscapes?
A two-locus system with recombination
S.C. Park, JK, in preparation
w(00)=1
00
µ
µ
01
10
µ
w(10)=w(01)=1−t−s
µ
11
w(11)=1−t
• Infinite population dynamics X0 = P(00), X1 = P(10) = P(01), X2 = P(11)
w̄X0′ = (1 − µ )2w00X0 + 2µ (1 − µ )w10 X1 + µ 2 w11X2 − (r/2w̄)(1 − 2µ )2D
w̄X1′ = [1 − 2µ (1 − µ )]w10X1 + µ (1 − µ )[w00X0 + w11X2] + (r/2w̄)(1 − 2µ )2D
w̄X2′ = (1 − µ )2w11X2 + 2µ (1 − µ )w10 X1 − (r/2w̄)(1 − 2µ )2D
w̄ = w00X0 + 2w10X1 + w11X2 mean fitness
D = w00w11X0X2 − w210X12 linkage disequilibrium
Types of epistasis
ln w
negative
null
positive
sign
00
10
01
11
• Epistasis measure: ε = w00w11 − w10w01
• Follow adaptation from low fitness genotype 11 to global optimum 00
Effect of recombination on the speed of adaptation
No epistasis (t=0.5, s=-0.207106...)
1.3
r=0
r=0.4
r=1
1.2
1.1
fitness
1
0.9
0.8
0.7
0.6
0.5
0.4
0
10
20
30
generations
40
50
• In the absence of epistasis (w00w11 = w10w01) recombination has no effect
on infinite populations
Effect of recombination on the speed of adaptation
Negative epistasis (t=0.5, s=-0.4)
1.2
r=0
r=0.1
r=1
1.1
1
fitness
0.9
0.8
0.7
0.6
0.5
0.4
0
20
40
60
generations
80
100
• Negative epistasis (w00w11 < w10w01) speeds up adaptation by generating
negative linkage disequilibrium (D < 0)
Effect of recombination on the speed of adaptation
Positive epistasis (t=0.5, s=-0.1)
1.3
r=0
r=0.4
r=1
1.2
1.1
fitness
1
0.9
0.8
0.7
0.6
0.5
0.4
0
10
20
30
generations
40
50
• Positive epistasis (w00w11 > w10w01) slows down adaptation by generating
positive linkage disequilibrium (D > 0)
Bistability
Fitness valley (t=0.2, s=0.2)
1.1
r=0
r=0.2
r=0.4
r=1
1.05
1
fitness
0.95
0.9
0.85
0.8
0.75
0.7
0
200
400
600
generations
800
1000
• In the presence of sign epistasis (w10 < min[w11, w00]) the population
remains at the initial genotype 11 forever if r > 2t and 2µ < s/(1 − t)
A fitness landscape for Aspergillus niger
J.A.G.M. de Visser, S.C. Park, JK, arXiv:0807.3002
• Fitness measurements for all 25 = 32 possible combinations of five
mutations known to be individually deleterious J.A.G.M. de Visser et al. (1997)
• Fitness landscape is rugged: Several local optima (underlined)
Adaptation in the A. niger landscape
1
1
A
C
0.9
w̄ (t)
w̄ (t)
0.9
0.8
U = 10−3
N = 109
0.7
r =0
r = 10−2
r = 10−1
r =1
0.8
U = 10−5
N = 109
0.7
0.6
0.6
0
100
200
300
400
0
t
500
1000
1500
2000
t
• Finite and infinite population behaviors agree
• Large recombination rate leads generically to localization at local fitness
optima
• Intermediate levels of recombination can be slightly advantageous
Summary and outlook
• Clonal interference is a generic feature of asexual adaptation with clear
experimental signatures and nontrivial theory
• Simple GL-theory works remarkably well in the regime of experimental
interest, though significant deviations are predicted for very large N
• Depending on the shape of Pb(s) adaptation is dominated by a few extreme
mutations or by many typical mutations
• First study of sexual reproduction on an empirical fitness landscape shows
that recombination is mostly disadvantageous
• Then why sex? Possible explanations:
The A. niger landscape is not representative
Other factors (e.g., time-dependent environments) are important