A Unified Treatment of the Probability of Fixation when Population

NOTE
A Unified Treatment of the Probability of Fixation
when Population Size and the Strength of Selection
Change Over Time
D. Waxman1
Centre for Computational Systems Biology, Fudan University, Shanghai 200433, People’s Republic of China, and Faculty of Science
and Engineering, University of Brighton, Brighton BN2 4GJ, United Kingdom
ABSTRACT The fixation probability is determined when population size and selection change over time and differs from Kimura’s
result, with long-term implications for a population. It is found that changes in population size are not equivalent to the corresponding
changes in selection and can result in less drift than anticipated.
new mutation in a finite population is subject to genetic
drift and its ultimate fate is random: it may be either
extinction (loss) or complete establishment (fixation). In
a randomly mating population, the typical outcome for
a new mutation is its loss, with fixation occurring only with
small probability. Under static conditions (constant population size and constant strength of selection) a new beneficial
mutation with small selective advantage, s, in a randomly
mating population with discrete generations, has only
a small probability of fixation: 2s when reproduction is
treated as a branching process and the number of offspring
has a Poisson distribution (Haldane 1927). A deleterious
mutation has a yet smaller probability of fixation that is
not calculable under a branching process. However, despite
the relative rarity of fixation among the fates of all mutations, attention is largely focused on this outcome because
the fixation of beneficial mutations plays a central role in the
long-term adaptation of populations, and the fixation of deleterious mutations, in the absence of recombination, plays
an important role in the long-term survival of populations
(Muller 1964; Felsenstein 1974). Our understanding of the
rate of such phenomena depends sensitively on the probability of fixation, and deviations from its static value, due to
time-dependent conditions, are of particular significance.
A
Copyright © 2011 by the Genetics Society of America
doi: 10.1534/genetics.111.129288
Manuscript received December 03, 2010; accepted for publication April 20, 2011
Supporting information is available online at http://www.genetics.org/content/
suppl/2011/04/28/genetics.111.129288.DC1.
1
Address for correspondence: School of Mathematical Sciences, Fudan University,
Shanghai 200433, People's Republic of China. E-mail: [email protected]
Indeed, there are a variety of reasons, both abiotic and biotic, why population size and the strength of selection do
not generally remain constant over time.
Temporal changes, such as systematic trends in the
composition or temperature of the atmosphere or oceans
over time, although abiotic in nature, often have major
implications for biological systems and may force biotic
change. For example, atmospheric temperature changes
may affect various biological processes within an organism,
but also affect the vegetation on which an organism feeds,
thereby affecting both selection and carrying capacity. Thus
the general situation is complex, with selection fluctuating
for multiple reasons; indeed, “. . . natural selection is very
complicated, it is unlikely that the selection coefficient stays
constant” (Ohta 1972, p. 307). Additionally, changes in, for
example, resource/habitat availability or the density of parasites or predators will generally change the strength of selection as well as the size of a population. Thus generally we
should expect variation in population size and the strength
of selection.
The Soay sheep provide an illustration of the interplay of
the various factors that affect population size and the
strength of selection and the interrelation of these two
quantities. The Soay sheep are an intensively studied wild
mammalian population and their survival is density dependent and closely tied in with the availability of vegetation, whose quality and abundance are highly variable
(Clutton-Brock and Pemberton 2003). Parasite population
dynamics have been shown to regulate vertebrate populations and, in the Soay sheep, over-winter survival has been
Genetics, Vol. 188, 907–913 August 2011
907
identified with a response of the host’s immune system to
parasitic load (Coltman et al. 1999). Milner et al. (2003)
concluded that the 20-year study of Soay sheep clearly demonstrated selection intensity fluctuating in a temporal fashion, and during years of high population density mortality
was highest and hence selection most intense.
Theoretical studies of the way that population-size
change affects fixation have a long history that includes an
approximate diffusion analysis by Kimura and Ohta (1974)
for logistic population growth and a much more recent
investigation by Otto and Whitlock (1997), who considered
various scenarios of population-size change, using a generalization of the branching process of Haldane (1927). Among
many other things, the intuition established from the work
of Otto and Whitlock (1997) is that when a beneficial mutation segregates in a population of increasing size, it has an
increased probability of fixation. Subsequent work extended
the calculations to include population-size changes of a stochastic nature, either within a branching-process framework
(Engen et al. 2009) or for the Moran model (Parsons and
Quince 2007; Parsons et al. 2010). The treatment of how
changes in the strength of selection in finite populations
affect fixation includes work of Kimura and Ohta (1972),
while correlated fluctuations in the strength of selection,
when population size is finite, have been considered by
Takahata et al. (1975). Lambert (2006) combined drift
and branching processes (and hence incorporates stochastic
number fluctuations) and has obtained results in the regime
of weak selection, which is defined by 4Ne|s| ≪ 1, where Ne
denotes the effective population size.
The previous work has shown how the probability of
fixation is affected by temporal variation in either population size or the strength of selection. In the present work we
aim to compare and quantify the different ways that
temporal changes in population size and the strength of
selection affect fixation. We consider cases where selection
may not be weak (i.e., 4Ne|s| may not be small compared
with 1). To this end, we present a unified treatment of such
temporal variations on the probability of fixation that covers
beneficial, neutral/nearly neutral, and deleterious mutations. To cover this range of selective regimes we work
within the framework of the diffusion approximation
(Kimura 1955a), where the relative frequency of a gene is
treated as a random variable that takes continuous values.
This approximation derives its name from the diffusion
equation that governs the distribution of the relative gene
frequency, and a diffusion analysis has been used to derive
many fundamental results in population genetics (Crow and
Kimura 1970).
Let us begin with the standard case where a single locus
determines fitness in a randomly mating diploid sexual
population under static conditions (i.e., constant population
size and constant strength of selection). (Asexual populations can also be studied via diffusion analysis; for a recent
example, see Waxman and Loewe 2010.) The locus has two
alleles, denoted A and a, and selection is on viability and is
908
D. Waxman
semidominant, with AA, Aa, and aa genotype individuals
having relative fitnesses of 1 + 2s, 1 + s, and 1, respectively.
Generations are discrete and the processes taking place in
one generation are given by the life cycle
adults
ðgeneration tÞ
Y random mating; following by the death of all adults
zygotes
Y viability selection
juveniles
Y thinning ðnumber regulationÞ
adults
ðgeneration tþ1Þ:
We assume that each adult contributes to a very large
number zygotes, so that viability selection may be treated as
being deterministic in character. Juveniles (the individuals
that survive viability selection) undergo a nonselective process of ecological thinning that leads to an adult population
of N individuals in each generation.
The proportion of all genes at the locus in adults that are
allele A is written X(t); this is the relative frequency (henceforth termed frequency) of allele A. Because of the process of
thinning in the life cycle, the frequency, X(t), generally
varies randomly from generation to generation and may
have different values in different replicates of a population.
Statistics of X(t) can be described by a Wright–Fisher model
(Fisher 1930; Wright 1931). However, to make theoretical
progress, we consider an analysis based on the diffusion
approximation, using methods of Kimura (1955), McKane
and Waxman (2007), and Waxman (2011).
Readers not concerned with the detailed technical
aspects of this work may omit the derivations/proofs
contained in the main text and the material in supporting
information, File S1.
A diffusion analysis is based on the diffusion equation
2
@
1 @2
½xð1 2 xÞf ðx; tÞ
f ðx; tÞ ¼ 2
@t
4Ne @x 2
@
þ ½sxð1 2 xÞf ðx; tÞ
@x
(1)
for the probability density f(x, t) of the frequency of the A
allele at time t and frequency x.
Equation 1 can be solved to determine the probability of
fixation of the A allele at long times, where the only possible
outcomes for the locus are the A allele fixing or being lost. In
terms of the frequency X(t) we have
1; if the A allele fixes
lim XðtÞ ¼
0; if the A allele is lost:
t/N
The probability of occurrence of these different outcomes
depends on the frequency, p, of the A allele at the initial time
t = 0. The fixation probability can be written as
Pfix ðpÞ ¼ lim Ep ½XðtÞ; where Ep[. . .] denotes an average over
t/N
replicate populations when the A allele frequency has the
value p at time t = 0. Thus Ep[. . .] is a shorthand for the
conditional expectation E[. . . | X(0) = p].
The diffusion approximation of the fixation probability
follows from Equation 1 and was found by Kimura (1955b)
to be
Pfix ðpÞ ¼ lim Ep ½XðtÞ ¼
t/N
1 2 e 2Sp
;
1 2 e 2S
(2)
where S = 4Nes. [The result of Equation 2 was derived assuming discrete generations. A closely related but different
result is obtained if it is assumed from the outset that generations are overlapping (Moran 1958).] The numerical errors
in the approximation of Equation 2 are remarkably small,
even for haploid populations of size 12 (equivalent to diploid
populations of size 6) as demonstrated by Ewens (1963).
The fixation probability of a single copy of an A allele in
a population of census size N is obtained by setting P = 1/
(2N) in Equation 2. When the population size is such that
S = 4Nes is large (S .. 1) but Sp [ 2Nes/N is small (Sp ,,
1), we arrive at the approximation Pfix(p) ≃ 2Nes/N, which,
when Ne = N, coincides with the leading term, 2s, of
a branching process (Haldane 1927). Thus for a constant
strength of selection and a constant population size, branching processes are valid for beneficial mutations in populations of suitably large size (S .. 1). Diffusion results have
a broader range of applicability that includes beneficial, deleterious and neutral/nearly neutral mutations in populations of essentially arbitrary size; in such an approach,
there are essentially no restrictions on the sign and size of S).
When the effective population size and the selection
coefficient depend on the time t, the parameter S acquires
time dependence and becomes S(t) = 4Ne(t)s(t). We initially proceed under the assumption that all changes in the
composite quantity S(t) are deterministic in character and
have the property that they cease after a finite time, which
we denote T. That is to say for times $T we take S(t) to have
a constant value, and schematically
arbitrary; t , T
(3)
SðtÞ ¼ 4Ne ðtÞsðtÞ ¼
t $ T:
SN ;
Such an assumption on S(t) is not greatly restrictive. It
could, for example, describe the situation where the
strength of selection remains constant, while population size
alone changes for a finite time before achieving a constant
value. While logistic growth of a population does not precisely cease after a finite time, the population size can be
well approximated as achieving its carrying capacity after
a finite time and hence closely fits within the framework
of Equation 3. Beyond such a case, Equation 3 could also
describe environmental change of finite duration, which
affects both population size and selection.
Generalization of Pfix(p)
The generalization of Equation 2 to the case of timedependent population sizes, Ne(t), and time-dependent selection coefficients, s(t), can be obtained from some basic
considerations. For deterministic changes in Ne(t) and s(t)
that lead to the composite quantity S(t) = 4Ne(t)s(t) having
the form in Equation 3, the generalization of Equation 2 is
1 2 Ep e 2SN XðTÞ
Pfix ðpÞ ¼
:
(4)
1 2 e 2SN
(the derivation of Equation 4 is given in the following
paragraph and an alternative derivation is given in Part 1 of
File S1). The only statistic required in Equation 4 involves
X(T), the random value of the A allele frequency at time T.
The statistic in question, Ep ½e2SN XðTÞ ; represents the average
of e2SN XðTÞ over all replicate populations where the initial
frequency is p at time t = 0 [i.e., having X(0) = p].
Derivation
To derive Equation 4 we note that as in the static case, the
fixation probability can be written as Pfix ðpÞ ¼ lim Ep ½XðtÞ:
t/N
Conditioning on the value of X(t) at time T [i.e., after changes
in Ne(t) and s(t) have taken place], we have Pfix ðpÞ ¼
i
lim Ep ½XðtÞ ¼ lim Ep ½E½XðtÞjXðTÞ¼ Ep ½ lim E½XðtÞj XðTÞ:
t/N
t/N
t/N
The quantity lim E½XðtÞjXðTÞ appearing in the last exprest/N
sion can be written as lim EXðTÞ ½XðtÞ and follows from
t/N
Equation 2 with
the substitution p / X(T). We obtain
lim EXðTÞ ½XðtÞ ¼ ð12e2SN XðTÞ Þ=ð12e2SN Þ and hence the
t/N
generalization of Equation 2 is Pfix ðpÞ ¼ Ep ½ð12e2SN XðTÞ Þ=
ð12e2SN Þ; which is equivalent to Equation 4.
We note that the solution for Pfix(p) in Equation 4, which
directly follows from a diffusion analysis, does not generally
agree with the forms assumed by Kimura and Ohta (1972,
1974) for situations of changing selection strength or changing population size. Additionally, Equation 4 will not be
compatible with a branching-process approach when SN is
not large (SN ≲ 1) or indeed when SN is zero or negative;
there is limited validity to a branching-process treatment.
Before considering the numerical estimation of the
fixation probability from Equation 4, we investigate some
limiting cases and properties of Equation 4. Some of the
limiting cases allow us to verify that Equation 4 is in
accordance with well-known/well-understood results.
Limiting cases
1. When the time-interval T, over which all change occurs,
tends to zero there is a negligibly short period of time
dependence in the parameters. It can be shown (see Part
2 of File S1) that as a consequence, the frequency X(T) is
unaffected by these changes and tends to its initial value:
X(T) / p. Equation 4 then collapses to Kimura’s result,
Equation 2, for a population of constant size and constant
selection coefficient.
2. When the problem is static, in the sense that neither
population size nor the strength of selection changes over
time [i.e., Ne(t) = Ne and s(t) = s], it should follow that
Equation 4 coincides with Kimura’s result, Equation 2,
Note
909
which covers the static case. In this case, T can be taken
to have any nonnegative value and it can be shown (see
Part 3 of File S1) that under static conditions, the expectation appearing in Equation 4, namely Ep ½e2SN XðTÞ , is
independent of T and hence equals its value at T = 0,
which is e2SN p . As a consequence, Equation 4 collapses to
Kimura’s result. The property of Ep ½e2SN XðTÞ of being independent of T is a well-known Martingale property of
the diffusion approximation of the Wright–Fisher model;
it appears to have been first identified as a property of the
diffusion approximation by Ewens (1964).
3. When SN becomes vanishingly small (SN / 0), Equation
4 reduces to Pfix(p) = Ep[X(T)]. Thus when there is selective neutrality after time T, the probability of fixation
equals the mean allele frequency at time T.
4. When SN becomes large and positive (SN / +N), the
result in Equation 4 depends only on the probability that
X(T) 6¼ 0. Equation 4 leads to Pfix(p) = 1 2 Ploss(T; p),
where Ploss(T; p) is the probability of loss of the A allele
by time T given it had a frequency of p at time t = 0 (the
proof is given in the following paragraph). Thus in an
environment that has strongly positive selection after
time T, the probability of fixation is determined by the
probability that the A allele has not been lost by time T.
Any A alleles that are present in a population after time T
will subsequently fix.
Proof. The result Pfix(p) = 12Ploss(T; p) follows since
limSN/Nð12e2SN XðTÞ Þ=ð12e2SN Þ corresponds to the indicator
function 1{X(T).0}, which has the value of unity when X(T)
. 0 and vanishes when X(T)=0. This indicator function
coincides with 1 2 1{X(T)=0} and thus Equation 4 becomes
Pfix(p)=1 2 Ep[1{X(T)=0}]=1 2 Ploss(T; p), where Ploss(T; p)
is the probability that X(T)=0, given X(0)=p.
5. When SN becomes large and negative (SN / 2N), the
result in Equation 4 depends only on the probability that
X(T) = 1. Equation 4 leads to Pfix(p) = Pfix(T; p), namely
the probability of fixation of the A allele by time T, given
it had a frequency of p at time t = 0 (the proof is given in
the following paragraph). Thus in an environment that
has strongly negative selection after time T, the probability of fixation is determined by fixations that occur up to
time T; after time T any A alleles present in a population
will be lost.
Proof. The result Pfix(p) = Pfix(T; p) follows since limSN /2N
ð12e2SN XðTÞ Þ=ð12e2SN Þ ¼ limSN/2N e2jSN jð12XðTÞÞ ; which corresponds to the indicator function 1{X(T)=1}. Equation 4 then
becomes Pfix (p) = E p [1{X(T)=1} ] = Pfix (T; p), where
P fix (T; p) is the probability that X(T) = 1, given X(0) = p.
The above results may give the impression that only the
time dependence of the parameter S(t) = 4Ne(t)s(t) is of
significance for the fixation probability, Pfix(p), of Equation
4, and that Pfix(p) is not sensitive to the way that Ne(t) and
s(t) separately change. This is not generally true. To illus-
910
D. Waxman
trate this, let us consider two cases where either Ne(t) or s(t)
linearly increases by a factor of f in time T and then remains
constant after this:
Case a:
h
i
1 þ ðf 2T 1Þt N0 ; t , T
t $ T;
fN0 ;
Ne ðtÞ ¼
sðtÞ ¼
s0 ; t , T
s0 ; t $ T:
Case b:
Ne ðtÞ ¼
N0 ; t , T
N0 ; t $ T;
h
sðtÞ ¼
i
1 þ ðf 2T 1Þt s0 ; t , T
t $ T:
fs0 ;
Both case a and case b lead to
SðtÞ ¼
h
i
4 1 þ ðf 2T 1Þt N0 s0 ; t , T
t $ T;
4fN0 s0
yet they lead to different values for the fixation probability.
As a concrete example of the fixation probabilities that can
arise for case a and case b, consider the parameter values
N0 = 10, f = 10, T = 10, s0 = 0.04, and P = 1/(2N0) = 0.05.
From simulations we find that case a leads to a fixation probability of Pfix(p) ≃ 0.363 while case b leads to
Pfix(p) ≃ 0.275. [All simulations carried out in this work
are made within the framework of a Wright–Fisher model
(Fisher 1930; Wright 1931). In such a framework, selection
is treated as a deterministic process, and only the random
sampling of individuals without regard to type, i.e., the process of random genetic drift, is treated stochastically.] In this
example, S(t) increases by a factor of 10 in each case but
when it is just Ne(t) that varies over time, the probability of
fixation is 30% larger than when just s(t) varies.
The results of cases a and b illustrate the more general
phenomenon that, as far as fixation is concerned, the way
that S(t) = 4Ne(t)s(t) varies over time is not the full story:
the probability of fixation generally depends on the source
of the variation of S(t).
The phenomenon that a varying Ne or a varying s has
nonequivalent effects on the probability of fixation follows
ultimately from the way that a changing Ne modifies the
timescale of random genetic drift. Since we shall compare
populations with the same initial size, Ne(0), we take Ne(0)
as a fixed parameter and find it convenient to take the timescale associated with genetic drift to be
Z
t¼
0
t
Ne ð0Þ
du:
Ne ðuÞ
(5)
We call t the drift time. We can express t as a function of the
drift time, t, and write t = t(t). The form of t(t) follows
from solving Equation 5 for t. Let us now explain the advantage of using the drift time, t, instead of the actual time, t.
Assuming a time-dependent population size and a timedependent strength of selection, we transform the diffusion
equation, Equation 1, so that the drift time t (Equation 5) is
used instead of the actual time t. It can be shown (see Part 4
of File S1) that under such a transformation, the quantity
playing the role of the strength of selection in the resulting
diffusion equation is R(t) = 4Ne(t(t))s(t(t)) and we call R
(t) the overall strength of selection. The quantity Ne(t) does
not appear elsewhere in the transformed diffusion equation.
In a static problem, the probability of fixation is determined
by just the initial frequency, p, and the overall strength of
selection (see Equation 2). In a more general case, where
Ne(t) and s(t) exhibit deterministic change over time, the
probability of fixation is determined by p, and the entire
history of R(t), from t = 0 onward.
We note that when population size is static (Ne(t) =
Ne(0)), the drift time and the actual time coincide: t(t) =
t, independent of any variation of s. Thus for a static population size but a changing strength of selection, the overall
strength of selection is
R0 ðtÞ ¼ 4Ne ð0ÞsðtÞ; static Ne :
(6)
By contrast, when Ne varies with time, the drift time does
not generally coincide with the actual time: t(t) 6¼ t. In such
a case, the overall strength of selection, when s has the fixed
value s(0), is
R1 ðtÞ ¼ 4Ne ðtðtÞÞsð0Þ;
varying Ne :
(7)
There is a key difference between Equations 6 and 7: the
time that s depends on in Equation 6 is simply the time we
have adopted for the diffusion equation, namely t. By contrast, in Equation 7, the time that Ne depends upon is t(t). It
can be shown (see Part 4 of File S1) that irrespective of
whether Ne(t) increases with time or whether it decreases
with time, the quantity Ne(t(t)) generally satisfies
Ne ðtðtÞÞ.Ne ðtÞ for all t . 0:
(8)
This inequality stems directly from the modification of the
timescale induced by an Ne that exhibits either increase or
decrease. [For an Ne(t) that changes monotonically, the inequality in Equation 8 becomes replaced with Ne(t(t)) $ Ne(t).] It
means the population size Ne(t(t)) in the transformed diffusion
equation is larger than the value of the population size that we
might believe is relevant at time t, namely Ne(t).
All other things being equal, a population whose size is
larger than anticipated exhibits less drift than anticipated.
This is the reason case a above, for a positively selected
allele, leads to a larger probability of fixation than case b.
Similarly, the inequality in Equation 8 indicates that
a negatively selected allele, when present in a population
that increases, or one that decreases, will have a reduced
probability of fixation (due to less drift) compared with the
case where population size is static and all variation occurs
in the strength of selection.
Figure 1 Scaled overall strength of selection. We present plots of the
scaled “overall strength of selection,” namely R(t) = 4Ne(t(t))s(t(t)) divided
by its initial value 4Ne(0)s(0). Different scenarios of change are illustrated,
where the selection coefficient s(t) is positive for all t. We used the two
functions a(t) and b(t) to produce this figure. The function a(t) corresponds to logistic growth by a factor of 10; i.e., a(t) = (1/10 + 9/10e2rt)21,
where r is a positive constant, and hence a(0) = 1 while a(N) = 10. The
function b(t) corresponds to logistic decay by a factor of 10; i.e., b(t) = (10
2 9e2rt)21 and hence b(0) = 1 while b(N) = 1/10. (A) We compare (i)
R0(t) for a fixed effective population size and a logistically increasing
selection coefficient (Ne(t) = Ne(0), s(t) = s(0)a(t)), with (ii) R1(t) for a logistically increasing effective population size and a fixed selection coefficient
(Ne(t) = Ne(0)a(t), s(t) = s(0)). (B) We compare (iii) R0(t) for a fixed effective
population size and a logistically decreasing selection coefficient (Ne(t) =
Ne(0), s(t) = s(0)b(t)), with (iv) R1(t) for a logistically decreasing effective
population size and a fixed selection coefficient (Ne(t) = Ne(0)b(t), s(t) =
s(0)). In A, an increasing population size leads to a larger overall strength
of selection than that of an increasing selection coefficient. In B, a decreasing population size leads again to a larger overall strength of selection than that of a decreasing selection coefficient. These properties result
from either an increasing population size or a decreasing population size
modifying the natural timescale of the diffusion equation in such a way
that there is less drift than might be anticipated (see Equation 8).
In Figure 1 we illustrate the forms of the “overall
strengths of selection,” R0(t) and R1(t), of Equations 6
and 7 to show how different the overall strength of selection
can be under “fixed Ne, varying s” and “varying Ne, fixed s.”
For definiteness, Figure 1 is restricted to alleles with positive
selection coefficients.
When Ne(t) exhibits periods of both increase and decrease,
no inequality of the form in Equation 8 generally holds.
Note
911
Table 1 Illustrative numerical results for the probability of fixation when the composite quantity S(t) = 4Ne(t)s(t) changes with time
Method
T
N0
NN
s0
sN
S0
SN
Pfix
Cost/rep
Direct simulation
Finite T simulation
Direct simulation
Finite T simulation
Direct simulation
Finite T simulation
Direct simulation
Finite T simulation
Direct simulation
Finite T simulation
Direct simulation
Finite T simulation
Direct simulation
Finite T simulation
Direct simulation
Finite T simulation
20
20
20
20
200
200
200
200
20
20
20
20
200
200
200
200
50
50
50
50
50
50
50
50
500
500
500
500
500
500
500
500
50
50
500
500
50
50
500
500
500
500
5000
5000
500
500
5000
5000
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.005
0.050
0.050
0.005
0.005
0.050
0.050
0.005
0.005
0.050
0.050
0.005
0.005
0.050
0.050
0.005
0.005
1
1
1
1
1
1
1
1
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
100
100
100
100
100
100
100
100
0.0687
0.0689
0.0817
0.0818
0.0320
0.0321
0.0451
0.0447
0.0672
0.0671
0.0814
0.0812
0.0298
0.0302
0.0428
0.0425
12
4
115
9
11
10
51
18
18
4
192
8
14
11
90
17
Data set
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
We take S(t) to start at the positive value S0 at time t = 0 and then to linearly increase to the value SN = 10S0 by time T and then remain constant at the value SN for all times
larger than T. Two different methods are used to estimate the probability of fixation when initially there is only a single copy of an A allele: (i) direct simulation, where we
“follow” each replicate population until either fixation or loss occurs and (ii) simulations based on Equation 4, where we follow each replicate population for, at most, T
generations. The column labeled “Cost/rep” gives the mean number of generations a replicate population was followed in a simulation. In the simulations, 5 · 105 replicate
populations were used, and we adopted a Wright–Fisher model where the only place in the life cycle where randomness occurs is in the thinning of the number of individuals
to Ne = N adults. The initial values of Ne and s are N0 and s0, while the final values are NN and sN; data sets 1, 2, 5, 6, 9, 10, 13, and 14 correspond to Ne fixed and s changing
with time. It is evident from the table that there are differences in the fixation probability, depending on whether Ne changed with time, at fixed s, or s changed with time at
fixed Ne.
Estimation of the probability of fixation
Consider now how we would estimate the probability of
fixation, for a case where potentially complicated deterministic changes of s(t) and Ne(t) take place up to a finite time,
T. A direct approach would simply be to simulate the behavior of many replicates of a population. Each replicate population would need to be “followed” for a sufficient number
of generations until either fixation or loss occurred. The
fraction of such populations that fix is an estimate of the
fixation probability. By contrast, using Equation 4 it is necessary to follow replicate populations for, at most, only T
generations. On average it will require less than T generations, since fixation or loss will occur in some replicate populations prior to time T, and no further change will occur in
such populations. An estimate of the fixation probability
follows from such simulations by using the average of
e2SN XðTÞ in Equation 4. This procedure may be significantly
shorter than the direct approach, depending on the parameter values in the problem (Table 1 illustrates differences in
the time required). Alternatively, the quantity Ep ½e2SN XðTÞ that appears in Equation 4 can be estimated from numerical
solution of the backward diffusion equation (see Part 5 of
File S1). This thus provides an alternative route to estimation of the fixation probability.
The different approaches to the calculation of the fixation
probability are illustrated with an example in Table 1.
Stochastic fluctuations
So far we have presented results of the probability of fixation
for cases where the time-dependent changes in the population size and the strength of selection are deterministic in
912
D. Waxman
character. Let us now point out a generalization of these
results that includes stochastic fluctuations in population size
and the strength of selection.
We note that Lambert (2006) obtained results in the regime of weak selection, when there is fluctuating population
size and random genetic drift, by combining branching and
Wright–Fisher processes, while Parsons et al. (2010) considered the effects of fluctuating population size, in a quasineutral (i.e., weak selection) situation, where different
alleles have the same ratio of intrinsic birth to death rates.
The work of Parsons and Quince (2007) covers the nonneutral regime and includes density dependence and fluctuations in population size that arise from uncorrelated
stochastic births and deaths. By contrast, Karlin and Levikson (1974) considered the case where stochastic changes in
Ne(t) and s(t) have temporal correlations that relate the
values of these quantities in adjacent generations. These
authors found that various statistics of Ne(t) and s(t) make
contributions to the drift and diffusion coefficients of the
diffusion equation. Here, we make the alternative assumption that the stochastic fluctuations of Ne(t) and s(t) have
temporal autocorrelations that decay slowly, over very many
generations. It is then possible to account for these fluctuations using the approach of Takahata et al. (1975); see also
Huerta-Sanchez et al. (2008) where two different models of
autocorrelation are incorporated into stochastic fluctuations
of selection.
To generalize Equation 4, we first assume there are both
deterministic changes and stochastic fluctuations in Ne(t)
and s(t) for times t # T, but only stochastic fluctuations
for times t . T. Then the appropriate generalization of Equa 1 ðXðTÞÞ, where C
1 ðxÞ is
tion 4 for this case is Pfix ðpÞ ¼ Ep ½C
an eigenfunction of an averaged backward diffusion operator: see Part 6 of File S1 for further details.
To summarize: in this work we have presented results,
based on the diffusion approximation, which generalize
Kimura’s result for the probability of fixation to cases where
population size and the strength of selection are time dependent. We have provided results when the changes are
deterministic and also shown their generalization when
there are stochastic fluctuations with temporal autocorrelations that decay over many generations. This work has
implications for the long-term adaptation of populations,
demonstrating that while temporal variation in population
size and the strength of selection both affect the probability
of fixation, the changes are not equivalent and that generally, a population size that either increases or decreases will
lead to less drift and hence less fixation of deleterious mutations and greater fixation of beneficial mutations than would
otherwise be anticipated. There are many possible scenarios
where population size and the strength of selection change,
and the result of Equation 4 allows an efficient way to explore the implications of these for fixation.
Acknowledgments
I thank Warren Ewens, Jianfeng Feng, Guillaume Martin,
Yannis Michalakis, and Andy Overall for stimulating discussions
and the latter for very helpful suggestions on the manuscript. I
also thank the two anonymous reviewers for comments and
suggestions that substantially improved this work.
Note: See Uecker and Hermisson (pp. 915–930) in this issue,
for a related work.
Literature Cited
Coltman, D. W., J. G. Pilkington, J. A. Smith, and J. M. Pemberton, 1999 Parasite-mediated selection against inbred Soay
sheep in a free-living, island population. Evolution 53: 1259–
1267.
Clutton-Brock, T. H., and J. M. Pemberton, 2003 Soay Sheep:
Dynamics and Selection in an Island Population. Cambridge
University Press, Cambridge, UK.
Crow, J. F., and M. Kimura, 1970 An Introduction to Population
Genetics Theory. Harper & Row, New York.
Engen, S., R. Lande, and B. Saether, 2009 Fixation probability of
beneficial mutations in a fluctuating population. Genet. Res. 91:
73–82.
Ewens, W. J., 1963 Numerical results and diffusion approximations in a genetic process. Biometrika 50: 241–249.
Ewens, W. J., 1964 The pseudo-transient distribution and its uses
in genetics. J. Appl. Probab. 1: 141–156.
Felsenstein, J., 1974 The evolutionary advantage of recombination. Genetics 78: 737–756.
Fisher, R. A., 1930 The Genetical Theory of Natural Selection.
Clarendon Press, Oxford.
Haldane, J. B. S., 1927 A mathematical theory of natural and
artificial selection. V. Selection and mutation. Proc. Camb.
Philos. Soc. 23: 838–844.
Huerta-Sanchez, M., R. Durrett, and C. D.. Bustamante,
2008 Population genetics of polymorphism and divergence
under fluctuating selection. Genetics 178: 325–337.
Karlinson, S., and B. Levikson, 1974 Temporal fluctuations in selection intensities: Case of small population size. Theor. Pop.
Biol. 6: 383–412.
Kimura, M., 1955a Stochastic processes and distribution of gene
frequencies under natural selection. Cold Spring Harbor Symp.
Quant. Biol. 20: 33–53.
Kimura, M., 1955b Solution of a process of random genetic drift
with a continuous model. Proc. Natl. Acad. Sci. USA 41: 141–150.
Kimura, M., and T. Ohta, 1972 Probability of fixation of a mutant
gene in a finite population when selective advantage decreases
with time. Genetics 65: 525–534.
Kimura, M., and T. Ohta, 1974 Probability of gene fixation in an
expanding finite population. Proc. Natl. Acad. Sci. USA 71:
3377–3379.
Lambert, A., 2006 Probability of fixation under weak selection:
a branching process unifying approach. Theor. Popul. Biol. 69:
419–441.
Milner, J. M., S. D. Albon, L. E. B. Kruk, and J. M. Pemberton,
2003 Selection on phenotype, pp. 190–216 in Soay Sheep:
Dynamics and Selection in an Island Population, edited by
T. H. Clutton-Brock and J. M. Pemberton. Cambridge University
Press, Cambridge, UK.
Moran, P. A. P., 1958 Random processes in genetics. Math. Proc.
Camb. Philos. Soc. 54: 60–71.
McKane, A. J., and D. Waxman, 2007 Singular solutions of the
diffusion equation of population genetics. J. Theor. Biol. 247:
849–858.
Muller, H. J., 1964 The relation of recombination to mutational
advance. Mutat. Res. 1: 2–9.
Ohta, T., 1972 Population size and the rate of evolution. J. Molec.
Evol 1: 305–314.
Otto, S. P., and M. Whitlock, 1997 The probability of fixation in
populations of changing size. Genetics 146: 723–733.
Parsons, T. L., and C. Quince, 2007 Fixation in haploid populations exhibiting density dependence I: the non-neutral case.
Theor. Popul. Biol. 72: 121–135.
Parsons, T. L., C. Quince, and J. B. Plotkin, 2010 Some consequences of demographic stochasticity in population genetics.
Genetics 185: 1345–1354.
Takahata, N., K. Ishii, and H. Matsuda, 1975 Effect of temporal
fluctuation of selection coefficient on gene frequency in a population. Proc. Natl. Acad. Sci. USA 72: 4541–4545.
Waxman, D., 2011 Comparison and content of the Wright–Fisher
model of random genetic drift, the diffusion approximation, and
an intermediate model. J. Theor. Biol. 269: 79–87.
Waxman, D., and L. Loewe, 2010 A stochastic model for a single
click of Muller’s ratchet. J. Theor. Biol. 264: 1120–1132.
Wright, S., 1931 Evolution in Mendelian populations. Genetics
16: 97–159.
Communicating editor: L. M. Wahl
Note
913
GENETICS
Supporting Information
http://www.genetics.org/content/suppl/2011/04/28/genetics.111.129288.DC1
A Unified Treatment of the Probability of Fixation
when Population Size and the Strength of Selection
Change Over Time
D. Waxman
Copyright © 2011 by the Genetics Society of America
DOI: 10.1534/genetics.111.129288
FILE S1
SUPPORTING INFORMATION
The Supporting Information presented here is separated into background material followed by six separate
parts. In Part 1 an alternative derivation is given for the probability of fixation when the effective population
size and the strength of selection may change up to time T , but do not change beyond time T . In Part 2 a
limiting case of the probability of fixation is determined, while in part 3 it is shown how Equation (4) of the
main text, for the fixation probability, simplifies when conditions are static. In Part 4 a useful transformation
of the diffusion equation is made and a property of the population size in this equation is exposed. In Part
5 details are given of a way to numerically evaluate the quantity Ep [e!S! X(T ) ] (which appears in Equation
(4) of the main text). Lastly, in Part 6, we give details of the generalisation of Equation (4) of the main text to
include stochastic fluctuations in population size and the strength of selection.
Background
Consider a single locus that determines fitness in a population of randomly mating diploid individuals
of (variance) effective size Ne (t) at time t. The locus has two alleles, denoted A and a, and is subject to
semidominant selection where, at time t, the relative fitnesses of AA, Aa and aa genotype individuals are
1 + 2s(t), 1 + s(t) and 1, respectively. Until Part 6 of the Supporting Information, we will assume that Ne (t)
and s(t) change deterministically.
We write the relative frequency (henceforth referred to as the frequency ) of allele A at time t as X(t).
Given an initial A allele frequency of y at time u, the probability density of the A allele frequency at later time
t, at frequency x, is written as K(x, t|y, u). Under a diffusion approximation K(x, t|y, u) obeys the forward
diffusion equation
!
!
1
!2
!
K(x, t|y, u) = !
[x(1 ! x)K(x, t|y, u)] + s(t)
[x(1 ! x)K(x, t|y, u)]
!t
4Ne (t) !x2
!x
(S1)
(Kimura 1955) and is subject to the initial condition K(x, u|y, u) = "(x ! y) where "(x) denotes a Dirac delta
function of argument x.
The result of the main text for the fixation probability, Equation (4), is given here for completeness:
Pfix (p) =
1 ! Ep [e!S! X(T ) ]
1 ! e!S!
(S2)
where p is the frequency at time t = 0, S" is the value of 4Ne (t)s(t) when t " T and Ep [...] denotes the
conditional expectation E [...|X(0) = p].
In what follows, we shall repeatedly make use of the relation
!
"
!
" #
Ep e!S! X(T ) # E e!S! X(T ) |X(0) = p =
1
0
e!S! x K(x, T |p, 0)dx
(S3)
where the last equality follows since K(x, T |p, 0) is the probability density, at frequency x, of X(T ), conditional
on X(0) = p.
PART 1
To give more insight into Equation (4) of the main text (reproduced in Equation (S2) above), and for use
in Part 6 of the Supporting Information, we give another derivation of Equation (S2).
We use a property of K(x, t|y, u) that follows from Equation (S1) being first order in time derivatives and
linear, namely
#
2
SI
1
K(x, t|y, u) =
K(x, t|z, r)K(z, r|y, u)dz
0
D.
Waxman
(S4)
where u ! r ! t. Equation (S4) is often known as the Chapman-Kolmogorov equation.
To determine the fixation probability, given an A allele frequency of p at time t = 0, we determine the
behaviour of K(x, t|p, 0) at large values of t. We work under the assumption that s(t) and Ne (t) do not
change when the time is larger than a specific time, T , hence
4Ne (t)s(t) = S! (a constant)
for t " T .
(S5)
We take Equation (S5) into account when making specific choices of r, y and u in Equation (S4):
K(x, t|p, 0) =
!
1
0
K(x, t|z, T )K(z, T |p, 0)dz.
(S6)
The factor .K(z, T |p, 0) in Equation (S6) is potentially complicated since it is determined by the time-dependent
changes of Ne (t) and s(t) that occur from time 0 to time T . By contrast, K(x, t|z, T ) applies for the range of
times where the strength of selection and the population size have achieved constant values.
In the work of McKane and Waxman (2007) and Waxman (2011), ‘zero current’ boundary conditions were
imposed on the solution of the forward diffusion equation1 . Zero current boundary conditions, in contrast to
the approach adopted by Kimura (1955), ensure that probability is conserved and lead to an interpretation of
Equation (S1) that is consistent with underlying the Wright-Fisher model for all x including x = 0 and x = 1,
i.e., including fixation and loss (see Waxman 2011). We apply these boundary conditions in the present
context.
For times t > T , where the strength of selection and the population size have achieved the constant
values s! and N! , we can write
K(x, t|z, T ) =
!
"
!n (x)"n (z)e"!n (t"T )
(S7)
n=0
where !n (x) ("n (z)) is an eigenfunction of the forward (backward) diffusion operator associated with eigenvalue !n :
#
d
1 d2
[x(1 # x)!n (x)] + s!
[x(1 # x)!n (x)] = !n !n (x)
4N! dx2
dx
#
y(1 # y) d2
d
"n (y) # s! y(1 # y) "n (y) = !n "n (y).
4N! dy 2
dy
(S8)
(S9)
For K(x, t|z, T ) to be a solution of the appropriate diffusion equation, the eigenfunctions appearing in Eqs.
(S7), (S8) and (S9) are orthogonal and normalised (Waxman 2011) in the sense
!
1
!n (x)"m (x)dx = "m,n
(S10)
0
where "m,n is a Kronecker delta, which equals 1 when m = n and vanishes otherwise.
A key feature of the analysis of McKane and Waxman (2007) and Waxman (2011) is that under ‘zero
current’ boundary conditions, there are two eigenfunctions of the forward and backward eigenvalue equations
"
current boundary conditions correspond to the probability current density ! 4N1 (t) "x
[x(1 ! x)K(x, t|y, u)] +
e
s(t)x(1 ! x)K(x, t|y, u) vanishing at x = 0 and x = 1. Since K(x, t|y, u) is a probability density, it is an integrable
function of x with the consequence that x(1 ! x)K(x, t|y, u) vanishes at x = 0 and x = 1. Accordingly, zero current
"
boundary conditions can be taken as requiring that "x
[x(1 ! x)K(x, t|y, u)] vanishes at x = 0 and x = 1. See McKane
and Waxman (2007) and Waxman (2011) for further details.
1 Zero
D.
Waxman
3
SI
with zero eigenvalue. These are given the labels n = 0 and n = 1 and as t ! " only terms with these labels
persist in Equation (S7):
lim K(x, t|z, T ) = !0 (x)"0 (z) + !1 (x)"1 (z).
(S11)
t!"
We take2 !0 (x) = !(x) and !1 (x) = !(1 # x): these are probability densities associated with the A allele
frequency having the precise values 0 and 1, respectively, and correspond to loss and
! 1 fixation of the A allele.
Using Equation (S11) in Equation (S6), we arrive at limt!" K(x, t|p, 0) = !0 (x) 0 "0 (z)K(z, T |p, 0)dz +
!1
!1 (x) 0 "1 (z)K(z, T |p, 0)dz and the coefficient of !1 (x) in this expression is the probability of fixation. We
thus find
" 1
Pfix (p) =
"1 (z)K(z, T |p, 0)dz = Ep ["1 (X(T ))] .
(S12)
0
The eigenfunction "1 (y) of the backward eigenvalue equation has zero eigenvalue; it obeys Equation (S9)
with n = 1 and "1 = 0. The eigenfunction is subject to "1 (0) = 0 and "1 (1) = 1 which result from Equation
1 # e#S! y
. We can thus write Equation (S12)
(S10) with m = 1 and n = 0 or n = 1. It follows that "1 (y) =
1 # e#S!
%
&
$
#
1 # Ep e#S! X(T )
1 # e#S! X(T )
as Pfix (p) = Ep
=
.
1 # e#S!
1 # e#S!
PART 2
In this part of the Supporting Information we obtain the limiting case of Equation (S2) when T ! 0.
Given that K(x, t|p, 0) obeys the forward diffusion equation,
a direct
' %Equation (S1),
' calculation, assuming
&
'
#1
#S! X(T )
#S! p '
[Ne (t)] and |s(t)| remain bounded for 0 $ t $ T , yields 'Ep e
#e
' = O(T ) and hence as
T ! 0, Equation (S2) collapses to Pfix (p) =
1 # e#S! p
which is Equation (2) of the main text.
1 # e#S!
PART 3
When population
% size and &the strength of selection are independent of time (Ne (t) = Ne and s(t) = s)
the expectation Ep e#S! X(T ) appearing in Equation (S2) takes a simple form which allows Equation (S2)
to be significantly simplified.
%
&
To establish the form of Ep e#S! X(T ) we use Equation (S3) and obtain an equation for the quantity
! 1 #S! x
e
K(x, T |p, 0)dx by multiplying Equation (S1) (with t replaced by T ), by e#S! x and integrating over
0
all x. When Ne and s are independent of time we obtain
#
# ! 1 #S! x
e
K(x, T |p, 0)dx
#T 0
=
#
1 ! 1 #S! x # 2
e
[x(1 # x)K(x, T |p, 0)] dx
4Ne 0
#x2
+s
!1
0
#S! x
e
(S13)
#
[x(1 # x)K(x, T |p, 0)] dx.
#x
Integrating the first term on the right hand side by parts twice, the second term once by parts, and using
4Ne s = S" , leads to
# #S! x
$x=1
" 1
#
e
#
#
e#S! x K(x, T |p, 0)dx = #
[x(1 # x)K(x, T |p, 0)]
.
(S14)
#T 0
4Ne #x
x=0
2 There are two eigenfunctions associated with zero eigenvalue. Different linear combinations of these eigenfunctions
also have zero eigenvalue. In the present work we have made the particular choice of these eigenfunctions made by
Waxman (2011); an alternative, but equivalent choice of these eigenfunctions has been made by McKane and Waxman
(2007).
4
SI
D.
Waxman
The right hand side vanishes under the ‘zero current’ boundary conditions imposed by McKane and Wax"
#
!
! ! 1 !S! x
man (2007) and Waxman (2011). We then have
e
K(x, T |p, 0)dx !
Ep e!S! X(T ) = 0
0
!T
!T
#
"
corresponding to Ep e!S! X(T ) being independent of T . We can take T = 0 and arrive at
"
#
Ep e!S! X(T ) = e!S! p .
(S15)
Using this last result in Equation (S2) shows that when parameters are time-independent, the fixation proba1 " e!S! p
bility reduces to Pfix (p) =
, i.e., Equation (2) of the main text.
1 " e!S!
#
"
The property of Equation (S15), that under static conditions Ep e!S! X(T ) is independent of T (and
hence equals its value when T = 0) is a Martingale property of the diffusion approximation that appears to
have been first identified by Ewens (1964). It can also be derived by noting that e!S! y is a linear superposition
of !0 (y) and !1 (y), the two eigenfunctions of the backward equation with zero eigenvalue.
PART 4
Transformation of the diffusion equation
In this part of the Supporting Information, we transform the diffusion equation, Equation (S1), by replacing
the time t by the ‘drift time’ given in Equation (5) of the main text:
" =
$
t
0
Ne (0)
du
Ne (u)
(S16)
and determine a key property of Ne (t).
We shall compare populations with the same value of Ne (0) and hence consider Ne (0) a fixed parameter.
We can, in principle, solve Equation (S16) for t and obtain it as a function of " , which we write as t(" ).
Defining
%
K(x,
" |y, #) = K(x, t(" )|y, t(#))
(S17)
we find that Equation (S1) becomes
"4Ne (0)
where
! %
K(x, " |y, #)
!"
=
#
!2 "
%
x(1 " x)K(x,
" |y, #)
2
!x
#
! "
%
+R(" )
x(1 " x)K(x,
" |y, #)
!x
"
R(" ) = 4Ne (t(" ))s(t(" )).
(S18)
(S19)
By virtue of its position in Equation (S18) the quantity R(" ) encapsulates selection and population size in
a single term that we call the ‘overall strength of selection’. Note that the time arguments of Ne and s in
Equation (S19) are t(" ) and hence are determined by the relationship between t and " of Equation (S16),
i.e., are determined by the way that Ne varies over time.
Equation (S18) is the diffusion equation that is obtained from the original diffusion equation, Equation
(S1), when it is transformed to depend on the drift time, " .
Property of Ne (t(" ))
Let us now demonstrate a property of the quantity Ne (t(" )) appearing in Equation (S19). We consider
the two cases, where Ne (t) either increases with t or where it decreases with t.
1) Increasing Ne (t) (i.e., dNe (t)/dt > 0). This immediately yields (i) Ne (t1 ) > Ne (t2 ) for t1 > t2 . From
Equation (S16) we obtain (ii) t(" ) > " . It follows from (i) and (ii) that Ne (t(" )) > Ne (" ).
D.
Waxman
5
SI
2) Decreasing Ne (t) (i.e., dNe (t)/dt < 0). This immediately yields (i) Ne (t1 ) < Ne (t2 ) for t1 > t2 . From
Equation (S16) we obtain (ii) ! > t(! ). It follows from (i) and (ii) that Ne (t(! )) > Ne (! ).
Thus in both cases we have Ne (t(! )) > Ne (! ) and this generally holds for an Ne (t) that exhibits only
increase or only decrease.
PART 5
The quantity Ep [e!S! X(T ) ] that appears in Equation (S2) can be determined by numerically solving a
diffusion equation. To establish this we first note that K(x, t|y, u) not only obeys Equation (S1) but also the
backward equation
y(1 ! y) " 2
"
"
K(x, t|y, u) = !
K(x, t|y, u) ! s(u)y(1 ! y) K(x, t|y, u)
"u
4Ne (u) "y 2
"y
(S20)
subject to
K(x, t|y, t) = #(x ! y)
K(x, t|0, u) = #(x)
(S21)
K(x, t|1, u) = #(1 ! x)
The second and third conditions in Equation (S21) follow from the first condition since K(x, t|0, u) and
K(x, t|1, u) are independent of u, by virtue of Equation (S20), and hence may be evaluated at u = t.
def
If we multiply K(x, T |y, u) by e!S! x and integrate from x = 0 to x = 1 we obtain the result G(y, u) "
e!S! x K(x, T |y, u)dx = E[e!S! X(T ) |X(u) = y]. The quantity G(y, u) is often known as the Laplace0
Stieltjes transform of X(t), and in the present case it is evaluated at the solution of Equation (S1). The
equation for G(y, u) follows from Equation (S20) by multiplying e!S! x and integrating from x = 0 to x = 1. It
reads
y(1 ! y) " 2
"
"
G(y, u) = !
G(y, u) ! s(u)y(1 ! y) G(y, u)
(S22)
"u
4Ne (u) "y 2
"y
!1
and is subject to (i) G(y, T ) = e!S! y , (ii) G(0, u) = 1 and (iii) G(1, u) = e!S! which follow from Equation
(S21).
Equation (S22), subject to the initial condition (i), and the boundary conditions (ii) and (iii) is a well defined
mathematical problem for G(y, u) that can be solved by a standard numerical technique such as the Crank
Nicholson method (see e.g., Press et al. 2007). Thus we ‘integrate backwards’ from u = T to u = 0 and
obtain G(y, 0) = E[e!S! X(T ) |X(0) = y] " Ey [e!S! X(T ) ] which, when used in Equation (S2), yields a
numerical estimate of the fixation probability.
PART 6
The result in Equation (S2) for the fixation probability has a wider applicability than just for Equation
(S1). In a more general case, we assume we assume there are both deterministic changes and stochastic
fluctuations in Ne (t) and s(t) for times t # T , but only stochastic fluctuations for times t > T . Then the
diffusion operator may be ‘time averaged’ and results in a time-independent form for t > T that differs from
Equation (S1). Time averaging over correlated fluctuations in the strength of selection was first carried out by
Takahata et al. (1975) and we adopt the same approach here, implicitly assuming that the correlations persist
over many generations.
To! demonstrate this in the simplest way possible, we transform the diffusion equation using the drift time
t
! = 0 4Ne1(u) du (which differs from the drift time defined in Equation (S16) by an overall constant factor).
This drift time eliminates the factor 4Ne (0) on the left hand side of Equation (S18) but otherwise leaves it
unchanged in form. As already stated, the quantity R(! ), by virtue of its position in the transformed diffusion
6
SI
D.
Waxman
equation plays the role of an effective strength of selection. We can directly follow the approach of Takahata
et al. (1975), making similar assumptions:
(i) we assume that R(! ) has bounded fluctuations around a mean value of R̄
(ii) with overbars denoting ensemble averages over fluctuations, we assume the correlations of R(! ) obey
#"
#
!! "
R(" + "1 ) ! R̄ R(") ! R̄ d" = V for all "1 larger than some small correlation time.
0
$
The average of K(x,
! |y, ") over fluctuations is written K̄(x, ! |y, ") and this obeys (cf. Takahata et al.
1975)
'
(
#
# 2 %&
! K̄(x, ! |y, ") = ! 2 x(1 ! x) + V x2 (1 ! x)2 K̄(x, ! |y, ")
#!
#x
(S23)
'
(
# %&
+
R̄x(1 ! x) + V x(1 ! x)(1 ! 2x) K̄(x, ! |y, ") .
#x
The eigenvalue equation of the backward diffusion operator associated with Equation (S23) is
!y(1 ! y) [1 + V y(1 ! y)]
&
' d
d2
!(y) ! y(1 ! y) R̄ + V (1 ! 2y)
!(y) = $!(y).
dy 2
dy
(S24)
To employ the result in Equation (S12) within Part 1 of the Supporting Information, we require the eigenfunc)
*
1!y/"+ #
1 ! 1!y/"!
tion !1 (y) with eigenvalue 0 that obeys !1 (0) = 0 and !1 (1) = 1. This is !1 (y) =
)
* where
1!1/"+ #
1 ! 1!1/"!
+
1 ± 1 + 4/V
R̄
%± =
and & = +
. The form of !1 (y) is equivalent to Equation (27) of TAKAHATA
2
V 1 + 4/V
et al. (1975) under the substitutions s̄ " R̄ and N " 1/4.
of Equation (S2), that includes fluctuations in Ne (t) and s(t) is then given by Pfix (p) =
&The generalisation
'
Ep !1 (X(T )) .
LITERATURE CITED IN THE
SUPPORTING INFORMATION
Ewens, W. J., 1964 The pseudo-transient distribution and its uses in genetics. J. Appl. Probab. 1: 141–156.
Kimura, M., 1955 Stochastic processes and distribution of gene frequencies under natural selection. Cold
Spring Harbour Symp. Quant. Biol. 20: 33–53.
McKane, A. J. and Waxman, D., 2007 Singular solutions of the diffusion equation of population genetics. J.
Theor. Biol. 247: 849–858.
Press, W. H., Teukolsky, S. A, and Vetterling, W. T., 2007 Numerical Recipes 3rd Edition: The Art
of Scientific Computing. Cambridge University Press, Cambridge.
Takahata, N., Ishii, K. and Matsuda, H., 1975 Effect of temporal fluctuation of selection coefficient on
gene frequency in a population. Proc. Natl. Acad. Sci. USA 72: 4541–4545.
Waxman, D., 2011 Comparison and content of the Wright–Fisher model of random genetic drift, the diffusion
approximation, and an intermediate model. J. Theor. Biol. 269: 79–87.
D.
Waxman
7
SI