On the effects of background selection in small populations on

Hereditas 141: 74 /80 (2004)
On the effects of background selection in small populations on
comparisons of molecular variation
SNÆBJÖRN PÁLSSON
Institute of Biology, University of Iceland, Reykjavik, Iceland
Pálsson, S. 2004. On the effects of background selection in small populations on comparisons of molecular variation.
*/ Hereditas 141: 74 /80. Lund, Sweden. ISSN 0018-0661. Received August 19, 2003. Accepted May 10, 2004
Deleterious mutations affect genetic variation at linked neutral loci. Neutral variation can be reduced due to background
selection, but in small population and with tight linkage such variation may increase due to associative overdominance. Here
I report the results of computer simulations of diploid genotypes in small populations, where I look at the effect of
deleterious mutations and linkage on comparisons of intra- and interspecific variation. Each chromosome consisted of 2000
loci where deleterious and neutral mutations occurred. The ratio of nonsynonymous to synonymous substitution rates
(Ka/Ks) either increases with tight linkage or is unaffected, depending on the strength of selection. The ratio of the numbers
of segregating mutations to the number of fixed mutations decreases under the conditions leading to background selection
but can increase at tight linkage. Numbers of segregating sites (Sn) are less affected than nucleotide site diversity (p), p
reduces more than Sn at intermediate linkage, but p increases more than Sn when linkage is tight. Similar effects as found for
Sn and p are observed for heterozygosity and variance in allele size of tandem repeat loci.
Snæbjörn Pálsson, Institute of Biology, University of Iceland, Sturlugata 7, IS-101 Reykjavik, Iceland. E-mail: [email protected]
Several studies on intra- and interspecific variation in
DNA have been conducted in recent years. Whether
such variation reflects a process of adaptation or is
simply a noise due to mutations and random genetic
drift is one of the questions of molecular evolutionary
biology, and both theory and statistics have been
developed to guide sampling and testing of data
(LI 1997). Such testing is made problematic partly
because of the assumption of equilibrium between
mutations and demographic factors, such as drift and
migration, and partly because neutral variation is
affected by selection at linked loci. Such selective
effects depend on linkage disequilibrium between the
loci and therefore on the rate of recombination as well
as on the mode of selection (CHARLESWORTH et al.
1993; PÁLSSON and PAMILO 1999).
The selective effects on neutral loci due to linkage
have been described for different modes of selection,
for hitchhiking (MAYNARD SMITH and HAIGH 1974),
background selection (CHARLESWORTH et al. 1993;
HUDSON and KAPLAN 1995; NORDBORG et al. 1996),
the joint effects of positively and negatively selected
mutations (KIM and STEPHAN 2000), and models
based on temporal fluctuations in the direction of
selection (BARTON 1995). Directional selection reduces variation of neutral or weakly selected variants,
either because of neutral alleles hitchhike with linked
beneficial mutations that sweep through the population or because background selection against deleterious mutations reduces the effective population size
(Ne) (STEPHAN et al. 1999).
The effect of linkage on variation at selected site can
be understood by considering the Hill-Robertsson
effect (HILL and ROBERTSON 1966; BARTON 1995),
which results in reduced Ne and consequently increased genetic drift and reduced intensity of selection.
These effects of interference among selected loci
apply mostly to mutations under moderate or weak
selection as these mutations can segregate for longer
time within a population. Studies by MCVEAN and
CHARLESWORTH (2000), COMERON et al. (1999) and
COMERON and KREITMAN (2002) have suggested
that stronger interferences can be expected in regions
of low recombination. Empirical work has laid some
evidence that these theoretical predictions hold in
natural
populations
(CHARLESWORTH
1996;
COMERON et al. 1999).
The background selection theory predicts that the
rate of substitution of deleterious alleles is accelerated
as the efficacy of selection is reduced and that
nucleotide diversity is more reduced than number of
segregating sites (CHARLESWORTH et al. 1993). Selection in small diploid populations is not simply
directional against chromosomes carrying harmful
mutations (PAMILO and PÁLSSON 1998; PÁLSSON
and PAMILO 1999). Chromosomes that are most
divergent from each other carry different harmful
mutations, and selection depends not only on the
number of such mutations but also on how often they
are expressed as homozygotes. This can result in
frequency dependent selection favouring rare variants
at the level of chromosomes with alleles at linked loci
Hereditas 141 (2004)
Effects of background selection in small populations
for deleterious alleles being in repulsion equilibrium
and in balancing selection when recombination is
restricted. Such selection will promote neutral variation at linked loci by associative overdominance
(FRYDENBERG 1963), opposite to the effect of background selection (CHARLESWORTH et al. 1993), when
linkage is tight and the product of population size (N),
dominance- (h) and selection (s) coefficients, Nhs, is
small: close to or lower than one (PÁLSSON and
PAMILO 1999).
In this study I look particularly at the situations
when the product of Nhs and the recombination rate
(r) in the study by PÁLSSON and PAMILO (1999)
resulted (1) in little or no linkage effect, (2) in a
reduction due to background selection, and (3) in an
increase in heterozygosity at the neutral marker due to
associative overdominance.
I explore how comparisons based on inter- and
intraspecific variation depend on deleterious mutations and linkage in small populations. Firstly at the
ratio of nonsynonymous and synonymous substitution
rates, which has been suggested by OHTA (1995) to test
her nearly neutral hypothesis, and have been used in
species comparisons (PAMILO and O’NEILL 1997).
Secondly I look at the test by MCDONALD and
KREITMAN (1991) which contrasts polymorphism
within species and divergence among species. Thirdly,
I look at two comparisons of intraspecific variation:
i) the Tajima test (TAJIMA 1989) which contrast two
estimates of u /4Nm, based on nucleotide diversity (p)
and on number of segregating sites (Sn), and a similar
statistic, ii) the imbalance index b developed by
KIMMEL et al. (1998) for microsatellite loci, which
contrasts two estimates of u based on variation
in allele size and heterozygosity of a locus with
variable number of tandem repeats. Previous studies
(CHARLESWORTH et al. 1993, PÁLSSON and PAMILO
1999) have shown how background selection can be
detected by methods which contrast estimates of p and
Sn by the Tajima test (TAJIMA 1989). As selection may
affect the estimates differently, comparison of the
estimates may indicate whether and how selection
has acted (TAJIMA 1989; KIMMEL et al. 1998).
METHODS
I simulated diploid genotypes consisting of a single
pair of chromosome carrying 2000 loci (or sites),
spread uniformly along the chromosome (as in
PÁLSSON and PAMILO 1999). The parametric values
studied here are chosen specially to contrast the
effect of background selection and associative
overdominance and are a subset of the values studied
by PÁLSSON and PAMILO (1999). The number
75
of mutations per gamete was generated by selecting
it randomly from a Poisson distribution and the sites
at which deleterious mutations occurred being
sampled from a uniform distribution. Neutral mutations were introduced at every second locus at a rate of
10 5 per locus, and were studied within regions of 200
loci at the center and 100 loci from both ends of the
chromosome whereas deleterious mutations occurred
at every second locus over the whole chromosome with
a rate of 104 per locus, or U /0.1 per genome per
generation. The deleterious mutation rate is consistent
with mutation rates used by CHARLESWORTH et al.
(1993) and are somewhat conservative (KEIGHTLEY
1994). Five microsatellite loci were located at the
center, changes in repeat numbers occurred by a single
step mutation with equal probability to increase or
decrease in length. The microsatellite mutation rate
per loci was 2/103 per generation.
The deleterious effect of each mutation was identical within each run and the overall fitnesses of the
genotypes were calculated multiplicatively from fitnesses at individual loci as: wi /(1 /s)x(1 /hs)y where
x and y are the numbers of homozygous and heterozygous loci, respectively. Selection (s) and dominance
(h) coefficients varied in different simulations, for
s/0.1 the effect of dominance was studied with
varying dominance (h/0.0, 0.1, 0.5), the strength of
selection varied (s /0.01, 0.05, 0.1, 0.2) in different
simulations with dominance h/0.1. Population size
(N) was 100 and 400. The product of Nhs thus varied
from 0 /20.
The number of recombination events was sampled
from a Poisson distribution with mean L. Three
different parameter values were used, namely
L /(0.01, 0.1 and 1.0), the recombination frequency
(r) between adjacent loci being given by Haldane’s
equation (HALDANE 1919) r/1/2(1/exp(/2L/
(n/1))). Corresponding r-values are 10 5, 10 4,
and 103. The sites of recombination events along
the chromosomes were sampled from a uniform
distribution.
Mating was random and the offspring consisted of
two gametes randomly selected from their two parents.
The simulations were run for 30 000 generations,
with five or more replicates each and the statistics
studied were scored every 1000 generation. Number of
segregating sites within the population (Si), mean
number of segregating mutations per gamete (Mi)
and fixations (Fi) were studied for both deleterious
and neutral mutations at the two chromosomal
regions (i) consisting each of 100 loci from the center
and the tips of the chromosome.
Neutral and harmful mutations were used to
estimate synonymous (Ks) and nonsynonymous
76
S. Pálsson
(Ka) substitution rates respectively as Ki /Mi/Fi.
This would be the expected estimate one would get
when sampling one gamete or sequence. Comparisons
of synonymous and nonsynonymous substitution
rates are generally based on comparisons of two
sequences obtained from two species (LI 1997;
NEI and KUMAR 2000). To include several sequences,
HUGHES and NEI (1988) based the comparisons
on the average values of the substitution rates.
How well this methods reflects analogous DNA
sequence variation can be questioned as this simulation is based on mutation per loci rather than
on a single nucleotide. What is of interest here is
though rather the relative differences among regions
and between simulations with different rate of
recombination.
MCDONALD and KREITMAN (1991) proposed
the use of the likelihood ratio, or G-test as a
2/2 contingency test, to contrast the relative levels
of divergence and polymorphism at both replacement and silent sites. In this study I look whether
there are different associations with and without
background selection, by comparing the ratio of
number of fixations (Fi) and segregating mutations (Si) and its dependence on the rate of
recombination.
Average number of mutation differences for the
neutral loci was calulated as p aiBj pij =[n(n1)=2]:
For the microsatellite locus I calculated the variance in
allele size and the heterozygosity. The expected
variance within a sample is given by mts2m, where t is
the average coalescence times (here equal to twice the
population size N), and s2m is the variance in the
change in allele size as a result of each mutational
event (SLATKIN 1995). Under the one-step mutational
model used here s2m /1. The expected heterozygosity
in a stepwise model is H /1/1/(1/8Nm)1/2 (KIMURA
and OHTA 1978). An imbalance index designed by
KIMMEL et al. (1998) contrasts two estimates of Nm
based on variance (V) in allele size and heterozygosity
P, ln b /ln uV /ln uP.
A source code for the simulation program was
written in C, pseudo random-numbers were generated
using procedures from numerical recipes in C (PRESS
et al. 1988).
Hereditas 141 (2004)
Fig. 1. Ratio of asynonymous- and synonymous substitution rates (Ka/Ks) at population sizes of (a) N /100 and (b)
N/400 individuals. Different combinations of h and s are
denoted respectively: k /(0.0, 0.1), ^ /(0.1, 0.01),
/ /(0.1, 0.05), / /(0.1, 0.1), I /(0.1, 0.2), and \ /(0.5,
0.1).
RESULTS AND DISCUSSION
Comparisons of molecular variation between
regions with different recombination rates reflect the
effect of linked deleterious mutations, strengthened
in small populations by increased dominance and
weak selection.
Fig. 2. Ratio of the number of segregating and fixed neutral
sites (Qs) with background selection at linked loci divided
with the corresponding ratio with no selection (Qo), for
population size of (a) N /100 and (b) N /400 individuals.
Same notation is used for the different combinations of h
and s as in Fig. 1.
Hereditas 141 (2004)
Effects of background selection in small populations
Interspecific comparisons
The prediction of background selection theory that
the rate of substitution of deleterious alleles (Ka) is
accelerated as the efficacy of selection is reduced with
more tight linkage was confirmed. A clear pattern is
observed in Fig. 1. Under parametric conditions which
resulted in reduced heterozygosity in the study by
PÁLSSON and PAMILO (1999) the ratio of Ka/Ks is
almost unaffected by recombination rate, background
selection affects both substitution rates equally. At
more tight linkage (conditions which result in increased heterozygosity), especially at r/105, the
ratio increases substantially and mainly due to an
increased number of segregating deleterious mutations. The substitution rates for the selected and
neutral sites becomes more similar with increased
linkage, a larger observed value for the former (as
shown by Ka/Ks /1) results from a tenfold higher
mutation rate at the selected sites than for neutral
sites. The ratio of Ka/Ks /1 for all recombination
rates when there are no effect of associative overdominance, (h /0.5). A ratio of Ka/Ks /1 is generally
interpreted as an evidence for positive selection
(NEI and KUMAR 2000 and a recent summary by
FORD 2002), possibly as a result of overdominance
plus short-term or frequency dependent selection
(OHTA 1998).
The dependency of the ratio Ka/Ks on the product
of Nhs is also observed. A clear and a significant
negative relationship was found at all recombination
rates between the product hs and the ratio for a given
population size, with Pearson correlation coefficient
ranging from /0.632 /(/0.994), and with P-values
ranging from 0.0274 /5.7 /107.
Larger effects of the deleterious mutations were
clearly seen in the middle of the chromosomes when
linkage was tight rB/10 5, where in all cases studied
the average ratio Ka/Ks was larger at the center than at
the edge of the chromosome (ranging from 7% to
more than twofold). With no or little linkage (r /103
and 104) the ratios were in most cases similar.
Inter- and intraspecific comparisons
The estimates of Ka and Ks confound numbers of
segregating loci and numbers of fixations, studied
e.g. by the McDonald-Kreitman test. Recombination
rates affects the comparison of number of segregating
mutations and fixations (Qs /S/F) (Fig. 2a and 2b),
Fig. 3. The ratio of the number of segregating sites with and without
background selection (Ss/So) (a) and (d), corresponding ratios for nucleotide
diversity (ps/po) (b) and (e), and Tajima’s D (c) and (f) for populations of (a /c)
N /100 and (d /f) N /400 individuals. Same notation is used for the different
combinations of h and s as in Fig. 1.
77
78
S. Pálsson
Hereditas 141 (2004)
where S and F are the observed values with background selection divided with corresponding values
with no background selection (Qo). The ratio Qs/Qo is
on average generally lower than 1 when r/10 4 and
is lower when the population size is larger, background
selection reduces the effective population size and the
number of segregating neutral alleles is reduced. For
the larger population size studied (N /400, Fig. 2b), a
ratio above 1 is only found at a tighter linkage. The
ratio is largest when associative overdominance is
strong at r/105 and r /0 (data not shown) when the
numbers of segregating mutations increases as they are
blocked within heterozygous segments favoured by
selection and as a consequence the numbers of
fixations are reduced.
Intraspecific comparisons
Background selection affects the number of segregating sites (Sn) less than nucleotide diversity (p), as
described by CHARLESWORTH et al. (1993). However
its effects are different at r /104 than at r/105 as
presented in Fig. 3. Standard deviation of the mean
(SE) ranged from 0.0120 /0.3258, the larger values
only found at tight linkage (r /105). A larger
reduction (compared with r/103) is found in p
(Fig. 3b and 3e) at r /104 than is observed for Sn
(Fig. 3a and 3d), resulting in Tajima’s D B/0, especially
when N /400 (Fig. 3c and 3f). At tight linkage
(r /105) p increases more than Sn and Tajima’s D
is larger than 0, except when the intensity of selection
is strong (large Nhs-values).
Similar patterns are observed for the estimates of
variation obtained at the microsatellite loci (Fig. 4).
Figure 4 presents the observed means, standard errors
ranged from 0.0068 /0.7716, the larger values only
found at tight linkage (r /105). The ranking of the
mean values larger than the neutral expectation at r/
105, partly reflects the number of generations each
simulations lasted, as an increased number of segregating loci exceeded the capacity of the program.
Variance in allele size is more affected than the
heterozygosity when the recombination rate is altered.
A larger effect of selection at linked loci on the
variance has also been described by SLATKIN (1995).
KIMMEL et al.’s (1998) imbalance index (ln b) can be
less than 0 at r /104 (especially at the larger
Fig. 4. Diversity in tandem repeat markers. The ratio of variances with and without
background selection (Vs/Vo) (a) and (d), corresponding ratios for the estimate on u
based on homozygosity (Ps/Po) (b) and (e), and the imbalance index ln b (c) and (f)
for populations of (a /c) N /100 and (d /f) N /400 individuals. Same notation is
used for the different combinations of h and s as in Fig. 1.
Hereditas 141 (2004)
Effects of background selection in small populations
population size) and is /0 at r /105 when N/100
(Fig. 4c and 4f).
The heterozygosity-increasing effect of linked deleterious mutations are mainly apparent under tight
linkage. Constraints on sexual mating can result in
increased linkage disequilibrium and favour such
effect e.g. in subdivided populations (PAMILO et al.
1999) or in species with cyclical parthenogenesis
(PÁLSSON 2001). Although the population sizes studied here are small this could though be of concern in
larger populations as the effective population sizes are
often only 10% of the actual size (FRANKHAM 1996),
or at nonrecombining chromosomal regions in larger
population as in crested newts where heteromorphism
for chromosome 1 is a requirement for a normal
development (MACGREGOR and HORNER 1980).
Functional overdominance or any other conditions
resulting in heterozygote advantage would further
add to the associative overdominance effect. The
extent of this effect can in addition be affected
by selection on a modifier of recombination rate
which may generally favour increased recombination
(PÁLSSON 2002).
The effects of linkage may be better observed in
microsatellite loci by studying the variance in allele
size than in their heterozygosity. A comparison of the
two with the KIMMEL’s et al. (1998) method may
reveal effects of selection at linked loci, although its
deviations, as observed with the Tajima test, may also
be due to fluctuations in population size or mixing of
different allelic classes. A careless use of the test may
therefore lead to erroneous conclusions, and a study of
several independent loci might be needed to distinguish between the selective effects and the population
history.
Microsatellites may be more sensitive to linked
deleterious mutations due to high mutation rate which
contributes to increased linkage disequilibrium. Genetic distances based on variance in allele size [such as
Fst (ROUSSET 1996) and (dm)2 (GOLDSTEIN and
POLLOCK 1997)] may be more affected by selection
on linked loci than genetic distances based on heterozygosity. A lower correspondance between genetic and
geographic distances for the former has been found
in bears (PAETKAU et al. 1997), and in Daphnia
where variance in allele size was larger than one
could expect based on the heterozygosity (PÁLSSON
2000).
Acknowledgements / I want to thank Pekka Pamilo for
comments on the early stages of this work. The work has
been supported by the Icelandic Research Council.
79
REFERENCES
Barton, N. H. 1995. Linkage and the limits to natural
selection. / Genetics 140: 821 /884.
Charlesworth, B. 1996. Background selection and patterns of
genetic diversity in Drosophila melanogaster. / Genet.
Res. 68: 131 /149.
Charlesworth, B., Morgan, M. T. and Charlesworth, D.
1993. The effect of deleterious mutations on neutral
molecular variation. / Genetics 134: 1289 /1303.
Comeron, J. M. and Kreitman, M. 2002. Population,
evolutionary and genomic consequences of inference
selection. / Genetics 161: 389 /410.
Comeron, J. M., Kreitman, M. and Aguadé, M. 1999.
Natural selection on synonymous sites is correlated
with gene length and recombination in Drosophila .
/ Genetics 151: 329 /349.
Ford, M. J. 2002. Applications of selective neutrality tests to
molecular ecology. / Mol. Ecol. 11: 1245 /1262.
Frankham, R. 1996. Effective population size/adult population size ratios in wildlife: a review. / Genet. Res. 66: 95 /
107.
Frydenberg, O. 1963. Population studies of a lethal
mutant in Drosophila melanogaster. I. Behaviour in
populations with discrete generations. / Hereditas 48:
89 /116.
Goldstein, D. B. and Pollock, D. D. 1997. Launching
microsatellites: a review of mutation processes and
methods of phylogenetic inferences. / J. Heredity 88:
335 /342.
Haldane, J. B. S. 1919. The combination of linkage values,
and the calculation of distance between the loci of linked
factors. / J. Genet. 8: 299 /309.
Hill, W. G. and Robertson, A. 1966. The effect of linkage on
limits to artificial selection. / Genet. Res. 8: 269 /294.
Hudson, R. R. and Kaplan, N. L. 1995. Deleterious background selection with recombination. / Genetics 141:
1605 /1617.
Hughes, A. L. and Nei, M. 1988. Pattern of nucleotide
substitution at major histocompatibility complex class I
loci reveals overdominant selection. / Nature 335: 167 /
170.
Keightley, P. T. 1994. The distribution of mutation effects on
viability in Drosophila melanogaster. / Genetics 138:
1315 /1322.
Kim, Y. and Stephan, W. 2000. Joint effects of genetic
hitchhiking and background selection on neutral variation. / Genetics 160: 765 /777.
Kimmel, M., Chakraborty, R., King, J. P. et al. 1998.
Signatures of populations expansion in microsatellite
repeat data. / Genetics 148: 1921 /1930.
Kimura, M. and Ohta, T. 1978. Stepwise mutation
model and distribution of allelic frequencies in a finite
population. / Proc. Natl Acad. Sci. USA 75: 2868 /
2872.
Li, W.-H. 1997. Molecular evolution. / Sinauer Ass.
Macgregor, H. C. and Horner, H. A. 1980. Heteromorphism
for chromosome 1 is a requirement for a normal
development in crested newts. / Chromosoma 76: 111 /
122.
Maynard Smith, J. and Haigh, J. 1974. The hitchhiking effect of a favourable gene. / Genet. Res. 23:
23 /35.
80
S. Pálsson
McDonald, J. H. and Kreitman, M. 1991. Adaptive protein
evolution at Adh locus in Drosophila . / Nature 351: 652 /
654.
McVean, G. A. T. and Charlesworth, B. 2000. The effects of
Hill-Robertson interference between weakly selected
mutations on patterns of molecular evolution and variation. / Genetics 155: 929 /944.
Nei, M. and Kumar, S. 2000. Molecular evolution and
phylogenetics. / Oxford Univ. Press.
Nordborg, M., Charlesworth, B. and Charlesworth, D. 1996.
The effect of recombination on background selection.
/ Genet. Res. 67: 159 /174.
Ohta, T. 1995. Synonymous and nonsynonymous substitutions in mammalian genes and the nearly neutral theory.
/ J. Mol. Evol. 40: 56 /63.
Ohta, T. 1998. On the pattern of polymorphisms at the
major histocompatibility complex loci. / J. Mol. Evol.
46: 633 /638.
Paetkau, D., Waits, L. P., Clarkson, P. L. et al. 1997. An
empirical evaluation of genetic distance statistics using
microsatellite data from bear. / Genetics 147: 1943 /
1957.
Pálsson, S. 2000. Microsatellite variation in Daphnia pulex
from both sides of the Baltic Sea. / Mol. Ecol. 9: 1075 /
1088.
Pálsson, S. 2001. The effects of deleterious mutations in
cyclically parthenogenetic organisms. / J. Theor. Biol.
208: 201 /214.
Hereditas 141 (2004)
Pálsson, S. 2002. Selection on a modifier of recombination
rate due to linked deleterious mutations. / J. Heredity 93:
22 /26.
Pálsson, S. and Pamilo, P. 1999. The effects of deleterious
mutations on linked, neutral variation in small populations. / Genetics 153: 475 /483.
Pamilo, P. and O’Neill, R. J. W. 1997. Evolution of the Sry
genes. / Mol. Biol. Evol. 14: 49 /55.
Pamilo, P. and Pálsson, S. 1998. Associative overdominance, heterozygosity and fitness. / Heredity 81:
81 /389.
Pamilo, P., Pálsson, S. and Savolainen, O. 1999. Deleteriuos
mutations can reduce differentiaton in small, subdivided
populations. / Hereditas 130: 257 /264.
Press, W. H., Flannery, B. P., Teukolsky, S. A. et al. 1988.
Numerical recipes in C. The art of scientific computing. / Cambridge Univ. Press.
Rousset, F. 1996. Equilibrium values of measures of
population subdivision for stepwise mutation process.
/ Genetics 142: 1357 /1362.
Slatkin, M. 1995. Hitchhiking and associative overdominance at a microsatellite locus. / Mol. Biol. Evol. 12:
473 /480.
Stephan, W., Charlesworth, B. and McVean, G. A. T. 1999.
The effect of background selection at a single locus on
weakly selected, linked variants. / Genet. Res. 73: 133 /
146.
Tajima, F. 1989. Statistical method for testing the neutral
mutation hypothesis. / Genetics 123: 597 /601.