Neutrality tests of conservative-radical amino acid

Gene 291 (2000) 115±125
www.elsevier.com/locate/gene
Neutrality tests of conservative-radical amino acid changes in nuclear- and
mitochondrially-encoded proteins
David M. Rand*, Daniel M. Weinreich, Brent O. Cezairliyan
Department of Ecology and Evolutionary Biology, Box G-W, 69 Brown Street, Brown University, Providence, RI 02912, USA
Received 8 June 2000; received in revised form 26 September 2000; accepted 5 October 2000
Received by G. Bernardi
Abstract
The neutralist-selectionist debate should not be viewed as a dichotomy but as a continuum. While the strictly neutral model suggests a
neutralist-selectionist dichotomy, the nearly neutral model is a continuous model spanning strict neutrality through weak selection (Ns , 1)
to deterministic selection (Ns . 3). We illustrate these points with polymorphism and divergence data from a sample of 73 genes (31
mitochondrial, 36 nuclear genes from Drosophila, and six Arabidopsis data sets). In an earlier study we used the McDonald±Kreitman (MK)
test to show that amino acid replacement polymorphism in animal mitochondrial genes and Arabidopsis genes show a consistent trend toward
negative selection, whereas nuclear genes from Drosophila span a range from negative selection, through neutrality, to positive selection.
Here we analyze a subset of these genes (13 Drosophila nuclear, ten mitochondrial, and six Arabidopsis nuclear) for polymorphism and
divergence of conservative and radical amino acid replacements (a protein-based conservative-radical MK, or pMK, test). The distinct
patterns of selection between the different genomes is not apparent with the pMK test. Different de®nitions of conservative and radical (based
on amino acid polarity, volume or charge) give inconsistent results across genes. We suggest that segregating ®tness difference between silent
and replacement mutations are more visible to selection than are segregating ®tness differences between conservative and radical amino acid
mutations. New data on the variation among genes with different opportunities for positive and negative selection are as important to the
continuum view of the neutralist-selectionist debate as is the distribution of selection coef®cients within individual genes. q 2000 Elsevier
Science B.V. All rights reserved.
Keywords: Neutrality test; Mildly deleterious single nucleotide polymorphism; Neutral theory; Natural selection; Molecular evolution; Drosophila; mtDNA;
McDonald±Kreitman test
1. Introduction
The controversial statement that ª¼the great majority of
evolutionary changes at the molecular level¼are caused not
by Darwinian selection but by random drift of selectively
neutral or nearly neutral mutantsº (Kimura, 1983, pg. xi) has
been the focal point of the long-running neutralist-selectionist debate. While some evolutionists have taken this view as
a threat to the foundation of the Modern Synthesis, Kimura
clearly quali®es his statement in the next sentence by clarifying that he does not deny the role of natural selection in
adaptive evolution. The issue of whether mutations are
Abbreviations: N.I., Neutrality Index; pN.I., protein Neutrality Index;
cN.I., Codon Neutrality Index; MK test, McDonald±Kreitman test; pMK
test, protein McDonald±Kreitman test; cMK, codon McDonald±Kreitman
test; mtDNA, mitochondrial DNA
* Corresponding author. Tel.: 11-401-863-2890; fax: 11-401-863-2166.
E-mail addresses: [email protected] (D.M. Rand);
[email protected] (D.M. Weinreich).
neutral OR selected seems to have greatly overshadowed
two crucial words in Kimura's statement: `great majority'.
Just how much is a `great majority'? In an election, 75% of
the votes would be considered a landslide, but Kimura
would probably not agree that as many as 25% of substitutions are not neutral. Indeed, he goes on to say that ª¼only
a minute fraction of changes at the DNA level are adaptive
in nature¼º (Kimura, 1983, pg. xi). Here we argue that the
neutralist-selectionist debate is over because it is not a qualitative, dichotomous problem. Rather, if the debate is to
continue it should focus on quantifying the relative sizes
of the `great majority' of neutral substitutions and the
`minute fraction' of adaptive DNA changes.
In recent years the neutral theory has been subjected to a
variety of tests using the growing database of DNA
sequences that have become available (Dispersion index,
HKA test, Tajima and Fu and Li tests, McDonald±Kreitman
tests; see Kreitman and Akashi, 1995). For example, the
dispersion index, or the ratio of the variance to the mean
0378-1119/00/$ - see front matter q 2000 Elsevier Science B.V. All rights reserved.
PII: S 0378-111 9(00)00483-2
116
D.M. Rand et al. / Gene 291 (2000) 115±125
number of substitutions between species is generally greater
than the neutral expectation of 1.0 (Gillespie, 1989; 1995;
Ohta, 1996). Among a sample of nuclear genes in Drosophila, about half of them showed departures from the strict
neutral models (Moriyama and Powell, 1996). For mitochondrial genes, about half of published data sets also depart
from neutral expectations, generally in the direction of
negative selection (Nachman, 1998; Rand and Kann,
1998). Even silent sites have been shown to deviate from
neutral evolution (Akashi, 1995; 1996). On balance then, a
`sizeable proportion' of the tests from recent data actually
reject strict neutral assumptions. Does this mean that the
neutral theory is wrong, or that we just need more insightful
tests of departures from strict neutrality? (e.g. Kreitman,
1996; Ohta, 1996). Tests that focus on polymorphic
sequences sampled from natural populations (Tajima,
1989; Fu and Li, 1993) can `reject' neutrality if the sample
is, in fact, not a truly random one. But as argued above, the
issue of neutrality vs. non-neutrality will become secondary
to studies that allow one to characterize the distribution of
selection coef®cients by placing an individual data set
somewhere on the continuum from strong purifying selection through neutrality to strong positive selection.
Studies of polymorphism and divergence in DNA
sequences allow one to translate empirical data into selection coef®cients. Following the predictions of Kimura
(1983, pg., 44±45) and Sawyer and Hartl (1992), Akashi
(1995) pointed out that the ratio of polymorphism to divergence (rpd) scales monotonically with effective selection
coef®cient, Ns. High values of rpd are indicative of negative
selection and low values indicate positive selection. One can
extend this to the McDonald±Kreitman (MK) test and
express this 2 £ 2 table as a ratio of ratios: rpd-replacement/rpdsilent (referred to as the Neutrality Index; Rand and Kann,
1996). The intention of a neutrality index is to provide a
measure of the direction and magnitude of a gene's departure from neutral expectation (Rand and Kann, 1996). N.I.
scales monotonically with selection: N:I: , 1 indicates an
excess of amino acid ®xations, or positive selection; N:I: .
1 indicates an excess of amino acid polymorphisms, or
negative selection (Rand and Kann, 1996; Nachman,
1998; Weinreich and Rand, 2000). One assumption that
follows from Kimura's (1983) analyses is that one class of
nucleotide sites is strictly neutral (e.g. silent sites; but see
Akashi, 1995; 1996). Empirically, it becomes relatively
straightforward to use MK tests and N.I. values to place
genes on the spectrum from negative to positive selection.
Because the MK test focuses on the ratios of counts of
segregating and ®xed sites, it should be less sensitive to
non-equilibrium conditions than other neutrality tests such
as the Tajima test or the HKA test (cf. McDonald and Kreitman, 1991; Akashi, 1999).
In an earlier paper we showed that patterns of molecular
evolution were signi®cantly different for genes encoded in
nuclear vs. mitochondrial genomes (Weinreich and Rand,
2000). Speci®cally, mitochondrial genes show a clear trend
toward excess amino acid polymorphism (N:I: . 1 for 25/
31 data sets) while nuclear genes show a roughly normal
distribution centered around neutrality (mean N:I: ˆ 1:2;
15/36 data sets with N:I: . 1). We suggested that the low
recombination environment of mtDNA hinders the ®xation
of advantageous mutations as they arise on haplotypes
carrying accumulated deleterious mutations. In support of
this argument is the observation that ®ve out of six genes
from Arabidopsis thaliana also show N:I: . 1; this plant is
known to be highly sel®ng, resulting in low effective recombination (but see Kuittinen and Aguade (2000)).
In principle, any two or more functionally distinct classes
of DNA changes can be subjected to the McDonald±Kreitman test format. These lead logically to a variety of possible
ratio of ratios, or neutrality index (N.I.) values that are
effective for measuring selection (Sawyer and Hartl, 1992;
Akashi, 1995; Nachman, 1998; Weinreich and Rand, 2000).
Here we extend these studies by performing a conservativeradical McDonald Kreitman tests (CRMK tests) on a subset
of the genes analyzed in Weinreich and Rand (2000). Only
amino acid sequence data are considered, and amino acid
changes are classi®ed as conservative or radical depending
on charge, volume or polarity (Zhang, 2000). Our intention
is to examine the relationship between the distribution of
selection coef®cients within versus between individual
genes. By contrasting the distributions of neutrality index
values based on protein sequences (pN.I. values) with
neutrality index values based on silent and replacement
changes in codons (cN.I. values), we are asking the question: Does polymorphism and divergence in protein
sequences reveal the same patterns of genome-speci®c
non-neutral evolution as for gene sequences at the DNA
level? From these analyses we seek to assess the relative
strengths and directions of selection acting on nucleotide
changes with varying levels of functional constraint. A similar approach has proven effective in analyses of speci®c
genes. In a study of MHC variation, Hughes et al. (1990)
showed that non-synonymous changes exceeded synonymous changes in the binding cleft of the molecule, suggesting overdominant selection. They went on to show that
amino acid changes altering side-chain charge occurred
more frequently than by chance alone, further suggesting
that that selection was acting to promote a diversity of
charge pro®les among alleles at MHC (Hughes et al.,
1990). These types of comparisons seek to distinguish
between the phenotypic effects of mutations that alter
codons in a messenger RNA, from mutations that alter
amino acids in a protein.
2. Materials and methods
2.1. Data sets
The genes selected for this study are a subset of those
analyzed in Weinreich and Rand (2000), which consisted of
D.M. Rand et al. / Gene 291 (2000) 115±125
Drosophila nuclear genes, animal mitochondrial genes, and
Arabidopsis thaliana nuclear genes. We restricted our
analyses to only those data sets from Weinreich and Rand
(2000) for which there were an appreciable number (.12)
of amino acid replacement differences (®xed and/or polymorphic) so that the 2 £ 2 tests would have reasonable
power. Thirteen Drosophila nuclear genes were studied
including four caboxylesterases (Est-5A, Est-5B, Est-5C,
Est-6), two accessory gland proteins (Acp26Aa,
Acp29AB), an acid phosphatase (Acph-1), the period gene
(per) involved in circadian rhythms, a gene involved in
conferring viral resistance (ref(2)p), three genes of unknown
function (Anon1A3, Anon1E9, Anon1G5), and relish, and
NF-kB/IkB protein. Ten animal mitochondrial data sets
were studied, consisting primarily of the cytochrome b
gene (cyt b) for mammals (Microtus, Ursus, Isothrix,
Sciurus), birds (Grus, Melospiza/Passerella), reptiles
(Emoia), and amphibians (Ambystoma), as well as (ND5)
locus in Drosophila. Six Arabidopsis thaliana genes were
studied including alcohol dehydrogenase (Adh), the homeotic genes APETALA (AP3) and PISTILLATA (PI), acidic
endochitinase (ChiA), and basic endochitinase (ChiB), plus
Chalcone Isomerase (Kuittinen and AguadeÂ, 2000, and
references therein).
2.2. Conservative-radical McDonald±Kreitman test
Three modi®cations of the McDonald±Kreitman (MK)
test were performed for each data set according to groups
of amino acids distinguished by polarity, both polarity and
volume, and charge (Zhang, 2000). We refer to these as
protein-based conservative-radical MK (CRMK) tests, or
more generally as pMK tests. Amino acid changes involving
amino acids within a category (see below) are de®ned as
conservative, while changes involving amino acids from
different categories are de®ned as radical. Classi®cation
by polarity was according to the following categories:
Polar (R, N, D, C, Q, E, G, H, K, S, T, Y); non-polar (A,
I, L, M, F, P, W, V). Classi®cation by polarity and volume
was according to the following categories: Special (C);
Neutral and small (A, G, P, S, T); Polar and relatively
small (N, D, Q, E); Polar and relatively large (R, H, K);
Non-polar and relatively small (I, L, M, V); Non-polar and
relatively large (F, W, Y). Classi®cation by charge was
according to the following categories: Positive (R, H, K);
Negative (D, E); Neutral (A, N, C, Q, G, I, L, M, F, P, S, T,
W, Y, V) (Zhang, 2000). Note that the three different ways
of de®ning `conservative' and `radical' differ in the number
of opportunities for variability within vs. between groups.
Protein sequences taken from GenBank were aligned
using a web interface to ClustalW at www.ibc.wustl.edu/
clustal.html. Alignment ®les were converted manually into
MEGA (Kumar et al., 1993) format, and MEGA was used to
identify and export data for variable amino acid sites. These
data were read by a program written in QuickBASIC 4.5 by
117
B.O.C. that performed the conservative/radical McDonald±
Kreitman tests (pMK tests).
Among the aligned sequences of alleles from two species,
the program ®nds amino acid positions that have exactly two
different amino acids, and these sites are classi®ed as conservative or radical according to the categories de®ned above. If
there is no variation among the sequences from within either
of the two species the site, the site is labeled `®xed', otherwise it is labeled `polymorphic'. If a site has more than two
different amino acids, the program queries the user for input.
These sites were classi®ed systematically by hand. In the
presence of multiple amino acids at a site within any of the
species, each unique amino acid was labeled as a polymorphism and either conservative or radical depending on its
comparison to the most common amino acid at that site.
The detection of sites with multiple polymorphisms is desirable to account for multiple mutations within codons.
Because the species compared were not greatly diverged,
these cases represented only a small fraction of the data so
no correction for multiple substitutions was attempted (cf.
Maynard Smith, 1994). After the test has been performed on
all sites, the program totals the number of ®xed radical (FR),
®xed conservative (FC), polymorphic radical (PR), and polymorphic conservative (PC) differences. Contingency tests for
the pMK tests and other 2 £ 2 test were done using G-tests.
Some variation in results were obtained when different
outgroups were used in the pMK tests. This was most notable
in the Arabidopsis data, so an effort was made to use data sets
employing Arabis lyrataas the outgroup.
2.3. Neutrality indexes
To describe the magnitude and direction of departures
from neutrality, a conservative-radical neutrality index, or
a `protein' N.I. was determined for each gene, de®ned as
pN.I. ˆ (PR/FR)/(PC/FC). Here, PR ˆ polymorphic radical,
FR ˆ ®xed radical, PC ˆ polymorphic conservative, and
FC ˆ ®xed conservative. In the case that either FR or PC
equals zero, they were replaced with 1 so that N.I. was not
unde®ned (Rand and Kann, 1996; Weinreich and Rand,
2000). For ease of nomenclature, we will refer to the N.I.
value from a traditional silent-replacement McDonald
Kreitman test as a `codon' N.I., or cN.I., and contrast this
with the protein N.I.'s (pN.I.'s) based on the conservativeradical comparison de®ned above. These N.I. terms are
distinguished from their respective McDonald±Kreitman
tests (cN.I. and cMK test; pN.I. and pMK test) on the
grounds that N.I. values seek to pin a number on the mode
of selection whereas MK tests provide a statistical statement
of the departure from neutral expectations. Hence, a cN.I.
may indicate positive or negative selection, and the respective cMK test can determine if the departure is signi®cant.
2.4. Amino acid composition
The amino acid compositions of the proteins under study
were calculated for each gene studied using MEGA. A
118
D.M. Rand et al. / Gene 291 (2000) 115±125
random allele was chosen from each gene, and means vales
for all genes in each genome in the data sets used (Drosophila nuclear, mitochondrial, and Arabidopsis nuclear) were
tabulated. For presentation these mean values for each
amino acid were pooled in to a non-polar group (A, F, I,
L, M, P, V, W), a polar group (C, G, N, Q, S, T, Y), a
negatively charged group (D, E) and a positively charged
group (H, K, R).
3. Results
For initial comparison, the results from our earlier study
(Weinreich and Rand, 2000) are summarized in Fig. 1. Fig.
1A shows the distribution of N.I. values for 36 Drosophila
nuclear data sets and 31 animal mtDNA data sets. The
difference between these distributions is signi®cant
(G ˆ 18:56, d:f: ˆ 6, P ˆ 0:005). In Fig. 1B we show the
distribution of N.I. values for MK tests that signi®cantly
reject the null hypothesis. Splitting the tests into those
with N.I. values less then, and greater then, 1.0, the data
sets from the two genomes are also signi®cantly different
(G ˆ 12:37, d:f: ˆ 1, P ˆ 0:0004; Weinreich and Rand,
2000). Drosophila nuclear genes tend to depart from
neutrality in the direction of excess amino acid ®xations,
while mitochondrial genes tend to depart from neutrality in
the direction of excess amino acid polymorphism.
3.1. Drosophila nuclear genes
The protein based conservative/radical McDonald±Kreitman (or pMK) test results for Drosophila nuclear genes are
presented in Table 1. In Fig. 2A the number of genes with
neutrality index less than one and greater than one are tabulated for each type of conservative/radical test, as well as for
the traditional codon-based silent/replacement McDonald±
Kreitman (cMK) test from Weinreich and Rand (2000). Two
of the cMK tests were signi®cant below the 0.01 level (per
and relish) and Est-5B was marginally signi®cant
(P ˆ 0:053). All three of these cMK tests depart from
neutrality in the direction of adaptive ®xation of amino
acid changes (data from Weinreich and Rand, 2000). For
the pMK tests in Table 1, three were signi®cant at the 5%
level by G-tests in the direction of excess radical ®xations
(pN:I: , 1) and two were signi®cant in the direction of an
excess of radical polymorphism (pN:I: . 1). The 13 tests
show a slight bias towards pN:I: , 1 with the polarity tests
and the polarity 1 volume test giving reciprocal results. But
no partitioning of the data by pN:I: , 1 vs. pN:I: . 1 is
signi®cant when comparing the three different versions of
pN.I. (polarity, polarity 1 volume, or charge), or when
comparing pN.I. values with cN.I. values.
3.2. Animal mitochondrial genes
Table 2 and Fig. 2B present the results for animal mitochondrial DNA. Compared to the cMK tests which show a
Fig. 1. Distribution of neutrality index values for Drosophila nuclear,
animal mitochondria, and Arabidopsis nuclear genes. (A) Distribution of
all genes examined. The animal mtDNA and Arabidopsis distributions are
not signi®cantly different, but the Drosophila nuclear distribution is significantly different from both the animal mtDNA and Arabidopsis nuclear
gene distributions (P , 0:005). (B) Distribution of neutrality index values
for those genes showing a signi®cant departure from neutral expectations
by the McDonald±Kreitman test. The same patterns of signi®cant differences between genomes observed in (A) hold for this restricted set of nonneutral genes. Data from Weinreich and Rand (2000).
skew towards cN.I. values .1, the pMK tests show a more
even split between pN.I. values greater than one and less
than one. The charge-based pN.I. values showed seven
pN:I:'s , 1 and three pN:I:'s . 1. This is signi®cantly
D.M. Rand et al. / Gene 291 (2000) 115±125
119
Table 1
Locus, number of codons, species and sample size, mutation class counts, pN.I. and cN.I. values for nuclear-encoded DNA sequence surveys used in this study.
Citations as in Weinreich and Rand (2000)
Locus
Codons
Species a
AA grouping b
FR c
FC c
PR c
PC c
pN.I. d,e
cN.I. d,e
Mst26Aa
(Acp26Aa)
267
10 mel
1 sim
220
39 mel
1 sim
Acph-1
447
53 sub
1 mad
Anon1A3
310
Anon1E9
595
Anon1G5
261
Est-5A
548
Est-5B
545
Est-5C
545
Est-6
544
26 mel
12 sim
(1 yak)
15 mel
8 sim
(1 yak)
3 mel
10 sim
(1 yak)
8 pse
1 per
(1 mir)
16 pse
1 per
(1 mir)
8 pse
1 per
(1 mir)
30 mel
3 sim
Per
402
Ref(2)p
599
Relish
803
24
42
22
8
19
13
0
0
0
12
16
7
11
23
21
9
12
6
1
2
2
23
29
19
1
0
0
2
6
5
4
7
1
14
16
11
21
50
36
42
24
44
20
9
15
4
4
4
14
10
19
28
16
18
14
11
17
2
1
1
13
7
17
0
1
1
17
13
14
12
9
15
20
18
23
60
31
45
3
4
3
3
3
3
10
15
8
8
14
8
12
19
12
7
8
6
5
10
4
9
16
10
7
8
4
8
6
4
9
7
1
3
4
3
3
9
2
6
5
6
3
3
3
20
15
22
13
7
13
24
17
24
9
8
10
14
9
15
23
16
22
5
4
8
12
14
16
5
7
13
7
6
7
8
2
9
0.88
0.46
1.00
2.50
0.47
1.15
2.00
4.00
1.45
0.72
1.25
1.67
1.27
0.78
0.43
1.21
0.92
1.70
0.71
0.56
0.13
0.22
0.24
0.41
1.40
2.00
0.50
5.67
0.93
0.70
5.40
1.29
1.15
0.61
0.75
0.90
1.07
2.79
0.28
0.41
Acp29AB
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
18 wil
7 equ
(1 yak)
10 mel
1 sim
6 mel
7 sim
(1 yak)
0.37
0.68
1.61
0.51
0.48
0.53
0.48
2.36
0.47
0.28
1.15
0.12
a
All genes from Drosophila spp. equ ˆ D. equinoxialis, mad ˆ D. maderiensis, mel ˆ D. melanogaster, per ˆ D. persimillis, pse ˆ D. pseudoobscura,
sim ˆ D. simulans, sub ˆ D. subobscurca, wil ˆ D. willistoni.
b
Amino acid grouping from Zhang (2000): pol ˆ polarity, pol&vol ˆ polarity and volume.
c
Mutation classes: FR ˆ ®xed radical, FC ˆ ®xed conservative, PR ˆ polymorphic radical, PC ˆ polymorphic conservative.
d
pN.I. ˆ Protein neutrality index. See Section 2. cN.I. ˆ Codon neutrality index. Data from Weinreich and Rand (2000). Some minor differences in cN.I.
values between Weinreich and Rand (2000) and those reported here stem from a reanalysis of the DNA sequences deposited in GenBank for the gene in
question.
e
Zero replaced with 1 for the purposes of calculating pN.I. N.I. values appearing in bold face correspond to MK tests that are signi®cant at or below the 0.05
level. See Section 2. The Relish data are from Begun and Whitley (2000).
different from the skewed pattern of nine cN:I:'s . 1 and
one cN:I: , 1 (P , 0:05). Considering tests that are significant at the 5% level, four of the ten cMK tests are signi®cant (one with cN:I: , 1, three with cN:I: . 1), while only
three of the 30 pMK test are signi®cant (all with pN:I: . 1).
Thus, although there appears to be excess `general' amino
acid polymorphism relative to divergence for mtDNA
(cN:I:'s . 1), this excess is less pronounced when one
considers radical amino acid polymorphism (smaller
proportion of pN.I.'s greater than 1).
3.3. Arabidopsis nuclear genes
Table 3 and Fig. 2C present the results for Arabidopsis
nuclear genes. None of the individual pMK tests were
signi®cant at the 5% level. Similar to the results for animal
120
D.M. Rand et al. / Gene 291 (2000) 115±125
mtDNA, ®ve out of six cN.I.'s are greater than one (and
three of these ®ve are signi®cant at the 5% level; Weinreich
and Rand, 2000), but the pN.I.'s for polarity or polarity 1
volume are, if anything, slightly skewed towards tests with
pN:I: , 1. The polarity 1 volume-based pN.I. values in
Arabidopsis stand out as skewed to values less than 1
(®ve out of six), which differs signi®cantly from the pattern
for cN.I. values (®ve out of six .1; G ˆ 5:2; P , 0:05).
To summarize the results presented in Tables 1±3 and
Fig. 2, there is no signi®cant heterogeneity among Drosophila, mitochondrial, or Arabidopsis data sets for the
conservative-radical neutrality index (pN.I.) values based
on polarity or polarity 1 volume when one considers pN.I.
values greater than vs. less than 1. This is in contrast to our
earlier study which showed that cN.I. values for Drosophila
nuclear genes are signi®cantly different from cN.I.'s from
animal mtDNA and Arabidopsis genes (Weinreich and
Rand, 200). However, there is signi®cant heterogeneity
between cN.I. and charge-based pN.I. values for the
mtDNA genes, and between the cN.I. and polarity 1
volume based pN.I.'s for Arabidopsis genes. In both of
these cases, the cN.I. values tend to be greater than 1.0
more often than the pN.I. values.
There were no signi®cant correlations between the three
different pN.I. values (polarity, polarity 1 volume and
charge) within either the Drosophila, mtDNA or Arabidopsis data sets alone, or across all genes studied in the three
data sets. There were also no signi®cant correlations
between cN.I. and pN.I. values across the entire data set.
3.4. Amino acid composition
Fig. 2. Distribution of Conservative-Radical and Silent-Replacement
neutrality index scores in Drosophila nuclear, animal mitochondria and
Arabidopsis nuclear genes. Three different Conservative-Radical McDonald±Kreitman tests were done for each gene, based on polarity, polarity
and volume, and charge properties of individual amino acids (Zhang, 2000).
Among the 87 possible protein-based McDonald±Kreitman (pMK) tests,
only eight were signi®cant, which is not signi®cantly different from chance
after correcting for multiple tests. Mitochondrial genes show a signi®cant
difference between their conservative-radical neutrality index scores
(protein N.I., or pN.I.'s) and their codon-based N.I. scores (cN.I.'s).
There is no signi®cant difference between the pN.I. and cN.I. distributions
for Drosophila. For animal mtDNA, pN.I. based on charge and cN.I. distributions are signi®cantly different (P , 0:05). For Arabidopsis genes, the
pN.I. from polarity 1 volume is signi®cantly different from the cN.I.
distribution (P , 0:05). None of the pN.I. distributions are signi®cantly
heterogeneous across genomes, but the cN.I. distributions for mtDNA
and for Arabidopsis are signi®cantly different from the Drosophila nuclear
distribution (P , 0:05).
Amino acid frequencies in the three categories of genes
studied are presented in Fig. 3. Proteins coded for by the
mitochondrial DNA have a greater proportion of the nonpolar residues F, I, L, M in comparison to nuclear-coded
proteins (P , 0:001 by Chi-square tests; see also Naylor et
al., 1995). This is due most likely to the presence of multiple
hydrophobic transmembrane regions in these proteins that
serve to anchor them in the mitochondrial membrane. There
is also a signi®cant de®ciency of charged residues (D, E, K,
R) in the mitochondrial genes, relative to the Drosophila and
Arabidopsis nuclear genes (P , 0:01). The Drosophila and
Arabidopsis nuclear proteins studied have very similar
proportions of the different groups of amino acid (Fig. 3).
4. Discussion
Our original goal was to perform a different test of the
observation that animal mitochondrial genes tend to deviate
from neutrality in the direction of negative selection, while
nuclear genes do not show this trend. Our aim was to use a
protein-based conservative-radical McDonald Kreitman
(pMK) test to ask if the functional distinction between
conservative and radical amino acid mutations within
proteins would be `perceived' by evolutionary forces in
D.M. Rand et al. / Gene 291 (2000) 115±125
121
Table 2
Locus, number of codons, species and sample size, mutation class counts, pN.I. and cN.I. values for mtDNA-encoded DNA sequence surveys used in this study.
Citations as in Weinreich and Rand (2000)
Locus
Codons
Species a
AA grouping b
FR c
FC c
PR c
PC c
pN.I. d,e
ND5
505
Cytb
227
25 mel
20 sim
(1 yak)
23 Ensatina eschscholtzii
1 Plethodon elongatus
cyt b
380
9 Grus antigone
2 Grus rubicunda
cyt b
380
15 Microtus arvalis
9 M. rossiaemeridionalis
cyt b
266
10 Isothrix bistriata
1 Isothrix pagurus
cyt b
266
29 Mesomys hispidus
2 Mesomys stimulax
cyt b
143
11 Passerella iliaca
6 Melospiza melodia
cyt b
97
8 Phyllobates lugubris
1 Phyllobates vittatus
cyt b
379
20 Sciurus aberti
1 Sciurus niger
cyt b
379
28 Ursus arctos
1 Ursus americanus
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
Pol
pol&vol
Charge
6
8
2
10
13
2
1
1
0
3
4
0
3
4
1
0
0
0
0
3
0
0
0
0
5
12
2
6
7
1
9
7
13
7
5
15
0
0
1
5
3
7
3
2
5
0
0
0
10
7
10
0
0
0
17
10
20
11
10
16
4
5
0
14
20
5
2
4
2
6
13
5
6
4
1
15
20
3
2
8
6
1
6
2
6
5
1
6
3
0
6
5
10
31
24
39
5
3
5
12
5
13
10
12
15
18
13
30
7
1
3
10
5
9
3
4
8
5
8
11
1.00
0.88
0.00
0.32
0.32
0.96
0.40
1.33
0.40
0.83
1.95
2.69
0.60
0.17
0.33
0.83
1.54
0.10
2.86
18.67
20.00
0.10
1.20
0.22
6.80
1.04
1.25
2.20
0.54
0.00
cN.I. d,e
2.11
0.28
16.20
5.07
1.20
4.71
1.90
3.54
1.67
1.10
a
Drosophila species names as in Table 1.
Amino acid grouping from Zhang (2000): pol ˆ polarity, pol&vol ˆ polarity and volume.
c
Mutation classes: FR ˆ ®xed radical, FC ˆ ®xed conservative, PR ˆ polymorphic radical, PC ˆ polymorphic conservative.
d
pN.I. ˆ Protein neutrality index. See Section 2. cN.I. ˆ Codon neutrality index. See Weinreich and Rand (2000).
e
Zero replaced with 1 for the purposes of calculating pN.I. N.I. values appearing in bold face correspond to MK tests that are signi®cant at or below the 0.05
level. See Section 2.
b
the same manner that silent-replacement mutations within
codons are perceived. In particular we predicted that the
distribution of neutrality index values for proteins
(pN.I.'s) would show a similar pattern to the distributions
of neutrality index values for codons (cN.I.'s), namely that
pN.I. values for mitochondrial genes would tend to be
greater than Drosophila nuclear genes, as seen for cN.I.
values (Weinreich and Rand, 2000). We would further
predict that if adaptive evolution has been inferred using
the standard McDonald Kreitman test (low cN.I.), the
pN.I. values for these genes should also be low.
Our results show that the partitioning of protein sequence
data according to conservative and radical amino acid
changes gives a very different picture of evolution than
that evident in comparisons of silent and replacement sites
in DNA. Mitochondrial and Arabidopsis genes do not show
a trend towards high pN.I. values, and those Drosophilanuclear genes with signi®cant adaptive evolution at the
codon level do not tend to show low pN.I. values. Indeed
there is no signi®cant correlation between cN.I. values and
either of the three pN.I. values in either genome (data not
shown).
4.1. Power of the tests
This different signal for protein based neutrality tests as
compared to codon based neutrality tests could be due to the
lower power of the data sets when restricted to amino acid
changes. It is interesting to note that seven of the of the 87
pMK tests are signi®cant at 5% level (four among 3 £ 13 ˆ
39 Drosophila tests, three among the 3 £ 10 ˆ 30 mitochondrial tests, and none among the 3 £ 6 ˆ 18 Arabidopsis
tests). While the different tests for each gene may not be
independent, the pN.I. values are not correlated (see Section
3.3). Considering the 29 traditional MK test (cMK test)
among these same data sets, two, four, and three tests, respectively, are signi®cant at the 5% level. In total, 31.0% of the
cMK tests are signi®cant, while only 8.0% of the pMK tests
122
D.M. Rand et al. / Gene 291 (2000) 115±125
Table 3
Locus, number of codons, species and sample size, mutation class counts, pN.I. and cN.I. values for nuclear-encoded DNA sequence surveys in Arabidopsis
thaliana used in this study
H
Codons
Species
AA grouping a
FR b
FC b
PR b
PC b
pN.I. c
cN.I. c
Adh
361
15 A. thaliana
1 Arabis lyrata
246
23 A. thaliana
1 Arabis lyrata
AP3
231
17 A. thaliana
1 Arabis lyrata
ChiA
302
ChiB
335
14 A. thaliana
1 Arabis lyrata
(no Ci-0)
16 A. thaliana
1 Arabis gemmifera
PI
209
5
5
4
8
11
6
2
4
1
5
9
3
6
9
5
1
3
1
10
10
11
16
13
18
4
2
5
12
8
14
11
8
12
7
5
7
3
2
1
0
0
0
5
10
8
6
6
1
1
1
2
3
6
5
4
5
6
2
2
2
11
6
8
6
6
11
3
3
2
9
6
7
1.50
0.80
0.46
0.00
0.00
0.00
0.91
0.83
5.00
2.40
0.89
0.42
0.61
0.30
2.40
2.33
1.67
5.00
1.29
CHI
Pol
Pol&vol
Charge
Pol
pol&vol
charge
pol
pol&vol
charge
pol
pol&vol
charge
pol
pol&vol
charge
pol
pol&vol
charge
16 A. thaliana
1 Arabis lyrata
2.10
4.00
3.07
0.38
6.00
a
Amino acid grouping from Zhang (2000): pol ˆ polarity, pol&vol ˆ polarity and volume.
Mutation classes: FR ˆ ®xed radical, FC ˆ ®xed conservative, PR ˆ polymorphic radical, PC ˆ polymorphic conservative.
c
pN.I. ˆ Protein neutrality index. See Section 2. cN.I. ˆ Codon neutrality index. Data from Weinreich and Rand (2000). CHI data from Kuittinen and
Aguade (2000). N.I. values appearing in bold face correspond to MK tests that are signi®cant at or below the 0.05 level.
b
are signi®cant. These two proportions are signi®cantly different (G ˆ 5:0, P , 0:05). A pMK test partitions only the
amino acid `row' of the traditional cMK test into the four
cells of the comparable 2 £ 2 test. While this will reduce
counts, and hence power (e.g. Akashi, 1999), some of the
genes in each of the three genomes have a reasonable number
of counts for each cell in the pMK test (see Tables 1±3).
While we acknowledge that our pMK tests will have lower
power, there appear to be enough data to detect a positive
Fig. 3. Protein composition of genes studied. Individual amino acids were
grouped in non-polar, polar, positively charge, and negatively charged as
described in Section 2. Mitochondrial genes show signi®cantly greater
percent composition of non-polar amino acids, and signi®cantly smaller
proportion of charged amino acids (both P , 0:001).
correlation between pN.I. and cN.I. values if a strong one
existed.
4.2. Protein and nucleotide composition
An alternative hypothesis is that contrasting patterns of
cN.I. and pN.I. values in the three genomes are due to
differences in protein composition inherent in our limited
sample of genes. Fig. 3 shows that percent composition of
non-polar amino acids and charged amino acids are signi®cantly different among the three genomes. However, it is the
mitochondrial proteins that are most inconsistent with the
other two genomes with respect to non-polar composition
yet there is no heterogeneity among genomes for pN.I.
values based on polarity (Fig. 2). Since the mitochondrial
genes show approximately a 50% de®ciency in charged
residues, relative to the two other data sets (Fig. 3), this
may account for the signi®cant difference between the
charge-based pN.I. distribution and the cN.I. distribution
for mitochondrial genes. It is not entirely clear how these
amino acid composition differences might account for the
patterns we see among the pN.I. from the different genomes,
or with respect to the distribution of their respective cN.I.
values. A study contrasting the amount of radical and
conservative polymorphism in transmembrane and extramembrane regions will assist in differentiating these two
possibilities.
An obvious question posed by our data is whether the
differences between pN.I.'s and cN.I.'s scored within and
between genomes is due mostly to differences in silent site
evolution. This seems plausible given recent evidence for
D.M. Rand et al. / Gene 291 (2000) 115±125
non-neutral patterns of evolution at silent sites in both
nuclear (Akashi, 1995; 1996; Akashi and Schaeffer, 1997)
and mitochondrial genes (Ballard and Kreitman, 1995; Rand
and Kann, 1998; Ballard, 2000a,b). While the data are
limited, the silent site N.I. values for mitochondrial genes
(`sN.I.'s', where preferred and unpreferred synonymous
codons are scored as ®xed or polymorphic) also tend to be
greater than 1.0, consistent with the pattern for traditional
cN.I. values (Rand and Kann, 1998). We are currently
examining the data sets in Weinreich and Rand (2000)
with various silent site tests with the aim of comparing
pN.I., cN.I. and sN.I. values across genomes. It is clear
that differences in base composition, mutational properties
and constraints of amino acid composition could interact to
affect pN.I. and cN.I. values. This may be especially true
across different genomes with different mutation rates (Li,
1997), or for genes on leading vs. lagging strands of replication that could experience different mutation rates (Rand
and Kann, 1998; but see Francino et al., 1996). Since
local rates of recombination can also alter the ef®cacy of
selection on silent or amino acid sites (Kliman and Hey,
1993; Moran, 1996), it will be important to disentangle
nucleotide and protein composition issues from other
confounding factors that might alter the strength of selection.
4.3. Different selection differentials
A third possible explanation for the lower levels of
among-genome heterogeneity for pN.I. values in contrast
to the signi®cant among-genome heterogeneity for the
cN.I. values, is that the phenotypic differences between
radical and conservative polymorphisms may be smaller
than the differences between replacement and silent polymorphisms. By this we do not mean that there is less
selective difference between an average conservative and
radical mutation than between a silent and replacement
mutation. Many amino acid mutations may have suf®ciently strong phenotypic effects that they are eliminated
quickly and not observed as polymorphisms. The residual
amino acid variants that do reach observable polymorphic
frequencies or eventual ®xation, however, may have only
subtle phenotypic consequences for protein function. Alternatively, these patterns may suggest that factors effecting
the expression of proteins (e.g. preferred vs. unpreferred
codons within mRNAs; cf. Akashi, 1995; 1996) are a more
important currency in the economy of molecular evolution
than are the alternative phenotypic states of individual
proteins.
If the selection differentials between conservative and
radical substitutions are indeed smaller than those for silent
and replacement changes, pN.I. values should be less
`extreme' than cN.I. values, and fewer pMK tests should
be signi®cant then cMK tests. To address the issue of the
`extreme-ness' of pN.I. vs. cN.I. values we de®ned less
extreme as being closer to the strictly neutral expectation
123
of pN.I. or cN:I: ˆ 1:0. Considering the 29 genes studied,
each with three pN.I. values for the different amino acid
classi®cations (polarity, polarity and volume, and charge)
there are 87 pN.I. values. For the 13 Drosophila genes, 23/
39 pN.I. values are less extreme than the comparable cN.I.
values. For the ten animal mtDNA genes, there are 17/30
pN.I. values that are less extreme than their respective cN.I.
values. And for the six Arabidopsis genes, 10/18 pN.I.
values are less extreme than the cN.I. values. Combining
the data, 50/87 pN.I. values are less extreme than their
respective cN.I. values. While the data does indicate that
less extreme pN.I. values are more common than more
extreme pN.I. values, these differences are not signi®cant
(P , 0:2). These simple contingency tests would be ¯awed
if there were some non-independence among the three pN.I.
values for the different amino acid classi®cations within a
given gene (and cN.I.s). Since none of the three cN.I. values
are signi®cantly correlated with each other, we have treated
them as statistically independent.
As for the number of signi®cant tests, the data do reveal
that cMK tests reject neutrality a greater proportion of the
time than do pMK tests (31% vs. 8%; see Section 4.1). As
discussed above, it is not clear whether this is a result of
limited power, or a pattern more consistent with neutrality.
While the tests of these two predictions appear weak individually, both suggest the same trend consistent with the
notion that conservative-radical amino acid differences
have smaller selection differentials than silent-replacement
differences. There simply may not be enough data at
present to distinguish between these two alternatives.
Clearly these two different classes of mutations have different mutational properties which need to be examined.
Furthermore, it will be important to examine the site
frequencies of segregating amino acid polymorphisms to
get a sense of the relative ages of distinct amino acid
mutations (Nielsen and Weinreich, 1999). If purifying
selection is acting disproportionately on radical amino
acid replacement mutations, we predict that the mean age
of such polymorphisms will be less that the mean age of
conservative polymorphisms (Nielsen and Weinreich,
1999). Since there appears to be some ®ltering of general
replacement polymorphisms that are admitted into populations, this could provide a test of whether the observed
amino acid variants are closer to effectively neutral
variants than are silent and replacement changes.
4.4. Neutrality and selection as a continuum
An important distinction between the strictly- and nearlyneutral models is that the former places the neutralist selectionist debate in a dichotomous context, while the latter
clearly allows for a continuous view (cf. Ohta, 1992,
1995). The distribution of selection coef®cients across the
genome thus becomes a problem in the units of selection.
The important question is to distinguish between the distribution of biological functions performed by individual
124
D.M. Rand et al. / Gene 291 (2000) 115±125
genes, and the distribution of phenotypically distinct mutations presented to those genes. Different genes with very
different roles in the biology of the cell, or the organism,
or the population of individuals in the reproductive and
physical environment will capitalize on different subsets
of mutations that are provided by raw DNA change. The
behavior of the various neutrality index values for genes
subject to strong purifying selection (e.g. histones), weak
or relaxed selection (e.g. ®brinopeptides, Dickerson, 1971;
ANON loci, Schmid et al., 1999), or strong adaptive evolution such as reproductive proteins (accessory gland proteins:
Tsaur et al., 1998; AguadeÂ, 1999) will be heavily in¯uenced
by the variety of biological functions across genes that
comprise one's sample of the genome. Moreover, the
history of adaptive vs. purifying episodes of evolution
may change across loci (e.g. lysozyme; Messier and Stewart, 1997).
If we believe that the neutralist selectionist debate should
be viewed across the continuum from strong negative selection, through neutrality, to strong positive selection, the
different views of this spectrum provided by pN.I. and
cN.I. values can help tabulate `votes' in this debate. At
the level of silent and replacement sites in Drosophila
nuclear genes, about 16% (6/36) of genes depart from
neutral expectations, and 5/6 of these depart in the direction
of adaptive evolution (Fig. 1; Weinreich and Rand, 2000).
While this may represent the `great majority' of neutral
cases (84%), it certainly is not consistent with a `minute
fraction' of changes being adaptive. For animal mitochondrial genes, almost half (15/31, 48%) of the cases depart
from neutrality, and of those that do, 14/15 or 93% depart
in the direction of negative selection. This is certainly not a
`great majority' of neutral cases (52%), but is not far from
the notion that a `minute fraction' are adaptive. The Arabidopsis story is similar to that for animal mtDNA (Weinreich
and Rand, 2000).
But when we consider conservative and radical amino
acid changes, our data are in closer agreement with the
proportions stated by Kimura (only 7/87, or 8% pMK
tests were signi®cant, and 5/8 were in the direction of
negative selection). When one corrects for multiple tests,
for amino acid changes the predictions of Kimura (1983,
pp. xi) seem to hold (non-neutral results emerge very
slightly more frequently than they should by chance). But
for DNA changes in codons, there seems to be an excess of
non-neutrality. We feel that contrasts among different functional classes of all kinds of molecular changes will help
elucidate the distributions of selection coef®cients within
and between genes (e.g. Ballard, 2000a,b). This will help in
providing upper and lower bounds for the strengths of
selection coef®cients working on these different kinds of
mutations. When we have a clear understanding of the
nature of the functional differences between molecular
polymorphisms we will be more able to appreciate the
signi®cance of these different patterns of polymorphism
and divergence.
4. Note added in proof
The Microtus mtDNA data have been retracted (Baker et
al., 1997. Nature 390, 100). Excluding these data alters our
counts of pNI and cNI values, but does not affect our conclusions.
Acknowledgements
Supported by NSF grants DEB 9707676 to DMR and
9981497 to DMR and DMW.
References
AguadeÂ, M., 1999. Positive selection drives evolution of the Acp29AB
accessory gland protein locus in Drosophila. Genetics 152, 543±551.
Akashi, H., 1995. Inferring weak selection from patterns of polymorphism
and divergence at `silent' sites in Drosophila DNA. Genetics 139,
1067±1076.
Akashi, H., 1996. Molecular evolution between Drosophila melanogaster
and D. simulans: reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster. Genetics 144, 1297±
1307.
Akashi, H., 1999. Inferring the ®tness effects of DNA mutations from
polymorphism and divergence data: statistical power to detect directional selection under stationarity and free recombination. Genetics
151, 221±238.
Akashi, H., Schaeffer, S.W., 1997. Natural selection and the frequency
distributions of `silent' DNA polymorphism in Drosophila. Genetics
146, 295±307.
Ballard, J.W.O., Kreitman, M., 1995. Is mitochondrial DNA a strictly
neutral marker? Trends Evol. Ecol. 10, 485±488.
Ballard, J.W.O., 2000. Comparative genomics of mitochondrial DNA in
members of the Drosophila melanogaster subgroup. J. Mol. Evol. 51,
48±63.
Ballard, J.W.O., 2000. Comparative genomics of mitochondrial DNA in
Drosophila simulans. J. Mol. Evol. 51, 64±75.
Begun, D., Whitley, P., 2000. Adaptive evolution of relish, a Drosophila
NF-kB/IkB protein. Genetics 154, 1231±1238.
Dickerson, R.E., 1971. The structure of cytochrome c and the rates of
molecular evolution. J. Mol. Evol. 1, 26±45.
Francino, M.P., Chao, L., Riley, M.A., Ochman, H., 1996. Asymmetries
generated by transcription-coupled repair in enterobacterial genes.
Science 272, 107±109.
Fu, Y.-X., Li, W.-H., 1993. Statistical tests of neutrality of mutations.
Genetics 133, 693±709.
Gillespie, J.H., 1989. Lineage effects and the index of dispersion of molecular evolution. Mol. Biol. Evol. 6, 636±647.
Gillespie, J.H., 1995. On Otha's hypothesis: most amino acid substitutions
are deleterious. J. Mol. Evol. 40, 64±69.
Hughes, A.L., Ohta, T., Nei, M., 1990. Positive Darwinian selection
promotes charge pro®le diversity in the antigen-binding cleft of class
I major-histocompatibility-complex molecules. Mol. Biol. Evol. 7 (6),
515±524.
Kimura, M., 1983. The Neutral Theory of Molecular Evolution. Cambridge
University Press, Cambridge, UK.
Kliman, R.M., Hey, J., 1993. Reduced natural selection associated with low
recombination in Drosophila melanogaster. Mol. Biol. Evol. 10, 1239±
1258.
Kreitman, M., 1996. The neutral theory is dead Long live the neutral theory.
Bioessays 18, 678±683.
Kreitman, M., Akashi, H., 1995. Molecular evidence for natural selection.
Ann. Rev. Ecol. Systemat. 26, 403±422.
D.M. Rand et al. / Gene 291 (2000) 115±125
Kuittinen, H., AguadeÂ, M., 2000. Nucleotide variation at the CHALCONE
ISOMERASE locus in Arabidopsis thaliana. Genetics 155, 863±872.
Kumar, S., Tamura, K., Nei, M., 1993. MEGA: Molecular evolutionary
genetics analysi. University of Pennsylvania, University Park, PA, p.
16802.
Li, W.-H., 1997. Molecular Evolution. Sinauer, Sunderland, MA.
Maynard Smith, J., 1994. Estimating selection by comparing synonymous
and substitutional changes. J. Mol. Evol 39, 123±128.
McDonald, J.H., Kreitman, M., 1991. Adaptive protein evolution at the Adh
locus in Drosophila. Nature 351, 652±654.
Messier, W., Stewart, C.-B., 1997. Episodic adaptive evolution of primate
lysozymes. Nature 385, 151±154.
Moran, N.A., 1996. Accelerated evolution and Muller's ratchet in endosymbiotic bacteria. Proc. Natl Acad. Sci. USA 93, 2873±2878.
Moriyama, E.N., Powell, J.R., 1996. Intraspeci®c nuclear DNA variation in
Drosophila. Mol. Biol. Evol. 13, 261±277.
Nachman, M.W., 1998. Deleterious mutations in animal mitochondrial
DNA. Genetica 102/103, 61±69.
Naylor, G.J., Collins, T.M., Brown, W.M., 1995. Hydrophobicity and
phylogeny. Nature 373, 565±566.
Nielsen, R., Weinreich, D., 1999. The age of nonsynonymous and synonymous mutations in animal mtDNA and implications for the mildly
deleterious theory. Genetics 153, 497±506.
Ohta, T., 1992. The nearly neutral theory of molecular evolution. Annu.
Rev. Ecol. Syst. 23, 263±286.
125
Ohta, T., 1995. Synonymous and nonsynonymous substitutions in mammalian genes and the nearly neutral theroy. J. Mol. Evol. 40, 56±63.
Ohta, T., 1996. The current signi®cance and standing of neutral and nearly
neutral theories. Bioessays 18, 673±677.
Rand, D.M., Kann, L.M., 1996. Excess amino acid polymorphism in mitochondrial DNA: contrasts among genes from Drosophila, mice and
humans. Mol. Biol. Evol. 13, 735±748.
Rand, D.M., Kann, L.M., 1998. Mutation and selection at silent and replacement sites in the evolution of animal mitochondrial DNA. Genetica
102/103, 393±407.
Sawyer, S.A., Hartl, D.L., 1992. Population genetics of polymorphism and
divergence. Genetics 132, 1161±1176.
Schmid, K.J., Nigro, L., Aquadro, C.F., Tautz, D., 1999. Large number of
replacement polymorphisms in rapidly evolving genes of Drosophila:
implications for genome wide surveys. Genetics 153, 1717±1729.
Tajima, F., 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123, 585±595.
Tsaur, S.-C., Ting, C.-T., Wu, C.-I., 1998. Positive selection driving the
evolution of a gene of male Reproduction Acp26Aa, of Drosophila: II.
Divergence versus polymorphism. Mol. Biol. Evol. 15, 1040±1046.
Weinreich, D.M., Rand, D.M., 2000. Contrasting patterns of non-neutral
evolution in proteins encoded in nuclear and mitochondrial genomes.
Genetics 156, 385±399.
Zhang, J., 2000. Rates of conservative and radical nonsynonymous nucleotide substitutions in mammalian nuclear genes. J. Mol. Evol. 50, 56±68.