a likelihood ratio test to detect conflicting phylogenetic signal

Syst. Biol. 45(l):92-98, 1996
A LIKELIHOOD RATIO TEST TO DETECT CONFLICTING
PHYLOGENETIC SIGNAL
JOHN P. HUELSENBECK 1 AND J. J. BULL 2
department of Integrative Biology, University of California, Berkeley, California 94720, USA;
E-mail: johnh@mws4. biol. berkeley. edu
department of Zoology, University of Texas, Austin, Texas 78712, USA;
E-mail: [email protected]
Abstract.—Molecular data are commonly used to reconstruct the evolutionary histories of organisms. However, evolutionary reconstructions from different molecular data sets sometimes conflict.
It is generally unknown whether these different estimates of history result from random variation
in the processes of nucleotide substitution or from fundamentally different evolutionary mechanisms underlying the histories of the genes analyzed. We describe a novel likelihood ratio test
that compares different topologies (each estimated from a different data partition for the same
taxa) to determine if they are significantly different. The results of this test indicate that different
genes provide significantly different phylogenies for amniotes, supporting earlier suggestions
based on less direct tests. These results suggest that some molecular data can give misleading
information about evolutionary history. [Likelihood ratio test; maximum likelihood; phylogenetic
methods; phylogenetic heterogeneity.]
Debate about which kinds of data provide the most accurate estimates of phylogeny is fueled by observations that different data sets for the same taxa
sometimes yield discordant estimates. In
some cases there is a good explanation for
the different phylogenetic estimates. For
example, horizontal gene transfer and recombination in bacteria cause different
portions of the genome to have different
histories (Dykhuizen and Green, 1991;
Maynard Smith et al., 1991; Medigue et al.,
1991; Souza et al., 1992; Valdez and Pinero,
1992). Also, in many organisms, polymorphisms that predate speciation may fail to
reflect the phylogenetic history of the rest
of the genome (the gene-tree vs. speciestree problem; Wilson et al., 1977; Pesole et
al., 1991; Doyle, 1992).
It is nonetheless generally accepted that
most genes in higher taxa are free from
these problems and that any differences in
phylogenies produced with different data
sets can be attributed to sampling. Statistical procedures such as bootstrapping reveal that different estimates of phylogeny
may arise purely as a consequence of sampling error. However, conflicting phylogenies may also arise when the underlying
processes of molecular evolution differ
among genes, even though the genes have
the same history (Bull et al., 1993). Genes
often evolve at different rates depending
on their position in the genome (Wolfe et
al., 1989), they are subject to different functional constraints (Luo et al., 1989), and
they experience different selection pressures (Stewart and Wilson, 1987).
Phylogenetic methods make specific assumptions about the evolutionary process,
and when these assumptions are not met,
the methods can provide positively misleading estimates of history (Felsenstein,
1978). Although the sensitivity of phylogenetic estimates to evolutionary process
has been recognized as a theoretical possibility for two decades, there are no unequivocal demonstrations in which different genes lead to significantly different
phylogenetic estimates because of systematic differences in evolutionary process.
(We acknowledge the many examples in
which different portions of prokaryotic genomes have different histories because of
recombination, but we restrict consideration here to other kinds of differences in
evolutionary processes.) Such examples of
heterogeneous estimates are important for
two reasons. First, they reveal the limitations of phylogenetic analysis; both phylogenies cannot be correct, so there must
be a poor match between evolutionary pro92
1996
HUELSENBECK AND BULL—LIKELIHOOD HETEROGENEITY TEST
cess and the model's assumptions for at
least one of the data sets. Identifying heterogeneity thus encourages improvement
of reconstruction methods. Second, some
philosophies advocate combining all data
prior to reconstruction (the total evidence
approach; Kluge, 1989). Combining heterogeneous data may ensure an incorrect reconstruction (Bull et al., 1993).
LIKELIHOOD HETEROGENEITY TEST
The standard approach in other fields of
science is to develop a statistical test of the
hypothesis that observed differences between data sets are due to sampling error;
the alternate hypothesis is that the data
sets are heterogeneous. This general approach is developed here and has also
been developed in earlier studies of phylogeny reconstruction, but our statistical
model differs from all earlier models. Previously, a comparison of bootstrapping
values has been used to infer whether conflicting estimates of phylogeny are consistent with sampling variation (Dykhuizen
and Green, 1991; de Queiroz, 1993). However, bootstrap support cannot easily be
used to assess the statistical significance of
conflicting phylogenies: the bootstrap values do not apply to the entire phylogeny
and are biased (i.e., bootstrap proportions
cannot be interpreted as the probability
that an estimate is correct; Zharkikh and
Li, 1992a, 1992b; Hillis and Bull, 1993).
Other tests of heterogeneity have also been
proposed; those of Rodrigo et al. (1993)
and Farris et al. (1995) may confound differences in topology with differences in
other properties of molecular evolution.
We developed a likelihood ratio test (the
likelihood heterogeneity test) to evaluate
the hypothesis that differences in phylogenetic estimates can be explained by stochastic variation. We specifically test for
heterogeneity in topology (branching order), but the test is trivially modified to
evaluate other aspects of the phylogenetic
model. The likelihood heterogeneity test
compares the likelihood Lo, obtained under
the constraint that the same phylogeny underlies all of the data sets, with the likelihood Llf obtained when this constraint is
93
relaxed. The data sets may consist of different genes or other groupings of homologous nucleotide positions. Let the model
parameters of the fth data partition be the
ordered pair 0, = (T,, <£,), where T, represents the bifurcating tree and <E>, represents the other parameters (such as branch
lengths, transition: transversion ratio, or
shape parameter of the gamma distribution) to be estimated from the fth data set,
and the estimates w = {Qv 0 2 , . . . , Qn} =
« ? v «>i), ( t 2 , <i>2)
(f„, <!>„)} G a . o u r
likelihood heterogeneity test compares the
likelihood, Lo, under the null hypothesis,
where
Lo = max[L(a))]Ln.Tl=T2=...Tn,
to the likelihood, La, under the alternative
hypothesis, where
U = max[L(a))]|a)en.
Under the null hypothesis (Ho), the same
tree is assumed to underlie the data from
different genes, although the overall rates
(for the genes as wholes) and the relative
rates (from branch to branch of the trees)
of evolution as well as other parameters
are allowed to vary among the genes. Under the alternative hypothesis (Ha), different trees and different evolutionary rates
can underlie each gene. The likelihood ratio test statistic is
8 = 2(ln Lx - In Lo).
Because H,, is a subset of Hlf this ratio
should be asymptotically distributed as a
X2 probability density distribution with n
— m degrees of freedom, where n is the
number of parameters under Ha and m is
the number of parameters under HQ (Rice,
1995). However, Goldman (1993) showed
that for the phylogeny problem, the x2 distribution is not appropriate and instead
suggested Markov simulation of the null
distribution to determine the critical values
for 8. In the absence of suitable asymptotic
results appropriate for all parameter values under the null hypothesis, the maximum likelihood values are instead used in
the simulations. The simulations thus assume the same tree for all genes but dif-
94
VOL. 45
SYSTEMATIC BIOLOGY
Lepidosaur
Bird
f
0.50-
N
Crocodilian '
Mammal
Tree 1
Crocodilian
Bird
N
Lepidosaur'
0.75 - _
f
Mammal
Tree 2
0.50-
Bird
Lepidosaur
Mammal
Crocodilian
Tree 3
FIGURE 1. Parametric bootstrapping provides a
close approximation to the distribution of the likelihood ratio test statistic, 8. The 5% critical value of true
8 determined directly is 0.53. The 5% critical values
determined from parametric bootstrapping are 0.41,
0.64, 0.67, 0.92, and 1.09. (a) Distribution of true 8. (b)
Average distribution from five parametric bootstrap
estimates of bootstrapped 8.
FIGURE 2. The three possible unrooted trees for the
relationship of amniotes.
hood heterogeneity test to the first five
pairs of data simulated above (by chance,
8t = 0 for each of these five cases). We then
obtained five distributions of 8b (averaged
in Fig. lb). The distributions appear to
ferent branch lengths (and other parameter match well: the 5% threshold value in the
distribution of 8t was 0.53, whereas for 8b
values) among data partitions.
The performance of the likelihood het- it averaged 0.75, within one standard deerogeneity test depends on several as- viation of 0.53. Future studies will be resumptions. A basic assumption is that the quired to determine whether this form of
distribution of 8 determined by parametric parametric bootstrapping provides acceptbootstrapping (8b) matches the distribution able estimates of the distribution of 8, unof true 8 (8t). To evaluate this assumption, der a wide range of conditions.
we simulated data according to a four-taxSIGNIFICANTLY DIFFERENT PHYLOGENETIC
on tree with equal branch lengths (0.8 subESTIMATES FOR AMNIOTES
stitutions/site). Two sets of 100 nucleotide
sites each were numerically evolved, and a
We applied the likelihood heterogeneity
8t value was calculated. This process was test to a controversial phylogenetic probrepeated 1,000 times, always using the lem—the phylogeny of amniotes (here repsame tree parameters (the distribution of resented by mammals, birds, crocodilians,
8t will vary with the tree used). The 1,000 and lepidosaurs). Over the past decade,
values of 8t then provided a distribution phylogenetic analyses of morphological
that should closely approximate the exact data and of at least 16 genes have provided
distribution of 8t (Fig. la).
different estimates of this phylogeny (GarTo assess the match between the distri- diner, 1982; Lovtrup, 1985; Gauthier et al.,
bution of 8t and 8b/ we applied our likeli- 1988; Hedges et al., 1990; Hedges and
1996
HUELSENBECK AND BULL—LIKELIHOOD HETEROGENEITY TEST
Maxson, 1991; Eernisse and Kluge, 1993;
Hedges, 1994). Usually, one of two trees is
estimated (Fig. 2, trees 1 and 3). One tree
depicts a bird-crocodilian relationship,
whereas the other depicts a bird-mammal
relationship. Three lines of evidence suggest that the bird-crocodilian relationship
represents the best estimate of the phylogeny of amniotes. First, amniotes have a
rich fossil history, and inclusion of some of
these fossil taxa strongly support a birdcrocodilian relationship (Gauthier et al.,
1988). Second, a bird-crocodilian relationship better fits the stratigraphic occurrence
of fossil taxa (Gauthier et al., 1988). Third,
the majority of trees estimated from different data partitions supports a bird-crocodilian relationship (whether analyzed in
a combined or separate analysis; Hedges,
1994). Most of the debate has centered on
which data are the most reliable and on
how to combine the different sources of
data to provide an accurate estimate of
phylogeny. What has not been addressed
is whether the differences in the phylogenetic trees can be explained simply by stochastic variation.
We examined the 12S, 16S, 18S, and 28S
ribosomal RNA (rRNA) sequences and the
valine transfer RNA (tRNA) sequence for
four amniote taxa: Sceloporus undulatus
(GenBank nos. L28075, L28075, M59400,
M59404, L28075), Alligator mississippiensis
(L28074, L28074, M59383, M59406,
L28074), Gallus gallus (X52392, X52392,
M59389, M59414, X52392), and Mus musculus (J01420, J01420, X00686, X00525,
J01420). The alignment of Hedges (Hedges
et al., 1990; Hedges, 1994) was used, but
all sites with missing data or gaps were
omitted. Log likelihoods were calculated
using the program PAUP* (Swofford,
1995), which provides maximum-likelihood estimates of the lengths of the
branches (in terms of number of substitutions per site), the transition: ransversion
ratio, the equilibrium nucleotide frequencies, and the shape parameter of the gamma distribution. Each data set was then
alyzed using maximum likelihood implemented with either a Jukes-Cantor (Jukes
and Cantor, 1969 [JC]) or Hasegawa-Kish-
95
ino-Yano (Hasegawa et al., 1985; Yang,
1993; [HKY85+r]) model of DNA substitution. The 18S rRNA gene was also
analyzed separately, using the minimum
evolution criterion with LogDet distances
(Lockhart et al., 1994) and maximum likelihood with a nonhomogeneous model of
DNA substitution (Yang, 1995; Yang and
Roberts, 1995). The best tree for both analyses of combined data was consistent with
a close bird-mammal relationship (Fig. 2,
tree 3). The null distribution for the test
statistic, 8, was determined by simulating
nucleotide sequences under the hypothesis
that the same tree underlies all of the data
partitions. One hundred simulated data
sets were generated for each model of
DNA substitution examined (JC and
HKY85+O. For each simulated data set, 8
was calculated anew and compared with
the original value of the likelihood ratio
test statistic. Table 1 shows the log likelihoods of the possible trees under both
models of DNA substitution for the three
possible trees. The likelihood heterogeneity values were significant at P = 0.03 for
both models, indicating the presence of
conflicting phylogenetic signal among the
genes at a level greater than expected purely from sampling error (Fig. 3). However,
our choice of these taxa and genes was
based on earlier work suggesting that they
exhibited heterogeneity. In limiting our
demonstration to this one example, we
have thus introduced a bias that typically
requires a Bonferroni-type correction and
would require a more stringent threshold
for rejection of the null hypothesis than
0.05. We cannot estimate the magnitude of
this effect and will proceed on the assumption that this heterogeneity is statistically significant.
Inclusion of the 18S rRNA gene is responsible for the majority of the data heterogeneity: when the likelihood heterogeneity test was applied to all of the genes
except the 18S, the null hypothesis of no
heterogeneity among genes is tentatively
accepted (8X = 1.09; P = 0.15 for JC; 82 =
1.93, P = 0.24 for HKY85+O. These tests
thus suggest that the 18S rRNA gene has
been subject to different processes of mo-
96
VOL. 4 5
SYSTEMATIC BIOLOGY
TABLE 1. Log likelihoods under two models of substitution for amniote genes: the Jukes-Cantor (1969)
model of DNA substitution (JC) and the Hasegawa, Kishino, and Yano (1985) model of DNA substitution with
rate heterogeneity among sites as described by a gamma distribution (Yang, 1993) (HKY85+F). 8j is the likelihood heterogeneity statistic for the inclusion of all genes when analyzed using the JC model, and 82 is the
likelihood ratio statistic for the inclusion of all genes when analyzed using the HKY85+F model. Both 8a and
82 are significant at P = 0.03, indicating that the differences in phylogenetic estimates among genes cannot be
explained by stochastic variation.
Genes a
Tree no.
12S
16S
JC
JC
JC
Model
1
2
3
-2451.37"
-2458.43
-2453.23
-3603.93 b
-3623.90
-3628.92
HKY85+F
HKY85+r
HKY85+r
1
2
3
-2357.30
-2358.10
-2356.33"
-3487.60"
-3497.91
-3498.20
a
b
18S
28S
tRNA
-2089.59
-447.30"
-223.98
-2091.62
-454.53
-224.27
-2072.38"
-454.53
-223.43"
5, = 2[(-8798.43) - (-8816.19)] = 35.51 (P = 0.03)
-2058.73
-432.02"
-205.38"
-2058.81
-434.70
-205.46
-2054.67"
-434.70
-205.43
82 = 2[(-8536.03) - (-8541.06)] = 10.05 (P = 0.03)
12S, 16S, 18S, and 28S rRNA genes and the valine tRNA genes.
Values for the maximum-likelihood tree.
lecular evolution than have the other genes
analyzed here and that the differences in
process lead to a different estimate of phylogeny. As expected from these results, the
inclusion of the 18S rRNA data decreased
the bootstrap support for the bird-crocodilian tree (bootstrap proportions [BP] for
the bird-crocodilian relationship [Fig. 2,
tree 1]: JC, BP = 0.92 with 18S and 0.99
without 18S; HKY85+I\ BP = 0.89 with
18S and 0.99 without 18S). This result suggests that inclusion of 18S data hinders estimation of the bird-crocodilian tree, even
though the bird-crocodilian tree is obtained as the best estimate from the combined data.
What is different about the 18S rRNA
gene? Long branch attraction, horizontal
gene transfer, ancestral polymorphism,
and convergence in nucleotide content between different lineages are all mechanisms that have been suggested to explain
conflicting phylogenetic signal (Wilson et
al., 1977; Felsenstein, 1978; Bernardi et al,
1985; Dykhuizen and Green, 1991; Maynard Smith et al., 1991; Medigue et al.,
1991; Pesole et al., 1991; Doyle, 1992; Souza
et al., 1992; Valdez and Pinero, 1992; Bull
et al., 1993; Hedges, 1994). However, several of these possibilities are implausible
for the present example. Horizontal gene
transfer does not seem likely because the
mechanisms known for transferring essential genes among vertebrates (e.g., hybridization) are confined to closely related organisms. Ancestral polymorphism is
likewise implausible because gene conversion is known to homogenize rRNA sequences within populations (Hillis et al.,
1991). Both of these explanations are further discounted because the 18S rRNA
gene is part of a cluster that includes other
nuclear rRNA genes (e.g., the 28S gene,
which was included in this analysis). Long
branch attraction (the incorrect estimation
of phylogeny because of parallel changes
along the longest branches of the phylogeny) and shifts in equilibrium nucleotide
frequencies are possible explanations for
the heterogeneity, although phylogenetic
analysis of the 18S rRNA gene with either
LogDet distances (Lockhart et al., 1994;
Swofford, 1995) or a nonhomogeneous
model of DNA substitution (Yang, 1995;
Yang and Roberts, 1995), both of which
correct for shifts in equilibrium nucleotide
frequency, still produces a bird-mammal
estimate with this gene.
DISCUSSION
Application of the likelihood heterogeneity test to additional genes and taxa
might reveal heterogeneity on a broader
scale. For example, some phylogenies
1996
HUELSENBECK AND BULL—LIKELIHOOD HETEROGENEITY TEST
f 0.2
0
8
16
24
32
40
48
f 02
97
Heterogeneity of this sort has a profound
impact on the larger realm of phylogenetic
analysis because it suggests that the models used in phylogeny reconstruction are
making mistakes by failing to capture the
relevant information about molecular evolution. Identification of heterogeneity is
thus an important step in improving these
(a) models. If significant heterogeneity in tree
estimates is widespread, systematists
should reconsider their methods of analysis and their data. The method proposed
here provides a new avenue of research in
phylogenetics. Such studies would represent an extension of contemporary systematics in which only the patterns of evolution, not the processes, are considered.
ACKNOWLEDGMENTS
(b)
0
2
4
6
8
10
12
FIGURE 3. Simulated distribution of the likelihood
ratio test statistic, 8, for the test of significant differences in phylogeny estimated from different data partitions under the JC model (a) and the HKY85+r
model (b). Data were simulated under the null hypothesis that the same tree underlies both data partitions, using maximum-likelihood estimates of evolutionary rates for each branch. Each distribution was
based on 100 simulated data sets. The values observed
from the data (35.51 for the JC model and 10.05 for
the HKY85+r model) fall outside of the 95% confidence region, so the null hypothesis of homogeneity
is rejected.
David Hillis provided insightful comments. Blair
Hedges provided the aligned sequences. This work
was supported by NSF grants DEB-9106746 awarded
to David Hillis, J.J.B., and Ian Molineux and DEB9221052 awarded to David M. Hillis and by the Johann Friedrich Miescher Regents Chair in Molecular
Biology (J.J.B.).
REFERENCES
BERNARDI, G., B. OLOFSSON, J. FILIPSKI, J. ZERIAL, J. SALINAS, G. CUNY, M. MEUNIER-ROTTVAL, AND F. R O
DIER. 1985. The mosaic genome of warm-blooded
vertebrates. Science 228:953-958.
BULL, J. J., J. P. HUELSENBECK, C. W. CUNNINGHAM, D.
L. SWOFFORD, AND P. J. WADDELL. 1993. Partition-
ing and combining data in phylogenetic analysis.
Syst. Biol. 42:384-397.
CARMEAN, D., AND B. CRESPI. 1995. Do long branches
based on 18S rRNA are in conflict with
phylogenies estimated using other data
(e.g., holometabolous insects; Carmean
and Crespi, 1995). If the differences in
these other cases cannot be reconciled as
due to sampling error, then the reliability
of the 18S rRNA gene as a phylogenetic
marker is questioned on a much broader
level than suggested from our analysis.
This analysis offers evidence that different genes provide significantly different
estimates of phylogeny in higher organisms. This situation is unique and probably cannot be explained by horizontal gene
transfer or ancestral polymorphism, as can
other instances in some organisms (e.g.,
bacteria; Dykhuizen and Green, 1991).
attract flies? Nature 373:666.
DE QUEIROZ, A. 1993. For consensus (sometimes).
Syst. Biol. 42:368-372.
DOYLE, J. J. 1992. Gene trees and species trees: Molecular systematics as one-character taxonomy. Syst.
Bot. 17:144-163.
DYKHUIZEN, D. E., AND L. GREEN. 1991. Recombina-
tion in Escherichia coli and the definition of biological species. J. Bacteriol. 173:7257-7268.
EERNISSE, D. J., AND A. G. KLUGE. 1993. Taxonomic
congruence versus total evidence, and amniote phylogeny inferred from fossils, molecules, and morphology. Mol. Biol. Evol. 10:1170-1195.
FARRIS, J. S., M. KALLERSJO, A. G. KLUGE, AND C. BULT.
1994. Testing significance of incongruence. Cladistics 10:315-319.
FELSENSTEIN, J. 1978. Cases in which parsimony or
compatibility methods will be positively misleading. Syst. Zool. 27:401-410.
GARDINER, B. G. 1982. Tetrapod classification. Zool. J.
Linn. Soc. 74:207-232.
98
SYSTEMATIC BIOLOGY
GAUTHIER, J., A. G. KLUGE, AND T. ROWE. 1988. Am-
VOL. 45
PESOLE, G., E. SBISA, F. MIGNOTTE, AND C. SACCONE.
1991. The branching order of mammals: Phylogeniote phylogeny and the importance of fossils. Clanetic trees inferred from nuclear and mitochondrial
distics 4:105-209.
molecular data. J. Mol. Evol. 33:537-542.
GOLDMAN, N. 1993. Statistical tests of models of DNA
RICE, J. A. 1995. Mathematical statistics and data analsubstitution. J. Mol. Evol. 36:182-198.
ysis. Duxbury Press, Belmont, California.
HASEGAWA, M., H. KISHINO, AND T. YANO. 1985. Dating of the human-ape splitting by a molecular clock RODRIGO, A. G., M. KELLY-BORGES, P. R. BERGQUIST,
AND P. L. BERGQUIST. 1993. A randomisation test of
of mitochondrial DNA. J. Mol. Evol. 22:160-174.
the null hypothesis that two cladograms are sample
HEDGES, S. B. 1994. Molecular evidence for the origin
estimates of a parametric phylogenetic tree. N.Z. J.
of birds. Proc. Natl. Acad. Sci. USA 91:2621-2624.
Bot. 31:257-268.
HEDGES, S. B., AND L. R. MAXSON. 1991. Pancreatic
polypeptide and the sister group of birds. Mol. Biol. SOUZA, V. T., T. NGUYEN, R. R. HUDSON, D. PINERO,
AND R. E. LENSKI. 1992. Hierarchical analysis of
Evol. 8:888-891.
linkage disequilibrium in Rhizobium populations:
HEDGES, S. B., K. D. MOBERG, AND L. R. MAXSON.
Evidence for sex? Proc. Natl. Acad. Sci. USA 89:
1990. Tetrapod phylogeny inferred from 18S and
8389-8393.
28S ribosomal RNA sequences and a review of the
STEWART,
C.-B, AND A. C. WILSON. 1987. Sequence
evidence for amniote relationships. Mol. Biol. Evol.
convergence and functional adaptation of stomach
7:607-633.
lysozymes from foregut fermenters. Cold Spring
HILLIS, D. M., AND J. J. BULL. 1993. An empirical test
Harbor Symp. Quant. Biol. 52:891-899.
of bootstrapping as a method for assessing confiSWOFFORD,
D. L. 1995. PAUP*: Phylogenetic analysis
dence in phylogenetic analysis. Syst. Biol. 42:182using parsimony*, version 4.0. Sinauer, Sunderland,
192.
Massachusetts.
HILLIS, D. M., C. MORITZ, C. A. PORTER, AND R. J. BA-
KER. 1991. Evidence for biased gene conversion in
concerted evolution of ribosomal DNA. Science 251:
308-310.
JUKES, T. H., AND C. R. CANTOR. 1969. Evolution of
protein molecules. Pages 21-132 in Mammalian
protein metabolism (H. Munro, ed.). Academic
Press, New York.
VALDEZ, A. M., AND D. PINERO. 1992. Phylogenetic
estimation of plasmid exchange in bacteria. Evolution 46:641-656.
WILSON, A. C , S. S. CARLSON, AND T. J. WHITE. 1977.
Biochemical evolution. Annu. Rev. Biochem. 46:473639.
WOLFE, K. H., P. M. SHARP, AND W.-H. LI. 1989. Mu-
tation rates differ among regions of the mammalian
genome. Nature 337:283-285.
logenetic hypothesis of relationships among Epicra- YANG, Z. 1993. Maximum likelihood estimation of
tes (Boidae, Serpentes). Syst. Zool. 38:7-25.
phylogeny from DNA sequences when substitution
LOCKHART, P. J., M. A. STEEL, M. D. HENDY, AND D.
rates differ over sites. Mol. Biol. Evol. 10:1396-1401.
PENNY. 1994. Recovering evolutionary trees under YANG, Z. 1995. PAML: Phylogenetic analysis by maxa more realistic model of sequence evolution. Mol.
imum likelihood. Distributed by author, Univ. CalBiol. Evol. 11:605-612.
ifornia, Berkeley.
L0TRUP, S. 1985. On the classification of the taxon YANG, Z., AND D. ROBERTS. 1995. On the use of nuTetrapoda. Syst. Zool. 34:463-470.
cleic acid sequences to infer early branching in the
Luo, C.-C, W.-H. Li, AND L. CHAN. 1989. Structure
tree of life. Mol. Biol. Evol. 12:451-458.
and expression of dog apolipoprotein A-I, E, and ZHARKIKH, A., AND W.-H. Li. 1992a. Statistical properties of bootstrap estimation of phylogenetic variC-I mRNAs: Implications for the evolution and
ability from nucleotide sequences. I. Four taxa with
functional constraints of apolipoprotein structure. J.
a molecular clock. Mol. Biol. Evol. 9:1119-1147.
Lipid Res. 30:1735-1746.
MAYNARD SMITH, J., C. G. DOWSON, AND B. G. SPRATT. ZHARKIKH, A., AND W.-H. Li. 1992b. Statistical properties of bootstrap estimation of phylogenetic vari1991. Localized sex in bacteria. Nature 349:29-31.
ability from nucleotide sequences. II. Four taxa
MEDIGUE, C , T. ROUXEL, P. VIGIER, A. HENAUT, AND
without a molecular clock. J. Mol. Evol. 35:356-366.
A. DANCHIN. 1991. Evidence for horizontal gene
transfer in Escherichia coli speciation. J. Mol. Biol. Received 25 May 1995; accepted 1 September 1995
222:851-856.
Associate Editor: David Cannatella
KLUGE, A. G. 1989. A concern for evidence and a phy-