Ancestral State Estimation and Taxon Sampling Density

Syst. Biol. 50(4):557–564, 2001
Ancestral State Estimation and Taxon Sampling Density
B ENJAMIN A. S ALISBURY1 AND J UNHYONG K IM 2
1
2
Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut 06520-8106 , USA;
E-mail: [email protected]
Department of Ecology and Evolutionary Biology, Department of Molecular, Cellular, and Developmental Biology, and
Department of Statistics, Yale University, New Haven, Connecticut 06520, USA; E-mail: [email protected]
Abstract.—A set of experiments based on simulation and analysis found that using the parsimony
algorithm for ancestral state estimation can beneŽt from increased sampling of terminal taxa. Estimation at the base of small clades showed strong sensitivity to tree topology and number of descendent
tips. These effects were largely driven by the creation and negation of ambiguity across a topology.
Root state and internal state estimation showed similar behavior. We conclude that increased taxon
sampling density is generally advisable, and attention to topological effects may be advisable in evaluating the conŽdence placed in state estimation. We also explore the factors affecting ancestral state
estimation and conjecture that as taxa are added to a tree, the total amount of information for root
state estimation depends on the tree topology and distance to root state of added taxa. For a pure-birth
model tree, we conjecture that the addition of N taxa increases root state information in proportion to
log(N). [Parsimony; state estimation; taxon sampling; tree topology.]
The challenge of estimating ancestral character states has recently received attention from many authors, including those
of the seven-paper symposium published
in Systematic Biology 48, no.3 (Cunningham,
1999; Martins, 1999; Mooers and Schluter,
1999; Omland, 1999; Pagel, 1999; Ree and
Donoghue, 1999; Schultz and Churchill,
1999). Relative to our understanding of phylogeny estimation, methods of ancestral state
estimation are somewhat poorly characterized; this disparity is perhaps due to the logical priority of the former step in inferring
evolutionary history. Most papers on state
estimation theory have revolved around the
merits and liabilities of competing analytical methods, such as parsimony and various
avors of maximum likelihood. In this note
we explore a single issue: how the accuracy
of estimating ancestral states depends on the
density of taxon sampling. We consider only
parsimony estimation because it is the most
commonly applied method and its algorithm
is amenable to analytical analysis.
Frumhoff and Reeve (1994) and a subsequent paper by Schultz et al. (1996) considered evolution and root state estimation on
a “null model” phylogenetic star tree analyzing the effects of asymmetry of character
change rates and other sources of correlated
homoplasy. Under this simple tree model, the
probability of successfully estimating the ancestral state increases with added taxa unless
the asymmetry of character evolution makes
the root state less likely to be observed at
each tip than is some other state. Of course,
by assuming a star phylogeny, those estimates were effectively nonphylogenetic, derived simply from the plurality state among
the observations.
Zhang and Nei (1997) used character simulation on a few, fully branched model trees
(up to 10 tips) to estimate probabilities of
correctly estimating ancestral states at internal nodes by using parsimony, maximum
likelihood, and a hybrid distance–maximum
likelihood method. Under parsimony, they
found that having more taxa usually improved the proportion of correct state estimates, with some unexplained exceptions.
However, their experiments considered only
small changes in taxon sampling (differences
of 1 or 2 tips).
Steel and Charleston (1995) asked what
happens to the probability of correct root
state estimation by parsimony when tree size
is increased by adding large numbers of taxa.
Their investigation was prompted by recognizing that when taxa are added to the tree,
the information level for root state estimation
increases, but the total amount of evolution
in the tree (thus erasure of the root state information) also increases. Therefore, which
of the two “forces,” effecting information increase and decrease, respectively, would win
out was not clear.
In Steel and Charleston, taxa were added
to a fully balanced tree by doubling the numbers at each time step and preserving the
balanced tree structure. Furthermore, they
557
558
S YSTEMATIC BIOLOGY
assumed that each added branch had a Žxed
probability, P, of state change such that the
time depth of the tree increased with each
doubling of the taxa. From this model of
taxon sampling they obtained the result that
the probability of correct root state estimate,
Pc , goes to
Á
!
p
1
(1 6x)(1 2x)
Pc D
1 2x C
,
2
1 2p
where
xD
p
1
2p
(1)
when p < 1=8 and goes to 1/3 when p >
1/8. In this taxon sampling model, the probability of correct root state estimate is a
decreasing function of number of taxa. However, this model is rather unrealistic. In the
usual empirical cases where we might be interested in the root state estimate of a Žxed
clade, the expected amount of evolution per
lineage would not increase with increased
taxon sampling unless we accidentally sampled outside the clade or by chance included
a very deviant subclade with high rates of
evolution. Therefore, in this paper, we revisit
the problem of asking what happens to root
state estimation probabilities when a more
realistic taxon sampling model is applied. We
also extend this research to estimation of internal node states.
M ETHODS
We began by creating trees that would each
represent an entire, fully sampled clade. For
simplicity, we generated trees using a pure
birth (i.e., no extinction) Markovian speciation model (also known as a Yule model); all
lineages were equally likely to speciate and
the speciation rate was constant over time.
To build the trees, we used the conditioned
sampling approach (Ross, 1996). We conditioned on the number of tips equaling 512
(i.e., 29 ) over a unit time interval. The speciation rate was set to ln(512/2), for which 512
tips are expected after one unit time under
the pure birth model after an initial speciation event. With these settings we hoped to
generate trees that were not drastically different from those encountered in natural investigations.
Subsamples of different sizes (details below) were taken from the parent trees such
that smaller subsamples were always nested
VOL. 50
within the larger ones. Our results would
thus show the effect of adding more information to an existing set of observations. Taxa
were chosen equiprobably from the initial
512. The subsamples were always required
to span the root of the tree.
We used exact calculations to determine
the probabilities of correct, incorrect, and ambiguous (Pc , Pi , and Pa , respectively) estimates of the root states of characters evolved
on these trees. The calculations were enabled by algorithms derived independently
by Maddison (1995) and Kim (1996). Here we
assumed (1) binary characters, (2) time homogeneous rates of evolution, and (3) symmetric (equal) change rates between the two
states. We also assumed that tree topologies
were correctly estimated before estimation of
ancestral states. We extended the previous algorithm to calculate the conditional probabilities of correctly, incorrectly, and ambiguously reconstructing the internal node states
for our trees. This was accomplished by effectively rerooting a tree as a trichotomy at
each node.
EXPERIMENTS AND R ESULTS
Root State Estimation
Our Žrst experiments were designed to
reect a commonly encountered endeavor:
estimating ancestral character states at the
root of a particular clade (e.g., orchid habit;
Frohlich, 1987). The researcher must decide
how thoroughly to sample taxa from among
the observable, extant members of the clade.
To address this situation, we examined how
the probability of correct root state estimation varies in relation to subsampling different numbers of taxa from a larger clade.
We Žrst examined how the probability, Pc ,
of correctly estimating root ancestral states
responds to large changes in sample size.
For each of 100 replicate 512-tip trees, we examined nested subsamples of size N D 16,
32, 64, 128, 256, and 512 tips. For each tree,
we began with a subsample of 16 equiprobably chosen tips that together spanned the
root of the tree. Each larger sample was generated by adding more taxa to the preceding smaller sample, as might be done in an
empirical study. We calculated Pc for a homogeneous instantaneous rate of character
change, r, for each tree and subtree created
in the above fashion. This analysis was conducted for character change rates of 0.5, 1.0,
2001
SALIS BURY AND KIM—ANCESTRAL S TATE ESTIMATION
559
and 2.0. Because the total time depth of the
tree was a unit interval, the characters were
expected to have 1, 2, or 4 changes over a path
connecting any pair of tips that spanned the
root of the tree. For a pure birth model of
speciation, the total number of steps over the
entire tree has the expectation (derived from
Ross, 1996):
Z D 2t C (N
2)
1
e ¸t ¸te
¸(1 e ¸t )
¸t
(2)
Therefore, with time t D 1 and r D 0:5, 1.0,
and 2.0, the total expected numbers of character changes over the whole tree are roughly
bounded by 46, 92, and 184, respectively.
Figure 1 depicts the mean values of the
probability of correct root state estimate, Pc ,
for the 100 trees under each set of conditions.
There is an invariant trend of increasing accuracy with increased sample sizes. Variation
masked by averaging of Pc over the sample
is hinted at by the standard deviation bars.
Details are discussed further later.
Secondarily, we considered the effects of
sampling at very small clade sizes. Because
the parsimony state estimation algorithm is
strictly a function of the tree topology, it can
display aberrant behavior. For example, in a
comb-shaped tree, the most “basal” lineages
FIGURE 2. Behavior of Pi , Pa , and Pc for subsample
trees with very few tips. Details as in Figure 1. Error bars
are shown only for Pc .
are longer (i.e., more error prone) than the
rest yet exert an overwhelming inuence regarding estimation of the root state. Such tree
topologies and behavior can be especially
common and pronounced when the number
of taxa is small.
Using the same 512-tip tree generation as
above and an analogous sampling strategy,
we calculated the root ancestral estimation
probabilities for every value of N from 2 to
8 given r D 0:5, 1.0, and 2.0. Figure 2 shows
the results for r D 2, which were comparable with, though more extreme than, results
for the other two values of r . Probabilities for
each value of N were averaged over the 100
samples. The previously observed tendency
for Pc to increase with sample size (Fig. 1)
is still apparent. However, for the smallest
values of N, the exact number of tips is a
strong determinant of Pc , as is evident in the
clear oscillatory variation. Pi oscillates synchronously with Pc , whereas Pa oscillates out
of phase with the others.
Internal Node State Estimation
FIGURE 1. Mean probabilities, Pc , of correctly estimating the root state of a binary character evolving
at three rates (r) on subsamples of 512-tip, pure-birth
model trees. The bars around each mean indicate § 1
SD based on a sample of 100 trees.
Our second set of experiments was designed to complement the above work by focusing on the estimation of ancestral states
throughout a phylogeny. We generated 100
trees as above and subsampled N D 16, 32,
64, 128, 256, and 512 tips, again requiring
that the root be spanned. We calculated the
560
VOL. 50
S YSTEMATIC BIOLOGY
conditional probabilities Pc , Pi , and Pa for
every internal node above the root. For each
node, we also noted the number of descendent tips and the temporal distance from the
root (from 0 to 1). This experimental design
treats each internal node state (a random
variable) as if it were the root state parameter. It does not assess the joint probability of
correct estimates at all internal nodes, only
the marginal states at each node.
Figure 3 depicts second-order local regressions (loess Žt; Venables and Ripley, 1997)
of distance from root and Pc for the nodes
of each subsample size. The clear result is
that, at any depth in the tree, adding more
terminal taxa to the study (not necessarily
within the subtended clade) tends to increase
estimation success at an internal node. Furthermore, the deeper the node is in the tree,
the more important taxon sampling density
becomes.
FIGURE 3. Probability of correctly estimating an internal node state as a function of its position. Characters are modeled as binary with symmetric change
rate r D 2:0. Time depth is the position of the internal
node relative to the terminal taxa and the root node (the
present D 0; the root position D 1). Pc is the conditional
probability of correctly reconstructing the internal node
state. The curves are for different sizes of samples of
the original tree. From top to bottom the sample sizes
are 512, 256, 128, 64, 32, and 16, respectively. The curves
were obtained by a second-order local regression of the
internal node positions and their probability of correct
state estimate.
D IS CUS SION
The primary implication of our results is
that the probability of correctly estimating
the ancestral states of characters can be increased by adding more taxa to an analysis. This holds for both internal states and
root states. Despite variability, the pattern
of increased root state Pc with increased
taxon sampling held almost universally in
our Žrst experiment. When the taxon density was doubled, only 2.3% of the cases resulted in a decreased Pc . Furthermore, the
magnitudes of those decreases tended to be
slight and they occurred primarily for r D
0:5, where sample size had little effect in
either direction because of the conservative
pace of character evolution. Mean improvement for root state Pc varied by size; Figure 1
shows the diminishing return of sample doubling. Diminishing returns on taxon sampling investment was less evident for high
r, as seen in Figure 2 and the bottom curve of
Figure 1.
The positive association between sampling
density and accuracy seen in Figure 1 is an
average tendency that hides some interesting
variation. Rather than being normally distributed around the means, Pc is distinctly
bimodal, especially at low taxon sizes. The
one restriction we placed on our subsampling was that the root had to be spanned;
that is, the two branches distinguished by the
root node had to be represented by at least
one taxon each. In a repeat of the Žrst experiment, if we include an additional requirement that the two halves of each tree subsample be represented by at least two tips each,
the bimodality effectively disappears (data
not shown). The original bimodality can be
attributed to the presence of trees that were
excluded by the new criterion: trees in which
a monotypic lineage forms the sister group
to rest of the taxa (1 and N 1). In those
trees, the monotypic branch contributes an
observed rather than estimated state to the
estimation process at the root; although this
branch is the longest branch in the tree (and
therefore most error prone), the state observation at its tip has more inuence than any
other tip state observation because it contributes directly to the root estimation and is
valued equally with the estimate at the root of
the sister clade. Trees with a 2:14 split have
higher Pc and lower Pa and Pi on average
than the 1:15 trees.
2001
SALIS BURY AND KIM—ANCESTRAL S TATE ESTIMATION
A different kind of variability was evident
for small taxon samples. At low N, parity is a
key determinant of root state estimation success (Fig. 2). When an odd number of tips is
used, ambiguity is more likely to be negated
at the root. We do not believe that these results argue for intentionally sampling even
or odd numbers of taxa. Rather, the differential results reect parity-dependent topological sampling and parsimony’s topologydependent estimation effects. However, the
Žndings do suggest that if a study Žnds an
unambiguous root estimate when N is low
and odd, the estimate should be viewed suspiciously as a possible artifact of parsimony
when there is any ambiguity at nodes near
to the root. We also considered the role of
parity for internal state estimation. When we
plotted (not shown) Pc for the internal nodes
of 16-taxon samples separately according to
number of descendent taxa, we found that
parity again made a large difference: Especially when near the root, Pc values were relatively higher for nodes with an even number
of descendents.
Parity, however, is only a crude indicator
of the likely extent of ambiguity. Figure 4
shows the internal Pa values for every fourdescendent node in 16-tip subsamples of
500 random 512-tip trees. The points are
marked according to whether the clade is
topologically symmetrical (2:2) or asymmet-
FIGURE 4. Probability of ambiguously estimating the
state of an internal node with four descendent terminal
taxa as a function of its position and topology. Characters are modeled as binary with symmetric change rate
r D 2:0. Pa is the conditional probability of ambiguously
reconstructing the internal node state. The points shown
represent all four-tip clades from a set of 500 16-tip, rootspanning subtrees. The points are labeled according to
whether the clade is symmetrical (2:2) or asymmetrical
(1:3).
561
rical (1:3). Pa is distinctly greater for the balanced clades. Similarly, Pc and Pi also show
bimodality. Clearly, topological considerations may be important when assessing the
success of parsimony estimates of ancestral
states.
Another aspect of parsimony estimates
that deserves mention is the difference between root and internal state estimation. In a
fully resolved tree, root estimates derive from
two subestimates (left and right), whereas
internal estimates are based on left, right,
and ancestral subestimates. This difference
dramatically affects how much ambiguity is
expected. Notice that the Pc values at time
depth D 1 in Figure 3 are distinctly greater
than the corresponding Pc values in Figure 1
(the r D 2 curve). This discrepancy appears
to largely reect the shift in ambiguity; for
example, when r D 2 and N D 16, Pa D 0:32
for the root, whereas Pa D 0:23 for internal
nodes near the root (time depth ¸ 0.99).
Effective Information of Added Taxa
It is useful to consider a simpler case to analyze the general factors affecting root state
estimation. Suppose we have a single lineage
with the ancestor at time t D 0 and a descendent at time t D ¿ and binary state characters with symmetric probability of change. If
the character state, X, of the descendent is
0, then our best guess at the ancestor state
is also 0, because we have no other information at hand. If we model the character evolution process as a continuous-time Markov
model, the probability that the descendent
will be identical to the ancestor is 12 C 12 e r t .
Thus Pc is a decreasing function at the order of »O(e r t ), where r is the rate constant.
As expected, the quality of information about
root state degrades the further away in time
we sample the descendent state.
Suppose now we have N descendent lineages and these are arranged as a star topology. Then we have N independent lines of
evidence about the ancestral state. Let f 0 and
f 1 be the frequency of state 0 and state 1
observed over the N descendent taxa. Then
the maximum parsimony estimate of the ancestral state is the most common state (as
is the maximum likelihood estimate). In the
next steps we assume state 0 is the true ancestral state without loss of generality. As
we add independent lineages, Probf f 0 > f 1 g
562
VOL. 50
S YSTEMATIC BIOLOGY
goes to 1 as N goes to inŽnity (even if the
length of each lineage varies) as long as
Probfdescendent state is different from ancestor stateg < 1=2, consistent with the results of
Frumhoff and Reeve (1994). Thus, when we
have a star topology and the probability of
difference between the ancestor and descendent is bounded at 1/2, we always converge
to the true estimate as we add taxa.
Let Prob fdescendent state is different from ancestor stateg be p D 12 12 e r t ; then f 0 and f 1
follow a binomial distribution. To compute
Probf f 0 > f 1 g, we can use the normal approximation to compute the Z-score Žrst.
N(1
Z» p
p
p
Log N
Z» p
¼ O( Log Ne
N=2
p)
Np(1
e 2r t
p)
p
N
D p
2r
t
e
1
p
¼ O( Ne
rt
):
(3)
Therefore, the standard Z-score increases as a
square root of N and decreases as a negative
exponential function of time. If N is large,
we can use an approximation to the cumulative distribution of the normal distribution
(Rohatgi, 1976):
1
ProbfZ > xg D p e
x 2¼
x 2 =2
pendency problem by incorporating the tree
structure into the state estimates. What about
“less information” provided by the tree itself? In the star topology case, each lineage
provides independent information about the
root state, leading to the scaling relationship
given in equation 5. In a “normal” tree the
dependencies reduce the effective number of
informative lineages. Here we conjecture that
the number of informative lineages in a Yule
tree (or similar model trees with constant expected branch length) scales as Log(N). This
gives us a conjectured scaling relationship for
Yule trees in terms of Z-scores:
(x ! 1) (4)
and substituting (3) gives us the scaling relationship:
Probfcorrect root state estimateg
Á
!
1
N 2r t
¼ 1 p
exp rt
e
2
2¼ N
(5)
When the lineages have more treelike
structure, we have dependencies in the lineages and therefore less total information as
well as less potentially misleading information. For example, if we have 99 lineages
forming a very recently diversiŽed clade
with a very long stem and a single sister
lineage, even if we have 99 lineages with
state 0, we would be wary of declaring the
ancestral state as 0 if the monotypic sister
was state 1. Tree-dependent estimates, such
as parsimony estimates or maximum likelihood estimates, attempt to resolve this de-
rt
1
)
(6)
The scaling relationship given by equation
6 Žts our data well (not shown). It is also supported by comparing our results with those
of Steel and Charleston (1995). Because the
total amount of time in our trees is given approximately by equation 2 and because there
are 2N 2 branches in any tree, the average branch length of our trees (equation 2
divided by 2N 2) rapidly asymptotes to a
constant length as the number of lineages is
increased. Therefore, contrary to the intuition
of “breaking up” branches, average branch
length of a pure-birth tree stays constant regardless of the number of taxa; it does not decrease to 0. Steel and Charleston’s trees also
have constant average branch length. However, the total time from the tips to the root
is a Žxed constant in our trees, whereas it
scales as a function of Log(N) in Steel and
Charleston’s trees. Letting rt D Log(N) and
substituting this into equation 6 shows that
we expect Z-scores to go to 0 as N increases
for Steel and Charleston’s trees. Therefore,
we would expect the probability of correct
root state estimate to decrease with N, consistent with their results.
Conditions and Conclusions
As listed in the Methods section, we made
several assumptions in this study, of which
two are particularly noteworthy. First, tree
topology is known (i.e., correctly estimated)
before ancestral character states are estimated. The signiŽcance of this assumption is
perhaps not as great as it might seem; Zhang
and Nei (1997) found that errors in topology estimation had negligible effects on state
2001
SALIS BURY AND KIM—ANCESTRAL S TATE ESTIMATION
estimation at nodes that were not in the immediate neighborhood of a topology error.
From our analysis, we can see that the main
effect of the tree topology estimate is to correct for the dependence structure in the data.
Then, as long as the estimated trees are not
wildly deviant from the true trees (i.e., as long
as they capture the rough dependence structure of the trees), ancestral state estimates apparently would not be greatly affected. However, a more thorough investigation of this
problem should still be attempted.
A second critical assumption of this study
was that taxon sampling is random with respect to taxon identity and phylogeny. In
practice, taxon sampling will depend on
the availability of specimens and data, researchers’ opinions on what constitutes an
appropriate sampling scheme, and the objectives of the study beyond the estimation
of ancestral states of particular characters. At
worst, or perhaps best, taxa may be chosen
with speciŽc regard to a character of interest. It is quite possible to sample taxa in a
pathological manner that will decrease the
probability of correctly estimating an ancestral state. However, from our analysis we expect the results here to be generally robust
to any particular taxon sampling scheme—
as long as the total depth of the tree (as measured by expected number of changes) does
not signiŽcantly increase as a result of the
added taxa.
The Žndings of this paper present clear evidence that increased taxon sampling, as a
general practice, can be helpful in estimating
ancestral character states. This pattern appears unaffected by rate of character change
as long as the total depth of the tree does
not increase with increased taxa. Sampling
taxa more densely, especially when the sample would otherwise be sparse, appears to
be a reliable way to improve ancestral character state estimates. We also demonstrated
some peculiar properties of the parsimony algorithm. Because parsimony has the effect of
considering all branches to be equally prone
to character change and because it deals in
absolute state assignment and nongraded
ambiguity, topology can have a large inuence on estimation success.
Finally, we note that all our results are
with respect to marginal states at a particular
node, root or otherwise. The joint estimate at
all nodes is a complicated problem. For an
N-taxon tree, we have N state observations
563
that we are hoping to use to deduce N 1
unobserved states. The difŽculty of such a
problem is evident. Ancestral state estimation is crucial to phylogenetic biology, but far
less attention has been paid to the problem
than to the problem of tree topology estimation. Many open questions remain for future
studies.
ACKNOWLEDGMENTS
We are grateful to Dick Olmstead, David Ackerly, and
an anonymous reviewer for their useful comments, especially the encouragement to expand our research to
include internal node estimation. This work was supported in part by NSF grant DEB-9806570 to J.K. B.A.S.
was also supported through a Forest B.H. and Elizabeth
D.W. Brown Postdoctoral Fellowship. This paper is dedicated to F. James Rohlf on his 65th birthday.
R EFERENCES
CUNNINGHAM , C. W. 1999. Some limitations of ancestral
character-state reconstruction when testing evolutionary hypotheses. Syst. Biol. 48:665–674.
FR OHLICH, M. W. 1987. Common-is-primitive—a partial validatio n by tree counting. Syst. Bot. 12:217–
237.
FR UMHOFF, P. C., and H. K. REEVE . 1994. Using phylogenies to test hypotheses of adaptation—a critique of
some current proposals. Evolution 48:172–180.
KIM , J. 1996. General inconsistency conditions for maximum parsimony: Effects of branch lengths and increasing numbers of taxa. Syst. Biol. 45:363–374.
MADDISON, W. P. 1995. Calculating the probability distributions of ancestral states reconstructed by parsimony on phylogenetic trees. Syst. Biol. 44:474–
481.
MARTINS , E. P. 1999. Estimation of ancestral states of
continuous characters: A computer simulation study.
Syst. Biol. 48:642–650.
MOOERS , A. Ø., and D. SCHLUTER . 1999. Reconstructing ancestor states with maximum likelihood: Support for one- and two-rate models. Syst. Biol. 48:623–
633.
OMLAND , K. E. 1999. The assumptions and challenges
of ancestral state reconstructions. Syst. Biol. 48:604–
611.
PAGEL, M. 1999. The maximum likelihood approach to reconstructing ancestral character states of
discrete characters on phylogenies. Syst. Biol. 48:612–
622.
REE , R. H., and M. J. DONOGHUE. 1999. Inferring rates
of change in ower symmetry in asterid angiosperms.
Syst. Biol. 48:633–641.
ROHATGI, V. K. 1976. An introduction to probability theory and mathematical statistics. Wiley & Sons, New
York.
ROS S , S. M. 1996. Stochastic Processes, 2nd edition.
Wiley & Sons, New York.
SCHULTZ, T. R., and G. A. CHUR CHILL. 1999. The role
of subjectivity in reconstructing ancestral character
states: A Bayesian approach to unknown rates, states,
and transformation asymmetries. Syst. Biol. 48:651–
664.
564
S YSTEMATIC BIOLOGY
SCHULTZ, T. R., R. B. COCROFT , and G. A. CHURCHILL.
1996. The reconstruction of ancestral character states.
Evolution 50:504–511.
STEEL, M., and M. CHARLESTON. 1995. Five Surprising properties of parsimoniously colored trees. Bull.
Math. Biol. 57:367–375.
VENABLES , W. N., and B. D. RIPLEY. 1997. Modern Applied Statistics with S-Plus. Springer-Verlag, New
York.
VOL. 50
ZHANG , J., and M. NEI . 1997. Accuracies of ancestral
amino acid sequences inferred by the parsimony, likelihood, and distance methods. J. Mol. Evol. 44:S139–
S146.
Received 4 May 2000; accepted 17 July 2000.
Associate Editor: R. Olmstead