Cladistics

Cladistics
Cladistics 20 (2004) 579–582
www.blackwell-synergy.com
Mapping characters on a tree with or without the outgroups
Philippe Grandcolas, Eric Guilbert, Tony Robillard, Cyrille A. D’Haese, Jérôme Murienne
and Frédéric Legendre
FRE 2695 CNRS, De´partement Systématique et Evolution, case 50, Muse´um national d’Histoire naturelle, 45, rue Buffon, 75005 Paris, France
Accepted 27 September 2004
Abstract
A character of special interest in evolutionary studies is usually optimized on a phylogenetic tree, with or without the outgroups
employed in that analysis. Both practices are never justified and look like arbitrary choices. Focusing on one example, we draw the
conclusion that authors retain or remove outgroups depending on the way these outgroups sample the diversity of states of the
character(s) of special interest. The topology without outgroups is often used by authors when different outgroup taxa nonexhaustively sample the different states of the character of interest outside of the ingroup. This can make the analysis incoherent,
because its different steps are not based on the same data matrix (outgroups are removed in the last step). It can provide several
incoherent and possibly different patterns for a same character of interest, one issuing from the first step of phylogeny construction
and the other resulting from the a posteriori optimization on the truncated topology. Phylogenetic analyses should be designed to
minimize this problem, selecting outgroup and ingroup taxa whose diversity of character states is needed for reconstructing the
evolutionary history of the character of interest.
The Willi Hennig Society 2004.
Over the last two decades, phylogenetic trees have
increasingly been used to infer the evolutionary histories
of characters of special interest (Coddington, 1988, 1994;
Carpenter, 1989; O’Hara, 1992; Eggleton and VaneWright, 1994; Miller and Wenzel, 1995; Grandcolas,
1997; Brooks and McLennan, 2002; Grandcolas and
D’Haese, 2003).This particular use of phylogenetic trees
has produced some methodological problems. The
assessment of ‘‘primary homology’’ of characters of
special interest (behavioral, ecological, etc.) has been
questioned (but see de Queiroz and Wimberger, 1993;
Wenzel, 1992; Desutter-Grandcolas and Robillard,
2003). The exclusion versus inclusion of characters of
interest in the matrix has been diversely favored (de
Queiroz, 1996 versus Deleporte, 1993; Kluge and Wolf,
1993; Luckow and Bruneau, 1997; Zrzavý, 1997; Grandcolas et al., 2001). Adaptational studies have been illconceived, optimizing characters together with their
supposed selective regimes on the tree (see Grandcolas
*Corresponding author: E-mail address: [email protected]
The Willi Hennig Society 2004
and D’Haese, 2003 contra Baum and Larson, 1991) and
the phylogenetic reconstruction of supposedly adaptive
characters has also been questioned itself, on the basis of
particular adaptationist models (see Schultz et al., 1996
contra Frumhoff and Reeve, 1994). As we will show in this
short review, using trees in evolutionary studies has often
separated the characters of interest and has led to some ad
hoc treatments and spurious cladistic conclusions.
Another methodological problem related to the use of
phylogenetic trees has not been mentioned until now
and emerges from an examination of the relevant
literature. When characters of interest are optimized a
posteriori on a tree (sometimes obtained from the
analysis of other evidence), they are optimized either
on the complete tree or on the tree minus the outgroup(s). A large and recent sample of studies has
shown that both practices are common, equally represented and never justified in this respect (see for example
the studies listed by Brooks and McLennan, 2002). We
examine the rationale for each practice and whether one
must be discarded in deference to general cladistic
methodology.
580
P. Grandcolas et al. / Cladistics 20 (2004) 579–582
At first glance, excluding the outgroup from the tree
for optimizing a particular character seems like discarding evidence. A cladogram results from the analysis of a
matrix with the character states coded for every taxon,
including the taxon selected as an outgroup. Rooting the
tree is the last step in the analysis, and does not change
the topology of the network relating the taxa (Farris,
1982; Nixon and Carpenter, 1993). So, why remove the
outgroup from the topology when optimizing the
character of interest? If the outgroup is good enough
to polarize the characters, is it not good enough to
polarize the character of interest by optimization on a
complete tree?
In addition, the logical consistency of the whole
cladistic analysis must be carefully considered. In
particular, when the characters of interest have been
used in the analysis (included in the matrix), using a
truncated tree for a posteriori optimizations is inconsistent: two possibly different phylogenetic patterns are
de facto considered for that character, one supporting
the tree and one used for evolutionary inference. It
seems incoherent to keep the second pattern for reference, knowing that the tree topology (on which the
optimization was derived) was perhaps strongly justified
by the first pattern.
Looking further at case studies may help to clarify the
reasons for different practices in this respect. Recently,
Thompson et al. (2000) studied the evolution of the
‘‘true’’ worker caste in termites with reference to a
molecular tree. According to different and opposite
evolutionary models, this caste was thought to be either
ancestral or derived in termites, especially with respect
to the evolution of foraging behavior. According to the
optimization of this caste onto the tree without the
outgroup (cockroaches), Thompson et al. (2000) found
the true worker caste to be ancestral and claimed to have
solved the controversy (Fig. 1A). In fact, using the
whole tree for optimizing the character produces a
different and indecisive result with several equally
parsimonious patterns (Figs 1A–C), which include a
Blattidae
Blattidae
Blattidae
Mastotermitidae
Mastotermitidae
Mastotermitidae
Termopsidae
Termopsidae
Termopsidae
Hodotermitidae
Hodotermitidae
Hodotermitidae
Kalotermitidae
Kalotermitidae
Serritermitidae
Serritermitidae
Serritermitidae
Rhmitiinoterdae
Rhmitiinoterdae
Rhmitiinoterdae
Kalotermitidae
(A)
Cockroaches
(outgroup)
worker caste, either ancestral to termites, or derived in
some termites, or even ancestral to cockroaches and
termites altogether (Grandcolas and D’Haese, 2002,
2004). In this case, using the outgroup does not help to
answer the question which was asked with the tree,
showing once more that a resolved or complete tree does
not necessarily bring more decisive answers (Wenzel,
1997). But there are no a priori justifications for using a
truncated tree lacking the cockroach outgroup which
was otherwise part of the analysis (contra Thompson
et al., 2003). In addition, any outgroup and any putative
sister-group (either mantids or cockroaches, or any
other hemimetabolous insects) in that case would
present the same state of the character of interest,
namely ‘‘lacking a worker caste’’ (‘‘worker caste’’ being
defined here as resulting from the hemimetabolous
developmental pathway of dictyopteran nymphs).
In that case, not using the outgroup is suppressing
information, even if there are no reasons to think that
this information is doubtful or the result of a poor
sampling strategy. Brooks and McLennan (2002, p. 270)
accordingly argued when they stated ‘‘the only way to be
sure that you have the most parsimonious optimization
for your character is to include information from
outgroups in the analysis …’’. We would specify: not
the most parsimonious but the most logically consistent
optimization for the character of interest.
Looking further at the same case study permits us
however, to understand that there are some cases where
using a truncated tree may have a rationale, even if
disputable. With their molecular phylogenetic tree,
Thompson et al. (2000) also studied the evolution of
foraging behaviour, which is supposed to be relevant to
the evolution of a ‘‘true’’ worker caste. Termites are
either foraging within the wood nest (‘‘one-piece’’), or
outside the nest (‘‘intermediate’’ or ‘‘separate’’ life
types), according to the terminology of Abe (1990).
Arguably, Thompson et al. (2000) assumed that
cockroaches exhibit the ‘‘one-piece’’ foraging behaviour
or are a ‘‘one-piece-like ancestor’’, according to the
Termitidae
Termites
(B)
Termitidae
(C)
Termitidae
Fig. 1. Optimization of the character ‘‘worker caste’’ on the cladogram of termites using cockroaches as an outgroup. If cockroaches are not used for
optimization, there is only one most parsimonious pattern (A) with worker caste (black line) ancestral and absence of worker caste homoplastically
derived (gray line). If optimization is made on the complete tree including cockroaches, there are three equiparsimonious patterns (A, B, C) with the
caste ancestral or not. Any hemimetabolous potential outgroup lacks a caste, so there is no conceivable rationale for considering a topology without
an outgroup.
581
P. Grandcolas et al. / Cladistics 20 (2004) 579–582
(A)
Blattidae
Blattidae
Blattidae
Blattidae
Mastotermitidae
Mastotermitidae
Mastotermitidae
Mastotermitidae
Termopsidae
Termopsidae
Termopsidae
Termopsidae
Hodotermitidae
Hodotermitidae
Hodotermitidae
Hodotermitidae
Kalotermitidae
Kalotermitidae
Kalotermitidae
Kalotermitidae
Serritermitidae
Serritermitidae
Serritermitidae
Serritermitidae
Rhmitiinoterdae
Rhmitiinoterdae
Rhmitiinoterdae
Termitidae
(B)
Termitidae
(C)
Termitidae
Rhmitiinoterdae
(D)
Termitidae
Fig. 2. Optimization of the character ‘‘foraging behaviour’’ on the cladogram of termites using cockroaches as the outgroup. ‘‘Foraging behaviour’’
has two states, either foraging within the wood nest (‘‘one-piece’’ foraging type, here black lines), or outside the nest (‘‘intermediate’’ or ‘‘separate’’
life types, here grey lines). Any group related to termites (including cockroaches) where some outgroups can be taken, is polymorphic, therefore
making the use of a particular outgroup a crucial question. There are four more parsomonious patterns (A, B, C, D) and patterns A (2 steps) versus
B, C, D (three steps) depend, respectively, on the outgroup (Blattidae) state.
misleading conception of the woodroach Cryptocercus
taken as a ‘‘living ancestor’’ of termites (Grandcolas and
D’Haese, 2002, 2004). Actually, cockroaches are polymorphic with respect to foraging behaviour and indeed
present some additional character states not observed in
termites. In that case, Thompson et al. (2000) optimized
foraging behaviour onto the complete tree comprising
the outgroup coded monomorphic and they obtained
one single pattern (Fig. 2A, according to Grandcolas
and D’Haese, 2002). It is however, obvious to any
entomologist that all groups of insects more or less
closely related to termites, including cockroaches, which
can be selected as outgroups, are polymorphic with
respect to foraging behaviour. Thus, using the whole
tree (with such kinds of outgroups) for optimising the
character of foraging behaviour would result in a lottery
with some sort of random outgroup (Wheeler, 1990). In
that case, ironically, one might wonder if truncating the
tree by suppressing the outgroup nodes would not be
less poorly conceived, even if it remains a questionable
practice for the reasons explained above. The rationale
would be that it could permit dealing with reasonable
optimisations which do not depend on a random
character state in some outgroup taxa which do not
represent a correct sample of the polymorphism of their
groups.
Thompson et al.’s (2000) case study is exemplary
because it deals with two kinds of characters of interest,
one of which is obviously monomorphic in the outgroups
and another which is obviously polymorphic in the
outgroups. It shows that one could be tempted, for
reasons of apparent common sense, to use both strategies
for optimizing the character of interest: using a tree
topology with or without the outgroup. However, as a
matter of principle, polarizations should be made with the
topologies including the outgroups coming out of the
analysis to preserve the logical consistency of the analysis
as far as possible, especially if the characters of interest are
included in the matrix. In addition, phylogenetic case
studies can deal, as in the present example, with several
interrelated characters that all deserve the same treat-
ment. In that case, treating two such characters differently
would add another kind of incoherence, dealing with
several topologies for different characters whose phylogenetic pattern is to be compared in the same clade.
If the outgroups are representative of the states of the
character of interest in their group, it is easy to fit the
prescription of keeping the topology complete as truly
resulting from the analysis. If the outgroup belongs to a
group which is highly polymorphic regarding the character of interest, it may not sample correctly the different
possible character states outside the ingroup. To
improve the situation, the study should be designed so
carefully that different outgroups (Nixon and Carpenter,
1993) used in turn (Barriel and Tassy, 1998) may be
selected to represent a complete sampling of the states of
the character of interest (e.g., D’Haese, 2000). However,
this strategy may prove deceptive in cases with several
characters of interest when it provides with very
complicated, ambiguous and not necessarily robust
answers. An even better designed study should include
the groups closely related to the focal ingroup for the
question that is currently under study. In that way,
studying the evolution of a worker caste in termites
necessitates taking into consideration the related groups
lacking a worker caste but showing a diversity of states
for the characters supposedly causally related to evolution of the worker caste (Grandcolas and D’Haese,
2002, 2004).
In any case, the analysis should be carefully explained
so that optimizing with or without the outgroup does
not appear as a default option never questioned nor
justified. This subject draws attention to the problem of
design of the phylogenetic tests of evolutionary scenarios. Trees should not be taken at random in the
literature to test an evolutionary question, but should
be reconstructed in a carefully and specifically designed
analysis to answer appropriately this particular question. It means that the ingroup and the outgroups
should be delimited and sampled so that the diversity of
character states and taxa is optimal regarding the
question under study.
582
P. Grandcolas et al. / Cladistics 20 (2004) 579–582
The design of phylogenetic studies should take that
point further into consideration and not only to sample
taxa optimally to find a correct or a resolved tree for
classification purpose only. Actually, phylogenetic analyses should be considered as opened studies, the data
sampling of which should be improved to test adequately any evolutionary questions. Arguing that the
accuracy of the phylogenetic reconstruction does not
depend so much on the taxon sample (Rosenberg and
Kumar, 2001) seems odd in this respect. A small taxon
sample will always be detrimental to the use of phylogenetic trees for answering evolutionary questions.
Acknowledgements
We are grateful to James Carpenter, Pierre Deleporte,
Laure Desutter-Grandcolas, two anonymous referees
and Arnold Kluge who all read the manuscript and
helped us with their incisive comments.
References
Abe, T., 1990. Evolution of worker caste in termites. In: Veeresh,
G.K., Mallik, B., Viraktamah, C.A. (Eds.), Social Insects and the
Environments, pp. 29–30. Oxford & IBH, New Delhi.
Barriel, V., Tassy, P., 1998. Rooting with multiple outgroups:
consensus versus parsimony. Cladistics, 14, 193–200.
Baum, D.A., Larson, A., 1991. Adaptation reviewed: a phylogenetic
methodology for studying character macroevolution. Syst. Zool.
40, 1–18.
Brooks, D.R., McLennan, D.A., 2002. The Nature of Diversity. An
Evolutionary Voyage of Discovery. The University of Chicago
Press, Chicago.
Carpenter, J.M., 1989. Testing scenarios: Wasp social behavior.
Cladistics, 5, 131–144.
Coddington, J.A., 1988. Cladistic tests of adaptational hypotheses.
Cladistics, 4, 3–22.
Coddington, J.A., 1994. The roles of homology and convergence in
studies of adaptation. In: Eggleton, P., Vane-Wright, R.I. (Eds.),
Phylogenetics and Ecology. Linnean Society Symposium Series,
Number 17. Academic Press, London, pp. 53–78.
D’Haese, C., 2000. Is psammophily an evolutionary dead end? A
phylogenetic test in the genus Willemia (Collembola: Hypogastruridae). Cladistics, 16, 255–273.
Deleporte, P., 1993. Characters, attributes and tests of evolutionary
scenarios. Cladistics, 9, 427–432.
Desutter-Grandcolas, L., Robillard, T., 2003. Phylogeny and the
evolution of calling songs in Gryllus (Insecta, Orthoptera, Gryllidae). Zool. Scr. 32, 173–183.
Eggleton, P., Vane-Wright, R.I. (Eds.), 1994. Phylogenetics and
Ecology. Linnean Society Symposium Series, Number 17. Academic Press, London.
Farris, J.S., 1982. Outgroups and parsimony. Syst. Zool. 31, 328–334.
Frumhoff, P.C., Reeve, H.K., 1994. Using phylogenies to test
hypotheses of adaptation: a critique of some current proposals.
Evolution, 48, 172–180.
Grandcolas, P. (Ed.), 1997. The origin of biodiversity in Insects.
Phylogenetic tests of evolutionary scenarios. Mém. Mus. natl. Hist.
nat. 173, 1–345.
Grandcolas, P., D’Haese, C., 2002. The origin of a ÔtrueÕ worker caste
in termites: phylogenetic evidence is not decisive. J. Evol. Biol. 15,
885–888.
Grandcolas, P., D’Haese, C., 2003. Testing adaptation with phylogeny:
How to account for phylogenetic pattern and selective value
together? Zool. Scr. 32, 483–490.
Grandcolas, P., D’Haese, C., 2004. The origin of a ÔtrueÕ worker caste
in termites: Mapping the real world on the phylogenetic tree.
J. Evol. Biol. 17, 461–463.
Grandcolas, P., Deleporte, P., Desutter-Grandcolas, L., Daugeron, C.,
2001. Phylogenetics and ecology: as many characters as possible
should be included in the cladistic analysis. Cladistics, 17, 104–110.
Kluge, A.G., Wolf, A.J., 1993. Cladistics: what’s in a word? Cladistics,
9, 183–199.
Luckow, M., Bruneau, A., 1997. Circularity and independence in
phylogenetic tests of ecological hypotheses. Cladistics, 13, 145–151.
Miller, J.S., Wenzel, J.W., 1995. Ecological characters and phylogeny.
Ann. Rev. Entomol. 40, 389–415.
Nixon, K.C., Carpenter, J.M., 1993. On outgroups. Cladistics, 9, 413–
426.
O’Hara, R.J., 1992. Telling the tree: narrative representation and the
study of evolutionary history. Biol. Phil. 7, 135–160.
de Queiroz, K., 1996. Including the characters of interest during tree
reconstruction and the problems of circularity and bias in studies of
character evolution. Am. Nat. 148, 700–708.
de Queiroz, A., Wimberger, P.H., 1993. The usefulness of behavior for
phylogeny estimation – levels of homoplasy in behavioral and
morphological characters. Evolution, 47, 46–60.
Rosenberg, M.S., Kumar, S., 2001. Incomplete taxon sampling is not a
problem for phylogenetic inference. Proc. Natl. Acad. Sci. USA,
98, 10751–10756.
Schultz, T.R., Cocroft, R.B., Churchill, G.A., 1996. The reconstruction of ancestral character states. Evolution, 50, 504–511.
Thompson, G.J., Kitade, O., Lo, N., Crozier, R.H., 2000. Phylogenetic
evidence for a single, ancestral origin of a ÔtrueÕ worker caste in
termites. J. Evol. Biol. 13, 869–881.
Thompson, G.J., Kitade, O., Lo, N., Crozier, R.H., 2003. The origin
of a ÔtrueÕ worker caste in termites: weighing up the phylogenetic
evidence. J. Evol. Biol. 17, 217–220.
Wenzel, J.W., 1992. Behavioral homology and phylogeny. Ann. Rev.
Ecol. Syst. 23, 361–381.
Wenzel, J.W., 1997. When is a phylogenetic test good enough? In:
Grandcolas, P. (Ed.), The Origin of Biodiversity in Insects.
Phylogenetic Tests of Evolutionary Scenarios. Mém. Mus. natn.
Hist. nat. 173, 31–45.
Wheeler, W.C., 1990. Nucleic acid sequence phylogeny and random
outgroups. Cladistics, 6, 363–367.
Zrzavý, J., 1997. Phylogenetics and ecology: all characters should be
included in the cladistic analysis. Oikos, 80, 186–192.