A Brief Tutorial Evolutionary Trees

The Challenge
of
Evolutionary
Trees:
Parasite
Control
A
Brief Tutorial
T
hroughout Parasitology: A Conceptual Approach, but
especially in Chapter 2, evolutionary trees are used to
depict relationships among groups of parasites. As the
interpretation of such trees may not be familiar to many students,
here we provide a brief tutorial regarding basic terminology, construction
and use of evolutionary trees.
An evolutionary tree is a branching diagram (a dendrogram) to depict
relationships among taxa of organisms. A taxon (plural, taxa) refers to members of a particular group at a taxonomic level, such as a species or family, and
while one tree might explore relationships among a group of species, another
might explore relationships among a group of families. Some basic aspects
of evolutionary trees are described in Figure 1. On such evolutionary trees a
group consisting of an ancestor and all of its descendants is referred to as a
clade or a monophyletic group. In Figure 1 the group consisting of the ancestor marked with the double asterisk and taxa D, F and C comprise a clade,
and the group consisting of the ancestor marked with a single asterisk and
taxa D and F is another clade. As taxa D and F form a clade to the exclusion
of any other taxa they are termed “sister taxa”. The more common ancestors
that are shared to the exclusion of other taxa, the more closely related two
taxa are. For example, taxa D and F share four common ancestors, whereas A
is the most distantly related sharing only one common ancestor (at the root
of the tree) with all the other members of the tree. An outgroup is a reference
point used to “root” the tree. Some trees are unrooted meaning they lack an
outgroup for comparison.
Trees that depict only topology or branching order are referred to as cladograms and can be depicted in several ways that are equivalent (Figure 2). The
critical feature in common to all these trees is that they have the same branching order, while branch length is arbitrary and does not convey any specific
information. In contrast to cladograms, phylograms depict not only the
A
B
E
D
F
C
A
B
E
D
F
C
A
B
E
D
F
C
A
B
E
D
F
C
A
B
E
D
F
F
C
E
D
C
outgroup
terminal
nodes
sister taxa
A
B
E
D
F
C
*
**
branches
internal
nodes
root
Figure 1 Basic parts of an evolutionary tree.
Each taxon is depicted as a terminal node or tip
of a branch on the tree (Taxa A to F). All such taxa
are
contemporaneous (for example, all are alive
ETT01
today). These terminal nodes are all connected
to internal nodes through branches and internal
nodes indicate common ancestors. Nodes
closer to the root of the tree depict more distant
ancestors. On this tree the present day is shown
at the top with evolutionary time extending
deeper into the past moving towards the base.
The pattern of branching on the tree is referred
to as its topology. Two taxa joined by branches
to a common internal node to the exclusion of
other taxa are sister taxa. The node marked by
an asterisk represents the most common recent
ancestor of sister taxa D and F. The node marked
by the double asterisk indicates the most common
recent ancestor of taxa D, F and C. (Adapted from
Gregory TR [2008] Evo Edu Outreach 1:121—137.
With permission from Springer).
Figure 2 Different ways to portray trees. Shown
are six different ways that rooted evolutionary
trees can be presented. All are equivalent with
respect to their essential feature, the ordering of
branching and depiction of relatedness. These
are cladograms meaning the branch lengths do
not convey any specific meaning. (Adapted from
Gregory TR [2008] Evo Edu Outreach 1:121—137.
With permission from Springer).
A
1
B
2
EVOLUTIONARY TREES
Figure 3 Cladograms and phylograms. (a) A
cladogram which portrays an order of branching
but for which branch lengths convey no
specific meaning. (b) A phylogram portrays
both the branching order and the branch
lengths are proportional to some measure
of divergence between taxa. Included with a
phylogram is a scale bar to indicate the degree
of divergence. Although the various branches
in the phylogram have different lengths and the
letters do not all align vertically taxa A to F are
still contemporaneous. (c) A phylogram which has
been “ultrametricized” to line up the taxa vertically
but with the lines scaled appropriately to show
divergence among sister groups rather than
among species. (Adapted from Gregory TR [2008]
Evo Edu Outreach 1:121—137. With permission
from Springer).
a)
b)
Figure 4 Different ways to show equivalent
relationships. These trees exemplify the point
that each node of the tree can be thought of
as a swivel, around which branches connected
to it can rotate. Nodes highlighted with black
circles are the swivel points to produce the
arrangement in the next tree. All trees depict the
same evolutionary relationships among taxa. For
example, all feature F and G as sister taxa with E
as the next closest relative. The arrangement of
the tips is not important as long as the branching
pattern is retained. (Adapted from Gregory
TR [2008] Evo Edu Outreach 1:121—137. With
permission from Springer).
A
B
C
D
E
F
G
C
D
E
F
G
B
A
A
D
F
G
E
C
B
A
C
D
G
F
E
B
C
c)
C
D
C
D
D
F
F
E
F
E
E
B
B
A
B
A
A
0.5 changes
3.0 2.0 1.0 0.0
branching order but the branch lengths also provide a measure of divergence
taxa (Figure 3). A critical point to understanding and interpreting
trees is that each node of the tree can be thought of as a swivel around which
branches can rotate (Figures 4 and 5), and the order of the terminal nodes
does not convey information about relatedness. It is the branching pattern
revealed by the internal nodes that conveys this information.
Some common misconceptions when interpreting evolutionary trees can
include assuming the evolutionary relationships are shown by the order of
taxa at the trees tips, and that the number of internal nodes relates to a taxa’s
complexity. For example in Figure 5a, although frogs and lizards are depicted
next to each other they are not each other’s closest relatives. The branching
pattern shows that lizards are actually more closely related to birds as they
form a monophyletic group with a common ancestor which is not shared
with any other taxa. The order of taxa at the tips of this tree could also be mistaken for a preconceived notion of progression of evolution from “lower” to
“higher” forms (Figure 5a). However, as each node can swivel, the depiction
in Figure 5b shows the same relationships among the taxa even though the
order at the tips is different. Looking specifically at the positions of frogs and
humans relative to bony fish, there are two nodes in the human lineage from
the last common ancestor between frogs and humans, but none between
frogs and the same ancestor. Does one interpret this tree to mean that frogs
are more closely related to bony fish than humans? No, because frogs and
humans shared the same common ancestor, and the interval of time available
for divergence from fishes, from the time of that common ancestor to the
ETT03
among
ETT04
a)
A BRIEF TUTORIAL
b)
present day has been the same for both frogs and humans. Furthermore, just
ETT05
because more nodes are depicted in the human lineage than in the frog lineage does not mean that humans are “higher” or more “derived” than frogs
which are “lower” or more “basal” in the tree. Again, both have had the same
time to evolve and change, and the degree of topology in the tree does not
necessarily relate to complexity or sophistication in some arbitrary sense.
In molecular phylogenies the sequences of orthologous genes—genes that
share a common ancestry and function—are compared among organisms.
If an evolutionary tree is based on comparisons derived from a single target
gene it really only reflects the evolutionary history of that one gene. This gene
tree may be at variance with the evolutionary history of the lineage of organisms in which it is found. This might occur, for example, if the gene were
acquired by horizontal transfer and not by direct descent from its ancestors,
or if the gene had undergone duplication and we then were unable to tell
which daughter gene was being analyzed. Therefore, a phylogeny that actually reflects evolutionary history of the organisms involved should be based
on multiple genes to overcome the potential errors that could originate from
one gene, or also include phenotypic or morphological characters. Such a
phylogeny might be more likely to fulfill the quest for an evolutionary-based
taxonomic scheme for the organisms in question.
Regarding the interpretation of trees, it must be borne in mind that there
are many different ways to construct molecular phylogenies, but all begin
with good alignments of the sequence of the genes being targeted. The
methods for analyzing the matrix of aligned sequences vary considerably.
Evolutionary distance methods such as neighbor joining rely on calculation
of genetic distance, the proportion of sequence mismatches encountered,
among all the possible pairs of sequences being considered to calculate tree
topology. Maximum parsimony methods determine an ancestral sequence
and then find the trees that represent the fewest steps from this ancestral
condition to derive the modern sequences. Maximum likelihood assigns
probabilities to particular possible trees, based on a nucleotide substitution
model that assesses the probability that particular mutations will occur, with
less probable trees being rejected. Bayesian inference is also used to produce
trees. This approach makes a prior assumption about a particular probability
distribution of all possible trees and uses Markov chain sampling algorithms
for implementation.
Once a tree has been constructed, what measure of confidence does one
have in the relationships it portrays? To test this, a bootstrap analysis is
commonly performed. The basic idea is that subsamples of columns from
the original sequence data matrix are drawn at random (with replacement)
and branching patterns are determined. The nodes indicating certain clades
are scored for how frequently they are retrieved in the re-sampling efforts.
The degree of bootstrap support deemed sufficient to support a particular
3
Figure 5 Interpreting the order of tips and
branching on the trees. The order of the tips
on the trees is arbitrary with both depicting the
same evolutionary relationships, the difference
is that the branches have been swiveled at two
internal nodes (highlighted by black circles). Both
trees depict the same evolutionary relationships,
the difference is that the branches have been
swiveled at two internal nodes (highlighted
by black circles). The order of the tips on the
trees is arbitrary. For example, in b) although
frogs and humans, or birds and fishes, occur
side-by-side they are not each other’s closest
relatives. Two pairs of sister taxa are shown,
cats and humans, and birds and lizards. These
taxa are closest relatives because of their
shared branching pattern. Each pair shares their
most recent ancestor to the exclusion of any
other taxa and both pairs share four common
ancestors. (Adapted from Gregory TR [2008] Evo
Edu Outreach 1:121—137. With permission from
Springer).
4
EVOLUTIONARY TREES
conclusion varies, but values of 90%, particularly if achieved with multiple methods of analysis, are generally considered to be sufficiently robust.
Progressively lower values of bootstrap support for particular nodes decrease
confidence in how accurately that node portrays a group’s evolutionary history. Bootstrap values, listed as percentages and indicating the degree of
support, are often shown at the nodes on a tree.
Reference
Gregory TR (2008) Understanding evolutionary trees. Evo Edu Outreach 1:121—137.