Phylogeny Reconstruction - Indiana University Bloomington

G404 Geobiology
Phylogeny Reconstruction
Trees, Methods and Characters
Reading: Gregory, 2008.
Understanding
Evolutionary Trees
Department of Geological Sciences | Indiana University
(Polly, 2006)
(c) 2011, P. David Polly
G404 Geobiology
Lab tomorrow
Meet in Geology GY522
Bring computers if you have them (they will be
more important next week and the week after)
Download and save PHYLIP program (no
installation) (http://evolution.genetics.washington.edu/phylip.html)
Download and install Mesquite program suite
(http://mesquiteproject.org/mesquite/mesquite.html)
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Key ingredients of phylogenetic analysis
❖ An understanding of the characteristics of a group of organisms
❖ A list of characters that vary among the group
❖ A tabulation of the state of the character in each member of the group
❖ Information on which state is ancestral for the group for each character
❖ A formatted data file that can be used with programs that perform
phylogenetic analysis
❖ A computer and software that are capable of performing the analysis
❖ An understanding of phylogenetic trees to aid in interpreting the results
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Tree terminology
Terminal node
(leaf, tip)
Internal node
(hypothetical
ancestor)
Root
Branch
(edge)
After Page & Holmes, 1998, Molecular
Evolution: a Phylogenetic Approach
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Trees show closeness of relationship
Trees are read from bottom up, with each node
representing an ancient speciation that lead to its
descendant branches
Tree shows recency of common ancestry, same as
showing nested sets
Sets can be drawn as a tree, or they can be written in
parenthetical form
The parenthetical form is very close to the file format
A
used by many programs to store or analyze trees
B
C
D
E
Examples:
A and B are more closely related to one another than
either is to C, D, or E.
C is more closely related to D and E than to A or B.
A and B share a more recent common ancestor than
either does with C, D, or E.
((A,B), (C, (D,E)))
After Page & Holmes, 1998, Molecular
Evolution: a Phylogenetic Approach
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Cladograms are like mobiles
Cladograms are types of trees that show recency
of common ancestry
Order of tree labels can vary without changing the
meaning of the cladogram
A
B
C
D
D
=
C
B
A
B
C
D
A
=
After Page & Holmes, 1998, Molecular
Evolution: a Phylogenetic Approach
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Tree-thinking challenge
Baum, et al., 2005. The tree-thinking challenge. Science, 310;
979-980.
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Different trees for different purposes
Some differ simply by what is
represented in the tree diagram
Some differ by the method used to
construct them from data
=
This axis means nothing
Ultrametric Tree:
shows recency of common ancestry
by branching pattern and evolution change
as branch lengths
shows recency of common ancestry
by branching pattern and time
as branch lengths
Time
This axis means nothing
shows only recency of common ancestry
Additive Tree:
Amount of change
Cladogram:
This axis means nothing
This axis means nothing
After Page Holmes, 1998, Molecular
Evolution: a Phylogenetic Approach
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Cladograms are constructed from character evolution
Didelphis
Dimetrodon
Tyrannosaurus
0
1
1
0
1
Tyrannosaurus
Tyrannosaurus
1
Dimetrodon
Dimetrodon
1
Didelphis
Didelphis
Character changes can be mapped
onto trees using any one of several
conventions
1
0
1
0
1
0
0 = absence of synapsid fenestra
1 = presence of synapsid fenestra
After Page Holmes, 1998, Molecular
Evolution: a Phylogenetic Approach
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Terminology for characters on a tree
Character states that evolve above the root (or a
particular node) are “derived characters” or
apomorphies
Character states present at the root are
“ancestral characters”, “primitive characters”, or
plesiomorphies
Apomorphies
Plesiomorphies
Synapomorphy
Synapomorphies are apomorphies shared by
common ancestry (these character states are
homologous and provide evidence of close
relationship)
Autapomorphies are apomorphies found in only
one tip (they are interesting, but don’t provide
evidence of relationship)
Autapomorphy
Homoplasy
Homoplasy is the evolution of a derived character
independently on a tree so that it is shared by two
tips, but not their common ancestor (same as
“analogy”)
After Page Holmes, 1998, Molecular
Evolution: a Phylogenetic Approach
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Characters versus character states
A character is a feature that can be recognized, named, and described, such as a bone
or a fenestra
A character state is the particular configuration of the character in a specific taxon
a character with states is called a “meristic character”, distinct from a “continuous
character” or “quantitative character”, which is a character that is measured and can
take on an infinite number of continuous values.
Character: quadratojugal condition
State A:
quadratojugal
present, large
Department of Geological Sciences | Indiana University
State B:
quadratojugal
present, small
State C:
quadratojugal
absent
(c) 2011, P. David Polly
G404 Geobiology
Methods for phylogeny reconstruction
Parsimony (=maximum parsimony, =cladistics). Uses only derived states of
meristic characters to construct a tree based on “parsimony”. Parsimony is
defined as minimizing the number of character states that evolve on the tree
or, in other words, finding the shortest tree or finding the tree that makes the
fewest assumptions of homoplasy.
Maximum likelihood (=ML). Uses derived states of meristic characters or
quantitative characters to construct a tree based on the probabilities of
character states changing on the tree. The probability of change is estimated
from the data. ML trees are based on the probability that a particular model
of character change and the observed character states would give rise to a
particular tree. The tree with the highest probability, or likelihood, is the one
favored.
Bayesian. Similar to maximum likelihood, but offers the possibility of easily
combining different kinds of data (e.g., morphological and molecular) and
offers the possibility of taking into account our confidence in relationships
based on prior work.
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Recipe for a parsimony analysis
1. Observe and compare morphology to identify characters and character
states. Best practice is to systematically work through the entire organism
from nose to tail finding all characters that vary.
2. Determine the plesiomorphic and derived states of each character using
one of several methods (outgroup method is the most accepted).
3. Score the characters and states in a data matrix, with one row for each
character and one column for each taxon. Plesiomorphic state is given a 0,
derived states a 1 (or an integer greater than one for a multistate
character).
4. Use one of several software packages to find the shortest or most
parsimonious tree from the data. These algorithms find the tree that
maximizes the number of synapomorphies and minimizes the number of
homoplasies.
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Methods for “polarizing characters”
Determining which states are plesiomorphic and which are apomorphic is known as
“polarizing” a character. This step is essential for parsimony analysis.
Outgroup criterion. The preferred method. One or more outgroups are identified and
the state common between them and the ingroup taxa is assumed to be the
plesiomorphic states. Other states are left “unordered” or are given an order based on
logic. Works well if the rate of character evolution is not high.
Paleontological criterion. The state that appears earliest in earth history is assumed to
be plesiomorphic based on the logic that it evolved first. Works well if the fossil record
is good.
Ontogenetic criterion. The state that appears first in embryonic development is
assumed to be plesiomorphic based on the logic that the most general developmental
state is likely to have evolved first. Dubious at best.
Commonality criterion. The state found in most taxa is assumed to be plesiomorphic
based on the logic that among many taxa, some are likely to be outgroups. Works only
in cases where the sample of taxa includes more outgroups than ingroups
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
The outgroup criterion at work
Edmontosaurus
Archaeopteryx
Kuhneosaurus
Character 1: postparietal condition. 0 - present and
large; 1 - small or absent.
Character 2: tabular condition. 0 - present and large; 1 small or absent.
Character 3: supratemporal condition. 0 - present and
large; 1 - small or absent.
Character 4: infratemporal condition. 0 - present and
large; 1 - small or absent.
Non-amniotes
Outgroup state: large tabular,
postparietal, supratemporal,
infratemporal
Department of Geological Sciences | Indiana University
Titanophoneus
Diadectes
Dimetrodon
Youngina
Captorhinus
Protogyrinus
Amniota
Amniotes
Ingroup states: small or absent
tabular, postparietal, supratemporal,
infratemporal, often positioned on
posterior of cranium
(c) 2011, P. David Polly
G404 Geobiology
Character matrix
Department of Geological Sciences | Indiana University
Protogyrinus
Diadectes
Edmontosaurus
Kuhneosaurus
Youngina
Dimetrodon
Archaeopteryx
Captorhinus
Cladogram based on posterior skull bones
Consensus cladogram
950 equally parsimonious trees found
Tree length: 4
CI: 1.0
(c) 2011, P. David Polly
Calculated as the number of changes on
the tree (tree length) divided by minimum
number of changes in data (number of
character states in all characters)
Here there are four characters each with
one derived state, so minimum number
of changes is 4.0
There are 4 changes on tree, so
consistency index is 1.0
Department of Geological Sciences | Indiana University
Protogyrinus
Diadectes
Edmontosaurus
Kuhneosaurus
Youngina
The simplest index of tree support in
parsimony is the consistency index
Dimetrodon
The support for a tree varies according to
the ability of the data to unambiguously
resolve nodes
Captorhinus
Tree support
Archaeopteryx
G404 Geobiology
1
Character 1: 0
1
Character 2: 0
1
Character 3: 0
1
Character 4: 0
Cladogram (consensus)
950 equally parsimonious trees found
Tree length: 4
CI: 1.0
(c) 2011, P. David Polly
G404 Geobiology
Character data are not always able to fully
resolve relationships
Totally unresolved tree is a “star tree”
“Hard polytomies” result from data support
“Soft polytomies” are when data are contradictory about
relationships
Star tree
Partially resolved
Fully resolved
(fully bifercating)
Polytomy
After Page Holmes, 1998, Molecular
Evolution: a Phylogenetic Approach
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly
G404 Geobiology
Scientific papers for further reading
Baldauf, S. L. 2003. Phylogeny for the faint of heart: a tutorial. TRENDS in
Genetics, 19: 345-351.
de Queiroz, K. and J. A. Gauthier, 1992. Phylogenetic taxonomy. Annual
Reviews of Ecology and Systematics, 23: 449-480.
Gauthier, J., A. G. Kluge, and T. Rowe. 1988. Amniote phylogeny and the
importance of fossils. Cladistics, 4: 105-209.
Gregory, T. R. 2008. Understanding Evolutionary Trees. Evolution Education
Outreach, 1: 121-137. [Required reading]
Padian, K., D.R. Lindberg, and P.D. Polly, 1994. Cladistics and the fossil
record: the uses of history. Annual Reviews of Earth and Planetary Sciences,
22: 63-91.
Department of Geological Sciences | Indiana University
(c) 2011, P. David Polly