Characters Characters

Characters
• Character = set of evidence (character states)
about the relationships among a set of taxa.
– A character comprises a homologous set of states
states.
• Characters are variables, and character states
are instantiations of variables.
– Character states represent evolutionary
transformations of one another.
Characters
(1) Simple variables: scored by direct observation.
(a) Nominal
(b) Ordinal
( ) Mensural
(c)
M
l
• Discrete (counts)
• Continuous (interval and ratio)
(2) Derived (composite) variables
(a) Ratios
(b) Factors and functions (e.g., warps)
1
Simple variables
• Nominal variables: named states, no implied
transitional sequence.
– Properties, attributes (e.g. color).
– Categories
C t
i ((e.g., DNA/RNA b
bases, amino
i acids).
id )
• Ordinal variables: states ordered or ranked.
– Values arbitrary and relative.
• Sequenced by magnitude or other criterion.
• Differences between consecutive states not
p
important.
– E.g., transition series:
• Morphology: bump on bone absent, small,
elongated, bifed.
Simple variables
• Mensural variables: measured.
– States expressed in a numerically ordered fashion,
on an interval scale.
1
-2 -1 0
– Differences between units are constant
constant.
(1) Discrete variables (discontinuous, cardinal,
meristic):
2
• Non-arbitrary integer values, usually non-negative or
positive.
• For example:
– Number of petals/flower
petals/flower.
– Number of dorsal fin rays.
– Number of abdominal setae.
2
Simple variables
• Mensural variables: measured.
-2
(2) Continuous variables:
-1
0
1
2
• Lengths, densities, colors, frequencies.
– E.g.,
E g humerus length
length, pelt color
color, allele frequency.
frequency
– Distance between two morphometric landmarks.
• Can theoretically assume an infinite number of
values.
• Actual continuous values are estimated as discrete
states by a measurement procedure or device.
– E.g., calipers, densitometer.
– Each has some degree of resolution or “precision”.
– Measurements are expressed by an interval of
indistinguishable values.
Issues with discrete characters
(1) Ordering:
– Character-state graphs (transformation series) and
corresponding transition (step) matrices.
– Evidence: ontogeny, “morphoclines” in adults.
– Represent evolutionary constraints (longer trees).
– Phylogenetic information always lost when states
treated as unordered.
Swofford and Maddison 1992
3
Issues with discrete characters
(2) Polarity: identification of ‘ancestral’ vs ‘derived’ states.
– Terminology:
• Plesiomorphy – ancestral state.
• Apomorphy – derived state.
state
– Two basic choices for inferring polarity:
(a) Specify on a character-by-character basis, based on:
– Outgroup criterion: state outside study group is
plesiomorphic.
– Ontogenetic criterion: developmentally earlier is
plesiomorphic.
– Paleontologic criterion: stratigraphically earlier is
plesiomorphic.
(b) Use outgroup assumption to root tree:
– Assess polarities from distribution of character states on
tree.
Issues with discrete characters
(3) Polymorphism:
– Variation in character states within taxa (e.g., species).
• Independent of ontogenetic and sexual variation.
– Common
C
problem
bl
iin phylogenetic
h l
ti studies.
t di
• Evolving characters must vary within taxa at some point in
history.
– Several methods:
• Subdivide taxon into homogeneous
groups.
• Code character state as ‘missing’
missing .
• Code polymorphism as intermediate
state between two fixed states.
• Most common: reject polymorphic
character.
(Wiens 2000)
4
Issues with discrete characters
(4) Character weighting: characterizing ‘cost’ of the
transformation from one state to another.
– Higher weight designates more ‘likely’ or ‘significant’
g from one state to another.
change
– E.g.,
• Transitions may be weighted less than
transversions, in proportion to observed frequency.
Issues with discrete characters
• Change in 3rd-codon position may be differentially
weighted relative to change in 1st or 2nd position, due to
degeneracy of genetic code.
– Up-weight or down-weight?
5
Issues with discrete characters
• Iterative re-weighting (Farris) and dynamic weighting
(Goloboff, etc.):
– Find shortest tree by ‘parsimony’.
– Re-weight
Re weight characters inversely proportional to the
number of character-state changes on the tree
(homoplasy).
– Find shortest tree with weights imposed.
– Repeat until solution stabilizes.
– Problem: circularity.
» Iterates toward ‘best’ solution, but ‘best’ in what
sense?
Issues with discrete characters
(5) Kinds of inference methods:
– Distance: based on pairwise measures of ‘similarity’.
• Usually give unique tree.
– ‘Parsimony’: based on finding topology having
minimum tree length.
• Tree length measured as total number of characterstates changes across tree.
• Usually gives sets of ‘equally parsimonious’ trees.
– Maximum likelihood and Bayesian:
• Based on particular models of character-state
change.
• Usually give unique tree.
6
Tree length
A
B
C
D
a1 b0
a1 b0
a0 b1
a0 b0
b1
a1
a0 b0
Tree length = 2
Discrete vs. continuous
• Characters have two evolutionary ‘options’:
– Remain constant from ancestor to descendant.
– Change between ancestor and descendant.
• Issues
I
with
ith discrete
di
t vs. continuous
ti
characters:
h
t
– Change in state between nodes on tree:
• Discrete character states might change or not.
– Synapomorphies can be identified qualitatively.
– Character-state changes can be counted.
– Total tree length can be defined in term of number of
character-state changes.
• Continuous character states are likely to always change
between tree nodes.
– Cladistic premise (based on parsimony): continuous
characters are inappropriate for phylogenetic
analysis.
7
Issues with continuous characters
• Two approaches:
(1) Convert continuous characters to discrete characters.
• Gap
Gap-coding
coding (Archie 1985).
• Range-coding (Pimentel and Riggins 1987).
• Homogeneous subsets (Mishler and De Luna 1991).
(2) Use continuous characters directly:
• Taxa summarized by means or medians.
• Distance methods: pairwise similarity.
• Maximum
M i
lik
likelihood
lih d or B
Bayesian
i methods.
th d
– ‘Brownian motion’ model (random walk).
Tree length
A
a1 b0
B
a1 b0
C
a0 b1
D
A
B
C
D
a0 b0
2.2
2.0
3.0
3.6
b1
2.6
3.2
a1
a0 b0
Tree length = 2
3.0
Tree length = 2.0
(Requires explicit
‘reconstruction’ of
ancestral states)
8