Evolution
Intro to Phylogenetic Trees
Lecture 6
Sections 7.1, 7.2, in Durbin et al.
Chapter 17 in Gusfield
Slides by Shlomo Moran and by Ydo Wexler. Modifications by Benny Chor
Source: Alberts et al
The Tree of Life
Tree of life- a better picture
!
"
#
$% & $
Primate evolution
Historical Note
Until mid 1950’s phylogenies were constructed by
experts based on their opinion (subjective criteria)
Since then, focus on objective criteria for
constructing phylogenetic trees
Thousands of articles in the last decades
Important for many aspects of biology
Classification
Understanding biological mechanisms
'
Morphological vs. Molecular
Classical phylogenetic analysis: morphological
features: number of legs, lengths of legs, etc.
Modern biological methods allow to use molecular
features
Gene sequences
Protein sequences
Morphological topology
;(
+
<
(
$& & = >
(
)
+
,
(
)
(
(
.
-
*
/
0
1
/
0
"
1
3 #
+
3
4
)
/
,
-
/
2
2
,
3
5#
5
"
)
9
(
6
#
"
7
.
Analysis based on homologous sequences (e.g.,
globins) in different species
)
#
,
"
)
1
"
,
+
(
.
8
5
/
+
7
:
From sequences to a phylogenetic tree
Rat
QEPGGLVVPPTDA
Rabbit
QEPGGMVVPPTDA
REPGGLVVPPTEG
;
+
>
Nuclear topology
;(
;
5
+
#
>
)
Flying Fox
Hedgehog
Mole
5
Pangolin
Whale
$
Hippo
)
)
Cow
Pig
Cat
Dog
Horse
Rhino
5
Rat
,
-
Capybara
?
Rabbit
Flying Lemur
Tree Shrew
A
Human
Galago
Sloth
B
Hyrax
Dugong
Elephant
Aardvark
Elephant Shrew
Opossum
Kangaroo
#
>
:
5
)
)
)
+
@-
:
1
@-
5
3
3
"
$
?
Theory of Evolution
>
Round Eared Bat
5
Donkey
Horse
Indian rhino
White rhino
Grey seal
Harbor seal
Dog
Cat
Blue whale
Fin whale
Sperm whale
Hippopotamus
Sheep
Cow
Alpaca
Pig
Little red flying fox
Ryukyu flying fox
Horseshoe bat
Japanese pipistrelle
Long-tailed bat
Jamaican fruit-eating bat
Asiatic shrew
Long-clawed shrew
Mole
Small Madagascar hedgehog
Aardvark
Elephant
Armadillo
Rabbit
Pika
Tree shrew
Bonobo
Chimpanzee
Man
Gorilla
Sumatran orangutan
Bornean orangutan
Common gibbon
Barbary ape
Baboon
White-fronted capuchin
Slow loris
Squirrel
Dormouse
Cane-rat
Guinea pig
Mouse
Rat
Vole
Hedgehog
Gymnure
Bandicoot
Wallaroo
Opossum
Platypus
Gorilla QEPGGLVVPPTDA
Cat
Mitochondrial topology
;(
5
@
Basic idea
speciation events lead to creation of different
species.
Speciation caused by physical separation into
groups where different genetic variants become
dominant
Any two species share a (possibly distant) common
ancestor
Phylogenenetic trees
Types of Trees
A natural model to consider is that of rooted trees
Common
Ancestor
Aardvark Bison Chimp Dog
Elephant
Leafs - current day species
Nodes - hypothetical most recent common ancestors
Edges length - “time” from one speciation to the next
Types of trees
Tree a
Unrooted tree represents the same phylogeny without
the root node
Tree b
Tree c
b
c
Depending on the model, data from current day species does
not distinguish between different placements of the root.
3
a
Positioning Roots in Unrooted Trees
We can estimate the position of the root by
introducing an outgroup:
a set of species that are definitely distant from all
the species of interest
Proposed root
Falcon
Aardvark Bison Chimp Dog
Elephant
Two Approaches to Tree Construction
/
*
C
D
;
C
*
#
C
>
We start with distance based methods, considering
the following question:
Given a set of species (leaves in a supposed tree),
and distances between them – construct a
phylogeny which best “fits” the distances.
Type of Reconstruction
Distance-based
Input is a matrix of distances between species
Can be fraction of residue they disagree on, or
alignment score between them, or …
Character-based
Examine all characters (AAs or DNA bases).
Do not ``summarize’’ sequences or pairs of
sequences by a single number.
Major methods: Parsimony; Likelihood.
Exact solution: Additive sets
Given a set M of L objects with an L×L distance matrix:
d(i,i)=0, and for i j, d(i,j)>0
d(i,j)=d(j,i).
For all i,j,k it holds that d(i,k) d(i,j)+d(j,k).
Can we construct a weighted tree which realizes these
distances?
Distances for three objects
are always additive:
Additive Distances (cont)
We say that the set of distances M over L objects is
additive if there is a tree T, L of its nodes correspond to
the L objects, with positive weights on the edges, such that
for all i,j,
d(i,j) = dT(i,j), the length of the path from i to j in T.
For L=3, here is always a (unique) tree with one
internal node (by simple linear algebra)
k
k
2
l
2
j
0
2
2
0
3
k
l
0
1
[d (i , k ) d ( j , k ) d (i , j )] 0
2
d (k , m)
Theorem: A set M of distances is additive iff any subset of
four objects can be labeled i,j,k,l so that:
d(i,k) + d(j,l) = d(i,l) +d(k,j) d(i,j) + d(k,l)
k
i
j
2
c
The Four Points Condition
How about four objects?
i
i 0
a b
a c
b c
i
Thus
Not all distance matrices with 4 objects are additive, even
if they satisfy triangle inequality.
E.g., no tree realizes these distances:
j
m
Note: Sometimes the tree is required to be binary, and then
the edge weights are required to be just non-negative.
d (i, j )
d (i, k )
d ( j, k )
j
l
We call (i,j),(k,l) the “split” of {i,j,k,l}.
Proof:
By inspecting the figure, additivity
4 points condition...
4P Condition
Tree construction for L=4
Additivity:
Construct the tree by the given distances as follows:
1. Construct a tree for {i, j,k}, with internal vertex m
2. Add vertex n ,d(m,n) = y
3. Add edge (n,l), c+f=d(k,l)
Induction on the number of objects, L.
For L
3 the condition is empty and tree exists.
Consider L=4.
Denote B = d(i,k) +d(j,l) = d(i,l) +d(j,k) d(i,j) + d(k,l) = A
Let y = (B – A)/2
0 (length of internal edge).
k
Then the tree should look as follows:
We want to find the distances a,b, c and f.
c
l
f
Remains to prove:
d(i,l) = dT(i,l)
d(j,l) = dT(j,l)
n
y
Again, an instance of linear algebra
m b
a
c
i
l
f
a
y
m
b
j
i
Splits Approach to Proof: Intuition
By the 4 points condition and the definition of y:
d(i,l) = d(i,j) + d(k,l) +2y - d(k,j) = a + y + f = dT(i,l)
(the middle equality holds since d(i,j), d(k,l) and d(k,j)
are realized by the tree)
d(j,l) = dT(j,l) is proved similarly.
n
c
n
j
Proof for L=4
l
f
a
i
k
k
y
m
b
Suppose 4 points condition holds with strict inequality, >,
for every four leaves.
This defines a (2,2) partition of every quartet.
Can use 4 points condition to show all quartets are consistent.
This in turn used to construct
tree (homework assignment).
j
k
Finally show tree distances agree
with original distances using linear
i
Algebra.
l
j
Linear Algebraic Approach : Induction
Induction step:
Remove L-th object from the set
By induction, there is a tree, T’, for {1,2,…,L-1}.
For each pair of labeled nodes (i,j) in T’, let aij, bij, cij
be defined by the following figure:
Pick i and j that minimize cij.
T is constructed by adding L (and possibly mij) to T’,
as in the figure. Then d(i,L) = dT(i,L) and d(j,L) = dT(j,L)
Remains to prove: For each k i,j: d(k,L) = dT(k,L).
L
L
cij
aij
mij
1
[ d (i, L) d ( j , L) d (i, j )]
2
cij
bij
cij
aij
j
bij
j
mij
T’
i
i
Induction step (cont.)
Induction step (end)
Let k i,j be an arbitrary node in T’ , and let n be the
branching point of k in the path from i to j.
By the minimality of cij , (i,j),(k,L) is not a split of {i,j,k,L}.
Assume WLOG that (i,L),(j,k) is a split of {i,j, k,L}.
L
cij
aij
i
bij
mij
n
T’
k
j
Since (i,L),(j,k) is a split, by the 4 points condition
d(L,k) = d(i,k) + d(L,j) - d(i,j)
d(i,k) = dT(i,k) and d(i,j) = dT(i,j) by induction, and
d(L,j) = dT(L,j) by the construction.
L
Hence d(L,k) = dT(L,k).
QED
cij
aij
i
bij
mij
T’
k
n
j
From Additive Distance to a Tree
Constructing additive trees:
The neighbor joining problem
By following the proof, the four point condition can
be used to construct a tree from a distance matrix, or
to decide that there is no such tree (namely that the
distance is not additive).
• Let i, j be sisters (neighboring leaves) in a tree, let k be
their father, and let m be any other vertex.
• Using eq. d (k , m) [d (i, m) d ( j, m) d (i, j )]/ 2
we can compute the distances from k to all other leaves.
This suggest the following method to construct tree from an
additive distance matrix:
1. Find sisters i,j in the tree,
2. Replace i,j by their father, k, and recursively construct a
tree T for the smaller set.
3. Add i,j as children of k in T.
But this algorithm will go over all quartets, resulting
in O(L4) many steps for L species (too sllllllllllllow).
The most popular method for constructing trees for
additive sets uses the neighbor joining approach.
Neighbor Finding: Seitou & Nei method
For a leaf i, let ri
d (i, m).
Neighbor Finding
How can we find from distances alone a pair of sisters
(neighboring leaves)?
Closest nodes are not necessarily neighboring leaves.
A
m is a leaf
Definition: Let i, j be two leaves (out of L leaves in T ).
Then their divergence is D(i, j ) d (i, j ) (ri r j ) /( L 2)
B
Theorem (Saitou&Nei) Assume d is additive, with all tree
edge weights positive. If D(i,j) is minimal (among all pairs
of leaves), then i and j are sister
T1
taxa in the tree.
T
2
C
Next, we show a way to find neighbors from distances.
D
The proof is rather involved, and
will be skipped (no tears pls).
m
l
i
k
j
A simpler neighbor finding method:
Select an arbitrary (fixed) node r.
For each pair of labeled nodes (i,j) let C(i,j) be defined
by the following expression (also see figure):
r
C(i,j)
1
[d (i , r ) d ( j , r ) d (i , j )]
2
C (i , j )
Sisters Identification: Example
1
[ d (i, r ) d ( j , r ) d (i , j )]
2
Select arbitrarily r=A.
C(B,C)=(15+25-30)/2=5
A
C(B,D)=(15+34-31)/2=8
5
4
C(C,D)=(25+34-49)/2=5
C (i , j )
20
j
i
Claim: Let i, j be such that C(i,j) is maximized.
Then i and j are neighboring leaves.
Neighbor Joining Algorithm
Set M to contain all leaves, and select a root r. |M|=L
If L =2, return a tree of two vertices
Iteration:
Choose i,j such that C(i,j) is maximal
Create a new vertex k, and update distances
B
6
25
C
Claim: Let i, j be such that C(i,j) is maximized.
Then i and j are neighboring leaves.
D
Complexity of Neighbor Joining Algorithm
m
k
d (i, k ) [d (i, j ) d (i, r ) d ( j , r )] / 2
i
d ( j , k ) d (i, j ) d (i, k )
j
1
for each other node m, d ( k , m)
[ d (i, m) d ( j , m) d (i, j )]
2
remove i,j, and add k to M
Recursively construct a tree on the smaller set.
When done, add i,j as children on k, at distances d(i,k) and d(j,k).
Naive Implementation:
Initialization: (L2) to compute the C(i,j)’ s.
Each Iteration:
O(L) to update {C(i,k):i L} for the new node k.
O(L2) to find the maximal C(i,j).
i
3
Total of O(L ).
m
k
j
Complexity of Neighbor Joining Algorithm
Using a Heap to store the C(i,j)’s:
Initialization: (L2) to compute and heapify the C(i,j)’ s.
Each Iteration:
O(1) to find the maximal C(i,j).
O(L log L) to delete {C(m,i), C(m,j)} and add C(m,k) for
all vertices m.
Total of O(L2 log L).
(implementation details are omitted)
Reconstructing Trees from Additive Matrices
Reconstructing Trees from Additive Matrices
Given a distance matrix constituting an additive metric, the
topology of the corresponding additive tree is unique.
Q: Do we have to test additivity before running NJ?
A: This would be bad news, as this takes O(L4) time!
A B C D E
A 0 2 7 4 7
B
0 7 4 7
C
0 7 6
D
0 7
E
0
A
B
1
2
1
1
3
2
• Identify i,j as neighbours if their divergence is minimal.
A: By Seito-Nei, if matrix is additive, NJ will
construct the correct tree. Algorithm does not care
about awareness and need not know anything
about the matrix!
• Combine i,j into a new node u.
A
B
1
1
1
2
2
i
Let ri be the sum of distances
from i to every other node
ri
0.4
n
D
C
0.1
l
0.4
d (i, j )
j 1
3
0.1
k
• If only 3 nodes are left – finish.
E
m
0.1
• update the distance matrix.
3
C
D
NJ Algorithm: Example
Q: Do we have to test additivity before running NJ?
A B C D E
A 0 2 7 4 7
B
0 7 4 7
C
0 7 6
D
0 7
E
0
E
3
Here, we use the divergence,
D (i, j ) d (i, j ) (ri rj ) /( L 2)
j
n
Distance Matrix
A
B
C
D
A
0
2
3
6
B
2
0
3
5
C
3
3
0
6
D
5
6
6
0
rA 11 rB
10 rC
11 rD
Distance Matrix
U
A
B
17
D( A, B)
D( A, C )
8.5
8
D( A, D)
8
D ( B, C )
D ( B, D )
7.5
8.5
D(C , D)
8
U
C
D
U
0
3
5.5
C
3
0
6
D
5.5
6
0
rU
8.5 rC
Y
D
0
5.6
D
5.6
0
U
B
11.5
C
A
X (U , C )
X (U , D)
5.75
4.5
X (C , D )
4.25
Reconstructing Trees from
non Additive Matrices
Distance Matrix
Y
9 rD
Y
Z
.
Y
.
U
B
2
0F
D
A
C
(
E
3
E
Almost Additive Matrix
2
G
Distance Matrix
H
2
2
| d d '| min{| di , j
i, j
d'
min
i , j |}
e
l (e )
2
Atteson: If d’ is almost additive with respect to a tree
T, then the output of NJ is a tree T’ with the same
topology as T
Unrooted Tree - NJ
Root
Output - NJ
Branch length
is proportional
to distance
N-J Method produces an Unrooted,
Additive tree
Neighbor-Joining Method
An Example
What is required for the Neighbour joining method?
Distance matrix
PAM
Spinach
Rice
Mosquito
Monkey
Human
1. First Step
5 +
C
"
A A ;"
+ #
/+
+
#
"
0. Distance Matrix
Spinach
0.0
84.9
105.6
90.8
86.3
Rice
84.9
0.0
117.8
122.4
122.6
Mosquito
105.6
117.8
0.0
84.7
80.8
Monkey
90.8
122.4
84.7
0.0
3.3
Human
86.3
122.6
80.8
3.3
0.0
2. Calculation of New Distances
>
-
I
I
After we have joined two species in a subtree we have to compute the
distances from every other node to the new subtree. We do this with a
simple average of distances:
Dist[Spinach, MonHum]
= (Dist[Spinach, Monkey] + Dist[Spinach, Human])/2
= (90.8 + 86.3)/2 = 88.55
Mon-Hum
Mon-Hum
Mosquito
Spinach
Rice Human
Monkey
Spinach
Human
Monkey
3. Next Cycle
PAM
Spinach
Rice
Mosquito
MonHum
Spinach
0.0
84.9
105.6
88.6
Rice
84.9
0.0
117.8
122.5
4. Penultimate Cycle
Mosquito
105.6
117.8
0.0
82.8
MonHum
88.6
122.5
82.8
0.0
PAM
Spinach
Rice
MosMonHum
Spinach
0.0
84.9
97.1
Rice
84.9
0.0
120.2
Mos-(Mon-Hum)
Mon-Hum
Rice
Spinach
Mosquito
Human
Monkey
SpinRice
0.0
108.7
Mos-(Mon-Hum)
Spin-Rice
Rice
Mon-Hum
Spinach
Mosquito
Human
Monkey
The result:
Unrooted Neighbor-Joining Tree
5. Last Joining
PAM
Spinach
MosMonHum
MosMonHum
97.1
120.2
0.0
MosMonHum
108.7
0.0
(Spin-Rice)-(Mos-(Mon-Hum))
Human
Spinach
Mos-(Mon-Hum)
Spin-Rice
Rice
Monkey
Mon-Hum
Spinach
Mosquito
Human
Monkey
Rice
Mosquito
Dangers of Paralogs
If we happen to consider genes 1A, 2B, and 3A of species
1,2,3, we get a wrong tree that does not represent the
Gene Duplication
-
1A
2A
-
Speciation events
3A
3B
Distance Based Reconstruction:
We now move to character
based methods
-
2B
1B
7
Character-based methods
for constructing phylogenies
In this approach, trees are constructed by comparing the
characters of the corresponding species. Characters may be
morphological (teeth structures, hip joint) or molecular
(homologous DNA sequences). The most popular
approaches are maximum parsimony (MP) and maximum
likelihood (ML)
In both methods, we will assume independence of
characters (no interactions). Each method has a well
defined objective function. Goal is to find the tree or trees
that optimize (maximize or minimize) respective function.
1. Maximum Parsimony
J
J
,
,,
,
#
J.
E
!
2
;
>J 5 #
;
>
;
C
>
*
1
2
1
Here, total #substitutions = 4
Example Continued
Example With One Letter
There are many trees possible. For example:
Suppose we have five species, such that three
have ‘C’ and two ‘T’ at a specified position
1
1
1
1
1
2
Minimal tree has one evolutionary change:
Total #substitutions = 3
Total #substitutions = 4
The left tree is preferred over the right tree.
C
T
C
C
"
C
C
T
T
Extension to Many Letters
T
C
Weighted Parsimony Scores
What is the parsimony score of
#
"
;
Aardvark Bison Chimp Dog
A:
B:
C:
D:
E:
CAGGTA
CAGACA
CGGGTA
TGCACT
TGCGTA
.
Elephant
'
;
>KL
;
>
>K$
Evaluating Weighted Parsimony
Scores
Each position is independent and computed by itself.
Use Dynamic Programming on a given tree.
if k is a node with children i and j, then
S(k,a) = minx(S(i,x)+c(a,x)) + miny(S(j,y)+c(a,y))
k
-;# >
-; 2>
i
j
S(k,a) the minimum
score of subtree rooted
at k when k has
character a.
-;C >
Cost of Evaluating Parsimony for binary
trees
If there are nodes, characters, and # possible
values for each character, then complexity is
8; #?>
Of course, we still need to search over possible trees
and find the best one. One usually resorts to
heuristic search techniques.
Evaluating Parsimony Scores
Dynamic programming on a given tree
Initialization:
For each leaf set -; > K L if is labeled by , otherwise
-;
>K
Iteration:
if # is node with children and C, then
-;# > K
2;-;
Termination:
cost of tree is
2>@ ; 2>> @
2-;
;-;C >@ ;
>>
2> where is the root
Comment:
To reconstruct an optimal assignment, we need to keep in each
node k and for each character a the two characters x, y that
bring about the minimum when k has character a.
2. Perfect Phylogeny
Data on species is given by a Character State Matrix.
Cell (p,i) has value j iff character i of object (species) p has state j.
Goal: constructing evolution tree for the species.
Object
A
B
C
D
E
c1
1
2
3
0
1
c2
1
0
2
3
1
Character
c3
2
1
3
4
0
c4
0
2
3
1
0
c5
0
1
1
0
1
Motivation: Evolution Tree
7
;
>
J$
?
;
;
>
>
© Copyright 2026 Paperzz