Assignment 7 -- Evolution Trees

Assignment 7 – Evolution Trees
Due: Wednesday, March 15th by 9:00 PM
10 points
Dean Zeller
CS10051
Spring, 2006
Objective
The student will use a graphics package to create diagrams of binary evolution trees.
Background
This assignment deals with the cutting-edge topic of bioinformatics, also called computational
biology. It is a complex field of graph theory with applications to mathematics, computer science, and
genetics. An evolutionary tree (or phylogeny) is a tree-structure the demonstrates evolution of species
over time. A tree consists of vertices (nodes) connected by edges (connections). In evolution, a node
represents a point in which a species population “splits” into two genetically different species. Once a
species splits, the two species created are genetically unable to produce offspring. Nodes not producing
offspring are called the leaves of the tree, representing the extant (non-extinct) species. Two trees are
isomorphic if they contain the same structure and are different only through symmetry of a node. In
order to simplify the problem, isomorphic trees are considered the same for purposes of this research.
y
y
y
y
A
B
C
D
x
z
x
z
x
z
x
z
(1242)
E
(1242)
(1242)
F





(1242)
Figure 1 – Isomorphic trees
A and B are isomorphic at node x.
B and C are isomorphic at node y.
C and D are isomorphic at node z.
A and D are isomorphic at node y.
As such, A, B, C, and D are all isomorphic, and thus considered the
same for purposes of this assignment.
E and F are not isomorphic because they are structurally different.

(1244b)
(1244a)
In order to easily show trees, they are given a text description called a classification label. This
allows a text label instead of a visual picture to represent a tree structure. A good classification system
has a unique name for each tree. For this assignment, the classification system is simply a listing of the
number of nodes at each level. Trees A, B, C, and D above have the label (1242), indicating the first
level has one node, the second has two nodes, the third has four nodes, and the fourth has two. Since the
four trees are isomorphic, the same label represents all four trees. Ultimately, this classification system
is incomplete. While simple to understand, it does not provide unique names for each tree at the higher
levels. Trees E and F are not isomorphic, and thus are given separate labels (1244a and 1244b). This
system will suffice for now, but can get confusing as the size of the trees increase.
Introductory study of phylogenies makes another assumption that greatly simplifies the problem.
For purposes of this assignment, the only non-leaf structure possible are nodes with exactly two
offspring, representing a point in time in which a population “splits” into two populations. A node with
a single offspring is called a redundant node and does
a)
b)
not significantly add to the tree structure. Nodes with
three or more children can be isomorphically
approximated, and thus can be ignored at this stage in the
research. While all nodes in evolution trees are unique
Figure 2 – Tree structure replacements
species, at this point only the leaf nodes need to be
a) Redundant node removed
considered.
b) Isomorphic approximation
Task 1 – Draw Given Trees (5 points)
Given below are thirteen evolution trees. This represents the complete set of all non-isomorphic
evolution trees of up to six leaves (11 nodes). Use a graphics package to recreate these trees. Give the
classification for the tree and label its leaves with successive letters (a, b, c, etc…) Your tree style may
differ from the trees below, but the structure must be correct.
(1222)
(12222)
(122)
(124)
(12)
e
c
d
a b
d
c
a b
a b c d
c
a b
a b
(1224)
(1242)
(122222)
f
e
e
c
a b c d
d
d
e
c
a b
a b
(12224)
(12242)
(12422)
f
f
e
d e
c
a b c d
a b
(1244a)
c
a b
(1244b)
e
a b c d
e
d
f
f
c
a b
d
e
f
Task 2 – Generate Trees (5 points)
Use a graphics package to create the eleven isomorphically unique evolutionary trees of 7 leaves (13
nodes). The following labels are the classifications for the unique trees: 1222222, 122224, 122242,
122422, 12244a, 12244b, 124222, 12424, 12442a, 12442b, and 1246.
Grading:
You will be graded on the following criteria:
Accuracy
Correctly drawing the evolutionary trees, attention to detail.
Extra Credit:
Extra credit will be given for including the following:
 Create the sixteen 8-leaf trees: 12222222, 1222224, 1222242, 1222422, 1224222, 1242222, 124224,
124242, 124422a, 124422b, 12444aa, 12444ab, 12444ba, 12444bb, 12462a, 12462b, 1248.
 Create the 9 leaf (17 node) non-isomorphic trees and classifications.
 Create the trees without the redundant node and isomorphic approximation assumptions, allowing
for any number of offspring from a node.