Viking database!

Viking database!
• Family tree of Gorm den Gamle, Harald Blåtand,
Svend Tveskæg ..
• Build new class for representing this tree..?
1
No!
tree.py
We already have a
generic tree class:
Phylogeny_node
2
general_tree.py
• Create copy with more
general name
• Build Royal class as a
subclass of this class:
– Needs same attributes
and methods, plus
perhaps more
3
royal_vikings.py (part 1)
Overrides __str__
method of Node class
• A Royal viking in a family tree has a
father/parent node, a name, and a string
representing the reigning period (if viking was
queen/king)
royal_vikings.py (part 2)
Test program
Not queen/king:
no reign given
[..]
Name:
Parent:
Siblings:
Sons:
(f)ather,
royal_vikings.py (part 3)
Name:
Parent:
Siblings:
Sons:
(f)ather,
Number of
Svend Estridsen
Estrid
Harald Hen, Knud den Hellige, Oluf Hunger,
(s)on, si(b)ling, (p)rint, (q)uit? f
Navigating the
family tree, starting
with
Niels
Erik Ejegod, Niels
Estrid
Svend Tveskæg
Harald 2., Knud den Store
Svend Estridsen
(s)on, si(b)ling, (p)rint, (q)uit? b
sibling (0-1)? 1
Name:
Knud den Store
Parent:
Svend Tveskæg
Siblings: Harald 2., Estrid
Sons:
Knud 3. Hardeknud
(f)ather, (s)on, si(b)ling, (p)rint, (q)uit? p
Knud den Store (1014-1035) - Svend Tveskæg (987-1014) - Harald Blåtand
(958-987) – Gorm den Gamle (?-958)
Another kind of tree: Newick trees
20.59
((monkey:100.85,cat:47.14):20.59);
47.14
100.85
monkey
cat
7
Project: Newick trees
• Load and parse newick tree file
– Need newick class
• Newick node has name, list of sons, distance to father, sequence
• Inherit from general_tree's Node class!
– Need parser
• Check that loaded tree corresponds to “current sequences”
– Create (ID, sequence) dictionary from current seqs (efficient!)
– After parsing tree file, traverse tree and look up sequence from
each node ID, store in node
– Give error message if ID not found
• Calculate “Average Hamming error”
8
Project: Newick trees
• Load and parse newick tree file
– Need newick class
• Newick node has name, list of sons, distance to father, sequence
• Inherit from general_tree's Node class!
– Need parser
• Check that loaded tree corresponds to “current sequences”
– Create (ID, sequence) dictionary from current seqs (efficient!)
– After parsing tree file, traverse tree and look up sequence from
each node ID, store in node
– Give error message if ID not found
• Calculate “Average Hamming error”
9
Average Hamming Error in tree
CATAT
1/4
1/5
CGATAT
2/5
CGTAT
GTAT
1/6
CGAGAT
• Average number of mismatches per alignment position
over all alignments in tree
• (2+1+1+1)/(5+6+5+4) = 5/20 = 0.25 errors per alignment
position
10
Newick_node
derives from
Node
hamming.py (part 1)
Exercise:
Newick_node
method
CATAT
CGATAT
CGTAT
GTAT
CGAGAT
11
hamming.py (part 1)
CATAT
mismatches = 0
alignmentlength = 0
CGATAT
0/0
GTAT
2/5
CGTAT
CGAGAT
12
hamming.py (part 1)
CATAT
mismatches = 2
alignmentlength = 5
CGATAT
0/0
2/5
CGTAT
GTAT
1/6
0/0
CGAGAT
13
hamming.py (part 1)
CATAT
mismatches = 3
alignmentlength = 11
CGATAT
GTAT
1/6
CGTAT
0/0
CGAGAT
14
hamming.py (part 1)
CATAT
3/11
CGATAT
CGTAT
GTAT
CGAGAT
15
hamming.py (part 1)
CATAT
3/11
1/5
CGATAT
CGTAT
1/4
0/0
GTAT
CGAGAT
16
hamming.py (part 1)
5/20
CATAT
3/11
1/5
CGATAT
CGTAT
1/4
0/0
GTAT
CGAGAT
17
Average Hamming Error
CATAT
hamming.py (part 2)
CGATAT
CGTAT
GTAT
CGAGAT
Average Hamming error: 0.250
18
.. on to the exercises
19