ELE4120 Bioinformatics Tutorial 9 Content • Building Phylogenetic Tree – Principles and terminology – Methods to construct the tree – Samples The construction of trees • We construct cladograms (= phylogenies or trees) to determine how closely related organisms are. • We need to understand certain terms before we can construct a tree. • Polytomy, outgroup, sister group, polytomy Multifurcation/Starbursts, Bifurcation Additional Terms • Recall: Bifurcation versus Multifurcation/Starbursts (e.g. Trifurcation) A B C D E A B C D E = / Bifurcation Trifurcation Multifurcation - May represent a lack of resolution because of too few data available for inferring the phylogeny - May stand for rapid speciation Additional Terms • Character – any phenotypic trait of an organism – E.g., tongue length, head shape, nucleotide position • Character state – variant (presumed to be homologous) forms of a character; – E.g., if characters are DNA sequences, code as different amino acids or G, A, T, or C Terms: Homology vs. Homoplasy X X Homology: similar traits inherited from a common ancestor X X Homoplasy: similar traits are not directly caused by common ancestry (convergent evolution). Terms: What traits should one evaluate to construct a taxonomy? 1. Phenetic approach: all traits can be useful; the taxonomist must use subjective judgment to decide how important each trait is relative to other traits. 2. Phylogenetic approach: Which traits should be used? Q Analogies: traits shared because of convergent evolution Q Homologies: traits shared because a common ancestor had the trait Q Derived homologies Q Ancestral homologies Terms: Should We Use Analogies to Construct Phylogeny? • Characters are versions of a trait. • Species 1 & 3 share character B because of convergent evolution: an analogy. Species 1 2 3 4 Characters B A B A B B A A A A A Terms: Should We Use Derived Homologies to Construct Phylogeny? • • Characters are versions of a trait. Species 1 & 2 share character B because a recent common ancestor derived B, a new character that species 3 & 4 lack. Species 1 2 3 4 Characters B B A A B A B A A Terms: Should We Use Ancestral Homologies to Construct Phylogeny? • • Characters are versions of a trait. Species 2, 3, & 4 share character A because a common ancestor of these species had A, and other descendants of that common ancestor lost A. Species 1 2 3 4 Characters B A A A B A A A A Terms: Summary of Previous 4 Slides • According to a type of phylogenetics called “cladistic phylogenetics” or “cladistics”: – Traits to use to construct phylogenetic trees: shared derived characters = derived homologies – Traits to NOT use to construct phylogenetic trees: • shared ancestral characters = ancestral homologies • analogies Recall Steps of Reconstructing trees • Choose the taxa – whose evolutionary relationships interest you – must themselves be clades • Determine the characters – examine each taxon to determine the character states – anatomical traits /select the 362 bases in a particular gene • Determine the polarity of characters – figure out the order of evolution for each character – May take some work – Helpful fossil evidence Recall Steps of Reconstructing trees • Group taxa by synapomorphies – derived or "changed" character states shared by two taxa – Assumption: similar features caused by common ancestry • Work out conflicts that arise – by some clearly stated method, usually parsimony • Build your tree --rules – All taxa go to endpoints, never nodes – All nodes must have a list of synapomorphies – All synapomorphies appear on the tree only once • have made a phylogeny!? – phylogeny is a hypothesis – Tree is only as good as the data Example 1: How to determine the most likely evolutionary tree? 1. Pick a group of taxa-of-interest: for example, cow, deer, hippo, pig, and whale. 2. Pick a set of traits that vary between these taxa: e.g., whether or not each species contains a DNA insertion at various specific locations in the chromosomes. Observe the traits in those species.→ Observed Traits: Presence or Absence of DNA Insertions at Different Specific Locations Names of Locations of Different Specific Insertions 1 3 5 6 7 8 10 11 12 15 18 19 20 Cow 0 0 0 0 0 1 1 1 1 1 1 0 0 Deer 0 0 0 0 0 1 1 1 1 1 1 0 0 Hippo 0 0 1 1 1 0 1 0 1 0 1 0 0 Pig 0 0 0 0 0 0 0 0 0 0 1 1 1 Whale 1 1 1 1 1 0 1 0 1 0 1 0 0 “0”means insertion is absent; “1” means insertion is present. Example 1: building tree 1. Pick a group of taxa-of-interest: for example, cow, deer, hippo, pig, and whale. 2. Pick a set of traits that vary between these taxa: e.g., whether or not each species contains a DNA insertion at various specific locations in the chromosomes. Observe the traits in those species.→ 3. Find a control taxon that is similar to the other taxa, but lacks all the traits you are using to construct this tree: for example, a mammal that lacks are of these insertions: the camel. Q What do we call this control group? Observed Traits: Include the Outgroup Names of Locations of Different Specific Insertions 1 3 5 6 7 8 10 11 12 15 18 19 20 Cow 0 0 0 0 0 1 1 1 1 1 1 0 0 Deer 0 0 0 0 0 1 1 1 1 1 1 0 0 Hippo 0 0 1 1 1 0 1 0 1 0 1 0 0 Pig 0 0 0 0 0 0 0 0 0 0 1 1 1 Whale 1 1 1 1 1 0 1 0 1 0 1 0 0 Camel 0 0 0 0 0 0 0 0 0 0 0 0 0 “0”means insertion is absent; “1” means insertion is present. Example 1: building tree 3. Find a control taxon that is similar to the other taxa, but lacks all the traits you are using to construct this tree: for example, a mammal that lacks are of these insertions: the camel. Q What do we call this control group? 4. Construct many hypothetical phylogenetic trees, each tree showing each mutational event: gaining or losing a DNA insertion. 5. Which tree is the most likely? Q The simplest one = the one that hypothesizes the fewest mutational events! Q “Parsimony” is the general logical principle that the simplest hypothesis that successfully explains all the observation is most likely. Q Select the most parsimonious tree. Construct many hypothetical phylogenetic trees, each tree showing each mutational event: gaining or losing a DNA insertion. Any tree must contain 6 tips, representing the 6 taxa in alphabetical order below and their current traits, the DNA insertions. Species: Camel Traits: none Cow Deer Hippo 8, 10, 11, 8, 10, 11, 5, 6, 7, 12, 15, 12, 15, 10, 12, 18 18 18 Pig Whale 18, 19, 20 1, 3, 5, 6, 7, 10, 12, 18 Consider one possible phylogenetic tree, shown on the next slide. Consider one possible phylogenetic tree. Same traits outgroup Camel Cow Deer Gain 8, 10, 11, 12, 15, 18 Last 3 taxa in alphabetical order Hippo Whale Gain 5, 6, 7, 10, 12, 18 Gain 18, 19, 20 Can you think of another more likely tree: one that is more simple = has fewer mutational events? This tree hypothesizes 23 mutational events. Pig Gain 1, 3, 5, 6, 7, 10, 12, 18 For example, would it be simpler to hypothesize that the insertion at 18 occurred once rather than 4 separate times? Simplify the previous phylogenetic tree. Camel Cow Deer Gain 8, 10, 11, 12, 15 Hippo Whale Gain 5, 6, 7, 10, 12 Gain 19, 20 Can you think of another more likely tree: one that is more simple = has fewer mutational events? This tree assumes 20 mutational events. Pig Gain 18 Gain 1, 3, 5, 6, 7, 10, 12 What other simplifications can you think of? 21 Different algorithms used to infer phylogeny from sequence data 1. Distance based methods a. Calculate evolutionary distances between sequences b. Build a tree based on those distances 2. Maximum Parsimony (character based method) a. Find the simplest tree that explains the data with the fewest # of substitutions/mutation Used by aforementioned method 3. Maximum Likelihood (probabilistic method based on explicit model) a. Find the tree that is most likely, given an evolutionary model 4. New Baysian approaches (also probabilistic) Example 2: building trees - parsimony • Now, how do we take a bunch of data, say DNA sequence data, and make a (phylogenetic) tree? • One way to do this is to draw a tree on which all of the features of the organisms (or alleles) evolve in the simplest possible way. • This is called a parsimony analysis - parsimony means stingy, so we are trying to find a tree on which one evolves within fewest substitutions, if possible. Example 2: Building phylogenetic trees Maximum Parsimony (character based method) Search all possible trees and find the one requiring the fewest substitutions AAG GGA AAA AGA a b c d Example 2: Building phylogenetic trees Maximum Parsimony (character based method) Search all possible trees and find the one requiring the fewest substitutions AAG GGA AAA AGA a b c d Example 2: Building phylogenetic trees Maximum Parsimony (character based method) Search all possible trees and find the one requiring the fewest substitutions AAG AAA GGA AGA a c b d What are the ancestral sequences at each node? How many base changes are required for this tree? Example 2: Building phylogenetic trees Maximum Parsimony (character based method) Search all possible trees and find the one requiring the fewest substitutions AAA AAG AAA GGA AGA AAA or AGA a c b d AGA What are the ancestral sequences at each node? How many base changes are required for this tree? 3 changes are required.
© Copyright 2026 Paperzz