Systematics - Bio 615 Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 1. Consistency Index 2. g1 statistic, PTP - test 3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jackknifing 6. Statistical hypothesis testing (frequentist) 7. Posterior probability (see lecture on Bayesian) Derek S. Sikes University of Alaska Multiple optimal trees • Many methods can yield multiple equally optimal trees Multiple optimal trees • If multiple optimal trees are found we know that all of them are wrong except, possibly, (hopefully) one (as species tree, not gene trees) • We can further select among these trees with additional criteria, but • Typically, relationships common to all the optimal trees are summarized with consensus trees Consensus methods • Some have argued against consensus tree methods for this reason • Debate over quest for true tree (point estimate) versus quantification of uncertainty Strict consensus methods • A consensus tree is a summary of the agreement among a set of fundamental trees • Strict consensus methods require agreement across all the fundamental trees • There are many consensus methods that differ in: 1. the kind of agreement 2. the level of agreement • They show only those relationships that are unambiguously supported by the data • Consensus methods can be used with multiple trees from a single analysis or from multiple analyses • The commonest method (strict component consensus) focuses on clades/components/full splits 1 Systematics - Bio 615 Strict consensus methods Strict consensus methods TWO FUNDAMENTAL TREES" • This method produces a consensus tree that includes all and only those full splits found in all the fundamental trees A! B! C! D! • Other relationships (those in which the fundamental trees disagree) are shown as unresolved polytomies E! A! • Can be less optimal than any of the optimal trees F! B! C! B! A! G! D! E! F! C! E! D! F! G! G! Simplest to interpret STRICT CONSENSUS TREE! Majority rule consensus Majority rule consensus • Majority-rule consensus methods require agreement across a majority of the fundamental trees • This method produces a consensus tree that includes all and only those full splits found in a majority (>50%) of the fundamental trees • May include relationships that are not supported by the most parsimonious interpretation of the data • Other relationships are shown as unresolved polytomies • The commonest method focuses on clades/ components/full splits • Of particular use in bootstrapping and Bayesian Inference (best not to use for single searches) • Implemented in PAUP* and MrBayes Majority rule consensus Majority rule consensus Majority Rule Consensus trees are used for THREE FUNDAMENTAL TREES A B C D E F Numbers indicate frequency of clades in the fundamental trees G B A A E C B C D F E D A G F G 66 100 66 66 66 B C E D F G 1. Summarizing multiple equally optimal trees from one search (but they shouldn’t be!) 2. Summarizing the results of a bootstrapping analysis (multiple searches) 3. Summarizing the results of a Bayesian analysis Don’t confuse these! The numbers on the branches mean very different things in each case MAJORITY-RULE CONSENSUS TREE 2 Systematics - Bio 615 Reduced consensus methods TWO FUNDAMENTAL TREES! A! B! C! D! E! F! G! A! G! B! C! D! E! F! A B!C! D!E! F! G! A! B! C! D! E! F! Strict component consensus! completely unresolved! AGREEMENT SUBTREE - PAUP*! Taxon G is excluded! Consensus methods Three fundamental trees agreement subtree strict consensus Ochromonas Symbiodinium Prorocentrum Loxodes Tetrahymena Tracheloraphis Spirostomum Euplotes Gruberia Ochromonas Symbiodinium Prorocentrum Loxodes Tetrahymena Spirostomumum Tracheloraphis Euplotes Gruberia Ochromonas Symbiodinium Prorocentrum Loxodes Tetrahymena Spirostomumum Euplotes Tracheloraphis Gruberia Ochromonas Symbiodinium Prorocentrum Loxodes Tetrahymena Euplotes Spirostomumum Tracheloraphis Gruberia Euplotes excluded majority-rule 100 100 66 100 66 100 Consensus methods Symbiodinium Prorocentrum Loxodes Tetrahymena Spirostomum Tracheloraphis Gruberia Ochromonas Ochromonas Symbiodinium Prorocentrum Loxodes Tetrahymena Spirostomum Euplotes Tracheloraphis Gruberia Recall • Use strict methods to identify those relationships unambiguously supported by parsimonious interpretation of the data • Stochastic error vs Systematic error • Use reduced methods where consensus trees are poorly resolved • These assessment methods help identify stochastic error • Avoid methods which have ambiguous interpretations. Prevent possible confusion between MR consensus for an optimal tree search and a MR consensus for a bootstrapping search Accuracy and Precision • Accuracy – Accuracy is correctness. How close a measurement is to the true value. "" "(unless we know the “true tree” in "" "advance we cannot measure this)" – How repeatable are the results? – How strongly do the data support them? – This is a measure of precision (which is hopefully related to accuracy) Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 1. Consistency Index 2. g1 statistic, PTP - test 3. Consensus trees 4. Decay index (Bremer Support) • Precision – Precision is reproducibility. How closely two or more measurements agree with one another. (this we can measure!) 5. Bootstrapping / Jackknifing 6. Statistical hypothesis testing (frequentist) 7. Posterior probability (see lecture on Bayesian) 3 Systematics - Bio 615 Decay analysis Branch Support • Several methods have been proposed that attach numerical values to internal branches in trees that are intended to provide some measure of the strength of support for those branches and the corresponding groups • These methods include: - The Bootstrap (BS) and jackknife - Decay analyses (aka Bremer Support) - Bayesian Posterior Probabilities (PP or BPP) • In parsimony analysis, a way to assess support for a group is to see if the group occurs in slightly less parsimonious trees also • The length difference between: the shortest trees including the group and the shortest trees that exclude the group (the extra steps required to collapse a group) is the decay index or Bremer support Decay analysis -example Ciliate SSUrDNA data +27 +45 +10 +15 +7 Ochromonas Symbiodinium Prorocentrum Loxodes Tracheloraphis Spirostomum Gruberia Euplotes Tetrahymena Randomly permuted data +1 +1 +8 +3 Ochromonas Symbiodinium Prorocentrum Loxodes Tetrahymena Tracheloraphis Spirostomum Euplotes Gruberia Decay indices - interpretation • Generally, the higher the decay index the better the relative support for a group • Like Bootstrap values (BS), decay indices may be misleading if the data are misleading • Magnitude of decay indices and BS generally correlated (i.e. they tend to agree) • Only groups found in all most parsimonious trees have decay indices > zero Decay analyses - in practice • Decay indices for each clade can be determined by: - Using PAUP* to search for the shortest tree that lacks the branch of interest using reverse topological constraints - with the Autodecay or TreeRot programs (in conjunction with PAUP*) - MacClade 4 will also help prepare for a Decay analysis - An excellent use for the Parsimony Ratchet because finding the shortest tree length is all that matters (not finding multiple shortest trees) Decay indices - interpretation • Unlike BS decay indices are not scaled (0-100) – This has the advantage that the value can exceed 100 whereas BS “tops - out” at 100 meaning that we cannot distinguish between the support of two branches with BS values of 100 although one might have a far greater decay index than the other • It is even less clear what is an acceptable decay index than a BS value… – Unlike the BS value very little work has examined the properties and behavior of decay indices 4 Systematics - Bio 615 Decay indices - interpretation One key study is that of DeBry (2001) – He showed that decay indices should be interpreted in light of branch lengths – That the same values, even within the same tree, do not represent the same support if the branch lengths differ - ie Decay Indices are not easily comparable as measures of branch support Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 1. Consistency Index 2. g1 statistic, PTP - test 3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jackknifing - Values < 4 should be considered weak regardless of branch length 6. Statistical hypothesis testing (frequentist) DeBry, R.W. (2001) Improving interpretation of the Decay Index for DNA sequence data. Systematic Biology 50: 742-752. 7. Posterior probability (see lecture on Bayesian) Bootstrapping (non-parametric) Decay values versus Bootstrap and Jacknife values from one empirical study • Bootstrapping is a statistical technique that uses computer intensive random resampling of data to determine sampling error or confidence intervals for some estimated parameter • Introduced to phylogenetics by Felsenstein in 1985 • Based on idea of Efron (1979) Norén, M. & U. Jondelius. 1999. Phylogeny of the Prolecithophora (Platyhelminthes) inferred from 18S rDNA sequences. Cladistics 15: 103-112. Bootstrapping (non-parametric) 1. Characters are sampled with replacement to create many (100-1000) bootstrap replicate data sets (think shuffle vs random play of music) 2. Each bootstrap replicate data set is analysed (e.g. with parsimony, distance, ML) 3. Agreement among the resulting trees is summarized with a majority-rule consensus tree 5 Systematics - Bio 615 Bootstrapping Bootstrapping (non-parametric) • Frequency of occurrence of groups, bootstrap support (BS), is a measure of support for those groups • Additional information is given in partition tables (for groups below 50% support) • Can ask PAUP* to create MR con-tree of higher cut-off, eg 80% - all weaker branches collapse Resampled data matrix! Original data matrix! Taxa A B C D Outgp 1 R R Y Y R Characters! 2 3 4 5 6 7 8! R Y Y Y Y Y Y! R Y Y Y Y Y Y! Y Y Y Y R R R! Y R R R R R R! R R R R R R R! Taxa A B C D Outgp 1 R R Y Y R Characters! 2 2 5 5 6 6 8! R R Y Y Y Y Y! R R Y Y Y Y Y! Y Y Y Y R R R! Y Y R R R R R! R R R R R R R! Randomly resample characters from the original data with replacement to build many bootstrap replicate data sets of the same size as the original - analyse each replicate data set A! B! C! 1! 2! 8! 7! 6! 5! 4! 3! D! Summarize the results of multiple analyses with a majority-rule consensus tree Bootstrap values (BS) are the frequencies with which groups are encountered in analyses of replicate data sets A! A! 1! 2! B! C! 8! 6! 6! B! C! D! D! 5! 5! 96%! 2! 2! 1! 66%! Outgroup! Outgroup! Bootstrapping - an example Ciliate SSUrDNA - parsimony bootstrap Ochromonas (1)! Symbiodinium (2)! 100! Prorocentrum (3)! Euplotes (8)! 84! Tetrahymena (9)! 96! Loxodes (4)! 100! Tracheloraphis (5)! 100! Spirostomum (6)! 100! Majority-rule consensus Gruberia (7)! 123456789 Freq ----------------.**...... 100.00 ...**.... 100.00 .....**.. 100.00 ...****.. 100.00 ...****** 95.50 .......** 84.33 ...****.* 11.83 ...*****. 3.83 .*******. 2.50 .**....*. 1.00 .**.....* 1.00 The probability of a character being omitted from a bootstrap sample ranges from 0-0.367 (depending on N, the number of characters) P 0 0.25 0.29 0.31 0.367 Bootstrapping - random data Partition Table Partition Table Bootstrapping N 1 2 3 4 … Outgroup! Rule of thumb: a branch must be supported by 3 or more characters to be recovered in >95% of bootstraps Randomly permuted data - parsimony bootstrap Ochromonas Ochromonas Symbiodinium 59 71 Prorocentrum Loxodes Tracheloraphis Spirostomumum Symbiodinium 16 59 26 21 71 16 Prorocentrum Loxodes Spirostomumum Tetrahymena Euplotes Tetrahymena Euplotes Tracheloraphis Gruberia Gruberia 50% Majority-rule consensus (with minority components) 123456789 Freq! -----------------! .*****.** 71.17! ..**..... 58.87! ....*..*. 26.43! .*......* 25.67! .***.*.** 23.83! ...*...*. 21.00! .*..**.** 18.50! .....*..* 16.00! .*...*..* 15.67! .***....* 13.17! ....**.** 12.67! ....**.*. 12.00! ..*...*.. 12.00! .**..*..* 11.00! .*...*... 10.80! .....*.** 10.50! .***..... 10.00! Bootstrap - interpretation • Bootstrapping was introduced as a way of establishing confidence intervals for phylogenies • This interpretation of bootstrap values depends on the assumption that the original data is a random sample from a much larger set of independent and identically distributed data (i.i.d.) 6 Systematics - Bio 615 Bootstrap - interpretation • However, several things complicate this interpretation - These assumptions are often wrong - making any strict statistical interpretation of BS invalid - Some theoretical work indicates that BS are very conservative (too low), and may underestimate confidence intervals - problem increases with numbers of taxa - BS can be high for incongruent relationships in separate analyses - and can therefore be misleading (misleading data -> misleading BS) recall the Mantra: The data are the things Bootstrap - interpretation Huelsenbeck & Rannala (2004) list 3 common interpretations 1. Probability that a clade is correct (accuracy) 2. Robustness of the results to perturbation (repeatability / precision) 3. Probability of incorrectly rejecting a hypothesis of monophyly (1-P) : probability of getting that much evidence if, in fact, the group did not exist Huelsenbeck, J.P. and Rannala, B. (2004) Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models. Systematic Biology 53: 904-913. Bootstrap - interpretation Be suspicious of maximum bootstrap values… they might be due to systematic error. “…bootstrapping provides us a confidence interval within which is contained not [necessarily] the true phylogeny but the phylogeny that would be estimated on repeated sampling of many characters from the underlying pool of characters.” Joseph Felsenstein (1985) Bootstrap - interpretation • High BS (e.g. > 85%) is indicative of strong ‘signal’ in the data (some use 70% as the cutoff, there is no consensus as to which value is best) • Provided we have no evidence of strong misleading signal due to violation of assumptions (e.g. base composition biases, great differences in branch lengths) high BS values are likely to reflect strong phylogenetic signal • In other words, although technically they are meant to be a measure precision, they are usually thought to be at least strongly correlated with accuracy Bootstrap - interpretation • Low BS values, however, need not mean the relationship is false, only that it is poorly supported – This is especially true of morphological data – Morphologists often use the Decay index instead • Bootstrapping can be viewed as a way of exploring the robustness of phylogenetic inferences to perturbations in the balance of supporting and conflicting evidence for groups Paul Lewis 7 Systematics - Bio 615 Bootstrap - interpretation Two types of precision (Hillis & Bull 1993): Precision of bootstrap value vs repeatability of finding a branch: - Precision of bootstrap values increases with the number of bootstrap replicates (variance among analyses decreases) - Repeatability tells us how likely we are to find the same results using a different but similar dataset - Felsenstein’s original idea Hillis & Bull (1993) examined precision, repeatability, and accuracy of the bootstrap a) 1,089 BS of 100 reps e from 1 “real” dataset =1,089 pseudo datasets Bootstrap - interpretation Hillis & Bull (1993) examined precision, repeatability, and accuracy of the bootstrap - Found that BS provide a very imprecise measure of repeatability - so imprecise as to be worthless as a measure of repeatability - Determined that in some cases a BS as low as 70% was equivalent to a 95% probability of being true - Bias confirmed by Newton (1996) Hillis, D.M. and Bull, J.J. (1993) An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Systematic Biology, 42: 182-192. Bootstrap - interpretation BS values have been criticized for a variety of reasons: Sanderson, M.J. (1995) Objections to Bootstrapping Phylogenies: A Critique. Systematic Biology, 44: 299-320. b) 100 real datasets “Comparison of these two distributions reveals that the process of bootstrap resampling is not the same as repeated, independent sampling of data.” Hillis, D.M. and Bull, J.J. (1993) An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Systematic Biology, 42: 182-192. Jackknifing • Jackknifing is very similar to bootstrapping and differs only in the character resampling strategy • Some proportion of characters (e.g. 37%, 50%) are randomly selected and deleted • Replicate data sets are analyzed and the results summarized with a majority-rule consensus tree • Jackknifing and bootstrapping tend to produce broadly similar results and have similar interpretations - Jackknifing is preferred by cladists But the top reason has been that they seem to be too conservative - ie underestimates of the probability of the branch being correct - ie biased downward (erratically & unpredictably) Newton, M.A. (1996) Bootstrapping phylogenies: Large deviations and dispersion effects. Biometrika, 83: 315-328. Low Support Low branch support can result from 1. Conflicting data (homoplasy) 2. Lack of data - even a dataset with no homoplasy can yield poorly resolved trees if there are branches without change 3. Use of a poorly fitting model (too complex or too simple) 4. Artifact of mid-sized clades? “This indicates that, for all support measures on trees of a given size, the largest clades and the smallest clades are supported most strongly, whereas medium sized clades receive lower support” Picket, K.M. and Randle, C.P. (2005) Strange bayes indeed: uniform topological priors imply non-uniform clade priors. Molecular Phylogenetics and Evolution 34: 203-211. SEE ALSO: Brandley, M. et al. (2006) Are unequal clade priors problematic for Bayesian phylogenetics? Systematic Biology 55: 138-146. 8 Systematics - Bio 615 Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 1. Consistency Index 2. g1 statistic, PTP - test 3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jackknifing 6. Statistical hypothesis testing (frequentist) 7. Posterior probability (see lecture on Bayesian) Terms - from lecture & readings consensus methods consensus tree strict consensus splits majority rule consensus reduced consensus trees agreement subtree branch support Decay analysis Decay index (Bremer Support) DeBry (2001) Bootstrapping resampling with replacement repeatability jackknifing Study questions Describe the difference between a strict and majority rule consensus tree." What were the key findings of DeBry in his (2001) paper on Decay Indices?" What is the rule of thumb in bootstrapping for a branch to receive > 95% support? What are two common but different interpretations of bootstrap values? What did Hillis & Bull (1993) conclude regarding these interpretations?" What are two common explanations for low branch support?" 9
© Copyright 2026 Paperzz