BIOINFORMATICS LETTER TO THE EDITOR Vol. 25 no. 1 2009, pages 147–149 doi:10.1093/bioinformatics/btn539 Phylogenetics Comment on ‘A congruence index for testing topological similarity between trees’ Anne Kupczok∗ and Arndt von Haeseler Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, University of Veterinary Medicine Vienna, Dr. Bohr-Gasse 9/6, A-1030 Vienna, Austria Received on August 21, 2008; revised and accepted on October 13, 2008 Advance Access publication October 14, 2008 Associate Editor: Martin Bishop Contact: [email protected] Testing the congruence of trees is a major task in phylogenetic research. The comparison of different trees allows to assess the similarity of trees originated from different genes or of trees from the same data but reconstructed with different methods. Another important application of phylogenetic comparisons are cospeciation studies. Cospeciation refers to the simultaneous speciation of ecologically associated lineages, e.g. hosts and their parasites. Since cospeciation is not the only reason for congruent host and parasite trees, the task to assess the congruence is denoted as cophylogenetic analysis (de Vienne et al., 2007b). Recently, de Vienne et al. (2007a) suggested a novel topological test for a cophylogenetic analysis: given two phylogenies with the same number of taxa and a one-to-one mapping of the taxa, the host– parasite mapping, they test whether the two trees are more congruent than expected by chance. Their test is based on the maximum agreement subtree (MAST) between two trees. The MAST is the largest possible subtree identical in both input phylogenies (Finden and Gordon, 1985). Thereby a subtree is obtained by pruning taxa from the phylogenies and collapsing inner nodes of degree two and its size is the number of taxa in the subtree. Thus, the MAST size refers to the maximal number of taxa retained in the subtree of both input phylogenies. The larger the MAST between two trees, the more congruent these trees are. First, we outline the method of de Vienne et al. (2007a), which we will denote as the MAST test. The null distribution of the MAST size is obtained by generating pairs of trees, where trees are assumed to be equally likely, and evaluating their MAST size. From this null distribution, they estimated functions for the mean and SD of the MAST size depending on the number of taxa n. Thereby, de Vienne et al. (2007a) confirmed the results of Bryant et al. (2003) that the mean MAST size grows proportionally to the square root of n (Fig. 1). For example, for n = 50, the mean MAST size is 10, thus on average 40 taxa (4/5 of all taxa) are pruned from random trees. The test statistic for two trees of n taxa is the MAST size centered by the mean for n and rescaled by the SD for n. The resulting standardized distribution for 7 ≤ n ≤ 50 is then used to fit an analytical curve to the left tail of the distribution. With this curve, P-values up to 0.05 can be estimated. ∗ To whom correspondence should be addressed. Using the centered and rescaled MAST size as a test statistic causes two inherent problems. First only the taxa in the MAST contribute to the significance while the topological information of the others is ignored. In a biological framework, however, this may not use all the information present in the topologies. Second, the mean MAST size increases only with the square root of the number of taxa n. Hence, the mean relative MAST size (the MAST size divided by n) approximates zero with increasing n. That means, for two large trees, on average a high proportion of taxa is pruned to obtain the MAST. Our main concern when appplying the MAST test is, however, a statistical one. When applying a statistical test at a significance level α (e.g. 5%), the assumption is that not more than 5% of the tests are rejected under the null hypothesis. The actual fraction of significant results for a predefined α is known as the size of a test. For discrete tests, the size will rarely match α exactly since the sum of probabilites for the extreme cases grows in discrete steps. This behavior is well-known for discrete distributions like the binomial distribution. The binomial distribution describes the number of successes in a sequence of n independent experiments, each of which yields success with probability P. For example, for n = 7 and P = 0.5, seven successes occur with a probability of 0.78%, whereas six or seven successes occur with a probability of 6.25%. In such a case, one has to choose the test statistic to be either conservative, i.e. that the size is always smaller than α, or liberal, i.e. there are more significant results than the predefined significance level. To be sure that a significant result or a more extreme case occurs under the null hypothesis not more often than the significance level, the test statistic must be conservative. For the example with the binomial distribution for n = 7, the conservative critical value for a significance level of at most 5% is 7, and thus the size is only 0.78%. The significance could also be computed in analogy to the MAST test. Then mean and SD are computed for 7 ≤ n ≤ 50 and significance is assigned to the values in the 5%-quantile of the distribution of the centered and rescaled values combined for all n. Then the critical value for n = 7 is 6, thus the resulting size of 6.25% is too liberal. To determine the size of the MAST test, we first compute the critical value of the MAST size for α = 0.05 (Fig. 1). For example, for n = 50, the critical value is 13, thus when pruning 37 (≈ 3/4) or less of the taxa, the trees are considered to be congruent. This high proportion is counterintuitive, but results from the vast number of © The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected] 147 Size of the test 4 Mean MAST size Critical value 10 20 30 40 50 60 70 80 90 Original size Conservative size 0.00 0.02 0.04 0.06 0.08 0.10 0.12 12 10 8 6 MAST size 14 16 A.Kupczok and A.von Haeseler 10 100 20 30 40 50 60 70 80 90 100 Number of taxa Number of taxa 148 0.15 0.10 0.05 Fraction of significant results trees for large n and the fact already mentioned that the mean MAST size grows slower than n. The size of the MAST test cannot be computed analytically as for the binomial distribution. We determine it by simulating 10 000 pairs of random trees, where all trees are equally likely. For the resulting pairs of trees, the MAST size is constructed with the algorithm of Goddard et al. (1994) as implemented in PAUP* (Swofford, 2002). The average size of the test over all n and over 7 ≤ n ≤ 50 is 0.043 and 0.048, respectivly and thus below the significance level of 0.05. However, in Figure 2, we see that the size exceeds the significance level for some n. In these cases, the estimate of the conservative size is much smaller but the critical value is only one taxon less than the critical value obtained from the MAST test (Fig. 1). For example, for n = 7, the critical value of the MAST test is 5, but the corresponding size is 12.7%, whereas 6 or 7 has only been observed in 0.8% of the cases. Thus, 0.8% is the correct size of the conservative test with a critical value of 6. We observe that the original size exceeds the significance level when the number of taxa approaches the right boundary of an area delimited by vertical lines. Within these areas, all n have the same critical value (cf. Fig. 1). For instance, with 40 ≤ n ≤ 47 the critical value is 12, thus a maximum of 28 (n = 40) to 35 (n = 47) taxa can be pruned while the pairs of trees are still significant. While the critical value remains constant between two lines, the number of taxa allowed to be pruned increases, thus more pairs show significance. To evaluate the behavior of the MAST test in a more realistic setting, we used real trees but random taxa mappings. To this end, we downloaded all 5023 trees from TreeBASE (http://www.treebase. org/treebase/data/Tree.txt, April 2008). Thereof we investigated the 4610 trees comprising between 7 and 100 taxa. Unfortunately the number of available trees varies strongly for the numbers of taxa. Especially for each n ≥ 94 there are less than 10 trees available, but for n ≤ 50 there are always more than 30 trees present. Two different trees with the same number of taxa are drawn randomly. If a tree contains multifurcations, these are randomly resolved each time the tree is drawn, where each resolution is equally likely. The resulting bifurcating trees are relabeled randomly with the same taxa set. This corresponds to random host–parasite mappings. In Figure 3, the fraction of significant results is shown for each number of taxa. On average the MAST test is slightly too liberal with Fig. 2. Size of the test: evaluation of the test statistic for 10 000 random pairs of trees. ‘Original size’ is obtained by using the critical values of the MAST test (Fig. 1). ‘Conservative size’ is obtained by using the largest critical value which yields a size of 5% or smaller. The vertical lines are the same as in Figure 1. The horizontal line displays the significance level at 0.05. 0.00 Fig. 1. Mean MAST size and critical value of the MAST size for α = 0.05: the mean is given by equation (1) in de Vienne et al. (2007a) and the critical value is computed with equation (6) in de Vienne et al. (2007a). The vertical lines indicate the steps in the critical value. 10 20 30 40 50 60 70 80 90 100 Number of taxa Fig. 3. Simulation results: results are considered significant if P < 0.05. A total of 1000 repetitions for each number of taxa. The vertical lines are the same as in Figure 1. The horizontal line displays the significance level at 0.05. a size of 0.064 and 0.058 for all n and for 7 ≤ n ≤ 50, respectively. This may be due to the fact that the assumption of the null hypothesis of equally likely trees is not true (see e.g. Blum and François, 2006, for a study about tree shape distributions on a similar data set). Note, that we weakened this fact by resolving the multifurcations in the trees randomly. We observe that the size of the test depends strongly on the number of taxa, as already observed for random trees (Fig. 2). We have shown that a number of pitfalls exist when using the MAST test introduced by de Vienne et al. (2007a) to test whether two phylogenetic trees are congruent. First, by using the MAST size as the basis of the test statistic, the positions of the taxa pruned from the trees are completely ignored and any positional information e.g. whether they were in the same subtrees is discarded. When applying the test in a biological framework the taxa in the maximum agreement subtree should be regarded not only their number. Second, a high number of taxa can be pruned from the phylogenies while the pair remains significant. Our third and major concern is that tree topologies are discrete as is the MAST size of two trees. One pitfall of the discreteness of the MAST size is the strongly varying size of the test for different numbers of taxa. The MAST test is too liberal for quite some n. Therefore, we recommend to adjust the critical Testing the congruence of trees value such that the test is conservative for all n. Finally, the test is more liberal using random phylogenies from TreeBASE which indicates that the assumption of equally likely trees may not be an appropriate null model. ACKNOWLEDGEMENTS The authors would like to thank Heiko Schmidt for helpful comments on the article and the three reviewers for valuable feedback. Funding: Wiener Wissenschafts-, Forschungs- and Technologiefonds (WWTF). Conflict of Interest: none declared. REFERENCES Blum,M.G.B. and François,O. (2006) Which random processes describe the tree of life? A large-scale study of phylogenetic tree imbalance. Syst. Biol., 55, 685–691. Bryant,D. et al. (2003) The size of a maximum agreement subtree for random binary trees. Dimacs Series in discrete mathematics and theoretical computer science, 61, 55–65. de Vienne,D.M. et al. (2007a) A congruence index for testing topological similarity between trees. Bioinformatics, 23, 3119–3124. de Vienne,D.M. et al. (2007b) When can host shifts produce congruent host and parasite phylogenies? A simulation approach. J. Evol. Biol., 20, 1428–1438. Finden,C.R. and Gordon,A.D. (1985) Obtaining common pruned trees. J. Classif., 2, 255–276. Goddard,W. et al. (1994) The agreement metric for labeled binary trees. Math. Biosci., 123, 215–226. Swofford,D.L. (2002) PAUP*: Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates, Sunderland, MA. 149
© Copyright 2025 Paperzz