Syst. Biol. 50(1):60–66, 2001 Changing the Landscape: A New Strategy for Estimating Large Phylogenies D ONALD L. J. Q UICKE,1 J ASON TAYLOR,2 AND ANDY PURVIS 2,3 1 Unit of Parasite Systematics, CABI, Bioscience UK Centre (Ascot), Department of Biology, Imperial College at Silwood Park, Ascot, Berkshire, SL5 7PY, and Department of Entomology, The Natural History Museum, London, SW7 5BD, U.K.; E-mail: [email protected] 2 Department of Biology, Imperial College, Silwood Park, Ascot, Berkshire, SL5 7PY, U.K.; E-mail: [email protected] 3 NERC Centre for Population Biology, Silwood Park, Ascot, Berkshire, SL5 7PY, U.K.; E-mail: [email protected] Abstract.—In this paper we describe a new heuristic strategy designed to nd optimal (parsimonious) trees for data sets with large numbers of taxa and characters. This new strategy uses an iterative searching process of branch swapping with equally weighted characters, followed by swapping with reweighted characters. This process increases the efciency of the search because, after each round of swapping with reweighted characters, the subsequent swapping with equal weights will start from a different group (island) of trees that are only slightly, if at all, less optimal. In contrast, conventional heuristic searching with constant equal weighting can become trapped on islands of suboptimal trees. We test the new strategy against a conventional strategy and a modied conventional strategy and show that, within a given time, the new strategy nds trees that are markedly more parsimonious. We also compare our new strategy with a recent, independently developed strategy known as the Parsimony Ratchet. [Heuristic algorithms; large phylogenies; Parsimony Ratchet; search time.] Without phylogenies, many evolutionary studies would be difcult, if not impossible (Harvey et al., 1996). In studies that utilize phylogenies, the inclusion of more taxa often allows the analysis of more data. Large phylogenies, therefore, have some advantages, such as increasing sample size for statistical analyses and lending greater credibility to extrapolations from the data. Large phylogenies also facilitate broad-scale studies that prevent our view of evolutionary trends from becoming distorted by small, possibly nonrepresentative, examples (Gould, 1997). Furthermore, certain types of analysis require large and complete species-level phylogenies, such as the localization of diversication rates and trait values to nodes in the phylogeny to allow sister clade comparisons (Barraclough et al., 1998). The importance of phylogenetic reconstruction has led to investigations of the consistency, efciency, accuracy, and relative performance of various optimality criteria (Stewart, 1993; Hillis et al., 1994; Sidow, 1994; Hillis, 1995) and of the strategies used for estimating large phylogenies (Farris et al., 1996; Nixon, 1999). Here, after briey summarizing the problems in searching for optimal trees, we describe a new search strategy and show that it is markedly more efcient at nding optimal trees, in this case most- parsimonious trees, for data sets containing large numbers of characters and taxa. We also compare our strategy with a similar and independently developed strategy known as the Parsimony Ratchet (Nixon, 1999). Searching for Optimal Trees The reconstruction of large phylogenies is difcult, whatever the optimality criterion used, because the number of possible trees grows very rapidly as the number of taxa increases. With 10 taxa, for example, >34 £ 106 trees are possible; for 20 taxa, >8:2 £ 1021 ; and for 228 taxa, »1:2 £ 10502 (Quicke 1993; Hillis, 1996; Purvis and Quicke, 1997). Consequently, for more than »20 taxa, searching strategies that guarantee to nd the optimal tree are unreasonably time-consuming, such that approximate (heuristic) methods have to be used. Conventional heuristic strategies generally proceed by randomly choosing taxa and adding them, one by one, to an optimal position on the growing tree until all taxa are added and the tree is complete (random step-wise addition). Unfortunately, as taxa are added, alternative and better positions for some of the previously added taxa may be formed. After all taxa have been added, branch swapping is then used to try 60 2001 QUICKE ET AL.—CHANGING THE LANDSCAPE 61 FIGURE 1. A hypothetical treescape with many suboptimal peaks, each of which has the potential to trap conventional heuristic searches. to nd these new positions and thus nd shorter trees. However, as the number of taxa increases, these conventional strategies become increasingly limited because of the time taken to swap branches, especially when a large number of very similar trees are held in memory. The large amount of search time devoted to branch swapping results in fewer random additions and thus limits the search to small groups of similar trees; this in turn reduces condence in the results of the search. The New Strategy The new strategy can be explained most easily with reference to a hypothetical threedimensional tree-landscape (Fig. 1). Both the horizontal axes of this landscape can be any measures of tree similarity, whereas the height of the landscape is tree optimality. In this study, parsimony has been chosen as the optimality criterion because several of the data sets are morphological and also because this is the most widely used criterion. Therefore, in this study the height of the peak refers to how parsimonious the trees are; that is, higher peaks are more parsimonious (Maddison’s [1991] “tree-islands” analogy). Although we have concentrated on parsimony analysis, our new strategy could equally well be applied to other types of analysis, for example, distance methods. The new strategy (Fig. 2) explores many more peaks on the treescape than would a conventional search. This is achieved by initially holding (saving) only one tree during branch swapping (stage 1), whereas conventional strategies get bogged down by saving and swapping large numbers of trees (Swofford et al., 1996). On completion of stage 1, the search will have found at least one tree FIGURE 2. A step-by-step ow chart of the New strategy. See text for details. on one, possibly suboptimal, peak. Further branch swapping would be able only to move up this peak, which may be only a local optimum (Maddison, 1991). To escape these peaks, the characters are reweighted proportionately to their goodness-of-t in the previously found trees (stage 2), and because the data have been altered, the treescape is then changed. Perhaps isolating dips will be transformed into accessible uphill climbs, thereby allowing the search to escape from the peaks during the next phase of branch-swapping (stage 3). The treescape is then returned to its original form by giving the characters their original weights (stage 4). Any uphill movement across the modied treescape can move the search off of the peaks on the original treescape, allowing branch swapping to 62 VOL. 50 S YSTEMATIC BIOLOGY move the search up new, potentially higher, peaks (stage 5). When tree length stabilizes (for equally weighted characters), the procedure stops (stage 6). To recap: Stage 1 of the new strategy involves exploring many peaks on the treescape. The rest of the stages explore the surroundings by crossing depressions in the treescape. The weighting and swapping is iterative and can be continued until tree length stabilizes. This strategy is more efcient than conventional strategies because, after each round of swapping with the weighted characters, the swapping with equal weights starts from trees that are only slightly less optimal than trees found previously during the search. This is much more efcient than swapping the branches of trees found by random addition, which could be much worse than previously found trees. Furthermore, if one assumes that the phylogenetic information of a character can be estimated by its t with the trees previously found by the strategy, then the weighting can be thought of as amplifying this information. Because characters that are considered good indicators of phylogeny are given more weight, the uphill movement on the treescape will be in the direction of shorter trees for the improved (weighted) data. When the treescape is converted to its original form, these trees may be longer; that is, the search may have moved downhill. However, the search has moved to trees in which weighted characters have had more inuence. Therefore, under the assumption that the characters that best t with the trees are related to their phylogenetic reliability, the weighting and swapping will cross the depressions in the direction of more optimal trees for the equally weighted data rather than move in a random direction. This will also increase the efciency of the searching. M ATERIALS AND M ETHODS Because the efciency of the new searching strategy will depend on the data set investigated—that is, some treescapes will be easier to search than others—we tested the new strategy on four sets of data that vary in number of taxa, number of characters, and character type: (1) 95 morphological characters for 126 genera of doryctine wasps (Insecta: Hymenoptera: Braconidae: Doryctinae) (Belokobylskij et al., in prep.), (2) 575 aligned nucleotides of nuclear 28S rDNA for 111 taxa of eulophid wasps (Insecta: Hymenoptera: Eulophidae) (Gauthier et al., 2000), (3) 123 morphological characters for 80 taxa of ichneumonid wasps (Insecta: Hymenoptera: Ichneumonidae) (Quicke et al., 2000), and (4) 259 morphological characters for 234 taxa of chewing lice (Insecta: Phthiraptera: Ischnocera: Trichodectidae) (Taylor et al., in prep.). All the data sets are available from www.bio.ic.ac.uk/research/data/ and a summary description of each data set is given in the legend to Table 1. TABLE 1. Shown are the lengths for the trees found by the various strategies for (1) the doryctine wasps data, which consists of 95 morphological characters (2 uninformative) for 126 general (2) the eulophid wasp data consisting of 575 aligned nucleotides from the D2 expansion region of the nuclear 28S rDNA gene (328 uninformative sites) for 111 taxa; (3) the ichneumonid wasp data consisting of 123 morphological characters (1 uninformative) for 80 taxa; and (4) the chewing louse data which consists of 259 morphological characters (6 uninformative) for 234 taxa. Six replicates were carried out for the conventional, the modied, and the new strategies. See text for details. Data set Doryctine wasps Eulophid wasps Ichneumonid wasps Chewing lice Strategy Conventional Modied New (max CI) New (max RI) Conventional Modied New (max CI) New (max RI) Conventional Modied New (max CI) New (max RI) Conventional Modied New (max CI) New (max RI) Tree lengths 663 660 655 656 1867 1863 1863 1863 1367 1368 1366 1365 1647 1643 1641 1645 661 657 655 659 660 658 663 660 656 667 657 654 660 658 655 1867 1862 1863 1864 1863 1863 1867 1863 1863 1863 1863 1863 1867 1862 1862 1369 1367 1364 1367 1366 1365 1368 1368 1364 1369 1366 1364 1364 1367 1366 1647 1644 1643 1639 1643 1642 1645 1643 1645 1639 1640 1644 1647 1640 1639 2001 QUICKE ET AL.—CHANGING THE LANDSCAPE Each data set was analyzed with the following strategies: 1. Conventional strategy (Con) of building starting trees by using random addition, followed by tree bisection and reconnection (TBR) branch swapping with the number of trees held effectively unlimited (set to 30,000). 2. Modied conventional strategy (Mod) performed the same way as the conventional search, but holding only one tree during TBR swapping; this strategy is stage 1 of the new strategy (Fig. 2). 3. The new strategy (New), carried out by using maximum character consistency index (max CI) as the reweighting factor. 4. The same as strategy 3 but reweighting by using the maximum character retention index (max RI). All analyses were carried out with PAUP¤ (4.0d64(PPC, test)) (kindly provided by D. Swofford) run on a 266 MHz PowerMacintosh G3, and each strategy was allowed a total of 80 min to analyze each data set. A time limit was set rather than the number of random additions, because in practice, time is usually the limiting factor. For the New strategy, the 80 min was divided into eight blocks of 10 min each: The rst 10 min was allocated to obtaining the starting tree (Fig. 2, step 1) with random additions followed by TBR branch swapping, holding no more than one tree during swapping; 10 min was then allocated for each subsequent round of weighting and swapping, and for these, maxtrees was increased to 30,000 (i.e., effectively unlimited). Each strategy was then performed six times per data set, except for the New strategy with max RI weighting, which was carried out once as a preliminary investigation of different weighting methods. All analyses were started with a different random seed. To test if the differences in the tree lengths found by each strategy were significantly different, a one-way ANOVA and a Least Signicant Difference (LSD) test were carried out on the results from each data set. A direct comparison between the new technique and the Parsimony Ratchet (Nixon, 1999) was carried out for one of our four data sets (ichneumonid wasps). We used only the one data set, in view of the time taken for completing searches with maxtrees 63 unlimited, as would be the situation in most real analyses. Although we used only one data set, 41 random additions with TBR swapping (number of trees held and saved unlimited) led to 40 different islands of trees, which suggests that the treescape contains many peaks. That makes this a suitable data set for testing the relative ability of these strategies to move between the peaks of the treescape to nd more-parsimonious trees. To compare the strategies, each of the 40 islands was used as a starting point for one iteration of the New strategy with maximum values of the rescaled consistency index (RC), CI, and RI as the reweighting factors. This resulted in three searches (each with different weighting function) for 40 starting islands, giving a total of 120 searches for the New strategy. Each of the 40 islands was also used as a starting point for one iteration of the Parsimony Ratchet. The Parsimony Ratchet was simulated by using PAUP¤ by up-weighting 15% of the characters to a weight of 2, compared with the remaining 85%, which were weighted at 1 (in line with Nixon, 1999). Because characters chosen at random might affect the performance of the Parsimony Ratchet, each search was therefore performed with 13 sets of randomly chosen characters up-weighted. This resulted in 13 searches consisting of a single iteration for each of the 40 starting islands, for a total of 520 searches. Note that the same 13 sets of randomly chosen and up-weighted characters were used on each of the starting islands. R ES ULTS Comparisons between Con, Mod, and New Strategies The lengths of the shortest trees found by each strategy are shown in Table 1 and the results from the ANOVA and LSD tests carried out on these tree lengths are given in Table 2. The results show that, except for the chewing lice data, and the modied strategy for the eulophid molecular data, the New method nds signicantly shorter trees. In the New strategy’s analyses of the ichneumonid morphological data, the search came to a premature nish; that is, a single tree was found in ·80 min. The same numbers of iterations were carried out, but the search time was greatly decreased, averaging 28 min. Moreover, the tree found in this much shorter 64 VOL. 50 S YSTEMATIC BIOLOGY TABLE 2. The results of the statistical analysis of the tree lengths found by various strategies (data in Table 1). The P-values and the F-ratios (degrees of freedom in brackets) from the ANOVA are shown in the left of the table. The right of the table contains the comparisons made by the LSD test to show which strategies nd signicantly different tree lengths. The nal column contains various condence intervals of the differences in tree length in the strategies being compared. If this range does not encompass zero, the results are signicantly different. If both values are <0, the rst-named strategy nds signicantly shorter trees; if both are positive, the rst-named strategy nds signicantly longer trees. The LSD test were carried out for 90%, 95%, and 98% condence intervals; however, all tests were either signicant at the 2% level (P < 0:02; identied with¤ ) or not signicant at the 10% level (P > 0:1; ns). The LSD tests were not performed for the louse data because the ANOVA showed no signicant difference among strategies. ANOVA LSD Data set F (2, 17) P-value Dorycine wasps 16.23 <0.01 Eulophidae 15.04 <0.01 Icneumonid wasps 6.34 0.01 Chewing lice 0.80 0.469 time was signicantly shorter ( p · 0:01) than those found by the conventional method. The new searching strategy was also carried out by using max RI to reweight the characters. Although only one replicate was performed, the results suggest that, at least for these four data sets, using max RI and max CI weighting produced no obvious difference between the lengths of trees found. Comparisons Condence interval New versus Con New versus Mod Mod versus Con New versus Con New versus Mod Mod versus Con New versus Con New versus Mod Mod versus Con ¡4.484, ¡0.516 ¤ ¡4.151, ¡0.182 ¤ ¡1.670, 1.003ns ¡4.691, ¡1.309 ¤ ¡0.972, 1.306ns ¡4.857, ¡1.476 ¡4.484, ¡0.516 ¤ ¡4.151, ¡0.182 ¤ ¡1.670, 1.003ns trial that never led to trees longer than the starting trees. D IS CUS SION We have described our New strategy and shown that it nds signicantly shorter trees than both the conventional and modied conventional strategy (Table 2). However, at least three issues remain to be dealt with: choosing a method for character Comparison with the Parsimony Ratchet The results of our comparisons are presented in Table 3. Of the 13 ratchet replicates, some of the randomly selected and up-weighted character sets consistently led to shorter trees than the starting island (see Table 3, best), whereas others led to trees that were longer than the starting trees more often than to shorter ones (Table 3, worse). The median performance of the ratchet indicates that it performs well, leading to shorter tree islands on 13 occasions, leading to longer islands on 4 occasions, and making no difference in 23 runs. The New strategy, using both RC and CI as reweighting functions, performed marginally better than the median ratchet, nding shorter trees from 15 and 14 of the starting islands, respectively, although with RC, more islands led to worse trees than in the median ratchet. Using RI found shorter trees fewer times than the median ratchet, but it was the only reweighting scheme in this TABLE 3. Comparison of the New strategy and a simulation of the Parsimony Ratchet (using 15% of randomly selected characters up-weighted by 1). The number of reweighting runs (out of 40) that found shorter or longer trees than the starting trees for the New strategy using maximum RC, CI, and RI as weighting functions and are shown on the left half of the table. The results from the Ratchet simulation are shown on the right half of the table. The column entitled Best shows the number of the 40 replicates that found trees shorter or longer than the starting tree by using the set of randomly chosen and up-weighted characters that performed best. The Worse column contains the number of the 40 replicates that found trees shorter or longer than the starting tree when the set of randomly chosen and up-weighted characters used performed worse. The nal column (Median) contains the median number of shorter or longer trees found from the 13 sets of randomly chosen and up-weighted characters. New strategy Shorter Longer Parsimony ratchet RC CI RI Best Worse Median 15 9 14 3 10 0 22 1 10 18 13 4 2001 QUICKE ET AL.—CHANGING THE LANDSCAPE reweighting; possible modications to our strategy; and how our work relates to the Parsimony Ratchet (Nixon, 1999). Reweighting the Characters Character reweighting can be carried out automatically within the PAUP¤ program by a range of methods. Unfortunately, it is difcult to determine whether a particular method of weighting is suitable for all data sets, or whether an examination of the data will allow the appropriate method to be determined. Therefore, the user must make the choice of method. In a further analysis of the chewing lice data (Taylor et al., in prep.), we chose the max CI because it cannot weight characters as zero. This is an attractive property because the data contain much homoplasy. If the chosen method can allocate zero weights, many potentially useful characters would be excluded from the analysis on the basis of their assessment on the preliminary starting trees. In this study the original weights of the characters are equal. However, this is not a necessary part of the strategy. Characters can be given any starting weights that are appropriate, and after each round of reweighting and swapping, they should be returned to those weights. Strategy Modications One obvious modication that can be made to our strategy is to carry out the search in parallel for each initial tree found by stage 1. This would explore more of the treescape by exploring around each peak and is likely to be particularly important when the treescape is too large to be examined by starting from one point. Alternatively, rather than stopping the iterations when tree length stabilizes, more iterations could be carried out to explore more of the treescape. 65 portant is currently undetermined and will require further work. The availability of two similar, contemporary, strategies can have only positive consequences. For instance, researchers estimating large phylogenies now have a choice of two strategies that are clearly described, implemented by using two different user-friendly programs, and both proven to work better than conventional strategies. Not only does this give a choice of strategies, it also encourages additional development and improvements. An example of such an anticipated improvement may come from further investigation of the most appropriate methods for character weighting (see also Nixon, 1999). This may clarify the problem of choosing a method for character weighting or lead to implementation of other versions of these strategies that are suitable for particular types of data. If no one method is proved to be better for all data sets, that might lead to the implementation of our strategy on fast parsimony programs such as NONA (Goloboff, 1993), which, as currently set up, cannot be used with our strategy. In summary, the New strategy is easy to understand, easy to implement by using PAUP, and more efcient than conventional methods. This, coupled with other methods for constructing large trees (Farris et al., 1996; Goloboff, 1999; Nixon, 1999), and the inevitable future improvements, will facilitate the estimation of large, useful phylogenies that, until now, have been very difcult. ACKNOWLEDGMENTS We thank A. Webster, A. Katzourakis, G. Broad, P.-M. Agapow, J. Harral, and especially M. Goddard and T. Barraclough for comments on the manuscript. We also thank D. Swofford for access to developmental versions of PAUP¤ and two anonymous reviewers for their helpful comments. This work was funded by a Natural Environment Research Council (NERC) PhD studentship and the NERC Initiative in Taxonomy. The Parsimony Ratchet R EFERENCES As previously mentioned, the Parsimony Ratchet (Nixon, 1999) is a similar, independently derived strategy. Its main difference is that, to escape the peaks, the weight of a set of randomly chosen characters is increased by 1 (or more) rather than by their t to the previously found trees. Whether any of the differences in weighting between the New strategy and the Parsimony Ratchet is im- B ARRACLOUGH, T. G., A. P. VOGLER, AND P. H. HARVEY. 1998. Revealing factors that promote speciation. Philos. Trans. R. Soc. London Ser. B 353:241–249. FARRIS , J. S ., V. A. ALBERT , M. K ALLERSJO , D. LIPS COMB , AND A. G. K LUGE . 1996. Parsimony jackkning outperforms neighbor-joining. Cladistics 12:99–124. G AUTHIER, N., J. LASALLE, D. L. J. Q UICKE, AND H. C. J. G ODFRAY. 2000. The phylogeny and classication of the Eulophidae (Hymenoptera, Chalcidoidea) and the recognition that the Elasmidae are derived eulophids. Syst. Entomol. 25:251–539. 66 S YSTEMATIC BIOLOGY G OLOBOFF, P. A. 1993. NONA, version 1.1. Computer program and documentation distributed by J. M. Carpenter, American Museum of Natural History, New York. ([email protected]). G OLOBOFF, P. A. 1999. Analyzing large data sets in reasonable times: Solutions for composite optima. Cladistics 15:415–428. G OULD , S . J. 1997. Cope’s rule as psychological artefact. Nature 385:199–200. HARVEY, P. H., A. J. L. B ROWN, J. M. S MITH, AND S. NEE. 1996. New uses for new phylogenies. Oxford University Press, Oxford. HILLIS , D. M. 1995. Approaches for assessing phylogenetic accuracy. Syst. Biol. 44:3–16. HILLIS , D. M. 1996. Inferring complex phylogenies. Nature 383:130–131. HILLIS , D. M., J. P. HUELSENBECK , AND C. W. CUNNING HAM . 1994. Application and accuracy of molecular phylogenies. Science 264:671–264. M ADDIS ON, D. R. 1991. The discovery and importance of multiple islands of most parsimonious trees. Syst. Zool. 40:315–328. NIXON, K. C. 1999. The Parsimony Ratchet, a new method for rapid parsimony analysis. Cladistics 15:407–414. VOL. 50 P URVIS , A., AND D. L. J. Q UICKE. 1997. Building phylogenies: Are the big easy? Trends Ecol. Evol. 12: 49–50. Q UICKE, D. L. J. 1993. Principles and techniques of contemporary taxonomy. Chapman and Hall, St Edmunds, Suffolk, U.K. Q UICKE, D. L. J., M. G. FITTON, D. G. NOTTON, G. R. B ROAD, AND K. D OLPHIN. 2000. Phylogeny of the subfamilies of Ichneumonidae (Hymenoptera): A simultaneous molecular and morphological analysis. Pages 74–83 in Hymenoptera: evolution, biodiversity and biological control (A. D. Austin and M. Dowton, eds.). CSIRO, Canberra. S IDOW, A. 1994. Parsimony or statistics? Nature 367: 26–27. S TEWART , C.-B. 1993. The powers and pitfalls of parsimony. Nature 361:603–607. S WOFFORD, D. L., G. J. O LSEN, P. J. WADDELL, AND D. M. HILLIS . 1996. Phylogenetic inference. Pages 407–514. in Molecular systematics (D. M. Hillis, C. Moritz, and B. K. Mable, eds.). Sinauer Associates, Sunderland, Massachusetts. Received 9 June 1999; accepted 30 December 1999 Associate Editor: R. Olmstead
© Copyright 2026 Paperzz