BIOINFORMATICS APPLICATIONS NOTE Vol. 16 no. 11 2000 Pages 1056–1058 PhyloDraw: a phylogenetic tree drawing system Jeong-Hyeon Choi, Ho-Youl Jung, Hye-Sun Kim and Hwan-Gue Cho Department of Computer Science, Pusan National University, Pusan, Korea Received on March 3, 2000; revised on June 27, 2000; accepted on July 7, 2000 Abstract Summary: PhyloDraw is a unified viewing tool for phylogenetic trees. PhyloDraw supports various kinds of multi-alignment formats (Dialign2, Clustal-W, Phylip format, NEXUS, MEGA, and pairwise distance matrix) and visualizes various kinds of tree diagrams, e.g. rectangular cladogram, slanted cladogram, phylogram, unrooted tree, and radial tree. By using several control parameters, users can easily and interactively manipulate the shape of phylogenetic trees. This program can export the final tree layout to BMP (bitmap image format) and PostScript. Availability: http:// pearl.cs.pusan.ac.kr/ phylodraw/ Contact: [email protected] In a phylogenetic tree, every leaf node represents a species, each edge denotes a relationship between two neighboring species, and the length of an edge indicates the evolutionary distance between species. The distance of a path in a phylogenetic tree must be as close as the evolutionary distance between two species. However, it is theoretically very difficult to compute a perfect phylogenetic tree. To overcome this difficulty, the least squares method, which attempts to minimize the following sum of squares (SS Q(T )) in a tree (T ), has been devised: SS Q(T ) = n n wi j (Di j − di j )2 , i=1 j=i where Di j is the observed evolutionary distance between species i and j, and di j is the length of the path from i to j in a phylogenetic tree T . Note that wi j is an element of a weight matrix. In general, finding the optimal least squares tree is an NP-complete problem (Day, 1986). Drawing a phylogenetic tree on a plane is another problem. A phylogenetic tree must not have edge crossings because such crossings would prevent users from recognizing the phylogenetic information. It is desirable for the phylogenetic viewing system to be capable of supporting various types of drawings: unrooted tree, radial tree, rooted tree, slanted cladogram, rectangle cladogram, and phylogram. Also, because 1056 there are many kinds of phylogenetic tree formats, supportability for various types of input data and interactive editing are major concerns in evaluating drawing software. There are currently several tools for drawing phylogenetic trees, for example, NJPLOT, GENETREE, PHYLIP, GENEDOC, DAMBE, TREECON, TREEVIEW, and SPECTRUM (Perriere and Gouy, 1996; Felsenstein, 1993; Page and Charleston, 1997; Charleston, 1998). NJPLOT(GENETREE) accepts the ∗. ph(∗.gtr ) format as the input type and generates a rectangular cladogram (phylogram). One of the drawbacks of PHYLIP is that it allows edge crossings in the final layout. DAMBE only views two types of trees, cladogram and phylogram, and if there are many species, it also allows edge crossings. TREECON, TREEVIEW, and SPECTRUM allow labels to overlap, although label overlap can be easily avoided by manual works. In contrast, PhyloDraw does not make any edge crossings, even with up to 100 species in practice. PhyloDraw uses a reliable labeling algorithm to avoid name overlapping in the final tree layout. Also we tried to distribute the whole species in a phylogenetic tree as uniformly as possible on the whole output screen. Previous phylogenetic tree drawing systems did not allow users to edit the shape of the tree on screen, but PhyloDraw provides several editing functions to give a user-friendly interface. Users can select the tree type (rectangular cladogram, slanted cladogram, phylogram, unrooted tree, or radial tree), resize the tree, and change the branching patterns of the phylogenetic tree (rooted, unrooted, or half-rooted). PhyloDraw can import various output types of multiple alignment formats such as DIALIGN (Morgenstern et al., 1998), Clustal-W (Thompson et al., 1994), Phylip, NEXUS, MEGA, and the pairwise distance matrix. A pairwise distance matrix is a matrix of the evolutionary distance between every pair of species. In constructing a phylogenetic tree from a pairwise distance matrix, we provide two well-known clustering methods: Neighbor Joining (Saitou and Nei, 1987) and Fitch–Margoliash (Felsenstein, 1997). Figure 1a c Oxford University Press 2000 PhyloDraw apple pear cat dog pig apple pear cat dog pig 0 0.1 0.8 0.9 0.7 0.1 0 0.9 0.8 0.6 0.8 0.9 0 0.2 0.2 0.9 0.8 0.2 0 0.1 0.7 0.6 0.2 0.1 0 (a) A testing pairwise distance matrix. (b) Corresponding phylogenetic tree. Fig. 1. A test of a pairwise distance input matrix and its corresponding phylogenetic tree by using the Neighbor-Joining method. Fig. 2. System interface of PhyloDraw. 1057 J.-H. Choi et al. shows a testing pairwise distance matrix of 5 species. The corresponding phylogenetic tree is shown in Figure 1b. By manipulating the entry value of the distance matrix, we can examine the clustering method easily. PhyloDraw exports three types of output data (BMP, PostScript, and pairwise distance matrix) so that users can include them for word processors, graphics tools, and PostScript. Figure 2 shows a snapshot of PhyloDraw. The left subwindow shows the current tree information and control buttons. In a rooted tree, users can control the fanout degree from the root. Also, users can give the control variable, w (0 ≤ w ≤ 1), to give weight to evolutionary distance and branching pattern. The length of an edge is computed as, D = w · Du + (1 − w) · De , where Du is the unit distance and De is the evolutionary distance. If w is 0, then the distance of a tree edge reflects evolutionary time. If w is 1, then it only shows the branching process. PhyloDraw was implemented with Visual C++ 6.0, and can be easily installed in Windows 95/98 or Windows NT. 1058 References Charleston,M.A. (1998) Spectrum: spectral analysis of phylogenetic data. Bioinformatics, 14, 98–99. Day,W.H. (1986) Computational complexity of inferring phylogenies from dissimilarity matrices. Bull. Math. Biol., 49, 461–467. Felsenstein,J. (1993) PHYLIP: Phylogeny Inference Package. Version 3.5. University of Washington, Seattle, WA. Felsenstein,J. (1997) An alternating least squares approach to inferring phylogenies from pairwise distances. Syst. Biol., 46, 101–111. Morgenstern,B., Frech,K., Dress,A. and Werner,T. (1998) Dialign: finding local similarities by multiple sequence alignment. Bioinformatics, 14, 290–294. Page,R.D. and Charleston,M.A. (1997) From gene to organismal phylogeny: feconciled trees and the gene tree/species tree problem. Mol. Phylogenet. Evol., 7, 231–240. Perriere,G. and Gouy,M. (1996) WWW-query: an on-line retrieval system for biological sequence banks. Biochimie, 78, 364–369. Saitou,N. and Nei,M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. 4, 406–425. Thompson,J.D., Higgins,D.G. and Gibson,T.J. (1994) Clustal-W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673–4680.
© Copyright 2024 Paperzz