PhyloDraw: a phylogenetic tree drawing system

BIOINFORMATICS APPLICATIONS NOTE
Vol. 16 no. 11 2000
Pages 1056–1058
PhyloDraw: a phylogenetic tree drawing system
Jeong-Hyeon Choi, Ho-Youl Jung, Hye-Sun Kim and Hwan-Gue
Cho
Department of Computer Science, Pusan National University, Pusan, Korea
Received on March 3, 2000; revised on June 27, 2000; accepted on July 7, 2000
Abstract
Summary: PhyloDraw is a unified viewing tool for
phylogenetic trees. PhyloDraw supports various kinds of
multi-alignment formats (Dialign2, Clustal-W, Phylip format, NEXUS, MEGA, and pairwise distance matrix) and
visualizes various kinds of tree diagrams, e.g. rectangular
cladogram, slanted cladogram, phylogram, unrooted tree,
and radial tree. By using several control parameters,
users can easily and interactively manipulate the shape of
phylogenetic trees. This program can export the final tree
layout to BMP (bitmap image format) and PostScript.
Availability: http:// pearl.cs.pusan.ac.kr/ phylodraw/
Contact: [email protected]
In a phylogenetic tree, every leaf node represents a
species, each edge denotes a relationship between two
neighboring species, and the length of an edge indicates
the evolutionary distance between species.
The distance of a path in a phylogenetic tree must be as
close as the evolutionary distance between two species.
However, it is theoretically very difficult to compute a
perfect phylogenetic tree. To overcome this difficulty, the
least squares method, which attempts to minimize the
following sum of squares (SS Q(T )) in a tree (T ), has
been devised:
SS Q(T ) =
n n
wi j (Di j − di j )2 ,
i=1 j=i
where Di j is the observed evolutionary distance between
species i and j, and di j is the length of the path from i
to j in a phylogenetic tree T . Note that wi j is an element
of a weight matrix. In general, finding the optimal least
squares tree is an NP-complete problem (Day, 1986).
Drawing a phylogenetic tree on a plane is another
problem. A phylogenetic tree must not have edge
crossings because such crossings would prevent users
from recognizing the phylogenetic information. It is
desirable for the phylogenetic viewing system to be
capable of supporting various types of drawings: unrooted tree, radial tree, rooted tree, slanted cladogram,
rectangle cladogram, and phylogram. Also, because
1056
there are many kinds of phylogenetic tree formats,
supportability for various types of input data and interactive editing are major concerns in evaluating drawing
software.
There are currently several tools for drawing phylogenetic trees, for example, NJPLOT, GENETREE, PHYLIP,
GENEDOC, DAMBE, TREECON, TREEVIEW, and
SPECTRUM (Perriere and Gouy, 1996; Felsenstein,
1993; Page and Charleston, 1997; Charleston, 1998).
NJPLOT(GENETREE) accepts the ∗. ph(∗.gtr ) format
as the input type and generates a rectangular cladogram
(phylogram). One of the drawbacks of PHYLIP is that it
allows edge crossings in the final layout. DAMBE only
views two types of trees, cladogram and phylogram, and
if there are many species, it also allows edge crossings.
TREECON, TREEVIEW, and SPECTRUM allow labels
to overlap, although label overlap can be easily avoided
by manual works. In contrast, PhyloDraw does not
make any edge crossings, even with up to 100 species in
practice. PhyloDraw uses a reliable labeling algorithm
to avoid name overlapping in the final tree layout. Also
we tried to distribute the whole species in a phylogenetic tree as uniformly as possible on the whole output
screen.
Previous phylogenetic tree drawing systems did not
allow users to edit the shape of the tree on screen, but
PhyloDraw provides several editing functions to give a
user-friendly interface. Users can select the tree type
(rectangular cladogram, slanted cladogram, phylogram,
unrooted tree, or radial tree), resize the tree, and change
the branching patterns of the phylogenetic tree (rooted,
unrooted, or half-rooted).
PhyloDraw can import various output types of multiple
alignment formats such as DIALIGN (Morgenstern et
al., 1998), Clustal-W (Thompson et al., 1994), Phylip,
NEXUS, MEGA, and the pairwise distance matrix. A
pairwise distance matrix is a matrix of the evolutionary
distance between every pair of species.
In constructing a phylogenetic tree from a pairwise
distance matrix, we provide two well-known clustering
methods: Neighbor Joining (Saitou and Nei, 1987)
and Fitch–Margoliash (Felsenstein, 1997). Figure 1a
c Oxford University Press 2000
PhyloDraw
apple
pear
cat
dog
pig
apple
pear
cat
dog
pig
0
0.1
0.8
0.9
0.7
0.1
0
0.9
0.8
0.6
0.8
0.9
0
0.2
0.2
0.9
0.8
0.2
0
0.1
0.7
0.6
0.2
0.1
0
(a) A testing pairwise distance matrix.
(b) Corresponding phylogenetic tree.
Fig. 1. A test of a pairwise distance input matrix and its corresponding phylogenetic tree by using the Neighbor-Joining method.
Fig. 2. System interface of PhyloDraw.
1057
J.-H. Choi et al.
shows a testing pairwise distance matrix of 5 species.
The corresponding phylogenetic tree is shown in Figure 1b. By manipulating the entry value of the distance
matrix, we can examine the clustering method easily.
PhyloDraw exports three types of output data (BMP,
PostScript, and pairwise distance matrix) so that users
can include them for word processors, graphics tools, and
PostScript.
Figure 2 shows a snapshot of PhyloDraw. The left
subwindow shows the current tree information and control
buttons. In a rooted tree, users can control the fanout
degree from the root. Also, users can give the control
variable, w (0 ≤ w ≤ 1), to give weight to evolutionary
distance and branching pattern. The length of an edge is
computed as,
D = w · Du + (1 − w) · De ,
where Du is the unit distance and De is the evolutionary
distance. If w is 0, then the distance of a tree edge
reflects evolutionary time. If w is 1, then it only shows
the branching process.
PhyloDraw was implemented with Visual C++ 6.0, and
can be easily installed in Windows 95/98 or Windows NT.
1058
References
Charleston,M.A. (1998) Spectrum: spectral analysis of phylogenetic
data. Bioinformatics, 14, 98–99.
Day,W.H. (1986) Computational complexity of inferring phylogenies from dissimilarity matrices. Bull. Math. Biol., 49, 461–467.
Felsenstein,J. (1993) PHYLIP: Phylogeny Inference Package. Version 3.5. University of Washington, Seattle, WA.
Felsenstein,J. (1997) An alternating least squares approach to
inferring phylogenies from pairwise distances. Syst. Biol., 46,
101–111.
Morgenstern,B., Frech,K., Dress,A. and Werner,T. (1998) Dialign:
finding local similarities by multiple sequence alignment. Bioinformatics, 14, 290–294.
Page,R.D. and Charleston,M.A. (1997) From gene to organismal
phylogeny: feconciled trees and the gene tree/species tree
problem. Mol. Phylogenet. Evol., 7, 231–240.
Perriere,G. and Gouy,M. (1996) WWW-query: an on-line retrieval
system for biological sequence banks. Biochimie, 78, 364–369.
Saitou,N. and Nei,M. (1987) The neighbor-joining method: a new
method for reconstructing phylogenetic trees. 4, 406–425.
Thompson,J.D., Higgins,D.G. and Gibson,T.J. (1994) Clustal-W:
improving the sensitivity of progressive multiple sequence
alignment through sequence weighting, position-specific gap
penalties and weight matrix choice. Nucleic Acids Res., 22,
4673–4680.