TESTING HYPOTHESES OF CHAETOGNATH ORIGINS: LONG

Syst. Biol. 45(2):223-246, 1996
TESTING HYPOTHESES OF CHAETOGNATH ORIGINS: LONG
BRANCHES REVEALED BY 18S RIBOSOMAL DNA
KENNETH M. HALANYCH1
Department of Zoology, University of Texas, Austin, Texas 78712, USA; and
Department of Zoology and Entomology, University of Pretoria, Pretoria 0002, South Africa
Abstract.—Many hypotheses regarding the phylogenetic position of the Chaetognatha (arrow
worms) have been proposed; these organisms are problematic primarily because their morphology
offers few unambiguous systematic characters that ally them with other taxa. Early researchers
proposed a plethora of phylogenetic placements for the Chaetognatha, grouping them with such
divergent taxa as acanthocephalans and mollusks, but more traditional hypotheses posit that
chaetognaths are, in fact, deuterostomes. Recently, Telford and Holland (1993, Mol. Biol. Evol. 10:
660-676) and Wada and Satoh (1994, Proc. Natl. Acad. Sci. USA 91:1801-1804) disputed the deuterostome affinities of chaetognaths based on 18S nuclear ribosomal RNA (rDNA) gene sequence
data. By employing published 18S rDNA gene sequence data, I extended these previous analyses
by testing specific hypotheses of chaetognath affinities to nematodes, mollusks, acanthocephalans,
and deuterostomes. Both parsimony and neighbor-joining analyses supported the monophyly of
a chaetognath-nematode clade. Faith's T-PTP test and winning-sites analyses were employed to
discriminate among competing hypotheses. The possibility of long-branch attraction accounting
for the chaetognath-nematode relationship was explored by analyzing alternative four-taxon trees.
An evolutionary scenario for the origin of the chaetognath lineage from a vermiform benthic
organism is presented. [Chaetognatha; Nematoda; 18S rDNA; long branches; winning-sites test;
phylogeny.]
"I think ... that the relations of Chaetognaths remain obscure and buried; nor is any sign to be seen
that they might be discovered in a short while". Let
us hope that Grassi's pessimistic views may be dispelled in a shorter time than has passed since he
expressed them.
—Ghirardelli, 1968
Chaetognatha (arrow worms) is a marine phylum of vermiform predatory organisms. They are usually planktonic, are
approximately 0.5-12 cm in length, and
are characterized in part by the presence
of chitinous spines around the mouth and
peculiar ciliary organs located dorsally.
This group of organisms has perplexed biologists since its discovery in 1768 because
the unique morphology offers few clues to
its evolutionary origins.
Traditionally, chaetognaths have been
considered a basal deuterostome lineage
(Hyman, 1959:66; Willmer, 1990). This hypothesis emphasized the importance of the
1
Present address: Department of Biological Sciences, Southern Methodist University, 227 Fondren
Science Building, Dallas, Texas 75275, USA. E-mail:
[email protected].
tripartite coelomic arrangement and embryological characters. However, because
of the lack of synapomorphies with other
phyla, a myriad of hypotheses have been
proposed concerning chaetognath origins.
For example, Nielsen (1985) argued that
the arrow worms are most closely related
to the acoelomate acanthocephalans, and
Giinther (1907) and Casanova (1987) proposed a protostome affinity by suggesting
that chaetognaths are derived mollusks.
An older hypothesis is that chaetognaths
are pseudocoelomates closely related to
the nematodes (Schneider, 1886, op. cit.
Ghirardelli, 1968). This last proposal was
originally supported by the apparent lack
of circular muscle in the body wall in both
taxa. Unfortunately, most of the morphological and embryological information is
ambiguous or conflicting. For example, the
coelomic condition of chaetognaths is still
debated, and depending upon the interpretation of the arrangement of various cell
and tissue layers, chaetognaths are either
pseudocoelomate or eucoelomate.
In the absence of unambiguous morphological information, DNA sequence data
223
224
VOL. 45
SYSTEMATIC BIOLOGY
can be a very useful tool for determining
the phylogenetic placement of a group of
organisms. Both Telford and Holland
(1993) and Wada and Satoh (1994) explored chaetognath origins using the 18S
nuclear ribosomal RNA (rDNA) gene. This
gene has been widely used to reconstruct
ancient phylogenetic events (hundreds of
millions of years ago) because of its slow
rate of change and concerted evolution
(Field et al v 1988; Hillis and Dixon, 1991).
The 18S rDNA data suggest that chaetognaths are not deuterostomes, and Telford
and Holland (1993) argued that the chaetognath lineage was established prior to
the origin of coelomate metazoans. These
previous molecular studies, however, focused largely on deuterostome taxa. Here,
I used a greater representation of metazoan life to reanalyze various phylogenetic
hypotheses.
Four explicit hypotheses of the evolutionary origins of chaetognaths were tested
to determine if chaetognaths are most
closely related to nematodes, acanthocephalans, mollusks, or deuterostomes. These
taxa were specifically chosen to represent
each of the major types of coelomic formation in triploblastic metazoans (nematode = pseudocoelomate; acanthocephalan
= acoelomate; mollusk = schizocoelomate;
deuterostome = enterocoelomate). Recent
ultrastructural and developmental observations (Welsch and Storch, 1982; Shinn
and Roberts, 1994) corroborated the presence of a true coelom in arrow worms. In
contrast, the 18S rDNA data suggest that
chaetognaths are most closely related to a
pseudocoelomate group, the nematodes.
Clearly, the phylogenetic position of chaetognaths is important for understanding
plasticity and constraints of coelom evolution.
The goal of this study was to further our
understanding of chaetognath origins by
testing alternative hypotheses using 18S
rDNA sequence data and standard phylogenetic reconstruction methods. Various
statistical tests (bootstrap analyses, Faith's
[1991] topology-dependent cladistic permutation tail probability [T-PTP] test, and
winning-sites analyses) were used to de-
termine the robustness of the results. Analyses of four-taxon trees suggest that the
chaetognath-nematode association is not
due to long-branch attraction. A speculative evolutionary scenario is proposed that
illustrates the functional feasibility of the
derivation of modern chaetognaths from a
common benthic chaetognath-nematode
ancestor.
MATERIALS AND METHODS
The phylogenetic analyses employed
complete 18S rDNA sequence data obtained from GenBank for 21 taxa. Table 1
lists the species used, their GenBank accession numbers, and the taxa they represent. To polarize character states and root
the resultant phylogeny, the sponge and
anemone sequences were designated as
outgroups. In situations when more than
one complete 18S sequence was available
for a given taxon, the more slowly evolving
representative was used. For example, Tenebrio molitor sequence was used for the insect representative instead of Drosophila
melanogaster because it is evolving at a
"slower" pace. This criterion was used in
an attempt to reduce homoplasy or erroneous results due to large divergence values/long-branch attraction. Furthermore,
in practice, the use of more slowly evolving representatives allows a greater portion of the 18S molecule to be unambiguously aligned.
The alignment of sequences was conducted with the multiple alignment program Clustal V (Higgins et al, 1992) and
then corrected by hand for obvious alignment errors (Appendix). This correction
employed a secondary structure model of
the eukaryotic small ribosomal subunit
(from Saccharomyces cerevisiae; De Rijk et
al., 1992) to identify conserved regions and
variable domains (Neefs and De Wachter,
1990). Regions that could not be unambiguously aligned were excluded from subsequent analyses. These regions differed in
sequence length across taxa and were contained within the previously identified
variable regions (except for one five-basepair region of ambiguity in helix 25, which
was caused by a two-base-pair insert in
1996
225
HALANYCH—CHAETOGNATH ORIGINS
TABLE 1. Metazoan taxa used in this study.
Species
GenBank no.
Taxon designation
Sagitta elegans
Sagitta crassa forma naikaiensis
Paraspadella gotoi
Caenorhabditis elegans
Haemonchus contortus
Nematodirus battus
Schistosoma haematobium
Opisthorchis viverrini
Artemia salina
Tenebrio molitor
Eurypelma californica
Moniliformis moniliformis
Limicolaria kambeul
Placopecten magellanicus
Acanthopleura japonica
Phoronis vancouverensis
Branchiostoma floridae
Strongybcentrotus purpuratus
Saccoglossus kowalevskii
Ammonia sulcata
Sq/pha ciliata
Z19551
D14363
D14362
X03680
L04153
U01230
Z11976
X55357
X01723
X07801
X13457
Z19562
X66374
X53899
X70210
U12648
M97571
L28056
L28054
X53498
L10827
chaetognath 1
chaetognath 2
chaetognath 3
nematode 1
nematode 2
nematode 3
flatworm 1
flatworm 2
crustacean
insect
spider
acanthocephalan
snail
bivalve
chiton
phoronid
chordate
echinoderm
hemichordate
anemone
sponge
Paraspadella). The boundaries of the exclud- 1, 3:1, 10:1, and 100:1 were arbitrarily choed regions were defined by pruning the se- sen. The ratio of 1.2:1 was determined emquences surrounding the ambiguous re- pirically with the "state changes and stagion back to the last character that was sis" option of MacClade by counting the
uninformative among phyla. For example, average number of transition (Ti) and
character 960 had the state G for 19 taxa transversion (Tv) events on 1,000 random
and state T for 3 taxa, but these 3 taxa were trees. The range of the Ti/Tv ratios over
all nematodes. Thus, character 960 was in- the 1,000 random trees was 1.13393:1 to
cluded in the analyses (see Appendix). 1.25793:1, and the average ratio of 1.19049:
However, character 337 was excluded be- 1 was rounded to 1.2:1 to reduce compucause it has a T for both Anemonia and Cae- tation time. This ratio is similar to that of
tiorhabditis, whereas the other taxa had a C the most-parsimonious tree based on equal
in this position (except Saccoglossus, which weighting (ratio = 1.24:1).
was a G). (The aligned data set with secTesting Hypotheses
ondary structure annotation and the PAUP
file can be obtained from the Systematic BiA T-PTP test was used to examine the sigology World Wide Web site.)
nificance of monophyly for certain clades
PAUP 3.1.2d5 (Swofford, 1993) was used (Faith, 1991). By comparing the difference
for parsimony analyses, and PHYLIP 3.5 in length of the shortest monophyletic and
(Felsenstein, 1993) was used for neighbor shortest nonmonophyletic trees for ranjoining and maximum-likelihood esti- domized data, a random distribution can
mates. Unfortunately, the size of the data be generated. The difference in length for
set made a maximum-likelihood search for the observed data can be compared with
the best tree computationally prohibitive. the random distribution, and the standard
MacClade 3.0 (Maddison and Maddison, significance value of 0.05 can be applied.
1992) was also used to determine various For the T-PTP analysis conducted herein,
character statistics and tree lengths. For the 100 random data sets were created using
analysis, the ratios 1:1, 1.2:1, 2:1, 3:1, 10:1, the Seqboot program in PHYLIP 3.5.
and 100:1 were used to weight transverTwo different winning-sites analyses
sions over transitions. The ratios of 1:1, 2: (Prager and Wilson, 1988) were incorpo-
226
SYSTEMATIC BIOLOGY
rated into the study. Both analyses used a
sign-rank test with a one-tail binomial
probability to determine significance
among pairwise comparisons of alternative hypotheses. The "compare two trees"
option of MacClade was used to determine
the number of characters supporting alternative topologies. The first analysis examined the number of winning sites for each
of the 21-taxon trees representing the
shortest trees of the alternative hypotheses
(i.e., nematode affinity, acanthocephalan
affinity, mollusk affinity, and deuterostome
affinity). The second analysis compared
various combinations of four-taxon trees.
Figure 1 shows the three four-taxon trees
examined. Because there are only three
possible topologies for a four-taxon tree,
the mollusks and the acanthocephalan hypotheses were grouped together. Thus,
when the four taxa included a chaetognath,
a nematode, a mollusk or acanthocephalan,
and a deuterostome, the three topologies
represented mutually exclusive hypotheses
(given that the trees were rooted along the
center branch). All possible combinations
of the three chaetognaths, three nematodes, three mollusks + the acanthocephalan, and three deuterostomes used in this
study were tested (i.e., 108 different taxonomic combinations).
(a)
VOL. 45
Chaetognath
Deuterostome
Nematode
Mollusk
or
Acanthocephalan
Chaetognath-Nematode Hypothesis
Chaetognath
Deuterostome
Nematode
Mollusk
or
Acanthocephalan
(b) Chaetognath-Deuterostome Hypothesis
Chaetognath
Deuterostome
Mollusk
or
Acanthocephalan
Nematode
RESULTS
The final alignment of the 21 taxa consisted of 2,072 nucleotide positions, of
which 1,454 could be unambiguously
aligned. Of these, 642 were variable and
456 were phylogenetically informative (i.e.,
parsimony sites). The gx statistic for 104
randomly generated trees (ga = —1.732) indicated that the data were significantly
more structured than random (Hillis and
Huelsenbeck, 1992).
The nucleotide composition of the sequence data was examined as a possible
source of bias in the phylogenetic signal.
The base composition of the variable sites
was determined for each taxon. A distance
table was constructed by taking the absolute value of the difference in total GC content (merely the sum of the percentage of
guanine and the percentage of cytosine)
(c)
Chaetognath-Mollusk Hypothesis
or
Chaetognath-Acanthocephalan Hypothesis
FIGURE 1. The three possible topologies of the
four-taxon statements used in the winning-sites tests.
Each topology reflects a different hypothesis of chaetognath relationships. Depending on the taxonomic
combination, the last tree is consistent with the hypothesis of molluscan affinities or acanthocephalan affinities. All possible combinations (108) of the three
chaetognaths, three nematodes, three mollusks plus
one acanthocephalan, and three deuterostomes used
in study were examined.
1996
227
HALANYCH—CHAETOGNATH ORIGINS
(a)
(b)
Chaetognath 1
Chaetognath 1
Chaetognath 3
Chaetognath 3
L
L Chaetognath 2
Chaetognath 2
Nematode 1
Nematode 1
Nematode 2
Nematode 2
Nematode 3
Nematode 3
• — Flatworm 1
HI
I — Flatworm 2
I
Hemichordate
I
Chordate
I
Crustacean
I"—
Flatworm 2
I—
Echinoderm
_M
Flatworm 1
Phoronid
R
J L
Snail
Chiton
I—
Bivalve
Insect
Crustacean
I
Spider
j —
Phoronid
Spider
fl
Snail
-
Insect
_J"— Bivalve
—
I— Chiton
Sponge
Chordate
Hemichordate
Acanthocephalan
Anemone
Acanthocephalan
—
Echinoderm
Anemone
Sponge
FIGURE 2. The results of the parsimony analyses. Branch lengths are proportional to the relative amount of
change along the branch, (a) One of two most-parsimonious trees generated by the general heuristic search
with the TBR branch swapping algorithm of PAUP (Swofford, 1993). The tree was based on equal weighting
(i.e., transitions: transversions = 1:1) and consisted of 1,509 steps. The CI was 0.628 (CI excluding uninformative
characters = 0.566). The alternative tree differed only in that the bivalve clustered with the chiton, (b) The
topology obtained when transversions are weighted 100 times greater than transitions. Weighting was achieved
using a step matrix.
for all possible pairs of taxa. The taxa were
then clustered using both the Neighbor option and the UPGMA option of the
"DNAdist" program in PHYLIP (R?lsenstein, 1993). In neither case did the chaetognaths and nematodes cluster, indicating
that nucleotide composition was not a
source of bias accounting for the repeated
recovery of a chaetoganth-nematode clade.
Using PAUP, a heuristic search employing the general setting and the tree bisection-reconnection branch-swapping algorithm produced two most-parsimonious
trees with a length of 1,509 steps and a
consistency index (CI) of 0.628 (CI excluding uniformative characters = 0.566). One
of these trees is shown in Figure 2a. The
other tree only differed in that the bivalve
clustered with the chiton. Because the frequencies of transitions and transversions
are not always equal over the course of
evolutionary time, I employed step matrices to weight transversions more heavily
than transitions. In addition to the ratio of
1:1 (i.e., equal weighting), the ratios of 1.2:
1, 2:1, 3:1, 10:1, and 100:1 weighting transversions over transitions were used to test
the robustness of the equally weighted
maximum parsimony results. All weighting ratios yielded topologies in which the
chaetognaths and nematodes formed a
monophyletic group. The flatworms were
always the sister taxon to this clade. The
deuterostomes were paraphyletic when ratios greater then 2:1 were used. With
weighting schemes <3:1, the acanthoceph-
228
VOL. 45
SYSTEMATIC BIOLOGY
71
100
Chaetognath 1
Chaetognath 3
Chaetognath 2
81
Nematode 1
100
57
001
Nematode 2
Nematode 3
Flatworm 1
100
Flatworm 2
91
62
Crustacean
Insect
Spider
100
Phoronid
80
Bivalve
Chiton
Snail
Acanthocephalan
Echinoderm
Hemichordate
Chordate
Anemone
Sponge
FIGURE 3. The results of the parsimony bootstrap
analysis based on the weighting ratio of 1.2:1 empirically derived from the observed transition: transversion value. The topology shown is a majority-rule consensus tree generated by 200 iterations of the heuristic
search algorithm. Branches supported <50% were collapsed, and values ^50% are shown.
1:1 ratio) for the chaetognath-nematode
clade (see Hillis and Bull, 1993).
Using the PHYLIP package, I obtained
maximum-likelihood estimations of distances based on a Kimura two-parameter
model and used these estimates to reconstruct the phylogeny by neighbor joining.
The topologies obtained using weighting
ratios of 1:1,1.2:1, 2:1, and 3:1 are identical.
This topology is shown in Figure 4a with
the branch lengths derived from the 1.2:1
weighting ratio. The 10:1 and 100:1 topologies differed in the arrangement of taxa
within the mollusk-phoronid clade, and
the 10:1 topology also showed the flatworm clade as being more derived. A
neighbor-joining bootstrap analysis
(weighting ratio 1.2:1 and 200 iterations)
produced a topology (Fig. 4b) consistent
with the parsimony bootstrap topology.
The chaetognath-nematode clade was
strongly supported by a high bootstrap
value (91%). Additionally, modification of
the Kimura two-parameter model, which
accounted for site-to-site variation by employing a gamma distribution (Jin and
Nei, 1990), produced similar results.
Testing Hypotheses
Table 2 shows the length of the mostparsimonious trees (based on equal
weighting; i.e., ratio of 1:1), the number of
most-parsimonious trees, and the CI for
the alternative hypothesis of chaetognath
alans and the arthropods were placed just affinities examined. The parsimony and
outside of the chaetognath-nematode-flat- neighbor-joining analyses were consistent
worm clade, but at 10:1 and 100:1 the ar- with the hypothesis of a nematode affinity.
thropods clustered with the mollusks and However, to determine if alternative hypoththe phoronid. The 100:1 tree is shown in eses were significantly different, a T-PTP
Figure 2b for comparison. A weighted par- test and winning-sites analyses were persimony analysis (Williams and Fitch, formed.
1990), which is accurate over a wider range
The T-PTP test was used to determine if
of parameters than is an equally weighted the chaetognath-nematode hypothesis was
parsimony analysis, also clustered the significantly better supported than the
chaetognaths and nematodes.
other hypotheses, which imply that chaeA majority rule consensus tree of 200 tognaths and nematodes do not form a
bootstrap iterations with a heuristic search monophyletic group. The random distriand the empirical weighting ratio of 1.2:1 bution ranged from —57 to +3 when the
is shown in Figure 3. Branches with boot- length of the shortest monophyly tree was
strap frequencies <50% were collapsed. subtracted from the length of the shortest
The bootstrap results indicated strong nonmonophyly tree. For the observed data,
support (81% with either the 1.2:1 or the this difference was +10. Thus, the T-PTP
1996
229
HALANYCH—CHAETOGNATH ORIGINS
(a)
(b)
^t0^'
Chaetognath 1
J ^
64
100
Chaetognath 3
Chaetognath 2
^
/^""-•"^^
c
Nematode 1
I—
I
51
I—
I
Nematode 2
100
^ Nematode 3
^^0^-
Flatworm 1
63
Flatwnrm ?
^ " ^
^^^
^***
Phoronid
'*"*'
100
10
°
k ^ Chiton
93
96
Snail
^t00l00*f' Crustacean
I—
I
I—
I
l^*
1
Echinoderm
Hemichordate
Flatworm 1
Flatworm 2
Crustacean
Insect
Phoronid
100
""**"" Insect
-
Nematode 3
Spider
63
i ^ — « - Spider
^
Nematode 2
Chordate
Acanthocephalan
/ ^ ^ ' Rivalvfi
Chaetognath 3
Chaetognath 2
91
_ ^ — — Nematode 1
^^^^j
Chaetognath 1
Snail
86
Echinoderm
Chordate
c
Bivalve
Chiton
Acanthocephalan
Hemichordate
Anemone
Anemone
Sponge
Sponge
FIGURE 4. The results of the neighbor-joining analysis based on a Kimura two-parameter model with a
weighting ratio of 1.2:1. (a) Tree derived when the PHYLIP (Felsenstein, 1993) software package was used.
Branch lengths are proportional to the relative amount of change along a branch, (b) Bootstrap analysis using
likelihood estimates of Kimura distances. The majority-rule consensus tree generated by 200 iterations is shown.
Branches supported <50% were collapsed, and values ^50% are shown.
test supported monophyly of the chaetognath-nematode clade at a level of P ^ 0.01.
The winning-sites test, which employed
all 21 taxa, significantly supported the
chaetognath-nematode hypothesis over
TABLE 2. A comparison of alternative phylogenetic
hypotheses of chaetognath origins. The results were
obtained from parsimony analyses based on equal
weighting (Ti/Tv ratio = 1:1).
Hypothesis
Nematode
Acanthocephalan
Deuterostome
Mollusk
Tree length" No. trees
1,509
1,521
1,527
1,540
CP
0.628
0.623
0.620
0.615
' Length of the shortest tree(s) in which chaetognaths and
the taxa in question (i.e., nematodes, acanthocephalans, deuterostomes, or mollusks) were monophyletic.
b
CI = consistency index.
both the chaetognath-mollusk and the
chaetognath-deuterostome hypotheses
(Table 3). Even though the nematode hypothesis was significantly supported over
the deuterostome hypothesis in only 10 out
of 16 comparisons, the other 6 comparisons all had P values of <0.06. The chaetognath-acanthocephalan hypothesis,
however, was not significantly rejected for
any of the three most-parsimonious trees
in which these taxa form an exclusive clade.
The results of the winning-sites tests for
the four-taxon combinations are shown in
Table 4. The chaetognath-nematode hypothesis is clearly the most strongly supported, and the chaetognath-mollusk hypothesis was significantly rejected in all of
the combinations for which it was tested.
Furthermore, the chaetognath-nematode
230
VOL. 4 5
SYSTEMATIC BIOLOGY
TABLE 3. The results of the winning-sites tests for 21-taxon trees. A sign-rank test with a one-tail binomial
probability was used to determine significance among pairwise comparisons of alternative hypotheses.
Versus nematode tree 1
Versus nematode tree 2
Tree
Alternative
hypothesis"
Nematode
hypothesis
Probability1'
Alternative
hypothesis
Nematode
hypothesis
Probability
Acanthocephalan tree 1
Acanthocephalan tree 2
Acanthocephalan tree 3
Deuterostome tree 1
Deuterostome tree 2
Deuterostome tree 3
Deuterostome tree 4
Deuterostome tree 5
Deuterostome tree 6
Deuterostome tree 7
Deuterostome tree 8
Mollusk tree 1
Mollusk tree 2
39
40
39
32
33
36
34
37
38
33
37
37
40
52
52
51
47
47
51
49
51
53
47
51
64
68
0.071
0.087
0.085
0.036*
0.046*
0.043*
0.039*
0.055
0.046*
0.046*
0.055
0.003*
0.003*
38
38
38
32
33
37
33
37
37
34
38
36
39
50
50
50
47
47
51
47
51
51
48
52
63
67
0.083
0.083
0.083
0.036*
0.046*
0.055
0.046*
0.055
0.055
0.049*
0.057
0.002*
0.002*
" Numbers are the numbers of more parsimonious characters (i.e., winning sites) versus the competing hypothesis.
P < 0.05 considered statistically significant.
b
tree lost only 19 out of 216 comparisons,
and none of these losses were significantly
supported.
Long Branch Length
The results of analyses presented here
strongly support a chaetognath-nematode
clade. One concern raised by both the parsimony and the neighbor-joining analyses
is that of long branch lengths. Felsenstein
(1978) showed that parsimony methods
can be misled by branch attraction when
two or more taxa exhibit relatively long
branch lengths. In this study, several taxa,
including chaetognaths and nematodes,
have long branches relative to those of other taxa. Previous studies in which longbranch attraction has been examined h^ve
used four-taxon cases for which the real
phylogeny was well known or where there
was strong support for a topology in
which the two long branches failed to
group (Felsenstein, 1978; Allard and Miyamoto, 1992; Huelsenbeck and Hillis,
1993). Unfortunately, neither of these situations is true here, and some of the methods previously applied to data sets to reveal long-branch attraction cannot be used.
Chaetognaths and nematodes do, however,
cluster under a wide variety of parameters,
including situations in which their branch
lengths are not much longer than those of
other taxa.
Figure 5 shows three groups of four-taxon trees in which branch lengths were determined by calculating likelihood estimates (Felsenstein, 1981) assuming a
Jukes-Cantor model of evolution and equal
nucleotide frequencies (Huelsenbeck and
Hillis, 1993). Although this approach will
underestimate branch length, it is nonetheless a standard estimation procedure. In
group 1, both the chaetognath and the
nematode have long branch lengths relative to the remaining three branches, and
parsimony highly favored the chaetognath-nematode tree (bootstrap value =
100%). Thus, group 1 matches the prediction of long-branch attraction.
Groups 2 and 3 represent cases in which
the chaetognath and nematode branches
are not very long relative to the other taxa.
In the case of group 2, parsimony favored
the chaetognath-deuterostome tree. The
acanthocephalan and the nematode are the
longest two branches and yet they did not
cluster when the entire data set was considered. Group 3 includes a flatworm sequence roughly the same length as the
chaetognath sequence used, and yet parsimony still favored the chaetognath-nematode tree. Moreover, maximum-likelihood
1996
231
HALANYCH—CHAETOGNATH ORIGINS
TABLE 4. The results of the winning-sites test for four-taxon trees. A sign-rank test with a one-tail binomial
probability was used to determine significance among pairwise comparisons of alternative hypotheses. There
were 108 taxonomic combinations and 216 comparisons.
Hypotheses compared
No.
comparisons
No. won
Nematode over deuterostome
Deuterostome over nematode
Nematode over acanthocephalan
Acanthocephalan over nematode
Nematode over mollusk
Mollusk over nematode
Nematode over other three hypotheses
Other three hypotheses over nematode
108
108
27
27
81
81
216
216
89
19
27
0
81
0
197
19
1
Significant
results"
significantly
27
0
23
25
81
100
131
0
61
85
The number of comparisons that were significant (P s 0.05) using a one-tail binomial probability test.
reconstructions, which employed the
weighting ratio of 1.2:1 empirically derived
from the observed Ti/Tv value, supported
the chaetognath-nematode clade for all
three groups. The chaetognath-nematode
tree also consistently scored the greatest
number of invariants when Lake's method
(1987) was used, but the results were not
Treel
Tree 2
Chaetognath 1
Tree 3
Nematode 3
Echinoderm
Chaetognath 1
Chaetognath 1
3.0/3.7
Group 1
Bivalve
3.6/
\29
Echinoderm
ML 100
Bivalve
Echinoderm
0
Nematode 1
Chaetognath 3
Hemichordate
7.4
Bivalve
0
Chaetognath 3
Chaetognath 3
4.6
Group 2
4.6
Acanthocephalan
Hemichordate
Nematode 1
Acanthocephalan
M L 20.5
Chaetognath 2
Hemichordate
Acanthocephalan
79.5
0.5
Nematode 1
Nematode 1
Hemichordate
Chaetognath 2
V 8.2
Chaetognath 2
.8.1
Group 3
4.6
Flatworm 2
Nematode 1
ML 73.3
Hemichordate
Flatworm 2
1.3
FIGURE 5. Four-taxon statements examining the possibility of long-branch attraction. The three possible
topologies for three different four-taxon combinations are shown. The values along the branches represent the
percentage of expected internodal change along the branch as calculated from estimates based on a JukesCantor model. The numbers under the tree indicate the bootstrap support for the internal branch (out of 100
iterations); ML values indicate which topology in each group was supported by maximum likelihood. Branch
lengths shown are properly proportioned.
232
SYSTEMATIC BIOLOGY
significantly supported. Both likelihood
and Lake's method are considered less
prone to long-branch attraction and problems of inconsistency (Felsenstein, 1978;
Huelsenbeck and Hillis, 1993), although
these methods may also become inconsistent in extreme situations (Allard and Miyamoto, 1992).
Because the three deuterostome taxa employed in the analysis were relatively slowly evolving, I also explored the four-taxon
situations using sequence from the urochordate Herdmandia momus (GenBank no.
X53538). Although the branch length of
Herdmandia was considerably longer than
that of other deuterostomes, the chaetognath and nematode representatives still
clustered together.
When all nine topologies shown in Figure 5 were plotted against the results of
Huelsenbeck and Hillis (1993), they fell
within the region of graph space where
parsimony is consistent, i.e., they were not
in the Felsenstein zone. However, they fell
in the lower left of the graph space, and
others (Allard and Miyamoto, 1992; Huelsenbeck and Hillis, 1993) have shown that
even in the "consistent" graph space longbranch attraction can occur. Also, the three
groups shown in Figure 5 were representative of particular situations. Several other
groupings of taxa were examined with
similar results.
Evolutionary history, and not longbranch attraction, may account for the repeated recovery of the chaetognath-nematode clade. Not only do methods of
reconstruction that more reliably (than
parsimony) handle long branches support
the chaetognath-nematode hypothesis, but
this clade was well supported even when
the branch lengths of the chaetognaths and
nematodes were not significantly longer
than those of the other taxa.
DISCUSSION
In a review of chaetognath affinities,
Ghirardelli (1968) argued that chaetognaths are not closely related to any extant
metazoan phylum. This idea was reminiscent of Darwin's (1844) statement that species within the chaetognath genus Sagitta
VOL. 45
will be remembered for the "obscurity of
their affinities." Because of this obscurity,
researchers have tried to relate chaetognaths to a variety of different metazoan
phyla. In this study, I examined four hypotheses of chaetognath affinities, using
independent data and more rigor than previously possible.
The results of all of the phylogenetic
analyses conducted herein support the hypothesis of a chaetognath-nematode relationship. Various weighting schemes, reconstruction methods, and statistical tests
(i.e., bootstrap analyses, Faith's T-PTP test,
and winning-sites tests) indicate that this
finding is robust. Schneider (1886, op. cit.
Ghirardelli, 1968) recognized a chaetognath-nematode taxon and placed both in
the phylum Nemathelminthes based on the
presumed similar arrangement of their
muscular bands. However, since Schneider,
the term Nemathelminthes has been used
to represent several different assemblages
of aschelminth or "pseudocoelomate" taxa,
including nematodes, nematomorphs, gastrotrichs, priapulids, rotifers, acanthocephalans (usually considered an acoelomate),
kinorhynchs, and loriciferans. Thus, to
avoid ambiguity inherent in the term Nemathelminthes, this term is not used here.
However, the node defined by the last
common ancestor of chaetognaths and
nematodes may include other metazoan
taxa that were not examined in this study
(e.g., Nematomorpha and Gastrotricha).
Based on the 18S rDNA sequence data,
the relative position of the chaetognathnematode clade within the Metazoa is ambiguous and depends upon the method of
reconstruction and the weighting scheme
used during analysis. The lack of resolution of more ancient phylogenetic events is
presumably due to the short internal
branch lengths that resulted from the rapid divergence of major metazoan lineages
in the Precambrian era (Turbeville et al,
1994; Halanych, 1995). Although additional data are needed to resolve the deep internal branches, my findings are consistent
with those of Telford and Holland (1993)
and Wada and Satoh (1994). A general
comparison between the topologies found
1996
HALANYCH—CHAETOGNATH ORIGINS
233
Thus, based on these recent ultrastructural
studies, chaetognaths can clearly be considered eucoelomate.
If chaetognaths are eucoelomate, then either eucoely arose at least twice in metazoan evolution or the pseudocoel of nematodes is a derived eucoel. The discrepancy
between morphology and phylogeny suggests that the conditions of acoely, pseudocoely, and eucoely are ecological in nature, not historical. Furthermore, there are
Controversial Characters
several examples of metazoans that do not
The disagreement among researchers on fit the standard coelomic pattern of their
chaetognath affinities has resulted primar- ancestors. For example, phoronids, enterily from controversy over four characters; opneusts, and cephalochordates often
the coelomic condition, the absence of cir- demonstrate both schizocoely and enterocular muscle, a tripartite Bauplan, and the coely in the same organism (Davis, 1908;
fate of the blastopore. Of these four, the Heath, 1917; Zimmer, 1964).
actual states, conditions, and importance
Scenario of Early Evolution
of the first three have never been satisfacThe putative relationship between nemtorily resolved. Hyman (1959) used embryological characters, including blastopore atodes and chaetognaths is not unreasonfate, as the primary reason for the desig- able from a functional morphology standnation of chaetognaths as deuterostomes. point, and the evolution of a chaetognath
In chaetognaths, the blastopore is located lineage from a marine benthic vermiform
posteriorly before it is lost, and the anus organism is plausible. Before considering
forms de novo (Hyman, 1959:32). The re- one of many possible evolutionary scenarcent 18S rDNA findings suggest that the ios, a common origin of chaetognaths and
fate of the blastopore has been more vari- nematodes, a few possible misconceptions
able across animal phyla than traditional need to be addressed. Although nematodes are arguably the most ubiquitous
dogma contends.
Debate continues over whether chaetog- metazoan on the planet, "there are no gennaths are coelomates or pseudocoelomates. uine pelagic marine nematodes" (Hyman,
The fact that chaetognaths do not satisfy 1951b:391). Most marine nematodes are
Hyman's (1951a:23) classical definition of a benthic infauna, and their presence in the
coelom as both "bounded on all sides by water column is limited. Thus, the mosttissue of entomesodermal origin and lined parsimonious assumption is that the chaeby peritoneum" accounts for much of the tognath-nematode ancestor was a benthic
continued skepticism over the interpreta- organism (there are benthic chaetognaths,
tion of chaetognaths as coelomates (Shinn e.g., Spadella).
Second, the common perception of nemand Roberts, 1994). Additionally, Shinn
and Roberts argued that Hyman's defini- atodes as a cigar-shaped organism with a
tion is irrelevant because the coelomic lin- smooth cuticle has largely been shaped by
ing of chaetognaths is ultrastructurally the extensive study of Caenorhabditis elevery similar to the coelomic lining in other gans. Many marine groups of nematodes
coelomates (e.g., in many small poly- are considerably different. The Epsilonechaetes, pterobranchs, and enteropneusts matidae and the Draconematidae both
and in the tentacular region of lophophor- have several free-living species that are
ates). Welsch and Storch (1982) described covered with a combination of spines, stilt
the body cavity of the chaetognath Sagitta bristles, cuticular folds, and/or annuli.
elegans as coelomate, based on the presence Assuming that the common ancestor of
of a thin epithelium lining the body cavity. chaetognaths and nematodes was benthic,
in these previous works (which focused
mainly on deuterostome taxa) and the
study here (which includes a wide range
of metazoans) is not feasible because the
taxonomic representation is too different.
However, I did not find convincing evidence either for or against the interpretation that the chaetognath lineage arose prior to coelomate metazoans (Telford and
Holland, 1994).
234
SYSTEMATIC BIOLOGY
VOL. 45
several morphological modifications but recent molecular analyses suggest that
would ensure a more successful invasion this hypothesis is incorrect. The findings
of a pelagic environment. Modifications herein, based on 18S rDNA sequence data,
that allow more efficient swimming or indicate that the chaetognaths are most
buoyancy control would be strongly fa- closely related to the nematodes. The long
vored. One simple way to increase buoy- branch length of the chaetognaths and
ancy is to increase surface area without in- nematodes, relative to other metazoan
creasing volume. With the increasing branches, was one possible source of error
surface area, drag also increases, slowing in my phylogenetic reconstructions. Howthe rate of negative vertical displacement ever, four-taxon analyses suggest that the
in the water column due to gravity. If this clustering of chaetognaths and nematodes
increase in surface area were properly was not merely due to long-branch attracshaped, it could also serve as a fin and tion. Assuming that the chaetognath-nemcould increase the efficiency of locomotion. atode ancestor was a benthic organism, seChaetognaths have well-developed fins, lective pressures may have acted to
and some nematodes have caudal alae and increase the surface area of the structures
bursae. A structure similar to an ala or similar to alae and to modify the hardness
bursa could have served as a rudimentary and size of anterior spines, resulting in a
fin upon which selection acted to increase chaetognath lineage.
the surface area.
ACKNOWLEDGMENTS
Modified feeding structures would allow food to be more easily captured. In an
The following people provided helpful comments
open pelagic environment, prey organisms and valuable criticism of this manuscript: T. J. Robinson, J. E. Husti, J. P. Huelsenbeck, K. Crandall, C.
often use locomotion to escape predators. Simon,
and anonymous reviewers. This work was parThus, in addition to improved swimming tially funded through a bursary from the FRD of The
abilities, ancestral chaetognaths needed Republic of South Africa.
some type of modification to allow prey to
REFERENCES
be captured and quickly subdued. Throughout organismal evolution this problem has ALLARD, M. W., AND M. M. MIYAMOTO. 1992. Testing
been repeatedly solved with enlarged
phylogenetic approaches with empirical data, as illustrated with the parsimony method. Mol. Biol.
and/or hardened mouth parts. Because
Evol. 9:778-786.
both the cuticle of nematodes and the
Q., K. P. RYAN, AND A. L. PULSFORD. 1983. The
spines of chaetognaths are composed of BONE,
structure and composition of the teeth and grasping
chitin (Bone et al., 1983), it is easy to enspines of chaetognaths. J. Mar. Biol. Assoc. U.K. 63:
929-939.
vision the modification of anterior spines
similar to those observed in modern nem- CASANOVA, J.-P. 1987. Deux Chaetognaths benthiques nouveaux du genre Spadella de parages de
atodes (e.g., the Draconematidae, see HyGibraltar. Remarques phylogenetiques. Bull. Mus.
man, 1951b: fig. 138). Similar to arrow
Natl. Hist. Nat. Paris Ser. 4e Vol. 9 Sec. A 2:375-390.
worms, some nematodes possess head DARWIN, C. 1844. Observations on the structure and
shields or cuticular folds into which the
propagation of the genus Sagitta. Ann. Mag. Nat.
Hist. Ser. 1 13:1-6.
anterior end and associated structures can
be withdrawn. Furthermore, the presence DAVIS, B. M. 1908. The early life history of Dolichogbssus pusillus, Ritter. Univ. Calif. Publ. Zool. 4:187of teeth (e.g., oncholaims, mononchs) and
226.
hardened spinelike or spearlike structures DE RIJK, P., J.-M. NEEFS, Y. VAN DE PEER, AND R. DE
(e.g., tylenchoids, dorylaims) are common
WACHTER. 1992. Compilation of small ribosomal
subunit RNA sequences. Nucleic Acids Res.
in predatory nematodes.
Conclusions
The phylogenetic affinities of the chaetognaths have been debated since the discovery of these organisms. Traditionally,
they have been considered deuterostomes,
20(suppl.):2075-2089.
FAITH, D. P. 1991. Cladistic permutation tests for
monophyly and nonmonophyly. Syst. Zool. 40:366375.
FELSENSTEIN, J. 1978. Cases in which parsimony or
compatibility methods will be positively misleading. Syst. Zool. 27:401-410.
FELSENSTEIN, J. 1981. Evolutionary trees from DNA
1996
HALANYCH—CHAETOGNATH ORIGINS
sequences: A maximum likelihood approach. J. Mol.
Evol. 17:368-376.
FELSENSTEIN, J. 1993. PHYLIP: Phylogeny inference
package, version 3.5c. Department of Genetics,
Univ. Washington, Seattle.
FIELD, K .G., G. J. OLSEN, D. J. LANE, S. J. GIOVANNONI,
M. T. GHISELIN, E. C. RAFF, N. R. PACE, AND R. A.
235
DNA and amino acid sequences. J. Mol. Evol. 27:
326-335.
SHINN, G. L., AND M. E. ROBERTS. 1994. Ultrastructure
of hatchling chaetognaths (Ferosagitta hispida): Epithelial arrangement of mesoderm and its phylogenetic implications. J. Morphol. 219:143-163.
SWOFFORD, D. L. 1993. PAUP: Phylogenetic analysis
using parsimony, version 3.1.2d5. Illinois Natural
History Survey, Champaign.
RAFF. 1988. Molecular phylogeny of the animal
kingdom. Science 239:748-753.
GHIRARDELLI, E. 1968. Some aspects of the biology of TELFORD, M. J., AND P. W. H. HOLLAND. 1993. The
the chaetognaths. Adv. Mar. Biol. 6:271-375.
phylogenetic affinities of the chaetognaths: A moGUNTHER, R. T. 1907. The Chaetognatha, or primitive
lecular analysis. Mol. Biol. Evol. 10:660-676.
Mollusca. Q. J. Microsc. Sci. 51:357-394.
TURBEVILLE, J. M., J. R. SCHULZ, AND R. A. RAFF. 1994.
HALANYCH, K. M. 1995. The phylogenetic position of
Deuterostome phylogeny and the sister group of the
the pterobranch hemichordates based on 18S rDNA
chordates: Evidence from molecules and morpholsequence data. Mol. Phylogenet. Evol. 4:72-76.
ogy. Mol. Biol. Evol. 11:648-655.
HEATH, H. 1917. The early development of Pateria WADA, H., AND N. SATOH. 1994. Details of the evo(Asterina) mineata. J. Morphol. 29:461-469.
lutionary history from invertebrates to vertebrates,
as deduced from the sequences of 18S rDNA. Proc.
HIGGINS, D. G., A. J. BLEASBY, AND R. FUCHS. 1992.
Natl. Acad. Sci. USA 91:1801-1804.
CLUSTAL V: Improved software for multiple sequence alignment. Comput. Appl. Biosci. 8:189-191. WELSCH, U., AND V. STORCH. 1982. Fine structure of
HILLIS, D. M., AND J. J. BULL. 1993. An empirical test
the coelomic epithelium of Sagitta elegans (Chaetogof bootstrapping as a method for assessing confinatha). Zoomorphology 100:217-222.
dence in phylogenetic analysis. Syst. Biol. 42:182- WILLIAMS, P. L., AND W. M. FITCH. 1990. Phylogeny
192.
determination using dynamically weighted parsimony. Methods Enzymol. 183:615-626.
HILLIS, D. M., AND M. T. DIXON. 1991. Ribosomal
DNA: Molecular evolution and phylogenetic infer- WILLMER, P. 1990. Invertebrate relationships: Patterns
ence. Q. Rev. Biol. 66:411-453.
in animal evolution. Cambridge Univ. Press, New
York.
HILLIS, D. M., AND J. P. HUELSENBECK. 1992. Signal,
noise, and reliability in molecular phylogenetic ZIMMER, R. L. 1964. Reproductive biology and develanalysis. J. Hered. 83:189-195.
opment of the Phoronida. Ph.D. Dissertation, Univ.
Washington, Seattle.
HUELSENBECK, J. P., AND D. M. HILLIS. 1993. Success
of phylogenetic methods in the four-taxon case.
Received 27 January 1995; accepted 20 December 1995
Syst. Biol. 42:247-264.
Associate Editor: Chris Simon
HYMAN, L. H. 1951a. The invertebrates: Platyhelminthes and Rhynchocoela. The acoelomate Bilateria,
Note added in proof.—While this manuscript was in
Volume II. McGraw-Hill, New York.
press, the phoronid sequence (U12648) was updated
HYMAN, L. H. 1951b. The invertebrates: Acanthoin GenBank. The changes to this sequence do not incephala, Aschelminthes, and Entoprocta. The pseufluence any of the conclusions herein.
docoelomate Bilateria, Volume III. McGraw-Hill,
New York.
APPENDIX
HYMAN, L. H. 1959. The invertebrates: Smaller coelomic groups, Volume V. McGraw-Hill, New York.
The 18S rDNA sequence data of the 21 taxa used in
JIN, L., AND M. NEI. 1990. Limitations of the evolu- this study were aligned with Clustal V and corrected
tionary parsimony method of phylogenetic analysis. by hand. The regions excluded from the analyses, due
Mol. Biol. Evol. 7:82-102.
to ambiguous alignment, are indicated by the heavy
LAKE, J. A. 1987. A rate-independent technique for bar at the top of the sequence (positions 71-84, 130analysis of nucleic acid sequences: Evolutionary par155, 191-290, 310-337, 554-558, 700-899, 935-959,
simony. Mol. Biol. Evol. 4:167-191.
1005-1014, 1178-1181, 1228-1235, 1525-1618, 16391663, 1729-1747, 1941-2000.) The secondary structure
MADDISON, W. P., AND D. R. MADDISON. 1992.
MacClade: Analysis of phylogeny and character of the rDNA is based on the Saccharomyces cerevisiae
evolution, version 3. Sinauer, Sunderland, Massa- model of De Rijk et al. (1992). The helical domains are
denoted at the bottom of the alignment by the bars
chusetts.
and the corresponding number. The open boxes under
NEEFS, J.-M., AND R. DE WACHTER. 1990. A proposal
for the secondary structure of a variable area of eu- the alignment denote the variable regions (V) of the
karyotic small ribosomal subunit RNA involving eukaryotic small ribosomal subunit as defined by
the existence of a psuedoknot. Nucleic Acids Res. Neefs and De Wachter (1990). Positions for which the
exact nucleic acid is unknown are scored as question
18:5695-5704.
marks. The N's in the extreme 5' and 3' regions repNIELSEN, C. 1985. Animal phylogeny in the light of
resent conserved positions for which the exact nucleic
the trochaea theory. Biol. J. Linn. Soc. 25:243-299.
acids were not available in GenBank. N's, question
PRAGER, E. M., AND A. C. WILSON. 1988. Ancient ormarks, and gaps were all treated as missing data.
igin of lactalbumin from lysozyme: Analysis of
20
30
40
70
80
90
100]
•]
CAATGAAATTGCGTACGGCTC
120
130
;
;
140
ATTAAATCAGTTATGGTTCCTTAGATCGTACAATCC—TAC
ATTAAATCAGCTATGGTTCCTTAGATCTTCGGCCGGGCCTTCGC
AATAAATCAGTTATGGTTCCTTAGATCGTACTATATCCTAC
ATTAAATCAGCTATGGTTCCTTAGATCGTAAATGC
TAC
110
I
8
V2 ~
ATTAAATCAGTTATGGTTCATTGGATCGAGTCCCCCC-GAC
ATTAAATCAGTTATGGTTCCTTTGATCGTCACATCCT—AC
ATTAAATCAGTTATGGTTCCTTTGATCGTTACCCCTT—AC
ATTAGAGCAGATGTCATTTATTCGGAAAATCCTTT
V2
TAC
ATTAAATCAGTTATAGTTTATTTGAT-GTTGACTTAC-TAC
ATTAAATCAGTTATGATTTCTTAGATCGTACACT-CC-TAC
ATTAAATCAGTCGAGGTTCCTTAGATGACACGAT-CC-TAC
ATTAGAGCAGATGTCATTTATTCGGAACGTCCTTT
160
170
180
190
;
200]
CTCAGTGAAACTGCGAATGGCTC
GAAGAGAAACTGCGAACGGCTC
GAGAGAAACTGCGAACGGCTC
9
V2
10
V2
I
ATGGATAACTGTGGTAATTCTAGAGCTAATACATGCGTCC-AAGCG
ATGGATAACTGTGGTAATTCTAGAGCTAATACATGCGGAAGAAGCG
TTGGATAACTGTGGCAATTCTAOAGCTAATACATGCCTAC-CTCGG
TTGGATAACTGCGGCAATTCTGGAGCTAATACATGCGTTT-AGGCC
TTGGATAACCGTGGTAATTCTAGAGCTAATACATGCGTTA-AAGTC
TTGGATAACTGTGGTAATTCTAGAGCTAATACATGACGTT—CAGC
TTGGATAACTGTGGCAATTCTAGAGCTAATACATGCTTAC-CAAGC
TGGATAACTGCGGTAATTCTGGAGCTAATACATGCAAAT-AAACC
ATGaATAACTGTATTAATTCTAGAGCTAATACATGCCACT-ATGCC
ATGGATAACTGTGGCAAATCTAGAGCTAATACATGTTTAC-AAGCT
TTGGATAACCGTGGTAATTCTAGAGCTAATACATGCGA-A-GAGTC
ATGGATAACTGCGGAAATACTGGAGCTAATACATGCAACT-ATACC
TTGGATAACTGTGGTAATTCTAGAGCTAATACATGCATCA—GAGC
TTGGATAACTGTGGCAATTCTAGAGCTAATACATGCAAAA—AGGC
TTGGATAACTGTGGTAATTCTAGAGCTAATACATGCAAAC-AGAGC
TTGGATAACTGTAGTAATTCTAGAGCTAATACATCGAAAC—AAGC
GGCTTTGGATAACTGTGGTAATTCTAGAGCTAATACATGCCTGA-CGGCG
TTGGATAACTGTGGTAATTCTAGAGCTAATACATGCACAA-TAGCC
ATGGATAACTGTAGTAATTCTAGAGCTAATACATGCCTTG-AATCC
150
V1
NNCCTGGTTGATCCTGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCACGTGCAAGTTTAAACT-GT
TACCTGATTGATTCTGTCAGC-GCTATATGCTCAGTTTAAAGATTAAGCCATGCATGTC-GAGTTCATCTTT
TACCTGATTGATTCTGTCAGC-GCTATATGCTCAGTTTAAAGATTAAGCCATGCATGTC-GAGTTCATCTTT
TACCTGGTTGATCCTGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCATGTCTAAGTACAGACTTTC—ACATAGTGAAACCGCAAATGGCTC
ATTAAATCAGCTATGSTTCCTTGGATCGTACATAC
Haemonchus
Nematodirus
Strongylocentrot
Branchioatoma
Saccogloaaua
60
TACCTGGTTGATCCTGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCATGTCTAAGTACATGCCTCC—TTAAGGCGAAACCGCGAATGGCTC
AACCTGGTTGATCCTGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCATGTCTAAGTACATACTTTT—TGATGGTGAAACCGCGAATGGCTC
TCCCTGGTTGATCCTGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCATGTCTCAGTACAAGCCGAA—TTAAGGTGAAACCGCGAAAGGCTC
AATCTGGTTGATCCTGCCAGTAGTCATATGCTTGTCTCAGAGATTAAGCCATGCATGTCTAAGTACAAACCTTC—AAACGGTGAAACCGCGAATGGCTC
Opiathorchia
Scypha
Acanthopleura
Limicolaria
Placopecten
Tenebrio
50
TACCTGATTGATTCTGTCAGC-GCGATATGCTCAAGTAAAAGATTAAGCCATGCATGCTTTGATTCAT
ATTACATCAGTTGTGGTTCATTAGATCATATGTTTAA
ATTAAATCAGTTATCGTTTATTTGATTGTACGTTTAC-TAC
ATTAGAGCAGATATCACCTTATCCGGGATCCGGATCCTCAT
ATTAAATCAGTTATGGTTCCTTAGATCGTACCTT-AC-TAC
ATTAAATCAGTTATGGTTCCTTAGATCGTACGATCC—TAC
ATTAAATCAGTTATGGTTCCTTAGATCGTACCCACATTTAC
us
10
NNNNNNNNNNNNNNNNNNNNNAGTCATATGCTTOTCTCAAAC5ATTAAOCCATGCATATCTAAGTACACACTTTC—ACACGGTGAAACCGCGAATGGCTC
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAGCACATGCCGCC—AAATGGCTAAGC-GCGAACGGCTC
NNNNiniNNNNNNNNNNNNNNNNNNNNNMNNNNNNNNNNNNNNNNNNNNNNiraNNNNNNNMNAGCACATGCCGCC—AAATGGCTAAGC-GCGAACGGCTC
TACCTGGTTGATCCTGCCAGTAG-CATATGCTTGTCTCAAAGATTAAGCCATGCATGTCTAAGTACAAGCCCCC—AGTGGGCQAAACCGCGAATGGCTC
V—AAACGGTGAAACCGCGAATGGCTC
:—AAATGGCTAAGCCGCGAACGGCTC
NUNNNNNNNNNNNNNNNNiraNNNNNNNNNNNNNNNNNNAAAGATTAAGCCATGCATGCGTAAGTACATACTTTT—TATGGTGTAAACCGCGAATGGCTC
Moniliform
Anemonia
Caenorhabditia
Eurypelma
I
Phoronia
Paraspadella
Sagitta crassa
Artamia
Schiatoaoma
Sagitta elegana
Caenorhabditia
Eurypelma
Placopeoten
Tenebrio
Opisthorchis
Scypha
Acanthopleura
Limicolaria
Haemonchua
Nematodirua
Strongylocentrot us
Branchioatoma
Saccogloaaua
Sagitta elegana
Moniliform
Anemonia
[
Ph oron is
Paraspadella
Sagitta crassa
Artamia
Schistosoma
Phoronis
Paraapadella
Sagltta craaaa
Artemia
Schistosoma
Sagltta elegans
Moniliform
Anemonia
Caenorhabditis
Eurypelma
Placopecten
Tenebrio
Opisthorchis
Scypha
Acanthopleura
Limicolaria
Haemonchas
Nematodlrus
Strongylocentrotua
Branchioatoma
Saccogloaaus
I
Ph oron is
Paraapadella
Sagltta craaaa
Artemia
Schistosoma
Sagltta
elegans
Moniliform
Anemonia
Caenorhabditis
Burypelma
Placopecten
Tenebrio
Opisthorchis
Scypha
Acanthopleura
Limicolaria
Haemonchus
Nematodlrus
Strongylocentrot us
Branchioatoma
Saccogloaaus
230
240
OGGAAOAQCQCTTTTATTAGATCAAAACCAATCO
220
250
260
270
290
300]
•]
310
10
320
330
11
V2
340
350
360
11
V2
370
380
390
11
V2
10
V2
11
V2
8
12
400]
•]
TGGATAACTTTGGGCTGATCGCATGGCCTT—GCGCCGGCGACGTATCTTTCAAATGTCTGCCCTA-TCAACTTTCGATGGTACGTGACATGCCTACCAT
TGGATAACTCCACT-TGAC-GCACGGCC—TTTGGCCGGCGGCTGATCTAATAAATGTCTGCCTTA-TCAACTGTCGACGGTAGGTGCCCGGCCTACCGT
TGGATAACTCCACT-TGACCGCATGGCC—TTG-GCCGGCGGCTGATCTAATAAATGTCTGCCTTA-TCAACTGTCGACGGTAGGAGCCCGGCCTACCGT
TGAATAACTATA-GCCGATCGCACGGTCTC—GCACCGGCGACGTGTCTTTCAAATGTCTGCCTTA-TCAACTTTCGATGGTAGGCTATGCGCCTACCAT
TGGATAACTTTA—CTGATCGCAGTCGGCCTTGTGTCGGCGACGGATCTTTCAAATGTCTGCCCTA-TCAA-TTT-GTTGGTAGGTGATTTGCCTACCAT
TGGATAACTCCACT-TGACCGCACGGCC—TTGCGCCGGCGGCTGATCTAATAAATGTCTGCCTTA-TCAACTGTCGACGGTAGGAGCCCGGCCTACCGT
TGAATAAATTTG—CAGATTGCAGCGGTCTTCGTACCGGCGACGTATCTTTCAAGTGTCTGCCCTA-TCAACTGGCGATGGTAGTTTATGTGCCTACCAT
ATAGTAACT—GATCGAATCGCATGGCCT—TGCGCTQGCGATGTTTCATTCAAATTTCTGCCCTA-TCAACTGTCGATGGTAAGGTGTTGGCTTACCAT
TGAATAAAGCAGTTTAC
TGTCAGTTTCGACTGACTCTATCCGGAAAGGGTGTCTGCCCTT-TCAACTA—GATGGTAGTTTATTGGACTACCAT
TGTATAACTTTGGGCTGATCGCACGGGCTT—GTCCCGGCGACGCATCTTTCAAGTGTCTGCCTTA-TCAACTGTCGATGGTAGGCTTATGCGCCTACCA
TGGATAACTTTGTGCTGATCGCACGGCCCTA-GTGCCGGCGACGTATCTTTCAAATGTCTGACCTA-TCAACTTTCGATGGTACGTGCTATGCCTACCAT
TGAATAACTTTACGCTGATCGCACGGTCT—TGCACCGGCGACGCATCTTTCAAATGTCTGCCTTA-TCAACTGTCGATGGTAGGTTCTGCGCCTACCAT
TGGATAACTTTA—CTGATCGCAGTCGGCCTTGTGTCGGCGACGG-TCTTTCAAATGTCTGCCCTA-TCAATTTTCGATGGTAGGTGACCTGCCTACCAT
ATGATAACT—GAACGG?TCGCATGGTCT—TGCGCCGGCGATGACTCATACAAATATCTGCCCTA-TCAACTTTCGATGGTAAGGTAGTGGCTTACCAT
TGAATAACTTTGTGCTGATCGCATGGCCAC—GCGCCGGCGACGTATCTTTCAAGTGTCTGCCCTA-TCAACTTTCGATGGTACGTGATATGCCTACCAT
TGGATAACTTTGTGCTGATCGCATGGCCTTCTGTGCCGGCGACGCATCTTTCAAATGTCTGCCCTA-TCAAATGTCGATGGTACGTGACATGCCTACCAT
TGAATAACGCAGCATAT
CGGCGGCTT-GTTCGCCGATATTCCGAAAAAGTGTCTGCCCTA-TCAACCT—GATGGTAGTCTATTAGTCTACCAT
TGAATAATGCAGCATAT
CGGCGGCTT-GACCGCTGATAATCCGAAAAAGTGTCTGCCCTA-TCAACCT—GATGGTAGTCTATTAGTCTACCAT
TGGATAACACAGCC—GATCGCACGGT-CTTTGCACCGGCGACGGATCCTTCGAATGTCTGCCCTAATCAACTTTCGATGGTACGTTATGCGCCTACCAT
TGGATAACCCAGCC—GATCGCACGGT-CTTCGCACCGGCGACAGATCATTCGAATGTCTGCCCTA-TCAACTTTCGATGGTAGGTTCTGTGCCTACCAT
TGGATAACTTGGCG—GATCG-ACGGC-CTCTGC—GGGCGAC-GAACTTTCGAGTGTCTGCCCTA-TCAACTTTCGACGGTACGTTATGCGCCTACCGT
V2
GGCTCGTCTCTTGGTGA-CTC
CCAACTTCAC
GGAAGGGGTGCTTTTATTAGATCAAGACCAATCGGGGCTTC
ATTCTGTGATGA-CTC
CTGACC
CGCAAGGGAACGGGTGCATTTATTAGAACAGAACCAATCGGGCGCGGCTTCGGCTGTGCCTGCTAC
CTTTGATGAACTC
TCGGG
CCCGCGAGGTCGCGACGTCTTTATGCCGTCAAGACCAGCCGGCGCACGCGTCTTC-GGA—CGTGGCGGCCGACCT
GATTTCATCCGAAATCACTTAATGCTGA-GTC
CCGACTTCT
GGAAAGAGCGCTTTTGTTAGATCAAACCAAT
GCCCGGTGCTTTGGTGA-TTC
CCGACTTCT
GGAAGGGATGTATTTATTAGATTCAAAACCAATGCGGGTTCT
CGTTTTCGGACGTTGTTTGTTGA-CTC
CCAACGCAA
GGCGGGGTGCAATTATTAGAACAGACCAAA
GACTCGTTCCGTATCCCATGGTGA-CTC
TCCGACCCTCTGG
GGACGAGCGCTTTTATTAGACCAAAACCAATCG
GTCGCAAGGCCGTCACTCTGGTGA-ATC
ACCGACT
CACGGAGGTGCGATTTTATCAGTCCAAAACCAATCG
ATCGTACAACTTGGTGA-CTC
TCCAACC—GGAAACGGAAGGAGCGCTTTTATTAGATCAAAACCAATCGGTGGCGGTCTCCGTC
ATTCTGTGATGA-CTC
CTGACC
CGCAAGGGAACGGGTGGATTTATTAGAACAGAACCAACCGGCGGTGACTTCGGTTGCCGTCGTTGC
AGGCTGGTTATTGGTGA-TTC
CTGACCTCTCGG
GGAAGGGATGTATTTATTAGATCCAAAACCGATGCAGTCGAA
GGCCTCGGCCCGTCCTGTTGGTGA-TTC
TCCGACCTTTTTGCAGGGAAGAGCGCTTTTATTAGATCAAGATCAATCG
TCCGACCCTCGCGGA
AAGAGCGCTTTTATTAGTTCAAAACCAATCGTCGTTGCCCTTCAGCGGGCGCGAGCGGGCGCGGCGTCCAACTGGTGA-CTC
CACTTTCGGGTGCAGTTTGCTGA-CTC
CTGACTTTT
GAAAGGGTGCAATTATTAGAGCAAATCAAT
CTCCTTCGGGTGCTGTTTGCTGA-CTC
CTGACTTTT
GAAAGGGTGCAATTATTAGAGCAAATCAAT
GCCCGGTCTCGGCCGGCCACACTGGTGAACTC
CCGACTTTC
CAGAAGGCGTGCTTTTATTAGGAACAAGACCA
ACCCGGGGTTCGCCCCGGTCCCTTTGGTGA-CTC
CCGACCTCAC
GGTCTGGCGTGCATTTATCAG-ACCAAGACCA
G
CGCTGACCTC
CCTTGATGAACTC
CCTTGATGAACTC
OGCTCGTCCCCQTCACATTGGTQA-CTC
280
TCGGGGCCCGCGAGGGTCGCGACGCTTTTATCCCTTCAAGAACCAGCCGGGCAGCGCGGTCTC-CGGACCGTGGCGGCCGACCT
TCGGGGCCCGCGA—GTCGCGACGTGCTTTATCCCGTCAAGACCAGCCGACGCACGCGGTTCCTCGGACCGTGGCGGTCGACCT
TCCGACTCGC
210
.
420
.
430
.
440
.
450
.
520
530
540
550
600]
16
4
l~
V3
17
V3
18_
ACGGTCATTTCAATGAGTTGATCATAAACCTTTTTTCGAGGATCA
ACGGTCATTTCAATGAGTTGAGCTTAAACCTTTTTTCGAGGATCA
TTCCCQACACOQQ-AGGTAQTQACOAAAAATAACQATACQOQACTCTTTQOAGGCCC-OTAATCOOAATQAOCACAOTCCAAATCCTTTAOCGAQOATCC
16
500]
Saccoglossus
590
490
]
CTCC-<
ATCTTGAACAGATGAGATAGTGACTAAAAATAAAAAGAC-CATTCCTATGGA
ATCTTGAATAGATGAGATAGTGACTAAAAATAAAAAGAC-CATTCCTATGGA
.
Placopecten
Tenebrio
Opisthorchis
Scypha
Acanthopleura
Limicolaria
Haemonchus
Nematodirus
580
5
480
TGAGTTATTTCAATGAGTTGAATACAAATGATTCTTCGAGTAGCA
.
CTGTTGAGTA—TGAGATAGTGACTAAAAATATAAAGACTCATCCTTTTGGA
570
15
470
Caenorhabditis
.
ATCCTAGATCGGGGAGGTAGTGACGAAAAATAACAATGCCGCAGTCGAAT-AGACTCGGCAATTGGAATGAGTACAATCTAAATCCTTTAACGAGGAACA
V;XV^'l\»V>UAi;UUUttAW»"X'AV>jreAi;WAAAAAXAAbUA'l"AUUUVMlUUX:XX"±\;W^
510
560
460
.
(MTGMAACs«MTaA(5<M(MAAnYiAa(MTTe<MTTceeGASAsaaAewr!Ta»eA»Ae«^
410
.
Artemia
Schistosoma
Sagitta elegans
Moniliform
PhOrOniS
t
[
Branchiostoma
Saccoglossus
[
[
Phoronis
Paraspadella
Sagitta crassa
Artemia
Schistosoma
Sagitta elegans
Moniliform
Anemonia
Caenorhabditis
Burypelma
Placopecten
Tenebrio
Opisthorchis
Scypha
Acanthopleura
Limicolaria
Haemonchus
Nematodirus
O
O
/-\
W
n
lis
00
00
620
630
640
650
660
670
680
690
720
730
740
750
760
770
780
790
ATCT
800]
AGATGATTGACGGACTTCTCAATGGAGCAACCGTGTTTTC
V4
]
GTTCTG
TCGCGGCGGCTACTTCCCGCTCCTGACCT
TCGCGCCGTTCACTGCCCGTTGTCCTCCT
TAACGGTGGTTACTGCCTGGCCTGAACAG
CCTTGGTGGCTACTGCTCGTCCTGACCTA
GCCGCGGGACGTCCTGCCGGTGGGCTTAGC
GGGCCG
GCTCCTCTTTGTACCGT
TTCTGCG—TTTCGGCGTCGGAGTT-GTGGCAGGTCGTCGGCCCTGTCGGCGAGGT
E21 - 1 , E21 -2, and E21 -5 Helical domains
V4
V4
GCGGAGCTCACGGTCCGCCTTTTCGGGTGTGCACTGTGCT
TTCAGGCTGGAGGTCCGCC
CGCAGGCGGGC6GTCCGGC
TACATGCAGTGATTCGCCTTT
CACATGCAGTGGTTCGCCTTT
CCCAGGCCTGCGGTCCGCCGT
ACCGGGCTTGCGGTCCGCCGC
CCGCGGGCGGGCGGAAGTTCTCCGTAC
Scypha
Acanthopleura
Limicolaria
Baemonchua
Nematodirus
Strongylocentrotua
Branchioatoma
Saccogloaaua
V4
TACGTGCCGCAGTTCGCAATT
TCCAGACGGGCGGTCCGCC
TCCAGGCTTGCGGTCCACT
CCCGCGCCGCCGGTTCATCGTTCGCGGTGTTAACTGGCGT
Caenorhabditia
Burypelma
Placopecten
TenebrlO
I
GTGGCACGGCCGGTCCGCCG
Anemonia
CAAGTGTGTCACTGGCC
ATCGTGTCGGCGTCTTTGTTGCTTC
Monillform
TTTAGGCTAGCGGTCCGCC
TCGCGC—GTTACTGCTCGTCCTGACCTA
ATCGTGTCGGGAGTTCCGTGGCGTG
CTCTGCGOCTTACGCCGTTGGGGTGCGTTGCGGGGTCTCGGCCCTGTCGGCGAGGT
ATCGTGTCG-CGTTCTCGTGGTTTC
TCCTGCGGCTCTGCCGTTTGGGAGTTTCCGCGGGTCGTCGGCCCTGCCGGCCTAGG
CTCGGTCGGGTGGTGCCGCC
TCACGGTGGTCACTGCCTCGATCGGACAA
CGTGCGGTCGCATGCCGCTGCTTGT-TCACGGTTTTGGTTACGATCAGGACGTGTTCAGC—TCGGTGTAGTGGCTGTGCAGCCTTTCAGCCGTGTCTGT
710
A TGGAGGGCAAGT-CTGGOXKCA^TTGCCCWGTAATTCCAGCTCCAATA^GTATATTAAAGCTGCTGCAGTTAAAAAGCTC
ACTGGAGGGCAAGT-CTGGTGCCAGCAGCCGCG-TAATTCCAGCTCCAGTAGCGTATATTAAAGCTGTTGCAGTTAAAAAGCTCGTAGTTGGATCTTGGG
610
Sagitta elegana
Phoronis
Paraapadella
Sagitta craaaa
Artemia
Schiatoaoma
[
accog oaaua
Pa ra apa del la
Sagitta craaaa
Artemla
Schiatoaoma
Sagitta elegana
Monillform
Anemonia
Caenorhabditia
Burypelma
Placopecten
Tenebrio
Opiathorchia
Scypha
Acanthopleura
Limicolaria
Baemonchua
Nematodirus
Strongylocentrotua
[
NJ
Q
3
g
H
Q
m
^
Strongylocentrot us
Branchiostoma
Saccoglossus
Baemonchus
Nematodirus
Schistosoma
Sagitta elegans
Moniliform
Anemonia
Caenorhabditis
Burypelma
Placopecten
Tenebrio
Opisthorchis
Scypha
Acanthopleura
Limicolaria
[
Phoronis
Paraspadella
Sagitta crassa
Artemia
nematodirus
Strongylocentrot us
Branchiostoma
Saccoglossus
Acanthopleura
Limlcolaria
Baemonchus
[
Phoronis
Paraspadella
Sagitta crassa
Artemia
Schistosoma
Sagitta elegans
Moniliform
Anemonia
Caenorhabditis
Burypelma
Placopecten
Tenebrio
Opisthorchia
Scypha
:
820
:
830
:
840
850
:
860
:
:
870
:
880
;
890
900]
•]
910
V4
920
930
940
950
960
970
E21 -1, E21 -2, and E21 -5 Helical domains
V4
V4
980
990
V4
1000]
E21 -2
~|
I
E21-6
V4
V4
E21-7
V4
E21-8
E21-9
V4
E21-8
I
TTACTTTGAAAAAATTAGAGTGCTCAAAGCAGGCGACTCGC
CT-GAATA—ATGGTGCATGGAATAATGGAATAGGACCTCGGTTCT-ATTTTGTTG
TTACCTTGAACAAATCGGAGTGCTCAAAGCAGGCTCTTCGCTG-CTCGAA-CG-GTAGCGCATGGAATAATGGAAGAGGACC-CGCTTCCCCTTCTGTTG
TTACCTTGAACAAATCGTAGTGCTCAAAGCAGGCTCTACGCT—CT-GAATCGCTATTAGCATGGAATAATGGAAGAGGACCTCGGTCCGCATTCTGTTG
TTACTTTGAACAAATTAGAGTGCTTAAAGCAGGTGCACCGCG-CCT-GAATA—TCACAGCA-GGAATGATGGAATAGGACCTCGGTCTT-ATTATGTTG
TTACTTTGAACAAATTTGAGTGCTCAAAGCAGGCCTGTGC—GCCT-GAAAA—TTCTTGCATGGAATAATGAAATAGGACTTCGGTTCT-ATTTTGTTG
TTACCTTGAACAAATCGGAGTGCTCAAAGCAGGCTCTTCATCGC-TCGAA-CG-GTAGCGCATGGAATAATGGAAGAGGACC-GGCTTCCGCTTCTGTTG
TTGTATA—GTGTTGCATGGAATAATGAAATAGGACCTCGGTTCT-GTTTTGTTG
TTACTTTGAGAAAATTAGCGTGCTTAACGCAGGCGTTACAGC
TTACTTTGAAAAAATTAGAGTGTTCAAAGCAGGCCAGCGC
TTGAATA—CATAAGCATGGAATAATGGAATAGGACTTGGGTTCT-ATTTTGTTG
TTACCTTGAATAAATCAGAGTGCTCAATACAAGCGCTTGC
TTGAATA—GCTCATCATGGAATAATGAAACAGGACTTCGGTTCT-TTTTGTTGG
TTACTTTGAAAAAATTAGAGTGCTCAAAGCAGGCGTGTA
GCCT-GAATA—ATGGTGCATGGAATAATGGAATAGGACTTCGTTTCT-ATTTTGTTG
TTACTTTGAAAAA-TTAGAGTGTTCAAAGCAGGCAAT
TC GCCT-GAATA—ATGGTGCATGGAATAATGGAATAGGACCTCGGTTCT-ATTTTGTTG
TTACTTTGAACAAATTAGAGTGCTTAAAGCAGGCTAAAACTTCGCCTGAATA-CTGTGTGCATGGAATAATGGAATAGGACCTCGGTTCT-ATTTTGTTG
TTACTTTGAACAAATTTGAGTGCTCAAAGCAGGCCCGTGT
GCCTGAAAA—TTCTTGCATGGAATAATGGAATAGGACTTCGGTTCT-ATTTTGTTG
GAAT?—CATTAGCATGGAATAATGAAATAGGACTTTGGTTCT-ATTTTGTTG
TTACTTTGAAAAAATTAGAGTGTTCAAAGCAGGCCTTGGCTT
TTACTTTGAAAAAATTAGAGTGTTCAAAGCAGGCCCC-G—TCGCCTGAATA—ATGGTGCATGGAATAATGGAAGAGGACCTCGGTTCT-ATTTTGTTG
TTACTTTGAAAAAATTAGAGTGTTCAAAGCAGGCCCCAG—CTGCCTGAATA—ATGGTGCATGGAATAATGGAATAGGACCTCGGTTCT-ATTTTGTTG
GAATG—GTCGATCATGGAATAATAAAAGAGGACTTCGGTTCT-ATTTATTGG
TTACTTTGAATAAATTAGAGTGCTCAGAACAAGCGTTTGCTT
TTACTTTGAATAAATTAGAGTGCTCAGAACAAGCGTTTGCTT
GAATG—GTCGATCATGGAATAATAAAAGAGGACTTCGGTTCT-ATTTATTGG
GAACA—GCAGAGCATGGAATAATGGAATAGGACCTCGGTTCT-ATTGCGTTG
TTACTTTGAAAAAATTGGAGTGTTCAAAGCA-GCCTCGCGCCT
TTACTTTGAAAAAATTAGAGTGTTCAAAGCAGGCCTGGCGCCT
GAATA—GTGGTGCATGGAATAATGGAATAGGACCTCGGTTCT-ATTTCGTTG
GGATA—GTCCAGCATGGAATAATGGAATAGGACCTCGGTCC—ATTGCGTTG
TTACTTTGAAGAAATTAGAGTGTTCAAAGCAGGCCGCCACGCCT
I
GAGTGGGTCTTCGCTGCCCGTCCCGGAC-TCTCCTCCGGTCCTCTCCCG-GTGCTCTTGGTTGA-GTGTCGGGGGCGGCCGGAACGT
G—GTGCTCTTGATTGA-GTGTCTCGGGTGGCTGGAACGT
CCTCCCGG-TTTT
CCCTT
CCTCGGCTGTTGCCGGGOTTCCAACCTTGCTTGTGCAGCCCGCCGTTGCGTCT-GGACGGGTGCCCTTACC-GGATG-CCCGTTTTTGGACGTGACCCGT
CATAGCCCGT
—TCCCTACCTGATGGCGTTCTCAACTGGCTTCTGCAGCCCGCCGTCGCGTTTTGGGCGGGTGTCCTTATTT
TTCATTGGATCGTT
CGGGGTGCTCTTAACCGA-GTGTCCTGGGTGGCCGATACGT
GTTAAACGGGTGCTGGTGGGTTGACGAGTTCGTC—TTGTTGACCTGTCGGCATGCTTCCGGATGCCTTTAAACGGG-T-GTCGGGAGCGGACGGCATCT
CC
CACCTTTGGTGGGTTCCACCTCGCTCGTGCAGCCCGCCGTCGCGTTT-GGGCGGGTGTCCTTAATTGGATG-CCCGCTCTCGAACGTGGCCCGT
TTCATCGTTGATGCACTTTATTGT-GTGTCACGTTTGAACGGCCTTT
CTC
TTCTTCGCAAAGACCGCGTGTGCTCTTGACTGA-GTGTGCGCGGGAGTTGCGACGT
TGCGTCAACTGTGGTCGTGACTTCTAATTTGCTGGTTTGAGGTTGGGTT-CGCCCTTCAACTGCCAGCAGGT
G—ATGATCTTCACCGG-TTGTCCTGGGTGACCGGCACGT
CCCAGCCGGTTT
CCCTA
G—GTGCTCTTGATTGA-GTGTCTCGGGTGGCCGGAACAT
CCTCCCGGTTTTA
CCCTT
TCGTGAGGGCGGC
CCAACTCAATCCCGCCGCGGTGCTCTTCGTTGA-GTGTCGAGGTGGGCCGGCACGT
TTCGACAGGTGTTAGCATGATTGGTGGGTTCGTCCCTGCCTGATCTGTTGACATGCTTCCCAGGTGCCTTAACCGGG-TCGTCGGGGGCGGACGGCACGT
TTC
TTCTTCTCGTGGAGCGTGTGTGCTCTTCATTGA-GTGTGCACGTAACTCGGGACTT
G—GTGCTCTTGACTGA-GTGTCTCGGGGGGCCAGAACGT
ACCATCGGGTTTT
CCCTT
GCCCTACCGTCTG
CCGGCTCTCTCCCGCGG—GTGCTCTTCGCTGA-GCGTCC7GGGTGGCCGGCGCGT
GGCGTTAATCGCTGTTGTAACTATTTGCTGGTTTTCTATTGAGGTTTCG-GCTTCTTTAGTGGCTAGCGAGT
GGCGTTAATCGCTGTTGTGACTATTTGCTGGTTTTCTATTGAGGTTTCG-ATCTCTTTAGTGGCTAGCGAGT
GAGGCGTGT-ACTGCA-GTCCTGGCCTTCCTCTC-GGTTTTCGCCCG-GTGCCCTTAATTGATGTGCCAGGAGAGGCCGGAACGT
:
810
Saccoglossus
Branchlostoma
Oplsthorohls
Scypha
Acanthopleura
Llmlcolaria
Haemonchua
Sematodirua
Strongylocentrotus
Placopeoten
Tenebrlo
Phoronls
Paraapadella
Sagltta crassa
Art end. a
Schlstosoma
Sagltta elegans
Moniliform
Anemonla
Caenorhabdltis
Eurypelma
I
I
[
Phoron Is
Paraspadella
Sagltta crasaa
Artemla
Schlatosoma
Sagltta el egans
Moniliform
Anemonla
Caenorhabdltis
Burypelma
Placopacten
Tenebrlo
Opiathorchla
Scypha
Acanthopleura
Llmlcolaria
Haemonchua
Namatodlrua
Strongylocentrot us
Bran ohlostoma
Saccoglossus
1010
1020
1030
1040
1050
1060
1070
1080
1090
1100]
•]
1120
1130
1140
1150
1160
1170
1180
1190
1200]
ACG
24
22
21
25
GCTGCGAAA-CG-TTTGCCAAGAGCGTTTT-CATTAGTCAAGAACGAAAGTCAGAGGTTCGAAGACGATCAGATACC- —GTCCTAGTTCTGACCATAAAC
AGAGCGAAAGCA-TCTGCCAAGGATGTTTC-CATTGATCAAGAACGAAAGTCGCGGGATCGAAGAGGATTAGAGACCt !TGACGTAGTCGCGACCGTAAAC
AGAGCGAAAGCA-TCTGCCAAGGATGTTTC-CATTGATCAAGAACGAAAGTCGCGGGATCGAACGGGATTAGATACC- —CCGGTAGTCGCGACCGTAAAC
ACTGCGAACAAG-TTTGCCAAGAATGTTTT-CATTAATCAAGAACGAAA-TTAGAGGTTCGAAGGCGATCAGATACC —GCCCTAGTTCTAACCATAAAC
ACAGCGAAAGCA-TTTGCCAAGAATGTTTT-CATTGATCAGGAGCGAAAGTCAGAGTTTCGAAGACGATCAGATACC' —GTCGTAGTTCT8ACCATAAAC
AGAGCGAAAGCA-TCTGCCAAAGATGTTTC-CATT9ATCAAGAACGAAAGTCGCGGGATCGAACGGGATTAGATACC' —CCGGTAGTCGCGACCGTAAAC
ACTGCGAAAGCA-TTTGCCAAGAATGTTTT-CATTAATCAAGAACGAAAGTTAGAGGTTCGAAGACGATTAGATACC —GTCCTAQTTCTAAC TATAAAC
ACTGCGAAAGCA-TTTGCCAAGAATGTTTT-CATTAATCAAGAACGAAAGTTAGAGGCTCGAAGACGATCAGATACC —GTCCTAGTTATAACCATAAAC
ACAGCGAAAGCA-TTTGCCAAGAATGTCTT-CATTAATCAAGAACGAAAGTCAGAGGTTCGAAGGCGATTAGATACC' —GCCCTAGTTCTGACCGTAAAC
ACTGCGAAAGCA-TTTGCCAAGAATGTTTT-CATTAATCAAGAACGAAAGTTAGAGGTTCGAAGGCGATCAGATACC' —GCCCTAGTTCTAACCATAAAC
ACTGCGAAAGCA-TTTGCCAAGAATGTTTT-CATTAATCAAGATCGAAAGTCAGAGGTTCGAAGACGATCAGATACC' —GTCGTAGTTCTGACCATAAAC
GAAGCGAAAGCA-TTTGCCAAAAACGCTTT-CATTGATCAAGAACGAAAGTTAGAGGTTCGAAGGCOATCAGATACC' - -GCCCTAGTTCTAACCATAAAC
ACAGCGAAAGCA-TTTGCCAAGGATGTTTT-CATTGATCTGGAGCGAAAGTCAGAGGTTCGAAGACGATCAGATACC' • -GTCCTAGTTCTGACCATAAAC
ACTGCGAAAGCA-TTTGCCAAGGATGTTTT-CATTAATCAAGAACGAAAGTTGGAGGTTCGAAGACGATCAGATACC' • -GTCGTAGTTCCAACCATAAAC
ACTGCGAAAGCA-TTTGCCAAGAATGTTTT-CATTAATCAAGAACGAAAGTCAGAGGTTCGAAGACGATCAGATACC —GTCGTAGTTCTGACCATAAAC
ACTGCGAAAGCA-TTTGTCAAGAATGTTTT-CATTAATCAAGAACGAAAGTCAGAGGCGCGAAGACGATCAGATACC —GTCGTAGTTCTGACCATAAAC
AAAGCGAAAGCA-TTTGCCAAGAATGTCTT-CATTAATCAAGAACGAAAGTCAGAGGTTCGAAGGCGATTAGATACC —GCCCTAGTTCTGACCGTAAAC
AAAGCGAAAGCA-TTTGCCAAGAATGTCTT-CATTAATCAAGAACGAAAGTCAGAGGTTCGAAGGCGATTAGATACC —GCCCTAGTTCTGACCGTAAAC
AC-GCGAAAGCAATTTGCCAAGAATGTTTTTCATTAATCAAGAACGAAAGTTAGAGGTTCGAAGGCGATCAGATACC —GCCCTAGTTCTAACCATAAAC
ACTGCGAAAGCA-TTTGCCAAGAATGTTTT-CATTGATCAAGAACGAAAGTTGTGGGCGCGAAGGCGATCAGATACC —GCCCTAGTCACAACCATAAAC
TTTGCCAAGAATGTTTT-CATTGATCAAGAACGAAAGTCGGAGGTTCGAAGACGATCAGATAC- -—GTCTAGTTCCGACCGTAAAC
ACTGCGAACG
1110
V4
GTCTTAGGAACGC-GAGGTAA-TGATCGAGAGGGACTGA-CGGGGGCATTCGTATTGCGGCG-TTAGAGGTGAAATTCTTGGATCGTCGCAAGA
E21-9
22
23
GTTTTCaGAATAC-GAGGTAA-TGATTAAGAGGGACAGA-CGGGGGCATTCGTATTGCGACG-CTAOAGGTGAAATTCTTGGACCGTCGCAAGACGAACT
GTTTTCOGAACTT-GAGGTAA-TgATTATGAGQGACAGA-CGGGGGCATTCGTATTACGGTG-TTAGAGGTGAAATTCTTGGATCGCCGTAAGACGAACT
GTTTTC6GAATTTTGAGGTAA-TGATTAATAGGAACGGAT-GGGGGCATTC0TATTGCGACG-TTAGAGGTGAAATTCTTGGATCGTCGCAAGACGGACA
GTTTTCGGA—TCCGAAGTAA-TG9TTAAAAGGGACAGA-CGGGGGCATTTGTATGGCGGTG-TTAGAGGTGAAATTCTTGGATCACCGCCAGACAAACT
GTTTTCGGAAC—CAAGGTAA-TGAC7AATAGGGACAGTT-GGGGGCATTCOTATTCAATTG-TCAOAGGTGAAATTCTTGGATTTATGGAAGACGAACA
GTTTTCGGAAGTC-GAGGTAA-TGATTAAGAGGGACAGA-CGGGGGCATTCGTATTACGGTG-TTAGAGGTGAAATTCTTGGATCGCCGTAAGACGAACT
-TTTTCGGAACTG-GAGGTAA-a
TTCA—GGAACT—GAAATAA-1
TTCA—GGAAC T—GAAATAA-3
CGACCG
GTTTTCGGAACTC-GAGGTAAGTGATTAAGAGGGACTGA-CGGGGGCATTCGTATTGCGGTGGTGAGAGGTGAAATTCTTGGTCGCCGCAG
GTTTCCGGAAG-CTGAGGTAA-TGATTAATAGGGACAGA-CGGGG—ATTCGTATTGTGGTG-TTAGAGGTGAAATTCTTGGATCACCGCCAGACGAACA
GTTTCTGGAAC-CTGAAGTAA-TGATTAAGAGGGACAGTT-QGGGGCATTCGTATTCG-TTG-TCAGAGGTGAAATTCTTGGATTTACGAAAGACGAACT
GTTTTCGGAACTT-GAGGTAA-TGATTAATAGGGACGGC-CGGGGGCATTGCTATTACGGTG-TTAGAGGTGAAATTCTTGGATCGCCGTAAGACGGACA
QTCTCAC—GGAAGCAGGTAA-TGATCAAGAGGGACGGA-CGGGGGCAGAGGTATGGCCCQG-CGAQAGGTGAAATTCTTGGACCCCGGCCAGACCCTCG
GTCTTCCTAGGAGCCAAGTAA-TGATTAAGAGGGACAGTT-GGGGQCATACGTATGGCTCGG-CGAGAGGTGAAATTCTTGGACCCTAGCCAGACCCTCG
aTTTTCTGGACT-TOAGGTAA-TGGT-AACAGAGACAGA-CGGGGGCATTCGTACTGCGACG-CTAGAGGTGAAATTCTTGGACCGT-GCAAGACGAACA
OTTTTCGGA—TCCGAAGTAA-TGGTTAAGAGGGACAGA-CGGGGGCATTTaTATGGCGGTG-TTAGAGGTGAAATTCTGGGATCGCCGCCAGACAAACT
Phoronia
Paraapadella
Sagltta craaaa
Artemla
Schlatoaoma
Sagltta el egana
Monlllform
Anemonia
Caenorhabdltla
Burypelma
Placopecten
Tenebrio
Oplathorchla
Scypha
Acanthopleura
Llmicolaria
Baemonchua
Nematodlrua
Strongylocentrotua
BranchIoatoma
Sacoogloaaua
I
I
[
PhoronIs
Paraapadella
Sagltta craaaa
Artemla
Schlatoaoma
sagltta elegana
Monlllform
Anemonia
Caenorhabdltla
Burypelma
Placopecten
Tenebrio
Opiathorchis
Scypha
Acanthopleura
Llmicolaria
Baemonchua
Vematodlrua
Strongylocentrotua
Branchioatoma
Saocogloaaua
1310
1320
27
1330
V5
1340
1350
1360
28
1370
26
1380
20
1390
29
29
30
31
32
33
34
35
AGCTGAAACTTAAAGGAATTGACOGAAGGGCACCACCAGGAGTO-AGC-TGCG-CTTAATTTGACTCAACACGGGAAAACTCACCCGGCCCGGACACOGC
AGCTGAAACTTAAAGGAATTGACGGAAGGGCACCACCAGOAGTOGAOCCTOCG-CTTAATTTGACTCAACACGGGAAAACTCACCCGGCCCGGACACAGT
AGCTGAAACTTAAAOGAATTOACGOAAGGGCACCACCAGGAGTaGCAG-TOCGGCTTAATTCGACTCAACACGGGAAATCTCACCCGGCCCGGACACTGT
AATCGAAACTTAAAGGAATTGACGGAGGGGCAC-ACCAGAAGTGGAGCCTGCGGCTCAATTTGACTCAACGCACGAAAACTTACCCGGCCCGAACACCGT
AGCTGAAACTTAAAGGAATTGACGGAAGGGCACC-CCAGGAGTGGAGCCTOCGGCTTAATTTGACTCAACACGGGAAACCTCACCAGGCCCGGA-ACTGG
26
TATOCCAACTaOQOATCCQTCOOTTOCCATTTQTAOOCTCOOCOaOCACCC-TACOOOAAACCA- -AAGTGAACAGGTTCCGGGGGGAGTATGGTTGCAA
OATOCCOACTAOOOATCAOAOAOTOTTA-TTGOATaACCTCTTTOOCACCT-TAOQOOAAACCA- -AAGTTTTTGGGTTTCGGGGGGAGTATGGTTGCAA
QATOCCATCTCQCOATTCQOAOO-OTT
TTTSCCCTaCCSAOaAOCT-ATCCOOAAACSA--AAGTCTTTCGGTTCCGGGGGTAOTATGOTTGCAA
OATOCCAACCAOCOATCCaCCTOAOTTCCTCAAATGACTCQOCOOOCAOCT-TCCaOQAAACCA- -AAGTGTTTGGGTTCCGGGGGAAGTATGGTTOCAA
OATOCCAACTAOCOATCCaCCaOAOTTOCTTCAATOACTCOOCOOOCAOCT-TACOOOAAACCA- -AAGTTTTTGGGTTCCGGGGGAAGTATGGTTGCAA
OATOCCAQCTAQCQATCCOCCQACQTTCCTCCSATOACTCOOCOOQCAQCT-TCCOOOAAACCA- -AAGCTTTTGGGTTCCGGGGGAAGTATGGTTGCAA
OATOCCAACTOACGATCCOTOaTOaCOCOATTATTOaCCCCOCOGOCAOCC-CCCOOOAAACCT: TAAGTCTTTGGGCTCCGGGGGAAGTATGGTTGCAA
OATQCCOACTAQQOATCQOTOQATOTTA-TTAAATOACTCCATCQOCACCT-TATOAOAAATCA- -AAGTTTTTGGGTTCCGGGGGGAGTATAATCGCAA
OATOCCAACTAOCQATCCQCCQOAOTTQTTTCAATOACTCOOCOGQCAOCT-TCCOOQAAACCA- -AAGTTTOTGGGTTCCGGGGGAAGTATGGTTGCAA
QATSCCOACTAOCOATCCOCAQOAOTTOCTTCOATQACTCTOCOOQCAGCT-TCCQaOAAACCA- -AAGTTTTTGGGTTCCGGGGGAAGTATGGTTGCAA
TATOCCATCTAQCOATCCOATOOOOT
ATAOTTOCCTTOTCOAOOAOCT-TCCCOOAAACOA- -AAGTCTTTCGGTTCCTGGGGTAGTATGGTTGCAA
TATOCCATCTAGCOATCCOATOGaOT
ATATTTOCCTTOTCOAOOAOCT-TCCCaOAAACSA- -AAGTCTTTCGGTTCCTGGGGTAGTATGGTTGCAA
OATOCCOACTOACOATCCOCCOGCOTTACTCCCATOACOCOOC-OOCAOTC-TAAQOOAAACCA- -AAGTCTTTGGGTTCCGGGGGAAGTATGGTTGCAA
OATGCCAACCAOCGATCCOCCGOCGTTACTTCOATOACCCOATOGOCAOCT-CCCOGGAAACCT- -GAGTTTTCGGGTTCCGGGGGAAGTATGGTTGCAA
GATGCCGACT—CGATCTOC-QOCGTTACTCTCTAOACCCOGC—GCAOCT—CC-OGAAACCA- -AAOTCTTTGOQTTCCGGOOQAAQTATQOTTGCAA
OATOCCAACCA-COATCCQCQQACQTTACTTOAATOACTCCOCQGQCAQCT-TCCOOQAAACCA—AAGTGTTT-GGTTCCGGGGGAAGTATGGTTGCAA
1220
1230
1240
1250
1260
1270
1280
1290
1300]
.]
.
.
.
.
.
.
.
GATGCCAACTGAGCATCCGCCGGAGTTGCTTCAATGACTCGGCGAGCAGCT-TCCGGGAAACCA—AAGTCTTTGGGTTCCGGGGGAAGTATGGTTGCAA
1210
4
4
1420
.
1430
.
1440
.
1450
.
1460
.
1470
.
1480
.
1490
.
1530
1540
1550
1570
1580
CACCGGTCCCGA—TGTGCCGGTGC
•
1560
1600]
GGCTTCT
GGCG
AGCTTCTCCTGCCG
1590
37
40
I
V7
V7
41
V7
V7
~|
GTTG
TGCTTCT
ACGAGACCCCAACCTGCTAACTAGCCCTCGGGTCC
GTGCCGATTCA
ACGAGACTCTAGCCTATTAATTAGG
TTACGCC
ACOAGACCTTAACCTOCTAAATAG
GCGAGACTCTAOCCTGCTAAATAOTTGGCGAATCTTC
GTCCCGATCAC—TTCTGTCGGGCG
ACOAGACTCTAGCCTACTAAATAGGC
CACCGATCCGCTCTGCGTCGGTGC
ACOAGACTCTAGCCTGCTAAATAGTT
GCCCGCTGGTCCCGGGTTCGCTCGGTGACCGTGCGCGGTTTTTAC
ACGAGACTCTAGCCTGCTAAATAGGCGTATTTCGACATCCCAAAG
ACGAGACTTTGGCCTGCTAAATAGTACGCCTGTCCTCTGTGCTCGTGCAGGTGGCGGTGCTCATTGCCTCTC—TGGGGTGATGGTGCCGTTCGCCGGCG
TTACGCG
ACGAGACCTTAACCTGCTAAATAG
CGCCGATCCCTGATGCGTCGGCGCC
ACOAGACTCTAGCCTATTAAATAOTT
CGCCGGTTCCTCGATCGCCGGCGC
ACGAGACTCTAOCCTATTAAATAGTT
GCGAOACTCTAGCCTGCTAAATAGTGGCTGGATTTTT
GCGAGACTCTAGCCTGCTAAATAGTGCCTGGATTTTT
ACGAGACTCTOGCTTGCTAAATAGTTGCGCCACCCCG
ACGAGACTCTGGCATGCTAACTAGTTGCGGCGATCCCG
T
ACGAGACTCTGGCTTQCTAAATAQTCGTG-COACCCT
1520
Sagltta elegana
Monlllform
Anemonla
Caenorhabdltla
Eurypelma
Placopecten
Tenebrlo
Oplathorchla
Scypha
Acanthopleura
Llmlcolarla
Haemonchus
Nematodirua
Strongyl ocen trot us
Branchloatoma
Saccogloaaua
1510
OAGGATTGACAGACTGAGAGCTCTTTCTCGATTCGGTGGGTGGTGGTGCATGGCCGTTCTTAGTTGGTGGA-CGATCTGTCTGGTTAATTCCGATAACGA
ACGAGACTCTAGCCTGCTAAATAGTT
ACOAGACCCCGACCTGCTAACTAGCCCTCGGGTCC
ACGAGACCCCAACCTGCTAACTAGCCCTCGGGTCT
ACGAGACTCTAGCC-GCTAAATAaACGATGGATCCTA
us
.
GAG-J
AAGGATTGACAGATTGAGAGCTCTTTCTTGATTCAGTGGGTaGTGGTGCATGGC-GTTCTTAGTTGGTGGAGCGATTTGTCTGGTTAATTCCGATAACGA
1410
Phoronls
Paraapadella
Sagltta craaaa
Artemla
[
Llmlcolarla
Haemonchua
nematodirua
Strongylocentrot
Acanthopleura
Sagltta elegana
Monlllform
Anemonla
Caenorhabdltla
Eurypelma
Placopecten
Tenebrlo
Oplathorchla
Scypha
[
[
Phoronla
Paraapadella
Sagltta craaaa
Artemla
H
fi
Caenorhabditis
Eurypelma
Placopecten
Tenebrio
Opiathorchia
Scypha
Acanthopleura
Limicolaria
Haamonchua
Nematodirus
strongylocentrot us
Branchioatoma
Saccogloaaua
Schiatoaoma
Sagitta
elegana
Moniliform
Anemonia
Phoronla
Paraspadella
Sagitta craaaa
Artemla
Saccogloaaua
Branchioatoma
Nematodirus
Strongylocentrotus
Acanthopleura
Limicolaria
Baemonchua
Placopecten
Tenebrlo
Opiathorchia
Scypha
Caenorhabditis
Eurypelma
Phoronls
Paraapadella
Sagitta crassa
Artend a
Schiatoaoma
Sagltta elegana
Monlllform
Anemonia
1620
1630
1640
1650
1660
1670
1680
1690
•]
1700]
41
1710
V7
1720
1730
V7
1740
42
1750
V7
I
1760
40
1770
36
1780
1790
34
1800]
32
I
43^
V8
CTGCACGCGCGCTACAATGGAGGGCTCAG-AAAGCGT
CCGCACGCGCGCTACACTGACGATGTCAA-CGAGT
CTGCACGCGTGCTACACTGGTGGAGTCAG-CGGGTTT
CCGCACGCGCGCTACACTGAAGGAATCAG-CGTGTGC
CCGCACGCGCGCTACACTGAAGGAATCAA-CGTGCTC
CCGCACOCGCGCTACACTGAAGGAATCAG-CGTG
CCGCACGTGCGCTACAATGACGGTTTCAA-CGAGTTT
CCGCACGCGCGCTACACTGATGAAGTCAG-CGAOT
CCGCACGCGCGCTACACTGAAGGAATCAG-CGTGTGT
CCGCACACACGCTACACTGAAGGAATCAG-CGTGGAT
CTGCACGCGCGCTACAATGGAAGAATCAG-CTGGCCT
CTGCACGCGCGCTACAATGGAAOAATCAG-CTGGCCT
CCGCACGCGCCGTACACTGGCGGAATCCA-GCGGGTA
CCGCACGCGCGCTACAATGAAGGCATCAG-CGAGTCT
CCACOCQCGCO-TACACTOAAGGOATCAG-CQQQTQT
V8
44
V8
43
~1
45
AA-CGATTTCGACAGAAATCGG-CAATCA
TAAATAGCCTTCTTGATTGGGATC
CT-CTCCTTGGCCGAAAGGTCT-GGGTAATCTTCTCAAACATCGTCGTGCTGGGGATA
TTCCTATGCCGAAAGGTATC-GGTAAACCGTTGAAATTCTTCCATGTCCGGGATA
TT-TCCCT-GTCCGGTAGGACT-GGGTAACCCGTTCAACCTCCTTCGTGATAGGGATA
TT-ATCCTTGCCCGGAAGGGTT-GGGTAACCCGTTGAACCTCCTTCGTGCTAGGGATT
TC-CTCCCTGGCCGAGCGGCCC-GGGTAACCCGCTGAACCTCCTTCOTGCTAGGGATT
GGGATCCTAGCCCGAAAGGTTT-GGGTAAACTGAACCATAACCGTCGTGACTGGGATC
TC-TTCCTTCACCGATAGGTGT-GGGTAATCTTGTGAAACTTCATCGTGCTGGGGATA
CG-TTCCTGGCCCGGAAGGGCT-GGGTAACCCGTTGAACCTCCTTCGTGCTAGGGATT
GC-CTCCCTQGCCCGAAAGGTT-OGGAAACCCGTTGAATCTCCTTCGTGCTAGGGATT
ATCCATTGCCGAAAGGCATT-GGTAAACCGTTGAAACTCTTCCGTGACCGGGATA
ATCCATTACCGAAAGGTATT-GGTAAACCGTTGAAACTCTTCCGTGACCGGGATA
CACTGCCCTTGGCCGGAAGGTCT-GGGTAATCCGCTGAACCTCCTCCGTGATGGGGATA
TTCGCCTTCGCCGAAAGGTGC-GGGTAACCTGCTGAACCGCCTTCGTGCTAGGGATC
COCCTCCCTQGCCQACAQGACCCAGGCAATCCOATQAGCCCCCTTCOTQCTAOQQATA
_
;
CCGCACGTGCGCTACACTGAAGGGATCAGGCGTGCG
TC-TACCCTGGCCTGGAAGGTT-GGGTAACCCGTTGAACCCCCTTCGTGCTAGGGATT
CTGCACGCGCGCTACACTGAAGGCATCAG—GTGCGCTTTGTCGGTTCCCTGCCTGAAAAGGCT-GGGTAACCCGCTGAACCGCCTTCGTGCCTGGGATA
CCGCAC-CGCGCTACACTGGAAGAATCAG-CGCGTC
CTCCCTGTCCGAGAGGACC-GGGTAACC-GCT-GACCTCTTCCGTGGTTGGGATT
CCACACGTGCGCTACAATGACGGTGCCAG-CGAGTCT
GGGAACCTGGCCCGAAAGGGTT-GGGCAAACTGTTTCATCACCGTCGTGACTGGGATC
I
GGG-GATAACAGGTCTGTGATGCCCTTAGATGTTCGGGG
GGCCCGAOGGAATGATTCGCTTCTTAGAGGGACTCGCGGCGC-TAGCCGCACGAAGGG
GGAGCAATAACAGGTCTGTGATGCCCTTAGATGTTCGGGG
ATTAGTCGGCATTGTTAAACTTCTTAGAGG-ACAGGTGCTTCTTAAGCACACGAAGA
AGGCAATAACAGGTCTGTGATGCCCTTAGATGTTCTGGG
CATCGCGATGGGCAACTAACTTCTTAGAGGGACTGTTGGTGTTTAACCAAAGTCAGGA
GATTGAGCGATAACAGGTCTGTGATGCCCTTAGATGTCCGGGG
GGGTTCGTATAACTTCTTAGAGGGATAAGCGGTGTTTAGCCGCACGA
TTCTTCTTAGAGGGACAAATGGCGTTTAGCCGCACGA—GA-CAGAGCAATAACAGGTCTGAGATGCCCTTAGATGTCCGGGG
AACTTCTTAGAGGGACAAGTGGCGTTTAGCCACACGA—GA-TTGAGCAATAACAGGTCTGTGATGCCCTTAGATGTTAGGGG
TGTCGGCGTACAAACAATTCTTCTTAGAGGGACAGGCGGCTTCTAGCCGAACGA—GA-TTGAGCAATAACAGGTCTGTGATGCCCTTAGATGTTCTGGG
GGTGCGGC-CAGGTGTCTACTTCTTAGAGGGACAAGCGGCGTGC—CAGTCGCACGAAATTGAGCAATAACAGGTCTGTGATGCCCTTAGATGTCCGGGG
ATTCTCGAATCGCGGCCAACTTCTTAGAGGGACTATTGGTGTTTAACCGATGGAAGTT-TGAGGCAATAACAGGTCTGTGATGCCCTTAGATGTTCTGGG
AACTTCTTAGAGGGACAAGTGGCGTTTAGCCACACGA—AA-TTGAGCAATAACAGGTCTGTGATGCCCTTAGATGTTCGGGG
AACTTCTTAGAGGGACGAGTGGCGTTTAGCCA-ACGA—GA-TTGAGCAATAACAGGTCTGTGATGCCCTTAGATGTCCGGGG
GATTGAGCGATAACAGGTCTGTGATGCCCTTAGATGTCCGGGG
GAGTCCAGTCTACTTCTTAGAGGGATAAGCGGTGTTTAGCCGCACGA
GATTGAGCGATAACAGGTCTGTGATGCCCTTAGATGTCCGGGG
ACGTCCAGTCTACTTCTTAGAGGGATAAGCGGTGTTTAGCCGCACGA
CGGTGCGCGTCAACTTCTTAGAGGGACAAGTGGCGTTTAG-AG
ATTGAGCAATAACAGGTCTGTGATGCCCTTAGATGTTCGGGG
ATGGAGCAATAACAGGTCTGTGATGCCCTTAGATGTCCGGGG
CGCGATCGGCCGCAACTTCTTAGAGGGACAGCCGGCAGTAAGCCGGACGAG
ATTOAGCAATAACAQQTCTQTGATGCCCTTAQATQTCCGGGG
AACQTTGTCOGCGACCGAACTTCTTAGAOGOACAAQCQGCGTTCCCQAQ
AACTTCTTAGAGG-ACAAGCGCG—ATAGCCGCACGA—GA-TTGAGCAATAACAGGTCTGTGATGCCCTTAGATGTTCGGGG
GGGCGATAACAGGTCTGTGATGCCCTTAGATGTTCGGGG
GGCCCGACGGAATTATTCGCTTCTTAGAGGGACTCGCGGCGCCTAGCCGCACAAGGGG
GGGCGATAACAGGTCTGTGATGCCCTTAGATGTTCGGGG
GGCCCGACGAACTGATTCGCTTCTTAGAGGGACTGCGCGGCCCTAGCCGCACGAAGGG
GTGGTGGATCGCTCTTCTTAGAGGGACAAGTGGCGT-CAGCC
ATATGAGAGTGAGCAATAACAGGTCTGTGATGCCCTTAGATGTCCTGGG
1610
Acanthopleura
Limicolarla
Baemonchua
sematodlrua
Strongylocentrotua
Branchioatoma
Saccogloaaua
caenorhabditia
Eurypelma
Placopecten
Tenebrio
Opiathorchla
Scypha
Phoronia
Paraspadella
Sagitta craaaa
Artemia
Schlatoaoma
Sagltta
elegana
Moniliform
Anemonla
[
Caenorhabditia
Eurypelma
Placopecten
Tenebrio
Opiathorchla
Scypha
Acanthopleura
Limicolarla
Baemonchua
Nematodlrua
Strongylocentrot us
Branchioatoma
Saccogloaaua
Schlatosoma
Sagltta
elegans
Moniliform
Anemonla
[
Phoron Is
Paraapadella
Sagltta craaaa
Artemia
1840
1850
1860
1870
1880
1890
1920
1930
1940
1950
46
1960
1970
1980
30
1990
2000]
V9
V9
CTG-TCCGGGACTGAGCTGTCTCGAGAGGACTGCGGACTGCT
CTG-TCCGGGACTGAGCTGTCTCGAGAGGACTGCGGACTGCT
CTA-CTACCGATTGAATGGTTTAOTGAGATCCTCGGATCGTC
CTA-CTACCGATTGAATGGTTTAGTGAGGTCAGTCGATCGGC
CTA-CTACCGATTGAACGGTTTAGTGAGATCTTCGGATCGC
47
V9
GTATCGAGGCCTTCGGGTCGCGGTA
GTATCGAGGCTTTCGGGTCGCGGTA
GGCGTCGGGCTTGCGCCTCGCTCGCA
CCCTCTCGGGCCGGCAACGGTCTGGAG
ACGCGCGGGGACTGGTTCTCGGCCCTCG
V9
TGGCGGGAAACAGTT
TGGCGGGAAACAGTT
TGTACGAGAAGACGAT
GAGCGCCGAGAAGCCGAT
TGTGTCCGAGAAGACGAT
C6TOGTOT
9OTTOAAA6OTT6TT
CTA-CTACCOATTOAATOATTTAOT6AOAACTTCOQACaACTC6CCAGOGCAOCTCC6OOCQCT
CGTTGCC-GCTCGACTGA
TGCTGAGAAGATGAC
CTA-CTACCGATTGAATGGTTTAGTGAGGTCGTTGGATTGGTGTCGTTGTAGTGG
CTA-CTACCGATCGAACGATGTAGTGAGGTCCTCGGACTGGCATGTACTCGGAAGCCGGG—TTCGCTCGGTAGAAGAGCTTGTTTGCCGGGAAGAGAAC
GTTGTAT
TACCTAAAAGTTGGC
CTA-CTACCGATTGAATGGCTTAGTGAGGTCTACGGATAGGCTACAAGGTAGCCATCAGCTCT
GCCACGGAGCAGCGGAC
TGCCGAGAAGTTGTT
CTA-CTACCGATTGAATGGTTTAGTGAGGACTCCTGATTGGCGCCGCGCCCCG
CTA-TCCGGGACTGAACTGATTCGAGAAGAGTGGGGACTGTC
GCTTCGAGGTTTAACGACTTCGTTG
TTGCGGAAACCATTT
CGGAAAGCGATTGAC
CTA-CTACCGATTGAATGATTTAGTGAGGTCTTCGGATTGGCGCTCGGAGCGGCCGCAAGG-TCGCGCCGGCGTGC
CG
AGAAGACGAG
CTACCTACCGATTGATTGGTTTGGTGAGCTCCTCGGATTGGTCCCGACACGGGGGGCAACCCTCGAGTCGGTGCGC
GG
AAGATGAC
CTA-CTACCGATTGAATGATTTAGTGAGGTCTTCGGACCGGTACGCGGTGGCGTTTCGGCGTCGCCGATGTTGCTG
CTTCGGCAGCTCGACCGG
TGCTGAAAAGACGAC
CTA-CTACCGATTGAATGGTTTAGCAAGGTCCTCGGATTGGTGCCATTGTAGTTG
CTTGTGTTGCCGGACAC
AGC-GAGAAGTTGAT
CTA-CTACCGATTGAATGGTTTAGTGAGATCTTCGGATTGCTGGCCCGGCGGC
GCCGGTGCGC
CG
AGAAGTTGTT
CTA-CTACCGATTGAATGGTTTAOTGAGAGCCCCGGATTGGTCCCGGCATGGGGG7AACCTCC
CGAGAAGC
TOCT
CTA-CTACCGATTGAACGGTTTAGTGAGAGCCTCGGATTGGTCCTG-CATGGTGGGCAACCATCGCGCCGGTGTGC
CTA-CTACCOATCGAACOATOTAOTOAOOTCCTCOOACTQaCCACTACTCaOAACCCCaC—TCCOGTOOGOAOAAOAOCTTOCTTOCCOaOAAOAOOAT
1910
45
GGGGACTGCAAGGAT-CCCCATGAACCAGGAATCCCTAG-AGGCGCAAGTCATTAGCTTGCGTCGATTACGTCCCTGCCCTTTGTACACACCGCCCGTC-
1830
ON
47
V9
I
48
CAAACTTOATCCTTTAOAOOAAOTAAAAOTCO-AACAAOOTTTCCOTAOOTOAACCTQCOOAAQOATCATTA
CTAATTTOACTATTTAOAOOAAOTAAAAOTCOTAACAAOOTTTCCOTAOOTOAAiniNNiniNMNNiniHNMNlIM
CAAACTQAATCQTTTAOA<WAAGTAAAAQTCQTAACAA<WTTTCCQTAGQTQAACCTQCQGAAQOATCATTA
COAACTTTACCATTTAOAOOAAOTAAAAOTCOTAACAAOOTTTCCOTAOOTOAACCTQCOOAAOOATCATTA
CAAACTTQATCATTTAQAaOAAOTAAAAOTCOTAACAAOOTTTCCOTAQQTOAACCTOCQOAAOOATCATTA
TTATCOCATTQOTTTOAACCOQOTAAAAQTCQTAACAAQQTAQCTOTAOOTOAACCTOCAOCTOOATCATCO
CAAACTTSATCATTTASAOSAAQTAAAAOTCQTAACAAOOTTTCCOTAOOTOAACCTQCOOAAGOATCATTA
CAAACTTOATTATTTAOAQQAAGTAAAAOTCQTAACAAOQTTTCCOTAOOTOAACCTOCAOAAOOATCAAQC
CAAACTTOATCATTTAOAaOAAQTAAAAOTCQTAACAAOOTTTCCOTAQOTOAACCTOCQOAAOOATCATTA
CAAACTTaATCATTTAQAQOAAOTAAAAOTCOTAACAAOQTTTCCaTAOQTOAACCTaCOOAAOOATCATTQ
CAAACTTOATCATTTAaAaOAAOTAAAAOTCOTAACAAaOTTTCCOTAaGTOAACCTOCAOAAOOATCANNN
COAACTTOATCATTTAQAOOAAOTAAAAOTCOTAACAAOOTTTCCOTAOQTSAACCTOCQOAAOOATCANira
CaAACTCOATCOCTTOOAOAAAOTAAAAOTCGTAACAAOOTTTCCaTAOOTOAACCTaCOOAAaOATCAinra
CAATCOCAATOOCTTOAACCOGOTAAAAQTCaTAACAAQQTATCTOTAOOTOAACCTOCAOATOOATCATCO
CAATCOCAATGOCTTOAACCOOOTAAAAOTCOTAACAAaOTATCTOTAOOTOAACCTOCAOATOOATCATCO
CAAACTTOATCATTTAOAOOAAOTAAAAOTCGTAACAAOQTTTCCQTAOOTOAACCTOCAOAAOOATCNiraN
CAAACTTOACCATTTAOAQGAAOTAAAAOTCQTNBinnnnnnnnnnnmiraNNim
COAACTTOATCOTTTAQAQOAAGTAAAAOTCqTAACAAOQTTTC-OTAOGTOAACCTgCAOATOOATCMllNM
Paraapadalla
Sagltta craaaa
Artemla
Schlatoaoma
Sagitta alagana
Monlllform
Anemonla
Camnorhabdltla
Burypmlma
Plaoopacten
Tmnabrio
Oplathorchia
Soypha
Acanthoplaura
Limloolarla
Bammonchua
Samatodlrua
Strongylocantrotua
Branchloatoma
Saccogloaaua
I
2030
2040
2050
2060
2070]
2020
]
.
.
.
.
.
.
.
CAAACTTOATCOTTTAaAOOAAOTAAAAaTCQTAACAAQQTTTCCQTNNNNNNNNNiniNNNNNNiraNNNNNN
2010
[
[
Phoronla