The Genome Of The Contractile Demosponge Tethya

bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
1
2
3
The genome of the contractile demosponge Tethya wilhelma and
the evolution of metazoan neural signalling pathways
Warren R. Francis1 , Michael Eitel1 , Sergio Vargas1 , Marcin Adamski2 ,
Steven H.D. Haddock 3 , Stefan Krebs4 , Helmut Blum4 , Dirk Erpenbeck1,5
Gert Wörheide1,5,6
1
4
Department of Earth and Environmental Sciences, Paleontology and Geobiology,
Ludwig-Maximilians-Universität München
Richard-Wagner Straße 10, 80333 Munich, Germany
2
Research School of Biology, College of Medicine, Biology & Environment,
Australian National University, Canberra ACT 0200 Australia
3
Monterey Bay Aquarium Research Institute, Moss Landing, CA 95039, USA
4
Laboratory for Functional Genome Analysis (LAFUGA), Gene Center,
Ludwig-Maximilians-University Munich, Munich, Germany
5
GeoBio-Center, Ludwig-Maximilians-Universität München, Munich, Germany
6
Bavarian State Collection for Paleontology and Geology, Munich, Germany
Abstract
18
Porifera are a diverse animal phylum with species performing important ecological roles in aquatic ecosystems, and have become models for multicellularity and early-animal evolution. Demosponges form the
largest class in sponges, but previous studies have relied on the only draft demosponge genome of Amphimedon queenslandica. Here we present the 125-megabase draft genome of a contractile laboratory demosponge
Tethya wilhelma, sequenced to almost 150x coverage. We explore the genetic repertoire of transporters, receptors, and neurotransmitter metabolism across early-branching metazoans in the context of the evolution
of these gene families. Presence of many genes is highly variable across animal groups, with many gene
family expansions and losses. Three sponge classes show lineage-specific expansions of GABA-B receptors,
far exceeding the gene number in vertebrates, while ctenophores appear to have secondarily lost most genes
in the GABA pathway. Both GABA and glutamate receptors show lineage-specific domain rearrangements,
making it difficult to trace the evolution of these gene families. Gene sets in the examined taxa suggest that
nervous systems evolved independently at least twice and either changed function or were lost in sponges.
Changes in gene content are consistent with the view that ctenophores and sponges are the earliest-branching
metazoan lineages and provide additional support for the proposed clade of Placozoa/Cnidaria/Bilateria.
19
Introduction
5
6
7
8
9
10
11
12
13
14
15
16
17
20
21
22
23
24
25
26
The presence of neurons is a defining character of animals, and is symbolic of their alleged superiority over all
other life on earth. Nonetheless, the four non-bilaterian phyla, Porifera, Placozoa, Ctenophora and Cnidaria,
are most different from other animals in their sensory systems and are often mistakenly referred to as “lower”
animals in common parlance, despite the fact that, like bilaterians, non-bilaterians exist at the tips of the
tree of life. Indeed, animals such as corals and sponges appear immobile or often unresponsive, challenging
early theorists in their ideas of what is and is not an animal. Yet we now know that representatives from all
four non-bilaterian phyla demonstrate dynamic responses to outside stimuli.
27
1
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
28
29
30
31
32
Neural evolution has been discussed previously in the context of paleontology (reviewed in [Wray et al.,
2015]) and metazoan phylogeny (reviewed in [Jékely et al., 2015]). Indeed, it has been suggested that many
features of bilaterian neurons and nervous systems represent separate, parallel evolutionary events from a
“simple” nervous system. A simple nervous system then must arise from proto-neurons [Schierwater et al.,
2009], however it is unclear what that might look like.
33
34
35
36
37
38
39
40
Several qualities can be used to define neurons or proto-neurons [Leys, 2015,Nickel, 2010] such as synapses,
electrical excitability, membrane potential, or secretory functions, though no single quality (and ultimately
gene set) solely defines such cells as neurons. Two non-bilaterian groups, ctenophores and cnidarians, are
thought to have true neurons. When considering the remaining two non-bilaterian phyla, sponges and
placozoans, many components of neural cells are found without any neuron-like cells having been identified [Srivastava et al., 2010, Riesgo et al., 2014a, Leys, 2015], although synapse-like structures have been
identified in placozoan fiber cells that show vesicles close to an osmophile contact [Grell and Benwitz, 1974].
41
42
43
44
45
46
47
48
49
50
Comparative analyses revealed a gradient of neural-like qualities indicating that “neuron-or-not” classifications are not straightforward. While ctenophores, cnidarians, and bilaterians have true neurons, structural and biochemical differences, [Moroz et al., 2014, Moroz, 2015] led to the proposition that neurons in
ctenophores and cnidarians may not be homologous, but rather separate evolutionary outcomes from neurallike precursor cells. Potentially, in the case of independent evolutions, neurons are “easy” to evolve, since it
involves co-expression of various pan-metazoan genetic modules in the same cell type. Alternatively, early
rudimentary signaling systems may have been energetically costly and not especially useful in pre-Cambrian
oceans, and in such cases, it may have been comparatively easy to lose such genes and with them neuronaltype cells.
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Interpretation of neural evolution requires an accurate metazoan phylogeny, and the phylogenetic relationships of early-branching metazoans have been a topic of continued controversy. Some analyses support the
traditional phylogenetic position of sponges as sister group to all other metazoans (“Porifera-sister”) [Philippe
et al., 2009,Pick et al., 2010,Nosenko et al., 2013,Pisani et al., 2015,Simion et al., 2017] while others suggest
that Ctenophora are the sister group to all other animals (“Ctenophora-sister”) [Dunn et al., 2008, Ryan
et al., 2013, Whelan et al., 2015], and some analyses also recover the classical view, a Coelenterata clade
uniting Cnidaria and Ctenophora [Philippe et al., 2009, Simion et al., 2017]. Importantly, phylogenomic
analyses can be prone to systematic artifacts under some circumstances, depending on taxon sampling [Pick
et al., 2010, Philippe et al., 2011], gene set [Nosenko et al., 2013], phylogenetic model [Pisani et al., 2015], or
use of nucleotides instead of proteins [Jarvis et al., 2014]. Other methods based on presence or absence of
the genes themselves have been proposed to provide a sequence-independent inference of phylogeny [Ryan
et al., 2010, Ryan et al., 2013, Pisani et al., 2015], relying on the assumption that gene loss is a rare event.
However, non-bilaterians have the additional problem that basic knowledge of many aspects of their biology
is absent [Dunn et al., 2015], and so the biological context that may separate or unite groups is limited.
66
67
68
69
70
71
72
In the context of phylogeny, the branching order critically affects whether neurons evolved multiple times
or were lost (see schematic in Figure 1). Given the gradient of neural-like qualities, the actual evolutionary
scenario may be somewhere in between a simple gain-loss of neurons. While some previous studies have
focused on neural evolution in ctenophores [Ryan et al., 2013, Alberstein et al., 2015, Li et al., 2015] or
analysing the genomic data from A. queenslandica [Krishnan et al., 2014], these alone do not provide a
comprehensive picture of all animals.
73
74
75
76
77
78
79
80
Here we have sequenced the genome of the contractile laboratory demosponge Tethya wilhelma [Sara
et al., 2001] and examined the protein repertoire in the context of genes mediating the contraction, and
other neural-like functions. Many metabolic genes show unique expansions in different sponge clades, as well
as other phyla, making it challenging to clearly assign functions based on similarity to human proteins. We
consider these expansions in the context of phylogenetics, showing that even though sponges lack neurons,
signaling pathways have still expanded. This gives support to the hypothesis that early neural-like cells have
become neurons multiple times in the history of animals.
2
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
A
B
HAS
NEURONS
GAIN
C
LOSS
D
Bilateria
Cnidaria
Placozoa
Porifera
Ctenophora
Figure 1: Schematic of neural evolution depending on metazoan phylogeny The presence of neurons or neural-like cells in ctenophores, cnidarians, and bilaterians can be viewed differently depending on
the phylogeny. Two different metazoan phylogenies based on recent multi-gene phylogenetic analyses are
the source of the Porifera-sister (A,B) [Philippe et al., 2009, Pisani et al., 2015, Simion et al., 2017] and
Ctenophora-sister (C,D) [Ryan et al., 2013, Whelan et al., 2015] scenarios. Neurons can either have evolved
once requiring a secondary loss in sponges, placozoans, or both (A,C), or evolved twice, in ctenophores and
in cnidarians/bilaterians (B,D).
81
Results
82
Genome assembly and annotation
83
84
85
86
87
88
89
90
91
92
93
94
95
96
We generated a total of 61 gigabases of paired-end reads from a whole specimen of T. wilhelma (Figure 2)
and all associated bacteria. Because of a close association with microbes, some contigs were expected to
have derived from bacteria, as many reads have unexpectedly high GC content (Supplemental Figures 1-4).
After assembly and filtration of bacterial contigs, the final assembly was 125Mb, similar to A. queenslandica,
with a N50 value of 70kb (Supplemental Table 1). Gene annotation was done with a combination of a
deeply-sequenced RNAseq library from an adult sponge and ab initio gene predictions. Because of high
density of genes, extensive manual curation was often necessary to correct genes of the same strand that
were erroneously merged. After correction and filtering of the ab initio predictions, we counted 37,416
predicted genes, comparable with the counts in A. queenslandica (40,122) [Fernandez-Valverde et al., 2015]
and S. ciliatum (40,504) [Fortunato et al., 2014].
General trends in splice variation were similar between T. wilhelma and A. queenslandica (Supplemental
Tables 2 and 3), suggesting similar underlying biology or genome structure. One-to-one orthologs from T.
wilhelma and A. queenslandica had relatively low identity (Supplemental Figure 5), with the average identity
of 57.8%, showing a high genetic diversity within Porifera. The average identity is lower when compared to
3
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
Figure 2: Contraction of a normal specimen (A) Maximally expanded state of Tethya wilhelma. (B)
The same specimen approximately an hour later in the most contracted state. (C) Cartoon view of most
contracted versus expanded state. Scale bar applies to all images. Photos courtesy of Dan B. Mills.
97
98
99
100
S. ciliatum (49.7%), N. vectensis (53.5%) and human (52.0%), which is not surprising given that A. queenslandica and T. wilhelma are both demosponges. Although both genomes are too fragmented to find syntenic
chromosomal regions, ordered blocks of genes are still identifiable between T. wilhelma and A. queenslandica
(Supplemental Figure 6), though not with S. ciliatum.
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
Neurotransmitter metabolism across early-branching metazoans
Compared to most other metazoans, sponges have a limited set of behaviors (contraction, closure of osculum
or choanocyte chambers to control flow), yet respond to many signaling molecules present in bilaterians [Ellwanger and Nickel, 2006, Ellwanger et al., 2007]. Some genes involved in vertebrate-like neurotransmitter
metabolism have been found in sponges [Riesgo et al., 2014a, Krishnan and Schiöth, 2015], although many
display a sister-group relationship to homologs found in other animals and appear to have a complex evolutionary history with duplications in sponges and other non-bilaterian animals (Figure 3, Supplemental
Figures 7-9), making the prediction of their functions difficult. Implicitly, presence of a gene usually means
a one-to-one orthology relationship with a functionally annotated protein, probably a human protein. Since
many of the non-bilaterian proteins in our set are many-to-many orthologs to human proteins with known
functions, declaring presence or absence of any individual gene or genetic module is not correct in the strictest
sense, as one-to-many or many-to-many orthologs are not the same gene. In such cases, it is not currently
possible to computationally predict which, if any, of the sponge orthologs shares its function with a human
protein.
116
117
118
119
120
121
122
123
For instance, biosynthesis of monoamine neurotransmitters (dopamine, serotonin, etc.) requires two
enzymes, tryptophan hydroxylase and tyrosine hydroxylase. These two enzymes appear to have arisen in
bilaterians from duplications of an ancestral phenylalanine hydroxylase [Cao et al., 2010], though evidence
is lacking as to whether this ancestral protein had multiple functions that specialized after duplication (subfunctionalization) or developed new functions (neofunctionalization) post-duplication. The absence of these
proteins in non-bilaterians seems to be ancestral; in other words, they had not evolved yet when these groups
split and diversified.
124
125
126
127
128
129
130
Among other non-bilaterians, some monoamine neurotransmitters are found in cnidarians [Carlberg and
Rosengren, 1985], but are mostly absent in ctenophores (or at detection limit) [Moroz et al., 2014]. Indeed,
previous studies were unable to find homologs of DOPA decarboxylase (AADC, Supplemental Figure 8),
dopamine β-hydroxylase (DBH, Supplemental Figure 7), monoamine oxidase (MAO, Supplemental Figure 9), or tyrosine hydroxylase (TH) in the genome of the ctenophore M. leidyi or any available ctenophore
transcriptome, and it was suggested that some of these proteins were absent in sponges as well (see Supple-
4
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
OH
O
O
GATP
Gamma-amino
butyric acid
Succinate
semialdehyde
Succinic acid
ABAT
Phenylpyruvate
COO-
GAD
4-hydroxyphenylpyruvate
O
HPD
COO-
KYAT
TAT
Phenylalanine
COONH3
Phenylalanine
catabolites
O
HO
Tyrosine
PAH
Standard
amino acids
4-hydroxyphenylacetate
Tyramine
COO-
HO
HGD
COO-
NH3+
+
NH3+
HO
Acetoacetate
+ Fumarate
DOPA
COO-
AADC
NH3+
HO
Dopamine
Dihydroxyphenylacetate
HO
HO
NH3+
HO
OH
COO-
HO
DOMA
COO
Catecholamine
degradation
products
Adrenaline
OH
VitC is a cofactor
HO
HGD
NH2Me+
AADC
OH
MeO
HO
HO
PNMT
HO
-
2
PAH
VMA
COMT
OH
HO
COO
2
2 3
2
TH
MAO
NH3+
HO
Enzyme requires O2
2 M
HPD
Homovanillate
MeO
COO-
HO
Noradrenaline
HO
NADH is a product
GLUD
TAT
DBH VitC
Catecholamine
neurotransmitters
M 2
KYAT
HO
TH
HO
GS
GATP
SSDH
-
DBH
M M
MAO
2
PNMT
COMT
2
M M M
Present
Homolog
Absent
Absent in transcriptome
Figure 3: Neurotransmitter overview across metazoans Summary schematic of neurotransmitter
biosynthesis and degradation pathways across early-branching metazoans. Each of the four non-bilaterian
groups is presumed to be monophyletic, although some individual trees of genes or gene families may display
alternate topologies. Bold letters refer to the enzymes. Individual gene trees that display the orthology of the
clades are found in the supplemental information. Arrows are shown in one direction though many reactions
can be reversible. (A) Glutamate and GABA metabolism from the citric acid cycle through the “GABA
shunt” pathway. Because ABAT is absent in ctenophores, GLUD and TAT are potentially alternatives to
convert α-ketoglutarate to glutamate. GATP can convert GABA to succinate semialdehyde, but this enzyme
was only found in some demosponges and plants. (B) Monoamine metabolism, excluding tryptophan. (C)
Table of presence-absence for genes in parts A and B. Presence (green) refers to a 1-to-1 ortholog where
orthology is clear from the tree position. Homolog (blue) refers to a sister group position in trees before
duplications with different or unknown functions. Secondary loss (red) refers to the gene missing in the
clade, but homologs are found in non-metazoan phyla. Numbers inside the boxes indicate copy number
specific to that group, M refers to multiple duplications within the group where the copy number is variable
among species. Abbreviations for clades are as follows: Cteno, ctenophores; Hsm, homoscleromorphs; Demo,
demosponges; Hexact, hexactinellids; Choano, choanoflagellates.
131
132
133
134
mentary Tables 17 and 19 in [Ryan et al., 2013]). However, we found orthologs of MAO and homologs of
AADC and DBH in several sponges, though it is unclear if they perform the same function as the human
proteins. Additionally, homologs of four enzymes, AADC, MAO, DBH, and ABAT, are present in singlecelled eukaryotes but not ctenophores, implying a secondary loss of these protein families in this phylum.
135
5
Loss
Plant
O
HO
NH2
B
SSDH
O
HO
Choano
ABAT
O
HO
*
Filasterea
GAD
2OG-enzymes
VitC
Hexact
O
NH2
Metazoa
OGDC-E1
SUCLG1
OH
Demo
O
HO
Calcarea
O
GLUD
OH
Hsm
GS
NH2
O
HO
Cteno
O
OH
Cnidaria
O
H 2N
C
Citric acid
cycle
Alpha-keto glutarate
Placozoa
O
(TAT)
Glutamic acid
Glutamine
Bilateria
A
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
GABA receptors
The neurotransmitter gamma-amino butyric acid (GABA) has been shown to affect contraction in T. wilhelma [Ellwanger et al., 2007] and the freshwater sponge E. muelleri [Elliott and Leys, 2010]. The genome of
T. wilhelma contains metabotropic receptors (GABA-B, mGABARs), but not ionotropic GABA receptors
(GABA-A, iGABARs). While humans have two mGABARs and the ctenophore M. leidyi has only one, the
T. wilhelma genome has nine. Sponges appear to have undergone a large expansion of this protein family
(Figure 4), similar to the expansion of glutamate-binding GPCRs previously observed in sponges [Krishnan
et al., 2014]. Based on the structure of the binding pocket of human GABAR-B1 [Geng et al., 2013], many
differences are observed across the mGABAR protein family, even showing that many residues involved in
coordination of GABA are not conserved between the two human proteins or all other animals (Supplemental Figure 11). Contrary to previous reports [Ramoino et al., 2010], we were unable to find normal
mGABARs in the two calcareous sponges S. ciliatum and L. complicata. Instead, in these two species, the
best BLAST hits from human GABA-B receptors (the putative mGABARs) had the best reciprocal hits to
Insulin-like growth-factor receptors. Structurally, this was due to the normal seven-transmembrane domain
being swapped with a C-terminal protein kinase domain (Figure 5), meaning these are not true metabotropic
GABA receptors. Similarly, in the filasterean Capsaspora owczarzaki, the N-terminal ligand binding domain
is also exchanged with other domains, suggesting as well that these are not true metabotropic GABA receptors.
154
155
156
157
158
159
160
161
162
163
164
Glutamate receptors
Glutamate is of particular interest as it is a key metabolic intermediate and the main excitatory neurotransmitter in animal nervous systems, acting on two types of receptors: the metabotropic glutamate receptors
(mGluRs) and the ionotropic ones (iGluRs). Some sponge species possess iGluRs, though these receptors
were absent in the transcriptomes of several demosponges [Riesgo et al., 2014a]. We were unable to find
iGluRs in the genome of T. wilhelma, in the genome and transcriptomes of any other demosponge, or in the
genomes of two choanoflagellates (M. brevicollis and S. rosetta). The top BLAST hits in demosponges have
a GPCR domain instead of the ion channel domain, indicating that these are not true iGluRs (Supplemental
Figure 13). Because the domain structure in plants is the same as most animal iGluRs, the ligand binding
domain was swapped out in demosponges.
165
166
167
168
169
170
171
172
The homoscleromorpha/calcarea clade appears to have an independent expansion of iGluRs (Supplemental Figure 12), though the normal ion transporter domain is switched with a SBP-bac-3 domain (PFAM
domain PF00497) compared to all other iGluRs (Supplemental Figure 13). Additionally, ctenophores and
placozoans appeared to have dramatic expansions of this protein family as well [Ryan et al., 2013, Moroz
et al., 2014,Alberstein et al., 2015], suggesting that a small set of iGluRs was present in the common ancestor
of eukaryotes and have diversified multiple times in both plants and animals, while other clades appear to
have modified or lost these proteins.
173
174
175
176
177
178
179
180
181
182
183
184
Vesicular transporters
Secretory systems are a common feature of all eukaryotes, as most cells have endoplasmic reticulum to secrete
proteins or make membrane proteins. Neurons secrete peptides (conceptually identical to any other protein)
or small-molecule neurotransmitters in a paracrine fashion, specifically to other neural cells. Compared to
peptides, small-molecule neurotransmitters need to be loaded into vesicles by dedicated transport proteins.
Vesicular glutamate transporters (VGluTs, SLC17A6-8) are part of a superfamily of transporters [Sreedharan et al., 2010] that carry glutamate, aspartate, and nucleotides. The position of sponge proteins in the tree
is inconsistent with a clear role in glutamate transport (Supplemental Figure 14), as several sponge clades
and ctenophores occur as sister group to multiple duplications. Transporters in sponges, ctenophores, and
choanoflagellates may well act upon glutamate or other amino acids, but this needs to be experimentally
investigated.
185
6
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
Protostome
GABAR-B3 (7)
GABAR-B2
Placozoans (2)
Cnidarians (9)
Vertebrate
GABAR-B1 (4)
Filasterea (1)
Ctenophores (2)
Homoscleromorphs (1)
**
Hexactinellids (4)
*
Demosponges (4)
++
BS>80
BS>90
Figure 4: Metabotropic GABA receptors (GABA-B type) across metazoans Protein tree generated
with RAxML. Numbers in parentheses indicate the number of species from that group, so the 46 demosponge
mGABARs come from 4 species. Key bootstrap values are summarized as yellow or gray dots, for values of
90 or more, or 80 or more, respectively. Single star indicates sequences annotated as mGABARs in [Krishnan
et al., 2014], double plus indicates the clade annotated as “sponge specific expansion” in [Krishnan et al.,
2014]. For complete version with protein names and all bootstrap values, see Supplemental Figure 10
190
Similar to glutamate, GABA is loaded into vesicles with the vesicular inhibitory amino acid transporter
(VIAAT). Ctenophores, sponges, and placozoans lack one-to-one homologs of VIAAT (Supplemental Figure 15). Several other transporters are thought to transport GABA (ANTL or SLC6 class) and many other
amino acids. SLC6-class transporters, which transport diverse amino acids, are found in all non-bilaterian
groups, so the function of VIAAT may be redundant.
191
Glycine receptors
186
187
188
189
192
193
194
195
196
197
Glycine is known to affect the contraction of T. wilhelma [Ellwanger and Nickel, 2006]. Some ctenophore
iGluRs have been shown to bind glycine [Alberstein et al., 2015] due to the substitution of serine for arginine
(S687 in human GluN1), though this appears to be specific only to ctenophores, as essentially all other
iGluRs have the conserved serine/threonine at this position. Because no ionotropic glycine receptors could
be identified in the T. wilhelma genome (or any other sponges, ctenophores or placozoans), other proteins
may be responsible for mediating this effect.
198
7
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
Vertebrate mGABARs
GABR1_HUMAN
ANF_receptor
7-transmembrane
Tyrosine kinase
GABR1_MOUSE
GABR2_HUMAN
Signal peptide
Sushi
TIG
Laminin G3
Human Receptor tyrosine kinases
INSRR_HUMAN
GABR2_MOUSE
Recep_L_domain
Furin−like
fibronectin3
IGF1R_HUMAN
Placozoans
Triad1_g72.t1__scaffold_1
Ctenophores
ML02335a−AUGUSTUS
Calcisponge mGABAR-like
scict1.027609.1_0
Sycon ciliatum
scict1.029082.1_0
Demosponges
Twilhelma_twi_ss.20824.4_2
lctid9879_0
Leucosolenia complicata
Aqu2.1.28011_001
aphrocallistes_vastus_comp16141_c0_seq1_1
Hexactinellids
lctid37767_0
rosella_fibulata_TR14675_c0_g1_i1_0
Capsaspora owczarzaki
oscarella_t_h60_42123_comp11388_c0_seq18
0
200
400
600
Homoscleromorph
800
1000
Filasterean
CAOG_05982T0
0
200
400
600
800
1000 1200
1400
Figure 5: Domain organization of GABA-B type receptors across metazoans
Scale bar displays number of amino acids. Top reciprocal BLAST hits to human for putative mGABARs
in calcareous sponges are INSRR and IGF1R, due to high-scoring hits to the tyrosine kinase domain. All
mGABARs from demosponges, glass sponges, and the homoscleromorph Oscarella carmela share the 7transmembrane domain (green) with mGABARs from other animals, while calcarean proteins have the same
ligand-binding domain (red) but instead have a protein tyrosine kinase domain (purple) at the C-terminus,
similar to growth-factor receptors. The filasterean Capsaspora owczarzaki has alternate domains at the
N-terminus.
199
200
201
202
203
204
205
206
207
208
209
210
Mechanical receptors
Some sponges can contract in response to mechanical agitation, as reported for the demosponges E. muelleri [Elliott and Leys, 2007] and T. wilhelma [Nickel, 2010]. Several diverse protein families appear to
be responsible for the sense of touch [Árnadóttir and Chalfie, 2010]. A subgroup of the TRP (transient
receptor-potential) channels, TRP-N, thought to mediate mechanosensation was determined to be absent
in sponges [Schuler et al., 2015], and we were unable to identify any in either T. wilhelma or S. ciliatum,
although other TRP-class channels were found [Ludeman et al., 2014, Schuler et al., 2015]. Because the
mechanosensory function of TRP channels may be redundant, we analysed for the presence of PIEZO, a
280kDa trimeric protein [Ge et al., 2015] involved in touch sensation in mammals [Coste et al., 2012]. Although two homologs were found in vertebrates, we found one copy in all other animals (Supplemental
Figure 16) as well as fungi, plants and most other eukaryotic groups, suggesting an ancient and conserved
function of this protein.
211
8
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
Voltage-gated channels
Voltage-gated ion channels are necessary for the propagation of electrical signals down axons and dendrites
[Zakon, 2012,Moran et al., 2015], and have specificities for sodium, potassium, or calcium. Previous analyses
were unable to clearly identify potassium or sodium channels in sponges [Liebeskind et al., 2011]; only one
partial potassium channel was found in the transcriptome of the homoscleromorph Corticium candelabrum
[Riesgo et al., 2014a,Li et al., 2015]. We were unable to find any voltage-gated sodium or potassium channels
in the genome or transcriptome of any sponge. We then examine voltage-gated hydrogen channels (hvcn1), as
these proteins have been found in a number of single-celled eukaryotes [Smith et al., 2011], and are extremely
conserved. These channels were found in all sponge groups, although the high protein identity resulted in a
poorly-resolved tree (Supplemental Figure 19).
Reports of action potentials in hexactinellids [Leys et al., 1999,Leys et al., 2007,Nickel, 2010] showed that
sponge action potentials were inhibited by divalent cations [Leys et al., 1999], suggesting a role of calcium
channels instead. Because voltage-gated sodium and calcium channels arose from a duplication event [Gur
Barzilai et al., 2012], the ion selectivity may be variable within this protein family. Most sponges have only a
single CaV-channel (Supplemental Figure 18) and several Hv-channels, and no voltage-gated channel of any
kind was found in any glass sponge. However, all glass sponge sequences are from transcriptomes, therefore
either the expression level of the true channels is low in glass sponges, or they have independently evolved
another mechanism to propagate action potentials.
230
9
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
231
Discussion
232
Gene content variation of metazoa
233
234
235
236
237
238
Among the thousands of genes in the genome, we focused on genes that may be mediating contractile behavior in T. wilhelma, and the interactions of those genes within broader metabolic pathways. Many of the
“housekeeping” genes in our study have lineage-specific duplications in at least one animal phylum. Considering the importance of “single-copy” proteins in phylogenetic analyses, as taxon sampling improves, it may
be found that very few or no genes are single copy across most or all animal phyla. Many other genes that
are critical for neural functioning in bilaterians have independent losses in other animal lineages (Figure 6).
239
Gene family expansions
Demo/Hexact
Calcarea/Hsm
mGluRs
mGABARs
MAO/PAOX-like
DBH-like
iGluRs
DBH-like *
Hv-channels *
Placozoans
Cnidarians
iGluRs
mGABARs
DBH-like
NaV-channels
VGluT
VIAAT
MAO/PAOX-like
DBH-like
Ctenophores
iGluRs
Kv-channels
Gene family losses
iGluRs
VIAAT
AADC-like
NaV-channels
Kv-channels
mGABARs *#
DBH-like *
NaV-channels
Kv-channels
ABAT
MAO
DBH-like
AADC-like
MAO
-
Figure 6: Summary of gene expansions and losses
Demo/Hexact refers to the clade of demosponges and glass sponges. Calcarea/Hsm refers to the unnamed
clade of calcareous sponges and homoscleromorphs. The star indicates that expansion or loss was found in
one class but not the other. Number sign indicates the domain rearrangement in Calcarea.
240
241
242
243
244
245
246
247
248
Glutamate and GABA receptor evolution
There is stark contrast in the relative abundance of mGABARs and iGluRs in sponges and ctenophores.
The relative dearth of mGABARs in ctenophores may reflect the apparently absence of amino-butyrate
amino-transferase (ABAT) in ctenophores, suggesting that ctenophores use an alternate pathway to produce
glutamate or metabolize GABA (Figure 3), rarely use GABA as a neurotransmitter, or simply are missing
this pathway. Other aminotransferases such as GLUD or TAT may perform some of the exchange between
α-ketoglutarate and glutamate, particularly as ctenophores have two copies of GLUD while most animals
have only one. Ctenophores also have multiple (variable) copies of glutamate synthase and three copies of
KYAT one of which may serve to balance glutamate metabolism in these animals.
249
250
251
252
253
254
There are two explanations for the diversity of mGABARs in sponges. Given the high variability of
amino acids in the mGABAR binding pocket (Supplemental Figure 11), it is plausible that many of these
receptors do not bind GABA at all, and have diversified for other ligands. There is precedent for this as it
was shown that the independent expansion of ctenophore iGluRs also included several key mutations to the
binding pocket which changed the ligand specificity of these proteins [Alberstein et al., 2015]. For the other
10
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
255
256
257
258
259
260
261
262
hypothesis, all of the receptors could bind GABA, essentially mediating the same contraction signal, but
their kinetics could differ and be influenced by factors such as, for instance, temperature. Because sponges
are mostly immobile, they often can be subject to environment variation in terms of light, oxygen, and
temperature. The possession of a set of proteins capable of triggering the same response (e.g. contraction)
with varying daily or seasonal environmental conditions (e.g. temperature) would be beneficial and may
explain the diverse set of receptors observed in sponges. Experimental characterization of these binding
domains is necessary and may even show that a combination of these hypotheses explains the diversification
of mGABARs in Porifera.
263
264
265
266
267
268
269
270
271
272
273
274
275
276
The apparent absence of true mGABARs in calcareous sponges (the genome of S. ciliatum and transcriptome of L. complicata) conflicts with a previous study that identified key proteins in the GABA pathway
by immunostaining [Ramoino et al., 2010]. The best mGABARs BLAST hits found in the two calcareous
sponges display a conserved ligand binding domain but the seven-transmembrane domain has been swapped
with a tyrosine kinase domain (Figure 5). Structural similarity of the conserved N-terminal domain may
result in a false-positive signal in studies using immunostaining with standard antibodies [Ramoino et al.,
2010]. On the other hand, compared to ctenophores, which apparently lack ABAT, this enzyme was found
in both of the calcareous sponges analyzed. Thus it would be surprising if these sponges had no capacity
to create or respond to GABA. Since true vertebrate-like mGABARs are found in all other sponge classes,
and our study could only examine two calcareous sponges, it could be that mGABAR presence is variable in
this class. The genome of S. ciliatum contains 40 proteins annotated as mGluRs [Fortunato et al., 2014], so
a third possibility is that even in the absence of true mGABARs, some of these proteins may have evolved
affinity for GABA and mediate its signaling in calcareous sponges.
277
278
279
280
281
282
283
284
285
286
Although a putative iGluR was identified in the transcriptome of the demosponge Ircinia fasciculata,
this sequence was only a fragment, so the glutamate affinity and domain structure could not be determined.
As with the mGABARs, the domain structure is different between the sponge classes. Otherwise, it appears
that only calcareous sponges and homoscleromorphs have NMDA/AMPA-like iGluRs. The presence of these
proteins in plants and other single-celled eukaryotes suggests that at least iGluRs were present in the common ancestor of all eukaryotes, and their absence in demosponges is likely the product of secondary losses.
In the context of contractions of T. wilhelma, the abundance of mGluRs and mGABARs could plausibly
work in antagonistic ways via the action of different G-proteins making ionotropic channels not necessary
for the modulation of this behavior.
287
288
289
290
291
292
293
294
Variation in neurotransmitter metabolism
Many of the oxidative enzymes in the monoamine pathway require molecular oxygen, suggesting an important role of this molecule both the synthesis of the neurotransmitters (with PAH, TH, and DBH) and
their inactivation (with MAO). Two catabolic pathways arise from tyrosine (Figure 3) and require oxygen
at nearly all steps. It is unclear why intermediate products of one of these two pathways, the catecholamine
pathway, became neurotransmitters and the other did not, particularly as hydroxyphenylpyruvate pathway
is universally found in animals and catabolic intermediates are likely to be ubiquitous.
295
296
297
298
299
300
301
302
303
304
MAO was found in most animal groups, but we were unable to find any in placozoans or ctenophores.
The topology of the MAO phylogenetic tree suggests a secondary loss of this protein in these phyla (Supplemental Figure 9). Related genes (PAOX, polyamine oxidase) were found in placozoans with several
placozoan-specific duplications, and again, potentially one of these may catalyze the oxidation of aromatic
amines. The analysis of these proteins also uncovered a clade including sponges, cnidarians, and lancelets,
though the function of these proteins cannot be predicted based on homology searches. In vitro characterization of these enzymes may reveal the function to provide evidence as to how these could have been
important for metabolism in early animals, and was subsequently replaced or lost in most other metazoan
lineages.
305
306
Remarkably, the DBH group has independent expansions in three sponge classes as well as placozoans
11
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
307
308
309
310
311
and cnidarians (Supplemental Figure 7). No DBH homologs were identified in calcareous sponges or in
ctenophores. A putative homolog of this group was found in the choanoflagellate M. brevicollis but not in
any other non-metazoan. The alignment and the phylogenetic position of the M. brevicollis protein suggest
that it may be a member of the copper-binding oxygenase superfamily, rather than a true homolog of DBH
(see Supplemental Alignment).
312
313
314
315
316
317
The presence of DBH-like and AADC-like enzymes in most animal groups suggests the possibility to
make phenylethanolamines (like octopamine or noradrenaline) from tyrosine, and then subsequently inactivate them with MAO. All demosponges appear to lack AADC, and ctenophores appear to lack both of
these enzymes calling into question a previous report of the detectability of monoamine neurotransmitters
in ctenophores [Carlberg and Rosengren, 1985].
318
319
Conserved properties of neurons
331
Neurons are generally defined by the presence of five key aspects: membrane potential, voltage-gated ion
channels, secretory pathways, ligand-gated ion channels, and cell-cell junctions to form synapses. Voltagegated channels, secretory systems, and ligand-gated ion channels are discussed above. Membrane potential
is maintained in animal cells by sodium-potassium pumps (ATP-ases), which are a class of cation pumps
exclusively found in animals [Stein, 1995, Sáez et al., 2009]. It is thought that such pumps are necessary
because animals are the only multicellular group that lacks any kind of cell wall, thus careful control of
ionic balance is necessary to resist osmotic stress [Stein, 1995]. For non-bilaterian animals, cell layers were
in direct contact with water, so potentially all cells needed this protein to function normally. Therefore
having neuron-like functionality is unlikely to rest upon the gain or loss of this gene. The last feature is
the presence of cell-cell connections. Many proteins involved in synapse structure or neurotransmission are
found in sponges, [Srivastava et al., 2010,Riesgo et al., 2014a,Moran et al., 2015,Leys, 2015] though it is not
clear which genes are necessary for neural functioning, or may have evolved independently.
332
Neural evolution and losses
320
321
322
323
324
325
326
327
328
329
330
333
334
335
336
337
338
339
340
Based on recent phylogenies, both Porifera-sister and Ctenophora-sister evolutionary scenarios require either
at least one loss of neurons or two independent gains (Figure 1) of this cell-type. The only scenario that
allows for a single evolution of neurons and no losses is the “Coelenterata” hypothesis (reviewed in [Jékely
et al., 2015]), which joins cnidarians and ctenophores in a clade. However, many molecular datasets [Dunn
et al., 2008, Ryan et al., 2013, Whelan et al., 2015, Pisani et al., 2015] and morphological evidence [Harbison,
1985] argue against this scenario (but also see [Philippe et al., 2009] and [Simion et al., 2017]). One other
alternative is that placozoans have an unidentified neuron-like cell in a Porifera-sister context, which would
therefore allow for a single origin of neural systems in animals and no losses.
341
342
343
344
345
346
347
What do the two different scenarios mean for evolution of neuronal cells? Considering the basic properties
of neurons related to electrical signaling or secretory pathways, it had been shown before that many of the
genes involved are universally found in animals. A single origin and multiple losses implies that the genetic
toolkit necessary for all of these functions was present in the same single-celled organism or the same cell
type (an hypothetical proto-neuron) of the last common ancestor of crown-group Metazoa, and either that
cell type was lost or its functions were split up.
348
349
350
351
352
353
354
355
356
357
Sponges and ctenophores both appear to have lost several gene families (Figure 6), though ctenophores
nonetheless have neural cells. Thus, the losses of the GABA or monoamine pathways are not critical for the
functioning of neural cell types overall. However, voltage-gated potassium and sodium channels are thought
to be essential for the propagation of electrical signals down axons and dendrites and have been found in
all animal groups except sponges [Moran et al., 2015]. The NaV-channel tree shows a single origin of this
protein family (Supplemental Figure 17), and presence of these channels in choanoflagellates suggests they
were present in the common ancestor of all animals; the apparent absence in sponges therefore is probably a
secondary loss. By comparison, ctenophores have a mostly-unique expansion of Kv-channels relative to the
rest of metazoans [Li et al., 2015] and a duplication in NaV channels. Together with the loss of this protein
12
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
358
359
family in sponges, the gene content argues for a combination of both multiple, independent gains and a loss
of neural-type cells and their associated functions across animals.
360
361
362
363
364
365
366
367
368
369
370
371
Properties of the earliest metazoans are unknown, including life cycle or number of cell types, but it is
most parsimonious that the first obligate multicellular animals did not have anything resembling a modern
bilaterian nervous system [Wray et al., 2015]. Yet, the genomic evidence shows that these animals have cellular capacity to respond to environmental or paracrine signals, regulate the cell internal ion concentrations
and respond to changes in their concentrations, and secrete small molecules that could serve as effectors
in unconnected (but proximal) cells. Thus, the earliest animals likely had the capacity to develop nerve
cells using the genetic toolkit they possessed, though the number of times this occurred is unclear. This
capacity appears to have been lost in sponges with the loss of voltage-gated channels. As we were unable to
find putative genes to mediate action potentials in glass sponges, either all of the four transcriptomes were
incomplete or the unique action potentials of glass sponges may represent a third case of the evolution of
neural-like functions in Metazoa.
372
373
Methods
374
Sequence data
375
376
Project overview can be found at spongebase.net. Reference data from the demosponge Tethya wilhelma
are available at: https://bitbucket.org/molpalmuc/tethya_wilhelma-genome
377
378
379
Raw genomic reads for T. wilhelma are available on NCBI SRA under accession numbers SRR2163223
(genomic reads), SRR2296844 (mate pairs), SRR5369934 (DNA Moleculo), and SRR4255675 (RNAseq).
380
381
Genome assembly
382
Processing and assembly
383
384
385
386
387
388
We generated 25Gb of 100bp paired-end Illumina reads of genomic DNA and 35Gb of 125bp Illumina gel-free
mate-pair reads. Contigs were assembled with SOAPdenovo2 [Luo et al., 2012] using a kmer of 83bp. We
also generated 436Mb of Moleculo synthetic long reads. Because both haplotypes are represented in the
Moleculo reads, we merged the Moleculo reads using HaploMerger [Huang et al., 2012]. Contigs and merged
Moleculo reads were then scaffolded using the gel-free mate-pairs with SSPACE [Boetzer et al., 2011] and
BESST [Sahlin et al., 2014]. The first draft assembly had 7,947 contigs, totaling 145 megabases.
389
390
391
392
393
394
395
396
Removal of low-coverage contigs
To examine the completeness of the genome, we generated a plot of kmer coverage against GC percentage for
the contigs (Supplemental Figure 1) using custom Python scripts (available at http://github.org/wrf/lavaLampPlot).
This revealed 1,040 contigs with a coverage of zero that were carried over from the Moleculo reads and were
not assembled (Supplemental Figure 2), accounting for 6 megabases. As these reads likely derived from
bacterial contamination in the aquarium water, these contigs were removed, leaving 6,907 contigs totalling
138 megabases.
397
398
399
400
401
402
Separation of bacterial contigs
Additionally, the plot revealed many contigs with lower coverage (20x-90x) and high GC content (50-75%)
suggesting the presence of bacteria (Supplemental Figure 3). Because many of these contigs were shorter
than 10kb, separation of the bacterial contigs was done through several steps. We found 4,858 contigs with
mapped RNAseq reads and GC content under 50%, as expected of metazoans. These contigs accounted for
13
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
403
404
405
406
407
408
409
410
411
88% of the sponge assembly, or 121 megabases. For the 2,014 contigs with no mapped RNAseq, we used
blastn to search the contigs against the A. queenslandica scaffolds and all complete bacterial genomes from
Genbank (5,242 sequences). Based on subtraction of bitscores, 62 contigs were identified as sponge and 565
were identified as bacterial. For the remaining 1,387 contigs, most of which were under 10kb, we repeated
the search with tblastx against A. queenslandica scaffolds and the genomes of Sinorhizobium medicae and
Roseobacter litoralis, which were the most similar complete genomes to the two bacterial 16S rRNAs identified in the contigs. After all sorting, 798 putative bacterial contigs accounted for 12.7 megabases and were
separated to bring the total to 6,109 sponge contigs. Contigs for the two bacteria were binned by tetranucleotide frequency using MetaWatt [Strous et al., 2012] (Supplemental Figure 4).
412
413
414
415
416
417
418
419
420
421
422
Genome coverage and completeness
Coverage was estimated two ways: kmer frequency and read mapping. Kmers of 31bp were counted using
the Jellyfish kmer counter [Marçais and Kingsford, 2011] and analyzed using custom Python and R scripts
(“fastqdumps2histo.py” and “jellyfish gc coverage blob plot v2.R”, available at http://github.org/wrf/
lavaLampPlot). As expected, the kmer distribution showed two peaks, one for kmers at heterozygous
positions and one for homozygous positions, whereupon the coverage peak was at 131-fold coverage for homozygous positions. Because of sequencing errors, this method often underestimates coverage, and so to
confirm this estimate we then mapped all reads to the genome using Bowtie2 [Langmead and Salzberg, 2012].
The sum of mapped reads divided by the total length provided an estimated coverage of 159-fold physical
coverage.
423
424
425
426
427
Of the original reads, 185 million (86.5%) mapped back to the assembled sponge contigs. Completeness
for gene content was assessed with BUSCO [Simão et al., 2015], whereupon we found 728 (86%) complete
genes and 42 (4.9%) predicted-incomplete genes. Overall, these data suggest that the genome assembly is
adequate for downstream analyses.
428
429
Genome annotation
430
Transcriptome versions
431
432
433
434
435
436
The transcriptome for T. wilhelma was assembled de novo using Trinity (release r20140717) [Grabherr et al.,
2011, Haas et al., 2013]. Default parameters were used, except for strand specific assembly, in silico read
normalization, and trimming (–SS lib type RF –normalize reads –trimmomatic). This produced 127,012
transcripts with an average length of 913bp. Assembled transcripts were mapped to the genomic assembly
using GMAP [Wu and Watanabe, 2005] to produce a GFF file of the transcript mapping. Of these, 114,744
transcripts were mapped 166,847 times, allowing for multiple mappings.
437
438
439
440
441
For the genome-guided transcriptome, strand-specific RNAseq reads were mapped against the genome
build using Tophat2 v2.0.13 [Kim et al., 2013] using strand-specific mapping (option –library-type frfirststrand) and otherwise default parameters. Mapped reads were then joined into transcripts using StringTie
v1.0.2 [Pertea et al., 2015] with default parameters.
442
443
444
445
446
447
448
449
450
451
452
Additionally, ab initio gene models were predicted using AUGUSTUS [Stanke et al., 2008]. AUGUSTUS
was trained on the webAugustus server [Hoff and Stanke, 2013] using the highest expression transcripts for
each Trinity component and the assembled contigs. This identified 27,551 putative genes. The majority of
these overlapped partially or completely with a predicted gene based on the Trinity mapping or Stringtie
genes. However, 3,866 genes (4,321 transcripts) had no overlap with any predicted exon from either the
Trinity or StringTie set, and were kept for the final set. Considering the possibility that some of these may
be pseudogenes, we aligned these proteins to the SwissProt database with BLASTP [Camacho et al., 2009].
Of these, only 759 had reliable hits (E-value < 10−5 ) to 688 proteins. The annotated functions were diverse,
including proteins similar to many receptors and large structural proteins such as fibrillin (potentially any
protein with EGF repeats), dynein heavy chain, and titin; because very large proteins may be split across
14
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
454
multiple contigs, the predicted genes may be only fragments of the full gene. Only 42 of the hits were against
transposable elements.
455
Filtering of the final gene set
453
456
457
458
459
460
461
462
Because assembly of transcripts for both StringTie and Trinity relies on overlaps in the genome or RNAseq
reads, genes that overlap in the untranslated regions (UTRs) can sometimes be erroneously fused. For
StringTie, we developed a custom Python script to separate non-overlapping transcripts belonging to the
same “gene” (stringtiesplitgenes.py, available at https://bitbucket.org/wrf/sequences/). Tandem duplications can lead to RNAseq reads bridging the two tandem copies and result in both copies being called
the same gene. The original StringTie set contained 46,572 transcripts for 32,112 genes, while the corrected
set contained 33,200 genes and identified 1,088 new non overlapping genes.
463
464
465
466
467
468
469
Positional errors in the genome or allelic variations may result in some RNAseq reads not mapping
to the genome, so some genes are fragmented in the genome-guided transcriptome but not the de novo
assembly. Making use of the protein predictions from TransDecoder, we compared the predicted proteins between the two transcriptomes using a custom Python script (transdecodersplitgenes.py, available
at https://bitbucket.org/wrf/sequences/). This identified 406 StringTie transcripts that were better
modeled by Trinity transcripts.
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
Functional gene annotation
Many genes of functional importance were examined manually, and the best transcript from StringTie, Trinity, or AUGUSTUS was retained for the final gene set. In the GFF and fasta versions of the transcriptomes,
names of protein functions were assigned several ways. Target genes that were manually curated and edited,
such as those used in all trees, are named by the generic function or the annotated function of the closest
human protein. For instance, the dopamine beta-hydroxylase (DBH) homolog in T. wilhelma was manually corrected, and the position in a phylogenetic tree demonstrated that demosponges diverged before the
duplication which created DBH and the two DBH-like proteins in humans, thus the T. wilhelma protein
is annotated as DBH-like. Secondly, automated ortholog finding pipelines (HaMStR [Ebersberger et al.,
2009]) used for phylogeny [Cannon et al., 2016] have identified homologs in T. wilhelma, which have been
manually checked based on positions in the phylogenetic trees. Thirdly, single-direction BLAST results were
kept as annotations provided that the BLAST hit had a bitscore over 1000, or a bitscore over 300 and the
T. wilhelma protein covered at least 75% of the best hit against the human protein dataset from SwissProt.
The bitscore and length cutoffs were applied to reduce the number of annotations based on a single domain.
485
486
487
488
489
490
491
492
493
494
495
496
497
Analysis of splice variation
Using the transcriptome from StringTie, splice variation was assessed using a custom Python script (splicevariantstats.py, available at https://bitbucket.org/wrf/sequences/). In this script, several ambiguous
definitions were clarified to define the different splice types. Firstly, single exon genes with no variants are
distinguished from single exon genes with variations, that is, a gene with two exons can have a variant with
one exon. For loci with only two transcripts, the canonical or main transcript is defined as the one with
the higher expression level, as measured by the higher FPKM value reported from StringTie. For loci with
three or more transcripts, main or canonical exons are those included in at least two transcripts. A cassette
exon must occur in less than 50% of the transcripts for a locus, otherwise such case is defined as a skipped
exon. A retained intron is any portion that exactly spans two other exons; for highly expressed transcripts
this may include erroneously retained introns due to intermediates in splicing. A summary of the splicing
types is displayed in Supplemental Table 2.
498
499
500
501
Intron retention was recently reported to be a common mode of alternative splicing in A. queenslandica [Fernandez-Valverde et al., 2015]. We found 3,295 transcripts with 3,565 retained intron events
(Supplemental Table 2). We then analyzed the length of the retained introns and found the phase of the
15
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
502
503
retained piece to be randomly distributed (unlike cassette exons, Supplemental Table 3), suggesting that
many of the retained introns result from incomplete splicing rather than functional retention.
504
505
506
507
508
509
510
511
512
513
Microsynteny across sponges
Putative synteny blocks were identified using a custom Python script (microsynteny.py, available at https:
//bitbucket.org/wrf/sequences/). Briefly, the script combines the gene positions on scaffolds for both
the query and the reference with BLASTX hits for the query against the reference. If a minimum of three
genes in a row on a query scaffold match to different genes on the same reference scaffold, the group is
kept. By default, this mandated a gap of no more than five genes before discarding the block, and that the
next gene must occur within 30kb. This method was designed to work for highly fragmented genomes with
thousands of scaffolds, so the order and direction of the corresponding genes on the reference scaffold do not
need to match those of the query scaffold.
514
515
516
517
518
519
StringTie transcripts for T. wilhelma were aligned against the A. queenslandica v2.0 protein set with
BLASTX [Camacho et al., 2009], and positions were taken from the accompanying A. queenslandica v2.0
GTF. The same procedure was attempted against the S. ciliatum gene models, though essentially no syntenic blocks were detected, indicating either substantial differences in gene content or gene order between
demosponges and calcareous sponges.
520
521
522
523
524
525
526
527
Collection of reference data
Proteins for Oikopleura dioica [Denoeud et al., 2010] were downloaded from Genoscope. Gene models
for Ciona intestinalis [Dehal et al., 2002], Branchiostoma floridae [Putnam et al., 2008], Trichoplax adherens [Srivastava et al., 2008], Capitella teleta, Lottia gigantea, Helobdella robusta [Simakov et al., 2013],
Saccoglossus kowalevskii [Simakov et al., 2015], and Monosiga brevicollis [King et al., 2008] were downloaded
from the JGI genome portal. Gene models for Sphaeroforma arctica, Capsaspora owczarzaki [Suga et al.,
2013] and Salpingoeca rosetta [Fairclough et al., 2013] were downloaded from the Broad Institute.
528
529
530
531
532
We used genomic data of the cnidarians Nematostella vectensis [Moran et al., 2014], Exaiptasia pallida [Baumgarten et al., 2015], and Hydra magnipapillata as well as transcriptomes from 33 other cnidarians [Bhattacharya et al., 2016, Zapata et al., 2015, Pratlong et al., 2015, Brinkman et al., 2015, Ponce et al.,
2016], mostly corals.
533
534
535
536
537
538
539
540
541
542
543
For demosponges, we used the genome of Amphimedon queenslandica [Srivastava et al., 2010, FernandezValverde et al., 2015] and transcriptomic data from: Mycale phyllophila [Qiu et al., 2015], Petrosia ficiformis [Riesgo et al., 2014a], Crambe crambe [Versluis et al., 2015], Cliona varians [Riesgo et al., 2014b], Halisarca dujardini [Borisenko et al., 2016], Crella elegans [Pérez-Porro et al., 2013], Stylissa carteri, Xestospongia testutinaria [Ryu et al., 2016], Scopalina sp., and Tedania anhelens. We used data from the genome of the
calcareous sponge Sycon ciliatum [Fortunato et al., 2014] and the transcriptome of Leucosolenia complicata.
For hexactinellids (glass sponges), we used transcriptome data from Aphrocallistes vastus [Ludeman et al.,
2014], Hyalonema populiferum, Rosella fibulata, and Sympagella nux [Whelan et al., 2015]. For homoscleromorphs, we used two transcriptomes from Oscarella carmela and Corticium candelabrum [Ludeman et al.,
2014].
544
545
546
547
548
We used data from the two published draft genomes of ctenophores [Ryan et al., 2013,Moroz et al., 2014],
as well as transcriptome data from 11 additional ctenophores: Bathocyroe fosteri, Bathyctena chuni, Beroe
abyssicola, Bolinopsis infundibulum, Charistephane fugiens, Dryodora glandiformis, Euplokamis dunlapae,
Hormiphora californensis, Lampea lactea, Thalassocalyce inconstans, and Velamen parallelum.
549
550
We used data from the unpublished draft genome of a novel placozoan species, designated H13.
551
16
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
552
553
554
555
556
557
558
559
560
561
Gene trees
For protein trees, candidate proteins were identified by reciprocal BLAST alignment using blastp or tblastn.
All BLAST searches were done using the NCBI BLAST 2.2.29+ package [Camacho et al., 2009]. Because
most functions were described for human, mouse, or fruit fly proteins, these served as the queries for all
datasets. Candidate homologs were kept for analysis if they reciprocally aligned by blastp to a query protein, usually human. Alignments for protein sequences were created using MAFFT v7.029b, with L-INS-i
parameters for accurate alignments [Katoh and Standley, 2013]. Phylogenetic trees were generated using either FastTree [Price et al., 2010] with default parameters or RAxML-HPC-PTHREADS v8.1.3 [Stamatakis,
2014], using the PROTGAMMALG model for proteins and 100 bootstrap replicates with the “rapid bootstrap” (-f a) algorithm and a random seed of 1234.
562
563
564
565
566
567
Domain annotation
Domains for individual protein trees were annotated with “hmmscan” v3.1b1 from the HHMER package [Eddy, 2011] using the PFAM-A database v27.0 [Finn et al., 2016] as queries. Signal peptides were predicted using the stand-alone version of SignalP v4.1 [Petersen et al., 2011]. Domain structures were visualized
using a custom Python script, “pfampipeline.py”, available at https://github.com/wrf/genomeGTFtools .
568
569
Acknowledgments
575
W.R.F would like to thank K. Achim, M. Nickel, J. Musser, J. Ryan, and I. Oldenburg for helpful comments.
G.W and D.E. would like to thank M. Nickel for providing the initial T. wilhelma specimens to set up the
culture in Munich. This work was supported by a LMUexcellent grant (Project MODELSPONGE) to D.E.
and G.W. as part of the German Excellence Initiative, partially by research grant 9278 (“Early evolution
of multicellular sponges”) from VILLUM FONDEN to G.W., and NIH grant NIGMS-5-R01-GM087198 to
S.H.D.H. The authors declare no competing interests.
576
References
570
571
572
573
574
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
References
[Alberstein et al., 2015] Alberstein, R., Grey, R., Zimmet, A., Simmons, D. K., and Mayer, M. L.
(2015). Glycine activated ion channel subunits encoded by ctenophore glutamate receptor genes.
Proceedings of the National Academy of Sciences, 112(44):E6048–E6057.
[Árnadóttir and Chalfie, 2010] Árnadóttir, J. and Chalfie, M. (2010). Eukaryotic Mechanosensitive
Channels. Annual Review of Biophysics, 39(1):111–137.
[Baumgarten et al., 2015] Baumgarten, S., Simakov, O., Esherick, L. Y., Liew, Y. J., Lehnert, E. M.,
Michell, C. T., Li, Y., Hambleton, E. a., Guse, A., Oates, M. E., Gough, J., Weis, V. M., Aranda,
M., Pringle, J. R., and Voolstra, C. R. (2015). The genome of Aiptasia , a sea anemone model for
coral symbiosis. Proceedings of the National Academy of Sciences, page 201513318.
[Bhattacharya et al., 2016] Bhattacharya, D., Agrawal, S., Aranda, M., Baumgarten, S., Belcaid, M.,
Drake, J. L., Erwin, D., Foret, S., Gates, R. D., Gruber, D. F., Kamel, B., Lesser, M. P., Levy,
O., Liew, Y. J., MacManes, M., Mass, T., Medina, M., Mehr, S., Meyer, E., Price, D. C., Putnam,
H. M., Qiu, H., Shinzato, C., Shoguchi, E., Stokes, A. J., Tambutté, S., Tchernov, D., Voolstra,
C. R., Wagner, N., Walker, C. W., Weber, A. P., Weis, V., Zelzion, E., Zoccola, D., and Falkowski,
P. G. (2016). Comparative genomics explains the evolutionary success of reef-forming corals. eLife,
5:1–26.
[Boetzer et al., 2011] Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D., and Pirovano, W. (2011).
Scaffolding pre-assembled contigs using SSPACE. Bioinformatics, 27(4):578–579.
17
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
[Borisenko et al., 2016] Borisenko, I., Adamski, M., Ereskovsky, A., and Adamska, M. (2016). Surprisingly rich repertoire of Wnt genes in the demosponge Halisarca dujardini. BMC evolutionary
biology, 16(1):123.
[Brinkman et al., 2015] Brinkman, D. L., Jia, X., Potriquet, J., Kumar, D., Dash, D., Kvaskoff, D.,
and Mulvenna, J. (2015). Transcriptome and venom proteome of the box jellyfish Chironex fleckeri.
BMC Genomics, 16(1):407.
[Camacho et al., 2009] Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer,
K., and Madden, T. L. (2009). BLAST+: architecture and applications. BMC bioinformatics,
10:421.
[Cannon et al., 2016] Cannon, J. T., Vellutini, B. C., Smith, J., Ronquist, F., Jondelius, U., and
Hejnol, A. (2016). Xenacoelomorpha is the sister group to Nephrozoa. Nature, 530(7588):89–93.
[Cao et al., 2010] Cao, J., Shi, F., Liu, X., Huang, G., and Zhou, M. (2010). Phylogenetic analysis
and evolution of aromatic amino acid hydroxylase. FEBS Letters, 584(23):4775–4782.
[Carlberg and Rosengren, 1985] Carlberg, M. and Rosengren, E. (1985). Biochemical basis for adrenergic neurotransmission in coelenterates. Journal of Comparative Physiology B, 155(2):251–255.
[Coste et al., 2012] Coste, B., Xiao, B., Santos, J. S., Syeda, R., Grandl, J., Spencer, K. S., Kim, S. E.,
Schmidt, M., Mathur, J., Dubin, A. E., Montal, M., and Patapoutian, A. (2012). Piezo proteins are
pore-forming subunits of mechanically activated channels. Nature, 483(7388):176–181.
[Dehal et al., 2002] Dehal, P., Satou, Y., Campbell, R. K., Chapman, J., Degnan, B., De Tomaso, A.,
Davidson, B., Di Gregorio, A., Gelpke, M., Goodstein, D. M., Harafuji, N., Hastings, K. E. M., Ho,
I., Hotta, K., Huang, W., Kawashima, T., Lemaire, P., Martinez, D., Meinertzhagen, I. a., Necula,
S., Nonaka, M., Putnam, N., Rash, S., Saiga, H., Satake, M., Terry, A., Yamada, L., Wang, H.-G.,
Awazu, S., Azumi, K., Boore, J., Branno, M., Chin-Bow, S., DeSantis, R., Doyle, S., Francino, P.,
Keys, D. N., Haga, S., Hayashi, H., Hino, K., Imai, K. S., Inaba, K., Kano, S., Kobayashi, K.,
Kobayashi, M., Lee, B.-I., Makabe, K. W., Manohar, C., Matassi, G., Medina, M., Mochizuki, Y.,
Mount, S., Morishita, T., Miura, S., Nakayama, A., Nishizaka, S., Nomoto, H., Ohta, F., Oishi, K.,
Rigoutsos, I., Sano, M., Sasaki, A., Sasakura, Y., Shoguchi, E., Shin-i, T., Spagnuolo, A., Stainier,
D., Suzuki, M. M., Tassy, O., Takatori, N., Tokuoka, M., Yagi, K., Yoshizaki, F., Wada, S., Zhang,
C., Hyatt, P. D., Larimer, F., Detter, C., Doggett, N., Glavina, T., Hawkins, T., Richardson,
P., Lucas, S., Kohara, Y., Levine, M., Satoh, N., and Rokhsar, D. S. (2002). The draft genome
of Ciona intestinalis: insights into chordate and vertebrate origins. Science (New York, N.Y.),
298(5601):2157–2167.
[Denoeud et al., 2010] Denoeud, F., Henriet, S., Mungpakdee, S., Aury, J.-M., Da Silva, C.,
Brinkmann, H., Mikhaleva, J., Olsen, L. C., Jubin, C., Canestro, C., Bouquet, J.-M., Danks, G.,
Poulain, J., Campsteijn, C., Adamski, M., Cross, I., Yadetie, F., Muffato, M., Louis, A., Butcher,
S., Tsagkogeorga, G., Konrad, A., Singh, S., Jensen, M. F., Cong, E. H., Eikeseth-Otteraa, H.,
Noel, B., Anthouard, V., Porcel, B. M., Kachouri-Lafond, R., Nishino, A., Ugolini, M., Chourrout,
P., Nishida, H., Aasland, R., Huzurbazar, S., Westhof, E., Delsuc, F., Lehrach, H., Reinhardt, R.,
Weissenbach, J., Roy, S. W., Artiguenave, F., Postlethwait, J. H., Manak, J. R., Thompson, E. M.,
Jaillon, O., Du Pasquier, L., Boudinot, P., Liberles, D. a., Volff, J.-N., Philippe, H., Lenhard, B.,
Crollius, H. R., Wincker, P., and Chourrout, D. (2010). Plasticity of Animal Genome Architecture
Unmasked by Rapid Evolution of a Pelagic Tunicate. Science, 1381(2010).
[Dunn et al., 2008] Dunn, C. W., Hejnol, A., Matus, D. Q., Pang, K., Browne, W. E., Smith, S. a.,
Seaver, E., Rouse, G. W., Obst, M., Edgecombe, G. D., Sørensen, M. V., Haddock, S. H. D.,
Schmidt-Rhaesa, A., Okusu, A., Kristensen, R. M., Wheeler, W. C., Martindale, M. Q., and Giribet,
G. (2008). Broad phylogenomic sampling improves resolution of the animal tree of life. Nature,
452(7188):745–9.
[Dunn et al., 2015] Dunn, C. W., Leys, S. P., and Haddock, S. H. D. (2015). The hidden biology of
sponges and ctenophores. Trends in Ecology & Evolution, pages 1–10.
[Ebersberger et al., 2009] Ebersberger, I., Strauss, S., and von Haeseler, A. (2009). HaMStR: profile
hidden markov model based search for orthologs in ESTs. BMC evolutionary biology, 9:157.
18
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
[Eddy, 2011] Eddy, S. R. (2011). Accelerated profile HMM searches. PLoS Computational Biology,
7(10).
[Elliott and Leys, 2007] Elliott, G. R. D. and Leys, S. P. (2007). Coordinated contractions effectively
expel water from the aquiferous system of a freshwater sponge. Journal of Experimental Biology,
210(21):3736–3748.
[Elliott and Leys, 2010] Elliott, G. R. D. and Leys, S. P. (2010). Evidence for glutamate, GABA and
NO in coordinating behaviour in the sponge, Ephydatia muelleri (Demospongiae, Spongillidae). The
Journal of experimental biology, 213:2310–2321.
[Ellwanger et al., 2007] Ellwanger, K., Eich, A., and Nickel, M. (2007). GABA and glutamate specifically induce contractions in the sponge Tethya wilhelma. Journal of Comparative Physiology A:
Neuroethology, Sensory, Neural, and Behavioral Physiology, 193(1):1–11.
[Ellwanger and Nickel, 2006] Ellwanger, K. and Nickel, M. (2006). Neuroactive substances specifically
modulate rhythmic body contractions in the nerveless metazoon Tethya wilhelma (Demospongiae,
Porifera). Frontiers in zoology, 3:7.
[Fairclough et al., 2013] Fairclough, S. R., Chen, Z., Kramer, E., Zeng, Q., Young, S., Robertson,
H. M., Begovic, E., Richter, D. J., Russ, C., Westbrook, M. J., Manning, G., Lang, B. F., Haas,
B., Nusbaum, C., and King, N. (2013). Premetazoan genome evolution and the regulation of cell
differentiation in the choanoflagellate Salpingoeca rosetta. Genome biology, 14(2):R15.
[Fernandez-Valverde et al., 2015] Fernandez-Valverde, S. L., Calcino, A. D., and Degnan, B. M. (2015).
Deep developmental transcriptome sequencing uncovers numerous new genes and enhances gene
annotation in the sponge Amphimedon queenslandica. BMC Genomics, 16(1):1–11.
[Finn et al., 2016] Finn, R. D., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Mistry, J., Mitchell, A. L.,
Potter, S. C., Punta, M., Qureshi, M., Sangrador-Vegas, A., Salazar, G. A., Tate, J., and Bateman,
A. (2016). The Pfam protein families database: towards a more sustainable future. Nucleic Acids
Research, 44(D1):D279–D285.
[Fortunato et al., 2014] Fortunato, S. a. V., Adamski, M., Ramos, O. M., Leininger, S., Liu, J., Ferrier,
D. E. K., and Adamska, M. (2014). Calcisponges have a ParaHox gene and dynamic expression of
dispersed NK homeobox genes. Nature, 514(7524):620–623.
[Ge et al., 2015] Ge, J., Li, W., Zhao, Q., Li, N., Chen, M., Zhi, P., Li, R., Gao, N., Xiao, B., and Yang,
M. (2015). Architecture of the mammalian mechanosensitive Piezo1 channel. Nature, 527(5):64–69.
[Geng et al., 2013] Geng, Y., Bush, M., Mosyak, L., Wang, F., and Fan, Q. R. (2013). Structural
mechanism of ligand activation in human GABA(B) receptor. Nature, 504(7479):254–9.
[Grabherr et al., 2011] Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. a.,
Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N.,
Gnirke, A., Rhind, N., di Palma, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K., Friedman, N.,
and Regev, A. (2011). Full-length transcriptome assembly from RNA-Seq data without a reference
genome. Nature biotechnology, 29(7):644–52.
[Grell and Benwitz, 1974] Grell, K. G. and Benwitz, G. (1974). Spezifische Verbindungsstrukuren der
Faserzellen von Trichoplax adhaerens F.E. Schulze. Z. Naturforsch., 29c:790.
[Gur Barzilai et al., 2012] Gur Barzilai, M., Reitzel, A. M., Kraus, J. E. M., Gordon, D., Technau, U.,
Gurevitz, M., and Moran, Y. (2012). Convergent Evolution of Sodium Ion Selectivity in Metazoan
Neuronal Signaling. Cell Reports, 2(2):242–248.
[Haas et al., 2013] Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden,
J., Couger, M. B., Eccles, D., Li, B., Lieber, M., Macmanes, M. D., Ott, M., Orvis, J., Pochet,
N., Strozzi, F., Weeks, N., Westerman, R., William, T., Dewey, C. N., Henschel, R., Leduc, R. D.,
Friedman, N., and Regev, A. (2013). De novo transcript sequence reconstruction from RNA-seq
using the Trinity platform for reference generation and analysis. Nature protocols, 8(8):1494–512.
[Harbison, 1985] Harbison, G. R. (1985). On the classification and evolution of the Ctenophora. In
Conway Morris, S. C., George, J. D., Gibson, R., and Platt, H. M., editors, The Origin and Relationships of Lower Invertebrates, pages 78–100. Clarendon Press, Oxford.
19
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
[Hoff and Stanke, 2013] Hoff, K. J. and Stanke, M. (2013). WebAUGUSTUS–a web service for training
AUGUSTUS and predicting genes in eukaryotes. Nucleic Acids Research, 41(W1):W123–W128.
[Huang et al., 2012] Huang, S., Chen, Z., Huang, G., Yu, T., Yang, P., Li, J., Fu, Y., Yuan, S., Chen,
S., and Xu, A. (2012). HaploMerger: Reconstructing allelic relationships for polymorphic diploid
genome assemblies. Genome Research, 22(8):1581–1588.
[Jarvis et al., 2014] Jarvis, E. D., Mirarab, S., Aberer, A. J., Li, B., Houde, P., Li, C., Ho, S. Y. W.,
Faircloth, B. C., Nabholz, B., Howard, J. T., Suh, A., Weber, C. C., da Fonseca, R. R., Li, J., Zhang,
F., Li, H., Zhou, L., Narula, N., Liu, L., Ganapathy, G., Boussau, B., Bayzid, M. S., Zavidovych, V.,
Subramanian, S., Gabaldon, T., Capella-Gutierrez, S., Huerta-Cepas, J., Rekepalli, B., Munch, K.,
Schierup, M., Lindow, B., Warren, W. C., Ray, D., Green, R. E., Bruford, M. W., Zhan, X., Dixon,
A., Li, S., Li, N., Huang, Y., Derryberry, E. P., Bertelsen, M. F., Sheldon, F. H., Brumfield, R. T.,
Mello, C. V., Lovell, P. V., Wirthlin, M., Schneider, M. P. C., Prosdocimi, F., Samaniego, J. A.,
Velazquez, A. M. V., Alfaro-Nunez, A., Campos, P. F., Petersen, B., Sicheritz-Ponten, T., Pas, A.,
Bailey, T., Scofield, P., Bunce, M., Lambert, D. M., Zhou, Q., Perelman, P., Driskell, A. C., Shapiro,
B., Xiong, Z., Zeng, Y., Liu, S., Li, Z., Liu, B., Wu, K., Xiao, J., Yinqi, X., Zheng, Q., Zhang, Y.,
Yang, H., Wang, J., Smeds, L., Rheindt, F. E., Braun, M., Fjeldsa, J., Orlando, L., Barker, F. K.,
Jonsson, K. A., Johnson, W., Koepfli, K.-P., O’Brien, S., Haussler, D., Ryder, O. A., Rahbek, C.,
Willerslev, E., Graves, G. R., Glenn, T. C., McCormack, J., Burt, D., Ellegren, H., Alstrom, P.,
Edwards, S. V., Stamatakis, A., Mindell, D. P., Cracraft, J., Braun, E. L., Warnow, T., Jun, W.,
Gilbert, M. T. P., and Zhang, G. (2014). Whole-genome analyses resolve early branches in the tree
of life of modern birds. Science, 346(6215):1320–1331.
[Jékely et al., 2015] Jékely, G., Paps, J., and Nielsen, C. (2015). The phylogenetic position of
ctenophores and the origin(s) of nervous systems. EvoDevo, 6(1):1.
[Katoh and Standley, 2013] Katoh, K. and Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular biology and evolution, 30(4):772–80.
[Kim et al., 2013] Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., and Salzberg, S. L.
(2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and
gene fusions. Genome biology, 14(4):R36.
[King et al., 2008] King, N., Westbrook, M. J., Young, S. L., Kuo, A., Abedin, M., Chapman, J.,
Fairclough, S., Hellsten, U., Isogai, Y., Letunic, I., Marr, M., Pincus, D., Putnam, N., Rokas, A.,
Wright, K. J., Zuzow, R., Dirks, W., Good, M., Goodstein, D., Lemons, D., Li, W., Lyons, J. B.,
Morris, A., Nichols, S., Richter, D. J., Salamov, A., Sequencing, J. G. I., Bork, P., Lim, W. a.,
Manning, G., Miller, W. T., McGinnis, W., Shapiro, H., Tjian, R., Grigoriev, I. V., and Rokhsar,
D. (2008). The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans.
Nature, 451(7180):783–8.
[Krishnan et al., 2014] Krishnan, A., Dnyansagar, R., Almén, M. S., Williams, M. J., Fredriksson,
R., Manoj, N., and Schiöth, H. B. (2014). The GPCR repertoire in the demosponge Amphimedon
queenslandica: insights into the GPCR system at the early divergence of animals. BMC Evolutionary
Biology, 14(1):270.
[Krishnan and Schiöth, 2015] Krishnan, A. and Schiöth, H. B. (2015). The role of G protein-coupled
receptors in the early evolution of neurotransmission and the nervous system. The Journal of
experimental biology, 218(Pt 4):562–571.
[Langmead and Salzberg, 2012] Langmead, B. and Salzberg, S. L. (2012). Fast gapped-read alignment
with Bowtie 2.
[Leys, 2015] Leys, S. P. (2015). Elements of a ’nervous system’ in sponges. The Journal of experimental
biology, 218(Pt 4):581–91.
[Leys et al., 1999] Leys, S. P., Mackie, G. O., and Meech, R. W. (1999). Impulse conduction in a
sponge. The Journal of experimental biology, 202 (Pt 9)(June 1997):1139–1150.
[Leys et al., 2007] Leys, S. P., Mackie, G. O., and Reiswig, H. M. (2007). The Biology of Glass Sponges.
Advances in Marine Biology, 52(06):1–145.
20
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
[Li et al., 2015] Li, X., Liu, H., Chu Luo, J., Rhodes, S. a., Trigg, L. M., van Rossum, D. B., Anishkin,
A., Diatta, F. H., Sassic, J. K., Simmons, D. K., Kamel, B., Medina, M., Martindale, M. Q., and
Jegla, T. (2015). Major diversification of voltage-gated K\n +\n channels occurred in ancestral
parahoxozoans. Proceedings of the National Academy of Sciences, page 201422941.
[Liebeskind et al., 2011] Liebeskind, B. J., Hillis, D. M., and Zakon, H. H. (2011). Evolution of sodium
channels predates the origin of nervous systems in animals. Proceedings of the National Academy of
Sciences of the United States of America, 108(22):9154–9159.
[Ludeman et al., 2014] Ludeman, D. A., Farrar, N., Riesgo, A., Paps, J., and Leys, S. P. (2014).
Evolutionary origins of sensation in metazoans: functional evidence for a new sensory organ in
sponges. BMC Evolutionary Biology, 14(3):1–11.
[Luo et al., 2012] Luo, R., Liu, B., Xie, Y., Li, Z., Huang, W., Yuan, J., He, G., Chen, Y., Pan, Q.,
Liu, Y., Tang, J., Wu, G., Zhang, H., Shi, Y., Liu, Y., Yu, C., Wang, B., Lu, Y., Han, C., Cheung,
D. W., Yiu, S.-M., Peng, S., Xiaoqian, Z., Liu, G., Liao, X., Li, Y., Yang, H., Wang, J., Lam, T.-W.,
and Wang, J. (2012). SOAPdenovo2: an empirically improved memory-efficient short-read de novo
assembler. GigaScience, 1(1):18.
[Marçais and Kingsford, 2011] Marçais, G. and Kingsford, C. (2011). A fast, lock-free approach for
efficient parallel counting of occurrences of k-mers. Bioinformatics, 27(6):764–770.
[Moran et al., 2015] Moran, Y., Barzilai, M. G., Liebeskind, B. J., and Zakon, H. H. (2015). Evolution of voltage-gated ion channels at the emergence of Metazoa. Journal of Experimental Biology,
218:515–525.
[Moran et al., 2014] Moran, Y., Fredman, D., Praher, D., Li, X. Z., Wee, L. M., Rentzsch, F., Zamore,
P. D., Technau, U., and Seitz, H. (2014). Cnidarian microRNAs frequently regulate targets by
cleavage. Genome Research, 24(4):651–663.
[Moroz, 2015] Moroz, L. L. (2015). Convergent evolution of neural systems in ctenophores. Journal
of Experimental Biology, 218:598–611.
[Moroz et al., 2014] Moroz, L. L., Kocot, K. M., Citarella, M. R., Dosung, S., Norekian, T. P., Povolotskaya, I. S., Grigorenko, A. P., Dailey, C., Berezikov, E., Buckley, K. M., Ptitsyn, A., Reshetov,
D., Mukherjee, K., Moroz, T. P., Bobkova, Y., Yu, F., Kapitonov, V. V., Jurka, J., Bobkov, Y. V.,
Swore, J. J., Girardo, D. O., Fodor, A., Gusev, F., Sanford, R., Bruders, R., Kittler, E., Mills, C. E.,
Rast, J. P., Derelle, R., Solovyev, V. V., Kondrashov, F. a., Swalla, B. J., Sweedler, J. V., Rogaev,
E. I., Halanych, K. M., and Kohn, A. B. (2014). The ctenophore genome and the evolutionary
origins of neural systems. Nature, 17:1–123.
[Nickel, 2010] Nickel, M. (2010). Evolutionary emergence of synaptic nervous systems: what can we
learn from the non-synaptic, nerveless Porifera? Invertebrate Biology, 129(1):1–16.
[Nosenko et al., 2013] Nosenko, T., Schreiber, F., Adamska, M., Adamski, M., Eitel, M., Hammel,
J., Maldonado, M., Müller, W. E. G., Nickel, M., Schierwater, B., Vacelet, J., Wiens, M., and
Wörheide, G. (2013). Deep metazoan phylogeny: when different genes tell different stories. Molecular
phylogenetics and evolution, 67(1):223–33.
[Pérez-Porro et al., 2013] Pérez-Porro, a. R., Navarro-Gómez, D., Uriz, M. J., and Giribet, G. (2013).
A NGS approach to the encrusting Mediterranean sponge Crella elegans (Porifera, Demospongiae,
Poecilosclerida): transcriptome sequencing, characterization and overview of the gene expression
along three life cycle stages. Molecular ecology resources, 454:494–509.
[Pertea et al., 2015] Pertea, M., Pertea, G. M., Antonescu, C. M., Chang, T.-C., Mendell, J. T., and
Salzberg, S. L. (2015). StringTie enables improved reconstruction of a transcriptome from RNA-seq
reads. Nature Biotechnology, 33(3).
[Petersen et al., 2011] Petersen, T. N., Brunak, S., von Heijne, G., and Nielsen, H. (2011). SignalP
4.0: discriminating signal peptides from transmembrane regions. Nature methods, 8(10):785–6.
[Philippe et al., 2011] Philippe, H., Brinkmann, H., Lavrov, D. V., Littlewood, D. T. J., Manuel,
M., Wörheide, G., and Baurain, D. (2011). Resolving difficult phylogenetic questions: why more
sequences are not enough. PLoS biology, 9(3):e1000602.
21
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
[Philippe et al., 2009] Philippe, H., Derelle, R., Lopez, P., Pick, K., Borchiellini, C., Boury-Esnault,
N., Vacelet, J., Renard, E., Houliston, E., Quéinnec, E., Da Silva, C., Wincker, P., Le Guyader, H.,
Leys, S., Jackson, D. J., Schreiber, F., Erpenbeck, D., Morgenstern, B., Wörheide, G., and Manuel,
M. (2009). Phylogenomics revives traditional views on deep animal relationships. Current biology :
CB, 19(8):706–12.
[Pick et al., 2010] Pick, K. S., Philippe, H., Schreiber, F., Erpenbeck, D., Jackson, D. J., Wrede, P.,
Wiens, M., Alié, A., Morgenstern, B., Manuel, M., and Wörheide, G. (2010). Improved phylogenomic
taxon sampling noticeably affects nonbilaterian relationships. Molecular Biology and Evolution,
27(9):1983–1987.
[Pisani et al., 2015] Pisani, D., Pett, W., Dohrmann, M., Feuda, R., Rota-Stabelli, O., Philippe, H.,
Lartillot, N., and Wörheide, G. (2015). Genomic data do not support comb jellies as the sister group
to all other animals. Proceedings of the National Academy of Sciences, 112(50):201518127.
[Ponce et al., 2016] Ponce, D., Brinkman, D. L., Potriquet, J., and Mulvenna, J. (2016). Tentacle transcriptome and venom proteome of the pacific sea nettle, Chrysaora fuscescens (Cnidaria: Scyphozoa).
Toxins, 8(4).
[Pratlong et al., 2015] Pratlong, M., Haguenauer, A., Chabrol, O., Klopp, C., Pontarotti, P., and
Aurelle, D. (2015). The red coral ( Corallium rubrum ) transcriptome: a new resource for population
genetics and local adaptation studies. Molecular Ecology Resources, 15(5):1205–1215.
[Price et al., 2010] Price, M. N., Dehal, P. S., and Arkin, A. P. (2010). FastTree 2 - Approximately
maximum-likelihood trees for large alignments. PLoS ONE, 5(3).
[Putnam et al., 2008] Putnam, N. H., Butts, T., Ferrier, D. E. K., Furlong, R. F., Hellsten, U.,
Kawashima, T., Robinson-Rechavi, M., Shoguchi, E., Terry, A., Yu, J.-K., Benito-Gutiérrez, E. L.,
Dubchak, I., Garcia-Fernàndez, J., Gibson-Brown, J. J., Grigoriev, I. V., Horton, A. C., de Jong,
P. J., Jurka, J., Kapitonov, V. V., Kohara, Y., Kuroki, Y., Lindquist, E., Lucas, S., Osoegawa,
K., Pennacchio, L. a., Salamov, A. a., Satou, Y., Sauka-Spengler, T., Schmutz, J., Shin-I, T., Toyoda, A., Bronner-Fraser, M., Fujiyama, A., Holland, L. Z., Holland, P. W. H., Satoh, N., and
Rokhsar, D. S. (2008). The amphioxus genome and the evolution of the chordate karyotype. Nature,
453(7198):1064–71.
[Qiu et al., 2015] Qiu, F., Ding, S., Ou, H., Wang, D., Chen, J., and Miyamoto, M. M. (2015). Transcriptome changes during the life cycle of the red sponge, Mycale phyllophila (Porifera, Demospongiae, Poecilosclerida). Genes, 6(4):1023–1052.
[Ramoino et al., 2010] Ramoino, P., Ledda, F. D., Ferrando, S., Gallus, L., Bianchini, P., Diaspro, A.,
Fato, M., Tagliafierro, G., and Manconi, R. (2010). Metabotropic ??-aminobutyric acid (GABAB)
receptors modulate feeding behavior in the calcisponge Leucandra aspera. Journal of Experimental
Zoology Part A: Ecological Genetics and Physiology, 313A(3):132–140.
[Riesgo et al., 2014a] Riesgo, A., Farrar, N., Windsor, P. J., Giribet, G., and Leys, S. P. (2014a). The
Analysis of Eight Transcriptomes from All Poriferan Classes Reveals Surprising Genetic Complexity
in Sponges. Molecular biology and evolution.
[Riesgo et al., 2014b] Riesgo, A., Peterson, K., Richardson, C., Heist, T., Strehlow, B., McCauley,
M., Cotman, C., Hill, M., and Hill, A. (2014b). Transcriptomic analysis of differential host gene
expression upon uptake of symbionts: a case study with Symbiodinium and the major bioeroding
sponge Cliona varians. BMC genomics, 15(1):376.
[Ryan et al., 2010] Ryan, J. F., Pang, K., Mullikin, J. C., Martindale, M. Q., and Baxevanis,
A. D. (2010). The homeodomain complement of the ctenophore Mnemiopsis leidyi suggests that
Ctenophora and Porifera diverged prior to the ParaHoxozoa. EvoDevo, 1(1):9.
[Ryan et al., 2013] Ryan, J. F., Pang, K., Schnitzler, C. E., a. D. Nguyen, A.-d., Moreland, R. T.,
Simmons, D. K., Koch, B. J., Francis, W. R., Havlak, P., Smith, S. a., Putnam, N. H., Haddock,
S. H. D., Dunn, C. W., Wolfsberg, T. G., Mullikin, J. C., Martindale, M. Q., Baxevanis, A. D.,
Comparative, N., and Program, S. (2013). The Genome of the Ctenophore Mnemiopsis leidyi and
Its Implications for Cell Type Evolution. Science, 342(6164):1242592–1242592.
22
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
[Ryu et al., 2016] Ryu, T., Seridi, L., Moitinho-Silva, L., Oates, M., Liew, Y. J., Mavromatis, C.,
Wang, X., Haywood, A., Lafi, F. F., Kupresanin, M., Sougrat, R., Alzahrani, M. A., Giles, E.,
Ghosheh, Y., Schunter, C., Baumgarten, S., Berumen, M. L., Gao, X., Aranda, M., Foret, S.,
Gough, J., Voolstra, C. R., Hentschel, U., and Ravasi, T. (2016). Hologenome analysis of two
marine sponges with different microbiomes. BMC genomics, 17(1):158.
[Sáez et al., 2009] Sáez, A. G., Lozano, E., and Zaldı́var-Riverón, A. (2009). Evolutionary history of
Na,K-ATPases and their osmoregulatory role. Genetica, 136(3):479–490.
[Sahlin et al., 2014] Sahlin, K., Vezzi, F., Nystedt, B., Lundeberg, J., and Arvestad, L. (2014). BESST
- Efficient scaffolding of large fragmented assemblies. BMC Bioinformatics, 15(1):281.
[Sara et al., 2001] Sara, M., Sara, A., Nickel, M., and Brümmer, F. (2001). Three New Species
of Tethya (Porifera: Demospongia) from German Aquaria. Stuttgarter Beiträge zur Naturkunde,
631(15S):1–16.
[Schierwater et al., 2009] Schierwater, B., Kolokotronis, S., Eitel, M., and DeSalle, R. (2009). The
Diploblast-Bilateria Sister hypothesis. Communicative & Integrative Biology, 2(5):1–3.
[Schuler et al., 2015] Schuler, a., Schmitz, G., Reft, A., Ozbek, S., Thurm, U., and Bornberg-Bauer,
E. (2015). The rise and fall of TRP-N, an ancient family of mechanogated ion channels, in Metazoa.
Genome Biology and Evolution, 7(6):1–27.
[Simakov et al., 2015] Simakov, O., Kawashima, T., Marlétaz, F., Jenkins, J., Koyanagi, R., Mitros,
T., Hisata, K., Bredeson, J., Shoguchi, E., Gyoja, F., Yue, J.-X., Chen, Y.-C., Freeman, R. M.,
Sasaki, A., Hikosaka-Katayama, T., Sato, A., Fujie, M., Baughman, K. W., Levine, J., Gonzalez,
P., Cameron, C., Fritzenwanker, J. H., Pani, A. M., Goto, H., Kanda, M., Arakaki, N., Yamasaki,
S., Qu, J., Cree, A., Ding, Y., Dinh, H. H., Dugan, S., Holder, M., Jhangiani, S. N., Kovar, C. L.,
Lee, S. L., Lewis, L. R., Morton, D., Nazareth, L. V., Okwuonu, G., Santibanez, J., Chen, R.,
Richards, S., Muzny, D. M., Gillis, A., Peshkin, L., Wu, M., Humphreys, T., Su, Y.-H., Putnam,
N. H., Schmutz, J., Fujiyama, A., Yu, J.-K., Tagawa, K., Worley, K. C., Gibbs, R. A., Kirschner,
M. W., Lowe, C. J., Satoh, N., Rokhsar, D. S., and Gerhart, J. (2015). Hemichordate genomes and
deuterostome origins. Nature, pages 1–19.
[Simakov et al., 2013] Simakov, O., Marletaz, F., Cho, S.-J., Edsinger-Gonzales, E., Havlak, P., Hellsten, U., Kuo, D.-H., Larsson, T., Lv, J., Arendt, D., Savage, R., Osoegawa, K., de Jong, P.,
Grimwood, J., Chapman, J. a., Shapiro, H., Aerts, A., Otillar, R. P., Terry, A. Y., Boore, J. L.,
Grigoriev, I. V., Lindberg, D. R., Seaver, E. C., Weisblat, D. a., Putnam, N. H., and Rokhsar, D. S.
(2013). Insights into bilaterian evolution from three spiralian genomes. Nature, 493(7433):526–31.
[Simão et al., 2015] Simão, F. A., Waterhouse, R. M., Ioannidis, P., and Kriventseva, E. V. (2015).
BUSCO : assessing genome assembly and annotation completeness with single-copy orthologs.
Genome analysis, 31(June):9–10.
[Simion et al., 2017] Simion, P., Philippe, H., Baurain, D., Jager, M., Richter, D. J., Di Franco, A.,
Roure, B., Satoh, N., Quéinnec, É., Ereskovsky, A., Lapébie, P., Corre, E., Delsuc, F., King, N.,
Wörheide, G., and Manuel, M. (2017). A Large and Consistent Phylogenomic Dataset Supports
Sponges as the Sister Group to All Other Animals. Current Biology, pages 1–10.
[Smith et al., 2011] Smith, S. M. E., Morgan, D., Musset, B., Cherny, V. V., Place, A. R., Hastings,
J. W., and DeCoursey, T. E. (2011). Voltage-gated proton channel in a dinoflagellate. Proceedings
of the National Academy of Sciences, 108(44):18162–18167.
[Sreedharan et al., 2010] Sreedharan, S., Shaik, J. H. A., Olszewski, P. K., Levine, A. S., Schiöth, H. B.,
and Fredriksson, R. (2010). Glutamate, aspartate and nucleotide transporters in the SLC17 family
form four main phylogenetic clusters: evolution and tissue expression. BMC genomics, 11(iii):17.
[Srivastava et al., 2008] Srivastava, M., Begovic, E., Chapman, J., Putnam, N. H., Hellsten, U.,
Kawashima, T., Kuo, A., Mitros, T., Salamov, A., Carpenter, M. L., Signorovitch, A. Y., Moreno,
M. a., Kamm, K., Grimwood, J., Schmutz, J., Shapiro, H., Grigoriev, I. V., Buss, L. W., Schierwater, B., Dellaporta, S. L., and Rokhsar, D. S. (2008). The Trichoplax genome and the nature of
placozoans. Nature, 454(7207):955–60.
23
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
[Srivastava et al., 2010] Srivastava, M., Simakov, O., Chapman, J., Fahey, B., Gauthier, M. E. a.,
Mitros, T., Richards, G. S., Conaco, C., Dacre, M., Hellsten, U., Larroux, C., Putnam, N. H.,
Stanke, M., Adamska, M., Darling, A., Degnan, S. M., Oakley, T. H., Plachetzki, D. C., Zhai, Y.,
Adamski, M., Calcino, A., Cummins, S. F., Goodstein, D. M., Harris, C., Jackson, D. J., Leys, S. P.,
Shu, S., Woodcroft, B. J., Vervoort, M., Kosik, K. S., Manning, G., Degnan, B. M., and Rokhsar,
D. S. (2010). The Amphimedon queenslandica genome and the evolution of animal complexity.
Nature, 466(7307):720–6.
[Stamatakis, 2014] Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and
post-analysis of large phylogenies. Bioinformatics, 30(9):1312–1313.
[Stanke et al., 2008] Stanke, M., Diekhans, M., Baertsch, R., and Haussler, D. (2008). Using native and
syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics, 24(5):637–
644.
[Stein, 1995] Stein, W. D. (1995). The sodium pump in the evolution of animal cells. Philosophical
transactions of the Royal Society of London. Series B, Biological sciences, 349(1329):263–9.
[Strous et al., 2012] Strous, M., Kraft, B., Bisdorf, R., and Tegetmeyer, H. E. (2012). The binning of metagenomic contigs for microbial physiology of mixed cultures. Frontiers in Microbiology,
3(DEC):1–11.
[Suga et al., 2013] Suga, H., Chen, Z., de Mendoza, A., Sebé-Pedrós, A., Brown, M. W., Kramer, E.,
Carr, M., Kerner, P., Vervoort, M., Sánchez-Pons, N., Torruella, G., Derelle, R., Manning, G., Lang,
B. F., Russ, C., Haas, B. J., Roger, A. J., Nusbaum, C., and Ruiz-Trillo, I. (2013). The Capsaspora
genome reveals a complex unicellular prehistory of animals. Nature communications, 4:2325.
[Versluis et al., 2015] Versluis, D., D’Andrea, M. M., Ramiro Garcia, J., Leimena, M. M., Hugenholtz,
F., Zhang, J., Öztürk, B., Nylund, L., Sipkema, D., van Schaik, W., de Vos, W. M., Kleerebezem,
M., Smidt, H., and van Passel, M. W. J. (2015). Mining microbial metatranscriptomes for expression
of antibiotic resistance genes under natural conditions. Scientific reports, 5(January):11981.
[Whelan et al., 2015] Whelan, N. V., Kocot, K. M., Moroz, L. L., and Halanych, K. M. (2015). Error
, signal , and the placement of Ctenophora sister to all other animals. Proceedings of the National
Academy of Sciences, 112(18):1–6.
[Wray et al., 2015] Wray, G. A., Smith, A., Peterson, K., Donoghue, P., Benton, M., Thackray, J.,
Budd, G., Marshall, C., Shu, D., Isozaki, Y., Zhang, X., Han, J., Maruyama, S., Walcott, C., Knoll,
A., Walter, M., Narbonne, G., Christie-Blick, N., Raymond, P., Cloud, P., Budd, G., Jensen, S.,
Briggs, D., Fortey, R., Morris, S. C., Seilacher, A., Bose, P., Pfluger, F., Rasmussen, B., Bengtson,
S., Fletcher, I., McNaughton, N., Pecoits, E., Konhauser, K., Aubet, N., Heaman, L., Veroslavsky,
G., Stern, R., Gingras, M., Huldtgren, T., Cunningham, J., Yin, C., Stampanoni, M., Marone, F.,
Donoghue, P., Bengtson, S., Gaucher, C., Poire, D., Bossi, J., Bettucci, L., Beri, A., Fedonkin,
M., Simonetta, A., Ivantsov, A., Sprigg, R., Glaessner, M., Gehling, J., Seilacher, A., Retallack,
G., Narbonne, G., Seilacher, A., Grazhdankin, D., Legouta, A., Li, C., Chen, J., Hua, T., Maloof,
A., Tang, F., Bengtson, S., Wang, Y., Wang, X., Yin, C., Yin, Z., Grant, S., Grotzinger, J.,
Watters, W., Knoll, A., Fedonkin, M., Vickers-Rich, P., Swalla, B., Trusler, P., Hall, M., Xiao, S.,
Zhang, Y., Knoll, A., Chen, J.-Y., Oliveri, P., Li, C., Zhou, G., Gao, F., Hagadorn, J., Peterson,
K., Davidson, E., Hagadorn, J., Chen, L., Xiao, S., Pang, K., Zhou, C., Yuan, X., Bengtson, S.,
Budd, G., Cunningham, J., Margoliash, E., Brown, R., Richardson, M., Boulter, D., Ramshaw,
J., Jefferies, R., Runnegar, B., Knoll, A., Carroll, S., Wray, G., Levinton, J., Shapiro, L., ArisBrosou, S., Yang, Z., Bromham, L., Rambaut, A., Fortey, R., Cooper, A., Penny, D., Welch, J.,
Fontanillas, E., Bromham, L., Wheat, C., Wahlberg, N., Schulte, J., Erwin, D., Laflamme, M.,
Tweedt, S., Sperling, E., Pisani, D., Peterson, K., Filipski, A., Murillo, O., Freydenzon, A., Tamura,
K., Kumar, S., Tamura, K., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar,
S., Hug, L., Roger, A., Ho, S., Phillips, M., Sanderson, M., Thorne, J., Kishino, H., Painter, I.,
Sanderson, M., Britton, T., Anderson, C., Jacquet, D., Lundqvist, S., Bremer, K., Shaul, S., Graur,
D., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar, S., Mello, B., Schrago, C.,
Rambaut, A., Bromham, L., Kishino, H., Thorne, J., Bruno, W., Douzery, E., Snell, E., Bapteste,
E., Delsuc, F., Philippe, H., Battistuzzi, F., Filipski, A., Hedges, S., Kumar, S., Schwartz, R.,
24
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
Mueller, R., Ho, S., Phillips, M., Drummond, A., Cooper, A., Blair, J., Hedges, S., Warnock, R.,
Yang, Z., Donoghue, P., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar, S., Ayala,
F., Rzhetsky, A., Ayala, F., Peterson, K., Lyons, J., Nowak, K., Takacs, C., Wargo, M., McPeek,
M., Cutler, D., Chernikova, D., Motamedi, S., Csuros, M., Koonin, E., Rogozin, I., Richter, D.,
King, N., Dunn, C., Giribet, G., Edgecombe, G., Hejnol, A., Emes, R., Grant, S., Burkhardt, P.,
Hejnol, A., Martindale, M., Balavoine, G., Adoutte, A., Hirth, F., Miller, D., Ball, E., Northcutt,
R., Strausfeld, N., Hirth, F., Tosches, M., Arendt, D., Lyons, T., Reinhard, C., Planavsky, N.,
Planavsky, N., Sperling, E., Frieder, C., Raman, A., Girguis, P., Levin, L., Knoll, A., Stanley, S.,
Peterson, K., Cotton, J., Gehling, J., Pisani, D., Butterfield, N., Benito-Gutierrez, E., Arendt, D.,
Holland, L., Carvalho, J., Escriva, H., Laudet, V., Schubert, M., Shimeld, S., Yu, J.-K., Turner,
S., Young, J., Pisani, D., Poling, L., Lyons-Weiler, M., Hedges, S., Otsuka, J., Sugaya, N., Hedges,
S., Blair, J., Venturi, M., Shoe, J., and Gu, X. (2015). Molecular clocks and the early evolution
of metazoan nervous systems. Philosophical transactions of the Royal Society of London. Series B,
Biological sciences, 370(1684):424–431.
[Wu and Watanabe, 2005] Wu, T. D. and Watanabe, C. K. (2005). GMAP: a genomic mapping and
alignment program for mRNA and EST sequences. Bioinformatics, 21(9):1859–1875.
[Zakon, 2012] Zakon, H. H. (2012). Adaptive evolution of voltage-gated sodium channels: the first 800
million years. Proceedings of the National Academy of Sciences of the United States of America, 109
Suppl(Supplement 1):10619–25.
[Zapata et al., 2015] Zapata, F., Goetz, F. E., Smith, S. A., Howison, M., Siebert, S., Church, S. H.,
Sanders, S. M., Ames, C. L., McFadden, C. S., France, S. C., Daly, M., Collins, A. G., Haddock,
S. H. D., Dunn, C. W., and Cartwright, P. (2015). Phylogenomic Analyses Support Traditional
Relationships within Cnidaria. Plos One, 10(10):e0139068.
25
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
1
Supplemental Figures
Raw reads coverage vs GC
100
1e+06
80
1e+05
Bacteria
10000
60
Bacteria
Haploid
Diploid
Repeats
1000
Repeats
40
GC%
20
100
10
0
973
1
100
200
300
Coverage
400
500
Counts
Supplemental Figure 1: Coverage vs. GC content for reads
Lava lamp plot of the unfiltered paired-end reads. Coverage was calculated as the median 31-mer coverage
for each read. High-GC reads indicate the presence of bacteria in the raw reads.
26
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
80
100
T.wilhelma Moleculo contigs coverage vs GC
40
GC%
60
100
0
20
10
1
100
200
300
400
Coverage
Supplemental Figure 2: Coverage vs. GC content for genomic scaffolds
Heat map of percent GC versus coverage of reads for the all Moleculo reads.
27
Counts
60
80
100
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
0
20
40
%GC
10
50
100
150
200
250
300
1
Counts
Coverage
Supplemental Figure 3: Coverage vs. GC content for genomic scaffolds
Scatterplot of percent GC versus coverage of reads for the all scaffolds. The 1,040 contigs with zero coverage
are carried over from low-coverage Moleculo reads, and likely derive from amplified contaminating DNA.
28
80
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
20
30
40
50
60
70
Non−Rhizobium 1
Non−Rhizobium 2
Non−Rhodobacter 1
Non−Rhodobacter 2
Non−Rhodobacter 3
Non−Rhodobacter 4
Possible Rhodospirales
Sponge contigs
0
50
100
150
200
250
300
Supplemental Figure 4: Separation of contigs derived from bacterial symbionts
Sponge scaffolds are in green, while scaffolds assigned to bacteria are identified by blue (Rhizobiales) and
pink (Rhodobacter ). Low-coverage contigs were removed. Seven bins were identified with MetaWatt to
separate the bacterial contigs, though several bins appeared to correspond to the same bacteria.
29
100
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
60
40
20
Percent Identity
80
Twi to A.queenslandica
Twi to S.ciliatum
Twi to N.vectensis
Twi to H.sapiens
0
100
200
300
400
500
Genes ordered by identity with Aqu
Supplemental Figure 5: Percent identity between sponges
Calculated protein percent identity between 570 one-to-one orthologs between T. wilhelma and A. queenslandica (blue), S. ciliatum (green), the anemone N. vectensis (purple) and human (red). Average identity
between T. wilhelma and A. queenslandica is shown as the dotted black line at 57%.
30
300
400
513
100
200
750 total blocks
3400 Tethya genes in blocks
84
53
34
0
Number of blocks
500
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
3
6
17 9 14 8 7
2 3 1 0 1 1 1 0 1 0 0 0
9
12
15
18
21
Number of genes in block
Supplemental Figure 6: Length of microsyntenic blocks
Histogram of number of genes in detected microsyntenic blocks between T. wilhelma and A. queenslandica
v2.0 gene models.
31
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
Hoilu
ng
ra r1
TriaHoiail_b
_g
d1_ungike
g42 a_bra 10333.t1
H
r1
Tr H oilung 44.t1_ke
_sca_g1033
iadoilu Tria_b
ffold 2.t1
Tr
ia
n
1
H
Ho o _g gia d1raker1
_4
ia
iluilu 4T
d1
r4ia_bra _g424_g10
2
ngng 6.d1 ke
_g
3.t2 056
r
_
t
i
i
1
42
a_a_ 1_ g4 _g
_ .t1
brbra _sc245 103 _scaff
53
ak ke aff .t1 28
old
.t1
_4
e
.t
o
_
r
r1 1_ ld_ _s 1
__
_g g1
sc
4 caff
10 03
af
old
fo
33 2 9
_4
ld
1. .t1
_4
t1
97
8
9408
2
85
97
Hexactinellids
98
99
77
54
54
73
88
Amel_4.5_GB53665-PA
NV15929-PA
38
59 87
35
52
98
86
99
99
Cgigas_EKC28453
0180
Lotgi1|12 6.2
31
09.p
eq.14m
as
irn
3126
ob
53 1249
v220
|64a21|14
1
Ocbim
o
Helr Capc
53
29
90
92
Sakowv30025742m
39 32
22
23
19
21
21
30
97
72
93
50
37
20
58
97
89
99
69
54
80
4
.1_
D1
Cm
ilii
_
XP
DBH-like 2
a
ca
pi
lis
ro
Xt
MOX11_DROME
Prototomes
Echinoderm/hemichordate
Chordates
Vertebrates
DBH-like 1
86m
X
E
E
O
ULS2
M
NR
MBOH
__
DA
2__D
CA
2_
XD.1_
XD
M2O12
NO
O
A
0
M
L2
0_
A3
DA DBH
UE
s_
9G
__
ru
H
.1
au
96
Bt
01
1
28
01
P_
_X
pm
NV10994-PA
Amel_4.5|GB41735-PA-trimmed
97
Amel_4
_GB406
NV.515
993-PA94-PA
OME
.18
MOX12_DR
.p
a
-sr
06
011
35247
0111
i1|1352
0000
NAP
otg
i1|1
88972
LoLtg
gnyi_S
000000
Csavi nyi_SNAP
Csavig
evo
r-d
26.1
300350
Sakowv
39026m
Sakowv300
2
029
026
M MM
O
_0 XDO1XO
X
07
_D D
89 M1O_H1_C
74 UU H
96 SEM IC
.1_ ANK
_
MODBH 9
XD L1 6
1_
bflo
DA
r na
NR
bflo
s
E
e
rna
seq bflorq.24
.34
114nase0q74.1
.1_0 .50 _0
bflo
705
rnase
.1_
bflorn q.28
0
aseq 386.2
.460 _1
1.9_0
30m
v22
eq.1
q1
se
01
0_
_0
_c
47
q2
57
75
_se
91
.2
c0
.1
p5
0_
m
u2
37
o
c
66
Aq
5_
3 omp
57 3
q9
i1_ _c
08 i1_
_se
3_ 67
ri_ 1_
_c0
_g 75
22 i_1 8610
lle _g
45 ler
6
ue 67
m 44 c1m0uel omp
tica10
_c
s_ia
das_ usytdriat 3971i2_5
2 g 1_
hytri
ch
_
ecpus Slaep lleri_
e 0 01
Sla
m u 03
tiais_c1
yudsatr
eSplahc
nas
obir
Clionavarians_TR46177|c0_g3_i8
Clionavarians_TR4
Clion
avarians_
|c5_g6_i
TR53827267
2|c0_g1_i1 1
_1
twi_ss.1
4527.1
_1
.22187
.1_1
Clio Clio
nav nav
79
aria arian
ns_T s_T
tw
R 2 9 R 4 79 8
twi_
i_sss.25
792
5
s.1 29
|c0 |c1_g1
_g 2
_i2
tw380 9b.2
_i8
i_s 7.1 _1
s.1 _0
88
9
6
Xt
.1_
es
1
t
Xt A
ut
ina A
es qu
tu 2
ria qu
tin .1.
_T 2.1
R1 .37
ar 27
29 88
ia 56
_T 6
37 3_
|c 001
R 1 _0 0
0_
A
76 1
qu
g1
7|
2.
_i1
1.
c0
_1
2
75
_g
48
1_
_0
i6
01
_0
twi_ss
99
96
bim
Oc
1_i1
seq1
3_c0_g
_c0_
R1342
2191
um_T
mp2
pulifer
us_co
vast onema_po
s_
te
hyal
_seq1
callis
648_c0
aphro
comp6
_i3
vastus_
070_c0_g1 0_g1_i2
allistes_
lata_TR4
aphroc rosella_fibuagel
la_nux_TR9313_c
symp
6_c0_g1_i1
nux_TR1348
sympagella_
81
99
Dreri
Homoscleromorphs
_4
ld
fo 4
af _
sc old
__ ff
t. 1 _sca d_4 d_4
1
seq
51 1_ fol fol
3_
42 2.t caf caf
_c
_g25 __s __s
15
16
d1 4 t1 t1
6
p1
q1
ia1_g 48. 49.
e
r
m
eq1
s
s
2
d
o
_
Tia 42g4
0_
_c
c0
8_ 620_c
98
Tr 1_dg1_
44
76
3
12 p10
iadria
_4
i4_
60 comp _com
Tr T
g 4_
_
h
0
_
6_ 95
_t
7|c
lla 267 294
853
are 0_6 h60_
TR5
ial
c
_
rt
s
6
_
a
o t_h a_t
rum
1_p
_ ll
lab
_i1_
lla are
nde
_g1
are osc
|c0
Cca
os c
586
R48
m_T
ru
lab
nde
Cca
99
85
92
DOP
D O_MDOOP
o_AA
I63055OPO
_HUUSOE_
Cmilii_ .1__DBHMABNOVIN
XP_007
65
Csavigny
898364
i_SNAP0
.1__DB
00000936
H
69
sima_20
778
98
83
Dopamine
β-hydroxylase
le
ur
ur
a_
el
e
llav
03 ia_rc
AI leg
anh
_au _f
PG an A
in
oef
ENtis IPG
fen relia alA
E2sim E
_rc SM
i_t_
7 9 a _ NE
h01 _fin _1
0529 2
_03 alA 221
529 SM 7_
NV 27967
1
_
_co 18 _p
E8
mp 59 ar
08
232 _5 tia
9
l
8 _c
0 _s
eq4
_a
Fungia_scutaria_26369
Porites_lobata_6537
03
op
Anthop NVE14232
leura_e
legantis
th
14732
Acropora_dig
ora_12988 2
lepitifera_
Acropora_mil
2750
scutaria_
Fungia_
197
2125
a_10
nsbisat_5
ralielo
s_aust tes_
Porite Pori
_
es
rit
An
ato
864 1
8691677
751 1_9
ra_3 baista_ _5
60
itife s_lo
ens aria
62
_dig riteali ut
o
P
17 6
ustr _sc
pora
s_a ungia
407 11 ia_
Acro
e
it
1132 _ r
F
Por
_14 ta ta
sitsa_ ba cu
enba 4 lo _s
alilo 4 s_ ia
str s _ 3 4 t e g
aurite _1 ori un
es_ o sis P F
rit P en
Po rali
st
au
Po
Cnidarians
Placozoans
Demosponges
0.3
Supplemental Figure 7: Dopamine β -hydroxylase homologs across metazoans
Tree of DBH and DBH-like proteins across all metazoan groups, generated with RAxML using the
PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.
32
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
83
TYDC1_ORYSJ
TYDC2_ARATH
TYDC1_ARATH
TYDC1_PAPSO
99 TYDC5_PAPSO
TYDC3_PAPSO
TYDC2_PAPSO
TYDC2_PETCR
TYDC1_PETCR
74
TYDC4_PETCR
85
TYDC3_PETCR
50
98
Plants
SARC_01107T0
50
Odioica_GSOIDT00000153001
Platygyra_carnosus_4652
40
99
75
98
Sponges
scict1.001978.2_0_partial
Syconcoactum_contig_14168_4-partial
92
42
lctid68041_0
Syconcoactum_contig_34351_2_partial
39
35
16
12
43
7
Capsaspora_KJE95580.1
Ccandelabrum_TR92788|c0_g5_i7_4-c0_g8_i14_2-partial
oscarellacarmela_comp38470_c0_seq2_2
Hoilungia_stringtie|12206.1|m.15829
Triad1_g1959.t1__scaffold_2
Nve_XP_001635455.1
Alatina_alata_c38444_g1_i1_0
Chironex_fleckeri_TR31060_c0_g1_i1_0
50
NVE18494
AIPGENE1003
AIPGENE10179
Cioin2|262216_partial
Cioin2|230165
Csavignyi_FGENESH00000078696
Csavignyi_GENEFINDER00000066949
Helro1|186120
81
Capca1|158583
Lotgi1|181541
Lotgi1|201667__AADC-like
82
15
Cgigas_EKC41301-trimmed
Ocbimv22022526m.p
60
obirnaseq.78969.4_1
Ocbimv22022529m.p
21
Ocbimv22022527m.p
Lingula_comp152792_c2_seq3__DDC-like
Sakowv30017927m
25
Bf_V2_253_g40219.t2
21
95
Bf_V2_287_g43385.t2
Bf_V2_250_g40085.t26
86
Cmilii_XP_007895676.1__AADC
34
Drerio_NP_998507.1__AADC
Xtropicalis_XP_012820058.1__AADC
98
94
DDC_MOUSE
DDC_HUMAN
88 DDC_BOVIN
Cgigas_EKC25403-trimmed
Ectopleura_larynx_g20429.2_i2_joined-partial
Ectopleura_larynx_g36591.2_i1_partial
91
HAEP_T-CDS_v02_46779
86
98
HAEP_T-CDS_v02_10472
6
hydra_sra.4784.1_2
66
HAEP_T-CDS_v02_45038_partial
99 hydra_sra.19042.1_0
82
HAEP_T-CDS_v02_1229_partial
33
hydra_sra.8276.1_2
Dmel_FBpp0080698
Dmel_FBpp0080697
TC013401
97
53
Bmori_XP_004931016.1
33
TC013402
23
Amel_4.5|GB45938-PA
54
71
NV11111-PA
E9RJV1_GRYBI
Amel_4.5|GB45973-PA__AADC-like
NV11109-PA
41
TC013480
35
DDC_DROME
82
DDC_MANSE
Bmori_NP_001037174.1__AADC
Spurpuratus_XP_011664746.1__AADC
93
Sakowv30016396m
Helro1|101612
86
Helro1|84539
Helro1|84403
Capca1|119245
75
Lotgi1|139922__HDC-like
96
Ocbimv22019847m.p
95
Amel_4.5|GB55830-PA
85
TC012567
Dmel_FBpp0085475
TC030580
59
Dmel_FBpp0085476
44
NV18137-PA
Amel_4.5|GB55831-PA
Spurpuratus_XP_789367.3__HDC
98
PFL3_pfl_40v0_9_20150316_1g34573.t1
Skowalevskii_NP_001161568.1_HDC
93
pmar16.36248-30514.1_0
Oreochromis_niloticus_XP_005463234.1__HDC
96
A7KBS5_DANRE__HDC
Xtropicalis_XP_002939672.3__HDC
CHICK_BAP16218.1__HDC
83
DCHS_BOVIN
DCHS_MOUSE
82
DCHS_HUMAN
Aplysia_XP_012940696.1
Aplysia_NP_001191536.1__HDC
93
Capca1|180248
73
Ocbimv22006132m.p
59
Cgigas_EKC37654__HDC
28
93
LOTGIDRAFT_119964__HDC
Dpulex_EFX79676.1
Apisum_XP_008179690.1__HDC
Bmori_XP_012551886.1__HDC
99 50
TC010062
70
DCHS_DROME
89
NV12919-PA
99
Amel_4.5|GB47379-PA__HDC-like
Bterrestris_XP_003393425.1__HDC
Placozoans
Cnidarians
Aromatic amino acid
Decarboxylase
Hydrozoan AADC
32
Invertebrate
AADC-like
Invertebrate
HDC-like
52
Homoscleromorphs
Calcarea
Histidine
Decarboxylase
Placozoans
Cnidarians
Protostomes
Echinoderm/hemichordate
Chordates
Vertebrates
0.5
Plants
Supplemental Figure 8: Aromatic amino acid decarboxylase homologs across metazoans
Tree of AADC and histidine decarboxylase (HDC) proteins across all metazoan groups, generated with
RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.
33
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
Aqu2.1.28354_001
ial
_i2_2
1_g2i1_
4 _part
_i1_1
R35249_c
c0_g1_
Crellaelegans_T
76_c1_g1
22979_
_TR827
helens_TR
Tedaniaan calephyllophila
My
Demosponge
MAO
Cnidarian
MAO
83
6
q.4
8471
ase
2778
orn
20|1
bfl
1c|a1
caap
6
p
8
C
Ca
7 68
ed
8|510
oin
FL
3a21
.p-j
RA
|1pc
3m
_B
aC1 a
855
C1 FL
pc
ZD RA 200
Ca
CA30_B imv424
O AO
A
ZE O|c1b902 .1__0M.1A__M .1__MAO
C3
i1 00136162 8354 __MAOA
.1
7 39 _00108
Lotg AH_0
P 15155953
P is0_N_0
o__AX
la
n_XPIN
FuriguX
keev
Dre
OV
AN
Chic
E
_B
UM
FB_HB_MOUS
AOFB
AOF
AO
48
77
UMAN
AOFA_H
AOFA_BOVIN
9037
MOUS
61AOFA_
E
38
47
30 84
0 62
44 5
g
.1-
19
67
6.t
2
Monoamine
Oxidase A/B
84
52
25
H
ila
al
Te
_T
isa
da
R8 tw
rc
n
ad
20 i_
Sty Cli iaan
ss
4
uj
0
o
lis n he
ar
_c .16
My
sac av le
2
n
a
cale
_g693 din
art ri s_
i_
1_ .1
eri_ans TR
ph
H
i2 _1
yll
TR_TR 28
AD
_2
op
0
1
hil
0064 67
A0
a_T
1633 _c
10
4_0_ 0_
R8
eph
43
c0c0 g1
08
yda
__ _
72
50
tiam
i11
_c0 g1g_1i1
0.
_
i_1_ _
u
eph
_g2
1_
53 4
yda elleri_
_
5
i1_
tiam
1
1
uell 8278_
eri_
c
248 omp6
26 _ 6 6
com 77_
twi_
p69 c0_
421 seq ss.16
970
_c0 1
_se
.1_
q1
2
NV
E1
aa_p_ horrotpra
79n3sisra_1 604_
scspoar_odrlien
7 _4438147 partia
ut_4a_ iag_ sis
l
5
ar34pi itm
_
ia 1stiifeilrle872
_35 lla a_po 9
74 ta 16ra_
_
46 86241145
73 80
57
76
79
88
60
68
24 2 19
22 2
ri_
lle
ue
yll
op
h
m
ia
at
yd
lep
h
85
28
Cnidarian and
sponge group
h
ep
M
yc
a
5
58
86
6028
2N4E
VEE
NPG
AI
90
e
85 M AIPNV
E15952
GEN
E1
86 AosnPFu
ngi
Aercst
98 P o
trota
raa_scu 1888
oitro
_cataria_
peosp_reoaara
FurS1Aites
erno287
a_uss_trmvil
Fnat0
c o_a
sa 38
A
p_a2lile
gyvirl0
u
e po _29
iopc s
49
84 9
9
gu
_X Dr
P_ eri
01 o_
16 XP
1_
02 _6
g1
81 86
99
0. 31
8.
1_ 0
t2
__
_M.4
sc
A_O_M
af
fo
-liAO
ld
ke _4
lik
Bf_V
2_196_g33248.t2
C3Z
_BR
CapcEL9
a1|136AFL
246
br
96
Fu
M
on
43
8
|30097
Dappu1
6.1_0
359_0 1872
lctid61scict1.0
02
855.p
1|1
12342
4m.p Helro 9147m
2955
_9262 6 18
00
os5a9_23413701761155
v220
v22
ern_6r4ia
9p_2rraa__
Ocbim
cbim
cavcuta
a
O
ea_ia_slatsis__s ifpeo
strang tilenoragilte 988
nta Fu_pisali p di il
Mo horaustsrtreoraa__m E149
or
a
lop s_ A oppo
EN95
Styorite Accrro IPG12
P
A
A VE
N
Plants and other outgroups
S. arctica
Aqu2.1.28355_001
MAO-like
Ctenophores
Placozoans
Cnidarians
Protostomes
Chordates
Vertebrates
Stylissacarteri_TR109385_c0_
g1_i2_3
Demosponges
Homoscleromorphs
Calcarea
71
1_
p7
m
co
60
73
80
q1
se
0_
_c
SARC_08858T0
Bf_V AIPGE
2 _1
NE
Bf_V
2_26 84_g30 14762
Bf_V
3_g4
7
2_98
1383 03.t1
_g16
122..tt1
1
AIPGENE
NVE215
22275
36
AIPGENE2226
AIPGENE3902 7
3991 94
AIPGENE17020
AIPGENE9768
Capsaspora and
cnidarian group
94
99
CAOG_00719T0
Acropora_millepora_19274
AIPGENE7844
100
100
7
61 4
44 13 7
0
62 5
5
59
75
79
96
1_0
hydra_sra.16499. .1_2
.12474
hydra_sra
SARC_03516T0
69
25
S.rosetta_PTSG_06229T0
47
85
Primary amine
Oxidase (PAOX)
100
0
6
0
40
66
30
71
54 56
78
43
99 69
99
Triad1_g432.t1__scaffo
Hoilungia_braker1_g04833.t1 ld_1
Tria
d1_
g
Hoil9931.t1
ungi __sc
a_bra aff
Hoilu
ke old
ngia_b
raker1 r1_g0_9138
35.t1
_g0483
Hoilung
4.t1
Hoilungia_
akker
ia_br
er 1_g
bra
g04883
Triad1_g431 1_
35.t11
.t1__sca04
ffold6.t
_1
95
78
NV N
10V1
9609
2- 61
PA-P
A
Am
M
el
AO
-li _4.5
Am
ke
|G
el_
B5
4.5
00 91
|G
76
B5
-P
01
A
62
-PA
NV _p
FBp
F
B
17 art
p00 pp0
FB
77 ia
762 074
p
p
6-Pl
86 _ 9 5 F 0 3
_SM 8 Bp 08
A
p0 5
OX-l
ike 074962
52
00
l
Ctenophore
KDM1
X-like
__
5
97 7 36
__PAO
TC
om
Tia2_t
M_x
9S0_
NV
7_05
E1
HY94
71
DV61_
U com
Lo 56_
tg _K
p7
Cg i1 D
43
ig |14 M1
53
as 5 -li
_c
_E 16 ke
0_
seq
KC 5 _p
a
1
rti
24
al
08
9
t2
7.
52
39
-g
.2
19
15
.3
eq
as
rn
flo
-b
.t1
q1 q1
H
_se
25
se
0
_
E
C
c
0
1
95 R TK ANUS
23_ 88_c _seq
g3 NT_LAIC M
MODO
p87 p82 6_c0 1
6_ XE83C_HHBU_ON
1
com com 1676 _seq
M
B
_
24 2_AB3D1_2M
73_ 797_omp 45_c0
K 0
q1
0
2_ LRH3RDG
XM
095 _09738_cmp96
_c0_se
_V V 1BFK6
20_ h200_298
2_co omp12342
Bf F6 E83
t_h _t_
_c
h2_104
rak_ameis
_t_
h20 _08640
hpolo
s_at_ _h30
cp3ysiro
1
mip
ealiuth
te0na_t
9oo5yc
hor M
L0nth
bboba
88
6283T0
SG_0
tta_PT
7
S.rose __scaffold_2
t1
g8203.
66
res
Am
Lysine-specific
Histone Demethylase
(KDM1)
_g1_i1_1
Placozoan
PAOX-like
1
33483_00
Aqu2.1.
_001
Aqu2.1.40694
91
58
KDM1A
FBpp03
04939
33
el_ 9BKf_ AIP
4.5 DVM2_ GEN
|G 1A225_ E198 THrioialu
M g37 hy5d2_p d1ng
B4 _H
_gia1 ra
83 UOM
2 ra art 1_b
565ke
86 UASN
.t1r1___g10
E32.t2 _sraia
.3l0
-PA
scaff903.t1
586
old_4
.1_
7
_KD
M1
-lik
e_p
artia
958
KDM1B
1_
Monbr
NVE13555
1
NV
95
10A
N8V-P
5
09
Corticiumcandelabrum_TR56531_c1
87
PA
9-
84 8
4
99
EN
e
OUUMS-lAike
-lik
_M
_HO
X
OXEN
OXX
PA
2
USA
PAO__PA
_SOMM
5.t
MI_MU
ICK
82
ALOOXX__H
g8
_CH
_CSM
_
1
5
V
9
M
B S
J
_3
KL
F1N
2
V9
_V
Bf
Da
YSJ
LDL1LDL1_OR
_ARA
LDL2_
ARATH TH
LDL2_O
RYSJ
10
26
7
10
68
189 19
3
1
|31173
39-PA
Dappu1
B459
_4.5|G 27058
Amel
u1|3
Dapp
1
96
35
89
A
B45215
6-P-PA
Amel_4.5|G
NV1830
0
39
|5 0
u1 97
pp 01
Da1|3
u
pp
Plant Lysine
Demethylase
97
0.8
Supplemental Figure 9: Monoamine oxidase homologs across metazoans
Tree of monoamine oxidase (MAO) and related proteins across all metazoan groups, generated with RAxML
using the PROTGAMMALG model. Searches for lysine demethylase (KDM1) and primary amine oxidase
(PAOX) were not exhaustive, and were added to display the sole positions of ctenophores (only have KDM)
and placozoans (only have PAOX) in this protein family. Bootstrap values are 100 unless otherwise shown.
34
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
CAOG_08244T0
41
CAOG_00213T0
CAOG_09932T0
CAOG_05982T0
21
Filasterea
Aqu2.1.43530_001
twi_ss.25346.1_2
41
97
52
54
ephydatiamuelleri_20738_comp67567_c0_seq1
ephydatiamuelleri_25644_comp76203_c0_seq1
ephydatiamuelleri_25735_comp78991_c0_seq1
ephydatiamuelleri_22338_comp68117_c0_seq2
ephydatiamuelleri_06289_comp54437_c0_seq1
ephydatiamuelleri_05142_comp51299_c0_seq1
ephydatiamuelleri_20901_comp67615_c1_seq2
Aqu2.1.23001_001
Aqu2.1.22999_001
twi_ss.11316.1_0
ephydatiamuelleri_09349_comp60264_c0_seq6
ephydatiamuelleri_10299_comp61291_c0_seq1
Slacustris_c104630_g1_i1
rosella_fibulata_TR14631_c0_g1_i1_1
aphrocallistes_vastus_comp22152_c0_seq3_3
aphrocallistes_vastus_comp16141_c0_seq1_1
rosella_fibulata_TR14675_c0_g1_i1_0
sympagella_nux_TR10361_c0_g1_i1_0
rosella_fibulata_TR4584_c0_g1_i5_2
sympagella_nux_TR20329_c0_g1_i1_2_partial
95
aphrocallistes_vastus_comp22385_c1_seq6_1
aphrocallistes_vastus_comp22385_c1_seq2_1
aphrocallistes_vastus_comp22385_c1_seq1_1
aphrocallistes_vastus_comp12269_c0_seq1_2
aphrocallistes_vastus_comp17700_c0_seq1_4
97
rosella_fibulata_TR4083_c0_g1_i1_1
rosella_fibulata_TR4083_c0_g1_i2_1
hyalonema_populiferum_TR19_c0_g1_i1_4
rosella_fibulata_TR7683_c0_g1_i1_0
97
rosella_fibulata_TR7683_c0_g1_i3_0
62
aphrocallistes_vastus_comp11820_c0_seq1_4
rosella_fibulata_TR8723_c0_g2_i1_2
sympagella_nux_TR21451_c0_g1_i1_0
75
88
93
rosella_fibulata_TR3980_c0_g1_i1_1
rosella_fibulata_TR3353_c0_g1_i1_0
aphrocallistes_vastus_comp20208_c0_seq3_0
97
hyalonema_populiferum_TR13658_c0_g1_i1_1_partial
rosella_fibulata_TR7995_c0_g1_i1_2
sympagella_nux_TR21457_c0_g1_i3_1
86
sympagella_nux_TR21457_c0_g1_i2_0
sympagella_nux_TR21457_c0_g1_i1_0
twi_ss.15124.1_2
ephydatiamuelleri_17836_comp66476_c0_seq2
ephydatiamuelleri_08108_comp58383_c0_seq9
twi_c31705_g2_i8_0
twi_c31705_g2_i4_1
97
twi_ss.30746.2_1
85
twi_ss.29636.1_1
Aqu2.1.28011_001
Aqu2.1.38315_001
ephydatiamuelleri_22930_comp68288_c0_seq43
52
Slacustris_c104641_g1_i1
Slacustris_c98123_g1_i3_partial
ephydatiamuelleri_16611_comp65944_c0_seq8
25
Aqu2.1.39153_001
ephydatiamuelleri_16976_comp66116_c0_seq1
95
ephydatiamuelleri_16977_comp66116_c0_seq2
38
Slacustris_c103807_g1_i3
16
Aqu2.1.25564_001
Aqu2.1.38717_001
Aqu2.1.27571_001
54
Aqu2.1.27573_001-trimmed
Aqu2.1.39154_001
69
Aqu2.1.38823_001
45
97
Aqu2.1.41785_001
Aqu2.1.26762_001
70
ephydatiamuelleri_23301_comp68424_c0_seq32
36
twi_ss.13620.1_2
44
twi_ss.20824.4_2
ephydatiamuelleri_11626_comp62613_c0_seq9
98
ephydatiamuelleri_23659_comp68518_c0_seq1
ephydatiamuelleri_23660_comp68518_c0_seq5
ML02335a-AUGUSTUS
hormiphora_t_x0_09819_comp8395_c0_seq2
37
oscarella_t_h60_65871_comp34165_c0_seq1_partial
oscarella_t_h60_42123_comp11388_c0_seq18
oscarella_t_h60_52462_comp11760_c0_seq13
93
oscarella_t_h60_64267_comp17249_c0_seq1
Porites_australiensis_13238
Alatina_alata_c61081_g1_i1_5__lCt
86
Acropora_digitifera_408__lCt
32
Acropora_millepora_2820__lCt
Capca1|22448
Capca1|42446
47
Triad1_g522.t1__scaffold_1
Hoilungia_stringtie|10218.1|m.13450
Triad1_g72.t1__scaffold_1
20
Acropora_digitifera_6641
Acropora_millepora_2277
48
hydra_sra.36426.1_0
76
adi_v1.15790
Porites_australiensis_19582
1
C3Y433_BRAFL-trimmed
25
Capca1|22458
Lanatina_comp153988_c0_seq2_1
10
4
Stylophora_pistillata_2729
Porites_australiensis_23261
80
Stylophora_pistillata_11795
Stylophora_pistillata_9171
64
92
AIPGENE7219__lCt
Acropora_millepora_3159
NVE1157
NVE5469__lCt
56
Porites_australiensis_9038__lCt
Lanatina_comp149962_c0_seq3_2
5
40
98
obirnaseq.37923.1_2
94
Capca1|204049
TC007169
Amel_4.5|GB53009
71
Dmel_NP_001285554.1__GABA-B3G
Porites_australiensis_8820
95
Anthopleura_elegantissima_62189
G5ECB2_CAEEL__gbb-2
47
C3YEC0_BRAFL
Q1LUN9_DANRE
84
98
GABR2_HUMAN
GABR2_MOUSE
65
2
Capca1|222380
97
1
Lanatina_comp141557_c0_seq2_3
74
Amel_4.5|GB49239
TC014995
95
Dmel_NP_001287456.1__GABA-B2C
Hoilungia_stringtie|8751.1|m.11539
AIPGENE12245
5
Hoilungia_stringtie|15673.1|m.20278
36
Triad1_g1739.t1__scaffold_2
Hoilungia_stringtie|6504.1|m.8438
97
Triad1_g6059.t1__scaffold_6
Hoilungia_stringtie|6503.1|m.8445
Triad1_g6057.t1__scaffold_6
Dmel_NP_001246033.1__GABA-B1C
62
Amel_4.5|GB49131
2
TC016191-016192-partial
81
B3VBI8_CAEEL__gbb-1
obirnaseq.69248.1_2
43
64
Lanatina_comp155098_c1_seq4_1
93
Capca1|107055
58
C3Z4V0_BRAFL
GABR1_MOUSE
GABR1_HUMAN
F1QAJ3_DANRE
F1RDY7_DANRE
Hoilungia_stringtie|3833.2|m.4840
70
Triad1_g7917.t1__scaffold_9
Hoilungia_stringtie|3832.1|m.4618
5
Triad1_g7915.t1__scaffold_9
Hoilungia_stringtie|3838.1|m.4714
55
Hoilungia_stringtie|11079.1|m.14461
61
Triad1_g7871.t1__scaffold_9
Triad1_g3741.t1__scaffold_3
Hoilungia_stringtie|10558.1|m.13818
Triad1_g5475.t1__scaffold_5
11
Stylophora_pistillata_9477
Porites_australiensis_37273
Stylophora_pistillata_9644
Hexactinellids
69
41
15
Demosponges
20
Ctenophores
Homoscleromorph
Cnidarians
Protostome
GABAR-B3
GABAR-B2
Demosponges
Oscarella carmela
Hexactinellids
Ctenophores
Placozoans
Cnidarians
Protostomes
Chordates
Vertebrates
GABAR-B1
Placozoans
Capsaspora (Filasterean outgroup)
0.4
Supplemental Figure 10: mGABA receptors across metazoans
Complete version of Figure 4 metabotropic GABA receptor (GABA-B type) protein tree generated with
RAxML. Bootstrap values are 100 unless otherwise shown. The majority of deeper nodes were poorly resolved; branch transparency corresponds to bootstrap support for values under 50, meaning half-transparent.
35
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
246
GABR1_HUMAN
GABR1_MOUSE
F1QAJ3_DANRE
Dmel_NP_001246033.1__GABA-B1C
Amel_4.5|GB49131
C3Z4V0_BRAFL
B3VBI8_CAEEL__gbb-1
GABR2_HUMAN
GABR2_MOUSE
Q1LUN9_DANRE
C3YEC0_BRAFL
Dmel_NP_001287456.1__GABA-B2C
Amel_4.5|GB49239
Hoilungia_stringtie|15673.1|m.20278
Triad1_g1739.t1__scaffold_2
Hoilungia_stringtie|6503.1|m.8445
Triad1_g6057.t1__scaffold_6
Hoilungia_stringtie|6504.1|m.8438
G5ECB2_CAEEL__gbb-2
AIPGENE12245
Hoilungia_stringtie|10558.1|m.13818
Triad1_g5475.t1__scaffold_5
Hoilungia_stringtie|11079.1|m.14461
Hoilungia_stringtie|3832.1|m.4618
Hoilungia_stringtie|3833.2|m.4840
Triad1_g7917.t1__scaffold_9
Hoilungia_stringtie|3838.1|m.4714
Triad1_g3741.t1__scaffold_3
Capca1|42446
Capca1|22448
Hoilungia_stringtie|10218.1|m.13450
Triad1_g72.t1__scaffold_1
Triad1_g522.t1__scaffold_1
Capca1|22458
Lanatina_comp153988_c0_seq2_1
AIPGENE7219
C3Y433_BRAFL-trimmed
Hoilungia_stringtie|8751.1|m.11539
Dmel_NP_001285554.1__GABA-B3G
Amel_4.5|GB53009
Lanatina_comp149962_c0_seq3_2
NVE5469/23-200
twi_ss.15124.1_2/34-214
ephydatiamuelleri_17836_comp66476_c0_seq2
oscarella_t_h60_42123_comp11388_c0_seq18
oscarella_t_h60_52462_comp11760_c0_seq13
oscarella_t_h60_64267_comp17249_c0_seq1
Aqu2.1.25564_001
Aqu2.1.26762_001
Aqu2.1.41785_001
Aqu2.1.39154_001
twi_ss.13620.1_2
twi_ss.20824.4_2
ephydatiamuelleri_23659_comp68518_c0_seq1
ephydatiamuelleri_23660_comp68518_c0_seq5
ephydatiamuelleri_11626_comp62613_c0_seq9
ephydatiamuelleri_23301_comp68424_c0_seq32
Aqu2.1.27571_001
Aqu2.1.27573_001-trimmed
Aqu2.1.38717_001
Aqu2.1.28011_001
Aqu2.1.38315_001
twi_ss.29636.1_1
ephydatiamuelleri_22930_comp68288_c0_seq43
ephydatiamuelleri_16611_comp65944_c0_seq8
twi_ss.30746.2_1
Aqu2.1.38823_001
Aqu2.1.39153_001
ephydatiamuelleri_16977_comp66116_c0_seq2
ephydatiamuelleri_16976_comp66116_c0_seq1
twi_c31705_g2_i4_1
twi_c31705_g2_i8_0
ephydatiamuelleri_08108_comp58383_c0_seq9
Aqu2.1.23001_001/19-196
twi_ss.11316.1_0/38-229
ephydatiamuelleri_10299_comp61291_c0_seq1
ephydatiamuelleri_09349_comp60264_c0_seq6
hydra_sra.36426.1_0/39-213
hormiphora_t_x0_09819_comp8395_c0_seq2
ML02335a-AUGUSTUS
Aqu2.1.43530_001
twi_ss.25346.1_2
ephydatiamuelleri_25735_comp78991_c0_seq1
ephydatiamuelleri_25644_comp76203_c0_seq1
ephydatiamuelleri_06289_comp54437_c0_seq1
ephydatiamuelleri_05142_comp51299_c0_seq1
ephydatiamuelleri_20901_comp67615_c1_seq2
ephydatiamuelleri_22338_comp68117_c0_seq2
ephydatiamuelleri_20738_comp67567_c0_seq1
CAOG_00213T0
CAOG_05982T0
CAOG_09932T0
- P GC S S V - P GC S S V - P GC S S V - AGC S T V - AGC S T V - TG C S S V - TG C S P V GG VC P S V GG VC P S V GG VC P S V G G TC S S V G A A C TH V G A A C TH V G SG Y S S V G SG Y S S V G AG Y S S V G AG Y S S V G AD F S T V GGQC T E V G AG Y S KC G AQN S RC G AQN S R S GP VSS I C GP VL SAV GP VF SY I GP VL SSV G P I L SQ T G P L T T VG A A AN S E L GGGC S I V G A AC S I V G A AC S V V G S AC S T V GGGC S E A GGGC S V S G P GC S E S GPG - GAV G P GC S P V G S AC S E V G T AC S E V G S SC SD V G P AC S A V G ADC S I A G ADC S V S A E GC T I V A AGC S T V A E GC S V V GGGC S L V G S GC S V A G S GC S V A G AGC S A A GC GC S V A GC GC S V A GC GC S V A GC GC S V A GC GC S V A G S GC S V A GC GC S T A GC GC S I S GC GC S T A G S GC S L A G AGC S V A G AGC S V A GGGC E N A G S GC E A A G S GC S V A GGGC P S T G AGC S L A GC DC P E L GGGC P A V G AGC S I A G AGC P L A A P GC T E E GGGC S L A DGGC S P A GGGC S S S DGGC N T V G P P Y TEQ G P V F TD V G P V F TD V G P PC ED S G P GC S T S E
G P GC S E A G L AC S D S GP LC SEA G P RC Y N S G L SC YD L G P SC S E S GL PC S TP Q A L I Q SC V
Q A L VD S C V
Q A L VD S C V
268
- YGS S S P AL S
- YGS S S P AL S
- YGS S S P AL S
- YGAS S P AL S
- YGAS S P AL S
- YGS S S P AL S
- Y GG S S P A L S
- F AA T TP VL A
- F AA T TP VL A
- F AA T TP VL A
- YDV TS P E F S
- Y A D TH P M F T
- Y A D TH P M F T
- YGA TS P S L S
- YGA TS P S L S
- Y S A TS P VL S
- Y SASSP VL S
- FGS TS P VL S
- Y A E TH A K F A
- Y S S TS PQL S
- HG S TS T I L S
- HG S TS P I L S
- N AA TS A TL S
- P S S A SMK L
- H T A T SME L T
- H T A T S MQ L V
- P S A TS L KL A
- SG V S S A A L
- Y ASVSP I L S
- Y GC MS P S L S
- Y ASASP AL S
- Y ASASP AL S
- Y ASVSP AL S
- F S AG S P D L S
- YGAARL K L S
- MAN AN P A L G
- F S TY S P AL S
- Y AG S S P A L S
- FGS TS P AL S
- FGS TAP AL S
- HASSAP AL S
- Y A S TA SD L S
- PL SSSPSL T
- P L S TS PQL E
- F L SSSPSLD
- F L SSSP ALQ
- F L SSSP AL E
- Y Y S TS P S L N
- Y S S S S VH L R
- F S S S S VH L R
- Y S S A A P ML S
- YGS AA I TL S
- YG T TS T TL S
- Y F S AS TAL S
- Y F SAS I EL S
- Y VD V S S A L D
- P E P L E AQ P S
- A I AG A F V L N
- Y A TG A D V L T
- F A A S TH I L N
- C ASSSSEL A
- C A S S SHN L A
- C V S S SN E L N
- CDS S TP E AE
- SD S S A P E L E
-H I SSSPEL S
- Y N P G L MR Y S
- Y A SQ S S V L N
- Y D R G TM S L N
- Y D TG A I S L S
- HSSSSAVLD
- Y S S S SG V L D
- YGF S - - I L G
- Y F A TS P AL S
- L SASSP AL S
- FGS S S P TL T
- FGSSSP SL S
- Y T E AN E P F E
- P TAS S S E F S
- P TAS S S S L S
- Y AS T TPQL N
- YGYN S P I FH
VS VA T TVE L T
VS I A TS P S L N
VH L A S S A A L A
V H MC G S P N L G
AH I S S S F I L G
L HMA A S P A L K
I SG S S S P A L S
- I TVARP S A T
- - I E TG K A V A
- - I E TG K A V A
287
F R TH P S A
F R TH P S A
F R TH P S A
F R TH P S A
F R TH P S A
F R TH P S A
F R TH P S A
F R TV P SD
F R TV P SD
F R TV P SD
F R TV P SD
F RVVP SE
F RVVP SE
L R T I P SD
L R T I P SD
FRT I ESE
FRT I ESE
Y R TV S SD
F RVVPGS
Y R T I P DD
F R TA L SD
F R TA L SD
Y R TA L SD
MR T A I SD
F R TA I SD
F R TS I SD
F R TS I SD
F T T E T TD
F R TS S PD
F R T VQ P E
F R TY P S E
FR I YPSE
F R TRS S E
F R T I RSD
WR N I P TM
VR TVP P E
F K T I Q TD
Y RL Y P S E
Y R TVAPD
L R TVAPD
Y R L A A AD
Y R T VQ P D
L RV V S SD
L R TV A SD
YH L L P AE
HQ I N P D E
HQ L L P S E
F TL RP SD
FQ I Y VSE
FQ I Y ASE
F R TY P SD
F R TH P S I
F R AN P P S
F R TN P S F
F R TN P S L
F R TS P SD
F QL F P SG
F R TL VS F
F R TL VS F
Y RA L I SN
F QML P T E
F QML P S A
FQL L P TE
F Q TL PND
F QML P N E
F Q L L G SG
VQ V F P S R
F R TVP S S
F R TY P SD
I R TY P SD
L RA I L SD
F R T I P SD
F SML P S E
F RVSP SE
FRT I PSE
FRT I PSE
YRT I PSE
FQ TP P S I
L RLHP VE
L RLHP VE
F R TVP S F
YQ TARS I
L R TVS TS
P RP I SS S
G I RG S I D
TS T TPC I
VG I T P S M
F GMAG S A
Y RV A S SG
F Y N GQN Q
F G AG S N Q
F G AG S N Q
367
GL F Y E TE A
GL F Y E TE A
GL F Y E TE A
GL F Y VVAA
GL F Y VVAA
G V F Y EDMA
GL F Y V TE A
GQ F DQN MA
GQ F DQN MA
GQ F D E N L A
G L F D E RM A
GN F N E H F A
G N F N E TWA
A S F Y E K EG
A S F Y E K EG
A S F Y E D TG
A S F Y E D TG
AN F Y E D L G
VD VD E E MA
GN F F GQG A
A I GQ A P V I
A I GQ A P V I
V Y A Y P DM T
L F AY SVKA
L F AY SP I A
L F AY P SKA
L N A Y P GQ I
L F AD L I P T
T I C NMP Q A
TA S Y ED EM
AG F Y S P QG
GG F Y S S QG
TG F F P S P G
YN S Y AK E A
F DG S EQ T F
GMF R E T T A
A L MA P HQC
L S S Y ED I A
G S F SQ E L A
GS F S PQL A
G S F S ED I A
GMF Y E D A A
L SVY P AY A
V N T Y P TH A
G L F Y E E VG
AL L Y E P TA
G L F Y E E TG
L NC D P K I S
L N MWE P K A
L N MWE P K A
L NMY SG K A
I N TD P D L A
LNS Y PNP L N S Y SG P A
VN S Y P E P A
L N S Y P GQ A
L NMY SN I A
I N T Y QD I A
I N TY PD I A
I NAY P SF S
L AMY A P MA
L AMY P GH A
L AMF S KQ A
L AMY VN D T
L AVYQPDA
L AMF L P R A
L N T Y QN T A
L NMY P P H A
L S MN P L N A
L NMY P L I A
I N AD P V T T
L N VC P E Y A
I N AN A S T T
G F F Y SN K A
G Y F Y ED K A
I WM Y E D K A
A WM Y E D Q A
A L F S V SG A
A N F D TD D A
AN F D E DD A
V Y SQ P E Y L
AF L SE RL A
G F L GG E L T
AF L - GP VA
- F S QH L AQ
VN I DG I L T
AF I GSSL A
L L AN S D L A
G - - T E HQC
Y P K TQ A Q V
Y P M TQ A Q V
Y P M TQ A Q V
395
L I G - WY A
L I G - WY A
L I G - WY A
F I G - WY E
F I G - WY E
I I G - WY P
F I G - WY A
I P G - WY E
I P G - WY E
I P G - WY Q
L NG - P Y S
I MA - T Y S
I MG - T Y T
L L G - WL S
F L G - WL S
L I G - WL E
L I G - WL E
L L G - WL Q
L PG - YH S
L LG - Y I E
L V G - WY P
L V G - WY P
L L G - WY F
L PG - F F F
MP G - F Y Y
L PG - F Y L
L P G - WL Y
I TF - GY I
M P G - WF D
F A N - WF D
I P G - WL G
I TG - WL E
G P G - WF T
I P G - WWA
Y Y G - WY N
L MSG S Y R
F NG - A L R
F MG - WY S
L H E - S MG
LPS-DTI
L VG - D F T
L L D - WA D
T F A - WY P
Y F A - WY P
F TG - WY D
F P G - WY H
F V G - WF Q
T L G - WY P
L Y G - WY S
L Y G - WY N
I P G - WY Q
T Y G - WY Q
A Y G - WY A
M RD - S F V
M RD - S F V
TN D - WY N
TN G - D Y T
I P M - WY N
I P M - WY N
L P M - WY A
T Y G - WY T
I Y G - WY S
TYG - TY E
T L G - WY T
L R D - WY G
VY P - F Y S
I Y D - WY N
V Y G - WY P
T L G - WY P
T L G - WY P
L L E - WH V
M Y G - WY D
V H A - WN T
L PG - F YG
L P A - WY T
L P G - WY S
L P G - WY T
LFE- I LP
L PS - SL P
L PS - SYK
F VG - N T E
F I G TS Y S
F HD - R S V
F HD - R S L
NA- D I PL
F L R - TKL
F L Y - RKL
QVE - I TL
L P D - RD V
TPH - QY T
TPN - TY T
TPN - TY T
446
GG F Q E A P L
GG F Q E A P L
GG F Q E A P L
EG YQE A P L
EG YQE A P L
PGY P E AP L
GG F P E A P L
G P S K F HG Y
G P S K F HG Y
E TS K F HG F
R TN P F H G Y
E Y S R F HG Y
E Y S R F HG Y
D TYG Y SN F
E S YG Y SN F
QQD I Y A A F
K RD L Y A A F
QP LGY Y P F
P N N TWR G Y
R RDDH S S Y
T TN T L R A Y
T TN T L R A Y
Y L A S V H TW
P L L GS Y TF
P L YGS Y S Y
P L SG S Y S Y
H L SG TY TY
P F K Y F Q TN
H A TN L A TQ
TG K D F G P A
GA I DY ASY
GA I DF ASF
GG K K F AG F
GYN P YH TF
G F ND YH P F
E PNP Y AS F
KS TVY VP L
G G WK T A G F
A I SQ Y A PQ
T Y S K F AGQ
T R S AH A P Q
T AN L Y V P Y
A L F GQ AQ F
S L QDQ V P Y
D F QQ Y T A Y
QF TY Y AAY
D F DH Y T A Y
DDH S I E S Y
Q P T Y V AHM
L S S Y V AHM
Y P I PDAL T
SQ TY TA A L
I L D T T A AQ
E A S D AGG R
E A S D AGG R
E K S E A AG R
QY TY VS AL
TE L G L SG V
TE L G L SG V
T E L G I AG L
S Y L I N S EH
R Y S TH N I Q
TH Y D Y A P H
S S S QD E E T
D P N RDD P S
T S N I A S QQ
S E MD G Y A F
SESY AAP F
FQ TS VAP F
FQ TS VAP F
L VD E S V Y F
A AD E I A S F
S I TED TVP
DQ F L S P T L
E PHAY I TF
A V S K Y MR F
N MH S Y V P F
KC D EN L A Y
K E D K AG S L
I E E K AG S L
ND L S V A A S
G P D P R AG T
I DL L YGP T
SD L A YG P V
VD S D T F N F
S I N K L GN A
T T S L RGN L
A S P P WA N L
N DN I Y VN S
S C R F G L AD
YC RFGS A T
YC RFGS A T
Supplemental Figure 11: Binding pocket alignment of mGABARs across metazoans
Select residues involved in the binding of GABA are highlighted, and numbers correspond to the position
in the human protein GABR1/GABAR-B1, based on the structure of GABR1 [Geng et al., 2013]. Highly
conserved residues not thought to be involved in binding are highlighted in blue. The two human proteins
are highlighted in pink. Placozoan proteins are highlighted in gray. Sponge proteins are highlighted in blue.
The two ctenophore proteins are highlighted in violet.
36
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
A.thaliana_NP_565744.1__GluR5
G5EKN9_SOLLC__SlGLR3.1
V.vinifera_F6HIJ5_VITVI
V.vinifera_F6HIJ6_VITVI
A.thaliana_NP_565743.1__GluR3.5
A.thaliana_NP_001030971.1__GluR3.4
G5EKP2_SOLLC__SlGLR3.4
G5EKP0_SOLLC__SlGLR3.2
V.vinifera_A5AMA8_VITVI
99
A.thaliana_NP_974686.1__GluR3.2
A.thaliana_NP_028351.2__GluR
A.thaliana_NP_190716.3__GluR3.6
G5EKP3_SOLLC__SlGLR3.5
V.vinifera_D7SWB7_VITVI
54 52
A.thaliana_NP_174978.1__GluR3.3
9951
V.vinifera_D7UDC6_VITVI
G5EKP1_SOLLC__SlGLR3.3
G5EKN1_SOLLC__SlGLR1.1
G5EKN2_SOLLC__SlGLR1.2
V.vinifera_A5AGU5_VITVI
V.vinifera_A5BMN8_VITVI
V.vinifera_F6HM67_VITVI
V.vinifera_F6GXG0_VITVI
V.vinifera_A5AUG7_VITVI
62
A.thaliana_NP_187061.1__GluR1.1
A.thaliana_NP_187408.2__GluR1.4
77
A.thaliana_NP_199652.1__GluR1.3
A.thaliana_NP_199651.1__GluR1.2
G5EKN7_SOLLC__SlGLR2.5
98 G5EKN4_SOLLC__SlGLR2.2
G5EKN6_SOLLC__SlGLR2.4
91
G5EKN5_SOLLC__SlGLR2.3
78 G5EKN3_SOLLC__SlGLR2.1
A.thaliana_NP_180476.3__GluR2.7
A.thaliana_NP_180474.1__GluR2.9
93
A.thaliana_NP_180475.2__GluR2.8
A.thaliana_NP_196682.1__GluR2.5
95
A.thaliana_NP_196679.1__GluR2.6
A.thaliana_NP_180047.1__GluR2.3
98
A.thaliana_NP_180048.1__GluR2.2
A.thaliana_NP_194899.1__GluR2.4
88
A.thaliana_NP_198062.2__GluR2.1
G5EKN8_SOLLC__SlGLR2.6
V.vinifera_A5AIS1_VITVI
80
V.vinifera_A5AD54_VITVI
84
V.vinifera_A5BDG6_VITVI
V.vinifera_F6H9F4_VITVI
V.vinifera_F6H9D0_VITVI
V.vinifera_F6H9G4_VITVI
V.vinifera_A5AU42_VITVI
V.vinifera_A5AQR7_VITVI
V.vinifera_A5AVQ8_VITVI
85 V.vinifera_F6H9G5_VITVI
V.vinifera_F6H9H0_VITVI
oscarella_t_h60_63112_comp13096_c0_seq1
oscarella_t_h60_08441_comp7318_c0_seq1
L.complicata_lctid7089
91
S.ciliatum_scpid22929_GLUR3.5
scict1.030436.1_0
oscarella_t_h60_33991_comp10944_c0_seq1
oscarella_t_h60_03154_comp3263_c0_seq1
91
oscarella_t_h60_50237_comp11700_c0_seq3
oscarella_t_h60_50246_comp11700_c0_seq12
S.ciliatum_scpid26529_GLUR3.4
L.complicata_lctid31759
oscarella_t_h60_65012_comp23829_c0_seq1
oscarella_t_h60_06601_comp6249_c0_seq1
42
oscarella_t_h60_65051_comp24134_c0_seq1
78
S.ciliatum_scpid25297_GLUR3.1
L.complicata_lctid34850
81
S.ciliatum_scpid27594_GLUR3.7
L.complicata_lctid7088
69
S.ciliatum_scpid21909_GLUR2.9
S.ciliatum_scpid15641_GLUR2.9
euplokamis_t_h20_19863_comp13082_c0_seq1
Hcal_TR15585_c2_g1_i1_m.37437
Hcal_TR15585_c3_g1_i1_m.37441-TR15585_c0_g1_i1
Pleurobrachia_bachei_AEX15543.1
67
MLRB004413
euplokamis_t_h30_23254_comp14560_c1_seq1-h10_06856
MLRB00443-edited
55 95
Pleurobrachia_bachei_AEX15551.1_trimmed
Hcal_TR7392_c0_g1_i1_m.11709
Pleurobrachia_bachei_AEX15547.1
Hcal_TR7373_c0_g1_i1_m.11556
53
99
euplokamis_t_h20_19416_comp12942_c0_seq1
Pleurobrachia_bachei_AEX15548.1-trimmed
euplokamis_t_h30_09966_comp8826_c0_seq1
Pleurobrachia_bachei_AEX15546.1
61
MLRB306921
ML30697a-trimmed
Pleurobrachia_bachei_AEX15539.1
Hcal_TR18969_c1_g2_i2_m.48365_partial
94
Pleurobrachia_bachei_AEX15541.1
10
Pleurobrachia_bachei_AEX15549.1
91
Pleurobrachia_bachei_AEX15542.1
ML00626a
89
ML032222a
ML032221a
ML05909a
Pleurobrachia_bachei_AEX15550.1
Pleurobrachia_bachei_AEX15544.1
ML111714a
85
ML15636a
ML085016a
78 65
63
Pleurobrachia_bachei_AEX15540.1
36
ML0850-17a-18a-fusion
Pleurobrachia_bachei_AEX15545.1
MLRB150054-fixed
ML03683a
95
ML027316a-fixed
ML07344a-recut
58
ML150012a-MLRB150049
88 99
ML150010a-trimmed-MLRB150043
MLRB064020
C3ZFG6_BRAFL
96
C3YQ18_BRAFL
C3YZA0_BRAFL-trimmed
78
C3ZMS5_BRAFL-cutshort
C3ZKA8_BRAFL-trimmed
64
adi_v1.00972
85
A7RPU4_NEMVE
83
adi_v1.17049-trimmed
adi_v1.11422
54
AIPGENE1622
90
A7T1G4_NEMVE
H13_TR19588_c2_g2_i3_m.37163
g1037.t1__scaffold_1__Triad1-18943
H13_TR30549_c0_g1_i1_m.66150
g1036.t1__scaffold_1__Triad1-18262
57
H13_TR15638_c1_g1_i1_m.29638
g1035.t2__scaffold_1__Triad1-18823
g9262.t1__scaffold_14__Triad1-30612
g9261.t1__scaffold_14__Triad1-30609
Triad1_55165
94
H13_TR9210_c0_g1_i1_m.20124
g10441.t1__scaffold_23__Triad1-32461
H13_TR13015_c0_g2_i3_m.25087-partial
99
g10442.t1__scaffold_23__Triad1-61396
g10443.t2__scaffold_23__Triad1-3218
94
H13_TR30216_c5_g1_i3_m.64945
A7SFF8_NEMVE_NMDA-like
AIPGENE3629-joined_AIPGENE1829-allele
98
A7SGV8_NEMVE_NMDA-like
68
AIPGENE14101
A7RLA0_NEMVE_NMDA-like
18
A7SL56_NEMVE
adi_v1.13302
Ocbimv22034182m.p
99
99
D.melanogaster_NP_730940.1__NMDAR-Z-like
Amel_4.5_GB46886
81
X.laevis_NP_001081616.1_NMDA1.2
Q6ZM67_DANRE_grin1b
E7F101_DANRE_grin1a
46NMDZ1_MOUSE
71
NMDZ1_HUMAN
C3YII6_BRAFL
71
obirnaseq.137713.1_1
Mouse_NP_001263284.1_NMDA3A
NMD3A_HUMAN
NMD3B_MOUSE
NMD3B_HUMAN
99
Amel_4.5_GB48097
D.melanogaster_NP_001014714.1__NMDAR-like
C3Y993_BRAFL
C3Y747_BRAFL
22
96
NMDE2_MOUSE
NMDE2_HUMAN
86
NMDE1_HUMAN
NMDE1_MOUSE
NMDE3_MOUSE
NMDE3_HUMAN
E9QBK8_DANRE_grin2db_NMDE4-like
NMDE4_MOUSE
NMDE4_HUMAN
oscarella_t_h60_52136_comp11753_c0_seq16
oscarella_t_h60_15764_comp9193_c0_seq1
67
oscarella_t_h60_42587_comp11410_c0_seq2
H13_braker1_g01988.t1
T.adherens_g388.t5
0
obirnaseq.63134.1_0
13
D.melanogaster_NP_001260049.1_25a
84
99
D.melanogaster_NP_727328.1_8a
scict1.009114.1_0
13
obirnaseq.13493.1_0
95
obirnaseq.109126.1_2
PH13_braker1_g02946.t1
15
g5117.t1__scaffold_5__Triad1-14565
71
g5089.t1__scaffold_5__Triad1-25027
92
PH13_braker1_g02971.t1
PH13_braker1_g02972.t1
g5085.t1__scaffold_5__Triad1-25025
C3XPD8_BRAFL
C3ZRD9_BRAFL
60
86
C3ZQV0_BRAFL
0
X.tropicalis_NP_001096470.1_DELTA1
98
GRID1_HUMAN
GRID1_MOUSE
GRID2_DANRE
GRID2_MOUSE
GRID2_HUMAN
Ocbimv22018191m.p
36
C3ZMC8_BRAFL
89
C3ZZY6_BRAFL
3
obirnaseq.94822.1_1
60
Amel_4.5_GB53122
Amel_4.5_GB49273
80
Amel_4.5_GB49268
98
Q9VIE2_DROME__Clumsy
69
Amel_4.5_GB49275
D.melanogaster_CAB64942.1
Danio_rerio_XP_009290381.1_K5
GRIK5_HUMAN
8
GRIK5_MOUSE
Danio_rerio_XP_017206665.1_K4
GRIK4_MOUSE
GRIK4_HUMAN
54
C3ZWB5_BRAFL
Danio_rerio_XP_009303592.1_K1
GRIK1_MOUSE
81
GRIK1_HUMAN
Danio_rerio_XP_017206939.1_K3
GRIK3_MOUSE
GRIK3_HUMAN
87 Danio_rerio_XP_017206873.1_K2
99
B9V8S1_XENLA_GRIK2
98
GRIK2_MOUSE
99
GRIK2_HUMAN
obirnaseq.32300.1_1
85
obirnaseq.114940.1_2
79
Amel_4.5_GB40973
D.melanogaster_NP_476855.1
D.melanogaster_NP_001261621.2
Q71E63_DANRE_gria2a
Q71E62_DANRE_gria2b
B9V8R8_XENLA_GRIA2
99
GRIA2_HUMAN
99
GRIA2_MOUSE
55
B3DGS8_DANRE_gria1a
Q71E64_DANRE_gria2a
X.laevis_NP_001153151.1_AMPA1
GRIA1_MOUSE
78
GRIA1_HUMAN
Q71E61_DANRE_gria3a
99Q71E60_DANRE_gria3b
X.laevis_NP_001153154.1_AMPA3
99GRIA3_MOUSE
GRIA3_HUMAN
51
X.laevis_NP_001153157.1_AMPA4
GRIA4_MOUSE
52GRIA4_HUMAN
Q71E59_DANRE_gria4a
Q71E58_DANRE_gria4b
86
99
52
Plants
Sponges
O.carmela
L.complicata
S.ciliatum
90
99
81
Ctenophores
Cnidarians
Placozoans
NMDAR
Glycine
Acidic ligand
Unknown ligand
DeltaR
Clumsy
Ctenophores
Placozoans
Cnidarians
Protostomes
Chordates
Vertebrates
Plants
KainateR
AMPAR
0.5
Supplemental Figure 12: Phylogenetic tree of ionotropic glutamate receptors across metazoans
Protein tree generated with RAxML using the PROTCATWAG model. Bootstrap values are 100 unless
otherwise shown. These receptors are not found in the genomes or transcriptomes of demosponges or
hexactinellids, so the Sponge clade refers to calcareous sponges and homoscleromorphs. For the three sponges,
the blue star indicates sequences derived from a transcriptome. Based on [Alberstein et al., 2015], some
receptors are predicted to bind ligands other than glutamate, shown with the green star, red diamond, and
37
black star, for glycine, acidic ligands, and unknown, respectively.
Four placozoan proteins have substitutions
at the conserved acidic residue (D723 in human GluN1), as either GY in ctenophores, or GG/WY in
placozoans; the carboxyl of the glutamic/aspartic acid is needed to coordinate the amino group of glutamate,
suggesting that these proteins do not bind an α -amino acid.
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
NMD3A_HUMAN
Human NMDA-class iGluRs
NMDE1_HUMAN
GRIA2_HUMAN
Human AMPA-class iGluRs
euplokamis_t_h30_09966_comp8826_c0_seq1
Ctenophore iGluRs
ML032222a
Pleurobrachia_bachei_AEX15542.1
Calcisponge/Hsm AMPA-like
oscarella_t_h60_42587_comp11410_c0_seq2
scict1.009114.1_0
A.thaliana_NP_187061.1__GluR1.1
S.ciliatum_scpid25297_GLUR3.1
Plant iGluRs
Calcisponge/Hsm iGluR-like
oscarella_t_h60_06601_comp6249_c0_seq1
L.complicata_lctid34850
Demosponge mGluR-like
Aqu2.1.29334_001
E.muelleri−Emu1128−0.9−mRNA−1
Signal peptide
ANF_receptor
7-transmembrane
bac SBP type 3
Ligand channel
NCD3G
NMDAR2_C
twi_ss.7125c.5|m.10167
0
250
500
750
1000
1250
1500
Supplemental Figure 13: Domain organization of ionotropic glutamate receptors across metazoans
Scale bar displays number of amino acids. Top BLAST hits for human iGluRs in demosponges appear to be
metabotropic, due to the presence of a 7-transmembrane domain instead of the ion channel, while the ligandbinding domain is conserved. Ctenophore iGluRs and some calcarea/homoscleromorph (Hsm) proteins have
the vertebrate-type domain organization, though the other calcarea/homoscleromorph proteins (main sponge
group in Supplemental Figure 12) have an SBP domain.
38
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
100
CAOG_01813T0__S17A9-like
Alatina_alata_c59016_g1_i1__S17A9
hydra_sra.33023.2__s17a9
Scopalina_sp_TR21479_c0_g1_i1__s17a9
X.testutinaria_TR78129_c0_g1_i1__s17a9
Aqu2.1.12587_001_partial__s17a9
Stylissa_carteri_TR73134_c0_g1_i2__s17a9
91
twi_ss.30106.1__s17a9
81
NVE8304
NVE22746
75
Porites_australiensis_3031__S17A9
99
Acropora_millepora_10895__S17A9
Danio_rerio_NP_001002635.1__SLC17A9
50
Xtropicalis_XP_002944466.2__SLC17A9
Gallus_gallus_NP_001006292.1__SLC17A9
73 87
S17A9_HUMAN
75
S17A9_MOUSE
Branchiostoma_belcheri_XP_019645208.1__SLC17A9
FBpp0077146__S17A9
82
Amel_4.5|GB46143-PA__S17A9
HELRODRAFT_76708__SLC17A9
40
49 66
Capca1|155164__S17A9
Capca1|155158
62
Lingula_comp143954_c0_seq4__S17A9
obirnaseq.110733.2__S17A9
79
9883
Cgigas_EKC18089__S17A9
LOTGIDRAFT_112031__SLC17A9
CAOG_02811T0
obirnaseq.94311.1
Hoilungia_braker1_g07337.t1
6
Triad1_g69.t1__scaffold_1
Hoilungia_braker1_g07338.t2
scict1.031446.1
lctid25328
lctid59401
scict1.029742.13
scict1.015158.1
36
bathyctena_t_h10_20781_comp13657_c0_seq4
beroeabyssicola_t_h20_12199_comp12599_c0_seq1
65 99
ML011726a
bolinopsis_t_h20_00340_comp262_c0_seq1
91
euplokamis_t_h20_11645_comp9428_c0_seq1
beroeabyssicola_t_h20_15083_comp14311_c0_seq2
thalassocalyceinconstans_t_h10_05529_comp6090_c0_seq1
78
bathocyroe_t_h20_21180_comp15532_c0_seq1
45 31
Hcal_TR27180_c1_g1_i1
41
dryodora_t_h20_15255_comp12806_c0_seq1
72
ML21903a
38
bolinopsis_t_h20_19456_comp13734_c0_seq1
42 velamen_t_h20_17746_comp13342_c0_seq1
Monbr1_g5507.t1__scaffold_14
99
Srosetta_PTSG_01814T0
Halisarca_dujardini_HADA01062594.1
19
Halisarca_dujardini_HADA01062593.1
sympagella_nux_TR6482_c0_g1_i1_partial
aphrocallistes_vastus_comp19335_c0_seq1
94
Scopalina_sp_TR32724_c0_g1_i1
68
Scopalina_sp_TR32724_c0_g2_i2
95
72
twi_ss.14577.1
76
Tedania_anhelens_TR19189_c0_g1_i10
Petrosia_ficiformis_TR2695_c2_g1_i1
88
Aqu2.1.28662_001
97
X.testutinaria_TR52323_c0_g1_i1
69
X.testutinaria_TR24582_c0_g1_i4
Aqu2.1.29226_001
13
Aqu2.1.28663_001
scict1.023385.2
lctid55483
Corticium_candelabrum_TR84939_c2_g3_i1_partial
75
65
Oscarella_carmela_comp9729_c0_seq1
84
Oscarella_carmela_comp31204_c0_seq3
78 91
Oscarella_carmela_comp37189_c0_seq1
Oscarella_carmela_comp40500_c0_seq9
Hydractinia_symbiolongicarpus_c17024_g1_i4
Alatina_alata_g24140.1_i2
96
AIPGENE1813
NVE15609__VGLUT-like
Hydractinia_symbiolongicarpus_g7761.1_i1
Alatina_alata_g35508.1_i1
92
Acropora_millepora_17115
AIPGENE1891
97 NVE15610__VGLUT-like
NVE21028__VGLUT-like
14
95
NVE16311__VGLUT-like
Alatina_alata_c52723_g1_i1
35
Hydractinia_symbiolongicarpus_c16442_g1_i1
78
hydra_sra.17150.1
95
Alatina_alata_g57195.1_i1
hydra_sra.36087.1
Hydractinia_symbiolongicarpus_c19851_g1_i1
Acropora_millepora_10896
Porites_australiensis_15186
19
NVE20928__VGLUT-like
88 AIPGENE24249
Triad1_g5316.t1__scaffold_5
Hoilungia_braker1_g11731.t1
98
Hoilungia_braker1_g10108.t1
40
Alatina_alata_c54481_g1_i1
97
Hydractinia_symbiolongicarpus_c23495_g1_i1
96
hydra_sra.10353.1
21 22
Alatina_alata_g56947.1_i1
77
Alatina_alata_g51821.1_i3
34
hydra_sra.10649.1
Hydractinia_symbiolongicarpus_c14056_g1_i1
NVE17787__VGLUT
22
Acropora_millepora_12796
52
Porites_australiensis_46055
82
NVE8595__VGLUT
50
AIPGENE9842
33
Helro1|65417
FBpp0083297
TC010895
94
58
NV17997-PA
96
Amel_4.5|GB53933-PA
50
Lingula_comp156026_c0_seq2
52
Cgigas_EKC30442
61 64
Lotgi1|126078
65 76
Lotgi1|102784
Lotgi1|161078
bflornaseq.42024.2
bflornaseq.42037.1
bflornaseq.29980.1-29981.1-joined
EKC38935
20
Cgigas_EKC18280
Lingula_comp149033_c2_seq2
Lingula_comp152169_c0_seq4
89
Capca1|208926
Lingula_comp149080_c0_seq7
59
Cgigas_EKC34298
97
Lotgi1|233350
Bf_V2_327_g44834.t1__bflornaseq.13338.1
23
10
Spurpuratus_XP_786480.3__VGLUT
Lotgi1|128007__VGLUT
94 86
obirnaseq.87250.1
Cgigas_EKC24439__VGLUT
82
Lingula_comp143616_c3_seq1__VGLUT
75
Capca1|177109__VGLUT
35
TC008459__VGLUT
55
Q9VQC0_DROME__VGLUT
81
Amel_4.5|GB54867-PA__VGLUT
53
NV15959-PA__VGLUT
VGLU3_DANRE
99Gallus_gallus_NP_001305958.1__VGLUT3
VGLU3_MOUSE
VGLU3_HUMAN
VGLU1_XENTR
Danio_rerio_NP_001092225.1__VGLUT1
78 VGLU1_HUMAN
VGLU1_MOUSE
83 VGL2A_DANRE
VGL2B_DANRE
81
Gallus_gallus_NP_001161855.1__VGLUT2
2991
VGLU2_MOUSE
98
VGLU2_HUMAN
Spurpuratus_XP_780445.3
Spurpuratus_XP_795625.3
72
Spurpuratus_XP_783585.2
94
Spurpuratus_XP_782868.2
82
Spurpuratus_XP_785484.3
Danio_rerio_NP_001070195.1__sialin
47
Salmo_salar_NP_001167306.1__sialin
99S17A5_HUMAN
S17A5_MOUSE
99 Alligator_mississippiensis_XP_006269813.1
99
Gallus_NP_001026257.1
98
Gallus_gallus_XP_015140231.1__sialin
Callorhinchus_milii_XP_007906946.1
Danio_rerio_XP_005159826.1
91
Alligator_mississippiensis_XP_006268054.2
Pelodiscus_sinensis_XP_014424898.1
68
E1BDD2_BOVIN__SLC17A4
99
S17A4_MOUSE
95S17A4_HUMAN
NPT3_BOVIN
NPT3_MOUSE
56 NPT3_HUMAN
89
NPT4_HUMAN
94
NPT1_MOUSE
NPT1_HUMAN
Capca1|183805
15
40
Helro1|186317
Lingula_comp149120_c0_seq6
27
Lotgi1|197570
27
80
Lotgi1|133858
35
obirnaseq.111983.1
32
Cgigas_EKC24609
7
Alatina_alata_c52956_g1_i1
Alatina_alata_c52359_g1_i1
99 46
Alatina_alata_g38981.1_i1
50
Hydractinia_symbiolongicarpus_c22518_g1_i1
53
NVE10158__sialin-like
AIPGENE16892
Porites_australiensis_22512
74
Acropora_millepora_10266
NVE12518__sialin-like
95 72
AIPGENE10305
71
NVE9481__sialin-like
80
AIPGENE16935
TC005982
7
NV11798-PA
TC006631
TC006632
Amel_4.5|GB41670-PA
64
83
NV10357-PA
NV15849-PA
97
NV17612-PA
NV17613-PA
61
TC006625
NV11203-PA
Amel_4.5|GB42969-PA
Amel_4.5|GB51651-PA
57
NV18394-PA
87
Amel_4.5|GB51650-PA
2
91
FBpp0086261
FBpp0309074
Amel_4.5|GB43225-PA
NV15041-PA
Hoilungia_braker1_g05533.t1
Capca1|113983
EKC40068
Lotgi1|107883
87
Lotgi1|155133
91
10
99 67
Lotgi1|161420
Lotgi1|107712
27
Capca1|93612
Helro1|108787
Helro1|71683
31
93
obirnaseq.69929.1
81
Cgigas_EKC35063
Cgigas_EKC35062
bflornaseq.39778.2
Capca1|209782
25
Lotgi1|113326
90 63
97
Cgigas_EKC25794
Lotgi1|167964
93
obirnaseq.30932.1
obirnaseq.22900.1
81
obirnaseq.22904.3
obirnaseq.22889.1
53
obirnaseq.30926.2
obirnaseq.30928.1
5330
obirnaseq.22894.5
obirnaseq.30927.1
24
3765 obirnaseq.123251.2
obirnaseq.123253.1
36 obirnaseq.30934.1
47
obirnaseq.30937.1
37
98
46
Nucleotide transporter
(SLC17A9)
100
Cnidarian VGLUT-like
Glutamate transporter
VGLUT
(SLC17A6,7,8)
Sialin (SLC17A5)
SLC17A4
Demosponges
Homoscleromorphs
Calcarea
Hexactinellids
Ctenophores
Placozoans
Cnidarians
Protostomes
Echinoderm/hemichordate
Chordates
Vertebrates
Filastereans
Choanoflagellates
0.6
Supplemental Figure 14: Vesicular glutamate transporter homologs across metazoans
Tree of VGLUT (SLC17A6-8) proteins across all metazoan groups, generated with RAxML using the
PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.
39
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
32
91
SC6A1_HUMAN
SC6A1_MOUSE
Srosetta_PTSG_08575T0
Hoilungia_braker1_g07132.t1
Triad1_g3650.t1__scaffold_3
Srosetta_PTSG_00672T0
Spurpuratus_XP_788574.3__S38AB
29
S38AB_DANRE
80
S38AB_HUMAN
S38AB_MOUSE
44
TC015432__S38
NV10096-PA__S38
31
Lingula_comp153547_c1_seq2_1__S38
65
EKC22480__S38
99
Lotgi1|174956__S38
CAOG_10085T0
lampea_t_h30_15888_comp12890_c0_seq2__cg4
Hcal_TR16507_c0_g1_i4__cg4
97
bolinopsis_t_h20_07298_comp7564_c0_seq1__cg4
80
45 thalassocalyceinconstans_t_h10_14237_comp12108_c0_seq1__cg4
57
beroeabyssicola_t_h01_04633_comp4052_c0_seq1__cg4
99
76 dryodora_t_h20_15361_comp12868_c0_seq1__cg4
Hoilungia_braker1_g00642.t1
Triad1_g3043.t2__scaffold_3
82
Cgigas_EKC37816
AIPGENE11622
60
Porites_australiensis_5139
Montastraea_cavernosa_84647
37Madracis_auretenra_42167
scict1.009980.1
lctid63806
Alatina_alata_c58688_g1_i1
hydra_sra.11928.1
93
resomia_t_x0_062726_comp75224_c0_seq1
Alatina_alata_c52262_g1_i1
19
hydra_sra.11232.1
hydra_sra.11218.1
34
Alatina_alata_c49879_g1_i1
Alatina_alata_c53411_g1_i1
98
hydra_sra.20297.1
56
hydra_sra.20302.1
AIPGENE25242
94
Acropora_millepora_11455
Porites_australiensis_9197
AIPGENE7072
NVE17671
26
98 Porites_australiensis_38448
Acropora_millepora_17942
98 resomia_t_x0_049257_comp70251_c0_seq1
94
resomia_t_x0_027597_comp54692_c0_seq1
98
Alatina_alata_g42698.1_i1
51
Alatina_alata_c51053_g1_i1
94
Acropora_millepora_8356
56
Porites_australiensis_37320
Acropora_millepora_12688
99
Porites_australiensis_49386
59
NVE17670
43
AIPGENE27144
NVE5099
NVE22971
NVE15339
79
85
AIPGENE28594
83
Acropora_millepora_16417
74
Porites_australiensis_36679
Alatina_alata_c44674_g1_i1
99
NVE24524
hydra_sra.24414.1
81
AIPGENE4721
99
NVE8024
Alatina_alata_g47107.1_i1
hydra_sra.1815.1
99
Alatina_alata_g24024.1_i1
29
AIPGENE26117
NVE8637
76
AIPGENE29146
NVE25975
hydra_sra.28166.1
24
Alatina_alata_g44676.1_i1
PFL3_pfl_40v0_9_20150316_1g2763.t1
Spurpuratus_XP_795408.3
obirnaseq.30198.1
95
Helro1|183498
81
68
Capca1|225531
39
TC009185
FBpp0086703
95 Amel_4.5|GB40541-PA
NV15949-PA
Spurpuratus_XP_791315.3__VIAAT
bflornaseq.38036.1
31
Gallus_gallus_XP_417347.3__VIAAT
85
VIAAT_XENTR
54
Danio_rerio_NP_001074170.1__VIAAT
49VIAAT_MOUSE
VIAAT_HUMAN
oscarella_t_h60_07389_comp6812_c0_seq2
beroeabyssicola_t_h20_14227_comp13818_c0_seq1
bathocyroe_t_h20_04478_comp6860_c0_seq1
Hcal_TR19444_c0_g1_i2
52 32thalassocalyceinconstans_t_h20_14855_comp14523_c0_seq1
24 velamen_t_h20_22074_comp14838_c0_seq2
62bolinopsis_t_h20_25026_comp15044_c0_seq3
49
ML073035a
Monbr1_g6159.t1
Srosetta_PTSG_07915T0
Lotgi1|157294__ANTL1
28
99
Lingula_comp152251_c0_seq2__ANTL1
bflornaseq.11495.1
74
oscarella_t_h60_33621_comp10920_c0_seq1
twi_ss.6165.1
25 58 98
Petrosia_ficiformis_TR11201_c0_g1_i1
99
X.testutinaria_TR68852_c1_g1_i1
93
Aqu2.1.23868_001
38
Triad1_g5051.t1__scaffold_5
Hoilungia_braker1_g03002.t1
Porites_australiensis_6407__ANTL1
63
62
Acropora_millepora_13042__ANTL1
80
NVE21968__ANTL1
98
Exaiptasia_pallida_KXJ22393.1__ANTL1
euplokamis_t_h30_07022_comp6902_c0_seq1
90 charistephanefugiens_t_h10_09118_comp9861_c0_seq1
Hcal_TR17487_c0_g1_i1
dryodora_t_h20_09505_comp9226_c0_seq1
88
bathocyroe_t_h20_17157_comp14441_c0_seq1
5842 thalassocalyceinconstans_t_h20_24470_comp18878_c0_seq3
97
25ML104321a
81 velamen_t_h20_12670_comp11072_c0_seq1
83 bolinopsis_t_h20_08753_comp8580_c0_seq1
CAOG_00255T0
Monbr1_g2633.t1
34
scict1.017867.1
lctid70109_0
lctid78596_0
99
scict1.018556.1
scict1.015001.1
96
scict1.015001.2
90
lctid12311_0
lctid80334_0
aphrocallistes_vastus_comp20147_c1_seq1
rosella_fibulata_TR11879_c0_g1_i1
sympagella_nux_TR7911_c0_g2_i1
87
aphrocallistes_vastus_comp6524_c0_seq1
sympagella_nux_TR20508_c0_g1_i4
rosella_fibulata_TR13163_c0_g1_i1
78
Scopalina_sp_TR28065_c0_g3_i1
44
twi_ss.8957.1_1
56
Tedania_anhelens_TR6077_c3_g2_i2
Aqu2.1.31258_001
63
Aqu2.1.31260_001
Petrosia_ficiformis_TR11578_c0_g1_i1
33
25
X.testutinaria_TR48957_c0_g1_i8
82
X.testutinaria_TR65433_c1_g3_i10
oscarella_t_h60_10767_comp8125_c0_seq6
Triad1_g5318.t2__scaffold_5
Triad1_g5319.t1__scaffold_5
euplokamis_t_h20_13527_comp10429_c0_seq1__S36A
39
bathyctena_t_h30_00646_comp699_c0_seq1__S36A
dryodora_t_h20_27957_comp16799_c0_seq3__S36A
bathocyroe_t_h20_09052_comp10596_c0_seq3__S36A
95
76
thalassocalyceinconstans_t_h10_13472_comp11685_c1_seq1__S36A
41
68
velamen_t_h20_29161_comp16666_c0_seq1__S36A
96
bolinopsis_t_h20_13495_comp11280_c1_seq1__S36A
79 ML085720a__S36A
Acropora_millepora_4745__S36A
Porites_australiensis_21159__S36A
22
89
NVE5526__S36A
NVE5525__S36A
93
AIPGENE23310__S36A
68
Alatina_alata_c55134_g1_i1__S36A
hydra_sra.16612.1__S36A
hydra_sra.16610.1__S36A
45
Alatina_alata_c43290_g1_i1__S36A
98
hydra_sra.11681.1__S36A
21
Alatina_alata_c41537_g1_i1__S36A
bflornaseq.11538.1-g25112.t1
40
Spurpuratus_XP_003723741.1__S36A
Lingula_comp153549_c1_seq2_1__S36A
63
Lotgi1|218879__S36A
89
NV16821-PA__S36A
15
FBpp0292523__S36A
97
71
NV18592-PA__S36A
71
Amel_4.5|GB51487-PA__S36A
40
NV11499-PA__S36A
S36A4_MOUSE
S36A4_HUMAN
S36A3_HUMAN
S36A3_MOUSE
S36A1_HUMAN
97 S36A1_MOUSE
46
S36A2_MOUSE
S36A2_HUMAN
94
Na-coupled neutral AA transporter
(SLC38)
Unknown VIAAT-like
88
Cnidarian-specific VIAAT-like
32
GABA transporter VIAAT
(SLC32A1)
41
Demosponges
Homoscleromorphs
Calcarea
Hexactinellids
Similar to
Aromatic and neutral
AA transporter
(ANTL)
Ctenophores
Placozoans
Cnidarians
Protostomes
Echinoderm/hemichordate
Chordates
Vertebrates
Filastereans
Choanoflagellates
H-coupled AA transporter
(SLC36)
0.6
Supplemental Figure 15: Vesicular inhibitory amino acid transporter homologs across metazoans
Tree of VIAAT (SLC32A1) proteins and related transpoters across all metazoan groups, generated with
RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.
40
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
Spongospora_subterranea_A0A0H5R9G2
20
Ostreococcus_lucimarinus_A4RY52_OSTLU
70
Red algae
M2XB31_GALSU
38
Klebsormidium_flaccidum_A0A0U9I7C4_KLEFL
Oryza_brachyantha_J3L063_ORYBR
Beta_vulgaris_A0A0J8B722_BETVU
K4CUY8_SOLLC
36
F6HQ45_VITVI
49
Plants
Citrus_sinensis_A0A067H648_CITSI
99
Eutrema_salsugineum_V4KNG8_EUTSA
PIEZO_ARATH
D.discoideum_Y2801_DICDI
92
Rozella_allomycis_EPZ31590.1
Monbr1_g1806.t2
99
Choanoflagellates
Salpingoeca_rosetta_PTSG_01300T0
Monbr1_g2005.t1
20
Ctenophores
euplokamis_t_h30_28970_comp15884_c0_seq4
bathyctena_t_x0_34756_comp23030_c0_seq1_0
beroeabyssicola_t_h20_27141_comp17803_c0_seq1
98
55
Pbachei_3461157_partial
68
hormiphora_t_h10_11700-TR2190_c1_g3_i2
46
bolinopsis_t_h20_24908_comp15026_c0_seq11
ML018021-AUGUSTUS
Aqu2.1.38329_001
ephydatiamuelleri_12638-joined
twi_ss.19116.1-19102.1
Sponges
oscarella_t_h60_60372_comp11964_c0_seq14
scict1.026728.1_scict1.029409.2_0
L.complicata_lctid5362
Placozoans
T.adherens_g4404.t1
Hoilungia_stringtie_3695.2_m.4488
99
resomia_t_x0_097703_comp79502_c0_seq1
hydra_sra.24978.1_2
NVE3870
Cnidarians
AIPGENE4679-13595-4700_partial
adi_v1.23438-05905-05906-05907_partial
Acropora_millepora_552
Porites_australiensis_8982
99
Montastraea_cavernosa_16922
Brafl_PIEZO_partial
O.dioica_2913001-5897001-5899001
57
94
F1NVW5_CHICK
39
PIEZ2_MOUSE
PIEZ2_HUMAN
Vertebrates
E1BX07_CHICK
48
PIEZ1_HUMAN
PIEZ1_MOUSE
Spur_390358264-joined
97
98
C.gigas_EKC42879-EKC42880
Transcriptome
Partial protein
obirnaseq.123544.1_0
42
Protostomes
Capca1_219762
81
L.anatina_comp156528_c0_seq1
D0VWN8_CAEEL
94
PIEZO_DROME
A0A087ZN43_APIME
Joined protein
99
TC013391
0.4
Supplemental Figure 16: Phylogenetic tree of Piezo homologs across metazoans
Protein tree generated with RAxML using the PROTCATWAG model and 100 bootstraps. Yellow circle
indicates the sequence was compete when joined with other genes or partial sequences, red partial circle
indicates the sequence is incomplete in the genome or transcriptome. A blue star indicates that the sequence
derived from a transcriptome, so copy number cannot be determined with certainty. Bootstrap values are
100 unless otherwise shown.
41
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
Choanoflagellates
Monbr1_g1391.t1__scaffold_3
Srosetta_PTSG_00336T0
Hcal_TR17996_c0_g1_i1
beroeabyssicola_t_h20_12271_comp12645_c0_seq2
bolinopsis_t_h20_09329_comp8981_c0_seq1
velamen_t_h20_28621_comp16542_c0_seq1
77
ML17059a__SCN-like
euplokamis_t_h20_12092_comp9658_c0_seq3
Hcal_TR16070_c2_g2_i1
98
bathocyroe_t_h20_22759_comp15829_c0_seq2
88 74
thalassocalyceinconstans_t_h10_09040_comp8843_c0_seq1
98 bolinopsis_t_h20_18283_comp13341_c0_seq1
velamen_t_h20_34229_comp17681_c0_seq1
64
ML358826a__SCN-like
Hoilungia_braker1_g08315.t1
Triad1|54699_g3484.t1
Hoilungia_braker1_g10012.t1
Triad1_g3491.t2_Triad1-23340
Hoilungia_braker1_g08313.t1
Triad1_g3486.t1
Hoilungia_braker1_g10010.t1
Triad1_g3490.t1
Hoilungia_braker1_g05943.t1
Hoilungia_braker1_g08314.t1
99
Triad1_g3487.t1
PFL3_pfl_40v0_9_20150316_1g3063.t1
Spurpuratus_XP_793384.3
FBpp0300666
Amel_4.5|GB47190-PA
92
TC008776
Lingula_comp154129_c1_seq10
92
Capca1|134859
Lotgi1|118343
Cgigas_EKC21550
Halocynthia_roretzi_BAA95896.1
Brafl1|75071
Brafl1|249620_bflornaseq.37403.1
bflornaseq.37394.1-37393.1-Brafl1-143759
bflornaseq.37392.1-37391.1
Halocynthia_roretzi_BAA04133.1
Pmarinus_GENSCAN00000002143
Pmarinus_GENSCAN00000032390
F6XEC0_XENTR__scn4a
43
SCN4A_MOUSE
SCN4A_HUMAN
86
60
K9J7R3_XENTR__scn5a
SCN5A_HUMAN
99
SCN5A_MOUSE
SCNBA_HUMAN
99
98
SCNBA_MOUSE
99
SCNAA_HUMAN
SCNAA_MOUSE
F6XFJ2_XENTR__scn8a
SCN8A_RAT
SCN8A_HUMAN
87
K9J7Z6_XENTR__scn2a
F6UXH2_XENTR__scn1a
81
79
81
F6TZ24_XENTR__scn3a
SCN3A_RAT
25
SCN3A_HUMAN
SCN7A_HUMAN
44
SCN9A_HUMAN
SCN9A_MOUSE
34 SCN2A_HUMAN
SCN2A_RAT
92
SCN1A_HUMAN
SCN1A_MOUSE
SCNA_DROME__para
TC004749__SCN
48 Amel_4.5|GB42728-PA
93
NV14617-PA__SCN
Lingula_comp153551_c1_seq30
81
Cgigas_EKC22630__SCN
Lotgi1|107523
Capca1|210954
Helro1|89291
93
Helro1|119038
99
Helro1|109965
Helro1|64539
Rhopilema_esculentum_TR59849_c0_g1_i2
hydra_sra.25523.1
Craspedacusta_sowerbyi_TR39962_c1_g1_i1
Corallium_rubrum_TR22337_c0_g1_i3
AIPGENE4043
Stylophora_pistillata_5656
Porites_australiensis_12936
45
Acropora_millepora_9921
Nemve1|171660_NVE7195
AIPGENE22631
Nemve1|88319_NVE6936
Porites_australiensis_8690
Stylophora_pistillata_17883
Corallium_rubrum_TR32421_c1_g1_i1
AIPGENE9486
98
99
NVE7348__SCN-like
Stylophora_pistillata_13842
Acropora_millepora_904
98
Porites_australiensis_6111
Alatina_alata_c37058_g1_i1
70
Rhopilema_esculentum_TR83407_c2_g1_i2
Cyanea_capillata_AAA75572.1
Craspedacusta_sowerbyi_TR78689_c5_g1_i1
resomia_t_x0_058327_comp73969_c0_seq1
96
94
Hydractinia_symbiolongicarpus_c23789_g1_i2
77
Polyorchis_penicillatus_AAC38974.1
93
hydra_sra.26839.1
hydra_sra.39059.1
Ctenophores
Placozoans
85
100
91
81
99
99
49
84
50
Protostome
NaV2
Vertebrate
NaV1
73
Protostome
NaV1
99
94
81
22
Ctenophores
Placozoans
Cnidarians
Protostomes
Echinoderm/hemichordate
Chordates
Vertebrates
Choanoflagellates
67
Cnidarian
NaV Group 2
Anthozoan
NaV Group
Cnidarian
NaV Group 1
0.6
Supplemental Figure 17: Phylogenetic tree of voltage-gated sodium channel alpha subunits
across metazoans
Protein tree generated with RAxML using the PROTGAMMALG model and 100 bootstraps. Bootstrap
values are 100 unless otherwise shown.
42
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
scict1.007366.1
lctid45949
scict1.016080.1
lctid3586
Aqu2.1.42586_001
Sponges
Twilhelma_c34057_g1_i2_twi_ss.12332.4
59
Tedania_anhelens_TR33204_c2_g1_i1
64
100
Scopalina_sp_TR28310_c4_g1_i2
Srosetta_PTSG_09464T0
Monbr1_g2466.t1__scaffold_5
euplokamis_t_h30_14855_comp11501_c0_seq1
Hcal_TR25961_c0_g1_i1__CaC-like
bathocyroe_t_h20_30616_comp16858_c2_seq1
ML190424a__CaC-like
100
99velamen_t_h20_35925_comp17943_c0_seq2
55
Ctenophores
bolinopsis_t_h20_17550_comp13069_c0_seq1
Triad1_g1763.t1__scaffold_2
Hoilungia_braker1_g05522.t1
92
CAC1E_HUMAN
N/P/Q-type CaV
CAC1B_HUMAN
95
CAC1A_MOUSE
CAC1A_HUMAN
99
bflornaseq.27973.1
Amel_GB18730-PA
99
NV11619-PA
32
TC011227
TC011226
85
Cacophony
CAC1A_DROME__cacophony
99
Helro1|73530
Helro1|128993
69
71
Helro1|119050
Lingula_comp155476_c1_seq27
82
Cgigas_EKC27184
hydra_sra.3016.1
Acropora_millepora_1337
56
Porites_australiensis_47371
NVE1263__CaA-like
98
hydra_sra.19987.4
Rhopilema_esculentum_TR102482_c2_g1_i1
90
AIPGENE18550
NVE18768__CaA-like
Porites_australiensis_44917
Acropora_millepora_1120
Alatina_alata_c56718_g1_i3
hydra_sra.37904.1
NVE4667__CaA-like
AIPGENE13431
Acropora_millepora_894
Oscarella_carmela_comp39348_c0_seq21
Hoilungia_braker1_g06121.t1
Triad1_g1163.t2__scaffold_1
Lingula_comp156742_c0_seq2
56
Cgigas_EKC35362
Lotgi1|51275
98
TC004715
CAC1D_DROME__Ca-alpha1D
bflornaseq.3496.1_bflornaseq.3497.1
CAC1S_HUMAN
94
CAC1F_HUMAN
CAC1D_HUMAN
96
99
CAC1C_HUMAN
L-type CaV
CAC1C_MOUSE
hydra_sra.20483.3
Alatina_alata_c60357_g1_i1
Chironex_fleckeri_TR37575_c1_g1_i8__CAC1C
75
Chrysaora_fuscescens_TR20395_c0_g1_i2__CAC1C
Rhopilema_esculentum_TR83197_c1_g2_i1__CAC1C
Corallium_rubrum_TR89372_c0_g1_i1__CAC1C
95
NVE6254__CaC-like
Stylophora_pistillata_AAD11470.1
60
Porites_australiensis_45764
Acropora_millepora_1095
Srosetta_PTSG_03773T0
Hoilungia_braker1_g01453.t1
Triad1_g2446.t1__scaffold_2
T-type CaV
CAC1I_HUMAN
100
CAC1H_MOUSE
CAC1H_HUMAN
58
Demosponges
Homoscleromorphs
Calcarea
Ctenophores
Placozoans
Cnidarians
Protostomes
Chordates
Vertebrates
Choanoflagellates
CAC1G_HUMAN
bflornaseq.16516.1
100
NV13530-PA__CAC1G-like
96
82
98
54
58
44
81
FBpp0088415__CAC1G-like
TC005355__CAC1G-like
Capca1|89566__CAC1G-like
Helro1|66349__CAC1G-like
39
Helro1|170765__CAC1G-like
Helro1|148954__CAC1G-like
Lingula_comp156482_c1_seq2__CAC1G-like
98
EKC37236__CAC1G-like
71
Lymnaea_stagnalis_AAO83843.2
Lotgi1|220094__CAC1G-like
Chironex_fleckeri_TR32013_c0_g1_i1__CAC1G
Corallium_rubrum_TR78644_c0_g1_i1__CAC1G
99
48
Porites_australiensis_46716__CAC1G
99
AIPGENE17378__CAC1G-like
NVE5017__CaG-like
Rhopilema_esculentum_TR92782_c0_g1_i2__CAC1G
Acropora_millepora_3959_partial
0.7
NVE7616__CaG-like
Supplemental Figure 18: Phylogenetic tree of voltage-gated calcium channels across metazoans
Protein tree generated with RAxML using the PROTGAMMALG model and 100 bootstraps. Bootstrap
values are 100 unless otherwise shown.
43
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
8
03m
12
an
_c
m
ciu
rti
Co 3.1 916
3
54 d7
15lcti
l
de
88
9
92
g
4_
_c
2_
_
i1
al
rti
pa
l
rtia
pa
0_
08 5
863909
id
24.1
scplctid
018 0
t1.0 8201
scic lctid
Calcisponges
1.2
1545
ict1.0 331
sc
lctid79
56.2
scict1.0042
lctid66549
0.9
49
0.98
8
Plants
0.96
0.
67
0.75
4
0029
Chlorella_variabilis_EFN53563.1
0.925
Polysphondyli
um
_pallidum_EF
Sak
ow
v30
03
60
Pha
85
m
seq1
301_comp13867_c1_
beroeabyssicola_t_h20_14
ctylu
m_
_AEQ59
28
6.1
rudii_
ADM
25
lass
825.1
iosira
tric
_pse
orn
udo
utu
nan
m_
a _X
XP_
P_0
022
002
9 33
180
60.1
795
.1
Anthozoan Group 2
19
_s
p_
TR
12
St
yl
64
iss
0_
_fi
a_
c1
X.t cif
ca
est or
_g
rte
uti mi
ri_ 1_
na s_T
TR i2
ria R
My
2
_
77
cale
TR 18
1
69 72
_ph
75 _c 78_
yllo
8_ 0_
c
ph
c2 g1 0_g
Cre
ila
_g _
ll
_
a
3_i1 1_i
Ted
TR
ania _eleg
i2
1
97
a
96
_an
Cram
2_c
hele ns_TR
be_c
ns_ 35
0 _g
rambe
1_i1
_TR3 TR183316_
Halisa
0718
8 c0
rca_du
_c2_g1_c0__gg1_
jardin
i3
i_HAD
1_i5 1_i2
A01036
606.1
Halisarca
_dujardin
i_HADA0
1057378.1
sia
al
in
a
Pe
tro
Sc
op
lithu
s_braa
Tha
AIP
AnGE
th NE
op 73
leu 91
ra
ta P
_e
st o
leg
ra ri
an
ea tes
tis
_c _a
sim
a v us
a_
er tra
65
no lie
82
sa ns
4
_1 is_
14 15
8 6 48
3
6
03 29
05 68 ta_
_9 a_1 illa
ria nr ist
ta te _p
cu reora
_s au h
iais_lop
nrgaSc ty
Faud
M
q1
q1
_se
_se
i1
_c0
1_
_c0
663
_g
005 omp23
c0
p11
4_
_c
58
om 31421
20
3_c
_
TR
252 h20
al_
0_1 sis_t_
Hc
_h2 nop
n_t boli
me
vela
seq1
_c0_
263
mp7
56_co
1
_088
h10
c0_seq
9_seq1
is_t_
487_
4651_c
p149
kam
_comp1
euplo
_22914
26_com
s_t_h10
_458
_h30
constan
calycein
tena_t
thalasso ML024915a
bathyc
bathocyroe_t_h20_10680_comp11581_c0_seq1
Demosponges
M
on
eficum
Cocco
eod
a
1
_i1
_g2_i2i1 1
_c1_g1g2_ 9.
i4
839_c0 0_ 83 _i2 _
R40 90 _c 15 2 g1
s_TR2630663 i_ss. c1_g c0_
n
a
2 tw 2_ 8_
leg _T
a_e elens e_TR
9
3
ll
5 27
Cre anh mb
09
3
_
ra
R1 TR
ania e_c
i_T sp_
Ted amb
ter
Cr
a_
ar
lin
_c
a
a
ss
op
yli
Sc
St
Protists
A75681.1
Karlodi
nium_v
en
0.999
1
wv3
Acro
Acropora
_mi
pora_
Ast
ra_20758
digllepo
itifera
MPooreriop
ora_sp
_10161
tes_
FnuMta
a
_2
nagsia
tr ustra 1046
dra_asea_c liensis_
Sct iscu_taarvern 3166
yloau ia_ osa
3
phret 682 _10
or enr 97 215
a_ a _
1
Co
pis 91
ra
til 26
lli G
lat
umor
a
_1
_rgon
9
1
ub ia
1
ru _ve
m nt
_T a l
R2 ina
36 _29
90 58
_c 1
0_
g1
_i
2
NVE9761
0.8
51
_esculen
Rhopilema
Sako
0.9
0.96
0.805
R7
_T
4
0.109
86
80.8
7
.0
um
r
ab
0.89
0.918
0.
73
0.862
R36069_c0_g1_i1
Chironex_fleckeri_T _g2_i1
1317_c0
tum_TR7
490.1
sra.26
hydra_
obirnaseq.13988.1
.t1.t1
g02370
_braker1_3120
Hoilungia Triad1_g
1
ict
sc
7
0.581
2
0.
1
Hoilungia_brake
r1_g02369.t2
Triad1_g3119.t1
Anthozoan Group 1
IOIN
raker1
_g0237
1.
Triad1_
g3t1
122.t2
Hydrozoans
N1_C
Placozoans
HVC
llo
rh
Da inch
nio us
_r _m
pu
s_t erio_ ilii_
Bf
rop
N X
_V
ica P_0 P_0
2_
lis_
0 0
32
NP 100 790
_g
HV
_0 23 0
66
0
4
C
4
Gallu
1 0 6. 4 8
N
46
11 1 .1
_M
s_gal HVC1N
26
.t1
lus_N
1 OU
2.1
E
P_00 _HUSM
1025 AN
834.1
Xe
no
Vertebrate
HV Channel
Hoilu
ngia_b
eq1
.1
9282 0_s
q.60214 6_c
nase C4 29
obir as_EK 143
p
1
Cgig com
3. 5
la_
20 10
gu
80 14
9.1
Lin
37 | 2
77
01 i1
19
P_ otg
11
8.100
_X L
47P_
us
27 N
m
37 s_
he
00 tu
yp
P_ ra
ol
_ X pu
_p
us pur
us
S
ul
m
Li
Ca
rat
pu
ur
Sp
Protostomes
Ctenophores
0.3
Supplemental Figure 19: Phylogenetic tree of voltage-gated proton channels across metazoans
Protein tree generated with FastTree.
44
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
974
2
Supplemental Tables
Supplemental Table 1: Summary statistics of the Tethya
includes the sponge and all associated bacteria.
Feature
Type
Assembly size (Mb)
Sponge only
Total gaps (Mb)
Sponge only
Estimated kmer coverage
Sponge only
Estimated mapping coverage
Sponge only
Number of contigs
All
Number of contigs
Sponge only
Number of contigs
Bacterial
GC %
Sponge only
Contig N50 (kb)
All
Contig N50 (kb)
Sponge only
Contig N50 (kb)
Bacterial
Genome size (bp)
Mitochondrion
Estimated mapping coverage
Mitochondrion
GC %
Mitochondrion
Paired-end reads
Holobiont genomic 100bp
Total paired-end bases (Gb)
Holobiont genomic
Reads aligning back to genome
All contigs
Mate-pair reads
Holobiont genomic 125bp
Total mate-pair bases (Gb)
Holobiont genomic
Moleculo (TruSeq) long reads
Holobiont genomic
Total Moleculo bases (Mb)
Holobiont genomic
Paired-end RNA-seq reads
dUTP Stranded
Paired-end RNA-seq bases (Gb)
dUTP Stranded
Trinity
De novo transcripts
RNA-seq mapping fraction
All contigs
StringTie
Genome guided transcripts
45
wilhelma genome. Holobiont genomic
Count
126.0
1.348
131x
159.3x
6,907
6,109
789
39.98%
70.7
73.4
48.2
19,754
669.6x
34.43%
259,518,468
25.951
214,103,768
280,837,536
35.104
125,150
436.7
201,451,574
25.181
127,012
68.3%
46,572
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
Supplemental Table 2: Summary of splice variation for the genome-guided transcriptome for
T. wilhelma (Twi) and transcript set v2.0 for A. queenslandica (Aqu).
Splice Type
Twi transcripts Twi
Aqu transcripts Aqu
events/exons
events/exons
Cassette exons
4779
9089
3591
5535
Canonical splicing
5049
2602
Skipped exons
3868
8329
1022
721
Alternative splice acceptor
3747
638
Intron retention
3295
3565
3437
3400
Alternative splice donor
3264
521
Alternative N-terminus
1964
1965
Alternative C-terminus
1788
1968
Intronic start
471
571
Intronic end
285
592
Non-canonical
73
135
Single exon with variants
246
13
No variants
12088
24027
Single exon and no variants 15421
12400
-
Supplemental Table 3: Skipped exon and retained intron frame, for T. wilhelma (Twi) and A.
queenslandica (Aqu). Skipped exons tend to have lengths as multiples of three.
Feature
Position 1 Position 2 Position 3
Twi Skipped exons
6904
5179
5331
Twi Retained introns
1202
1204
1159
Aqu Skipped exons
2671
1914
1972
Aqu Retained introns
1244
1092
1101
46