bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 1 2 3 The genome of the contractile demosponge Tethya wilhelma and the evolution of metazoan neural signalling pathways Warren R. Francis1 , Michael Eitel1 , Sergio Vargas1 , Marcin Adamski2 , Steven H.D. Haddock 3 , Stefan Krebs4 , Helmut Blum4 , Dirk Erpenbeck1,5 Gert Wörheide1,5,6 1 4 Department of Earth and Environmental Sciences, Paleontology and Geobiology, Ludwig-Maximilians-Universität München Richard-Wagner Straße 10, 80333 Munich, Germany 2 Research School of Biology, College of Medicine, Biology & Environment, Australian National University, Canberra ACT 0200 Australia 3 Monterey Bay Aquarium Research Institute, Moss Landing, CA 95039, USA 4 Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, Ludwig-Maximilians-University Munich, Munich, Germany 5 GeoBio-Center, Ludwig-Maximilians-Universität München, Munich, Germany 6 Bavarian State Collection for Paleontology and Geology, Munich, Germany Abstract 18 Porifera are a diverse animal phylum with species performing important ecological roles in aquatic ecosystems, and have become models for multicellularity and early-animal evolution. Demosponges form the largest class in sponges, but previous studies have relied on the only draft demosponge genome of Amphimedon queenslandica. Here we present the 125-megabase draft genome of a contractile laboratory demosponge Tethya wilhelma, sequenced to almost 150x coverage. We explore the genetic repertoire of transporters, receptors, and neurotransmitter metabolism across early-branching metazoans in the context of the evolution of these gene families. Presence of many genes is highly variable across animal groups, with many gene family expansions and losses. Three sponge classes show lineage-specific expansions of GABA-B receptors, far exceeding the gene number in vertebrates, while ctenophores appear to have secondarily lost most genes in the GABA pathway. Both GABA and glutamate receptors show lineage-specific domain rearrangements, making it difficult to trace the evolution of these gene families. Gene sets in the examined taxa suggest that nervous systems evolved independently at least twice and either changed function or were lost in sponges. Changes in gene content are consistent with the view that ctenophores and sponges are the earliest-branching metazoan lineages and provide additional support for the proposed clade of Placozoa/Cnidaria/Bilateria. 19 Introduction 5 6 7 8 9 10 11 12 13 14 15 16 17 20 21 22 23 24 25 26 The presence of neurons is a defining character of animals, and is symbolic of their alleged superiority over all other life on earth. Nonetheless, the four non-bilaterian phyla, Porifera, Placozoa, Ctenophora and Cnidaria, are most different from other animals in their sensory systems and are often mistakenly referred to as “lower” animals in common parlance, despite the fact that, like bilaterians, non-bilaterians exist at the tips of the tree of life. Indeed, animals such as corals and sponges appear immobile or often unresponsive, challenging early theorists in their ideas of what is and is not an animal. Yet we now know that representatives from all four non-bilaterian phyla demonstrate dynamic responses to outside stimuli. 27 1 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 28 29 30 31 32 Neural evolution has been discussed previously in the context of paleontology (reviewed in [Wray et al., 2015]) and metazoan phylogeny (reviewed in [Jékely et al., 2015]). Indeed, it has been suggested that many features of bilaterian neurons and nervous systems represent separate, parallel evolutionary events from a “simple” nervous system. A simple nervous system then must arise from proto-neurons [Schierwater et al., 2009], however it is unclear what that might look like. 33 34 35 36 37 38 39 40 Several qualities can be used to define neurons or proto-neurons [Leys, 2015,Nickel, 2010] such as synapses, electrical excitability, membrane potential, or secretory functions, though no single quality (and ultimately gene set) solely defines such cells as neurons. Two non-bilaterian groups, ctenophores and cnidarians, are thought to have true neurons. When considering the remaining two non-bilaterian phyla, sponges and placozoans, many components of neural cells are found without any neuron-like cells having been identified [Srivastava et al., 2010, Riesgo et al., 2014a, Leys, 2015], although synapse-like structures have been identified in placozoan fiber cells that show vesicles close to an osmophile contact [Grell and Benwitz, 1974]. 41 42 43 44 45 46 47 48 49 50 Comparative analyses revealed a gradient of neural-like qualities indicating that “neuron-or-not” classifications are not straightforward. While ctenophores, cnidarians, and bilaterians have true neurons, structural and biochemical differences, [Moroz et al., 2014, Moroz, 2015] led to the proposition that neurons in ctenophores and cnidarians may not be homologous, but rather separate evolutionary outcomes from neurallike precursor cells. Potentially, in the case of independent evolutions, neurons are “easy” to evolve, since it involves co-expression of various pan-metazoan genetic modules in the same cell type. Alternatively, early rudimentary signaling systems may have been energetically costly and not especially useful in pre-Cambrian oceans, and in such cases, it may have been comparatively easy to lose such genes and with them neuronaltype cells. 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 Interpretation of neural evolution requires an accurate metazoan phylogeny, and the phylogenetic relationships of early-branching metazoans have been a topic of continued controversy. Some analyses support the traditional phylogenetic position of sponges as sister group to all other metazoans (“Porifera-sister”) [Philippe et al., 2009,Pick et al., 2010,Nosenko et al., 2013,Pisani et al., 2015,Simion et al., 2017] while others suggest that Ctenophora are the sister group to all other animals (“Ctenophora-sister”) [Dunn et al., 2008, Ryan et al., 2013, Whelan et al., 2015], and some analyses also recover the classical view, a Coelenterata clade uniting Cnidaria and Ctenophora [Philippe et al., 2009, Simion et al., 2017]. Importantly, phylogenomic analyses can be prone to systematic artifacts under some circumstances, depending on taxon sampling [Pick et al., 2010, Philippe et al., 2011], gene set [Nosenko et al., 2013], phylogenetic model [Pisani et al., 2015], or use of nucleotides instead of proteins [Jarvis et al., 2014]. Other methods based on presence or absence of the genes themselves have been proposed to provide a sequence-independent inference of phylogeny [Ryan et al., 2010, Ryan et al., 2013, Pisani et al., 2015], relying on the assumption that gene loss is a rare event. However, non-bilaterians have the additional problem that basic knowledge of many aspects of their biology is absent [Dunn et al., 2015], and so the biological context that may separate or unite groups is limited. 66 67 68 69 70 71 72 In the context of phylogeny, the branching order critically affects whether neurons evolved multiple times or were lost (see schematic in Figure 1). Given the gradient of neural-like qualities, the actual evolutionary scenario may be somewhere in between a simple gain-loss of neurons. While some previous studies have focused on neural evolution in ctenophores [Ryan et al., 2013, Alberstein et al., 2015, Li et al., 2015] or analysing the genomic data from A. queenslandica [Krishnan et al., 2014], these alone do not provide a comprehensive picture of all animals. 73 74 75 76 77 78 79 80 Here we have sequenced the genome of the contractile laboratory demosponge Tethya wilhelma [Sara et al., 2001] and examined the protein repertoire in the context of genes mediating the contraction, and other neural-like functions. Many metabolic genes show unique expansions in different sponge clades, as well as other phyla, making it challenging to clearly assign functions based on similarity to human proteins. We consider these expansions in the context of phylogenetics, showing that even though sponges lack neurons, signaling pathways have still expanded. This gives support to the hypothesis that early neural-like cells have become neurons multiple times in the history of animals. 2 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. A B HAS NEURONS GAIN C LOSS D Bilateria Cnidaria Placozoa Porifera Ctenophora Figure 1: Schematic of neural evolution depending on metazoan phylogeny The presence of neurons or neural-like cells in ctenophores, cnidarians, and bilaterians can be viewed differently depending on the phylogeny. Two different metazoan phylogenies based on recent multi-gene phylogenetic analyses are the source of the Porifera-sister (A,B) [Philippe et al., 2009, Pisani et al., 2015, Simion et al., 2017] and Ctenophora-sister (C,D) [Ryan et al., 2013, Whelan et al., 2015] scenarios. Neurons can either have evolved once requiring a secondary loss in sponges, placozoans, or both (A,C), or evolved twice, in ctenophores and in cnidarians/bilaterians (B,D). 81 Results 82 Genome assembly and annotation 83 84 85 86 87 88 89 90 91 92 93 94 95 96 We generated a total of 61 gigabases of paired-end reads from a whole specimen of T. wilhelma (Figure 2) and all associated bacteria. Because of a close association with microbes, some contigs were expected to have derived from bacteria, as many reads have unexpectedly high GC content (Supplemental Figures 1-4). After assembly and filtration of bacterial contigs, the final assembly was 125Mb, similar to A. queenslandica, with a N50 value of 70kb (Supplemental Table 1). Gene annotation was done with a combination of a deeply-sequenced RNAseq library from an adult sponge and ab initio gene predictions. Because of high density of genes, extensive manual curation was often necessary to correct genes of the same strand that were erroneously merged. After correction and filtering of the ab initio predictions, we counted 37,416 predicted genes, comparable with the counts in A. queenslandica (40,122) [Fernandez-Valverde et al., 2015] and S. ciliatum (40,504) [Fortunato et al., 2014]. General trends in splice variation were similar between T. wilhelma and A. queenslandica (Supplemental Tables 2 and 3), suggesting similar underlying biology or genome structure. One-to-one orthologs from T. wilhelma and A. queenslandica had relatively low identity (Supplemental Figure 5), with the average identity of 57.8%, showing a high genetic diversity within Porifera. The average identity is lower when compared to 3 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. Figure 2: Contraction of a normal specimen (A) Maximally expanded state of Tethya wilhelma. (B) The same specimen approximately an hour later in the most contracted state. (C) Cartoon view of most contracted versus expanded state. Scale bar applies to all images. Photos courtesy of Dan B. Mills. 97 98 99 100 S. ciliatum (49.7%), N. vectensis (53.5%) and human (52.0%), which is not surprising given that A. queenslandica and T. wilhelma are both demosponges. Although both genomes are too fragmented to find syntenic chromosomal regions, ordered blocks of genes are still identifiable between T. wilhelma and A. queenslandica (Supplemental Figure 6), though not with S. ciliatum. 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 Neurotransmitter metabolism across early-branching metazoans Compared to most other metazoans, sponges have a limited set of behaviors (contraction, closure of osculum or choanocyte chambers to control flow), yet respond to many signaling molecules present in bilaterians [Ellwanger and Nickel, 2006, Ellwanger et al., 2007]. Some genes involved in vertebrate-like neurotransmitter metabolism have been found in sponges [Riesgo et al., 2014a, Krishnan and Schiöth, 2015], although many display a sister-group relationship to homologs found in other animals and appear to have a complex evolutionary history with duplications in sponges and other non-bilaterian animals (Figure 3, Supplemental Figures 7-9), making the prediction of their functions difficult. Implicitly, presence of a gene usually means a one-to-one orthology relationship with a functionally annotated protein, probably a human protein. Since many of the non-bilaterian proteins in our set are many-to-many orthologs to human proteins with known functions, declaring presence or absence of any individual gene or genetic module is not correct in the strictest sense, as one-to-many or many-to-many orthologs are not the same gene. In such cases, it is not currently possible to computationally predict which, if any, of the sponge orthologs shares its function with a human protein. 116 117 118 119 120 121 122 123 For instance, biosynthesis of monoamine neurotransmitters (dopamine, serotonin, etc.) requires two enzymes, tryptophan hydroxylase and tyrosine hydroxylase. These two enzymes appear to have arisen in bilaterians from duplications of an ancestral phenylalanine hydroxylase [Cao et al., 2010], though evidence is lacking as to whether this ancestral protein had multiple functions that specialized after duplication (subfunctionalization) or developed new functions (neofunctionalization) post-duplication. The absence of these proteins in non-bilaterians seems to be ancestral; in other words, they had not evolved yet when these groups split and diversified. 124 125 126 127 128 129 130 Among other non-bilaterians, some monoamine neurotransmitters are found in cnidarians [Carlberg and Rosengren, 1985], but are mostly absent in ctenophores (or at detection limit) [Moroz et al., 2014]. Indeed, previous studies were unable to find homologs of DOPA decarboxylase (AADC, Supplemental Figure 8), dopamine β-hydroxylase (DBH, Supplemental Figure 7), monoamine oxidase (MAO, Supplemental Figure 9), or tyrosine hydroxylase (TH) in the genome of the ctenophore M. leidyi or any available ctenophore transcriptome, and it was suggested that some of these proteins were absent in sponges as well (see Supple- 4 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. OH O O GATP Gamma-amino butyric acid Succinate semialdehyde Succinic acid ABAT Phenylpyruvate COO- GAD 4-hydroxyphenylpyruvate O HPD COO- KYAT TAT Phenylalanine COONH3 Phenylalanine catabolites O HO Tyrosine PAH Standard amino acids 4-hydroxyphenylacetate Tyramine COO- HO HGD COO- NH3+ + NH3+ HO Acetoacetate + Fumarate DOPA COO- AADC NH3+ HO Dopamine Dihydroxyphenylacetate HO HO NH3+ HO OH COO- HO DOMA COO Catecholamine degradation products Adrenaline OH VitC is a cofactor HO HGD NH2Me+ AADC OH MeO HO HO PNMT HO - 2 PAH VMA COMT OH HO COO 2 2 3 2 TH MAO NH3+ HO Enzyme requires O2 2 M HPD Homovanillate MeO COO- HO Noradrenaline HO NADH is a product GLUD TAT DBH VitC Catecholamine neurotransmitters M 2 KYAT HO TH HO GS GATP SSDH - DBH M M MAO 2 PNMT COMT 2 M M M Present Homolog Absent Absent in transcriptome Figure 3: Neurotransmitter overview across metazoans Summary schematic of neurotransmitter biosynthesis and degradation pathways across early-branching metazoans. Each of the four non-bilaterian groups is presumed to be monophyletic, although some individual trees of genes or gene families may display alternate topologies. Bold letters refer to the enzymes. Individual gene trees that display the orthology of the clades are found in the supplemental information. Arrows are shown in one direction though many reactions can be reversible. (A) Glutamate and GABA metabolism from the citric acid cycle through the “GABA shunt” pathway. Because ABAT is absent in ctenophores, GLUD and TAT are potentially alternatives to convert α-ketoglutarate to glutamate. GATP can convert GABA to succinate semialdehyde, but this enzyme was only found in some demosponges and plants. (B) Monoamine metabolism, excluding tryptophan. (C) Table of presence-absence for genes in parts A and B. Presence (green) refers to a 1-to-1 ortholog where orthology is clear from the tree position. Homolog (blue) refers to a sister group position in trees before duplications with different or unknown functions. Secondary loss (red) refers to the gene missing in the clade, but homologs are found in non-metazoan phyla. Numbers inside the boxes indicate copy number specific to that group, M refers to multiple duplications within the group where the copy number is variable among species. Abbreviations for clades are as follows: Cteno, ctenophores; Hsm, homoscleromorphs; Demo, demosponges; Hexact, hexactinellids; Choano, choanoflagellates. 131 132 133 134 mentary Tables 17 and 19 in [Ryan et al., 2013]). However, we found orthologs of MAO and homologs of AADC and DBH in several sponges, though it is unclear if they perform the same function as the human proteins. Additionally, homologs of four enzymes, AADC, MAO, DBH, and ABAT, are present in singlecelled eukaryotes but not ctenophores, implying a secondary loss of these protein families in this phylum. 135 5 Loss Plant O HO NH2 B SSDH O HO Choano ABAT O HO * Filasterea GAD 2OG-enzymes VitC Hexact O NH2 Metazoa OGDC-E1 SUCLG1 OH Demo O HO Calcarea O GLUD OH Hsm GS NH2 O HO Cteno O OH Cnidaria O H 2N C Citric acid cycle Alpha-keto glutarate Placozoa O (TAT) Glutamic acid Glutamine Bilateria A bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 GABA receptors The neurotransmitter gamma-amino butyric acid (GABA) has been shown to affect contraction in T. wilhelma [Ellwanger et al., 2007] and the freshwater sponge E. muelleri [Elliott and Leys, 2010]. The genome of T. wilhelma contains metabotropic receptors (GABA-B, mGABARs), but not ionotropic GABA receptors (GABA-A, iGABARs). While humans have two mGABARs and the ctenophore M. leidyi has only one, the T. wilhelma genome has nine. Sponges appear to have undergone a large expansion of this protein family (Figure 4), similar to the expansion of glutamate-binding GPCRs previously observed in sponges [Krishnan et al., 2014]. Based on the structure of the binding pocket of human GABAR-B1 [Geng et al., 2013], many differences are observed across the mGABAR protein family, even showing that many residues involved in coordination of GABA are not conserved between the two human proteins or all other animals (Supplemental Figure 11). Contrary to previous reports [Ramoino et al., 2010], we were unable to find normal mGABARs in the two calcareous sponges S. ciliatum and L. complicata. Instead, in these two species, the best BLAST hits from human GABA-B receptors (the putative mGABARs) had the best reciprocal hits to Insulin-like growth-factor receptors. Structurally, this was due to the normal seven-transmembrane domain being swapped with a C-terminal protein kinase domain (Figure 5), meaning these are not true metabotropic GABA receptors. Similarly, in the filasterean Capsaspora owczarzaki, the N-terminal ligand binding domain is also exchanged with other domains, suggesting as well that these are not true metabotropic GABA receptors. 154 155 156 157 158 159 160 161 162 163 164 Glutamate receptors Glutamate is of particular interest as it is a key metabolic intermediate and the main excitatory neurotransmitter in animal nervous systems, acting on two types of receptors: the metabotropic glutamate receptors (mGluRs) and the ionotropic ones (iGluRs). Some sponge species possess iGluRs, though these receptors were absent in the transcriptomes of several demosponges [Riesgo et al., 2014a]. We were unable to find iGluRs in the genome of T. wilhelma, in the genome and transcriptomes of any other demosponge, or in the genomes of two choanoflagellates (M. brevicollis and S. rosetta). The top BLAST hits in demosponges have a GPCR domain instead of the ion channel domain, indicating that these are not true iGluRs (Supplemental Figure 13). Because the domain structure in plants is the same as most animal iGluRs, the ligand binding domain was swapped out in demosponges. 165 166 167 168 169 170 171 172 The homoscleromorpha/calcarea clade appears to have an independent expansion of iGluRs (Supplemental Figure 12), though the normal ion transporter domain is switched with a SBP-bac-3 domain (PFAM domain PF00497) compared to all other iGluRs (Supplemental Figure 13). Additionally, ctenophores and placozoans appeared to have dramatic expansions of this protein family as well [Ryan et al., 2013, Moroz et al., 2014,Alberstein et al., 2015], suggesting that a small set of iGluRs was present in the common ancestor of eukaryotes and have diversified multiple times in both plants and animals, while other clades appear to have modified or lost these proteins. 173 174 175 176 177 178 179 180 181 182 183 184 Vesicular transporters Secretory systems are a common feature of all eukaryotes, as most cells have endoplasmic reticulum to secrete proteins or make membrane proteins. Neurons secrete peptides (conceptually identical to any other protein) or small-molecule neurotransmitters in a paracrine fashion, specifically to other neural cells. Compared to peptides, small-molecule neurotransmitters need to be loaded into vesicles by dedicated transport proteins. Vesicular glutamate transporters (VGluTs, SLC17A6-8) are part of a superfamily of transporters [Sreedharan et al., 2010] that carry glutamate, aspartate, and nucleotides. The position of sponge proteins in the tree is inconsistent with a clear role in glutamate transport (Supplemental Figure 14), as several sponge clades and ctenophores occur as sister group to multiple duplications. Transporters in sponges, ctenophores, and choanoflagellates may well act upon glutamate or other amino acids, but this needs to be experimentally investigated. 185 6 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. Protostome GABAR-B3 (7) GABAR-B2 Placozoans (2) Cnidarians (9) Vertebrate GABAR-B1 (4) Filasterea (1) Ctenophores (2) Homoscleromorphs (1) ** Hexactinellids (4) * Demosponges (4) ++ BS>80 BS>90 Figure 4: Metabotropic GABA receptors (GABA-B type) across metazoans Protein tree generated with RAxML. Numbers in parentheses indicate the number of species from that group, so the 46 demosponge mGABARs come from 4 species. Key bootstrap values are summarized as yellow or gray dots, for values of 90 or more, or 80 or more, respectively. Single star indicates sequences annotated as mGABARs in [Krishnan et al., 2014], double plus indicates the clade annotated as “sponge specific expansion” in [Krishnan et al., 2014]. For complete version with protein names and all bootstrap values, see Supplemental Figure 10 190 Similar to glutamate, GABA is loaded into vesicles with the vesicular inhibitory amino acid transporter (VIAAT). Ctenophores, sponges, and placozoans lack one-to-one homologs of VIAAT (Supplemental Figure 15). Several other transporters are thought to transport GABA (ANTL or SLC6 class) and many other amino acids. SLC6-class transporters, which transport diverse amino acids, are found in all non-bilaterian groups, so the function of VIAAT may be redundant. 191 Glycine receptors 186 187 188 189 192 193 194 195 196 197 Glycine is known to affect the contraction of T. wilhelma [Ellwanger and Nickel, 2006]. Some ctenophore iGluRs have been shown to bind glycine [Alberstein et al., 2015] due to the substitution of serine for arginine (S687 in human GluN1), though this appears to be specific only to ctenophores, as essentially all other iGluRs have the conserved serine/threonine at this position. Because no ionotropic glycine receptors could be identified in the T. wilhelma genome (or any other sponges, ctenophores or placozoans), other proteins may be responsible for mediating this effect. 198 7 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. Vertebrate mGABARs GABR1_HUMAN ANF_receptor 7-transmembrane Tyrosine kinase GABR1_MOUSE GABR2_HUMAN Signal peptide Sushi TIG Laminin G3 Human Receptor tyrosine kinases INSRR_HUMAN GABR2_MOUSE Recep_L_domain Furin−like fibronectin3 IGF1R_HUMAN Placozoans Triad1_g72.t1__scaffold_1 Ctenophores ML02335a−AUGUSTUS Calcisponge mGABAR-like scict1.027609.1_0 Sycon ciliatum scict1.029082.1_0 Demosponges Twilhelma_twi_ss.20824.4_2 lctid9879_0 Leucosolenia complicata Aqu2.1.28011_001 aphrocallistes_vastus_comp16141_c0_seq1_1 Hexactinellids lctid37767_0 rosella_fibulata_TR14675_c0_g1_i1_0 Capsaspora owczarzaki oscarella_t_h60_42123_comp11388_c0_seq18 0 200 400 600 Homoscleromorph 800 1000 Filasterean CAOG_05982T0 0 200 400 600 800 1000 1200 1400 Figure 5: Domain organization of GABA-B type receptors across metazoans Scale bar displays number of amino acids. Top reciprocal BLAST hits to human for putative mGABARs in calcareous sponges are INSRR and IGF1R, due to high-scoring hits to the tyrosine kinase domain. All mGABARs from demosponges, glass sponges, and the homoscleromorph Oscarella carmela share the 7transmembrane domain (green) with mGABARs from other animals, while calcarean proteins have the same ligand-binding domain (red) but instead have a protein tyrosine kinase domain (purple) at the C-terminus, similar to growth-factor receptors. The filasterean Capsaspora owczarzaki has alternate domains at the N-terminus. 199 200 201 202 203 204 205 206 207 208 209 210 Mechanical receptors Some sponges can contract in response to mechanical agitation, as reported for the demosponges E. muelleri [Elliott and Leys, 2007] and T. wilhelma [Nickel, 2010]. Several diverse protein families appear to be responsible for the sense of touch [Árnadóttir and Chalfie, 2010]. A subgroup of the TRP (transient receptor-potential) channels, TRP-N, thought to mediate mechanosensation was determined to be absent in sponges [Schuler et al., 2015], and we were unable to identify any in either T. wilhelma or S. ciliatum, although other TRP-class channels were found [Ludeman et al., 2014, Schuler et al., 2015]. Because the mechanosensory function of TRP channels may be redundant, we analysed for the presence of PIEZO, a 280kDa trimeric protein [Ge et al., 2015] involved in touch sensation in mammals [Coste et al., 2012]. Although two homologs were found in vertebrates, we found one copy in all other animals (Supplemental Figure 16) as well as fungi, plants and most other eukaryotic groups, suggesting an ancient and conserved function of this protein. 211 8 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 Voltage-gated channels Voltage-gated ion channels are necessary for the propagation of electrical signals down axons and dendrites [Zakon, 2012,Moran et al., 2015], and have specificities for sodium, potassium, or calcium. Previous analyses were unable to clearly identify potassium or sodium channels in sponges [Liebeskind et al., 2011]; only one partial potassium channel was found in the transcriptome of the homoscleromorph Corticium candelabrum [Riesgo et al., 2014a,Li et al., 2015]. We were unable to find any voltage-gated sodium or potassium channels in the genome or transcriptome of any sponge. We then examine voltage-gated hydrogen channels (hvcn1), as these proteins have been found in a number of single-celled eukaryotes [Smith et al., 2011], and are extremely conserved. These channels were found in all sponge groups, although the high protein identity resulted in a poorly-resolved tree (Supplemental Figure 19). Reports of action potentials in hexactinellids [Leys et al., 1999,Leys et al., 2007,Nickel, 2010] showed that sponge action potentials were inhibited by divalent cations [Leys et al., 1999], suggesting a role of calcium channels instead. Because voltage-gated sodium and calcium channels arose from a duplication event [Gur Barzilai et al., 2012], the ion selectivity may be variable within this protein family. Most sponges have only a single CaV-channel (Supplemental Figure 18) and several Hv-channels, and no voltage-gated channel of any kind was found in any glass sponge. However, all glass sponge sequences are from transcriptomes, therefore either the expression level of the true channels is low in glass sponges, or they have independently evolved another mechanism to propagate action potentials. 230 9 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 231 Discussion 232 Gene content variation of metazoa 233 234 235 236 237 238 Among the thousands of genes in the genome, we focused on genes that may be mediating contractile behavior in T. wilhelma, and the interactions of those genes within broader metabolic pathways. Many of the “housekeeping” genes in our study have lineage-specific duplications in at least one animal phylum. Considering the importance of “single-copy” proteins in phylogenetic analyses, as taxon sampling improves, it may be found that very few or no genes are single copy across most or all animal phyla. Many other genes that are critical for neural functioning in bilaterians have independent losses in other animal lineages (Figure 6). 239 Gene family expansions Demo/Hexact Calcarea/Hsm mGluRs mGABARs MAO/PAOX-like DBH-like iGluRs DBH-like * Hv-channels * Placozoans Cnidarians iGluRs mGABARs DBH-like NaV-channels VGluT VIAAT MAO/PAOX-like DBH-like Ctenophores iGluRs Kv-channels Gene family losses iGluRs VIAAT AADC-like NaV-channels Kv-channels mGABARs *# DBH-like * NaV-channels Kv-channels ABAT MAO DBH-like AADC-like MAO - Figure 6: Summary of gene expansions and losses Demo/Hexact refers to the clade of demosponges and glass sponges. Calcarea/Hsm refers to the unnamed clade of calcareous sponges and homoscleromorphs. The star indicates that expansion or loss was found in one class but not the other. Number sign indicates the domain rearrangement in Calcarea. 240 241 242 243 244 245 246 247 248 Glutamate and GABA receptor evolution There is stark contrast in the relative abundance of mGABARs and iGluRs in sponges and ctenophores. The relative dearth of mGABARs in ctenophores may reflect the apparently absence of amino-butyrate amino-transferase (ABAT) in ctenophores, suggesting that ctenophores use an alternate pathway to produce glutamate or metabolize GABA (Figure 3), rarely use GABA as a neurotransmitter, or simply are missing this pathway. Other aminotransferases such as GLUD or TAT may perform some of the exchange between α-ketoglutarate and glutamate, particularly as ctenophores have two copies of GLUD while most animals have only one. Ctenophores also have multiple (variable) copies of glutamate synthase and three copies of KYAT one of which may serve to balance glutamate metabolism in these animals. 249 250 251 252 253 254 There are two explanations for the diversity of mGABARs in sponges. Given the high variability of amino acids in the mGABAR binding pocket (Supplemental Figure 11), it is plausible that many of these receptors do not bind GABA at all, and have diversified for other ligands. There is precedent for this as it was shown that the independent expansion of ctenophore iGluRs also included several key mutations to the binding pocket which changed the ligand specificity of these proteins [Alberstein et al., 2015]. For the other 10 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 255 256 257 258 259 260 261 262 hypothesis, all of the receptors could bind GABA, essentially mediating the same contraction signal, but their kinetics could differ and be influenced by factors such as, for instance, temperature. Because sponges are mostly immobile, they often can be subject to environment variation in terms of light, oxygen, and temperature. The possession of a set of proteins capable of triggering the same response (e.g. contraction) with varying daily or seasonal environmental conditions (e.g. temperature) would be beneficial and may explain the diverse set of receptors observed in sponges. Experimental characterization of these binding domains is necessary and may even show that a combination of these hypotheses explains the diversification of mGABARs in Porifera. 263 264 265 266 267 268 269 270 271 272 273 274 275 276 The apparent absence of true mGABARs in calcareous sponges (the genome of S. ciliatum and transcriptome of L. complicata) conflicts with a previous study that identified key proteins in the GABA pathway by immunostaining [Ramoino et al., 2010]. The best mGABARs BLAST hits found in the two calcareous sponges display a conserved ligand binding domain but the seven-transmembrane domain has been swapped with a tyrosine kinase domain (Figure 5). Structural similarity of the conserved N-terminal domain may result in a false-positive signal in studies using immunostaining with standard antibodies [Ramoino et al., 2010]. On the other hand, compared to ctenophores, which apparently lack ABAT, this enzyme was found in both of the calcareous sponges analyzed. Thus it would be surprising if these sponges had no capacity to create or respond to GABA. Since true vertebrate-like mGABARs are found in all other sponge classes, and our study could only examine two calcareous sponges, it could be that mGABAR presence is variable in this class. The genome of S. ciliatum contains 40 proteins annotated as mGluRs [Fortunato et al., 2014], so a third possibility is that even in the absence of true mGABARs, some of these proteins may have evolved affinity for GABA and mediate its signaling in calcareous sponges. 277 278 279 280 281 282 283 284 285 286 Although a putative iGluR was identified in the transcriptome of the demosponge Ircinia fasciculata, this sequence was only a fragment, so the glutamate affinity and domain structure could not be determined. As with the mGABARs, the domain structure is different between the sponge classes. Otherwise, it appears that only calcareous sponges and homoscleromorphs have NMDA/AMPA-like iGluRs. The presence of these proteins in plants and other single-celled eukaryotes suggests that at least iGluRs were present in the common ancestor of all eukaryotes, and their absence in demosponges is likely the product of secondary losses. In the context of contractions of T. wilhelma, the abundance of mGluRs and mGABARs could plausibly work in antagonistic ways via the action of different G-proteins making ionotropic channels not necessary for the modulation of this behavior. 287 288 289 290 291 292 293 294 Variation in neurotransmitter metabolism Many of the oxidative enzymes in the monoamine pathway require molecular oxygen, suggesting an important role of this molecule both the synthesis of the neurotransmitters (with PAH, TH, and DBH) and their inactivation (with MAO). Two catabolic pathways arise from tyrosine (Figure 3) and require oxygen at nearly all steps. It is unclear why intermediate products of one of these two pathways, the catecholamine pathway, became neurotransmitters and the other did not, particularly as hydroxyphenylpyruvate pathway is universally found in animals and catabolic intermediates are likely to be ubiquitous. 295 296 297 298 299 300 301 302 303 304 MAO was found in most animal groups, but we were unable to find any in placozoans or ctenophores. The topology of the MAO phylogenetic tree suggests a secondary loss of this protein in these phyla (Supplemental Figure 9). Related genes (PAOX, polyamine oxidase) were found in placozoans with several placozoan-specific duplications, and again, potentially one of these may catalyze the oxidation of aromatic amines. The analysis of these proteins also uncovered a clade including sponges, cnidarians, and lancelets, though the function of these proteins cannot be predicted based on homology searches. In vitro characterization of these enzymes may reveal the function to provide evidence as to how these could have been important for metabolism in early animals, and was subsequently replaced or lost in most other metazoan lineages. 305 306 Remarkably, the DBH group has independent expansions in three sponge classes as well as placozoans 11 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 307 308 309 310 311 and cnidarians (Supplemental Figure 7). No DBH homologs were identified in calcareous sponges or in ctenophores. A putative homolog of this group was found in the choanoflagellate M. brevicollis but not in any other non-metazoan. The alignment and the phylogenetic position of the M. brevicollis protein suggest that it may be a member of the copper-binding oxygenase superfamily, rather than a true homolog of DBH (see Supplemental Alignment). 312 313 314 315 316 317 The presence of DBH-like and AADC-like enzymes in most animal groups suggests the possibility to make phenylethanolamines (like octopamine or noradrenaline) from tyrosine, and then subsequently inactivate them with MAO. All demosponges appear to lack AADC, and ctenophores appear to lack both of these enzymes calling into question a previous report of the detectability of monoamine neurotransmitters in ctenophores [Carlberg and Rosengren, 1985]. 318 319 Conserved properties of neurons 331 Neurons are generally defined by the presence of five key aspects: membrane potential, voltage-gated ion channels, secretory pathways, ligand-gated ion channels, and cell-cell junctions to form synapses. Voltagegated channels, secretory systems, and ligand-gated ion channels are discussed above. Membrane potential is maintained in animal cells by sodium-potassium pumps (ATP-ases), which are a class of cation pumps exclusively found in animals [Stein, 1995, Sáez et al., 2009]. It is thought that such pumps are necessary because animals are the only multicellular group that lacks any kind of cell wall, thus careful control of ionic balance is necessary to resist osmotic stress [Stein, 1995]. For non-bilaterian animals, cell layers were in direct contact with water, so potentially all cells needed this protein to function normally. Therefore having neuron-like functionality is unlikely to rest upon the gain or loss of this gene. The last feature is the presence of cell-cell connections. Many proteins involved in synapse structure or neurotransmission are found in sponges, [Srivastava et al., 2010,Riesgo et al., 2014a,Moran et al., 2015,Leys, 2015] though it is not clear which genes are necessary for neural functioning, or may have evolved independently. 332 Neural evolution and losses 320 321 322 323 324 325 326 327 328 329 330 333 334 335 336 337 338 339 340 Based on recent phylogenies, both Porifera-sister and Ctenophora-sister evolutionary scenarios require either at least one loss of neurons or two independent gains (Figure 1) of this cell-type. The only scenario that allows for a single evolution of neurons and no losses is the “Coelenterata” hypothesis (reviewed in [Jékely et al., 2015]), which joins cnidarians and ctenophores in a clade. However, many molecular datasets [Dunn et al., 2008, Ryan et al., 2013, Whelan et al., 2015, Pisani et al., 2015] and morphological evidence [Harbison, 1985] argue against this scenario (but also see [Philippe et al., 2009] and [Simion et al., 2017]). One other alternative is that placozoans have an unidentified neuron-like cell in a Porifera-sister context, which would therefore allow for a single origin of neural systems in animals and no losses. 341 342 343 344 345 346 347 What do the two different scenarios mean for evolution of neuronal cells? Considering the basic properties of neurons related to electrical signaling or secretory pathways, it had been shown before that many of the genes involved are universally found in animals. A single origin and multiple losses implies that the genetic toolkit necessary for all of these functions was present in the same single-celled organism or the same cell type (an hypothetical proto-neuron) of the last common ancestor of crown-group Metazoa, and either that cell type was lost or its functions were split up. 348 349 350 351 352 353 354 355 356 357 Sponges and ctenophores both appear to have lost several gene families (Figure 6), though ctenophores nonetheless have neural cells. Thus, the losses of the GABA or monoamine pathways are not critical for the functioning of neural cell types overall. However, voltage-gated potassium and sodium channels are thought to be essential for the propagation of electrical signals down axons and dendrites and have been found in all animal groups except sponges [Moran et al., 2015]. The NaV-channel tree shows a single origin of this protein family (Supplemental Figure 17), and presence of these channels in choanoflagellates suggests they were present in the common ancestor of all animals; the apparent absence in sponges therefore is probably a secondary loss. By comparison, ctenophores have a mostly-unique expansion of Kv-channels relative to the rest of metazoans [Li et al., 2015] and a duplication in NaV channels. Together with the loss of this protein 12 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 358 359 family in sponges, the gene content argues for a combination of both multiple, independent gains and a loss of neural-type cells and their associated functions across animals. 360 361 362 363 364 365 366 367 368 369 370 371 Properties of the earliest metazoans are unknown, including life cycle or number of cell types, but it is most parsimonious that the first obligate multicellular animals did not have anything resembling a modern bilaterian nervous system [Wray et al., 2015]. Yet, the genomic evidence shows that these animals have cellular capacity to respond to environmental or paracrine signals, regulate the cell internal ion concentrations and respond to changes in their concentrations, and secrete small molecules that could serve as effectors in unconnected (but proximal) cells. Thus, the earliest animals likely had the capacity to develop nerve cells using the genetic toolkit they possessed, though the number of times this occurred is unclear. This capacity appears to have been lost in sponges with the loss of voltage-gated channels. As we were unable to find putative genes to mediate action potentials in glass sponges, either all of the four transcriptomes were incomplete or the unique action potentials of glass sponges may represent a third case of the evolution of neural-like functions in Metazoa. 372 373 Methods 374 Sequence data 375 376 Project overview can be found at spongebase.net. Reference data from the demosponge Tethya wilhelma are available at: https://bitbucket.org/molpalmuc/tethya_wilhelma-genome 377 378 379 Raw genomic reads for T. wilhelma are available on NCBI SRA under accession numbers SRR2163223 (genomic reads), SRR2296844 (mate pairs), SRR5369934 (DNA Moleculo), and SRR4255675 (RNAseq). 380 381 Genome assembly 382 Processing and assembly 383 384 385 386 387 388 We generated 25Gb of 100bp paired-end Illumina reads of genomic DNA and 35Gb of 125bp Illumina gel-free mate-pair reads. Contigs were assembled with SOAPdenovo2 [Luo et al., 2012] using a kmer of 83bp. We also generated 436Mb of Moleculo synthetic long reads. Because both haplotypes are represented in the Moleculo reads, we merged the Moleculo reads using HaploMerger [Huang et al., 2012]. Contigs and merged Moleculo reads were then scaffolded using the gel-free mate-pairs with SSPACE [Boetzer et al., 2011] and BESST [Sahlin et al., 2014]. The first draft assembly had 7,947 contigs, totaling 145 megabases. 389 390 391 392 393 394 395 396 Removal of low-coverage contigs To examine the completeness of the genome, we generated a plot of kmer coverage against GC percentage for the contigs (Supplemental Figure 1) using custom Python scripts (available at http://github.org/wrf/lavaLampPlot). This revealed 1,040 contigs with a coverage of zero that were carried over from the Moleculo reads and were not assembled (Supplemental Figure 2), accounting for 6 megabases. As these reads likely derived from bacterial contamination in the aquarium water, these contigs were removed, leaving 6,907 contigs totalling 138 megabases. 397 398 399 400 401 402 Separation of bacterial contigs Additionally, the plot revealed many contigs with lower coverage (20x-90x) and high GC content (50-75%) suggesting the presence of bacteria (Supplemental Figure 3). Because many of these contigs were shorter than 10kb, separation of the bacterial contigs was done through several steps. We found 4,858 contigs with mapped RNAseq reads and GC content under 50%, as expected of metazoans. These contigs accounted for 13 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 403 404 405 406 407 408 409 410 411 88% of the sponge assembly, or 121 megabases. For the 2,014 contigs with no mapped RNAseq, we used blastn to search the contigs against the A. queenslandica scaffolds and all complete bacterial genomes from Genbank (5,242 sequences). Based on subtraction of bitscores, 62 contigs were identified as sponge and 565 were identified as bacterial. For the remaining 1,387 contigs, most of which were under 10kb, we repeated the search with tblastx against A. queenslandica scaffolds and the genomes of Sinorhizobium medicae and Roseobacter litoralis, which were the most similar complete genomes to the two bacterial 16S rRNAs identified in the contigs. After all sorting, 798 putative bacterial contigs accounted for 12.7 megabases and were separated to bring the total to 6,109 sponge contigs. Contigs for the two bacteria were binned by tetranucleotide frequency using MetaWatt [Strous et al., 2012] (Supplemental Figure 4). 412 413 414 415 416 417 418 419 420 421 422 Genome coverage and completeness Coverage was estimated two ways: kmer frequency and read mapping. Kmers of 31bp were counted using the Jellyfish kmer counter [Marçais and Kingsford, 2011] and analyzed using custom Python and R scripts (“fastqdumps2histo.py” and “jellyfish gc coverage blob plot v2.R”, available at http://github.org/wrf/ lavaLampPlot). As expected, the kmer distribution showed two peaks, one for kmers at heterozygous positions and one for homozygous positions, whereupon the coverage peak was at 131-fold coverage for homozygous positions. Because of sequencing errors, this method often underestimates coverage, and so to confirm this estimate we then mapped all reads to the genome using Bowtie2 [Langmead and Salzberg, 2012]. The sum of mapped reads divided by the total length provided an estimated coverage of 159-fold physical coverage. 423 424 425 426 427 Of the original reads, 185 million (86.5%) mapped back to the assembled sponge contigs. Completeness for gene content was assessed with BUSCO [Simão et al., 2015], whereupon we found 728 (86%) complete genes and 42 (4.9%) predicted-incomplete genes. Overall, these data suggest that the genome assembly is adequate for downstream analyses. 428 429 Genome annotation 430 Transcriptome versions 431 432 433 434 435 436 The transcriptome for T. wilhelma was assembled de novo using Trinity (release r20140717) [Grabherr et al., 2011, Haas et al., 2013]. Default parameters were used, except for strand specific assembly, in silico read normalization, and trimming (–SS lib type RF –normalize reads –trimmomatic). This produced 127,012 transcripts with an average length of 913bp. Assembled transcripts were mapped to the genomic assembly using GMAP [Wu and Watanabe, 2005] to produce a GFF file of the transcript mapping. Of these, 114,744 transcripts were mapped 166,847 times, allowing for multiple mappings. 437 438 439 440 441 For the genome-guided transcriptome, strand-specific RNAseq reads were mapped against the genome build using Tophat2 v2.0.13 [Kim et al., 2013] using strand-specific mapping (option –library-type frfirststrand) and otherwise default parameters. Mapped reads were then joined into transcripts using StringTie v1.0.2 [Pertea et al., 2015] with default parameters. 442 443 444 445 446 447 448 449 450 451 452 Additionally, ab initio gene models were predicted using AUGUSTUS [Stanke et al., 2008]. AUGUSTUS was trained on the webAugustus server [Hoff and Stanke, 2013] using the highest expression transcripts for each Trinity component and the assembled contigs. This identified 27,551 putative genes. The majority of these overlapped partially or completely with a predicted gene based on the Trinity mapping or Stringtie genes. However, 3,866 genes (4,321 transcripts) had no overlap with any predicted exon from either the Trinity or StringTie set, and were kept for the final set. Considering the possibility that some of these may be pseudogenes, we aligned these proteins to the SwissProt database with BLASTP [Camacho et al., 2009]. Of these, only 759 had reliable hits (E-value < 10−5 ) to 688 proteins. The annotated functions were diverse, including proteins similar to many receptors and large structural proteins such as fibrillin (potentially any protein with EGF repeats), dynein heavy chain, and titin; because very large proteins may be split across 14 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 454 multiple contigs, the predicted genes may be only fragments of the full gene. Only 42 of the hits were against transposable elements. 455 Filtering of the final gene set 453 456 457 458 459 460 461 462 Because assembly of transcripts for both StringTie and Trinity relies on overlaps in the genome or RNAseq reads, genes that overlap in the untranslated regions (UTRs) can sometimes be erroneously fused. For StringTie, we developed a custom Python script to separate non-overlapping transcripts belonging to the same “gene” (stringtiesplitgenes.py, available at https://bitbucket.org/wrf/sequences/). Tandem duplications can lead to RNAseq reads bridging the two tandem copies and result in both copies being called the same gene. The original StringTie set contained 46,572 transcripts for 32,112 genes, while the corrected set contained 33,200 genes and identified 1,088 new non overlapping genes. 463 464 465 466 467 468 469 Positional errors in the genome or allelic variations may result in some RNAseq reads not mapping to the genome, so some genes are fragmented in the genome-guided transcriptome but not the de novo assembly. Making use of the protein predictions from TransDecoder, we compared the predicted proteins between the two transcriptomes using a custom Python script (transdecodersplitgenes.py, available at https://bitbucket.org/wrf/sequences/). This identified 406 StringTie transcripts that were better modeled by Trinity transcripts. 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 Functional gene annotation Many genes of functional importance were examined manually, and the best transcript from StringTie, Trinity, or AUGUSTUS was retained for the final gene set. In the GFF and fasta versions of the transcriptomes, names of protein functions were assigned several ways. Target genes that were manually curated and edited, such as those used in all trees, are named by the generic function or the annotated function of the closest human protein. For instance, the dopamine beta-hydroxylase (DBH) homolog in T. wilhelma was manually corrected, and the position in a phylogenetic tree demonstrated that demosponges diverged before the duplication which created DBH and the two DBH-like proteins in humans, thus the T. wilhelma protein is annotated as DBH-like. Secondly, automated ortholog finding pipelines (HaMStR [Ebersberger et al., 2009]) used for phylogeny [Cannon et al., 2016] have identified homologs in T. wilhelma, which have been manually checked based on positions in the phylogenetic trees. Thirdly, single-direction BLAST results were kept as annotations provided that the BLAST hit had a bitscore over 1000, or a bitscore over 300 and the T. wilhelma protein covered at least 75% of the best hit against the human protein dataset from SwissProt. The bitscore and length cutoffs were applied to reduce the number of annotations based on a single domain. 485 486 487 488 489 490 491 492 493 494 495 496 497 Analysis of splice variation Using the transcriptome from StringTie, splice variation was assessed using a custom Python script (splicevariantstats.py, available at https://bitbucket.org/wrf/sequences/). In this script, several ambiguous definitions were clarified to define the different splice types. Firstly, single exon genes with no variants are distinguished from single exon genes with variations, that is, a gene with two exons can have a variant with one exon. For loci with only two transcripts, the canonical or main transcript is defined as the one with the higher expression level, as measured by the higher FPKM value reported from StringTie. For loci with three or more transcripts, main or canonical exons are those included in at least two transcripts. A cassette exon must occur in less than 50% of the transcripts for a locus, otherwise such case is defined as a skipped exon. A retained intron is any portion that exactly spans two other exons; for highly expressed transcripts this may include erroneously retained introns due to intermediates in splicing. A summary of the splicing types is displayed in Supplemental Table 2. 498 499 500 501 Intron retention was recently reported to be a common mode of alternative splicing in A. queenslandica [Fernandez-Valverde et al., 2015]. We found 3,295 transcripts with 3,565 retained intron events (Supplemental Table 2). We then analyzed the length of the retained introns and found the phase of the 15 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 502 503 retained piece to be randomly distributed (unlike cassette exons, Supplemental Table 3), suggesting that many of the retained introns result from incomplete splicing rather than functional retention. 504 505 506 507 508 509 510 511 512 513 Microsynteny across sponges Putative synteny blocks were identified using a custom Python script (microsynteny.py, available at https: //bitbucket.org/wrf/sequences/). Briefly, the script combines the gene positions on scaffolds for both the query and the reference with BLASTX hits for the query against the reference. If a minimum of three genes in a row on a query scaffold match to different genes on the same reference scaffold, the group is kept. By default, this mandated a gap of no more than five genes before discarding the block, and that the next gene must occur within 30kb. This method was designed to work for highly fragmented genomes with thousands of scaffolds, so the order and direction of the corresponding genes on the reference scaffold do not need to match those of the query scaffold. 514 515 516 517 518 519 StringTie transcripts for T. wilhelma were aligned against the A. queenslandica v2.0 protein set with BLASTX [Camacho et al., 2009], and positions were taken from the accompanying A. queenslandica v2.0 GTF. The same procedure was attempted against the S. ciliatum gene models, though essentially no syntenic blocks were detected, indicating either substantial differences in gene content or gene order between demosponges and calcareous sponges. 520 521 522 523 524 525 526 527 Collection of reference data Proteins for Oikopleura dioica [Denoeud et al., 2010] were downloaded from Genoscope. Gene models for Ciona intestinalis [Dehal et al., 2002], Branchiostoma floridae [Putnam et al., 2008], Trichoplax adherens [Srivastava et al., 2008], Capitella teleta, Lottia gigantea, Helobdella robusta [Simakov et al., 2013], Saccoglossus kowalevskii [Simakov et al., 2015], and Monosiga brevicollis [King et al., 2008] were downloaded from the JGI genome portal. Gene models for Sphaeroforma arctica, Capsaspora owczarzaki [Suga et al., 2013] and Salpingoeca rosetta [Fairclough et al., 2013] were downloaded from the Broad Institute. 528 529 530 531 532 We used genomic data of the cnidarians Nematostella vectensis [Moran et al., 2014], Exaiptasia pallida [Baumgarten et al., 2015], and Hydra magnipapillata as well as transcriptomes from 33 other cnidarians [Bhattacharya et al., 2016, Zapata et al., 2015, Pratlong et al., 2015, Brinkman et al., 2015, Ponce et al., 2016], mostly corals. 533 534 535 536 537 538 539 540 541 542 543 For demosponges, we used the genome of Amphimedon queenslandica [Srivastava et al., 2010, FernandezValverde et al., 2015] and transcriptomic data from: Mycale phyllophila [Qiu et al., 2015], Petrosia ficiformis [Riesgo et al., 2014a], Crambe crambe [Versluis et al., 2015], Cliona varians [Riesgo et al., 2014b], Halisarca dujardini [Borisenko et al., 2016], Crella elegans [Pérez-Porro et al., 2013], Stylissa carteri, Xestospongia testutinaria [Ryu et al., 2016], Scopalina sp., and Tedania anhelens. We used data from the genome of the calcareous sponge Sycon ciliatum [Fortunato et al., 2014] and the transcriptome of Leucosolenia complicata. For hexactinellids (glass sponges), we used transcriptome data from Aphrocallistes vastus [Ludeman et al., 2014], Hyalonema populiferum, Rosella fibulata, and Sympagella nux [Whelan et al., 2015]. For homoscleromorphs, we used two transcriptomes from Oscarella carmela and Corticium candelabrum [Ludeman et al., 2014]. 544 545 546 547 548 We used data from the two published draft genomes of ctenophores [Ryan et al., 2013,Moroz et al., 2014], as well as transcriptome data from 11 additional ctenophores: Bathocyroe fosteri, Bathyctena chuni, Beroe abyssicola, Bolinopsis infundibulum, Charistephane fugiens, Dryodora glandiformis, Euplokamis dunlapae, Hormiphora californensis, Lampea lactea, Thalassocalyce inconstans, and Velamen parallelum. 549 550 We used data from the unpublished draft genome of a novel placozoan species, designated H13. 551 16 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 552 553 554 555 556 557 558 559 560 561 Gene trees For protein trees, candidate proteins were identified by reciprocal BLAST alignment using blastp or tblastn. All BLAST searches were done using the NCBI BLAST 2.2.29+ package [Camacho et al., 2009]. Because most functions were described for human, mouse, or fruit fly proteins, these served as the queries for all datasets. Candidate homologs were kept for analysis if they reciprocally aligned by blastp to a query protein, usually human. Alignments for protein sequences were created using MAFFT v7.029b, with L-INS-i parameters for accurate alignments [Katoh and Standley, 2013]. Phylogenetic trees were generated using either FastTree [Price et al., 2010] with default parameters or RAxML-HPC-PTHREADS v8.1.3 [Stamatakis, 2014], using the PROTGAMMALG model for proteins and 100 bootstrap replicates with the “rapid bootstrap” (-f a) algorithm and a random seed of 1234. 562 563 564 565 566 567 Domain annotation Domains for individual protein trees were annotated with “hmmscan” v3.1b1 from the HHMER package [Eddy, 2011] using the PFAM-A database v27.0 [Finn et al., 2016] as queries. Signal peptides were predicted using the stand-alone version of SignalP v4.1 [Petersen et al., 2011]. Domain structures were visualized using a custom Python script, “pfampipeline.py”, available at https://github.com/wrf/genomeGTFtools . 568 569 Acknowledgments 575 W.R.F would like to thank K. Achim, M. Nickel, J. Musser, J. Ryan, and I. Oldenburg for helpful comments. G.W and D.E. would like to thank M. Nickel for providing the initial T. wilhelma specimens to set up the culture in Munich. This work was supported by a LMUexcellent grant (Project MODELSPONGE) to D.E. and G.W. as part of the German Excellence Initiative, partially by research grant 9278 (“Early evolution of multicellular sponges”) from VILLUM FONDEN to G.W., and NIH grant NIGMS-5-R01-GM087198 to S.H.D.H. The authors declare no competing interests. 576 References 570 571 572 573 574 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 References [Alberstein et al., 2015] Alberstein, R., Grey, R., Zimmet, A., Simmons, D. K., and Mayer, M. L. (2015). Glycine activated ion channel subunits encoded by ctenophore glutamate receptor genes. Proceedings of the National Academy of Sciences, 112(44):E6048–E6057. [Árnadóttir and Chalfie, 2010] Árnadóttir, J. and Chalfie, M. (2010). Eukaryotic Mechanosensitive Channels. Annual Review of Biophysics, 39(1):111–137. [Baumgarten et al., 2015] Baumgarten, S., Simakov, O., Esherick, L. Y., Liew, Y. J., Lehnert, E. M., Michell, C. T., Li, Y., Hambleton, E. a., Guse, A., Oates, M. E., Gough, J., Weis, V. M., Aranda, M., Pringle, J. R., and Voolstra, C. R. (2015). The genome of Aiptasia , a sea anemone model for coral symbiosis. Proceedings of the National Academy of Sciences, page 201513318. [Bhattacharya et al., 2016] Bhattacharya, D., Agrawal, S., Aranda, M., Baumgarten, S., Belcaid, M., Drake, J. L., Erwin, D., Foret, S., Gates, R. D., Gruber, D. F., Kamel, B., Lesser, M. P., Levy, O., Liew, Y. J., MacManes, M., Mass, T., Medina, M., Mehr, S., Meyer, E., Price, D. C., Putnam, H. M., Qiu, H., Shinzato, C., Shoguchi, E., Stokes, A. J., Tambutté, S., Tchernov, D., Voolstra, C. R., Wagner, N., Walker, C. W., Weber, A. P., Weis, V., Zelzion, E., Zoccola, D., and Falkowski, P. G. (2016). Comparative genomics explains the evolutionary success of reef-forming corals. eLife, 5:1–26. [Boetzer et al., 2011] Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D., and Pirovano, W. (2011). Scaffolding pre-assembled contigs using SSPACE. Bioinformatics, 27(4):578–579. 17 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 [Borisenko et al., 2016] Borisenko, I., Adamski, M., Ereskovsky, A., and Adamska, M. (2016). Surprisingly rich repertoire of Wnt genes in the demosponge Halisarca dujardini. BMC evolutionary biology, 16(1):123. [Brinkman et al., 2015] Brinkman, D. L., Jia, X., Potriquet, J., Kumar, D., Dash, D., Kvaskoff, D., and Mulvenna, J. (2015). Transcriptome and venom proteome of the box jellyfish Chironex fleckeri. BMC Genomics, 16(1):407. [Camacho et al., 2009] Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., and Madden, T. L. (2009). BLAST+: architecture and applications. BMC bioinformatics, 10:421. [Cannon et al., 2016] Cannon, J. T., Vellutini, B. C., Smith, J., Ronquist, F., Jondelius, U., and Hejnol, A. (2016). Xenacoelomorpha is the sister group to Nephrozoa. Nature, 530(7588):89–93. [Cao et al., 2010] Cao, J., Shi, F., Liu, X., Huang, G., and Zhou, M. (2010). Phylogenetic analysis and evolution of aromatic amino acid hydroxylase. FEBS Letters, 584(23):4775–4782. [Carlberg and Rosengren, 1985] Carlberg, M. and Rosengren, E. (1985). Biochemical basis for adrenergic neurotransmission in coelenterates. Journal of Comparative Physiology B, 155(2):251–255. [Coste et al., 2012] Coste, B., Xiao, B., Santos, J. S., Syeda, R., Grandl, J., Spencer, K. S., Kim, S. E., Schmidt, M., Mathur, J., Dubin, A. E., Montal, M., and Patapoutian, A. (2012). Piezo proteins are pore-forming subunits of mechanically activated channels. Nature, 483(7388):176–181. [Dehal et al., 2002] Dehal, P., Satou, Y., Campbell, R. K., Chapman, J., Degnan, B., De Tomaso, A., Davidson, B., Di Gregorio, A., Gelpke, M., Goodstein, D. M., Harafuji, N., Hastings, K. E. M., Ho, I., Hotta, K., Huang, W., Kawashima, T., Lemaire, P., Martinez, D., Meinertzhagen, I. a., Necula, S., Nonaka, M., Putnam, N., Rash, S., Saiga, H., Satake, M., Terry, A., Yamada, L., Wang, H.-G., Awazu, S., Azumi, K., Boore, J., Branno, M., Chin-Bow, S., DeSantis, R., Doyle, S., Francino, P., Keys, D. N., Haga, S., Hayashi, H., Hino, K., Imai, K. S., Inaba, K., Kano, S., Kobayashi, K., Kobayashi, M., Lee, B.-I., Makabe, K. W., Manohar, C., Matassi, G., Medina, M., Mochizuki, Y., Mount, S., Morishita, T., Miura, S., Nakayama, A., Nishizaka, S., Nomoto, H., Ohta, F., Oishi, K., Rigoutsos, I., Sano, M., Sasaki, A., Sasakura, Y., Shoguchi, E., Shin-i, T., Spagnuolo, A., Stainier, D., Suzuki, M. M., Tassy, O., Takatori, N., Tokuoka, M., Yagi, K., Yoshizaki, F., Wada, S., Zhang, C., Hyatt, P. D., Larimer, F., Detter, C., Doggett, N., Glavina, T., Hawkins, T., Richardson, P., Lucas, S., Kohara, Y., Levine, M., Satoh, N., and Rokhsar, D. S. (2002). The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. Science (New York, N.Y.), 298(5601):2157–2167. [Denoeud et al., 2010] Denoeud, F., Henriet, S., Mungpakdee, S., Aury, J.-M., Da Silva, C., Brinkmann, H., Mikhaleva, J., Olsen, L. C., Jubin, C., Canestro, C., Bouquet, J.-M., Danks, G., Poulain, J., Campsteijn, C., Adamski, M., Cross, I., Yadetie, F., Muffato, M., Louis, A., Butcher, S., Tsagkogeorga, G., Konrad, A., Singh, S., Jensen, M. F., Cong, E. H., Eikeseth-Otteraa, H., Noel, B., Anthouard, V., Porcel, B. M., Kachouri-Lafond, R., Nishino, A., Ugolini, M., Chourrout, P., Nishida, H., Aasland, R., Huzurbazar, S., Westhof, E., Delsuc, F., Lehrach, H., Reinhardt, R., Weissenbach, J., Roy, S. W., Artiguenave, F., Postlethwait, J. H., Manak, J. R., Thompson, E. M., Jaillon, O., Du Pasquier, L., Boudinot, P., Liberles, D. a., Volff, J.-N., Philippe, H., Lenhard, B., Crollius, H. R., Wincker, P., and Chourrout, D. (2010). Plasticity of Animal Genome Architecture Unmasked by Rapid Evolution of a Pelagic Tunicate. Science, 1381(2010). [Dunn et al., 2008] Dunn, C. W., Hejnol, A., Matus, D. Q., Pang, K., Browne, W. E., Smith, S. a., Seaver, E., Rouse, G. W., Obst, M., Edgecombe, G. D., Sørensen, M. V., Haddock, S. H. D., Schmidt-Rhaesa, A., Okusu, A., Kristensen, R. M., Wheeler, W. C., Martindale, M. Q., and Giribet, G. (2008). Broad phylogenomic sampling improves resolution of the animal tree of life. Nature, 452(7188):745–9. [Dunn et al., 2015] Dunn, C. W., Leys, S. P., and Haddock, S. H. D. (2015). The hidden biology of sponges and ctenophores. Trends in Ecology & Evolution, pages 1–10. [Ebersberger et al., 2009] Ebersberger, I., Strauss, S., and von Haeseler, A. (2009). HaMStR: profile hidden markov model based search for orthologs in ESTs. BMC evolutionary biology, 9:157. 18 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 [Eddy, 2011] Eddy, S. R. (2011). Accelerated profile HMM searches. PLoS Computational Biology, 7(10). [Elliott and Leys, 2007] Elliott, G. R. D. and Leys, S. P. (2007). Coordinated contractions effectively expel water from the aquiferous system of a freshwater sponge. Journal of Experimental Biology, 210(21):3736–3748. [Elliott and Leys, 2010] Elliott, G. R. D. and Leys, S. P. (2010). Evidence for glutamate, GABA and NO in coordinating behaviour in the sponge, Ephydatia muelleri (Demospongiae, Spongillidae). The Journal of experimental biology, 213:2310–2321. [Ellwanger et al., 2007] Ellwanger, K., Eich, A., and Nickel, M. (2007). GABA and glutamate specifically induce contractions in the sponge Tethya wilhelma. Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology, 193(1):1–11. [Ellwanger and Nickel, 2006] Ellwanger, K. and Nickel, M. (2006). Neuroactive substances specifically modulate rhythmic body contractions in the nerveless metazoon Tethya wilhelma (Demospongiae, Porifera). Frontiers in zoology, 3:7. [Fairclough et al., 2013] Fairclough, S. R., Chen, Z., Kramer, E., Zeng, Q., Young, S., Robertson, H. M., Begovic, E., Richter, D. J., Russ, C., Westbrook, M. J., Manning, G., Lang, B. F., Haas, B., Nusbaum, C., and King, N. (2013). Premetazoan genome evolution and the regulation of cell differentiation in the choanoflagellate Salpingoeca rosetta. Genome biology, 14(2):R15. [Fernandez-Valverde et al., 2015] Fernandez-Valverde, S. L., Calcino, A. D., and Degnan, B. M. (2015). Deep developmental transcriptome sequencing uncovers numerous new genes and enhances gene annotation in the sponge Amphimedon queenslandica. BMC Genomics, 16(1):1–11. [Finn et al., 2016] Finn, R. D., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Mistry, J., Mitchell, A. L., Potter, S. C., Punta, M., Qureshi, M., Sangrador-Vegas, A., Salazar, G. A., Tate, J., and Bateman, A. (2016). The Pfam protein families database: towards a more sustainable future. Nucleic Acids Research, 44(D1):D279–D285. [Fortunato et al., 2014] Fortunato, S. a. V., Adamski, M., Ramos, O. M., Leininger, S., Liu, J., Ferrier, D. E. K., and Adamska, M. (2014). Calcisponges have a ParaHox gene and dynamic expression of dispersed NK homeobox genes. Nature, 514(7524):620–623. [Ge et al., 2015] Ge, J., Li, W., Zhao, Q., Li, N., Chen, M., Zhi, P., Li, R., Gao, N., Xiao, B., and Yang, M. (2015). Architecture of the mammalian mechanosensitive Piezo1 channel. Nature, 527(5):64–69. [Geng et al., 2013] Geng, Y., Bush, M., Mosyak, L., Wang, F., and Fan, Q. R. (2013). Structural mechanism of ligand activation in human GABA(B) receptor. Nature, 504(7479):254–9. [Grabherr et al., 2011] Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. a., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N., Gnirke, A., Rhind, N., di Palma, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K., Friedman, N., and Regev, A. (2011). Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature biotechnology, 29(7):644–52. [Grell and Benwitz, 1974] Grell, K. G. and Benwitz, G. (1974). Spezifische Verbindungsstrukuren der Faserzellen von Trichoplax adhaerens F.E. Schulze. Z. Naturforsch., 29c:790. [Gur Barzilai et al., 2012] Gur Barzilai, M., Reitzel, A. M., Kraus, J. E. M., Gordon, D., Technau, U., Gurevitz, M., and Moran, Y. (2012). Convergent Evolution of Sodium Ion Selectivity in Metazoan Neuronal Signaling. Cell Reports, 2(2):242–248. [Haas et al., 2013] Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J., Couger, M. B., Eccles, D., Li, B., Lieber, M., Macmanes, M. D., Ott, M., Orvis, J., Pochet, N., Strozzi, F., Weeks, N., Westerman, R., William, T., Dewey, C. N., Henschel, R., Leduc, R. D., Friedman, N., and Regev, A. (2013). De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nature protocols, 8(8):1494–512. [Harbison, 1985] Harbison, G. R. (1985). On the classification and evolution of the Ctenophora. In Conway Morris, S. C., George, J. D., Gibson, R., and Platt, H. M., editors, The Origin and Relationships of Lower Invertebrates, pages 78–100. Clarendon Press, Oxford. 19 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 [Hoff and Stanke, 2013] Hoff, K. J. and Stanke, M. (2013). WebAUGUSTUS–a web service for training AUGUSTUS and predicting genes in eukaryotes. Nucleic Acids Research, 41(W1):W123–W128. [Huang et al., 2012] Huang, S., Chen, Z., Huang, G., Yu, T., Yang, P., Li, J., Fu, Y., Yuan, S., Chen, S., and Xu, A. (2012). HaploMerger: Reconstructing allelic relationships for polymorphic diploid genome assemblies. Genome Research, 22(8):1581–1588. [Jarvis et al., 2014] Jarvis, E. D., Mirarab, S., Aberer, A. J., Li, B., Houde, P., Li, C., Ho, S. Y. W., Faircloth, B. C., Nabholz, B., Howard, J. T., Suh, A., Weber, C. C., da Fonseca, R. R., Li, J., Zhang, F., Li, H., Zhou, L., Narula, N., Liu, L., Ganapathy, G., Boussau, B., Bayzid, M. S., Zavidovych, V., Subramanian, S., Gabaldon, T., Capella-Gutierrez, S., Huerta-Cepas, J., Rekepalli, B., Munch, K., Schierup, M., Lindow, B., Warren, W. C., Ray, D., Green, R. E., Bruford, M. W., Zhan, X., Dixon, A., Li, S., Li, N., Huang, Y., Derryberry, E. P., Bertelsen, M. F., Sheldon, F. H., Brumfield, R. T., Mello, C. V., Lovell, P. V., Wirthlin, M., Schneider, M. P. C., Prosdocimi, F., Samaniego, J. A., Velazquez, A. M. V., Alfaro-Nunez, A., Campos, P. F., Petersen, B., Sicheritz-Ponten, T., Pas, A., Bailey, T., Scofield, P., Bunce, M., Lambert, D. M., Zhou, Q., Perelman, P., Driskell, A. C., Shapiro, B., Xiong, Z., Zeng, Y., Liu, S., Li, Z., Liu, B., Wu, K., Xiao, J., Yinqi, X., Zheng, Q., Zhang, Y., Yang, H., Wang, J., Smeds, L., Rheindt, F. E., Braun, M., Fjeldsa, J., Orlando, L., Barker, F. K., Jonsson, K. A., Johnson, W., Koepfli, K.-P., O’Brien, S., Haussler, D., Ryder, O. A., Rahbek, C., Willerslev, E., Graves, G. R., Glenn, T. C., McCormack, J., Burt, D., Ellegren, H., Alstrom, P., Edwards, S. V., Stamatakis, A., Mindell, D. P., Cracraft, J., Braun, E. L., Warnow, T., Jun, W., Gilbert, M. T. P., and Zhang, G. (2014). Whole-genome analyses resolve early branches in the tree of life of modern birds. Science, 346(6215):1320–1331. [Jékely et al., 2015] Jékely, G., Paps, J., and Nielsen, C. (2015). The phylogenetic position of ctenophores and the origin(s) of nervous systems. EvoDevo, 6(1):1. [Katoh and Standley, 2013] Katoh, K. and Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular biology and evolution, 30(4):772–80. [Kim et al., 2013] Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., and Salzberg, S. L. (2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome biology, 14(4):R36. [King et al., 2008] King, N., Westbrook, M. J., Young, S. L., Kuo, A., Abedin, M., Chapman, J., Fairclough, S., Hellsten, U., Isogai, Y., Letunic, I., Marr, M., Pincus, D., Putnam, N., Rokas, A., Wright, K. J., Zuzow, R., Dirks, W., Good, M., Goodstein, D., Lemons, D., Li, W., Lyons, J. B., Morris, A., Nichols, S., Richter, D. J., Salamov, A., Sequencing, J. G. I., Bork, P., Lim, W. a., Manning, G., Miller, W. T., McGinnis, W., Shapiro, H., Tjian, R., Grigoriev, I. V., and Rokhsar, D. (2008). The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans. Nature, 451(7180):783–8. [Krishnan et al., 2014] Krishnan, A., Dnyansagar, R., Almén, M. S., Williams, M. J., Fredriksson, R., Manoj, N., and Schiöth, H. B. (2014). The GPCR repertoire in the demosponge Amphimedon queenslandica: insights into the GPCR system at the early divergence of animals. BMC Evolutionary Biology, 14(1):270. [Krishnan and Schiöth, 2015] Krishnan, A. and Schiöth, H. B. (2015). The role of G protein-coupled receptors in the early evolution of neurotransmission and the nervous system. The Journal of experimental biology, 218(Pt 4):562–571. [Langmead and Salzberg, 2012] Langmead, B. and Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. [Leys, 2015] Leys, S. P. (2015). Elements of a ’nervous system’ in sponges. The Journal of experimental biology, 218(Pt 4):581–91. [Leys et al., 1999] Leys, S. P., Mackie, G. O., and Meech, R. W. (1999). Impulse conduction in a sponge. The Journal of experimental biology, 202 (Pt 9)(June 1997):1139–1150. [Leys et al., 2007] Leys, S. P., Mackie, G. O., and Reiswig, H. M. (2007). The Biology of Glass Sponges. Advances in Marine Biology, 52(06):1–145. 20 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 [Li et al., 2015] Li, X., Liu, H., Chu Luo, J., Rhodes, S. a., Trigg, L. M., van Rossum, D. B., Anishkin, A., Diatta, F. H., Sassic, J. K., Simmons, D. K., Kamel, B., Medina, M., Martindale, M. Q., and Jegla, T. (2015). Major diversification of voltage-gated K\n +\n channels occurred in ancestral parahoxozoans. Proceedings of the National Academy of Sciences, page 201422941. [Liebeskind et al., 2011] Liebeskind, B. J., Hillis, D. M., and Zakon, H. H. (2011). Evolution of sodium channels predates the origin of nervous systems in animals. Proceedings of the National Academy of Sciences of the United States of America, 108(22):9154–9159. [Ludeman et al., 2014] Ludeman, D. A., Farrar, N., Riesgo, A., Paps, J., and Leys, S. P. (2014). Evolutionary origins of sensation in metazoans: functional evidence for a new sensory organ in sponges. BMC Evolutionary Biology, 14(3):1–11. [Luo et al., 2012] Luo, R., Liu, B., Xie, Y., Li, Z., Huang, W., Yuan, J., He, G., Chen, Y., Pan, Q., Liu, Y., Tang, J., Wu, G., Zhang, H., Shi, Y., Liu, Y., Yu, C., Wang, B., Lu, Y., Han, C., Cheung, D. W., Yiu, S.-M., Peng, S., Xiaoqian, Z., Liu, G., Liao, X., Li, Y., Yang, H., Wang, J., Lam, T.-W., and Wang, J. (2012). SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience, 1(1):18. [Marçais and Kingsford, 2011] Marçais, G. and Kingsford, C. (2011). A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics, 27(6):764–770. [Moran et al., 2015] Moran, Y., Barzilai, M. G., Liebeskind, B. J., and Zakon, H. H. (2015). Evolution of voltage-gated ion channels at the emergence of Metazoa. Journal of Experimental Biology, 218:515–525. [Moran et al., 2014] Moran, Y., Fredman, D., Praher, D., Li, X. Z., Wee, L. M., Rentzsch, F., Zamore, P. D., Technau, U., and Seitz, H. (2014). Cnidarian microRNAs frequently regulate targets by cleavage. Genome Research, 24(4):651–663. [Moroz, 2015] Moroz, L. L. (2015). Convergent evolution of neural systems in ctenophores. Journal of Experimental Biology, 218:598–611. [Moroz et al., 2014] Moroz, L. L., Kocot, K. M., Citarella, M. R., Dosung, S., Norekian, T. P., Povolotskaya, I. S., Grigorenko, A. P., Dailey, C., Berezikov, E., Buckley, K. M., Ptitsyn, A., Reshetov, D., Mukherjee, K., Moroz, T. P., Bobkova, Y., Yu, F., Kapitonov, V. V., Jurka, J., Bobkov, Y. V., Swore, J. J., Girardo, D. O., Fodor, A., Gusev, F., Sanford, R., Bruders, R., Kittler, E., Mills, C. E., Rast, J. P., Derelle, R., Solovyev, V. V., Kondrashov, F. a., Swalla, B. J., Sweedler, J. V., Rogaev, E. I., Halanych, K. M., and Kohn, A. B. (2014). The ctenophore genome and the evolutionary origins of neural systems. Nature, 17:1–123. [Nickel, 2010] Nickel, M. (2010). Evolutionary emergence of synaptic nervous systems: what can we learn from the non-synaptic, nerveless Porifera? Invertebrate Biology, 129(1):1–16. [Nosenko et al., 2013] Nosenko, T., Schreiber, F., Adamska, M., Adamski, M., Eitel, M., Hammel, J., Maldonado, M., Müller, W. E. G., Nickel, M., Schierwater, B., Vacelet, J., Wiens, M., and Wörheide, G. (2013). Deep metazoan phylogeny: when different genes tell different stories. Molecular phylogenetics and evolution, 67(1):223–33. [Pérez-Porro et al., 2013] Pérez-Porro, a. R., Navarro-Gómez, D., Uriz, M. J., and Giribet, G. (2013). A NGS approach to the encrusting Mediterranean sponge Crella elegans (Porifera, Demospongiae, Poecilosclerida): transcriptome sequencing, characterization and overview of the gene expression along three life cycle stages. Molecular ecology resources, 454:494–509. [Pertea et al., 2015] Pertea, M., Pertea, G. M., Antonescu, C. M., Chang, T.-C., Mendell, J. T., and Salzberg, S. L. (2015). StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nature Biotechnology, 33(3). [Petersen et al., 2011] Petersen, T. N., Brunak, S., von Heijne, G., and Nielsen, H. (2011). SignalP 4.0: discriminating signal peptides from transmembrane regions. Nature methods, 8(10):785–6. [Philippe et al., 2011] Philippe, H., Brinkmann, H., Lavrov, D. V., Littlewood, D. T. J., Manuel, M., Wörheide, G., and Baurain, D. (2011). Resolving difficult phylogenetic questions: why more sequences are not enough. PLoS biology, 9(3):e1000602. 21 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 [Philippe et al., 2009] Philippe, H., Derelle, R., Lopez, P., Pick, K., Borchiellini, C., Boury-Esnault, N., Vacelet, J., Renard, E., Houliston, E., Quéinnec, E., Da Silva, C., Wincker, P., Le Guyader, H., Leys, S., Jackson, D. J., Schreiber, F., Erpenbeck, D., Morgenstern, B., Wörheide, G., and Manuel, M. (2009). Phylogenomics revives traditional views on deep animal relationships. Current biology : CB, 19(8):706–12. [Pick et al., 2010] Pick, K. S., Philippe, H., Schreiber, F., Erpenbeck, D., Jackson, D. J., Wrede, P., Wiens, M., Alié, A., Morgenstern, B., Manuel, M., and Wörheide, G. (2010). Improved phylogenomic taxon sampling noticeably affects nonbilaterian relationships. Molecular Biology and Evolution, 27(9):1983–1987. [Pisani et al., 2015] Pisani, D., Pett, W., Dohrmann, M., Feuda, R., Rota-Stabelli, O., Philippe, H., Lartillot, N., and Wörheide, G. (2015). Genomic data do not support comb jellies as the sister group to all other animals. Proceedings of the National Academy of Sciences, 112(50):201518127. [Ponce et al., 2016] Ponce, D., Brinkman, D. L., Potriquet, J., and Mulvenna, J. (2016). Tentacle transcriptome and venom proteome of the pacific sea nettle, Chrysaora fuscescens (Cnidaria: Scyphozoa). Toxins, 8(4). [Pratlong et al., 2015] Pratlong, M., Haguenauer, A., Chabrol, O., Klopp, C., Pontarotti, P., and Aurelle, D. (2015). The red coral ( Corallium rubrum ) transcriptome: a new resource for population genetics and local adaptation studies. Molecular Ecology Resources, 15(5):1205–1215. [Price et al., 2010] Price, M. N., Dehal, P. S., and Arkin, A. P. (2010). FastTree 2 - Approximately maximum-likelihood trees for large alignments. PLoS ONE, 5(3). [Putnam et al., 2008] Putnam, N. H., Butts, T., Ferrier, D. E. K., Furlong, R. F., Hellsten, U., Kawashima, T., Robinson-Rechavi, M., Shoguchi, E., Terry, A., Yu, J.-K., Benito-Gutiérrez, E. L., Dubchak, I., Garcia-Fernàndez, J., Gibson-Brown, J. J., Grigoriev, I. V., Horton, A. C., de Jong, P. J., Jurka, J., Kapitonov, V. V., Kohara, Y., Kuroki, Y., Lindquist, E., Lucas, S., Osoegawa, K., Pennacchio, L. a., Salamov, A. a., Satou, Y., Sauka-Spengler, T., Schmutz, J., Shin-I, T., Toyoda, A., Bronner-Fraser, M., Fujiyama, A., Holland, L. Z., Holland, P. W. H., Satoh, N., and Rokhsar, D. S. (2008). The amphioxus genome and the evolution of the chordate karyotype. Nature, 453(7198):1064–71. [Qiu et al., 2015] Qiu, F., Ding, S., Ou, H., Wang, D., Chen, J., and Miyamoto, M. M. (2015). Transcriptome changes during the life cycle of the red sponge, Mycale phyllophila (Porifera, Demospongiae, Poecilosclerida). Genes, 6(4):1023–1052. [Ramoino et al., 2010] Ramoino, P., Ledda, F. D., Ferrando, S., Gallus, L., Bianchini, P., Diaspro, A., Fato, M., Tagliafierro, G., and Manconi, R. (2010). Metabotropic ??-aminobutyric acid (GABAB) receptors modulate feeding behavior in the calcisponge Leucandra aspera. Journal of Experimental Zoology Part A: Ecological Genetics and Physiology, 313A(3):132–140. [Riesgo et al., 2014a] Riesgo, A., Farrar, N., Windsor, P. J., Giribet, G., and Leys, S. P. (2014a). The Analysis of Eight Transcriptomes from All Poriferan Classes Reveals Surprising Genetic Complexity in Sponges. Molecular biology and evolution. [Riesgo et al., 2014b] Riesgo, A., Peterson, K., Richardson, C., Heist, T., Strehlow, B., McCauley, M., Cotman, C., Hill, M., and Hill, A. (2014b). Transcriptomic analysis of differential host gene expression upon uptake of symbionts: a case study with Symbiodinium and the major bioeroding sponge Cliona varians. BMC genomics, 15(1):376. [Ryan et al., 2010] Ryan, J. F., Pang, K., Mullikin, J. C., Martindale, M. Q., and Baxevanis, A. D. (2010). The homeodomain complement of the ctenophore Mnemiopsis leidyi suggests that Ctenophora and Porifera diverged prior to the ParaHoxozoa. EvoDevo, 1(1):9. [Ryan et al., 2013] Ryan, J. F., Pang, K., Schnitzler, C. E., a. D. Nguyen, A.-d., Moreland, R. T., Simmons, D. K., Koch, B. J., Francis, W. R., Havlak, P., Smith, S. a., Putnam, N. H., Haddock, S. H. D., Dunn, C. W., Wolfsberg, T. G., Mullikin, J. C., Martindale, M. Q., Baxevanis, A. D., Comparative, N., and Program, S. (2013). The Genome of the Ctenophore Mnemiopsis leidyi and Its Implications for Cell Type Evolution. Science, 342(6164):1242592–1242592. 22 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 [Ryu et al., 2016] Ryu, T., Seridi, L., Moitinho-Silva, L., Oates, M., Liew, Y. J., Mavromatis, C., Wang, X., Haywood, A., Lafi, F. F., Kupresanin, M., Sougrat, R., Alzahrani, M. A., Giles, E., Ghosheh, Y., Schunter, C., Baumgarten, S., Berumen, M. L., Gao, X., Aranda, M., Foret, S., Gough, J., Voolstra, C. R., Hentschel, U., and Ravasi, T. (2016). Hologenome analysis of two marine sponges with different microbiomes. BMC genomics, 17(1):158. [Sáez et al., 2009] Sáez, A. G., Lozano, E., and Zaldı́var-Riverón, A. (2009). Evolutionary history of Na,K-ATPases and their osmoregulatory role. Genetica, 136(3):479–490. [Sahlin et al., 2014] Sahlin, K., Vezzi, F., Nystedt, B., Lundeberg, J., and Arvestad, L. (2014). BESST - Efficient scaffolding of large fragmented assemblies. BMC Bioinformatics, 15(1):281. [Sara et al., 2001] Sara, M., Sara, A., Nickel, M., and Brümmer, F. (2001). Three New Species of Tethya (Porifera: Demospongia) from German Aquaria. Stuttgarter Beiträge zur Naturkunde, 631(15S):1–16. [Schierwater et al., 2009] Schierwater, B., Kolokotronis, S., Eitel, M., and DeSalle, R. (2009). The Diploblast-Bilateria Sister hypothesis. Communicative & Integrative Biology, 2(5):1–3. [Schuler et al., 2015] Schuler, a., Schmitz, G., Reft, A., Ozbek, S., Thurm, U., and Bornberg-Bauer, E. (2015). The rise and fall of TRP-N, an ancient family of mechanogated ion channels, in Metazoa. Genome Biology and Evolution, 7(6):1–27. [Simakov et al., 2015] Simakov, O., Kawashima, T., Marlétaz, F., Jenkins, J., Koyanagi, R., Mitros, T., Hisata, K., Bredeson, J., Shoguchi, E., Gyoja, F., Yue, J.-X., Chen, Y.-C., Freeman, R. M., Sasaki, A., Hikosaka-Katayama, T., Sato, A., Fujie, M., Baughman, K. W., Levine, J., Gonzalez, P., Cameron, C., Fritzenwanker, J. H., Pani, A. M., Goto, H., Kanda, M., Arakaki, N., Yamasaki, S., Qu, J., Cree, A., Ding, Y., Dinh, H. H., Dugan, S., Holder, M., Jhangiani, S. N., Kovar, C. L., Lee, S. L., Lewis, L. R., Morton, D., Nazareth, L. V., Okwuonu, G., Santibanez, J., Chen, R., Richards, S., Muzny, D. M., Gillis, A., Peshkin, L., Wu, M., Humphreys, T., Su, Y.-H., Putnam, N. H., Schmutz, J., Fujiyama, A., Yu, J.-K., Tagawa, K., Worley, K. C., Gibbs, R. A., Kirschner, M. W., Lowe, C. J., Satoh, N., Rokhsar, D. S., and Gerhart, J. (2015). Hemichordate genomes and deuterostome origins. Nature, pages 1–19. [Simakov et al., 2013] Simakov, O., Marletaz, F., Cho, S.-J., Edsinger-Gonzales, E., Havlak, P., Hellsten, U., Kuo, D.-H., Larsson, T., Lv, J., Arendt, D., Savage, R., Osoegawa, K., de Jong, P., Grimwood, J., Chapman, J. a., Shapiro, H., Aerts, A., Otillar, R. P., Terry, A. Y., Boore, J. L., Grigoriev, I. V., Lindberg, D. R., Seaver, E. C., Weisblat, D. a., Putnam, N. H., and Rokhsar, D. S. (2013). Insights into bilaterian evolution from three spiralian genomes. Nature, 493(7433):526–31. [Simão et al., 2015] Simão, F. A., Waterhouse, R. M., Ioannidis, P., and Kriventseva, E. V. (2015). BUSCO : assessing genome assembly and annotation completeness with single-copy orthologs. Genome analysis, 31(June):9–10. [Simion et al., 2017] Simion, P., Philippe, H., Baurain, D., Jager, M., Richter, D. J., Di Franco, A., Roure, B., Satoh, N., Quéinnec, É., Ereskovsky, A., Lapébie, P., Corre, E., Delsuc, F., King, N., Wörheide, G., and Manuel, M. (2017). A Large and Consistent Phylogenomic Dataset Supports Sponges as the Sister Group to All Other Animals. Current Biology, pages 1–10. [Smith et al., 2011] Smith, S. M. E., Morgan, D., Musset, B., Cherny, V. V., Place, A. R., Hastings, J. W., and DeCoursey, T. E. (2011). Voltage-gated proton channel in a dinoflagellate. Proceedings of the National Academy of Sciences, 108(44):18162–18167. [Sreedharan et al., 2010] Sreedharan, S., Shaik, J. H. A., Olszewski, P. K., Levine, A. S., Schiöth, H. B., and Fredriksson, R. (2010). Glutamate, aspartate and nucleotide transporters in the SLC17 family form four main phylogenetic clusters: evolution and tissue expression. BMC genomics, 11(iii):17. [Srivastava et al., 2008] Srivastava, M., Begovic, E., Chapman, J., Putnam, N. H., Hellsten, U., Kawashima, T., Kuo, A., Mitros, T., Salamov, A., Carpenter, M. L., Signorovitch, A. Y., Moreno, M. a., Kamm, K., Grimwood, J., Schmutz, J., Shapiro, H., Grigoriev, I. V., Buss, L. W., Schierwater, B., Dellaporta, S. L., and Rokhsar, D. S. (2008). The Trichoplax genome and the nature of placozoans. Nature, 454(7207):955–60. 23 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 [Srivastava et al., 2010] Srivastava, M., Simakov, O., Chapman, J., Fahey, B., Gauthier, M. E. a., Mitros, T., Richards, G. S., Conaco, C., Dacre, M., Hellsten, U., Larroux, C., Putnam, N. H., Stanke, M., Adamska, M., Darling, A., Degnan, S. M., Oakley, T. H., Plachetzki, D. C., Zhai, Y., Adamski, M., Calcino, A., Cummins, S. F., Goodstein, D. M., Harris, C., Jackson, D. J., Leys, S. P., Shu, S., Woodcroft, B. J., Vervoort, M., Kosik, K. S., Manning, G., Degnan, B. M., and Rokhsar, D. S. (2010). The Amphimedon queenslandica genome and the evolution of animal complexity. Nature, 466(7307):720–6. [Stamatakis, 2014] Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9):1312–1313. [Stanke et al., 2008] Stanke, M., Diekhans, M., Baertsch, R., and Haussler, D. (2008). Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics, 24(5):637– 644. [Stein, 1995] Stein, W. D. (1995). The sodium pump in the evolution of animal cells. Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 349(1329):263–9. [Strous et al., 2012] Strous, M., Kraft, B., Bisdorf, R., and Tegetmeyer, H. E. (2012). The binning of metagenomic contigs for microbial physiology of mixed cultures. Frontiers in Microbiology, 3(DEC):1–11. [Suga et al., 2013] Suga, H., Chen, Z., de Mendoza, A., Sebé-Pedrós, A., Brown, M. W., Kramer, E., Carr, M., Kerner, P., Vervoort, M., Sánchez-Pons, N., Torruella, G., Derelle, R., Manning, G., Lang, B. F., Russ, C., Haas, B. J., Roger, A. J., Nusbaum, C., and Ruiz-Trillo, I. (2013). The Capsaspora genome reveals a complex unicellular prehistory of animals. Nature communications, 4:2325. [Versluis et al., 2015] Versluis, D., D’Andrea, M. M., Ramiro Garcia, J., Leimena, M. M., Hugenholtz, F., Zhang, J., Öztürk, B., Nylund, L., Sipkema, D., van Schaik, W., de Vos, W. M., Kleerebezem, M., Smidt, H., and van Passel, M. W. J. (2015). Mining microbial metatranscriptomes for expression of antibiotic resistance genes under natural conditions. Scientific reports, 5(January):11981. [Whelan et al., 2015] Whelan, N. V., Kocot, K. M., Moroz, L. L., and Halanych, K. M. (2015). Error , signal , and the placement of Ctenophora sister to all other animals. Proceedings of the National Academy of Sciences, 112(18):1–6. [Wray et al., 2015] Wray, G. A., Smith, A., Peterson, K., Donoghue, P., Benton, M., Thackray, J., Budd, G., Marshall, C., Shu, D., Isozaki, Y., Zhang, X., Han, J., Maruyama, S., Walcott, C., Knoll, A., Walter, M., Narbonne, G., Christie-Blick, N., Raymond, P., Cloud, P., Budd, G., Jensen, S., Briggs, D., Fortey, R., Morris, S. C., Seilacher, A., Bose, P., Pfluger, F., Rasmussen, B., Bengtson, S., Fletcher, I., McNaughton, N., Pecoits, E., Konhauser, K., Aubet, N., Heaman, L., Veroslavsky, G., Stern, R., Gingras, M., Huldtgren, T., Cunningham, J., Yin, C., Stampanoni, M., Marone, F., Donoghue, P., Bengtson, S., Gaucher, C., Poire, D., Bossi, J., Bettucci, L., Beri, A., Fedonkin, M., Simonetta, A., Ivantsov, A., Sprigg, R., Glaessner, M., Gehling, J., Seilacher, A., Retallack, G., Narbonne, G., Seilacher, A., Grazhdankin, D., Legouta, A., Li, C., Chen, J., Hua, T., Maloof, A., Tang, F., Bengtson, S., Wang, Y., Wang, X., Yin, C., Yin, Z., Grant, S., Grotzinger, J., Watters, W., Knoll, A., Fedonkin, M., Vickers-Rich, P., Swalla, B., Trusler, P., Hall, M., Xiao, S., Zhang, Y., Knoll, A., Chen, J.-Y., Oliveri, P., Li, C., Zhou, G., Gao, F., Hagadorn, J., Peterson, K., Davidson, E., Hagadorn, J., Chen, L., Xiao, S., Pang, K., Zhou, C., Yuan, X., Bengtson, S., Budd, G., Cunningham, J., Margoliash, E., Brown, R., Richardson, M., Boulter, D., Ramshaw, J., Jefferies, R., Runnegar, B., Knoll, A., Carroll, S., Wray, G., Levinton, J., Shapiro, L., ArisBrosou, S., Yang, Z., Bromham, L., Rambaut, A., Fortey, R., Cooper, A., Penny, D., Welch, J., Fontanillas, E., Bromham, L., Wheat, C., Wahlberg, N., Schulte, J., Erwin, D., Laflamme, M., Tweedt, S., Sperling, E., Pisani, D., Peterson, K., Filipski, A., Murillo, O., Freydenzon, A., Tamura, K., Kumar, S., Tamura, K., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar, S., Hug, L., Roger, A., Ho, S., Phillips, M., Sanderson, M., Thorne, J., Kishino, H., Painter, I., Sanderson, M., Britton, T., Anderson, C., Jacquet, D., Lundqvist, S., Bremer, K., Shaul, S., Graur, D., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar, S., Mello, B., Schrago, C., Rambaut, A., Bromham, L., Kishino, H., Thorne, J., Bruno, W., Douzery, E., Snell, E., Bapteste, E., Delsuc, F., Philippe, H., Battistuzzi, F., Filipski, A., Hedges, S., Kumar, S., Schwartz, R., 24 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 Mueller, R., Ho, S., Phillips, M., Drummond, A., Cooper, A., Blair, J., Hedges, S., Warnock, R., Yang, Z., Donoghue, P., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar, S., Ayala, F., Rzhetsky, A., Ayala, F., Peterson, K., Lyons, J., Nowak, K., Takacs, C., Wargo, M., McPeek, M., Cutler, D., Chernikova, D., Motamedi, S., Csuros, M., Koonin, E., Rogozin, I., Richter, D., King, N., Dunn, C., Giribet, G., Edgecombe, G., Hejnol, A., Emes, R., Grant, S., Burkhardt, P., Hejnol, A., Martindale, M., Balavoine, G., Adoutte, A., Hirth, F., Miller, D., Ball, E., Northcutt, R., Strausfeld, N., Hirth, F., Tosches, M., Arendt, D., Lyons, T., Reinhard, C., Planavsky, N., Planavsky, N., Sperling, E., Frieder, C., Raman, A., Girguis, P., Levin, L., Knoll, A., Stanley, S., Peterson, K., Cotton, J., Gehling, J., Pisani, D., Butterfield, N., Benito-Gutierrez, E., Arendt, D., Holland, L., Carvalho, J., Escriva, H., Laudet, V., Schubert, M., Shimeld, S., Yu, J.-K., Turner, S., Young, J., Pisani, D., Poling, L., Lyons-Weiler, M., Hedges, S., Otsuka, J., Sugaya, N., Hedges, S., Blair, J., Venturi, M., Shoe, J., and Gu, X. (2015). Molecular clocks and the early evolution of metazoan nervous systems. Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 370(1684):424–431. [Wu and Watanabe, 2005] Wu, T. D. and Watanabe, C. K. (2005). GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics, 21(9):1859–1875. [Zakon, 2012] Zakon, H. H. (2012). Adaptive evolution of voltage-gated sodium channels: the first 800 million years. Proceedings of the National Academy of Sciences of the United States of America, 109 Suppl(Supplement 1):10619–25. [Zapata et al., 2015] Zapata, F., Goetz, F. E., Smith, S. A., Howison, M., Siebert, S., Church, S. H., Sanders, S. M., Ames, C. L., McFadden, C. S., France, S. C., Daly, M., Collins, A. G., Haddock, S. H. D., Dunn, C. W., and Cartwright, P. (2015). Phylogenomic Analyses Support Traditional Relationships within Cnidaria. Plos One, 10(10):e0139068. 25 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 1 Supplemental Figures Raw reads coverage vs GC 100 1e+06 80 1e+05 Bacteria 10000 60 Bacteria Haploid Diploid Repeats 1000 Repeats 40 GC% 20 100 10 0 973 1 100 200 300 Coverage 400 500 Counts Supplemental Figure 1: Coverage vs. GC content for reads Lava lamp plot of the unfiltered paired-end reads. Coverage was calculated as the median 31-mer coverage for each read. High-GC reads indicate the presence of bacteria in the raw reads. 26 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 80 100 T.wilhelma Moleculo contigs coverage vs GC 40 GC% 60 100 0 20 10 1 100 200 300 400 Coverage Supplemental Figure 2: Coverage vs. GC content for genomic scaffolds Heat map of percent GC versus coverage of reads for the all Moleculo reads. 27 Counts 60 80 100 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 0 20 40 %GC 10 50 100 150 200 250 300 1 Counts Coverage Supplemental Figure 3: Coverage vs. GC content for genomic scaffolds Scatterplot of percent GC versus coverage of reads for the all scaffolds. The 1,040 contigs with zero coverage are carried over from low-coverage Moleculo reads, and likely derive from amplified contaminating DNA. 28 80 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 20 30 40 50 60 70 Non−Rhizobium 1 Non−Rhizobium 2 Non−Rhodobacter 1 Non−Rhodobacter 2 Non−Rhodobacter 3 Non−Rhodobacter 4 Possible Rhodospirales Sponge contigs 0 50 100 150 200 250 300 Supplemental Figure 4: Separation of contigs derived from bacterial symbionts Sponge scaffolds are in green, while scaffolds assigned to bacteria are identified by blue (Rhizobiales) and pink (Rhodobacter ). Low-coverage contigs were removed. Seven bins were identified with MetaWatt to separate the bacterial contigs, though several bins appeared to correspond to the same bacteria. 29 100 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 60 40 20 Percent Identity 80 Twi to A.queenslandica Twi to S.ciliatum Twi to N.vectensis Twi to H.sapiens 0 100 200 300 400 500 Genes ordered by identity with Aqu Supplemental Figure 5: Percent identity between sponges Calculated protein percent identity between 570 one-to-one orthologs between T. wilhelma and A. queenslandica (blue), S. ciliatum (green), the anemone N. vectensis (purple) and human (red). Average identity between T. wilhelma and A. queenslandica is shown as the dotted black line at 57%. 30 300 400 513 100 200 750 total blocks 3400 Tethya genes in blocks 84 53 34 0 Number of blocks 500 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 3 6 17 9 14 8 7 2 3 1 0 1 1 1 0 1 0 0 0 9 12 15 18 21 Number of genes in block Supplemental Figure 6: Length of microsyntenic blocks Histogram of number of genes in detected microsyntenic blocks between T. wilhelma and A. queenslandica v2.0 gene models. 31 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. Hoilu ng ra r1 TriaHoiail_b _g d1_ungike g42 a_bra 10333.t1 H r1 Tr H oilung 44.t1_ke _sca_g1033 iadoilu Tria_b ffold 2.t1 Tr ia n 1 H Ho o _g gia d1raker1 _4 ia iluilu 4T d1 r4ia_bra _g424_g10 2 ngng 6.d1 ke _g 3.t2 056 r _ t i i 1 42 a_a_ 1_ g4 _g _ .t1 brbra _sc245 103 _scaff 53 ak ke aff .t1 28 old .t1 _4 e .t o _ r r1 1_ ld_ _s 1 __ _g g1 sc 4 caff 10 03 af old fo 33 2 9 _4 ld 1. .t1 _4 t1 97 8 9408 2 85 97 Hexactinellids 98 99 77 54 54 73 88 Amel_4.5_GB53665-PA NV15929-PA 38 59 87 35 52 98 86 99 99 Cgigas_EKC28453 0180 Lotgi1|12 6.2 31 09.p eq.14m as irn 3126 ob 53 1249 v220 |64a21|14 1 Ocbim o Helr Capc 53 29 90 92 Sakowv30025742m 39 32 22 23 19 21 21 30 97 72 93 50 37 20 58 97 89 99 69 54 80 4 .1_ D1 Cm ilii _ XP DBH-like 2 a ca pi lis ro Xt MOX11_DROME Prototomes Echinoderm/hemichordate Chordates Vertebrates DBH-like 1 86m X E E O ULS2 M NR MBOH __ DA 2__D CA 2_ XD.1_ XD M2O12 NO O A 0 M L2 0_ A3 DA DBH UE s_ 9G __ ru H .1 au 96 Bt 01 1 28 01 P_ _X pm NV10994-PA Amel_4.5|GB41735-PA-trimmed 97 Amel_4 _GB406 NV.515 993-PA94-PA OME .18 MOX12_DR .p a -sr 06 011 35247 0111 i1|1352 0000 NAP otg i1|1 88972 LoLtg gnyi_S 000000 Csavi nyi_SNAP Csavig evo r-d 26.1 300350 Sakowv 39026m Sakowv300 2 029 026 M MM O _0 XDO1XO X 07 _D D 89 M1O_H1_C 74 UU H 96 SEM IC .1_ ANK _ MODBH 9 XD L1 6 1_ bflo DA r na NR bflo s E e rna seq bflorq.24 .34 114nase0q74.1 .1_0 .50 _0 bflo 705 rnase .1_ bflorn q.28 0 aseq 386.2 .460 _1 1.9_0 30m v22 eq.1 q1 se 01 0_ _0 _c 47 q2 57 75 _se 91 .2 c0 .1 p5 0_ m u2 37 o c 66 Aq 5_ 3 omp 57 3 q9 i1_ _c 08 i1_ _se 3_ 67 ri_ 1_ _c0 _g 75 22 i_1 8610 lle _g 45 ler 6 ue 67 m 44 c1m0uel omp tica10 _c s_ia das_ usytdriat 3971i2_5 2 g 1_ hytri ch _ ecpus Slaep lleri_ e 0 01 Sla m u 03 tiais_c1 yudsatr eSplahc nas obir Clionavarians_TR46177|c0_g3_i8 Clionavarians_TR4 Clion avarians_ |c5_g6_i TR53827267 2|c0_g1_i1 1 _1 twi_ss.1 4527.1 _1 .22187 .1_1 Clio Clio nav nav 79 aria arian ns_T s_T tw R 2 9 R 4 79 8 twi_ i_sss.25 792 5 s.1 29 |c0 |c1_g1 _g 2 _i2 tw380 9b.2 _i8 i_s 7.1 _1 s.1 _0 88 9 6 Xt .1_ es 1 t Xt A ut ina A es qu tu 2 ria qu tin .1. _T 2.1 R1 .37 ar 27 29 88 ia 56 _T 6 37 3_ |c 001 R 1 _0 0 0_ A 76 1 qu g1 7| 2. _i1 1. c0 _1 2 75 _g 48 1_ _0 i6 01 _0 twi_ss 99 96 bim Oc 1_i1 seq1 3_c0_g _c0_ R1342 2191 um_T mp2 pulifer us_co vast onema_po s_ te hyal _seq1 callis 648_c0 aphro comp6 _i3 vastus_ 070_c0_g1 0_g1_i2 allistes_ lata_TR4 aphroc rosella_fibuagel la_nux_TR9313_c symp 6_c0_g1_i1 nux_TR1348 sympagella_ 81 99 Dreri Homoscleromorphs _4 ld fo 4 af _ sc old __ ff t. 1 _sca d_4 d_4 1 seq 51 1_ fol fol 3_ 42 2.t caf caf _c _g25 __s __s 15 16 d1 4 t1 t1 6 p1 q1 ia1_g 48. 49. e r m eq1 s s 2 d o _ Tia 42g4 0_ _c c0 8_ 620_c 98 Tr 1_dg1_ 44 76 3 12 p10 iadria _4 i4_ 60 comp _com Tr T g 4_ _ h 0 _ 6_ 95 _t 7|c lla 267 294 853 are 0_6 h60_ TR5 ial c _ rt s 6 _ a o t_h a_t rum 1_p _ ll lab _i1_ lla are nde _g1 are osc |c0 Cca os c 586 R48 m_T ru lab nde Cca 99 85 92 DOP D O_MDOOP o_AA I63055OPO _HUUSOE_ Cmilii_ .1__DBHMABNOVIN XP_007 65 Csavigny 898364 i_SNAP0 .1__DB 00000936 H 69 sima_20 778 98 83 Dopamine β-hydroxylase le ur ur a_ el e llav 03 ia_rc AI leg anh _au _f PG an A in oef ENtis IPG fen relia alA E2sim E _rc SM i_t_ 7 9 a _ NE h01 _fin _1 0529 2 _03 alA 221 529 SM 7_ NV 27967 1 _ _co 18 _p E8 mp 59 ar 08 232 _5 tia 9 l 8 _c 0 _s eq4 _a Fungia_scutaria_26369 Porites_lobata_6537 03 op Anthop NVE14232 leura_e legantis th 14732 Acropora_dig ora_12988 2 lepitifera_ Acropora_mil 2750 scutaria_ Fungia_ 197 2125 a_10 nsbisat_5 ralielo s_aust tes_ Porite Pori _ es rit An ato 864 1 8691677 751 1_9 ra_3 baista_ _5 60 itife s_lo ens aria 62 _dig riteali ut o P 17 6 ustr _sc pora s_a ungia 407 11 ia_ Acro e it 1132 _ r F Por _14 ta ta sitsa_ ba cu enba 4 lo _s alilo 4 s_ ia str s _ 3 4 t e g aurite _1 ori un es_ o sis P F rit P en Po rali st au Po Cnidarians Placozoans Demosponges 0.3 Supplemental Figure 7: Dopamine β -hydroxylase homologs across metazoans Tree of DBH and DBH-like proteins across all metazoan groups, generated with RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown. 32 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 83 TYDC1_ORYSJ TYDC2_ARATH TYDC1_ARATH TYDC1_PAPSO 99 TYDC5_PAPSO TYDC3_PAPSO TYDC2_PAPSO TYDC2_PETCR TYDC1_PETCR 74 TYDC4_PETCR 85 TYDC3_PETCR 50 98 Plants SARC_01107T0 50 Odioica_GSOIDT00000153001 Platygyra_carnosus_4652 40 99 75 98 Sponges scict1.001978.2_0_partial Syconcoactum_contig_14168_4-partial 92 42 lctid68041_0 Syconcoactum_contig_34351_2_partial 39 35 16 12 43 7 Capsaspora_KJE95580.1 Ccandelabrum_TR92788|c0_g5_i7_4-c0_g8_i14_2-partial oscarellacarmela_comp38470_c0_seq2_2 Hoilungia_stringtie|12206.1|m.15829 Triad1_g1959.t1__scaffold_2 Nve_XP_001635455.1 Alatina_alata_c38444_g1_i1_0 Chironex_fleckeri_TR31060_c0_g1_i1_0 50 NVE18494 AIPGENE1003 AIPGENE10179 Cioin2|262216_partial Cioin2|230165 Csavignyi_FGENESH00000078696 Csavignyi_GENEFINDER00000066949 Helro1|186120 81 Capca1|158583 Lotgi1|181541 Lotgi1|201667__AADC-like 82 15 Cgigas_EKC41301-trimmed Ocbimv22022526m.p 60 obirnaseq.78969.4_1 Ocbimv22022529m.p 21 Ocbimv22022527m.p Lingula_comp152792_c2_seq3__DDC-like Sakowv30017927m 25 Bf_V2_253_g40219.t2 21 95 Bf_V2_287_g43385.t2 Bf_V2_250_g40085.t26 86 Cmilii_XP_007895676.1__AADC 34 Drerio_NP_998507.1__AADC Xtropicalis_XP_012820058.1__AADC 98 94 DDC_MOUSE DDC_HUMAN 88 DDC_BOVIN Cgigas_EKC25403-trimmed Ectopleura_larynx_g20429.2_i2_joined-partial Ectopleura_larynx_g36591.2_i1_partial 91 HAEP_T-CDS_v02_46779 86 98 HAEP_T-CDS_v02_10472 6 hydra_sra.4784.1_2 66 HAEP_T-CDS_v02_45038_partial 99 hydra_sra.19042.1_0 82 HAEP_T-CDS_v02_1229_partial 33 hydra_sra.8276.1_2 Dmel_FBpp0080698 Dmel_FBpp0080697 TC013401 97 53 Bmori_XP_004931016.1 33 TC013402 23 Amel_4.5|GB45938-PA 54 71 NV11111-PA E9RJV1_GRYBI Amel_4.5|GB45973-PA__AADC-like NV11109-PA 41 TC013480 35 DDC_DROME 82 DDC_MANSE Bmori_NP_001037174.1__AADC Spurpuratus_XP_011664746.1__AADC 93 Sakowv30016396m Helro1|101612 86 Helro1|84539 Helro1|84403 Capca1|119245 75 Lotgi1|139922__HDC-like 96 Ocbimv22019847m.p 95 Amel_4.5|GB55830-PA 85 TC012567 Dmel_FBpp0085475 TC030580 59 Dmel_FBpp0085476 44 NV18137-PA Amel_4.5|GB55831-PA Spurpuratus_XP_789367.3__HDC 98 PFL3_pfl_40v0_9_20150316_1g34573.t1 Skowalevskii_NP_001161568.1_HDC 93 pmar16.36248-30514.1_0 Oreochromis_niloticus_XP_005463234.1__HDC 96 A7KBS5_DANRE__HDC Xtropicalis_XP_002939672.3__HDC CHICK_BAP16218.1__HDC 83 DCHS_BOVIN DCHS_MOUSE 82 DCHS_HUMAN Aplysia_XP_012940696.1 Aplysia_NP_001191536.1__HDC 93 Capca1|180248 73 Ocbimv22006132m.p 59 Cgigas_EKC37654__HDC 28 93 LOTGIDRAFT_119964__HDC Dpulex_EFX79676.1 Apisum_XP_008179690.1__HDC Bmori_XP_012551886.1__HDC 99 50 TC010062 70 DCHS_DROME 89 NV12919-PA 99 Amel_4.5|GB47379-PA__HDC-like Bterrestris_XP_003393425.1__HDC Placozoans Cnidarians Aromatic amino acid Decarboxylase Hydrozoan AADC 32 Invertebrate AADC-like Invertebrate HDC-like 52 Homoscleromorphs Calcarea Histidine Decarboxylase Placozoans Cnidarians Protostomes Echinoderm/hemichordate Chordates Vertebrates 0.5 Plants Supplemental Figure 8: Aromatic amino acid decarboxylase homologs across metazoans Tree of AADC and histidine decarboxylase (HDC) proteins across all metazoan groups, generated with RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown. 33 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. Aqu2.1.28354_001 ial _i2_2 1_g2i1_ 4 _part _i1_1 R35249_c c0_g1_ Crellaelegans_T 76_c1_g1 22979_ _TR827 helens_TR Tedaniaan calephyllophila My Demosponge MAO Cnidarian MAO 83 6 q.4 8471 ase 2778 orn 20|1 bfl 1c|a1 caap 6 p 8 C Ca 7 68 ed 8|510 oin FL 3a21 .p-j RA |1pc 3m _B aC1 a 855 C1 FL pc ZD RA 200 Ca CA30_B imv424 O AO A ZE O|c1b902 .1__0M.1A__M .1__MAO C3 i1 00136162 8354 __MAOA .1 7 39 _00108 Lotg AH_0 P 15155953 P is0_N_0 o__AX la n_XPIN FuriguX keev Dre OV AN Chic E _B UM FB_HB_MOUS AOFB AOF AO 48 77 UMAN AOFA_H AOFA_BOVIN 9037 MOUS 61AOFA_ E 38 47 30 84 0 62 44 5 g .1- 19 67 6.t 2 Monoamine Oxidase A/B 84 52 25 H ila al Te _T isa da R8 tw rc n ad 20 i_ Sty Cli iaan ss 4 uj 0 o lis n he ar _c .16 My sac av le 2 n a cale _g693 din art ri s_ i_ 1_ .1 eri_ans TR ph H i2 _1 yll TR_TR 28 AD _2 op 0 1 hil 0064 67 A0 a_T 1633 _c 10 4_0_ 0_ R8 eph 43 c0c0 g1 08 yda __ _ 72 50 tiam i11 _c0 g1g_1i1 0. _ i_1_ _ u eph _g2 1_ 53 4 yda elleri_ _ 5 i1_ tiam 1 1 uell 8278_ eri_ c 248 omp6 26 _ 6 6 com 77_ twi_ p69 c0_ 421 seq ss.16 970 _c0 1 _se .1_ q1 2 NV E1 aa_p_ horrotpra 79n3sisra_1 604_ scspoar_odrlien 7 _4438147 partia ut_4a_ iag_ sis l 5 ar34pi itm _ ia 1stiifeilrle872 _35 lla a_po 9 74 ta 16ra_ _ 46 86241145 73 80 57 76 79 88 60 68 24 2 19 22 2 ri_ lle ue yll op h m ia at yd lep h 85 28 Cnidarian and sponge group h ep M yc a 5 58 86 6028 2N4E VEE NPG AI 90 e 85 M AIPNV E15952 GEN E1 86 AosnPFu ngi Aercst 98 P o trota raa_scu 1888 oitro _cataria_ peosp_reoaara FurS1Aites erno287 a_uss_trmvil Fnat0 c o_a sa 38 A p_a2lile gyvirl0 u e po _29 iopc s 49 84 9 9 gu _X Dr P_ eri 01 o_ 16 XP 1_ 02 _6 g1 81 86 99 0. 31 8. 1_ 0 t2 __ _M.4 sc A_O_M af fo -liAO ld ke _4 lik Bf_V 2_196_g33248.t2 C3Z _BR CapcEL9 a1|136AFL 246 br 96 Fu M on 43 8 |30097 Dappu1 6.1_0 359_0 1872 lctid61scict1.0 02 855.p 1|1 12342 4m.p Helro 9147m 2955 _9262 6 18 00 os5a9_23413701761155 v220 v22 ern_6r4ia 9p_2rraa__ Ocbim cbim cavcuta a O ea_ia_slatsis__s ifpeo strang tilenoragilte 988 nta Fu_pisali p di il Mo horaustsrtreoraa__m E149 or a lop s_ A oppo EN95 Styorite Accrro IPG12 P A A VE N Plants and other outgroups S. arctica Aqu2.1.28355_001 MAO-like Ctenophores Placozoans Cnidarians Protostomes Chordates Vertebrates Stylissacarteri_TR109385_c0_ g1_i2_3 Demosponges Homoscleromorphs Calcarea 71 1_ p7 m co 60 73 80 q1 se 0_ _c SARC_08858T0 Bf_V AIPGE 2 _1 NE Bf_V 2_26 84_g30 14762 Bf_V 3_g4 7 2_98 1383 03.t1 _g16 122..tt1 1 AIPGENE NVE215 22275 36 AIPGENE2226 AIPGENE3902 7 3991 94 AIPGENE17020 AIPGENE9768 Capsaspora and cnidarian group 94 99 CAOG_00719T0 Acropora_millepora_19274 AIPGENE7844 100 100 7 61 4 44 13 7 0 62 5 5 59 75 79 96 1_0 hydra_sra.16499. .1_2 .12474 hydra_sra SARC_03516T0 69 25 S.rosetta_PTSG_06229T0 47 85 Primary amine Oxidase (PAOX) 100 0 6 0 40 66 30 71 54 56 78 43 99 69 99 Triad1_g432.t1__scaffo Hoilungia_braker1_g04833.t1 ld_1 Tria d1_ g Hoil9931.t1 ungi __sc a_bra aff Hoilu ke old ngia_b raker1 r1_g0_9138 35.t1 _g0483 Hoilung 4.t1 Hoilungia_ akker ia_br er 1_g bra g04883 Triad1_g431 1_ 35.t11 .t1__sca04 ffold6.t _1 95 78 NV N 10V1 9609 2- 61 PA-P A Am M el AO -li _4.5 Am ke |G el_ B5 4.5 00 91 |G 76 B5 -P 01 A 62 -PA NV _p FBp F B 17 art p00 pp0 FB 77 ia 762 074 p p 6-Pl 86 _ 9 5 F 0 3 _SM 8 Bp 08 A p0 5 OX-l ike 074962 52 00 l Ctenophore KDM1 X-like __ 5 97 7 36 __PAO TC om Tia2_t M_x 9S0_ NV 7_05 E1 HY94 71 DV61_ U com Lo 56_ tg _K p7 Cg i1 D 43 ig |14 M1 53 as 5 -li _c _E 16 ke 0_ seq KC 5 _p a 1 rti 24 al 08 9 t2 7. 52 39 -g .2 19 15 .3 eq as rn flo -b .t1 q1 q1 H _se 25 se 0 _ E C c 0 1 95 R TK ANUS 23_ 88_c _seq g3 NT_LAIC M MODO p87 p82 6_c0 1 6_ XE83C_HHBU_ON 1 com com 1676 _seq M B _ 24 2_AB3D1_2M 73_ 797_omp 45_c0 K 0 q1 0 2_ LRH3RDG XM 095 _09738_cmp96 _c0_se _V V 1BFK6 20_ h200_298 2_co omp12342 Bf F6 E83 t_h _t_ _c h2_104 rak_ameis _t_ h20 _08640 hpolo s_at_ _h30 cp3ysiro 1 mip ealiuth te0na_t 9oo5yc hor M L0nth bboba 88 6283T0 SG_0 tta_PT 7 S.rose __scaffold_2 t1 g8203. 66 res Am Lysine-specific Histone Demethylase (KDM1) _g1_i1_1 Placozoan PAOX-like 1 33483_00 Aqu2.1. _001 Aqu2.1.40694 91 58 KDM1A FBpp03 04939 33 el_ 9BKf_ AIP 4.5 DVM2_ GEN |G 1A225_ E198 THrioialu M g37 hy5d2_p d1ng B4 _H _gia1 ra 83 UOM 2 ra art 1_b 565ke 86 UASN .t1r1___g10 E32.t2 _sraia .3l0 -PA scaff903.t1 586 old_4 .1_ 7 _KD M1 -lik e_p artia 958 KDM1B 1_ Monbr NVE13555 1 NV 95 10A N8V-P 5 09 Corticiumcandelabrum_TR56531_c1 87 PA 9- 84 8 4 99 EN e OUUMS-lAike -lik _M _HO X OXEN OXX PA 2 USA PAO__PA _SOMM 5.t MI_MU ICK 82 ALOOXX__H g8 _CH _CSM _ 1 5 V 9 M B S J _3 KL F1N 2 V9 _V Bf Da YSJ LDL1LDL1_OR _ARA LDL2_ ARATH TH LDL2_O RYSJ 10 26 7 10 68 189 19 3 1 |31173 39-PA Dappu1 B459 _4.5|G 27058 Amel u1|3 Dapp 1 96 35 89 A B45215 6-P-PA Amel_4.5|G NV1830 0 39 |5 0 u1 97 pp 01 Da1|3 u pp Plant Lysine Demethylase 97 0.8 Supplemental Figure 9: Monoamine oxidase homologs across metazoans Tree of monoamine oxidase (MAO) and related proteins across all metazoan groups, generated with RAxML using the PROTGAMMALG model. Searches for lysine demethylase (KDM1) and primary amine oxidase (PAOX) were not exhaustive, and were added to display the sole positions of ctenophores (only have KDM) and placozoans (only have PAOX) in this protein family. Bootstrap values are 100 unless otherwise shown. 34 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. CAOG_08244T0 41 CAOG_00213T0 CAOG_09932T0 CAOG_05982T0 21 Filasterea Aqu2.1.43530_001 twi_ss.25346.1_2 41 97 52 54 ephydatiamuelleri_20738_comp67567_c0_seq1 ephydatiamuelleri_25644_comp76203_c0_seq1 ephydatiamuelleri_25735_comp78991_c0_seq1 ephydatiamuelleri_22338_comp68117_c0_seq2 ephydatiamuelleri_06289_comp54437_c0_seq1 ephydatiamuelleri_05142_comp51299_c0_seq1 ephydatiamuelleri_20901_comp67615_c1_seq2 Aqu2.1.23001_001 Aqu2.1.22999_001 twi_ss.11316.1_0 ephydatiamuelleri_09349_comp60264_c0_seq6 ephydatiamuelleri_10299_comp61291_c0_seq1 Slacustris_c104630_g1_i1 rosella_fibulata_TR14631_c0_g1_i1_1 aphrocallistes_vastus_comp22152_c0_seq3_3 aphrocallistes_vastus_comp16141_c0_seq1_1 rosella_fibulata_TR14675_c0_g1_i1_0 sympagella_nux_TR10361_c0_g1_i1_0 rosella_fibulata_TR4584_c0_g1_i5_2 sympagella_nux_TR20329_c0_g1_i1_2_partial 95 aphrocallistes_vastus_comp22385_c1_seq6_1 aphrocallistes_vastus_comp22385_c1_seq2_1 aphrocallistes_vastus_comp22385_c1_seq1_1 aphrocallistes_vastus_comp12269_c0_seq1_2 aphrocallistes_vastus_comp17700_c0_seq1_4 97 rosella_fibulata_TR4083_c0_g1_i1_1 rosella_fibulata_TR4083_c0_g1_i2_1 hyalonema_populiferum_TR19_c0_g1_i1_4 rosella_fibulata_TR7683_c0_g1_i1_0 97 rosella_fibulata_TR7683_c0_g1_i3_0 62 aphrocallistes_vastus_comp11820_c0_seq1_4 rosella_fibulata_TR8723_c0_g2_i1_2 sympagella_nux_TR21451_c0_g1_i1_0 75 88 93 rosella_fibulata_TR3980_c0_g1_i1_1 rosella_fibulata_TR3353_c0_g1_i1_0 aphrocallistes_vastus_comp20208_c0_seq3_0 97 hyalonema_populiferum_TR13658_c0_g1_i1_1_partial rosella_fibulata_TR7995_c0_g1_i1_2 sympagella_nux_TR21457_c0_g1_i3_1 86 sympagella_nux_TR21457_c0_g1_i2_0 sympagella_nux_TR21457_c0_g1_i1_0 twi_ss.15124.1_2 ephydatiamuelleri_17836_comp66476_c0_seq2 ephydatiamuelleri_08108_comp58383_c0_seq9 twi_c31705_g2_i8_0 twi_c31705_g2_i4_1 97 twi_ss.30746.2_1 85 twi_ss.29636.1_1 Aqu2.1.28011_001 Aqu2.1.38315_001 ephydatiamuelleri_22930_comp68288_c0_seq43 52 Slacustris_c104641_g1_i1 Slacustris_c98123_g1_i3_partial ephydatiamuelleri_16611_comp65944_c0_seq8 25 Aqu2.1.39153_001 ephydatiamuelleri_16976_comp66116_c0_seq1 95 ephydatiamuelleri_16977_comp66116_c0_seq2 38 Slacustris_c103807_g1_i3 16 Aqu2.1.25564_001 Aqu2.1.38717_001 Aqu2.1.27571_001 54 Aqu2.1.27573_001-trimmed Aqu2.1.39154_001 69 Aqu2.1.38823_001 45 97 Aqu2.1.41785_001 Aqu2.1.26762_001 70 ephydatiamuelleri_23301_comp68424_c0_seq32 36 twi_ss.13620.1_2 44 twi_ss.20824.4_2 ephydatiamuelleri_11626_comp62613_c0_seq9 98 ephydatiamuelleri_23659_comp68518_c0_seq1 ephydatiamuelleri_23660_comp68518_c0_seq5 ML02335a-AUGUSTUS hormiphora_t_x0_09819_comp8395_c0_seq2 37 oscarella_t_h60_65871_comp34165_c0_seq1_partial oscarella_t_h60_42123_comp11388_c0_seq18 oscarella_t_h60_52462_comp11760_c0_seq13 93 oscarella_t_h60_64267_comp17249_c0_seq1 Porites_australiensis_13238 Alatina_alata_c61081_g1_i1_5__lCt 86 Acropora_digitifera_408__lCt 32 Acropora_millepora_2820__lCt Capca1|22448 Capca1|42446 47 Triad1_g522.t1__scaffold_1 Hoilungia_stringtie|10218.1|m.13450 Triad1_g72.t1__scaffold_1 20 Acropora_digitifera_6641 Acropora_millepora_2277 48 hydra_sra.36426.1_0 76 adi_v1.15790 Porites_australiensis_19582 1 C3Y433_BRAFL-trimmed 25 Capca1|22458 Lanatina_comp153988_c0_seq2_1 10 4 Stylophora_pistillata_2729 Porites_australiensis_23261 80 Stylophora_pistillata_11795 Stylophora_pistillata_9171 64 92 AIPGENE7219__lCt Acropora_millepora_3159 NVE1157 NVE5469__lCt 56 Porites_australiensis_9038__lCt Lanatina_comp149962_c0_seq3_2 5 40 98 obirnaseq.37923.1_2 94 Capca1|204049 TC007169 Amel_4.5|GB53009 71 Dmel_NP_001285554.1__GABA-B3G Porites_australiensis_8820 95 Anthopleura_elegantissima_62189 G5ECB2_CAEEL__gbb-2 47 C3YEC0_BRAFL Q1LUN9_DANRE 84 98 GABR2_HUMAN GABR2_MOUSE 65 2 Capca1|222380 97 1 Lanatina_comp141557_c0_seq2_3 74 Amel_4.5|GB49239 TC014995 95 Dmel_NP_001287456.1__GABA-B2C Hoilungia_stringtie|8751.1|m.11539 AIPGENE12245 5 Hoilungia_stringtie|15673.1|m.20278 36 Triad1_g1739.t1__scaffold_2 Hoilungia_stringtie|6504.1|m.8438 97 Triad1_g6059.t1__scaffold_6 Hoilungia_stringtie|6503.1|m.8445 Triad1_g6057.t1__scaffold_6 Dmel_NP_001246033.1__GABA-B1C 62 Amel_4.5|GB49131 2 TC016191-016192-partial 81 B3VBI8_CAEEL__gbb-1 obirnaseq.69248.1_2 43 64 Lanatina_comp155098_c1_seq4_1 93 Capca1|107055 58 C3Z4V0_BRAFL GABR1_MOUSE GABR1_HUMAN F1QAJ3_DANRE F1RDY7_DANRE Hoilungia_stringtie|3833.2|m.4840 70 Triad1_g7917.t1__scaffold_9 Hoilungia_stringtie|3832.1|m.4618 5 Triad1_g7915.t1__scaffold_9 Hoilungia_stringtie|3838.1|m.4714 55 Hoilungia_stringtie|11079.1|m.14461 61 Triad1_g7871.t1__scaffold_9 Triad1_g3741.t1__scaffold_3 Hoilungia_stringtie|10558.1|m.13818 Triad1_g5475.t1__scaffold_5 11 Stylophora_pistillata_9477 Porites_australiensis_37273 Stylophora_pistillata_9644 Hexactinellids 69 41 15 Demosponges 20 Ctenophores Homoscleromorph Cnidarians Protostome GABAR-B3 GABAR-B2 Demosponges Oscarella carmela Hexactinellids Ctenophores Placozoans Cnidarians Protostomes Chordates Vertebrates GABAR-B1 Placozoans Capsaspora (Filasterean outgroup) 0.4 Supplemental Figure 10: mGABA receptors across metazoans Complete version of Figure 4 metabotropic GABA receptor (GABA-B type) protein tree generated with RAxML. Bootstrap values are 100 unless otherwise shown. The majority of deeper nodes were poorly resolved; branch transparency corresponds to bootstrap support for values under 50, meaning half-transparent. 35 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 246 GABR1_HUMAN GABR1_MOUSE F1QAJ3_DANRE Dmel_NP_001246033.1__GABA-B1C Amel_4.5|GB49131 C3Z4V0_BRAFL B3VBI8_CAEEL__gbb-1 GABR2_HUMAN GABR2_MOUSE Q1LUN9_DANRE C3YEC0_BRAFL Dmel_NP_001287456.1__GABA-B2C Amel_4.5|GB49239 Hoilungia_stringtie|15673.1|m.20278 Triad1_g1739.t1__scaffold_2 Hoilungia_stringtie|6503.1|m.8445 Triad1_g6057.t1__scaffold_6 Hoilungia_stringtie|6504.1|m.8438 G5ECB2_CAEEL__gbb-2 AIPGENE12245 Hoilungia_stringtie|10558.1|m.13818 Triad1_g5475.t1__scaffold_5 Hoilungia_stringtie|11079.1|m.14461 Hoilungia_stringtie|3832.1|m.4618 Hoilungia_stringtie|3833.2|m.4840 Triad1_g7917.t1__scaffold_9 Hoilungia_stringtie|3838.1|m.4714 Triad1_g3741.t1__scaffold_3 Capca1|42446 Capca1|22448 Hoilungia_stringtie|10218.1|m.13450 Triad1_g72.t1__scaffold_1 Triad1_g522.t1__scaffold_1 Capca1|22458 Lanatina_comp153988_c0_seq2_1 AIPGENE7219 C3Y433_BRAFL-trimmed Hoilungia_stringtie|8751.1|m.11539 Dmel_NP_001285554.1__GABA-B3G Amel_4.5|GB53009 Lanatina_comp149962_c0_seq3_2 NVE5469/23-200 twi_ss.15124.1_2/34-214 ephydatiamuelleri_17836_comp66476_c0_seq2 oscarella_t_h60_42123_comp11388_c0_seq18 oscarella_t_h60_52462_comp11760_c0_seq13 oscarella_t_h60_64267_comp17249_c0_seq1 Aqu2.1.25564_001 Aqu2.1.26762_001 Aqu2.1.41785_001 Aqu2.1.39154_001 twi_ss.13620.1_2 twi_ss.20824.4_2 ephydatiamuelleri_23659_comp68518_c0_seq1 ephydatiamuelleri_23660_comp68518_c0_seq5 ephydatiamuelleri_11626_comp62613_c0_seq9 ephydatiamuelleri_23301_comp68424_c0_seq32 Aqu2.1.27571_001 Aqu2.1.27573_001-trimmed Aqu2.1.38717_001 Aqu2.1.28011_001 Aqu2.1.38315_001 twi_ss.29636.1_1 ephydatiamuelleri_22930_comp68288_c0_seq43 ephydatiamuelleri_16611_comp65944_c0_seq8 twi_ss.30746.2_1 Aqu2.1.38823_001 Aqu2.1.39153_001 ephydatiamuelleri_16977_comp66116_c0_seq2 ephydatiamuelleri_16976_comp66116_c0_seq1 twi_c31705_g2_i4_1 twi_c31705_g2_i8_0 ephydatiamuelleri_08108_comp58383_c0_seq9 Aqu2.1.23001_001/19-196 twi_ss.11316.1_0/38-229 ephydatiamuelleri_10299_comp61291_c0_seq1 ephydatiamuelleri_09349_comp60264_c0_seq6 hydra_sra.36426.1_0/39-213 hormiphora_t_x0_09819_comp8395_c0_seq2 ML02335a-AUGUSTUS Aqu2.1.43530_001 twi_ss.25346.1_2 ephydatiamuelleri_25735_comp78991_c0_seq1 ephydatiamuelleri_25644_comp76203_c0_seq1 ephydatiamuelleri_06289_comp54437_c0_seq1 ephydatiamuelleri_05142_comp51299_c0_seq1 ephydatiamuelleri_20901_comp67615_c1_seq2 ephydatiamuelleri_22338_comp68117_c0_seq2 ephydatiamuelleri_20738_comp67567_c0_seq1 CAOG_00213T0 CAOG_05982T0 CAOG_09932T0 - P GC S S V - P GC S S V - P GC S S V - AGC S T V - AGC S T V - TG C S S V - TG C S P V GG VC P S V GG VC P S V GG VC P S V G G TC S S V G A A C TH V G A A C TH V G SG Y S S V G SG Y S S V G AG Y S S V G AG Y S S V G AD F S T V GGQC T E V G AG Y S KC G AQN S RC G AQN S R S GP VSS I C GP VL SAV GP VF SY I GP VL SSV G P I L SQ T G P L T T VG A A AN S E L GGGC S I V G A AC S I V G A AC S V V G S AC S T V GGGC S E A GGGC S V S G P GC S E S GPG - GAV G P GC S P V G S AC S E V G T AC S E V G S SC SD V G P AC S A V G ADC S I A G ADC S V S A E GC T I V A AGC S T V A E GC S V V GGGC S L V G S GC S V A G S GC S V A G AGC S A A GC GC S V A GC GC S V A GC GC S V A GC GC S V A GC GC S V A G S GC S V A GC GC S T A GC GC S I S GC GC S T A G S GC S L A G AGC S V A G AGC S V A GGGC E N A G S GC E A A G S GC S V A GGGC P S T G AGC S L A GC DC P E L GGGC P A V G AGC S I A G AGC P L A A P GC T E E GGGC S L A DGGC S P A GGGC S S S DGGC N T V G P P Y TEQ G P V F TD V G P V F TD V G P PC ED S G P GC S T S E G P GC S E A G L AC S D S GP LC SEA G P RC Y N S G L SC YD L G P SC S E S GL PC S TP Q A L I Q SC V Q A L VD S C V Q A L VD S C V 268 - YGS S S P AL S - YGS S S P AL S - YGS S S P AL S - YGAS S P AL S - YGAS S P AL S - YGS S S P AL S - Y GG S S P A L S - F AA T TP VL A - F AA T TP VL A - F AA T TP VL A - YDV TS P E F S - Y A D TH P M F T - Y A D TH P M F T - YGA TS P S L S - YGA TS P S L S - Y S A TS P VL S - Y SASSP VL S - FGS TS P VL S - Y A E TH A K F A - Y S S TS PQL S - HG S TS T I L S - HG S TS P I L S - N AA TS A TL S - P S S A SMK L - H T A T SME L T - H T A T S MQ L V - P S A TS L KL A - SG V S S A A L - Y ASVSP I L S - Y GC MS P S L S - Y ASASP AL S - Y ASASP AL S - Y ASVSP AL S - F S AG S P D L S - YGAARL K L S - MAN AN P A L G - F S TY S P AL S - Y AG S S P A L S - FGS TS P AL S - FGS TAP AL S - HASSAP AL S - Y A S TA SD L S - PL SSSPSL T - P L S TS PQL E - F L SSSPSLD - F L SSSP ALQ - F L SSSP AL E - Y Y S TS P S L N - Y S S S S VH L R - F S S S S VH L R - Y S S A A P ML S - YGS AA I TL S - YG T TS T TL S - Y F S AS TAL S - Y F SAS I EL S - Y VD V S S A L D - P E P L E AQ P S - A I AG A F V L N - Y A TG A D V L T - F A A S TH I L N - C ASSSSEL A - C A S S SHN L A - C V S S SN E L N - CDS S TP E AE - SD S S A P E L E -H I SSSPEL S - Y N P G L MR Y S - Y A SQ S S V L N - Y D R G TM S L N - Y D TG A I S L S - HSSSSAVLD - Y S S S SG V L D - YGF S - - I L G - Y F A TS P AL S - L SASSP AL S - FGS S S P TL T - FGSSSP SL S - Y T E AN E P F E - P TAS S S E F S - P TAS S S S L S - Y AS T TPQL N - YGYN S P I FH VS VA T TVE L T VS I A TS P S L N VH L A S S A A L A V H MC G S P N L G AH I S S S F I L G L HMA A S P A L K I SG S S S P A L S - I TVARP S A T - - I E TG K A V A - - I E TG K A V A 287 F R TH P S A F R TH P S A F R TH P S A F R TH P S A F R TH P S A F R TH P S A F R TH P S A F R TV P SD F R TV P SD F R TV P SD F R TV P SD F RVVP SE F RVVP SE L R T I P SD L R T I P SD FRT I ESE FRT I ESE Y R TV S SD F RVVPGS Y R T I P DD F R TA L SD F R TA L SD Y R TA L SD MR T A I SD F R TA I SD F R TS I SD F R TS I SD F T T E T TD F R TS S PD F R T VQ P E F R TY P S E FR I YPSE F R TRS S E F R T I RSD WR N I P TM VR TVP P E F K T I Q TD Y RL Y P S E Y R TVAPD L R TVAPD Y R L A A AD Y R T VQ P D L RV V S SD L R TV A SD YH L L P AE HQ I N P D E HQ L L P S E F TL RP SD FQ I Y VSE FQ I Y ASE F R TY P SD F R TH P S I F R AN P P S F R TN P S F F R TN P S L F R TS P SD F QL F P SG F R TL VS F F R TL VS F Y RA L I SN F QML P T E F QML P S A FQL L P TE F Q TL PND F QML P N E F Q L L G SG VQ V F P S R F R TVP S S F R TY P SD I R TY P SD L RA I L SD F R T I P SD F SML P S E F RVSP SE FRT I PSE FRT I PSE YRT I PSE FQ TP P S I L RLHP VE L RLHP VE F R TVP S F YQ TARS I L R TVS TS P RP I SS S G I RG S I D TS T TPC I VG I T P S M F GMAG S A Y RV A S SG F Y N GQN Q F G AG S N Q F G AG S N Q 367 GL F Y E TE A GL F Y E TE A GL F Y E TE A GL F Y VVAA GL F Y VVAA G V F Y EDMA GL F Y V TE A GQ F DQN MA GQ F DQN MA GQ F D E N L A G L F D E RM A GN F N E H F A G N F N E TWA A S F Y E K EG A S F Y E K EG A S F Y E D TG A S F Y E D TG AN F Y E D L G VD VD E E MA GN F F GQG A A I GQ A P V I A I GQ A P V I V Y A Y P DM T L F AY SVKA L F AY SP I A L F AY P SKA L N A Y P GQ I L F AD L I P T T I C NMP Q A TA S Y ED EM AG F Y S P QG GG F Y S S QG TG F F P S P G YN S Y AK E A F DG S EQ T F GMF R E T T A A L MA P HQC L S S Y ED I A G S F SQ E L A GS F S PQL A G S F S ED I A GMF Y E D A A L SVY P AY A V N T Y P TH A G L F Y E E VG AL L Y E P TA G L F Y E E TG L NC D P K I S L N MWE P K A L N MWE P K A L NMY SG K A I N TD P D L A LNS Y PNP L N S Y SG P A VN S Y P E P A L N S Y P GQ A L NMY SN I A I N T Y QD I A I N TY PD I A I NAY P SF S L AMY A P MA L AMY P GH A L AMF S KQ A L AMY VN D T L AVYQPDA L AMF L P R A L N T Y QN T A L NMY P P H A L S MN P L N A L NMY P L I A I N AD P V T T L N VC P E Y A I N AN A S T T G F F Y SN K A G Y F Y ED K A I WM Y E D K A A WM Y E D Q A A L F S V SG A A N F D TD D A AN F D E DD A V Y SQ P E Y L AF L SE RL A G F L GG E L T AF L - GP VA - F S QH L AQ VN I DG I L T AF I GSSL A L L AN S D L A G - - T E HQC Y P K TQ A Q V Y P M TQ A Q V Y P M TQ A Q V 395 L I G - WY A L I G - WY A L I G - WY A F I G - WY E F I G - WY E I I G - WY P F I G - WY A I P G - WY E I P G - WY E I P G - WY Q L NG - P Y S I MA - T Y S I MG - T Y T L L G - WL S F L G - WL S L I G - WL E L I G - WL E L L G - WL Q L PG - YH S L LG - Y I E L V G - WY P L V G - WY P L L G - WY F L PG - F F F MP G - F Y Y L PG - F Y L L P G - WL Y I TF - GY I M P G - WF D F A N - WF D I P G - WL G I TG - WL E G P G - WF T I P G - WWA Y Y G - WY N L MSG S Y R F NG - A L R F MG - WY S L H E - S MG LPS-DTI L VG - D F T L L D - WA D T F A - WY P Y F A - WY P F TG - WY D F P G - WY H F V G - WF Q T L G - WY P L Y G - WY S L Y G - WY N I P G - WY Q T Y G - WY Q A Y G - WY A M RD - S F V M RD - S F V TN D - WY N TN G - D Y T I P M - WY N I P M - WY N L P M - WY A T Y G - WY T I Y G - WY S TYG - TY E T L G - WY T L R D - WY G VY P - F Y S I Y D - WY N V Y G - WY P T L G - WY P T L G - WY P L L E - WH V M Y G - WY D V H A - WN T L PG - F YG L P A - WY T L P G - WY S L P G - WY T LFE- I LP L PS - SL P L PS - SYK F VG - N T E F I G TS Y S F HD - R S V F HD - R S L NA- D I PL F L R - TKL F L Y - RKL QVE - I TL L P D - RD V TPH - QY T TPN - TY T TPN - TY T 446 GG F Q E A P L GG F Q E A P L GG F Q E A P L EG YQE A P L EG YQE A P L PGY P E AP L GG F P E A P L G P S K F HG Y G P S K F HG Y E TS K F HG F R TN P F H G Y E Y S R F HG Y E Y S R F HG Y D TYG Y SN F E S YG Y SN F QQD I Y A A F K RD L Y A A F QP LGY Y P F P N N TWR G Y R RDDH S S Y T TN T L R A Y T TN T L R A Y Y L A S V H TW P L L GS Y TF P L YGS Y S Y P L SG S Y S Y H L SG TY TY P F K Y F Q TN H A TN L A TQ TG K D F G P A GA I DY ASY GA I DF ASF GG K K F AG F GYN P YH TF G F ND YH P F E PNP Y AS F KS TVY VP L G G WK T A G F A I SQ Y A PQ T Y S K F AGQ T R S AH A P Q T AN L Y V P Y A L F GQ AQ F S L QDQ V P Y D F QQ Y T A Y QF TY Y AAY D F DH Y T A Y DDH S I E S Y Q P T Y V AHM L S S Y V AHM Y P I PDAL T SQ TY TA A L I L D T T A AQ E A S D AGG R E A S D AGG R E K S E A AG R QY TY VS AL TE L G L SG V TE L G L SG V T E L G I AG L S Y L I N S EH R Y S TH N I Q TH Y D Y A P H S S S QD E E T D P N RDD P S T S N I A S QQ S E MD G Y A F SESY AAP F FQ TS VAP F FQ TS VAP F L VD E S V Y F A AD E I A S F S I TED TVP DQ F L S P T L E PHAY I TF A V S K Y MR F N MH S Y V P F KC D EN L A Y K E D K AG S L I E E K AG S L ND L S V A A S G P D P R AG T I DL L YGP T SD L A YG P V VD S D T F N F S I N K L GN A T T S L RGN L A S P P WA N L N DN I Y VN S S C R F G L AD YC RFGS A T YC RFGS A T Supplemental Figure 11: Binding pocket alignment of mGABARs across metazoans Select residues involved in the binding of GABA are highlighted, and numbers correspond to the position in the human protein GABR1/GABAR-B1, based on the structure of GABR1 [Geng et al., 2013]. Highly conserved residues not thought to be involved in binding are highlighted in blue. The two human proteins are highlighted in pink. Placozoan proteins are highlighted in gray. Sponge proteins are highlighted in blue. The two ctenophore proteins are highlighted in violet. 36 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. A.thaliana_NP_565744.1__GluR5 G5EKN9_SOLLC__SlGLR3.1 V.vinifera_F6HIJ5_VITVI V.vinifera_F6HIJ6_VITVI A.thaliana_NP_565743.1__GluR3.5 A.thaliana_NP_001030971.1__GluR3.4 G5EKP2_SOLLC__SlGLR3.4 G5EKP0_SOLLC__SlGLR3.2 V.vinifera_A5AMA8_VITVI 99 A.thaliana_NP_974686.1__GluR3.2 A.thaliana_NP_028351.2__GluR A.thaliana_NP_190716.3__GluR3.6 G5EKP3_SOLLC__SlGLR3.5 V.vinifera_D7SWB7_VITVI 54 52 A.thaliana_NP_174978.1__GluR3.3 9951 V.vinifera_D7UDC6_VITVI G5EKP1_SOLLC__SlGLR3.3 G5EKN1_SOLLC__SlGLR1.1 G5EKN2_SOLLC__SlGLR1.2 V.vinifera_A5AGU5_VITVI V.vinifera_A5BMN8_VITVI V.vinifera_F6HM67_VITVI V.vinifera_F6GXG0_VITVI V.vinifera_A5AUG7_VITVI 62 A.thaliana_NP_187061.1__GluR1.1 A.thaliana_NP_187408.2__GluR1.4 77 A.thaliana_NP_199652.1__GluR1.3 A.thaliana_NP_199651.1__GluR1.2 G5EKN7_SOLLC__SlGLR2.5 98 G5EKN4_SOLLC__SlGLR2.2 G5EKN6_SOLLC__SlGLR2.4 91 G5EKN5_SOLLC__SlGLR2.3 78 G5EKN3_SOLLC__SlGLR2.1 A.thaliana_NP_180476.3__GluR2.7 A.thaliana_NP_180474.1__GluR2.9 93 A.thaliana_NP_180475.2__GluR2.8 A.thaliana_NP_196682.1__GluR2.5 95 A.thaliana_NP_196679.1__GluR2.6 A.thaliana_NP_180047.1__GluR2.3 98 A.thaliana_NP_180048.1__GluR2.2 A.thaliana_NP_194899.1__GluR2.4 88 A.thaliana_NP_198062.2__GluR2.1 G5EKN8_SOLLC__SlGLR2.6 V.vinifera_A5AIS1_VITVI 80 V.vinifera_A5AD54_VITVI 84 V.vinifera_A5BDG6_VITVI V.vinifera_F6H9F4_VITVI V.vinifera_F6H9D0_VITVI V.vinifera_F6H9G4_VITVI V.vinifera_A5AU42_VITVI V.vinifera_A5AQR7_VITVI V.vinifera_A5AVQ8_VITVI 85 V.vinifera_F6H9G5_VITVI V.vinifera_F6H9H0_VITVI oscarella_t_h60_63112_comp13096_c0_seq1 oscarella_t_h60_08441_comp7318_c0_seq1 L.complicata_lctid7089 91 S.ciliatum_scpid22929_GLUR3.5 scict1.030436.1_0 oscarella_t_h60_33991_comp10944_c0_seq1 oscarella_t_h60_03154_comp3263_c0_seq1 91 oscarella_t_h60_50237_comp11700_c0_seq3 oscarella_t_h60_50246_comp11700_c0_seq12 S.ciliatum_scpid26529_GLUR3.4 L.complicata_lctid31759 oscarella_t_h60_65012_comp23829_c0_seq1 oscarella_t_h60_06601_comp6249_c0_seq1 42 oscarella_t_h60_65051_comp24134_c0_seq1 78 S.ciliatum_scpid25297_GLUR3.1 L.complicata_lctid34850 81 S.ciliatum_scpid27594_GLUR3.7 L.complicata_lctid7088 69 S.ciliatum_scpid21909_GLUR2.9 S.ciliatum_scpid15641_GLUR2.9 euplokamis_t_h20_19863_comp13082_c0_seq1 Hcal_TR15585_c2_g1_i1_m.37437 Hcal_TR15585_c3_g1_i1_m.37441-TR15585_c0_g1_i1 Pleurobrachia_bachei_AEX15543.1 67 MLRB004413 euplokamis_t_h30_23254_comp14560_c1_seq1-h10_06856 MLRB00443-edited 55 95 Pleurobrachia_bachei_AEX15551.1_trimmed Hcal_TR7392_c0_g1_i1_m.11709 Pleurobrachia_bachei_AEX15547.1 Hcal_TR7373_c0_g1_i1_m.11556 53 99 euplokamis_t_h20_19416_comp12942_c0_seq1 Pleurobrachia_bachei_AEX15548.1-trimmed euplokamis_t_h30_09966_comp8826_c0_seq1 Pleurobrachia_bachei_AEX15546.1 61 MLRB306921 ML30697a-trimmed Pleurobrachia_bachei_AEX15539.1 Hcal_TR18969_c1_g2_i2_m.48365_partial 94 Pleurobrachia_bachei_AEX15541.1 10 Pleurobrachia_bachei_AEX15549.1 91 Pleurobrachia_bachei_AEX15542.1 ML00626a 89 ML032222a ML032221a ML05909a Pleurobrachia_bachei_AEX15550.1 Pleurobrachia_bachei_AEX15544.1 ML111714a 85 ML15636a ML085016a 78 65 63 Pleurobrachia_bachei_AEX15540.1 36 ML0850-17a-18a-fusion Pleurobrachia_bachei_AEX15545.1 MLRB150054-fixed ML03683a 95 ML027316a-fixed ML07344a-recut 58 ML150012a-MLRB150049 88 99 ML150010a-trimmed-MLRB150043 MLRB064020 C3ZFG6_BRAFL 96 C3YQ18_BRAFL C3YZA0_BRAFL-trimmed 78 C3ZMS5_BRAFL-cutshort C3ZKA8_BRAFL-trimmed 64 adi_v1.00972 85 A7RPU4_NEMVE 83 adi_v1.17049-trimmed adi_v1.11422 54 AIPGENE1622 90 A7T1G4_NEMVE H13_TR19588_c2_g2_i3_m.37163 g1037.t1__scaffold_1__Triad1-18943 H13_TR30549_c0_g1_i1_m.66150 g1036.t1__scaffold_1__Triad1-18262 57 H13_TR15638_c1_g1_i1_m.29638 g1035.t2__scaffold_1__Triad1-18823 g9262.t1__scaffold_14__Triad1-30612 g9261.t1__scaffold_14__Triad1-30609 Triad1_55165 94 H13_TR9210_c0_g1_i1_m.20124 g10441.t1__scaffold_23__Triad1-32461 H13_TR13015_c0_g2_i3_m.25087-partial 99 g10442.t1__scaffold_23__Triad1-61396 g10443.t2__scaffold_23__Triad1-3218 94 H13_TR30216_c5_g1_i3_m.64945 A7SFF8_NEMVE_NMDA-like AIPGENE3629-joined_AIPGENE1829-allele 98 A7SGV8_NEMVE_NMDA-like 68 AIPGENE14101 A7RLA0_NEMVE_NMDA-like 18 A7SL56_NEMVE adi_v1.13302 Ocbimv22034182m.p 99 99 D.melanogaster_NP_730940.1__NMDAR-Z-like Amel_4.5_GB46886 81 X.laevis_NP_001081616.1_NMDA1.2 Q6ZM67_DANRE_grin1b E7F101_DANRE_grin1a 46NMDZ1_MOUSE 71 NMDZ1_HUMAN C3YII6_BRAFL 71 obirnaseq.137713.1_1 Mouse_NP_001263284.1_NMDA3A NMD3A_HUMAN NMD3B_MOUSE NMD3B_HUMAN 99 Amel_4.5_GB48097 D.melanogaster_NP_001014714.1__NMDAR-like C3Y993_BRAFL C3Y747_BRAFL 22 96 NMDE2_MOUSE NMDE2_HUMAN 86 NMDE1_HUMAN NMDE1_MOUSE NMDE3_MOUSE NMDE3_HUMAN E9QBK8_DANRE_grin2db_NMDE4-like NMDE4_MOUSE NMDE4_HUMAN oscarella_t_h60_52136_comp11753_c0_seq16 oscarella_t_h60_15764_comp9193_c0_seq1 67 oscarella_t_h60_42587_comp11410_c0_seq2 H13_braker1_g01988.t1 T.adherens_g388.t5 0 obirnaseq.63134.1_0 13 D.melanogaster_NP_001260049.1_25a 84 99 D.melanogaster_NP_727328.1_8a scict1.009114.1_0 13 obirnaseq.13493.1_0 95 obirnaseq.109126.1_2 PH13_braker1_g02946.t1 15 g5117.t1__scaffold_5__Triad1-14565 71 g5089.t1__scaffold_5__Triad1-25027 92 PH13_braker1_g02971.t1 PH13_braker1_g02972.t1 g5085.t1__scaffold_5__Triad1-25025 C3XPD8_BRAFL C3ZRD9_BRAFL 60 86 C3ZQV0_BRAFL 0 X.tropicalis_NP_001096470.1_DELTA1 98 GRID1_HUMAN GRID1_MOUSE GRID2_DANRE GRID2_MOUSE GRID2_HUMAN Ocbimv22018191m.p 36 C3ZMC8_BRAFL 89 C3ZZY6_BRAFL 3 obirnaseq.94822.1_1 60 Amel_4.5_GB53122 Amel_4.5_GB49273 80 Amel_4.5_GB49268 98 Q9VIE2_DROME__Clumsy 69 Amel_4.5_GB49275 D.melanogaster_CAB64942.1 Danio_rerio_XP_009290381.1_K5 GRIK5_HUMAN 8 GRIK5_MOUSE Danio_rerio_XP_017206665.1_K4 GRIK4_MOUSE GRIK4_HUMAN 54 C3ZWB5_BRAFL Danio_rerio_XP_009303592.1_K1 GRIK1_MOUSE 81 GRIK1_HUMAN Danio_rerio_XP_017206939.1_K3 GRIK3_MOUSE GRIK3_HUMAN 87 Danio_rerio_XP_017206873.1_K2 99 B9V8S1_XENLA_GRIK2 98 GRIK2_MOUSE 99 GRIK2_HUMAN obirnaseq.32300.1_1 85 obirnaseq.114940.1_2 79 Amel_4.5_GB40973 D.melanogaster_NP_476855.1 D.melanogaster_NP_001261621.2 Q71E63_DANRE_gria2a Q71E62_DANRE_gria2b B9V8R8_XENLA_GRIA2 99 GRIA2_HUMAN 99 GRIA2_MOUSE 55 B3DGS8_DANRE_gria1a Q71E64_DANRE_gria2a X.laevis_NP_001153151.1_AMPA1 GRIA1_MOUSE 78 GRIA1_HUMAN Q71E61_DANRE_gria3a 99Q71E60_DANRE_gria3b X.laevis_NP_001153154.1_AMPA3 99GRIA3_MOUSE GRIA3_HUMAN 51 X.laevis_NP_001153157.1_AMPA4 GRIA4_MOUSE 52GRIA4_HUMAN Q71E59_DANRE_gria4a Q71E58_DANRE_gria4b 86 99 52 Plants Sponges O.carmela L.complicata S.ciliatum 90 99 81 Ctenophores Cnidarians Placozoans NMDAR Glycine Acidic ligand Unknown ligand DeltaR Clumsy Ctenophores Placozoans Cnidarians Protostomes Chordates Vertebrates Plants KainateR AMPAR 0.5 Supplemental Figure 12: Phylogenetic tree of ionotropic glutamate receptors across metazoans Protein tree generated with RAxML using the PROTCATWAG model. Bootstrap values are 100 unless otherwise shown. These receptors are not found in the genomes or transcriptomes of demosponges or hexactinellids, so the Sponge clade refers to calcareous sponges and homoscleromorphs. For the three sponges, the blue star indicates sequences derived from a transcriptome. Based on [Alberstein et al., 2015], some receptors are predicted to bind ligands other than glutamate, shown with the green star, red diamond, and 37 black star, for glycine, acidic ligands, and unknown, respectively. Four placozoan proteins have substitutions at the conserved acidic residue (D723 in human GluN1), as either GY in ctenophores, or GG/WY in placozoans; the carboxyl of the glutamic/aspartic acid is needed to coordinate the amino group of glutamate, suggesting that these proteins do not bind an α -amino acid. bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. NMD3A_HUMAN Human NMDA-class iGluRs NMDE1_HUMAN GRIA2_HUMAN Human AMPA-class iGluRs euplokamis_t_h30_09966_comp8826_c0_seq1 Ctenophore iGluRs ML032222a Pleurobrachia_bachei_AEX15542.1 Calcisponge/Hsm AMPA-like oscarella_t_h60_42587_comp11410_c0_seq2 scict1.009114.1_0 A.thaliana_NP_187061.1__GluR1.1 S.ciliatum_scpid25297_GLUR3.1 Plant iGluRs Calcisponge/Hsm iGluR-like oscarella_t_h60_06601_comp6249_c0_seq1 L.complicata_lctid34850 Demosponge mGluR-like Aqu2.1.29334_001 E.muelleri−Emu1128−0.9−mRNA−1 Signal peptide ANF_receptor 7-transmembrane bac SBP type 3 Ligand channel NCD3G NMDAR2_C twi_ss.7125c.5|m.10167 0 250 500 750 1000 1250 1500 Supplemental Figure 13: Domain organization of ionotropic glutamate receptors across metazoans Scale bar displays number of amino acids. Top BLAST hits for human iGluRs in demosponges appear to be metabotropic, due to the presence of a 7-transmembrane domain instead of the ion channel, while the ligandbinding domain is conserved. Ctenophore iGluRs and some calcarea/homoscleromorph (Hsm) proteins have the vertebrate-type domain organization, though the other calcarea/homoscleromorph proteins (main sponge group in Supplemental Figure 12) have an SBP domain. 38 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 100 CAOG_01813T0__S17A9-like Alatina_alata_c59016_g1_i1__S17A9 hydra_sra.33023.2__s17a9 Scopalina_sp_TR21479_c0_g1_i1__s17a9 X.testutinaria_TR78129_c0_g1_i1__s17a9 Aqu2.1.12587_001_partial__s17a9 Stylissa_carteri_TR73134_c0_g1_i2__s17a9 91 twi_ss.30106.1__s17a9 81 NVE8304 NVE22746 75 Porites_australiensis_3031__S17A9 99 Acropora_millepora_10895__S17A9 Danio_rerio_NP_001002635.1__SLC17A9 50 Xtropicalis_XP_002944466.2__SLC17A9 Gallus_gallus_NP_001006292.1__SLC17A9 73 87 S17A9_HUMAN 75 S17A9_MOUSE Branchiostoma_belcheri_XP_019645208.1__SLC17A9 FBpp0077146__S17A9 82 Amel_4.5|GB46143-PA__S17A9 HELRODRAFT_76708__SLC17A9 40 49 66 Capca1|155164__S17A9 Capca1|155158 62 Lingula_comp143954_c0_seq4__S17A9 obirnaseq.110733.2__S17A9 79 9883 Cgigas_EKC18089__S17A9 LOTGIDRAFT_112031__SLC17A9 CAOG_02811T0 obirnaseq.94311.1 Hoilungia_braker1_g07337.t1 6 Triad1_g69.t1__scaffold_1 Hoilungia_braker1_g07338.t2 scict1.031446.1 lctid25328 lctid59401 scict1.029742.13 scict1.015158.1 36 bathyctena_t_h10_20781_comp13657_c0_seq4 beroeabyssicola_t_h20_12199_comp12599_c0_seq1 65 99 ML011726a bolinopsis_t_h20_00340_comp262_c0_seq1 91 euplokamis_t_h20_11645_comp9428_c0_seq1 beroeabyssicola_t_h20_15083_comp14311_c0_seq2 thalassocalyceinconstans_t_h10_05529_comp6090_c0_seq1 78 bathocyroe_t_h20_21180_comp15532_c0_seq1 45 31 Hcal_TR27180_c1_g1_i1 41 dryodora_t_h20_15255_comp12806_c0_seq1 72 ML21903a 38 bolinopsis_t_h20_19456_comp13734_c0_seq1 42 velamen_t_h20_17746_comp13342_c0_seq1 Monbr1_g5507.t1__scaffold_14 99 Srosetta_PTSG_01814T0 Halisarca_dujardini_HADA01062594.1 19 Halisarca_dujardini_HADA01062593.1 sympagella_nux_TR6482_c0_g1_i1_partial aphrocallistes_vastus_comp19335_c0_seq1 94 Scopalina_sp_TR32724_c0_g1_i1 68 Scopalina_sp_TR32724_c0_g2_i2 95 72 twi_ss.14577.1 76 Tedania_anhelens_TR19189_c0_g1_i10 Petrosia_ficiformis_TR2695_c2_g1_i1 88 Aqu2.1.28662_001 97 X.testutinaria_TR52323_c0_g1_i1 69 X.testutinaria_TR24582_c0_g1_i4 Aqu2.1.29226_001 13 Aqu2.1.28663_001 scict1.023385.2 lctid55483 Corticium_candelabrum_TR84939_c2_g3_i1_partial 75 65 Oscarella_carmela_comp9729_c0_seq1 84 Oscarella_carmela_comp31204_c0_seq3 78 91 Oscarella_carmela_comp37189_c0_seq1 Oscarella_carmela_comp40500_c0_seq9 Hydractinia_symbiolongicarpus_c17024_g1_i4 Alatina_alata_g24140.1_i2 96 AIPGENE1813 NVE15609__VGLUT-like Hydractinia_symbiolongicarpus_g7761.1_i1 Alatina_alata_g35508.1_i1 92 Acropora_millepora_17115 AIPGENE1891 97 NVE15610__VGLUT-like NVE21028__VGLUT-like 14 95 NVE16311__VGLUT-like Alatina_alata_c52723_g1_i1 35 Hydractinia_symbiolongicarpus_c16442_g1_i1 78 hydra_sra.17150.1 95 Alatina_alata_g57195.1_i1 hydra_sra.36087.1 Hydractinia_symbiolongicarpus_c19851_g1_i1 Acropora_millepora_10896 Porites_australiensis_15186 19 NVE20928__VGLUT-like 88 AIPGENE24249 Triad1_g5316.t1__scaffold_5 Hoilungia_braker1_g11731.t1 98 Hoilungia_braker1_g10108.t1 40 Alatina_alata_c54481_g1_i1 97 Hydractinia_symbiolongicarpus_c23495_g1_i1 96 hydra_sra.10353.1 21 22 Alatina_alata_g56947.1_i1 77 Alatina_alata_g51821.1_i3 34 hydra_sra.10649.1 Hydractinia_symbiolongicarpus_c14056_g1_i1 NVE17787__VGLUT 22 Acropora_millepora_12796 52 Porites_australiensis_46055 82 NVE8595__VGLUT 50 AIPGENE9842 33 Helro1|65417 FBpp0083297 TC010895 94 58 NV17997-PA 96 Amel_4.5|GB53933-PA 50 Lingula_comp156026_c0_seq2 52 Cgigas_EKC30442 61 64 Lotgi1|126078 65 76 Lotgi1|102784 Lotgi1|161078 bflornaseq.42024.2 bflornaseq.42037.1 bflornaseq.29980.1-29981.1-joined EKC38935 20 Cgigas_EKC18280 Lingula_comp149033_c2_seq2 Lingula_comp152169_c0_seq4 89 Capca1|208926 Lingula_comp149080_c0_seq7 59 Cgigas_EKC34298 97 Lotgi1|233350 Bf_V2_327_g44834.t1__bflornaseq.13338.1 23 10 Spurpuratus_XP_786480.3__VGLUT Lotgi1|128007__VGLUT 94 86 obirnaseq.87250.1 Cgigas_EKC24439__VGLUT 82 Lingula_comp143616_c3_seq1__VGLUT 75 Capca1|177109__VGLUT 35 TC008459__VGLUT 55 Q9VQC0_DROME__VGLUT 81 Amel_4.5|GB54867-PA__VGLUT 53 NV15959-PA__VGLUT VGLU3_DANRE 99Gallus_gallus_NP_001305958.1__VGLUT3 VGLU3_MOUSE VGLU3_HUMAN VGLU1_XENTR Danio_rerio_NP_001092225.1__VGLUT1 78 VGLU1_HUMAN VGLU1_MOUSE 83 VGL2A_DANRE VGL2B_DANRE 81 Gallus_gallus_NP_001161855.1__VGLUT2 2991 VGLU2_MOUSE 98 VGLU2_HUMAN Spurpuratus_XP_780445.3 Spurpuratus_XP_795625.3 72 Spurpuratus_XP_783585.2 94 Spurpuratus_XP_782868.2 82 Spurpuratus_XP_785484.3 Danio_rerio_NP_001070195.1__sialin 47 Salmo_salar_NP_001167306.1__sialin 99S17A5_HUMAN S17A5_MOUSE 99 Alligator_mississippiensis_XP_006269813.1 99 Gallus_NP_001026257.1 98 Gallus_gallus_XP_015140231.1__sialin Callorhinchus_milii_XP_007906946.1 Danio_rerio_XP_005159826.1 91 Alligator_mississippiensis_XP_006268054.2 Pelodiscus_sinensis_XP_014424898.1 68 E1BDD2_BOVIN__SLC17A4 99 S17A4_MOUSE 95S17A4_HUMAN NPT3_BOVIN NPT3_MOUSE 56 NPT3_HUMAN 89 NPT4_HUMAN 94 NPT1_MOUSE NPT1_HUMAN Capca1|183805 15 40 Helro1|186317 Lingula_comp149120_c0_seq6 27 Lotgi1|197570 27 80 Lotgi1|133858 35 obirnaseq.111983.1 32 Cgigas_EKC24609 7 Alatina_alata_c52956_g1_i1 Alatina_alata_c52359_g1_i1 99 46 Alatina_alata_g38981.1_i1 50 Hydractinia_symbiolongicarpus_c22518_g1_i1 53 NVE10158__sialin-like AIPGENE16892 Porites_australiensis_22512 74 Acropora_millepora_10266 NVE12518__sialin-like 95 72 AIPGENE10305 71 NVE9481__sialin-like 80 AIPGENE16935 TC005982 7 NV11798-PA TC006631 TC006632 Amel_4.5|GB41670-PA 64 83 NV10357-PA NV15849-PA 97 NV17612-PA NV17613-PA 61 TC006625 NV11203-PA Amel_4.5|GB42969-PA Amel_4.5|GB51651-PA 57 NV18394-PA 87 Amel_4.5|GB51650-PA 2 91 FBpp0086261 FBpp0309074 Amel_4.5|GB43225-PA NV15041-PA Hoilungia_braker1_g05533.t1 Capca1|113983 EKC40068 Lotgi1|107883 87 Lotgi1|155133 91 10 99 67 Lotgi1|161420 Lotgi1|107712 27 Capca1|93612 Helro1|108787 Helro1|71683 31 93 obirnaseq.69929.1 81 Cgigas_EKC35063 Cgigas_EKC35062 bflornaseq.39778.2 Capca1|209782 25 Lotgi1|113326 90 63 97 Cgigas_EKC25794 Lotgi1|167964 93 obirnaseq.30932.1 obirnaseq.22900.1 81 obirnaseq.22904.3 obirnaseq.22889.1 53 obirnaseq.30926.2 obirnaseq.30928.1 5330 obirnaseq.22894.5 obirnaseq.30927.1 24 3765 obirnaseq.123251.2 obirnaseq.123253.1 36 obirnaseq.30934.1 47 obirnaseq.30937.1 37 98 46 Nucleotide transporter (SLC17A9) 100 Cnidarian VGLUT-like Glutamate transporter VGLUT (SLC17A6,7,8) Sialin (SLC17A5) SLC17A4 Demosponges Homoscleromorphs Calcarea Hexactinellids Ctenophores Placozoans Cnidarians Protostomes Echinoderm/hemichordate Chordates Vertebrates Filastereans Choanoflagellates 0.6 Supplemental Figure 14: Vesicular glutamate transporter homologs across metazoans Tree of VGLUT (SLC17A6-8) proteins across all metazoan groups, generated with RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown. 39 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 32 91 SC6A1_HUMAN SC6A1_MOUSE Srosetta_PTSG_08575T0 Hoilungia_braker1_g07132.t1 Triad1_g3650.t1__scaffold_3 Srosetta_PTSG_00672T0 Spurpuratus_XP_788574.3__S38AB 29 S38AB_DANRE 80 S38AB_HUMAN S38AB_MOUSE 44 TC015432__S38 NV10096-PA__S38 31 Lingula_comp153547_c1_seq2_1__S38 65 EKC22480__S38 99 Lotgi1|174956__S38 CAOG_10085T0 lampea_t_h30_15888_comp12890_c0_seq2__cg4 Hcal_TR16507_c0_g1_i4__cg4 97 bolinopsis_t_h20_07298_comp7564_c0_seq1__cg4 80 45 thalassocalyceinconstans_t_h10_14237_comp12108_c0_seq1__cg4 57 beroeabyssicola_t_h01_04633_comp4052_c0_seq1__cg4 99 76 dryodora_t_h20_15361_comp12868_c0_seq1__cg4 Hoilungia_braker1_g00642.t1 Triad1_g3043.t2__scaffold_3 82 Cgigas_EKC37816 AIPGENE11622 60 Porites_australiensis_5139 Montastraea_cavernosa_84647 37Madracis_auretenra_42167 scict1.009980.1 lctid63806 Alatina_alata_c58688_g1_i1 hydra_sra.11928.1 93 resomia_t_x0_062726_comp75224_c0_seq1 Alatina_alata_c52262_g1_i1 19 hydra_sra.11232.1 hydra_sra.11218.1 34 Alatina_alata_c49879_g1_i1 Alatina_alata_c53411_g1_i1 98 hydra_sra.20297.1 56 hydra_sra.20302.1 AIPGENE25242 94 Acropora_millepora_11455 Porites_australiensis_9197 AIPGENE7072 NVE17671 26 98 Porites_australiensis_38448 Acropora_millepora_17942 98 resomia_t_x0_049257_comp70251_c0_seq1 94 resomia_t_x0_027597_comp54692_c0_seq1 98 Alatina_alata_g42698.1_i1 51 Alatina_alata_c51053_g1_i1 94 Acropora_millepora_8356 56 Porites_australiensis_37320 Acropora_millepora_12688 99 Porites_australiensis_49386 59 NVE17670 43 AIPGENE27144 NVE5099 NVE22971 NVE15339 79 85 AIPGENE28594 83 Acropora_millepora_16417 74 Porites_australiensis_36679 Alatina_alata_c44674_g1_i1 99 NVE24524 hydra_sra.24414.1 81 AIPGENE4721 99 NVE8024 Alatina_alata_g47107.1_i1 hydra_sra.1815.1 99 Alatina_alata_g24024.1_i1 29 AIPGENE26117 NVE8637 76 AIPGENE29146 NVE25975 hydra_sra.28166.1 24 Alatina_alata_g44676.1_i1 PFL3_pfl_40v0_9_20150316_1g2763.t1 Spurpuratus_XP_795408.3 obirnaseq.30198.1 95 Helro1|183498 81 68 Capca1|225531 39 TC009185 FBpp0086703 95 Amel_4.5|GB40541-PA NV15949-PA Spurpuratus_XP_791315.3__VIAAT bflornaseq.38036.1 31 Gallus_gallus_XP_417347.3__VIAAT 85 VIAAT_XENTR 54 Danio_rerio_NP_001074170.1__VIAAT 49VIAAT_MOUSE VIAAT_HUMAN oscarella_t_h60_07389_comp6812_c0_seq2 beroeabyssicola_t_h20_14227_comp13818_c0_seq1 bathocyroe_t_h20_04478_comp6860_c0_seq1 Hcal_TR19444_c0_g1_i2 52 32thalassocalyceinconstans_t_h20_14855_comp14523_c0_seq1 24 velamen_t_h20_22074_comp14838_c0_seq2 62bolinopsis_t_h20_25026_comp15044_c0_seq3 49 ML073035a Monbr1_g6159.t1 Srosetta_PTSG_07915T0 Lotgi1|157294__ANTL1 28 99 Lingula_comp152251_c0_seq2__ANTL1 bflornaseq.11495.1 74 oscarella_t_h60_33621_comp10920_c0_seq1 twi_ss.6165.1 25 58 98 Petrosia_ficiformis_TR11201_c0_g1_i1 99 X.testutinaria_TR68852_c1_g1_i1 93 Aqu2.1.23868_001 38 Triad1_g5051.t1__scaffold_5 Hoilungia_braker1_g03002.t1 Porites_australiensis_6407__ANTL1 63 62 Acropora_millepora_13042__ANTL1 80 NVE21968__ANTL1 98 Exaiptasia_pallida_KXJ22393.1__ANTL1 euplokamis_t_h30_07022_comp6902_c0_seq1 90 charistephanefugiens_t_h10_09118_comp9861_c0_seq1 Hcal_TR17487_c0_g1_i1 dryodora_t_h20_09505_comp9226_c0_seq1 88 bathocyroe_t_h20_17157_comp14441_c0_seq1 5842 thalassocalyceinconstans_t_h20_24470_comp18878_c0_seq3 97 25ML104321a 81 velamen_t_h20_12670_comp11072_c0_seq1 83 bolinopsis_t_h20_08753_comp8580_c0_seq1 CAOG_00255T0 Monbr1_g2633.t1 34 scict1.017867.1 lctid70109_0 lctid78596_0 99 scict1.018556.1 scict1.015001.1 96 scict1.015001.2 90 lctid12311_0 lctid80334_0 aphrocallistes_vastus_comp20147_c1_seq1 rosella_fibulata_TR11879_c0_g1_i1 sympagella_nux_TR7911_c0_g2_i1 87 aphrocallistes_vastus_comp6524_c0_seq1 sympagella_nux_TR20508_c0_g1_i4 rosella_fibulata_TR13163_c0_g1_i1 78 Scopalina_sp_TR28065_c0_g3_i1 44 twi_ss.8957.1_1 56 Tedania_anhelens_TR6077_c3_g2_i2 Aqu2.1.31258_001 63 Aqu2.1.31260_001 Petrosia_ficiformis_TR11578_c0_g1_i1 33 25 X.testutinaria_TR48957_c0_g1_i8 82 X.testutinaria_TR65433_c1_g3_i10 oscarella_t_h60_10767_comp8125_c0_seq6 Triad1_g5318.t2__scaffold_5 Triad1_g5319.t1__scaffold_5 euplokamis_t_h20_13527_comp10429_c0_seq1__S36A 39 bathyctena_t_h30_00646_comp699_c0_seq1__S36A dryodora_t_h20_27957_comp16799_c0_seq3__S36A bathocyroe_t_h20_09052_comp10596_c0_seq3__S36A 95 76 thalassocalyceinconstans_t_h10_13472_comp11685_c1_seq1__S36A 41 68 velamen_t_h20_29161_comp16666_c0_seq1__S36A 96 bolinopsis_t_h20_13495_comp11280_c1_seq1__S36A 79 ML085720a__S36A Acropora_millepora_4745__S36A Porites_australiensis_21159__S36A 22 89 NVE5526__S36A NVE5525__S36A 93 AIPGENE23310__S36A 68 Alatina_alata_c55134_g1_i1__S36A hydra_sra.16612.1__S36A hydra_sra.16610.1__S36A 45 Alatina_alata_c43290_g1_i1__S36A 98 hydra_sra.11681.1__S36A 21 Alatina_alata_c41537_g1_i1__S36A bflornaseq.11538.1-g25112.t1 40 Spurpuratus_XP_003723741.1__S36A Lingula_comp153549_c1_seq2_1__S36A 63 Lotgi1|218879__S36A 89 NV16821-PA__S36A 15 FBpp0292523__S36A 97 71 NV18592-PA__S36A 71 Amel_4.5|GB51487-PA__S36A 40 NV11499-PA__S36A S36A4_MOUSE S36A4_HUMAN S36A3_HUMAN S36A3_MOUSE S36A1_HUMAN 97 S36A1_MOUSE 46 S36A2_MOUSE S36A2_HUMAN 94 Na-coupled neutral AA transporter (SLC38) Unknown VIAAT-like 88 Cnidarian-specific VIAAT-like 32 GABA transporter VIAAT (SLC32A1) 41 Demosponges Homoscleromorphs Calcarea Hexactinellids Similar to Aromatic and neutral AA transporter (ANTL) Ctenophores Placozoans Cnidarians Protostomes Echinoderm/hemichordate Chordates Vertebrates Filastereans Choanoflagellates H-coupled AA transporter (SLC36) 0.6 Supplemental Figure 15: Vesicular inhibitory amino acid transporter homologs across metazoans Tree of VIAAT (SLC32A1) proteins and related transpoters across all metazoan groups, generated with RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown. 40 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. Spongospora_subterranea_A0A0H5R9G2 20 Ostreococcus_lucimarinus_A4RY52_OSTLU 70 Red algae M2XB31_GALSU 38 Klebsormidium_flaccidum_A0A0U9I7C4_KLEFL Oryza_brachyantha_J3L063_ORYBR Beta_vulgaris_A0A0J8B722_BETVU K4CUY8_SOLLC 36 F6HQ45_VITVI 49 Plants Citrus_sinensis_A0A067H648_CITSI 99 Eutrema_salsugineum_V4KNG8_EUTSA PIEZO_ARATH D.discoideum_Y2801_DICDI 92 Rozella_allomycis_EPZ31590.1 Monbr1_g1806.t2 99 Choanoflagellates Salpingoeca_rosetta_PTSG_01300T0 Monbr1_g2005.t1 20 Ctenophores euplokamis_t_h30_28970_comp15884_c0_seq4 bathyctena_t_x0_34756_comp23030_c0_seq1_0 beroeabyssicola_t_h20_27141_comp17803_c0_seq1 98 55 Pbachei_3461157_partial 68 hormiphora_t_h10_11700-TR2190_c1_g3_i2 46 bolinopsis_t_h20_24908_comp15026_c0_seq11 ML018021-AUGUSTUS Aqu2.1.38329_001 ephydatiamuelleri_12638-joined twi_ss.19116.1-19102.1 Sponges oscarella_t_h60_60372_comp11964_c0_seq14 scict1.026728.1_scict1.029409.2_0 L.complicata_lctid5362 Placozoans T.adherens_g4404.t1 Hoilungia_stringtie_3695.2_m.4488 99 resomia_t_x0_097703_comp79502_c0_seq1 hydra_sra.24978.1_2 NVE3870 Cnidarians AIPGENE4679-13595-4700_partial adi_v1.23438-05905-05906-05907_partial Acropora_millepora_552 Porites_australiensis_8982 99 Montastraea_cavernosa_16922 Brafl_PIEZO_partial O.dioica_2913001-5897001-5899001 57 94 F1NVW5_CHICK 39 PIEZ2_MOUSE PIEZ2_HUMAN Vertebrates E1BX07_CHICK 48 PIEZ1_HUMAN PIEZ1_MOUSE Spur_390358264-joined 97 98 C.gigas_EKC42879-EKC42880 Transcriptome Partial protein obirnaseq.123544.1_0 42 Protostomes Capca1_219762 81 L.anatina_comp156528_c0_seq1 D0VWN8_CAEEL 94 PIEZO_DROME A0A087ZN43_APIME Joined protein 99 TC013391 0.4 Supplemental Figure 16: Phylogenetic tree of Piezo homologs across metazoans Protein tree generated with RAxML using the PROTCATWAG model and 100 bootstraps. Yellow circle indicates the sequence was compete when joined with other genes or partial sequences, red partial circle indicates the sequence is incomplete in the genome or transcriptome. A blue star indicates that the sequence derived from a transcriptome, so copy number cannot be determined with certainty. Bootstrap values are 100 unless otherwise shown. 41 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. Choanoflagellates Monbr1_g1391.t1__scaffold_3 Srosetta_PTSG_00336T0 Hcal_TR17996_c0_g1_i1 beroeabyssicola_t_h20_12271_comp12645_c0_seq2 bolinopsis_t_h20_09329_comp8981_c0_seq1 velamen_t_h20_28621_comp16542_c0_seq1 77 ML17059a__SCN-like euplokamis_t_h20_12092_comp9658_c0_seq3 Hcal_TR16070_c2_g2_i1 98 bathocyroe_t_h20_22759_comp15829_c0_seq2 88 74 thalassocalyceinconstans_t_h10_09040_comp8843_c0_seq1 98 bolinopsis_t_h20_18283_comp13341_c0_seq1 velamen_t_h20_34229_comp17681_c0_seq1 64 ML358826a__SCN-like Hoilungia_braker1_g08315.t1 Triad1|54699_g3484.t1 Hoilungia_braker1_g10012.t1 Triad1_g3491.t2_Triad1-23340 Hoilungia_braker1_g08313.t1 Triad1_g3486.t1 Hoilungia_braker1_g10010.t1 Triad1_g3490.t1 Hoilungia_braker1_g05943.t1 Hoilungia_braker1_g08314.t1 99 Triad1_g3487.t1 PFL3_pfl_40v0_9_20150316_1g3063.t1 Spurpuratus_XP_793384.3 FBpp0300666 Amel_4.5|GB47190-PA 92 TC008776 Lingula_comp154129_c1_seq10 92 Capca1|134859 Lotgi1|118343 Cgigas_EKC21550 Halocynthia_roretzi_BAA95896.1 Brafl1|75071 Brafl1|249620_bflornaseq.37403.1 bflornaseq.37394.1-37393.1-Brafl1-143759 bflornaseq.37392.1-37391.1 Halocynthia_roretzi_BAA04133.1 Pmarinus_GENSCAN00000002143 Pmarinus_GENSCAN00000032390 F6XEC0_XENTR__scn4a 43 SCN4A_MOUSE SCN4A_HUMAN 86 60 K9J7R3_XENTR__scn5a SCN5A_HUMAN 99 SCN5A_MOUSE SCNBA_HUMAN 99 98 SCNBA_MOUSE 99 SCNAA_HUMAN SCNAA_MOUSE F6XFJ2_XENTR__scn8a SCN8A_RAT SCN8A_HUMAN 87 K9J7Z6_XENTR__scn2a F6UXH2_XENTR__scn1a 81 79 81 F6TZ24_XENTR__scn3a SCN3A_RAT 25 SCN3A_HUMAN SCN7A_HUMAN 44 SCN9A_HUMAN SCN9A_MOUSE 34 SCN2A_HUMAN SCN2A_RAT 92 SCN1A_HUMAN SCN1A_MOUSE SCNA_DROME__para TC004749__SCN 48 Amel_4.5|GB42728-PA 93 NV14617-PA__SCN Lingula_comp153551_c1_seq30 81 Cgigas_EKC22630__SCN Lotgi1|107523 Capca1|210954 Helro1|89291 93 Helro1|119038 99 Helro1|109965 Helro1|64539 Rhopilema_esculentum_TR59849_c0_g1_i2 hydra_sra.25523.1 Craspedacusta_sowerbyi_TR39962_c1_g1_i1 Corallium_rubrum_TR22337_c0_g1_i3 AIPGENE4043 Stylophora_pistillata_5656 Porites_australiensis_12936 45 Acropora_millepora_9921 Nemve1|171660_NVE7195 AIPGENE22631 Nemve1|88319_NVE6936 Porites_australiensis_8690 Stylophora_pistillata_17883 Corallium_rubrum_TR32421_c1_g1_i1 AIPGENE9486 98 99 NVE7348__SCN-like Stylophora_pistillata_13842 Acropora_millepora_904 98 Porites_australiensis_6111 Alatina_alata_c37058_g1_i1 70 Rhopilema_esculentum_TR83407_c2_g1_i2 Cyanea_capillata_AAA75572.1 Craspedacusta_sowerbyi_TR78689_c5_g1_i1 resomia_t_x0_058327_comp73969_c0_seq1 96 94 Hydractinia_symbiolongicarpus_c23789_g1_i2 77 Polyorchis_penicillatus_AAC38974.1 93 hydra_sra.26839.1 hydra_sra.39059.1 Ctenophores Placozoans 85 100 91 81 99 99 49 84 50 Protostome NaV2 Vertebrate NaV1 73 Protostome NaV1 99 94 81 22 Ctenophores Placozoans Cnidarians Protostomes Echinoderm/hemichordate Chordates Vertebrates Choanoflagellates 67 Cnidarian NaV Group 2 Anthozoan NaV Group Cnidarian NaV Group 1 0.6 Supplemental Figure 17: Phylogenetic tree of voltage-gated sodium channel alpha subunits across metazoans Protein tree generated with RAxML using the PROTGAMMALG model and 100 bootstraps. Bootstrap values are 100 unless otherwise shown. 42 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. scict1.007366.1 lctid45949 scict1.016080.1 lctid3586 Aqu2.1.42586_001 Sponges Twilhelma_c34057_g1_i2_twi_ss.12332.4 59 Tedania_anhelens_TR33204_c2_g1_i1 64 100 Scopalina_sp_TR28310_c4_g1_i2 Srosetta_PTSG_09464T0 Monbr1_g2466.t1__scaffold_5 euplokamis_t_h30_14855_comp11501_c0_seq1 Hcal_TR25961_c0_g1_i1__CaC-like bathocyroe_t_h20_30616_comp16858_c2_seq1 ML190424a__CaC-like 100 99velamen_t_h20_35925_comp17943_c0_seq2 55 Ctenophores bolinopsis_t_h20_17550_comp13069_c0_seq1 Triad1_g1763.t1__scaffold_2 Hoilungia_braker1_g05522.t1 92 CAC1E_HUMAN N/P/Q-type CaV CAC1B_HUMAN 95 CAC1A_MOUSE CAC1A_HUMAN 99 bflornaseq.27973.1 Amel_GB18730-PA 99 NV11619-PA 32 TC011227 TC011226 85 Cacophony CAC1A_DROME__cacophony 99 Helro1|73530 Helro1|128993 69 71 Helro1|119050 Lingula_comp155476_c1_seq27 82 Cgigas_EKC27184 hydra_sra.3016.1 Acropora_millepora_1337 56 Porites_australiensis_47371 NVE1263__CaA-like 98 hydra_sra.19987.4 Rhopilema_esculentum_TR102482_c2_g1_i1 90 AIPGENE18550 NVE18768__CaA-like Porites_australiensis_44917 Acropora_millepora_1120 Alatina_alata_c56718_g1_i3 hydra_sra.37904.1 NVE4667__CaA-like AIPGENE13431 Acropora_millepora_894 Oscarella_carmela_comp39348_c0_seq21 Hoilungia_braker1_g06121.t1 Triad1_g1163.t2__scaffold_1 Lingula_comp156742_c0_seq2 56 Cgigas_EKC35362 Lotgi1|51275 98 TC004715 CAC1D_DROME__Ca-alpha1D bflornaseq.3496.1_bflornaseq.3497.1 CAC1S_HUMAN 94 CAC1F_HUMAN CAC1D_HUMAN 96 99 CAC1C_HUMAN L-type CaV CAC1C_MOUSE hydra_sra.20483.3 Alatina_alata_c60357_g1_i1 Chironex_fleckeri_TR37575_c1_g1_i8__CAC1C 75 Chrysaora_fuscescens_TR20395_c0_g1_i2__CAC1C Rhopilema_esculentum_TR83197_c1_g2_i1__CAC1C Corallium_rubrum_TR89372_c0_g1_i1__CAC1C 95 NVE6254__CaC-like Stylophora_pistillata_AAD11470.1 60 Porites_australiensis_45764 Acropora_millepora_1095 Srosetta_PTSG_03773T0 Hoilungia_braker1_g01453.t1 Triad1_g2446.t1__scaffold_2 T-type CaV CAC1I_HUMAN 100 CAC1H_MOUSE CAC1H_HUMAN 58 Demosponges Homoscleromorphs Calcarea Ctenophores Placozoans Cnidarians Protostomes Chordates Vertebrates Choanoflagellates CAC1G_HUMAN bflornaseq.16516.1 100 NV13530-PA__CAC1G-like 96 82 98 54 58 44 81 FBpp0088415__CAC1G-like TC005355__CAC1G-like Capca1|89566__CAC1G-like Helro1|66349__CAC1G-like 39 Helro1|170765__CAC1G-like Helro1|148954__CAC1G-like Lingula_comp156482_c1_seq2__CAC1G-like 98 EKC37236__CAC1G-like 71 Lymnaea_stagnalis_AAO83843.2 Lotgi1|220094__CAC1G-like Chironex_fleckeri_TR32013_c0_g1_i1__CAC1G Corallium_rubrum_TR78644_c0_g1_i1__CAC1G 99 48 Porites_australiensis_46716__CAC1G 99 AIPGENE17378__CAC1G-like NVE5017__CaG-like Rhopilema_esculentum_TR92782_c0_g1_i2__CAC1G Acropora_millepora_3959_partial 0.7 NVE7616__CaG-like Supplemental Figure 18: Phylogenetic tree of voltage-gated calcium channels across metazoans Protein tree generated with RAxML using the PROTGAMMALG model and 100 bootstraps. Bootstrap values are 100 unless otherwise shown. 43 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 8 03m 12 an _c m ciu rti Co 3.1 916 3 54 d7 15lcti l de 88 9 92 g 4_ _c 2_ _ i1 al rti pa l rtia pa 0_ 08 5 863909 id 24.1 scplctid 018 0 t1.0 8201 scic lctid Calcisponges 1.2 1545 ict1.0 331 sc lctid79 56.2 scict1.0042 lctid66549 0.9 49 0.98 8 Plants 0.96 0. 67 0.75 4 0029 Chlorella_variabilis_EFN53563.1 0.925 Polysphondyli um _pallidum_EF Sak ow v30 03 60 Pha 85 m seq1 301_comp13867_c1_ beroeabyssicola_t_h20_14 ctylu m_ _AEQ59 28 6.1 rudii_ ADM 25 lass 825.1 iosira tric _pse orn udo utu nan m_ a _X XP_ P_0 022 002 9 33 180 60.1 795 .1 Anthozoan Group 2 19 _s p_ TR 12 St yl 64 iss 0_ _fi a_ c1 X.t cif ca est or _g rte uti mi ri_ 1_ na s_T TR i2 ria R My 2 _ 77 cale TR 18 1 69 72 _ph 75 _c 78_ yllo 8_ 0_ c ph c2 g1 0_g Cre ila _g _ ll _ a 3_i1 1_i Ted TR ania _eleg i2 1 97 a 96 _an Cram 2_c hele ns_TR be_c ns_ 35 0 _g rambe 1_i1 _TR3 TR183316_ Halisa 0718 8 c0 rca_du _c2_g1_c0__gg1_ jardin i3 i_HAD 1_i5 1_i2 A01036 606.1 Halisarca _dujardin i_HADA0 1057378.1 sia al in a Pe tro Sc op lithu s_braa Tha AIP AnGE th NE op 73 leu 91 ra ta P _e st o leg ra ri an ea tes tis _c _a sim a v us a_ er tra 65 no lie 82 sa ns 4 _1 is_ 14 15 8 6 48 3 6 03 29 05 68 ta_ _9 a_1 illa ria nr ist ta te _p cu reora _s au h iais_lop nrgaSc ty Faud M q1 q1 _se _se i1 _c0 1_ _c0 663 _g 005 omp23 c0 p11 4_ _c 58 om 31421 20 3_c _ TR 252 h20 al_ 0_1 sis_t_ Hc _h2 nop n_t boli me vela seq1 _c0_ 263 mp7 56_co 1 _088 h10 c0_seq 9_seq1 is_t_ 487_ 4651_c p149 kam _comp1 euplo _22914 26_com s_t_h10 _458 _h30 constan calycein tena_t thalasso ML024915a bathyc bathocyroe_t_h20_10680_comp11581_c0_seq1 Demosponges M on eficum Cocco eod a 1 _i1 _g2_i2i1 1 _c1_g1g2_ 9. i4 839_c0 0_ 83 _i2 _ R40 90 _c 15 2 g1 s_TR2630663 i_ss. c1_g c0_ n a 2 tw 2_ 8_ leg _T a_e elens e_TR 9 3 ll 5 27 Cre anh mb 09 3 _ ra R1 TR ania e_c i_T sp_ Ted amb ter Cr a_ ar lin _c a a ss op yli Sc St Protists A75681.1 Karlodi nium_v en 0.999 1 wv3 Acro Acropora _mi pora_ Ast ra_20758 digllepo itifera MPooreriop ora_sp _10161 tes_ FnuMta a _2 nagsia tr ustra 1046 dra_asea_c liensis_ Sct iscu_taarvern 3166 yloau ia_ osa 3 phret 682 _10 or enr 97 215 a_ a _ 1 Co pis 91 ra til 26 lli G lat umor a _1 _rgon 9 1 ub ia 1 ru _ve m nt _T a l R2 ina 36 _29 90 58 _c 1 0_ g1 _i 2 NVE9761 0.8 51 _esculen Rhopilema Sako 0.9 0.96 0.805 R7 _T 4 0.109 86 80.8 7 .0 um r ab 0.89 0.918 0. 73 0.862 R36069_c0_g1_i1 Chironex_fleckeri_T _g2_i1 1317_c0 tum_TR7 490.1 sra.26 hydra_ obirnaseq.13988.1 .t1.t1 g02370 _braker1_3120 Hoilungia Triad1_g 1 ict sc 7 0.581 2 0. 1 Hoilungia_brake r1_g02369.t2 Triad1_g3119.t1 Anthozoan Group 1 IOIN raker1 _g0237 1. Triad1_ g3t1 122.t2 Hydrozoans N1_C Placozoans HVC llo rh Da inch nio us _r _m pu s_t erio_ ilii_ Bf rop N X _V ica P_0 P_0 2_ lis_ 0 0 32 NP 100 790 _g HV _0 23 0 66 0 4 C 4 Gallu 1 0 6. 4 8 N 46 11 1 .1 _M s_gal HVC1N 26 .t1 lus_N 1 OU 2.1 E P_00 _HUSM 1025 AN 834.1 Xe no Vertebrate HV Channel Hoilu ngia_b eq1 .1 9282 0_s q.60214 6_c nase C4 29 obir as_EK 143 p 1 Cgig com 3. 5 la_ 20 10 gu 80 14 9.1 Lin 37 | 2 77 01 i1 19 P_ otg 11 8.100 _X L 47P_ us 27 N m 37 s_ he 00 tu yp P_ ra ol _ X pu _p us pur us S ul m Li Ca rat pu ur Sp Protostomes Ctenophores 0.3 Supplemental Figure 19: Phylogenetic tree of voltage-gated proton channels across metazoans Protein tree generated with FastTree. 44 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. 974 2 Supplemental Tables Supplemental Table 1: Summary statistics of the Tethya includes the sponge and all associated bacteria. Feature Type Assembly size (Mb) Sponge only Total gaps (Mb) Sponge only Estimated kmer coverage Sponge only Estimated mapping coverage Sponge only Number of contigs All Number of contigs Sponge only Number of contigs Bacterial GC % Sponge only Contig N50 (kb) All Contig N50 (kb) Sponge only Contig N50 (kb) Bacterial Genome size (bp) Mitochondrion Estimated mapping coverage Mitochondrion GC % Mitochondrion Paired-end reads Holobiont genomic 100bp Total paired-end bases (Gb) Holobiont genomic Reads aligning back to genome All contigs Mate-pair reads Holobiont genomic 125bp Total mate-pair bases (Gb) Holobiont genomic Moleculo (TruSeq) long reads Holobiont genomic Total Moleculo bases (Mb) Holobiont genomic Paired-end RNA-seq reads dUTP Stranded Paired-end RNA-seq bases (Gb) dUTP Stranded Trinity De novo transcripts RNA-seq mapping fraction All contigs StringTie Genome guided transcripts 45 wilhelma genome. Holobiont genomic Count 126.0 1.348 131x 159.3x 6,907 6,109 789 39.98% 70.7 73.4 48.2 19,754 669.6x 34.43% 259,518,468 25.951 214,103,768 280,837,536 35.104 125,150 436.7 201,451,574 25.181 127,012 68.3% 46,572 bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license. Supplemental Table 2: Summary of splice variation for the genome-guided transcriptome for T. wilhelma (Twi) and transcript set v2.0 for A. queenslandica (Aqu). Splice Type Twi transcripts Twi Aqu transcripts Aqu events/exons events/exons Cassette exons 4779 9089 3591 5535 Canonical splicing 5049 2602 Skipped exons 3868 8329 1022 721 Alternative splice acceptor 3747 638 Intron retention 3295 3565 3437 3400 Alternative splice donor 3264 521 Alternative N-terminus 1964 1965 Alternative C-terminus 1788 1968 Intronic start 471 571 Intronic end 285 592 Non-canonical 73 135 Single exon with variants 246 13 No variants 12088 24027 Single exon and no variants 15421 12400 - Supplemental Table 3: Skipped exon and retained intron frame, for T. wilhelma (Twi) and A. queenslandica (Aqu). Skipped exons tend to have lengths as multiples of three. Feature Position 1 Position 2 Position 3 Twi Skipped exons 6904 5179 5331 Twi Retained introns 1202 1204 1159 Aqu Skipped exons 2671 1914 1972 Aqu Retained introns 1244 1092 1101 46
© Copyright 2026 Paperzz