Gene 386 (2007) 202 – 210 www.elsevier.com/locate/gene Profiling of maternal and developmental-stage specific mRNA transcripts in Atlantic halibut Hippoglossus hippoglossus Jialin Bai a,b , Christel Solberg b , Jorge M.O. Fernandes a , Ian A. Johnston a,b,⁎ a Fish Muscle Research Group, Gatty Marine Laboratory, School of Biology, University of St Andrews, East Sands, St Andrews, Fife, KY16 8LB, Scotland, UK b Department of Fisheries and Natural Sciences, Bodø Regional University, N-8049 Bodø, Norway Received 28 October 2005; received in revised form 4 September 2006; accepted 19 September 2006 Available online 5 October 2006 Received by C.T. Amemiya Abstract cDNA libraries were constructed from the following developmental stages (tissues) of the Atlantic halibut (Hippoglossus hippoglossus): 2-cell stage (embryos), 1 day-old yolk sac larvae (trunk) and juvenile (fast skeletal muscle). A total of 4249 high quality expressed sequence tags from the three libraries were clustered into a partial transcriptome of 2124 putative genes. A large proportion of the gene clusters (48.3%) had no significant matches against known proteins. The most abundant ESTs of nuclear transcripts in the 2-cell library included sequences with high identity to zebrafish H1M, a linker histone-like protein involved in primordial germ cell specification, zinc finger protein, rRNA external transcribed spacer, thymosin β-4, cyclin B1 and several predicted peptides from the Tetraodon nigroviridis genome assembly with unknown functions. 170 and 123 ESTs represented ribosomal proteins in the larval and juvenile libraries respectively, compared with only two sequences in the 2-cell library, which may reflect an abundance of maternally inherited pre-formed ribosomes in the yolk. Even though some clusters were common to all three libraries, most putative genes showed a developmental-stage specific distribution with 72% (2-cell embryo), 59% (larval) and 57% (juvenile) sequences having no significant matches against the 8400 adult halibut sequences in the EMBL nucleotide database. Comparison between the predicted halibut peptide data set and the human, zebrafish, and pufferfishes (T. nigroviridis and Takifugu rubripes) proteomes revealed that, as expected, the halibut sequences were more similar to the other two fish species than to human proteins. However, no clear bias towards the pufferfishes was observed, suggesting significant sequence variation between orthologues within the clade Acanthomorpha. The sequence information generated in the present study will represent a significant new resource for future studies on normal and abnormal development in Atlantic halibut. © 2006 Elsevier B.V. All rights reserved. Keywords: Maternal mRNAs; cDNA library; Expressed sequence tags; Gene Ontology 1. Introduction The genomes of four model teleost fishes have been sequenced to the draft level: the zebrafish (Danio rerio), the medaka (Oryzias Abbreviations: BLAST, Basic local alignment search tool; EST, Expressed sequence tag; GO, Gene Ontology; NCBI, National Center for Biotechnology Information; PCR, Polymerase chain reaction. ⁎ Corresponding author. Fish Muscle Research Group, Gatty Marine Laboratory, School of Biology, University of St Andrews, East Sands, St Andrews, Fife, KY16 8LB, Scotland, UK. Tel.: +44 1334 463440; fax: +44 1334 463443. E-mail address: [email protected] (I.A. Johnston). 0378-1119/$ - see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.gene.2006.09.012 latipes) and the pufferfishes Takifugu rubripes and Tetraodon nigroviridis (http://www.ensemble.org/index.html). In contrast, large scale genetic resources for farmed fish species such as common carp (Cyprinus carpio), Atlantic salmon (Salmo salar), rainbow trout (Oncorhynchus mykiss) and tilapia (Oreochromis niloticus) are restricted to expressed sequence tags (ESTs), the product of high-throughput, single-pass cDNA sequence analyses (Cossins and Crawford, 2005). The Atlantic halibut Hippoglossus hippoglossus (order Pleuronectiformes) is a valuable flatfish with an established market that is starting to be farmed commercially in Canada, Norway and Scotland (Bergh et al., 2001). A major bottleneck in halibut farming is the production of juveniles for on-growing. J. Bai et al. / Gene 386 (2007) 202–210 Significant problems include high embryonic and larval mortality and the prevalence of body deformities (Kjørsvik et al., 1990; Bromage et al., 1992). Paternity analysis using DNA microsatellite markers indicates that surviving juveniles typically come from a small percentage of the total broodstock population, indicating poor egg quality following artificial fertilization (Jackson et al., 2003). Abnormal patterns of blastomere cleavage are common and have been linked to low hatching success (Kjørsvik et al., 1990; Shields et al., 1997). Maternal mRNA transcripts and/or proteins have critical roles in directing early developmental processes in teleosts (Kane and Kimmel, 1993). In zebrafish, zygotic transcription only begins at the mid-blastula transition (512-cell stage) (Kane and Kimmel, 1993). In contrast, zygotic gene activation starts at the 1-cell stage in mouse embryos and is clearly evident by the 2-cell stage (Aoki et al., 1997; Zeng et al., 2004). During the maternal-to-zygotic transition oocytespecific transcripts are degraded and maternal transcripts common to oocyte and early embryo are replaced with zygotic transcripts (Paynton et al., 1988; Davis et al., 1996; Zeng et al., 2004). Maternal mRNA transcripts and/or proteins are thought to control cell cleavage patterns, the establishment of body axes and specification of early embryonic cells prior to the mid-blastula phase (Nishikata et al., 2001; Yamada et al., 2005; Zeng et al., 2004; Zeng and Schultz, 2005). Post-ovulatory ageing in rainbow trout oocytes altered the abundance of mRNA transcripts, including insulin-like growth factor-I receptor, and was associated with developmental abnormalities in the larval stages (Aegerter et al., 2004). Studies on developmental mechanisms in Atlantic halibut are hindered by a lack of molecular markers. All published EST datasets from Atlantic halibut are from the adult stage of the life cycle (e.g. Park et al., 2005) with a total of 8400 sequences deposited in the EMBL nucleotide data base. There have been no transcriptomal analyses concerned with skeletal muscle and/ or early developmental stages and nothing is known about the composition of maternal mRNAs. The first aim of the present study was therefore to characterize cDNA libraries from 2-cell embryos, the trunk of yolk-sac larvae and the fast myotomal muscle of juveniles as a resource for future studies on normal and abnormal development. The two pufferfish genomes sequenced to draft level belong to the same clade as the Atlantic halibut, the Acanthomorpha (ray finned fishes with true spines in their anal and dorsal fins) (Nelson, 1994). The Acanthomorpha represent nearly 60% of extant fish diversity with more than 15,300 species and 314 families (Nelson, 1988). Many of the nodes on the Acanthomorpha tree remain unresolved or poorly defined (Dettai and Lecointre, 2005). The second aim of the present study was to establish the extent to which genomic resources from model species, particularly pufferfishes, could be useful for studies with Atlantic halibut. We therefore investigated patterns of phylogenetic affinity between halibut EST sequences and the proteome databases available from pufferfish, zebrafish and human using the Java/Perl-based application SimiTri (Parkinson and Blaxter, 2003), which allowed simultaneous display and analysis of relative similarity relationships. 203 2. Materials and methods 2.1. Animals and sample collection 2-cell embryos and 1 day-old yolk-sac larvae (1 day after hatching) were obtained from brood stock of mature Atlantic halibut kept at Halibut Research Station, Bodø University College, Bodø, Norway. A single batch of Atlantic halibut eggs obtained by manual stripping was fertilized with milt from one male. All developmental stages were reared using sea water maintained at a salinity of 33–35‰ and treated with ozone. Eggs were incubated in a 280 l tank at a temperature of 5.2–5.4 °C and under dark photoperiod until day 7 after fertilization, at which time epiboly has been completed. After approximately 10 hours post fertilization (hpf), 2-cell embryos were collected. 1 day-old yolk-sac larva after hatching were obtained on day 15 after fertilization. Samples were kept in RNAlater (Ambion, Cambridgeshire, UK), packaged with dry ice and transported to the Gatty Marine Laboratory (University of St Andrews, UK) where they were stored at − 70 °C. The head, yolk-sac and tail of 1 dayold yolk-sac larvae were removed prior to total RNA extraction. Juvenile Atlantic halibut were obtained from Marine Harvest (Scotland) Ltd (UK) and maintained at the Gatty Marine Laboratory in sea water (salinity 34‰) at 10 °C and under 12 h dark:12 h light photoperiodic regime. A juvenile halibut (∼0.42 kg) was humanely killed by overanaesthesia in a solution of 0.2 mM 3-aminobenzoic acid ethyl ester (Sigma, Dorset, UK) buffered with sodium bicarbonate (Sigma). Fast muscle was dissected from the dorsal epaxial myotome, either snapfrozen in liquid nitrogen or stored in RNA later for subsequent nucleic acid extraction. 2.2. cDNA library construction One hundred milligrams of each sample were added to FastRNA ProGreen beads (Qbiogene) containing 1 ml of TRI reagent (Sigma). The tissues were homogenized using the FastPrep Instrument (Qbiogene) for 40 s at the speed setting of 6.0. Total RNA was isolated from each sample according to the manufacturer's instructions. Potential contaminating DNA was removed from the RNA preparation using TURBO DNA-free (Ambion). Poly (A+) RNA was purified using the Absolutely mRNA™ purification kit (Stratagene) and quantified with the fluorescent nucleic acid stain RiboGreen (Molecular Probes) following the recommended protocol so as to obtain ∼ 5 μg poly (A+) RNA. cDNA libraries were generated using the pBlueScript II XR cDNA library construction kit (Stratagene) according to the manufacturer's instructions. Primary cDNA libraries were amplified once before colonies were picked for sequencing. 2.3. cDNA clone selection and insert sequencing The amplified plasmid cDNA libraries were plated and individual clones were randomly picked following blue-white clone selection with isopropyl β-D-thiogalactopyranoside (20 mM) and 5-bromo-4-chloro-3-indolyl β-D-galactopyranoside (0.2 mM). 204 J. Bai et al. / Gene 386 (2007) 202–210 Insert checks were carried out by PCR with T3 short (5′ATTAACCCTCACTAAAG-3′) and T7 short (5′-AATACGACTCACTATTAG-3′) primers. PCR involved an initial denaturation step at 95 °C for 5 min, followed by 35 amplification cycles: 94 °C for 30 s, 53 °C for 45 s and 72 °C for 2 min. A final extension at 72 °C for 10 min was used. PCR products were cleaned up with shrimp alkaline phosphatase (SAP)/Exonuclease I (Exo I) (Amersham). 5′ end sequencing PCR reactions with T3 primer (5′-ATTAACCCTCACTAAAGGGAA-3′) were performed using the ABI prism Big Dye (v3.1) Terminator Cycle Sequencing Ready Reaction Kit (PE Applied Biosystems, USA). The sequencing reaction comprised an initial denaturation step at 96 °C for 1 min, and 25 cycles at 96 °C for 10 s, 50 °C for 5 s and 60 °C for 4 min. DNA pellets were sent for sequencing at the John Innes Centre (JIC) with an ABI 3770 DNA sequencer and at the Oxford DNA sequencing facility with an ABI 3700 capillary DNA sequencer (PE Applied Biosystems, USA). 2.4. Sequence processing and bioinformatics analysis Raw trace data were processed with the EST analysis pipeline developed by the Natural Environment Research Council and Environmental Genomics Thematic Programme Data Centre (NERC–EGTDC, University of Edinburgh). The electropherograms were sequentially analyzed by trace2dbEST (http:// envgen.nox.ac.uk/est.html), Partigene (Parkinson et al., 2004), Prot4EST (Wasmuth and Blaxter, 2004), and annot8r (available from http://www.nematodes.org/ PartiGene), as described by Fernandes et al. (2005). For cross-taxon comparison, the 4249 EST sequences generated in the present study were used to construct a partial transcriptome database with Partigene. The resulting putative genes of Atlantic halibut were subjected to BLASTX similarity searches (Altschul et al., 1997) against the zebrafish, pufferfishes and human predicted proteomes (downloadable from http://www.ensembl.org). Relative similarity relationships between these four data sets were visualized as a triangular plot generated by the SimiTri software (Parkinson and Blaxter, 2003). dbEST (http://www.ncbi/nlm/nih.gov/dbEST): 1419 sequences from the 2-cell library (GenBank accession numbers: DT805529–DT806308), 1300 sequences from the larval trunk library (GenBank accession numbers: DN794246– DN794954, DT806677–DT806820 and EB102920– EB103372) and 1530 sequences from the fast muscle library (GenBank accession numbers: DN792583–DN793123, DT806309–DT806676). The proportion of known gene sequences containing a start codon was determined by comparing their putative translation products with the corresponding BLASTX hits against the NR (NCBI) and UniProt (EBI) databases. Putative peptides containing a start methionin were considered to correspond to ESTs with complete 5′ coding sequences. On this basis, the percentage of full-length cDNA clones in the 2-cell, larval trunk and juvenile fast muscle libraries was 47%, 56% and 60%, respectively. A total of 2124 clusters (putative genes) were generated by Partigene: 798 clusters from the 2-cell library, 719 clusters from the larval trunk library and 607 clusters from the fast muscle library (Fig. 1). About 120 (15.0%), (168) 23.4% and 177 (29.1%) of clusters were represented by multi-ESTs in the 2-cell, larval trunk and fast muscle libraries, respectively, with corresponding maximum cluster sizes of 70, 35 and 89 EST sequences. In some cases the computational assembly of clusters was too conservative, generating distinct clusters that corresponded to the same gene. Such clusters were identified and grouped into superclusters. A complete list of the partial transcriptome databases corresponding to all three libraries generated in this study is available online at the Fish Muscle Research Group web site: 3. Results and discussion 3.1. Summary of ESTs from the cDNA libraries in the 2-cell embryo, the trunk of yolk-sac larvae and the fast skeletal muscle of juvenile Atlantic halibut The amplified libraries contained at least 6.9 × 108, 1.1 × 109 and 7.6 × 108 total transformants in the 2-cell embryo, yolk-sac larvae (trunk) and juvenile (fast skeletal muscle) library, respectively, and insert sizes ranged from approximately 0.5 to 3.0 kb. Single-pass sequencing was performed on a total of 6610 randomly picked clones, including 1783 clones from the 2-cell library, 2112 clones from the larval trunk library and 2715 clones from the fast muscle library. After removing the vector sequences and adaptors, only ESTs longer than 150 bp were chosen for further analysis. As a result, 4249 high quality sequences were obtained and submitted to the EST database Fig. 1. Diagram of the number of clusters isolated from three cDNA libraries and the overlap among them. Putative gene clusters were obtained from cDNA libraries of 2-cell embryo, yolk-sac larva (trunk) and juvenile (fast myotomal muscle) Atlantic halibut (Hippoglossus hippoglossus). J. Bai et al. / Gene 386 (2007) 202–210 Table 1 Most abundant 20 EST clones of maternal mRNA transcripts from the 2-cell embryo cDNA library from Atlantic halibut (Hippoglossus hippoglossus) Gene No. of sequences Cluster Transcript size abundance (%) Two-cell embryo (1419 sequences, 798 cluster) Mitochondrial genes Cytochrome c oxidase subunit II 132 Cytochrome c oxidase subunit III 108 Cytochrome b 104 Cytochrome c oxidase subunit I 89 NADH dehydrogenase subunits (1–4) 27 Mitochondrial hypothetical 18K protein 26 ATP synthase subunit 6 19 8 5 8 7 7 1 3 9.30 7.61 7.33 6.27 1.90 1.83 1.34 Nuclear genes Novel Tetraodon protein Histone (H1M, H2A and H3) rRNA external transcribed spacer Splicing factor Zinc finger protein Senescence-associated protein Thymosin beta-4 Cyclin B1 Nuclease diphosphate kinase B Novel Tetraodon protein Novel Tetraodon protein Cathepsin L Transcription elongation factor B 1 3 1 1 3 3 1 2 2 1 1 1 3 0.92 0.63 0.42 0.42 0.42 0.35 0.35 0.28 0.28 0.28 0.21 0.21 0.21 13 9 6 6 6 5 5 4 4 4 3 3 3 http://www.st-andrews.ac.uk/∼fmrg. The overall redundancy (defined as the ratio between the total number of sequences and the number of clusters) was 1.78, 1.81 and 2.52 for the 2-cell, larval trunk and fast muscle libraries, respectively. In fact, more than 70% of the putative genes from each library were singletons represented by only one EST. These low redundancy values suggest that our isolation of ESTs may not be saturated and that most clusters represent rare mRNAs. Sixty-seven clusters were common to all three libraries. When pairwise comparisons between libraries were carried out, 110 super-clusters were found to be present in both 2-cell and larval trunk, while 95 and 192 superclusters were common between 2-cell and fast muscle, and fast muscle and larval trunk, respectively. The majority of EST superclusters in each library (2-cell (82.7%), larval trunk (67.3%) and fast muscle (63.8%)) did not show any overlap. The number of clusters without significant hits (cutoff of 1e− 8) against the 8400 adult halibut sequences available from EMBL was 579 (72.6%) (2-cell embryo), 424 (59%) larval trunk and 347 (57.1%) (juvenile). The full-list of sequences unique to each developmental stage studied are shown as HTML Tables available online at http://www.st-and.ac.uk/∼fmrg/bai.html. These results indicate that each developmental stage has a significant number of unique clusters, probably reflecting differential gene expression during development. 3.2. Distribution of sequences at each Atlantic halibut developmental stage Cluster sizes in a non-normalised EST library indicate the relative abundance of the corresponding transcripts. Tables 1, 2, 205 and 3 show the top 20 EST clones in the three developmental stages studied. Transcript abundance is defined as the ratio between the number of EST sequences of putative genes and the total number of ESTs. The highest abundance of mitochondrial sequences was found in the 2-cell library (35.6%) compared with 8.9% for the larval trunk library and 26% for the juvenile fast muscle library. This may simply reflect the different composition of tissues present in the samples from which the RNA was extracted e.g. whole organism versus specific tissue(s). Comparative studies with Xenopus (El Meziane et al., 1989) and mouse (Pikó and Taylor, 1987) indicate mitochondrial transcripts are abundant in the oocyte but decrease dramatically shortly after fertilization only increasing again after the transition to zygotic transcription at the 8-cell stage and mid-blastula transition respectively. The most abundant nuclear transcripts in the 2-cell library included a short peptide without conserved domains also present in the T. nigroviridis genome (GSTENG00007427001), a linker histone-like protein (H1M), thymosin β-4 (TMSB4X) and cyclin-dependent kinase regulatory subunit cyclin B1 (CCNB1) (Table 1). The histone-like protein was similar to zebrafish H1M, a maternally transmitted protein that is involved in the specification of primordial germ cells and is expressed during blastula stages (Muller et al., 2002). Cyclin B1 has a key regulatory role in the G2–M transition of the cell cycle (Kong et al., 2000) and thymosin β-4 is thought to control the rate and extent of actin polymerization in cells (Carlier et al., 1993). The largest gene Table 2 Most abundant 20 EST clones of zygotically transcribed mRNA transcripts from yolk-sac larval (trunk) cDNA library from Atlantic halibut (Hippoglossus hippoglossus) Gene No. of sequences Cluster size Yolk-sac larval (trunk) (1300 sequences, 719 cluster) Mitochondrial genes Cytochrome c oxidase subunit I 47 ATP synthase subunit 6 28 Cytochrome c oxidase subunit II 19 Cytochrome b 16 NADH dehydrogenase subunits 7 (1, 2 and 4) Nuclear genes Ribosomal proteins a Cardiac muscle alpha-actin Myosin light chain (1, 2 and 3) Creatine kinase muscle isoform 1 Novel Tetraodon protein Myosin heavy chain Novel Tetraodon protein Aldolase A fructose-bisphosphate Parvalbumin beta Skeletal fast troponin T Nuclease diphosphate kinase B Novel Tetraodon protein Adenine nucleotide translocator s6 Glyceraldehyde-3-phosphate dehydrogenase Skeletal alpha actin a Transcript abundance (%) 2 7 2 1 5 3.62 2.15 1.46 1.13 0.54 170 43 32 32 16 13 11 10 9 9 9 8 6 6 40 1 5 2 1 4 1 1 1 2 1 1 1 1 13.08 3.30 2.46 2.46 1.23 1.00 0.85 0.77 0.69 0.69 0.69 0.62 0.46 0.46 5 2 0.38 Although all ribosomal proteins were grouped into one cluster in this study, the majority of ribosomal proteins were represented by single copy genes. 206 J. Bai et al. / Gene 386 (2007) 202–210 Table 3 Most abundant 20 EST clones of zygotically transcribed mRNA transcripts from the juvenile (fast skeletal muscle) cDNA library from Atlantic halibut (Hippoglossus hippoglossus) Gene No. of sequences Cluster size Juvenile (fast muscle) (1530 sequences, 607 clusters) Mitochondrial genes Cytochrome c oxidase subunit II 161 11 Cytochrome c oxidase subunit III 72 3 NADH dehydrogenase subunits 47 11 (1–6) ATPase subunit 6 45 8 Cytochrome b 34 3 Cytochrome c oxidase subunit I 24 4 Cytochrome c oxidase subunit 11 2 IV Nuclear genes Ribosomal proteins a Slow troponin T2 Glyceraldehyde-3-phosphate dehydrogenase Myosin heavy chain 2 Novel Tetraodon protein Cytoplasmic actin A3a2 Thymosin beta Tropomysin (2 and 4) Novel Tetraodon protein Eukaryotic translation factor Troponin I, cardiac muscle Adenine nucleotide translocator Alpha actin, smooth muscle Transcript abundance (%) 10.52 4.71 3.07 3.14 2.22 1.57 0.72 123 21 19 46 2 3 8.04 1.37 1.24 16 15 14 12 12 11 8 8 8 6 3 1 1 1 3 1 6 1 1 2 1.05 0.98 0.92 0.78 0.78 0.72 0.52 0.52 0.52 0.39 trunk (1.4%) and fast muscle (1.6%)). The frequency of mitochondrial clusters was 4.1%, 1.9% and 7.9% in the 2-cell, larval trunk and fast muscle libraries, respectively. After eliminating rRNA and mitochondrial genes, 209, 337, and 305 of the clusters from the 2-cell, larval trunk and fast muscle libraries, respectively, had significant similarity matches to the nr database (herein defined as “known genes”). Of the remaining clusters, 442 (55.4%), 307 (42.7%) and 185 (30.5%) from 2-cell, larval trunk and fast muscle libraries, respectively, could be translated with ESTScan and DECODER. Hence, these are likely to represent novel putative polypeptides, hereafter referred to as “unknown genes” (Fig. 2A, B and C). Combining the data from the three libraries there were 728 known genes (38.3%), 918 unknown genes (48.3%), and 159 clusters (7.9%) that could not be translated with ESTScan and DECODER and probably correspond to untranslated regions or pseudogenes. 3.4. Gene annotation based on Gene Ontology To investigate the functional profile of genes expressed in Atlantic halibut, the putative translation products of the clusters from three libraries were grouped into different categories according to the GO slim terms (Ashburner et al., 2001; Harris et al., 2004), as shown in Fig. 3. A comprehensive list of EST clusters with the corresponding GO annotations can be found at a Although all ribosomal proteins were grouped into one cluster in this study, the majority of ribosomal proteins were represented by single copy genes. clusters encoded structural proteins or metabolic enzymes such as mitochondrial genes for cytochrome oxidase, cytochrome b and ATPase subunit 6, α-actin and nuclease diphosphate kinase (Tables 1, 2, and 3). Several of the most abundant transcripts in the larval and juvenile libraries corresponded to genes that exhibit a skeletal muscle-specific or skeletal muscle-predominant pattern of expression, including myosin light chain, myosin heavy chain, tropomyosin, troponin and muscle-specific creatine kinase (Table 2). The libraries contained a range of potential markers of developmental processes. The sequence information reported here should provide a useful resource for future studies of normal and abnormal development in this species. For example, fulllength cDNAs have now been characterized for hairy-related 4 gene (Accession number DQ885478) a basic-helix-loop helix transcription factor involved in cell fate specification and particularly in neurogenesis (Fisher and Caudy, 1998) (from the 2-cell embryo library), osteonectin (DQ912662) a glycoprotein involved in bone formation and mineralization (Estêvão et al., 2005) (from the larval trunk library), and the proteolytic enzyme cathepsin D (DQ912663) (from the juvenile fast muscle library). 3.3. Gene identification based on Prot4EST As shown in Fig. 2, only a small proportion of EST clusters in each library corresponded to rRNA genes (2-cell (3.5%), larval Fig. 2. Pie chart of gene classification of 2-cell embryo (A), yolk-sac larva (trunk) (B), juvenile (fast skeletal muscle) (C) and combined cDNA libraries (D), based on Prot4EST analyses. “Known genes” comprise ESTs that had significant matches to entries from the non-redundant protein database. The clusters termed “unknown genes” consist of novel, putative peptide sequences that have been predicted by ESTScan or DECODER. The category “others” refers to ESTs that were not predicted to code for functional peptides and include potential pseudogenes and untranslated regions. J. Bai et al. / Gene 386 (2007) 202–210 207 Fig. 3. Gene classification of 2-cell embryo, yolk-sac larval (trunk) and juvenile (fast skeletal muscle) Atlantic halibut based on Gene Ontology. A. Cellular component. B. Molecular function. C. Biological process. http://www.st-andrews.ac.uk/∼fmrg/bai.html. We assigned cell, extracellular, intracellular, cellular component unknown to 4 main cellular components: motor, transcription regulator, signal transducer, enzyme regulator, catalytic, binding, molecular function unknown, structural molecular and transporter to 10 main molecular function categories and electron transport, response to stimulus, nucleic acid derivative metabolism, transport, cell communication, cellular process and biological process unknown to 11 main biological processes (Fig. 3). A large proportion of clusters from the 2-cell library did not have an associated GO slim term and were referred to as “unclassified”: 67.7% in molecular function, 77.4% in cellular components and 74.3% in the 208 J. Bai et al. / Gene 386 (2007) 202–210 biological process category. In the yolk-sac larval library, the unclassified rate was 44.4% in molecular function, 68.0% in cellular component and 65.9% in biological process, respectively, while the unclassified clusters accounted for 32.5%, 51.4% and 53.1% respectively in the juvenile library. The majority of these clusters represent hypothetical proteins with unknown function and cellular location. There is a striking difference in the number of unclassified proteins from the three libraries: earlier developmental stages correlated with a larger proportion of proteins with unidentified function. Most proteins in all three libraries were intracellular or located on the plasma membrane (Fig. 3A). In the molecular function categories, over 70% of the putative proteins with a GO slim classification were related to catalytic, binding, structural molecule, nucleic acid binding, transporter and motor activities (Fig. 3B). As far as their biological process is concerned, proteins in subcategories of physiological process, electron transport, nucleotide and nucleic acid metabolism as well as transport were preponderant in all three libraries (Fig. 3C). for this large difference in order to synthesize the same level of ribosomal proteins. Similar profiles of gene expression were observed for ribosomal proteins in catfish (Ictalurus punctatus) brain (Ju et al., 2000). In the halibut 2-cell library, there were only two ribosomal protein genes (40S ribosomal protein S27 and 60S ribosomal protein L32) represented by a single clone. It has been shown that each Xenopus laevis oocyte accumulates an equivalent amount of ribosomes as thousands of somatic cells (reviewed by Brown, 2004). This mechanism of specific rDNA amplification has not been described in teleosts, but our data indicate that it might be occurring in Atlantic halibut oocytes. 3.6. Comparison of Atlantic halibut proteins with zebrafish, pufferfish and human proteomes 3.5. Differential expression of ribosomal protein genes in the yolk-sac larva and juvenile stages Fig. 4 shows the SimiTri representation of predicted Atlantic halibut genes compared to zebrafish, human and green-spotted pufferfish proteomes. Of the 1899 putative H. hippoglossus genes, only 753, 685, 726 and 717 genes had significant BLAST hits against the H. sapiens, D. rerio, T. nigroviridis and T. rubripes protein databases, respectively. The majority of Each ribosome contains some 50 distinct proteins that must be made at the exactly the same rate (Nomura et al., 1984; Ju et al., 2000). As each of the ribosomal proteins is required in the formation of ribosomes, correct expression of ribosomal protein genes plays an important regulatory role in ribosome assembly. However, it is known that the primary control of ribosomal protein synthesis is at the translation level, not on mRNA synthesis (Nomura et al., 1984). In this study, ESTs of 40 (170 clones) and 41 (161 clones) ribosomal protein genes were identified in larval trunk and juvenile fast muscle libraries. Of these, 22 were for large and 18 for small ribosome subunits in the larval trunk library, and 22 for large and 19 for small ribosomal subunits in the fast muscle library, respectively. Although ribosomal proteins are proportionally required for the assembly of ribosomes, large differences were observed in the relative abundance of ribosomal proteins ESTs. In the larval trunk library, the most abundant ribosomal protein gene products were L23 (65 clones), followed by L6 (9 clones), S19 (8 clones), L1, L35 and S19 (each with 5 clones), L8, L17, L36a, S20 and S24 (each with 4 clones). Three clones were identified for L14, L27a, L28, L29, L38, S4 and S6. The eight ribosomal genes were sequenced with two clones. Only one clone was sequenced for each of the remaining 12 ribosomal genes. In the fast muscle library, the most abundant ribosomal protein genes were S29 (11 clones), followed by S14 (10 clones), S19 and S17 (both with 7 clones), S26 and L30 (both with 6 clones), and P1, P2, L35, S18 and S30 (all with five clones each), S15 (4 clones), L10a, L18, L27a, L28, L38, S11, S20 and S24 (all with 3 clones), L30 (4 clones), S11, S30, L10a, and L35a (each with three clones). Two clones were sequenced for L6, L17, L23, L29 and S7. The remaining 14 genes corresponded to only one clone each. The expression profile of ribosomal proteins indicated a difference of up to 63-fold in their relative mRNA abundance in halibut larvae. Such large difference suggests that for several ribosomal protein genes such as L23 in larvae and S29 in juveniles, translational control has to account Fig. 4. Similarity of Hippoglossus hippoglossus ESTs to the proteomes of Homo sapiens, Danio rerio and Tetraodon nigroviridis. SimiTri plot showing sequence similarity relationships between H. hippoglossus putative peptides and H. sapiens (142,587 entries, 753 hits), D. rerio (42,184 entries, 685 hits) and T. nigroviridis (28,412 entries, 726 hits) proteomes. For each of the 1899 Atlantic halibut peptides, a BLASTX search was performed against the proteomes of the other 3 species. Each tile in the graphic represents a unique consensus sequence and its relative position is computed from the raw BLASTX scores derived above (with a cutoff of 50). Hence, each tile's position indicates its degree of sequence similarity to each of the three selected databases. The cluster outlined with a circle (HHC01291) is more similar to its human orthologue than to its counterpart in the other two fish species. Sequences showing similarity to a single database are not represented. Sequences with similarity to only two databases appear on the lines joining the two databases. Tiles are colored by their highest BLASTX score to each of the databases: red ≥300; yellow ≥ 200; green ≥150; blue ≥ 100 and purple b100. J. Bai et al. / Gene 386 (2007) 202–210 putative proteins were located in the centre with a bias towards the top and right sections of the triangle showing that, as expected, H. hippoglossus is more closely related to T. nigroviridis and D. rerio than H. sapiens (Fig. 4). Similarly, H. hippoglossus is also more closely related to T. rubripes and D. rerio than H. sapiens (Supplementary Fig. S1). For further details please consult the interactive applets available online at http:// www.st-andrews.ac.uk/∼fmrg/bai.html. Interestingly, the Atlantic halibut cytochrome c oxidase subunit 3 (Cox3), which is involved in mitochondrial electron transport, is more similar to human COX3 than the zebrafish and green-spotted pufferfish Cox3, as shown in Fig. 4. Sequences located on the edge joining two databases did not have significant matches against the database indicated on the opposite apex of the triangle. For example, the clusters aligned along the edge joining the T. nigroviridis and D. rerio proteomes might have been lost in the human genome during evolution. The relatively equal distribution of proteins between the vertices that correspond to zebrafish, green-spotted pufferfish (Fig. 4) and T. rubripes (Supplementary Fig. S1) proteomes indicated that sequence variation within the clade Acanthomorpha is comparable to that between the Acanthomorpha and zebrafish, which is in the clade Ostraphysii. Our results emphasize the importance of EST analysis for studies of developmental processes in non-model species. 3.7. Conclusions • The 4249 high quality ESTs prepared from 2-cell embryo, larval (trunk) and juvenile (fast muscle) of Atlantic halibut were clustered into a partial transcriptome of 2124 putative genes, 48.3% of which corresponded to unknown proteins. • The maternal mRNA in the 2-cell embryos contained a higher proportion of mitochondrial transcripts (35.6%) than the other libraries, but only had two ESTs for a ribosomal protein, which may reflect the relatively high abundance of pre-formed ribosomes in the yolk. • A global analysis of protein similarity revealed significant differences between Atlantic halibut and other members of the clade Acanthomorpha (T. nigroviridis and T. rubripes). Acknowledgements We would like to thank Professor Igor Babiak of the Department of Fisheries and Natural Sciences, Bodø Regional University, Norway for collecting samples of 2-cell embryo and 1 day-old yolk-sac larvae of Atlantic halibut. We are grateful to Marine Harvest (Scotland) Ltd for providing the juvenile halibut. This work was funded by the Norwegian Research Council (Grant No.: NFR159594/S40) with additional funding for consumables from the MARBIT program (Grant no. AF0024). Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.gene.2006.09.012. 209 References Aegerter, S., Jalabert, B., Bobe, J., 2004. Messenger RNA stockpile of cyclin B, insulin-like growth factor I, insulin-like growth factor II, insulin-like growth factor receptor Ib and p53 in the rainbow trout oocyte in relation to developmental competence. Mol. Reprod. Dev. 67, 127–135. Altschul, S.F., et al., 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402. Aoki, F., Worrad, D.M., Schultz, R.M., 1997. Regulation of transcriptional activity during the first and second cell cycles in the preimplantation mouse embryo. Dev. Biol. 181, 296–307. Ashburner, M., et al., 2001. Creating the gene ontology resource: design and implementation. Genome Res. 11, 1425–1443. Bergh, O., Nilsen, F., Samuelsen, O.B., 2001. Diseases, prophylaxis and treatment of the Atlantic halibut Hippoglossus hippoglossus: a review. Dis. Aquat. Org. 48, 57–74. Bromage, N., et al., 1992. Broodstock management, fecundity, egg quality and the timing of egg production in the rainbow trout (Oncorhynchus mykiss). Aquaculture 100, 141–166. Brown, D.D., 2004. A tribute to the Xenopus laevis oocyte and egg. J. Biol. Chem. 279, 45291–45299. Carlier, M.-F., Jean, C., Rieger, K.J., Lenfant, M., 1993. Modulation of the interaction between G-actin and thymosin β4 by the ATP/ADP ratio: possible implication in the regulation of actin dynamics. Proc. Natl. Acad. Sci. 90, 5034–5038. Cossins, A.R., Crawford, D.L., 2005. Fish as models for environmental genomics. Nat. Rev., Genet. 4, 324–333. Davis Jr., W., De Sousa, P.A., Schultz, R.M., 1996. Transient expression of translation initiation factor eIF-4C during the 2-cell stage of the preimplantation mouse embryo: identification by mRNA differential display and the role of DNA replication in zygotic gene activation. Dev. Biol. 174, 190–201. Dettai, A., Lecointre, G., 2005. Further support for the clades obtained by multiple molecular phylogenies in the acanthomorph bush. C. R. Biol. 328, 674–689. El Meziane, A., Callen, J.C., Mounolon, J.C., 1989. Mitochondrial gene expression during Xenopus laevis development: a molecular study. EMBO J. 8, 1649–1655. Estêvão, M.D., Redruello, B., Canario, A.V.M., Power, D.M., 2005. Ontogeny of osteonectin expression in embryos and larvae of sea bream (Sparus auratus). Gen. Comp. Endocrinol. 142, 155–162. Fernandes, J.M., et al., 2005. A genomic approach to reveal novel genes associated with myotube formation in the model teleost, Takifugu rubripes. Physiol. Genomics 22, 327–338. Fisher, A., Caudy, M., 1998. The function of hairy-related bHLH repressor proteins in cell fate decisions. Bioessays 20, 298–306. Harris, M.A., et al., 2004. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32, D258–D261. Jackson, T.R., Martin-Robichaud, D.J., Reith, M.E., 2003. Application of DNA markers to the management of Atlantic halibut (Hippoglossus hippoglossus) broodstock. Aquaculture 220, 245–259. Ju, Z., et al., 2000. Transcriptome analysis of channel catfish (Ictalurus punctatus): genes and expression profile from the brain. Gene 261, 373–382. Kane, D.A., Kimmel, C.B., 1993. The zebrafish midblastula transition. Development 119, 447–456. Kjørsvik, E., Magnor-Jensen, A., Holmefjord, I., 1990. Egg quality in fishes. Adv. Mar. Biol. 26, 71–113. Kong, M., Barnes, E.A., Ollendorf, V., Donoghue, D.J., 2000. Cyclin F regulates the nuclear localization of cyclin B1 through a cyclin–cyclin interaction. EMBO J. 19, 1378–1388. Muller, K., Thisse, C., Thisse, B., Raz, E., 2002. Expression of a linker histonelike gene in the primordial germ cells in zebrafish. Mech. Dev. 117, 253–257. Nelson, G., 1988. Phylogeny of major fish groups. Nobel Symposium 70: The Hierarchy of Life. Molecules and Morphology in Phylogenetic Analysis. Elsevier Science Publishers B.V. (Biomedical Division), Karlskoga, Sweden. Nelson, J.S., 1994. Fishes of the World, 3rd edition. John Wiley and Sons, New York. 600 pp. 210 J. Bai et al. / Gene 386 (2007) 202–210 Nishikata, T., et al., 2001. Profiles of maternally expressed genes in fertilized eggs of Ciona intestinalis. Dev. Biol. 238, 315–331. Nomura, M., Gourse, R., Baughman, G., 1984. Regulation of the synthesis of ribosomes and ribosomal components. Annu. Rev. Biochem. 53, 75–117. Park, K.C., Osborne, J.A., Tsoi, S.C., Brown, L.L., Johnson, S.C., 2005. Expressed sequence tags analysis of Atlantic halibut (Hippoglossus hippoglossus) liver, kidney and spleen tissues following vaccination against Vibrio anguillarum and Aeromonas salmonicida. Fish Shellfish Immunol. 18, 393–415. Parkinson, J., Blaxter, M., 2003. SimiTri—visualizing similarity relationships for groups of sequences. Bioinformatics 19, 390–395. Parkinson, J., Anthony, A., Wasmuth, J., Schmid, R., Hedley, A., Blaxter, M., 2004. PartiGene—constructing partial genomes. Bioinformatics 20, 1398–1404. Paynton, B.V., Rempel, R., Bachvarova, R., 1988. Changes in the state of adenylation and time course of degradation of maternal mRNAs during oocyte maturation and early embryonic development in the mouse. Dev. Biol. 129, 304–314. Pikó, L., Taylor, K.D., 1987. Amounts of mitochondrial DNA and abundance of some mitochondrial gene transcripts in early mouse embryos. Dev. Biol. 123, 364–374. Shields, R.J., Brown, N.P., Bromage, N.R., 1997. Blastomere morphology as a predictive measure of fish egg viability. Aquaculture 155, 1–12. Wasmuth, J.D., Blaxter, M.L., 2004. Prot4EST: translating expressed sequence tags from neglected genomes. BMC Bioinformatics 5, 187. Yamada, L., Kobayashi, K., Satou, Y., Satoh, N., 2005. Microarray analysis of localization of maternal transcript in eggs and early embryos of the ascidian, Ciona intestinalis. Dev. Biol. 284, 536–550. Zeng, F., Schultz, R.M., 2005. RNA transcript profiling during zygotic activation in the preimplantation mouse embryo. Dev. Biol. 283, 40–57. Zeng, F., Baldwin, D.A., Schultz, R.M., 2004. Transcript profiling during preimplantation mouse development. Dev. Biol. 272, 483–496.
© Copyright 2026 Paperzz