Profiling of maternal and developmental

Gene 386 (2007) 202 – 210
www.elsevier.com/locate/gene
Profiling of maternal and developmental-stage specific mRNA transcripts in
Atlantic halibut Hippoglossus hippoglossus
Jialin Bai a,b , Christel Solberg b , Jorge M.O. Fernandes a , Ian A. Johnston a,b,⁎
a
Fish Muscle Research Group, Gatty Marine Laboratory, School of Biology, University of St Andrews, East Sands, St Andrews, Fife, KY16 8LB, Scotland, UK
b
Department of Fisheries and Natural Sciences, Bodø Regional University, N-8049 Bodø, Norway
Received 28 October 2005; received in revised form 4 September 2006; accepted 19 September 2006
Available online 5 October 2006
Received by C.T. Amemiya
Abstract
cDNA libraries were constructed from the following developmental stages (tissues) of the Atlantic halibut (Hippoglossus hippoglossus): 2-cell
stage (embryos), 1 day-old yolk sac larvae (trunk) and juvenile (fast skeletal muscle). A total of 4249 high quality expressed sequence tags from
the three libraries were clustered into a partial transcriptome of 2124 putative genes. A large proportion of the gene clusters (48.3%) had no
significant matches against known proteins. The most abundant ESTs of nuclear transcripts in the 2-cell library included sequences with high
identity to zebrafish H1M, a linker histone-like protein involved in primordial germ cell specification, zinc finger protein, rRNA external
transcribed spacer, thymosin β-4, cyclin B1 and several predicted peptides from the Tetraodon nigroviridis genome assembly with unknown
functions. 170 and 123 ESTs represented ribosomal proteins in the larval and juvenile libraries respectively, compared with only two sequences in
the 2-cell library, which may reflect an abundance of maternally inherited pre-formed ribosomes in the yolk. Even though some clusters were
common to all three libraries, most putative genes showed a developmental-stage specific distribution with 72% (2-cell embryo), 59% (larval) and
57% (juvenile) sequences having no significant matches against the 8400 adult halibut sequences in the EMBL nucleotide database. Comparison
between the predicted halibut peptide data set and the human, zebrafish, and pufferfishes (T. nigroviridis and Takifugu rubripes) proteomes
revealed that, as expected, the halibut sequences were more similar to the other two fish species than to human proteins. However, no clear bias
towards the pufferfishes was observed, suggesting significant sequence variation between orthologues within the clade Acanthomorpha. The
sequence information generated in the present study will represent a significant new resource for future studies on normal and abnormal
development in Atlantic halibut.
© 2006 Elsevier B.V. All rights reserved.
Keywords: Maternal mRNAs; cDNA library; Expressed sequence tags; Gene Ontology
1. Introduction
The genomes of four model teleost fishes have been sequenced
to the draft level: the zebrafish (Danio rerio), the medaka (Oryzias
Abbreviations: BLAST, Basic local alignment search tool; EST, Expressed
sequence tag; GO, Gene Ontology; NCBI, National Center for Biotechnology
Information; PCR, Polymerase chain reaction.
⁎ Corresponding author. Fish Muscle Research Group, Gatty Marine
Laboratory, School of Biology, University of St Andrews, East Sands, St
Andrews, Fife, KY16 8LB, Scotland, UK. Tel.: +44 1334 463440; fax: +44
1334 463443.
E-mail address: [email protected] (I.A. Johnston).
0378-1119/$ - see front matter © 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.gene.2006.09.012
latipes) and the pufferfishes Takifugu rubripes and Tetraodon
nigroviridis (http://www.ensemble.org/index.html). In contrast,
large scale genetic resources for farmed fish species such as
common carp (Cyprinus carpio), Atlantic salmon (Salmo salar),
rainbow trout (Oncorhynchus mykiss) and tilapia (Oreochromis
niloticus) are restricted to expressed sequence tags (ESTs), the
product of high-throughput, single-pass cDNA sequence
analyses (Cossins and Crawford, 2005).
The Atlantic halibut Hippoglossus hippoglossus (order
Pleuronectiformes) is a valuable flatfish with an established
market that is starting to be farmed commercially in Canada,
Norway and Scotland (Bergh et al., 2001). A major bottleneck in
halibut farming is the production of juveniles for on-growing.
J. Bai et al. / Gene 386 (2007) 202–210
Significant problems include high embryonic and larval mortality
and the prevalence of body deformities (Kjørsvik et al., 1990;
Bromage et al., 1992). Paternity analysis using DNA microsatellite markers indicates that surviving juveniles typically come
from a small percentage of the total broodstock population,
indicating poor egg quality following artificial fertilization
(Jackson et al., 2003). Abnormal patterns of blastomere cleavage
are common and have been linked to low hatching success
(Kjørsvik et al., 1990; Shields et al., 1997). Maternal mRNA
transcripts and/or proteins have critical roles in directing early
developmental processes in teleosts (Kane and Kimmel, 1993). In
zebrafish, zygotic transcription only begins at the mid-blastula
transition (512-cell stage) (Kane and Kimmel, 1993). In contrast,
zygotic gene activation starts at the 1-cell stage in mouse embryos
and is clearly evident by the 2-cell stage (Aoki et al., 1997; Zeng
et al., 2004). During the maternal-to-zygotic transition oocytespecific transcripts are degraded and maternal transcripts common
to oocyte and early embryo are replaced with zygotic transcripts
(Paynton et al., 1988; Davis et al., 1996; Zeng et al., 2004).
Maternal mRNA transcripts and/or proteins are thought to control
cell cleavage patterns, the establishment of body axes and
specification of early embryonic cells prior to the mid-blastula
phase (Nishikata et al., 2001; Yamada et al., 2005; Zeng et al.,
2004; Zeng and Schultz, 2005). Post-ovulatory ageing in rainbow
trout oocytes altered the abundance of mRNA transcripts,
including insulin-like growth factor-I receptor, and was associated
with developmental abnormalities in the larval stages (Aegerter
et al., 2004).
Studies on developmental mechanisms in Atlantic halibut are
hindered by a lack of molecular markers. All published EST
datasets from Atlantic halibut are from the adult stage of the life
cycle (e.g. Park et al., 2005) with a total of 8400 sequences
deposited in the EMBL nucleotide data base. There have been
no transcriptomal analyses concerned with skeletal muscle and/
or early developmental stages and nothing is known about the
composition of maternal mRNAs. The first aim of the present
study was therefore to characterize cDNA libraries from 2-cell
embryos, the trunk of yolk-sac larvae and the fast myotomal
muscle of juveniles as a resource for future studies on normal
and abnormal development.
The two pufferfish genomes sequenced to draft level belong
to the same clade as the Atlantic halibut, the Acanthomorpha
(ray finned fishes with true spines in their anal and dorsal fins)
(Nelson, 1994). The Acanthomorpha represent nearly 60% of
extant fish diversity with more than 15,300 species and 314
families (Nelson, 1988). Many of the nodes on the
Acanthomorpha tree remain unresolved or poorly defined
(Dettai and Lecointre, 2005). The second aim of the present
study was to establish the extent to which genomic resources
from model species, particularly pufferfishes, could be useful
for studies with Atlantic halibut. We therefore investigated
patterns of phylogenetic affinity between halibut EST
sequences and the proteome databases available from pufferfish, zebrafish and human using the Java/Perl-based application SimiTri (Parkinson and Blaxter, 2003), which allowed
simultaneous display and analysis of relative similarity
relationships.
203
2. Materials and methods
2.1. Animals and sample collection
2-cell embryos and 1 day-old yolk-sac larvae (1 day after
hatching) were obtained from brood stock of mature Atlantic
halibut kept at Halibut Research Station, Bodø University
College, Bodø, Norway. A single batch of Atlantic halibut eggs
obtained by manual stripping was fertilized with milt from one
male. All developmental stages were reared using sea water
maintained at a salinity of 33–35‰ and treated with ozone. Eggs
were incubated in a 280 l tank at a temperature of 5.2–5.4 °C and
under dark photoperiod until day 7 after fertilization, at which
time epiboly has been completed. After approximately 10 hours
post fertilization (hpf), 2-cell embryos were collected. 1 day-old
yolk-sac larva after hatching were obtained on day 15 after
fertilization. Samples were kept in RNAlater (Ambion, Cambridgeshire, UK), packaged with dry ice and transported to the
Gatty Marine Laboratory (University of St Andrews, UK) where
they were stored at − 70 °C. The head, yolk-sac and tail of 1 dayold yolk-sac larvae were removed prior to total RNA extraction.
Juvenile Atlantic halibut were obtained from Marine Harvest
(Scotland) Ltd (UK) and maintained at the Gatty Marine
Laboratory in sea water (salinity 34‰) at 10 °C and under 12 h
dark:12 h light photoperiodic regime. A juvenile halibut
(∼0.42 kg) was humanely killed by overanaesthesia in a solution of 0.2 mM 3-aminobenzoic acid ethyl ester (Sigma, Dorset,
UK) buffered with sodium bicarbonate (Sigma). Fast muscle
was dissected from the dorsal epaxial myotome, either snapfrozen in liquid nitrogen or stored in RNA later for subsequent
nucleic acid extraction.
2.2. cDNA library construction
One hundred milligrams of each sample were added to
FastRNA ProGreen beads (Qbiogene) containing 1 ml of TRI
reagent (Sigma). The tissues were homogenized using the
FastPrep Instrument (Qbiogene) for 40 s at the speed setting of
6.0. Total RNA was isolated from each sample according to the
manufacturer's instructions. Potential contaminating DNA was
removed from the RNA preparation using TURBO DNA-free
(Ambion). Poly (A+) RNA was purified using the Absolutely
mRNA™ purification kit (Stratagene) and quantified with the
fluorescent nucleic acid stain RiboGreen (Molecular Probes)
following the recommended protocol so as to obtain ∼ 5 μg poly
(A+) RNA. cDNA libraries were generated using the pBlueScript II XR cDNA library construction kit (Stratagene) according to the manufacturer's instructions. Primary cDNA
libraries were amplified once before colonies were picked for
sequencing.
2.3. cDNA clone selection and insert sequencing
The amplified plasmid cDNA libraries were plated and individual clones were randomly picked following blue-white clone
selection with isopropyl β-D-thiogalactopyranoside (20 mM) and
5-bromo-4-chloro-3-indolyl β-D-galactopyranoside (0.2 mM).
204
J. Bai et al. / Gene 386 (2007) 202–210
Insert checks were carried out by PCR with T3 short (5′ATTAACCCTCACTAAAG-3′) and T7 short (5′-AATACGACTCACTATTAG-3′) primers. PCR involved an initial
denaturation step at 95 °C for 5 min, followed by 35
amplification cycles: 94 °C for 30 s, 53 °C for 45 s and 72 °C
for 2 min. A final extension at 72 °C for 10 min was used. PCR
products were cleaned up with shrimp alkaline phosphatase
(SAP)/Exonuclease I (Exo I) (Amersham). 5′ end sequencing
PCR reactions with T3 primer (5′-ATTAACCCTCACTAAAGGGAA-3′) were performed using the ABI prism Big
Dye (v3.1) Terminator Cycle Sequencing Ready Reaction Kit
(PE Applied Biosystems, USA). The sequencing reaction
comprised an initial denaturation step at 96 °C for 1 min, and
25 cycles at 96 °C for 10 s, 50 °C for 5 s and 60 °C for 4 min.
DNA pellets were sent for sequencing at the John Innes Centre
(JIC) with an ABI 3770 DNA sequencer and at the Oxford
DNA sequencing facility with an ABI 3700 capillary DNA
sequencer (PE Applied Biosystems, USA).
2.4. Sequence processing and bioinformatics analysis
Raw trace data were processed with the EST analysis pipeline
developed by the Natural Environment Research Council and
Environmental Genomics Thematic Programme Data Centre
(NERC–EGTDC, University of Edinburgh). The electropherograms were sequentially analyzed by trace2dbEST (http://
envgen.nox.ac.uk/est.html), Partigene (Parkinson et al., 2004),
Prot4EST (Wasmuth and Blaxter, 2004), and annot8r (available
from http://www.nematodes.org/ PartiGene), as described by
Fernandes et al. (2005). For cross-taxon comparison, the 4249
EST sequences generated in the present study were used to
construct a partial transcriptome database with Partigene. The
resulting putative genes of Atlantic halibut were subjected to
BLASTX similarity searches (Altschul et al., 1997) against the
zebrafish, pufferfishes and human predicted proteomes (downloadable from http://www.ensembl.org). Relative similarity
relationships between these four data sets were visualized as a
triangular plot generated by the SimiTri software (Parkinson and
Blaxter, 2003).
dbEST (http://www.ncbi/nlm/nih.gov/dbEST): 1419 sequences
from the 2-cell library (GenBank accession numbers:
DT805529–DT806308), 1300 sequences from the larval
trunk library (GenBank accession numbers: DN794246–
DN794954, DT806677–DT806820 and EB102920–
EB103372) and 1530 sequences from the fast muscle library
(GenBank accession numbers: DN792583–DN793123,
DT806309–DT806676). The proportion of known gene
sequences containing a start codon was determined by comparing
their putative translation products with the corresponding
BLASTX hits against the NR (NCBI) and UniProt (EBI)
databases. Putative peptides containing a start methionin were
considered to correspond to ESTs with complete 5′ coding
sequences. On this basis, the percentage of full-length cDNA
clones in the 2-cell, larval trunk and juvenile fast muscle libraries
was 47%, 56% and 60%, respectively.
A total of 2124 clusters (putative genes) were generated by
Partigene: 798 clusters from the 2-cell library, 719 clusters from
the larval trunk library and 607 clusters from the fast muscle
library (Fig. 1). About 120 (15.0%), (168) 23.4% and 177 (29.1%)
of clusters were represented by multi-ESTs in the 2-cell, larval
trunk and fast muscle libraries, respectively, with corresponding
maximum cluster sizes of 70, 35 and 89 EST sequences. In some
cases the computational assembly of clusters was too conservative, generating distinct clusters that corresponded to the same
gene. Such clusters were identified and grouped into superclusters. A complete list of the partial transcriptome databases
corresponding to all three libraries generated in this study is
available online at the Fish Muscle Research Group web site:
3. Results and discussion
3.1. Summary of ESTs from the cDNA libraries in the 2-cell
embryo, the trunk of yolk-sac larvae and the fast skeletal
muscle of juvenile Atlantic halibut
The amplified libraries contained at least 6.9 × 108, 1.1 × 109
and 7.6 × 108 total transformants in the 2-cell embryo, yolk-sac
larvae (trunk) and juvenile (fast skeletal muscle) library,
respectively, and insert sizes ranged from approximately 0.5
to 3.0 kb. Single-pass sequencing was performed on a total of
6610 randomly picked clones, including 1783 clones from the
2-cell library, 2112 clones from the larval trunk library and
2715 clones from the fast muscle library. After removing the
vector sequences and adaptors, only ESTs longer than 150 bp
were chosen for further analysis. As a result, 4249 high quality
sequences were obtained and submitted to the EST database
Fig. 1. Diagram of the number of clusters isolated from three cDNA libraries and
the overlap among them. Putative gene clusters were obtained from cDNA
libraries of 2-cell embryo, yolk-sac larva (trunk) and juvenile (fast myotomal
muscle) Atlantic halibut (Hippoglossus hippoglossus).
J. Bai et al. / Gene 386 (2007) 202–210
Table 1
Most abundant 20 EST clones of maternal mRNA transcripts from the 2-cell
embryo cDNA library from Atlantic halibut (Hippoglossus hippoglossus)
Gene
No. of
sequences
Cluster Transcript
size
abundance (%)
Two-cell embryo (1419 sequences, 798 cluster)
Mitochondrial genes
Cytochrome c oxidase subunit II
132
Cytochrome c oxidase subunit III
108
Cytochrome b
104
Cytochrome c oxidase subunit I
89
NADH dehydrogenase subunits (1–4)
27
Mitochondrial hypothetical 18K protein 26
ATP synthase subunit 6
19
8
5
8
7
7
1
3
9.30
7.61
7.33
6.27
1.90
1.83
1.34
Nuclear genes
Novel Tetraodon protein
Histone (H1M, H2A and H3)
rRNA external transcribed spacer
Splicing factor
Zinc finger protein
Senescence-associated protein
Thymosin beta-4
Cyclin B1
Nuclease diphosphate kinase B
Novel Tetraodon protein
Novel Tetraodon protein
Cathepsin L
Transcription elongation factor B
1
3
1
1
3
3
1
2
2
1
1
1
3
0.92
0.63
0.42
0.42
0.42
0.35
0.35
0.28
0.28
0.28
0.21
0.21
0.21
13
9
6
6
6
5
5
4
4
4
3
3
3
http://www.st-andrews.ac.uk/∼fmrg. The overall redundancy
(defined as the ratio between the total number of sequences and
the number of clusters) was 1.78, 1.81 and 2.52 for the 2-cell,
larval trunk and fast muscle libraries, respectively. In fact, more
than 70% of the putative genes from each library were singletons
represented by only one EST. These low redundancy values
suggest that our isolation of ESTs may not be saturated and that
most clusters represent rare mRNAs. Sixty-seven clusters were
common to all three libraries. When pairwise comparisons between libraries were carried out, 110 super-clusters were found to
be present in both 2-cell and larval trunk, while 95 and 192 superclusters were common between 2-cell and fast muscle, and fast
muscle and larval trunk, respectively. The majority of EST superclusters in each library (2-cell (82.7%), larval trunk (67.3%) and
fast muscle (63.8%)) did not show any overlap. The number of
clusters without significant hits (cutoff of 1e− 8) against the 8400
adult halibut sequences available from EMBL was 579 (72.6%)
(2-cell embryo), 424 (59%) larval trunk and 347 (57.1%)
(juvenile). The full-list of sequences unique to each developmental stage studied are shown as HTML Tables available online at
http://www.st-and.ac.uk/∼fmrg/bai.html. These results indicate
that each developmental stage has a significant number of unique
clusters, probably reflecting differential gene expression during
development.
3.2. Distribution of sequences at each Atlantic halibut
developmental stage
Cluster sizes in a non-normalised EST library indicate the
relative abundance of the corresponding transcripts. Tables 1, 2,
205
and 3 show the top 20 EST clones in the three developmental
stages studied. Transcript abundance is defined as the ratio
between the number of EST sequences of putative genes and the
total number of ESTs. The highest abundance of mitochondrial
sequences was found in the 2-cell library (35.6%) compared with
8.9% for the larval trunk library and 26% for the juvenile fast
muscle library. This may simply reflect the different composition
of tissues present in the samples from which the RNA was
extracted e.g. whole organism versus specific tissue(s). Comparative studies with Xenopus (El Meziane et al., 1989) and
mouse (Pikó and Taylor, 1987) indicate mitochondrial transcripts
are abundant in the oocyte but decrease dramatically shortly after
fertilization only increasing again after the transition to zygotic
transcription at the 8-cell stage and mid-blastula transition
respectively. The most abundant nuclear transcripts in the 2-cell
library included a short peptide without conserved domains also
present in the T. nigroviridis genome (GSTENG00007427001), a
linker histone-like protein (H1M), thymosin β-4 (TMSB4X) and
cyclin-dependent kinase regulatory subunit cyclin B1 (CCNB1)
(Table 1). The histone-like protein was similar to zebrafish H1M,
a maternally transmitted protein that is involved in the specification of primordial germ cells and is expressed during blastula
stages (Muller et al., 2002). Cyclin B1 has a key regulatory role in
the G2–M transition of the cell cycle (Kong et al., 2000) and
thymosin β-4 is thought to control the rate and extent of actin
polymerization in cells (Carlier et al., 1993). The largest gene
Table 2
Most abundant 20 EST clones of zygotically transcribed mRNA transcripts from
yolk-sac larval (trunk) cDNA library from Atlantic halibut (Hippoglossus
hippoglossus)
Gene
No. of
sequences
Cluster
size
Yolk-sac larval (trunk) (1300 sequences, 719 cluster)
Mitochondrial genes
Cytochrome c oxidase subunit I
47
ATP synthase subunit 6
28
Cytochrome c oxidase subunit II
19
Cytochrome b
16
NADH dehydrogenase subunits
7
(1, 2 and 4)
Nuclear genes
Ribosomal proteins a
Cardiac muscle alpha-actin
Myosin light chain (1, 2 and 3)
Creatine kinase muscle isoform 1
Novel Tetraodon protein
Myosin heavy chain
Novel Tetraodon protein
Aldolase A fructose-bisphosphate
Parvalbumin beta
Skeletal fast troponin T
Nuclease diphosphate kinase B
Novel Tetraodon protein
Adenine nucleotide translocator s6
Glyceraldehyde-3-phosphate
dehydrogenase
Skeletal alpha actin
a
Transcript
abundance (%)
2
7
2
1
5
3.62
2.15
1.46
1.13
0.54
170
43
32
32
16
13
11
10
9
9
9
8
6
6
40
1
5
2
1
4
1
1
1
2
1
1
1
1
13.08
3.30
2.46
2.46
1.23
1.00
0.85
0.77
0.69
0.69
0.69
0.62
0.46
0.46
5
2
0.38
Although all ribosomal proteins were grouped into one cluster in this study,
the majority of ribosomal proteins were represented by single copy genes.
206
J. Bai et al. / Gene 386 (2007) 202–210
Table 3
Most abundant 20 EST clones of zygotically transcribed mRNA transcripts from
the juvenile (fast skeletal muscle) cDNA library from Atlantic halibut
(Hippoglossus hippoglossus)
Gene
No. of
sequences
Cluster
size
Juvenile (fast muscle) (1530 sequences, 607 clusters)
Mitochondrial genes
Cytochrome c oxidase subunit II 161
11
Cytochrome c oxidase subunit III 72
3
NADH dehydrogenase subunits 47
11
(1–6)
ATPase subunit 6
45
8
Cytochrome b
34
3
Cytochrome c oxidase subunit I
24
4
Cytochrome c oxidase subunit 11
2
IV
Nuclear genes
Ribosomal proteins a
Slow troponin T2
Glyceraldehyde-3-phosphate
dehydrogenase
Myosin heavy chain 2
Novel Tetraodon protein
Cytoplasmic actin A3a2
Thymosin beta
Tropomysin (2 and 4)
Novel Tetraodon protein
Eukaryotic translation factor
Troponin I, cardiac muscle
Adenine nucleotide translocator
Alpha actin, smooth muscle
Transcript
abundance (%)
10.52
4.71
3.07
3.14
2.22
1.57
0.72
123
21
19
46
2
3
8.04
1.37
1.24
16
15
14
12
12
11
8
8
8
6
3
1
1
1
3
1
6
1
1
2
1.05
0.98
0.92
0.78
0.78
0.72
0.52
0.52
0.52
0.39
trunk (1.4%) and fast muscle (1.6%)). The frequency of
mitochondrial clusters was 4.1%, 1.9% and 7.9% in the 2-cell,
larval trunk and fast muscle libraries, respectively. After eliminating rRNA and mitochondrial genes, 209, 337, and 305 of the
clusters from the 2-cell, larval trunk and fast muscle libraries,
respectively, had significant similarity matches to the nr database
(herein defined as “known genes”). Of the remaining clusters,
442 (55.4%), 307 (42.7%) and 185 (30.5%) from 2-cell, larval
trunk and fast muscle libraries, respectively, could be translated
with ESTScan and DECODER. Hence, these are likely to
represent novel putative polypeptides, hereafter referred to as
“unknown genes” (Fig. 2A, B and C). Combining the data from
the three libraries there were 728 known genes (38.3%), 918
unknown genes (48.3%), and 159 clusters (7.9%) that could not
be translated with ESTScan and DECODER and probably
correspond to untranslated regions or pseudogenes.
3.4. Gene annotation based on Gene Ontology
To investigate the functional profile of genes expressed in
Atlantic halibut, the putative translation products of the clusters
from three libraries were grouped into different categories
according to the GO slim terms (Ashburner et al., 2001; Harris
et al., 2004), as shown in Fig. 3. A comprehensive list of EST
clusters with the corresponding GO annotations can be found at
a
Although all ribosomal proteins were grouped into one cluster in this study,
the majority of ribosomal proteins were represented by single copy genes.
clusters encoded structural proteins or metabolic enzymes such as
mitochondrial genes for cytochrome oxidase, cytochrome b and
ATPase subunit 6, α-actin and nuclease diphosphate kinase
(Tables 1, 2, and 3). Several of the most abundant transcripts in the
larval and juvenile libraries corresponded to genes that exhibit a
skeletal muscle-specific or skeletal muscle-predominant pattern
of expression, including myosin light chain, myosin heavy chain,
tropomyosin, troponin and muscle-specific creatine kinase
(Table 2). The libraries contained a range of potential markers
of developmental processes. The sequence information reported
here should provide a useful resource for future studies of normal
and abnormal development in this species. For example, fulllength cDNAs have now been characterized for hairy-related 4
gene (Accession number DQ885478) a basic-helix-loop helix
transcription factor involved in cell fate specification and
particularly in neurogenesis (Fisher and Caudy, 1998) (from the
2-cell embryo library), osteonectin (DQ912662) a glycoprotein
involved in bone formation and mineralization (Estêvão et al.,
2005) (from the larval trunk library), and the proteolytic enzyme
cathepsin D (DQ912663) (from the juvenile fast muscle library).
3.3. Gene identification based on Prot4EST
As shown in Fig. 2, only a small proportion of EST clusters in
each library corresponded to rRNA genes (2-cell (3.5%), larval
Fig. 2. Pie chart of gene classification of 2-cell embryo (A), yolk-sac larva
(trunk) (B), juvenile (fast skeletal muscle) (C) and combined cDNA libraries
(D), based on Prot4EST analyses. “Known genes” comprise ESTs that had
significant matches to entries from the non-redundant protein database. The
clusters termed “unknown genes” consist of novel, putative peptide sequences
that have been predicted by ESTScan or DECODER. The category “others”
refers to ESTs that were not predicted to code for functional peptides and include
potential pseudogenes and untranslated regions.
J. Bai et al. / Gene 386 (2007) 202–210
207
Fig. 3. Gene classification of 2-cell embryo, yolk-sac larval (trunk) and juvenile (fast skeletal muscle) Atlantic halibut based on Gene Ontology. A. Cellular
component. B. Molecular function. C. Biological process.
http://www.st-andrews.ac.uk/∼fmrg/bai.html. We assigned cell,
extracellular, intracellular, cellular component unknown to 4 main
cellular components: motor, transcription regulator, signal transducer, enzyme regulator, catalytic, binding, molecular function
unknown, structural molecular and transporter to 10 main molecular function categories and electron transport, response to
stimulus, nucleic acid derivative metabolism, transport, cell
communication, cellular process and biological process unknown
to 11 main biological processes (Fig. 3). A large proportion of
clusters from the 2-cell library did not have an associated GO slim
term and were referred to as “unclassified”: 67.7% in molecular
function, 77.4% in cellular components and 74.3% in the
208
J. Bai et al. / Gene 386 (2007) 202–210
biological process category. In the yolk-sac larval library, the
unclassified rate was 44.4% in molecular function, 68.0% in
cellular component and 65.9% in biological process, respectively,
while the unclassified clusters accounted for 32.5%, 51.4% and
53.1% respectively in the juvenile library. The majority of these
clusters represent hypothetical proteins with unknown function
and cellular location. There is a striking difference in the number
of unclassified proteins from the three libraries: earlier developmental stages correlated with a larger proportion of proteins with
unidentified function. Most proteins in all three libraries were
intracellular or located on the plasma membrane (Fig. 3A). In the
molecular function categories, over 70% of the putative proteins
with a GO slim classification were related to catalytic, binding,
structural molecule, nucleic acid binding, transporter and motor
activities (Fig. 3B). As far as their biological process is concerned,
proteins in subcategories of physiological process, electron
transport, nucleotide and nucleic acid metabolism as well as
transport were preponderant in all three libraries (Fig. 3C).
for this large difference in order to synthesize the same level of
ribosomal proteins. Similar profiles of gene expression were
observed for ribosomal proteins in catfish (Ictalurus punctatus)
brain (Ju et al., 2000). In the halibut 2-cell library, there were only
two ribosomal protein genes (40S ribosomal protein S27 and 60S
ribosomal protein L32) represented by a single clone. It has been
shown that each Xenopus laevis oocyte accumulates an equivalent
amount of ribosomes as thousands of somatic cells (reviewed by
Brown, 2004). This mechanism of specific rDNA amplification
has not been described in teleosts, but our data indicate that it
might be occurring in Atlantic halibut oocytes.
3.6. Comparison of Atlantic halibut proteins with zebrafish,
pufferfish and human proteomes
3.5. Differential expression of ribosomal protein genes in the
yolk-sac larva and juvenile stages
Fig. 4 shows the SimiTri representation of predicted Atlantic
halibut genes compared to zebrafish, human and green-spotted
pufferfish proteomes. Of the 1899 putative H. hippoglossus
genes, only 753, 685, 726 and 717 genes had significant BLAST
hits against the H. sapiens, D. rerio, T. nigroviridis and
T. rubripes protein databases, respectively. The majority of
Each ribosome contains some 50 distinct proteins that must be
made at the exactly the same rate (Nomura et al., 1984; Ju et al.,
2000). As each of the ribosomal proteins is required in the
formation of ribosomes, correct expression of ribosomal protein
genes plays an important regulatory role in ribosome assembly.
However, it is known that the primary control of ribosomal
protein synthesis is at the translation level, not on mRNA
synthesis (Nomura et al., 1984). In this study, ESTs of 40 (170
clones) and 41 (161 clones) ribosomal protein genes were
identified in larval trunk and juvenile fast muscle libraries. Of
these, 22 were for large and 18 for small ribosome subunits in the
larval trunk library, and 22 for large and 19 for small ribosomal
subunits in the fast muscle library, respectively. Although
ribosomal proteins are proportionally required for the assembly
of ribosomes, large differences were observed in the relative
abundance of ribosomal proteins ESTs. In the larval trunk library,
the most abundant ribosomal protein gene products were L23 (65
clones), followed by L6 (9 clones), S19 (8 clones), L1, L35 and
S19 (each with 5 clones), L8, L17, L36a, S20 and S24 (each with
4 clones). Three clones were identified for L14, L27a, L28, L29,
L38, S4 and S6. The eight ribosomal genes were sequenced with
two clones. Only one clone was sequenced for each of the
remaining 12 ribosomal genes. In the fast muscle library, the most
abundant ribosomal protein genes were S29 (11 clones), followed
by S14 (10 clones), S19 and S17 (both with 7 clones), S26 and
L30 (both with 6 clones), and P1, P2, L35, S18 and S30 (all with
five clones each), S15 (4 clones), L10a, L18, L27a, L28, L38,
S11, S20 and S24 (all with 3 clones), L30 (4 clones), S11, S30,
L10a, and L35a (each with three clones). Two clones were
sequenced for L6, L17, L23, L29 and S7. The remaining 14 genes
corresponded to only one clone each. The expression profile of
ribosomal proteins indicated a difference of up to 63-fold in their
relative mRNA abundance in halibut larvae. Such large difference
suggests that for several ribosomal protein genes such as L23 in
larvae and S29 in juveniles, translational control has to account
Fig. 4. Similarity of Hippoglossus hippoglossus ESTs to the proteomes of Homo
sapiens, Danio rerio and Tetraodon nigroviridis. SimiTri plot showing sequence
similarity relationships between H. hippoglossus putative peptides and
H. sapiens (142,587 entries, 753 hits), D. rerio (42,184 entries, 685 hits) and
T. nigroviridis (28,412 entries, 726 hits) proteomes. For each of the 1899
Atlantic halibut peptides, a BLASTX search was performed against the
proteomes of the other 3 species. Each tile in the graphic represents a unique
consensus sequence and its relative position is computed from the raw BLASTX
scores derived above (with a cutoff of 50). Hence, each tile's position indicates
its degree of sequence similarity to each of the three selected databases. The
cluster outlined with a circle (HHC01291) is more similar to its human
orthologue than to its counterpart in the other two fish species. Sequences
showing similarity to a single database are not represented. Sequences with
similarity to only two databases appear on the lines joining the two databases.
Tiles are colored by their highest BLASTX score to each of the databases: red
≥300; yellow ≥ 200; green ≥150; blue ≥ 100 and purple b100.
J. Bai et al. / Gene 386 (2007) 202–210
putative proteins were located in the centre with a bias towards the
top and right sections of the triangle showing that, as expected,
H. hippoglossus is more closely related to T. nigroviridis and
D. rerio than H. sapiens (Fig. 4). Similarly, H. hippoglossus is
also more closely related to T. rubripes and D. rerio than
H. sapiens (Supplementary Fig. S1). For further details
please consult the interactive applets available online at http://
www.st-andrews.ac.uk/∼fmrg/bai.html. Interestingly, the Atlantic halibut cytochrome c oxidase subunit 3 (Cox3), which is
involved in mitochondrial electron transport, is more similar to
human COX3 than the zebrafish and green-spotted pufferfish
Cox3, as shown in Fig. 4. Sequences located on the edge joining
two databases did not have significant matches against the
database indicated on the opposite apex of the triangle. For
example, the clusters aligned along the edge joining the
T. nigroviridis and D. rerio proteomes might have been lost in
the human genome during evolution. The relatively equal
distribution of proteins between the vertices that correspond to
zebrafish, green-spotted pufferfish (Fig. 4) and T. rubripes
(Supplementary Fig. S1) proteomes indicated that sequence
variation within the clade Acanthomorpha is comparable to that
between the Acanthomorpha and zebrafish, which is in the clade
Ostraphysii. Our results emphasize the importance of EST
analysis for studies of developmental processes in non-model
species.
3.7. Conclusions
• The 4249 high quality ESTs prepared from 2-cell embryo,
larval (trunk) and juvenile (fast muscle) of Atlantic halibut
were clustered into a partial transcriptome of 2124 putative
genes, 48.3% of which corresponded to unknown proteins.
• The maternal mRNA in the 2-cell embryos contained a
higher proportion of mitochondrial transcripts (35.6%) than
the other libraries, but only had two ESTs for a ribosomal
protein, which may reflect the relatively high abundance of
pre-formed ribosomes in the yolk.
• A global analysis of protein similarity revealed significant
differences between Atlantic halibut and other members of
the clade Acanthomorpha (T. nigroviridis and T. rubripes).
Acknowledgements
We would like to thank Professor Igor Babiak of the
Department of Fisheries and Natural Sciences, Bodø Regional
University, Norway for collecting samples of 2-cell embryo and
1 day-old yolk-sac larvae of Atlantic halibut. We are grateful to
Marine Harvest (Scotland) Ltd for providing the juvenile
halibut. This work was funded by the Norwegian Research
Council (Grant No.: NFR159594/S40) with additional funding
for consumables from the MARBIT program (Grant no.
AF0024).
Appendix A. Supplementary data
Supplementary data associated with this article can be found,
in the online version, at doi:10.1016/j.gene.2006.09.012.
209
References
Aegerter, S., Jalabert, B., Bobe, J., 2004. Messenger RNA stockpile of cyclin B,
insulin-like growth factor I, insulin-like growth factor II, insulin-like growth
factor receptor Ib and p53 in the rainbow trout oocyte in relation to
developmental competence. Mol. Reprod. Dev. 67, 127–135.
Altschul, S.F., et al., 1997. Gapped BLAST and PSI-BLAST: a new generation
of protein database search programs. Nucleic Acids Res. 25, 3389–3402.
Aoki, F., Worrad, D.M., Schultz, R.M., 1997. Regulation of transcriptional
activity during the first and second cell cycles in the preimplantation mouse
embryo. Dev. Biol. 181, 296–307.
Ashburner, M., et al., 2001. Creating the gene ontology resource: design and
implementation. Genome Res. 11, 1425–1443.
Bergh, O., Nilsen, F., Samuelsen, O.B., 2001. Diseases, prophylaxis and
treatment of the Atlantic halibut Hippoglossus hippoglossus: a review. Dis.
Aquat. Org. 48, 57–74.
Bromage, N., et al., 1992. Broodstock management, fecundity, egg quality and
the timing of egg production in the rainbow trout (Oncorhynchus mykiss).
Aquaculture 100, 141–166.
Brown, D.D., 2004. A tribute to the Xenopus laevis oocyte and egg. J. Biol.
Chem. 279, 45291–45299.
Carlier, M.-F., Jean, C., Rieger, K.J., Lenfant, M., 1993. Modulation of the
interaction between G-actin and thymosin β4 by the ATP/ADP ratio:
possible implication in the regulation of actin dynamics. Proc. Natl. Acad.
Sci. 90, 5034–5038.
Cossins, A.R., Crawford, D.L., 2005. Fish as models for environmental
genomics. Nat. Rev., Genet. 4, 324–333.
Davis Jr., W., De Sousa, P.A., Schultz, R.M., 1996. Transient expression of
translation initiation factor eIF-4C during the 2-cell stage of the
preimplantation mouse embryo: identification by mRNA differential display
and the role of DNA replication in zygotic gene activation. Dev. Biol. 174,
190–201.
Dettai, A., Lecointre, G., 2005. Further support for the clades obtained by
multiple molecular phylogenies in the acanthomorph bush. C. R. Biol. 328,
674–689.
El Meziane, A., Callen, J.C., Mounolon, J.C., 1989. Mitochondrial gene
expression during Xenopus laevis development: a molecular study. EMBO J.
8, 1649–1655.
Estêvão, M.D., Redruello, B., Canario, A.V.M., Power, D.M., 2005. Ontogeny
of osteonectin expression in embryos and larvae of sea bream (Sparus
auratus). Gen. Comp. Endocrinol. 142, 155–162.
Fernandes, J.M., et al., 2005. A genomic approach to reveal novel genes
associated with myotube formation in the model teleost, Takifugu rubripes.
Physiol. Genomics 22, 327–338.
Fisher, A., Caudy, M., 1998. The function of hairy-related bHLH repressor
proteins in cell fate decisions. Bioessays 20, 298–306.
Harris, M.A., et al., 2004. The Gene Ontology (GO) database and informatics
resource. Nucleic Acids Res. 32, D258–D261.
Jackson, T.R., Martin-Robichaud, D.J., Reith, M.E., 2003. Application of DNA
markers to the management of Atlantic halibut (Hippoglossus hippoglossus)
broodstock. Aquaculture 220, 245–259.
Ju, Z., et al., 2000. Transcriptome analysis of channel catfish (Ictalurus
punctatus): genes and expression profile from the brain. Gene 261, 373–382.
Kane, D.A., Kimmel, C.B., 1993. The zebrafish midblastula transition.
Development 119, 447–456.
Kjørsvik, E., Magnor-Jensen, A., Holmefjord, I., 1990. Egg quality in fishes.
Adv. Mar. Biol. 26, 71–113.
Kong, M., Barnes, E.A., Ollendorf, V., Donoghue, D.J., 2000. Cyclin F
regulates the nuclear localization of cyclin B1 through a cyclin–cyclin
interaction. EMBO J. 19, 1378–1388.
Muller, K., Thisse, C., Thisse, B., Raz, E., 2002. Expression of a linker histonelike gene in the primordial germ cells in zebrafish. Mech. Dev. 117,
253–257.
Nelson, G., 1988. Phylogeny of major fish groups. Nobel Symposium 70: The
Hierarchy of Life. Molecules and Morphology in Phylogenetic Analysis.
Elsevier Science Publishers B.V. (Biomedical Division), Karlskoga, Sweden.
Nelson, J.S., 1994. Fishes of the World, 3rd edition. John Wiley and Sons, New
York. 600 pp.
210
J. Bai et al. / Gene 386 (2007) 202–210
Nishikata, T., et al., 2001. Profiles of maternally expressed genes in fertilized
eggs of Ciona intestinalis. Dev. Biol. 238, 315–331.
Nomura, M., Gourse, R., Baughman, G., 1984. Regulation of the synthesis of
ribosomes and ribosomal components. Annu. Rev. Biochem. 53, 75–117.
Park, K.C., Osborne, J.A., Tsoi, S.C., Brown, L.L., Johnson, S.C., 2005.
Expressed sequence tags analysis of Atlantic halibut (Hippoglossus
hippoglossus) liver, kidney and spleen tissues following vaccination against
Vibrio anguillarum and Aeromonas salmonicida. Fish Shellfish Immunol.
18, 393–415.
Parkinson, J., Blaxter, M., 2003. SimiTri—visualizing similarity relationships
for groups of sequences. Bioinformatics 19, 390–395.
Parkinson, J., Anthony, A., Wasmuth, J., Schmid, R., Hedley, A., Blaxter, M.,
2004. PartiGene—constructing partial genomes. Bioinformatics 20,
1398–1404.
Paynton, B.V., Rempel, R., Bachvarova, R., 1988. Changes in the state of
adenylation and time course of degradation of maternal mRNAs during
oocyte maturation and early embryonic development in the mouse. Dev.
Biol. 129, 304–314.
Pikó, L., Taylor, K.D., 1987. Amounts of mitochondrial DNA and abundance of
some mitochondrial gene transcripts in early mouse embryos. Dev. Biol.
123, 364–374.
Shields, R.J., Brown, N.P., Bromage, N.R., 1997. Blastomere morphology as a
predictive measure of fish egg viability. Aquaculture 155, 1–12.
Wasmuth, J.D., Blaxter, M.L., 2004. Prot4EST: translating expressed sequence
tags from neglected genomes. BMC Bioinformatics 5, 187.
Yamada, L., Kobayashi, K., Satou, Y., Satoh, N., 2005. Microarray analysis of
localization of maternal transcript in eggs and early embryos of the ascidian,
Ciona intestinalis. Dev. Biol. 284, 536–550.
Zeng, F., Schultz, R.M., 2005. RNA transcript profiling during zygotic
activation in the preimplantation mouse embryo. Dev. Biol. 283, 40–57.
Zeng, F., Baldwin, D.A., Schultz, R.M., 2004. Transcript profiling during
preimplantation mouse development. Dev. Biol. 272, 483–496.