Complete Sequence, Gene Arrangement, and Genetic Code of

Complete Sequence, Gene Arrangement, and Genetic Code of
Mitochondrial DNA of the Cephalochordate Branchiostoma floridae
(Amphioxus)
Jeffrey L. Boore, L. Lynne Daehler, and Wesley M. Brown
Department of Biology, University of Michigan, Ann Arbor
We have determined the 15,083-nucleotide (nt) sequence of the mitochondrial DNA (mtDNA) of the lancelet
Branchiostoma floridae (Chordata: Cephalochordata). As is typical in metazoans, the mtDNA encodes 13 protein,
2 rRNA, and 22 tRNA genes. The gene arrangement differs from the common vertebrate arrangement by only four
tRNA gene positions. Three of these are unique to Branchiostoma, but the fourth is in a position that is primitive
for chordates. It shares the genetic code variations found in vertebrate mtDNAs except that AGA 5 serine, a code
variation found in many invertebrate phyla but not in vertebrates (the related codon AGG was not found). Branchiostoma mtDNA lacks a vertebrate-like control region; its largest noncoding region (129 nt) is unremarkable in
sequence or base composition, and its location between ND5 and tRNAG differs from that usually found in vertebrates. It also lacks a potential hairpin DNA structure like those found in many (though not in all) vertebrates to
serve as the second-strand (i.e., L-strand) origin of replication. Perhaps related to this, the sequence corresponding
to the DHU arm of tRNAC cannot form a helical stem, a condition found in a few other vertebrate mtDNAs that
also lack a canonical L-strand origin of replication. ATG and GTG codons appear to initiate translation in 11 and
2 of the protein-encoding genes, respectively. Protein genes end with complete (TAA or TAG) or incomplete (T or
TA) stop codons; the latter are presumably converted to TAA by post-transcriptional polyadenylation.
Introduction
Complete mitochondrial DNA (mtDNA) sequences
have been determined for 36 vertebrate species and partial sequences for hundreds of others. All are circular
DNA molecules containing 37 genes: 13 for proteins
(COI-III, ND1-6, ND4L, Cytb, A6, A8); two for rRNAs
(srRNA and lrRNA); and 22 for tRNAs (designated by
the one-letter amino acid code, with the two S and two
L tRNAs differentiated by the codons recognized [AGN/
UCN and CUN/UUR, respectively]). The genes are arranged very compactly, with no introns and few intergenic nucleotides. However, all vertebrate mtDNAs examined have a single, large, noncoding region, highly
variable in size among (and sometimes within) species,
that contains signalling elements for regulating transcription and replication (reviewed by Shadel and Clayton 1997).
Comparisons of mitochondrial systems are useful
for modeling genome evolution and for phylogenetic inference. Many complex features are available for comparison: modes of replication and transcription; RNA
processing; protein, tRNA, and rRNA secondary structures; patterns of transcript editing; genetic code variations; and the relative arrangements of genes (Sankoff
et al. 1992; Smith et al. 1993; Boore and Brown 1994;
Boore et al. 1995; Kumazawa and Nishida 1995; Boore
1996; Boore, Lavrov, and Brown 1998). While these
(and many additional) features are also present in nuclear genomes, they are currently more accessible for
study in the much smaller and simpler mitochondrial
genomes (Garesse et al. 1997). Here we describe and
Key words: Branchiostoma, amphioxus, mitochondria, evolution,
chordate, genome.
Address for correspondence and reprints: J. L. Boore, Department
of Biology, University of Michigan, 830 N. University Avenue, Ann
Arbor, Michigan 48109. E-mail: [email protected].
Mol. Biol. Evol. 16(3):410–418. 1999
q 1999 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038
410
analyze the complete sequence of Branchiostoma floridae mtDNA (GenBank accession number AF098298), a
species in the group Cephalochordata, a primitive chordate having diverged before the vertebrate radiation.
Materials and Methods
Live specimens of B. floridae were purchased from
Gulf Specimen Co., Panacea, Fla. Mitochondrial DNA
was purified by cesium chloride–ethidium bromide centrifugation as in Wright, Spolsky, and Brown (1983). A
detailed restriction map was determined, separating radiolabeled fragments on both 1% agarose and 3.5% acrylamide gels, using conditions that allowed detection
of fragments as small as 40 base pairs (bp) (Brown
1980). The entire Branchiostoma mtDNA was then
cloned into the lambda vector EMBL4 (Stratagene) using the unique EcoRI site at position 8,834–8,839 (numbering from the first nucleotide of COI; see figs. 1 and
2). This clone was verified by comparing restriction enzyme fragment sizes with those expected from the cleavage map of the native mtDNA. After subsequent digestion with other restriction enzymes, mtDNA fragments
were subcloned into pBluescript plasmids (Stratagene)
and sequenced; some of the longer fragments required
additional exonuclease deletion cloning steps (Erase-aBaset; Promega). Sequences were determined on both
strands using dideoxynucleotide terminators and radiolabeled nucleotides (Sanger, Nicklen, and Coulson
1977); oligonucleotide sequencing primers were designed as necessary. Protein-encoding genes were identified by sequence similarity of open reading frames to
mitochondrial gene sequences of Cyprinus carpio
(Chang, Huang, and Lo 1994). Ribosomal RNA genes
were identified by their potential to form tRNA-like secondary structures; specific identification were made according to anticodon sequences.
Lancelet Mitochondrial DNA
FIG. 1.—The gene map of Branchiostoma floridae mtDNA. Genes
are abbreviated as in the text and scaling is only approximate; NC
refers to the largest noncoding region. Asterisks mark those genes
whose positions differ from those of the basic vertebrate arrangement
as exemplified by human mtDNA (Anderson et al. 1981) (the choice
to mark M rather than Q is arbitrary, as these are in ‘‘switched’’ positions). Transfer RNA genes identified outside of the circle are transcribed clockwise in this figure, as are all other genes except ND6;
ND6 and the tRNAs marked inside are transcribed from the opposite
strand. An arc marks a homologous portion previously determined for
a congeneric species, Branchiostoma lanceolatum (Delarbre et al.
1997).
Results and Discussion
Gene Content and Organization
Complete mtDNA sequences have been published
for 36 vertebrate species. The 37 encoded genes are arranged identically in most, but minor variations of the
basic arrangement have been found in some animals (sea
lamprey [Lee and Kocher 1995], some frogs [Yoneyama
1987; Fujii et al. 1988], reptiles [Seutin et al. 1994; Kumazawa and Nishida 1995; Kumazawa et al. 1996;
Quinn and Mindell 1996; Janke and Arnason 1997; Macey et al. 1997], birds [Glaus et al. 1980; Desjardins,
Ramirez, and Morais 1990; Desjardins and Morais 1990,
1991; Quinn and Wilson 1993; Ramirez, Savoie, and
Morais 1993; Harlid, Janke, and Arnason 1997], and
marsupials [Pääbo et al. 1991; Janke et al. 1994]). Each
of these variations appears to be derived independently,
because the basic arrangement is shared among several
vertebrate classes and none of the variations are shared
among distantly related groups. This inference is
strengthened further by the gene arrangement in Branchiostoma mtDNA, which, except for four tRNA gene
positions, is identical to the basic vertebrate arrangement
(as exemplified in mammals [Anderson et al. 1981],
Xenopus [Roe et al. 1985], and bony fish [Chang,
Huang, and Lo 1994]). None of the four variant positions is shared by the corresponding gene in another
vertebrate species.
The four tRNA genes whose arrangement in Branchiostoma mtDNA differs from the basic vertebrate ar-
411
rangement are those for glycine (G), phenylalanine (F),
methionine (M), and asparagine (N) (fig. 1). Two of the
differences (tRNAM and tRNAN) are the same as noted
in the sequenced portion of Branchiostoma lanceolatum
mtDNA (Delarbre et al. 1997). Because the positions of
tRNAG and tRNAM are identical in the basic vertebrate
arrangement and in the mtDNA of Drosophila (Clary
and Wolstenholme 1985) and because the position of
tRNAF is similar in the basic arrangement and in the
mtDNAs of echinoderms (Jacobs et al. 1988; Smith et
al. 1989), it can be argued that the positions of these
three tRNAs are derived in Branchiostoma and that their
positions in the basic arrangement represent the primitive state for chordates. From published data it is not
possible to determine the primitive chordate position of
tRNAN. The primitive arrangement could be as in Branchiostoma mtDNA (between ND2 and tRNAW) with its
translocation to a position between tRNAA and tRNAC
being a derived condition for vertebrates or, alternatively, the basic vertebrate arrangement could be primitive
with an independent translocation in the lineage leading
to Branchiostoma. This is resolved, however, by noting
that the position of tRNAN in a hemichordate mtDNA
(S. Pääbo, personal communication) is identical to that
in Branchiostoma. Assuming that Hemichordata is an
outgroup to a clade of Cephalochordata 1 Vertebrata,
the most parsimonious explanation is that the tRNAN position in Branchiostoma mtDNA is primitive for chordates.
Branchiostoma mtDNA is arranged very compactly, even for a mitochondrial genome. In total, there appear to be only 154 noncoding nucleotides (nt): 129 in
a single region between ND5–tRNAG; 8 between tRNAS(UCN)–tRNAD; 2 between each of tRNAR–ND4L and
tRNAF–tRNAV; 1 between each of tRNAQ–ND2, ND6–
tRNAE, and tRNAE–Cytb; and 10 between tRNAY–COI
(fig. 2). Several protein-encoding genes are predicted to
end with abbreviated stop codons (see below). In six
cases, adjacent genes overlap (COI–tRNAS(UCN), A8–A6,
tRNAG–ND6, ND2–tRNAN, tRNAN–tRNAW, and tRNAW–
tRNAA); however, in all except one (A8–A6, discussed
below) the overlapping genes are encoded on opposite
DNA strands. The lack of overlap of the ND4L–ND4
genes is unusual.
Base Composition
Overall, B. floridae mtDNA is 63% A1T, slightly
higher than other chordate mtDNAs (e.g., Petromyzon
marinus 5 62% [Lee and Kocher 1995], C. carpio 5
57% [Chang, Huang, and Lo 1994], and Protopterus
dolloi 5 58% [Zardoya and Meyer 1996]). The 11,262
nt making up the protein-encoding genes are 62% A1T,
nearly identical to the mtDNA overall (table 1). The
A1T content of first, second, and third codon positions
is 54%, 62%, and 70%, respectively. As is typical of
metazoan mtDNAs (Cardon et al. 1994), the dinucleotide CpG is significantly underrepresented in Branchiostoma mtDNA.
In earlier studies (Naylor and Brown 1997, 1998),
comparison of the protein-encoding portions of the B.
floridae mtDNA sequence with those of other metazoans
412
Boore et al.
FIG. 2.—A partly schematic representation of the mtDNA sequence of Branchiostoma floridae. Numbers within the slash marks indicate
omitted nucleotides. For two genes, the inferred initiation codon is GTG; the corresponding amino acid (M) is in parentheses here to indicate
presumed noncomformity with the generally employed genetic code. (Mitochondrial proteins appear to initiate with formyl-methionine [Smith
and Marcker 1968] as do those of their bacterial progenitors.) Stop codons, including those inferred to be abbreviated, are marked by an asterisk
and the single large noncoding region by a row of dots. A dart (.) marks the last nucleotide of each gene and indicates the direction of
transcription. The EcoRI restriction enzyme site used for cloning (positions 8,826–8,831) is underlined.
Lancelet Mitochondrial DNA
413
Table 1
Number of Occurrences and Percentage of Total of the 3,754 Codons in the 13 Protein-Encoding Genes of
Branciostoma floridae mtDNA
Amino
Acid
Codon
N
%
Phe (F) . . . .
[GAA]a . . . .
Leu (L) . . . .
[UAA] . . . .
TTT
TTC
TTA
TTG
170
82
289
112
4.5
2.2
7.7
3.0
Leu (L) . . . . CTT
[UAG] . . . . CTC
CTA
CTG
77
15
93
36
ATT
ATC
ATA
ATG
Val (V) . . . . GTT
[UAC]. . . . . GTC
GTA
GTG
Ile (I) . . . . .
[GAU] . . . .
Met (M) . . .
[CAU]. . . . .
a
b
Amino
Acid
Codon
N
%
Amino
Acid
Codon
N
%
Tyr (Y)
[GUA]
TERb
TAT
TAC
TAA
TAG
95
53
—
—
2.5
1.4
—
—
Cys (C)
[GCA]
Trp (W)
[UCA]
TGT
TGC
TGA
TGG
23
15
58
48
0.6
0.4
1.5
1.3
2.0
0.5
1.1
0.6
His (H)
[GUG]
Gln (Q)
[UUG]
CAT
CAC
CAA
CAG
65
26
38
47
1.7
0.7
1.0
1.3
Arg (R)
[UCG]
CGT
CGC
CGA
CGG
16
12
27
21
0.4
0.3
0.7
0.6
65
15
65
32
1.7
0.4
1.7
0.9
Asn (N)
[GUU]
Lys (K)
[UUU]
AAT
AAC
AAA
AAG
87
24
40
33
2.3
0.6
1.1
0.9
Ser (S)
[GCU]
AGT
AGC
AGA
AGG
73
33
12
0
1.9
0.9
0.3
0.0
126
22
86
34
3.4
0.6
2.3
0.9
Asp (D)
[GUC]
Glu (E)
[UUC]
GAT
GAC
GAA
GAG
57
23
58
45
1.5
0.6
1.5
1.2
Gly (G)
[UCC]
GGT
GGC
GGA
GGG
92
20
67
113
2.5
0.5
1.8
3.0
Codon
N
%
Ser (S)
[UGA]
TCT
TCC
TCA
TCG
77
19
64
17
2.1
0.5
1.7
0.5
2.1
0.4
2.5
1.0
Pro (P)
[UGG]
CCT
CCC
CCA
CCG
76
20
40
22
187
46
138
63
5.0
1.2
3.7
1.7
Thr (T)
[UGU]
ACT
ACC
ACA
ACG
130
23
124
68
3.5
0.6
3.3
1.8
Ala (A)
[UGC]
GCT
GCC
GCA
GCG
Amino
Acid
The anticodon of the corresponding tRNA is shown in brackets.
Stop codons are omitted from this analysis.
failed to confirm Branchiostoma as the sister taxon to
vertebrates, suggesting instead that it is the sister taxon
to (echinoderms 1 vertebrates) (which was viewed as
artifactual by the authors). From review of the original
data used for these studies, 13 sequencing errors were
detected, resulting in frameshifts in COI, ND2, ND4L,
and ND6. A total of 79 amino acids were inferred in
error for those previous studies, about 2% of the total
amino acids analyzed; whether this would significantly
affect their conclusions awaits further analysis.
Initiation and Termination of Protein-Encoding Genes
The mitochondrial protein genes of Branchiostoma
correspond well in size and sequence to those of other
metazoans (table 2). ATG codons initiate 11 of the 13,
COI and ND1 being the exceptions. The inferred initiation codon for COI is GTG, as no ATG or other initiation codon employed in metazoan mitochondrial systems is nearby, and because the sequence of the protein
initiated at this position is highly similar to the amino
terminal sequences of COI in other metazoans. There is
an in-frame, TAG stop codon 6 nt upstream of this GTG,
and Delarbre et al. (1997) found GTG at the corresponding position in B. lanceolatum, which they inferred as
the initiation codon for COI. GTG has been inferred to
initiate this and other genes in many metazoan mtDNAs
(Wolstenholme 1992). Inference of the initiation codon
for ND1 is less certain, as two commonly used initiation
codons, GTG and ATA, are in-frame and immediately
adjacent at the probable ND1 start site (positions
12,576–12,581 in fig. 2). The GTG codon directly abuts
the upstream tRNAL(UUR) gene, followed immediately by
ATA. The same ambiguity is present at the 59 end of
ND1 in B. lanceolatum, for which Delarbre et al. (1997)
arbitrarily designated ATA as the initiation codon.
Complete TAG stop codons are present in COI,
Cytb, ND2, ND5, and ND6, none of which overlaps a
downstream gene having the same transcriptional orientation, and a complete TAA stop codon is present in
COIII (table 2). Also, A8 almost certainly ends at the
TAG codon following the highly conserved sequence
(WPW) at the 39 end of A8, where A8 and A6 overlap.
These genes commonly overlap in chordate mtDNAs,
where they are known to be translated from the same
bicistronic mRNA (Fearnley and Walker 1986).
The other genes end at incomplete (i.e., T or TA)
codons. After transcription and processing, mRNAs
ending in T or TA are converted to TAA by polyadenylation (Ojala, Montoya, and Attardi 1981). This is
surely the case for COII; if this transcript extended to
the first in-frame stop codon, it would overlap the adjacent gene by 29 nt. A6, ND1, ND3, ND4, and ND4L
are all inferred to end at incomplete stop codons that
directly abut their adjacent, downstream genes; for
each, however, allowing the transcript to overlap the
downstream gene by 1 or 2 nt would complete the termination codon. Delarbre et al. (1997) sequenced a
cDNA to the ND1 mRNA of B. lanceolatum and found
that it terminated with TAA, thus confirming the incomplete codon hypothesis for that gene (overlap
would have resulted in a TAG stop codon). However,
it seems likely that the presence, at least prior to processing, of potentially complete termination codons is
not merely coincidence and may be a mechanism for
preventing translational readthrough in cases where
correct transcript processing fails.
Transfer RNAs
Twenty-two sequences can be folded into tRNAlike structures (fig. 3). In these, the sequences corre-
Gene lengths are as inferred in the text and depicted in figure 1 or obtained from Genbank. Actual gene length could be slightly different due to ambiguity in determining start and stop codons.
Percent identity is the number of identical inferred amino acids in a pairwise alignment divided by the mean length of the two compared sequences.
c For predicted stop codons the parentheses indicate the potential of a complete stop codon overlapping a downstream gene with the same transcriptional orientation. The asterisks indicate that no such potential reasonably
exists and that the stop codon is incomplete, presumably completed by polyadenylation of the mRNA (see text).
b
a
ATG TAA
GTG TAA
ATG TAA
ATG TAA
ATR TAA
ATG TA(G)
ATG TAA
ATG TAG
ATR TAG
ATR TAG
ATC TA(A)
ATG TAG
ATG TAG
ATG TA(A)
ATG TAG
GTG TAA
ATG T*
ATG TA(A)
ATG T*
ATG TAA
ATG TA(G)
ATG T(AG)
ATG T*
ATG TAA
ATG TAA
ATG TAA
37.0
16.7
76.2
62.5
69.7
67.1
57.7
36.0
54.5
44.8
41.0
40.1
33.1
232
54
517
229
260
380
323
352
116
463
97
638
160
227
54
515
230
262
380
314
346
117
452
91
599
167
A6
A8
COI
COII
COIII
Cytb
ND1
ND2
ND3
ND4
ND4L
ND5
ND6
227
52
516
230
261
380
324
348
116
460
98
607
172
52.4
30.2
77.4
60.9
73.8
69.7
59.2
34.3
59.2
48.5
40.7
46.8
27.7
34.9
19.4
77.3
64.5
69.2
63.7
53.1
34.1
47.9
45.6
41.5
37.6
23.2
ATG TA(A)
ATG TAG
GTG TAG
ATG T*
ATG TAA
ATG TAG
GTG T(AG)
ATG TAG
ATG T(AA)
ATG TA(G)
ATG TA(A)
ATG TAG
ATG TAG
Sea urchin
Carp
Lancelet
Sea urchin/carp
Lancelet/
sea urchin
Lancelet/carp
Sea urchin
Carp
Lancelet
PROTEIN
NUMBER
OF
AMINO ACIDSa
PERCENT AMINO ACID IDENTITYb
PREDICTED INITIATION
AND
TERMINATIONc CODONS
Boore et al.
Table 2
Comparisons of the Mitochondrial Protein-Coding Genes of a Lancelet (Branchiostoma floridae), a Carp (Cyprinus carpio; Chang, Huang, and Lo 1994), and a Sea
Urchin (Paracentrotus lividus; Cantatore et al. 1989)
414
sponding in position to the anticodons are identical to
those for the mitochondrial tRNA gene of human,
chicken, frog, and fish mtDNAs (Anderson et al. 1981;
Roe et al. 1985; Desjardins and Morais 1990; Chang,
Huang, and Lo 1994). All B. floridae mitochondrial
tRNA genes have a TCC loop of 3–7 nt and a TCC
stem of 3–6 nt; two (tRNAR and tRNAQ) have a single
mismatch in this stem. All but tRNAM, tRNAF, and
tRNAW have a fully paired, seven-member acceptor
stem, and all but tRNAQ and tRNAL(UUR) have a fully
paired five-member anticodon stem. All except tRNAT
have a four-member extra arm. The dinucleotide between the acceptor and DHU arms is TpA in all tRNA
genes except tRNAS(UCN) and tRNAY, where it is TpG,
and in tRNAV, where it is GpA. In all of the tRNA
genes, the 2 nt preceding the anticodon are pyrimidines
and the nucleotide following it is a purine. Except for
tRNAC and tRNAS(AGN), all have DHU arms with a stem
of 3–5 nt and a loop of 3–11 nt. For all but tRNAG, the
2 most proximal unpaired nt in the DHU loop are purines (almost always A’s). The unpaired replacement
for the DHU arm of tRNAS(AGN) is typical of metazoan
mtDNAs, as is the potential for additional pairing at
the end of the anticodon stem. An unpaired DHU arm
in tRNAC is unusual but has precedents among vertebrates; in some (but not all) cases (for examples, Seutin
et al. 1994; Macey et al. 1997) it is correlated with the
loss of the immediately adjacent, stem–loop structure
that functions, in many vertebrate mtDNAs, as the second-strand (i.e., L-strand) origin of replication. We
speculate that this aberrant tRNAC might represent a
compromise structure, serving as both an origin of replication and a functional tRNA gene. A similar condition appears in the mtDNA of P. dolloi (Zardoya and
Meyer 1996) where tRNAC and this stem–loop structure
partially share the same sequence.
Unassigned DNA
The largest noncoding region in B. floridae mtDNA
is only 129 nt. By contrast, the largest noncoding region
is 198 nt in P. marinus (Lee and Kocher 1995), 928 nt
in C. carpio (Chang, Huang, and Lo 1994), and 1183 nt
in P. dolloi (Zardoya and Meyer 1996) mtDNAs. A
search of all vertebrate mtDNA sequences identified no
obvious similarity with this noncoding sequence, and its
location, between ND5 and tRNAG, differs from that
usually found in vertebrates. In B. floridae, this region
is slightly less A1T-rich (59%) than the overall mtDNA
(63%). Other than in this region, there are only 25 nt,
distributed in blocks of 1–10 nt, that are unassigned to
genes, and the composition of these appears unremarkable.
Genetic Code: AGA Specifies Serine, Not Glycine, in
Branchiostoma mtDNA
In vertebrate mtDNAs only AGY specifies serine,
with AGR codons being absent or, when present, used
as stop codons. In all invertebrate mtDNAs except those
of cnidarians both AGR and AGY appear to specify serine (reviewed by Wolstenholme 1992). (AGR specifies
arginine in cnidarian mtDNA, as in the ‘‘universal’’
Lancelet Mitochondrial DNA
415
FIG. 3.—The potential secondary structures of the 22 inferred tRNAs of Branchiostoma floridae mtDNA. Nomenclature for tRNA arms is
shown for tRNAV. The five additional nucleotides in parentheses outside of the structures and accompanied by an arrow indicate the only
differences in comparing the eight sequenced tRNA genes of Branchiostoma lanceolatum (Delarbre et al. 1997).
code, presumably due to the use of imported, nuclearencoded tRNAs.)
Based on a single AGA and no AGG codons, Delabre et al. (1997) suggested a variation for the lancelet
mitochondrial genetic code in which AGR specifies glycine and AGY, serine. Based on the paucity of data
available to them, that suggestion may have been reasonable. However, with the much larger data set pre-
416
Boore et al.
sented here (and noting that no AGG codons are present), it is clear that AGA (along with AGY) specifies
serine in Branchiostoma mtDNA.
There are 12 AGA codons in B. floridae mtDNA,
1 of which is identical in position to the single AGA
codon found by Delabre et al. (1997) in the ND2 gene
of B. lanceolatum. In alignments with the corresponding
gene sequences of P. marinus (Lee and Kocher 1995),
C. carpio (Chang, Huang, and Lo 1994), P. dolloi (Zardoya and Meyer 1996), Gadus morhua (Johansen, Guddal, and Johansen 1990), and Crossostome lacustre
(Tzeng et al. 1992), the Branchiostoma AGA codons
correspond most frequently to serine codons, with the
correspondence to TCN codons being even more frequent than to AGY codons. No tRNA genes in B. floridae mtDNA have a TCT anticodon, as would be needed
to discriminate AGR from the AGN codon family;
moreover, only tRNAS(AGN) has an NCT anticodon.
Even though we believe that there is unequivocal
evidence that AGN specifies serine in Branchiostoma
mtDNA, the absence of AGG and reduced usage of
AGA codons in this mtDNA can be viewed as a precondition for codon reassignment, as shown even better
for the mtDNA of the hemichordate Balanoglossus
(Castresana, Feldmaier-Fuchs, and Pääbo 1998).
AGR codons appear to specify glycine in the urochordate Halocynthia roretzi mtDNA, because of their
frequent alignment with glycine (GGN) codons in other
metazoan mtDNAs (Yokobori, Ueda, and Watanabe
1993). The assignment of glycine as the amino acid
specified by AGR codons in Halocynthia mtDNA is
based on 19 occurrences in a 1,263-nt fragment of COI
(Yokobori, Ueda, and Watanabe 1993). No AGR codons
appear in COI of B. floridae; however, 11 of the 12
AGA codons and 5 of the 7 AGG codons in Halocynthia
COI align with glycine codons (GGN) in B. floridae
COI, providing further evidence for the reassignment of
AGR in the urochordate lineage, after it split from the
lineage leading to cephalochordates.
Sequence alignments of the B. floridae and vertebrate mitochondrial proteins suggest that there are no
other differences between its genetic code and that used
in vertebrate mtDNAs.
Nucleotide Sequence Comparisons with B.
lanceolatum mtDNA
The 2,562 nt previously determined for B. lanceolatum (Delarbre et al. 1997) are remarkably similar in
sequence to the corresponding region of B. floridae
mtDNA, with only 73 nt differences between the two
(.97% sequence identity). This common region includes complete genes for ND1, ND2, and eight tRNAs,
and parts of genes for COI and a ninth tRNA.
There is evidence for only one insertion/deletion
event. This results in a single nucleotide difference in
the lengths of the noncoding region between tRNAY–COI
(10 nt in B. floridae, 9 nt in B. lanceolatum) that are
otherwise identical.
In the aggregate, the tRNA gene sequences of the
two species differ by five substitutions (fig. 3). Of these,
four are in loops and one is in a stem (which changes
a T–G pair to a C–G pair); all five substitutions are
transitions.
The ND1 proteins differ by 15 substitutions (disregarding whether initiated by GTG or ATA), of which
13 are synonymous and 2 nonsynonymous; 11 are transitions and 4 are transversions. The ND2 proteins differ
by 52 substitutions, of which 47 are synonymous and 5
nonsynonymous; 50 are transitions and 2 are transversions. However, the ND2 situation is made complex by
Delarbre et al.’s (1997) report of a second ND2 copy,
which could be either mitochondrial or nuclear. The sequence of the second copy was not specifically reported,
but they noted that the two differed by 42 substitutions,
8 of which cause amino acid replacements, which they
describe. Of these eight, the B. floridae ND2 amino acid
sequence is identical to three in the first (reported) copy
and to five in the second. We found no evidence for a
second ND2 copy in the B. floridae mtDNA sequence,
and the size of the purified mtDNA, estimated from restriction fragment length summations, is clearly insufficient to accommodate a second copy of this gene.
Acknowledgments
Thanks to Susan Fuerstenberg, Kevin Helfenbein,
Gavin Naylor, and Alan Wolf for helpful comments on
the manuscript and to Svante Pääbo, Jose Castresana,
and coworkers for sharing the Balanoglossus data with
us prior to publication. This work was supported by NSF
grant DEB-9220640 to W.M.B.
LITERATURE CITED
ANDERSON, S., A. T. BANKIER, B. G. BARRELL et al. (14 coauthors). 1981. Sequence and organization of the human
mitochondrial genome. Nature 290:457–465.
BOORE, J. L. 1996. Ancient patterns of arthropod evolution are
recorded in mitochondrial genome rearrangements. Pp. 69–
78 in M. NEI and N. TAKAHATA, eds. Current topics in
molecular evolution: proceedings of the U.S.–Japan Binational Workshop on Molecular Evolution. Graduate School
for Advanced Studies, Hayama, Japan.
BOORE, J. L., and W. M. BROWN. 1994. Mitochondrial genomes and the phylogeny of mollusks. Nautilus 108(Suppl.
2):61–78.
BOORE, J. L., T. M. COLLINS, D. STANTON, L. L. DAEHLER,
and W. M. BROWN. 1995. Deducing the pattern of arthropod
phylogeny from mitochondrial DNA rearrangements. Nature 376:163–165.
BOORE, J. L., D. V. LAVROV, and W. M. BROWN. 1998. Gene
translocation links insects and crustaceans. Nature 393:667–
668.
BROWN, W. M. 1980. Polymorphism in mitochondrial DNA of
humans as revealed by restriction endonuclease analysis.
Proc. Natl. Acad. Sci. USA 77:3605–3609.
CANTATORE, P., M. ROBERTI, G. RAINALDI, M. N. GADALETA,
and C. SACCONE. 1989. The complete nucleotide sequence,
gene order and genetic code of the mitochondrial genome
of Paracentrotus lividus. J. Biol. Chem. 264:10965–10975.
CARDON, L. R., C. BURGE, D. A. CLAYTON, and S. KARLIN.
1994. Pervasive CpG suppression in animal mitochondrial
genomes. Proc. Natl. Acad. Sci. USA 91:3799–3803.
CASTRESANA, J., G. FELDMAIER-FUCHS, and S. PÄÄBO. 1998.
Codon reassignment and amino acid composition in hemi-
Lancelet Mitochondrial DNA
chordate mitochondria. Proc. Natl. Acad. Sci. USA 95:
3703–3707.
CHANG, Y.-S., F.-L. HUANG, and T.-B. LO. 1994. The complete
nucleotide sequence and gene organization of carp (Cyprinus carpio) mitochondrial genome. J. Mol. Evol. 38:138–
155.
CLARY, D. O., and D. R. WOLSTENHOLME. 1985. The mitochondrial DNA molecule of Drosophila yakuba: nucleotide
sequence, gene organization, and genetic code. J. Mol.
Evol. 22:252–271.
DELARBRE, C., V. BARRIEL, S. TILLIER, P. JANVIER, and G.
GACHELIN. 1997. The main features of the craniate mitochondrial DNA between the ND1 and the COI genes were
established in the common ancestor with the lancelet. Mol.
Biol. Evol. 14:807–813.
DESJARDINS, P., and R. MORAIS. 1990. Sequence and gene organization of the chicken mitochondrial genome: a novel
gene order in higher vertebrates. J. Mol. Biol. 212:599–634.
. 1991. Nucleotide sequence and evolution of coding
and noncoding regions of a quail mitochondrial genome. J.
Mol. Evol. 32:153–161.
DESJARDINS, P., V. RAMIREZ, and R. MORAIS. 1990. Gene organization of the Peking duck mitochondrial genome. Curr.
Genet. 17:515–518.
FEARNLEY, I. M., and J. E. WALKER. 1986. Two overlapping
genes in bovine mitochondrial DNA encode membrane
components of ATP synthase. EMBO J. 5:2003–2008.
FUJII, H., T. SHIMADA, Y. GOTO, and T. OKAZAKI. 1988. Cloning of the mitochondrial genome of Rana catesbeiana and
the nucleotide sequences of the ND2 and five tRNA genes
J. Biochem. 103:474–481.
GARESSE, R., J. A. CARRODEGUAS, J. SANTIAGO, M. L. PEREZ,
R. MARCO, and C. G. VALLEJO. 1997. Artemia mitochondrial genome: molecular biology and evolutive considerations. Comp. Biochem. Physiol. B. Biochem. Mol. Biol.
117:357–366.
GLAUS, K. R., H. P. ZASSENHAUS, N. S. FECHHEIMER, and P.
S. PERLMAN. 1980. Avian mtDNA: structure, organization
and evolution. Pp. 131–135 in A. M. KROON and C. SACCONE, eds. The organization and expression of the mitochondrial genome. Elsevier/North-Holland Biomedical
Press, Amsterdam.
HARLID, A., A. JANKE, and U. ARNASON. 1997. The mtDNA
sequence of the ostrich and the divergence between paleognathous and neognathous birds. Mol. Biol. Evol. 14:754–
761.
JACOBS, H. T., D. J. ELLIOTT, V. B. MATH, and A. FARQUARSON. 1988. Nucleotide sequence and gene organization of
sea urchin mitochondrial DNA. J. Mol. Biol. 202:185–217.
JANKE, A., and U. ARNASON. 1997. The complete mitochondrial genome of Alligator mississippiensis and the separation between recent Archosauria (birds and crocodiles).
Mol. Biol. Evol. 14:1266–1272.
JANKE, A., G. FELDMAIER-FUCHS, W. K. THOMAS, A. VON
HAESELER, and S. PÄÄBO. 1994. The marsupial mitochondrial genome and the evolution of placental mammals. Genetics 137:243–256.
JOHANSEN, H., P. H. GUDDAL, and T. JOHANSEN. 1990. Organization of the mitochondrial genome of Atlantic cod, Gadus morhua. Nucleic Acids Res. 18:411–419.
KUMAZAWA, Y., and M. NISHIDA. 1995. Variations in mitochondrial tRNA gene organization of reptiles as phylogenetic markers. Mol. Biol. Evol. 12:759–772.
KUMAZAWA, Y., H. OTA, M. NISHIDA, and T. OZAWA. 1996.
Gene rearrangements in a snake mitochondrial genomes:
highly concerted evolution of control-region-like sequences
417
duplicated and inserted into a tRNA gene cluster. Mol. Biol.
Evol. 13:1242–1254.
LEE W.-J., and T. KOCHER. 1995. Complete sequence of a sea
lamprey (Petromyzon marinus) mitochondrial genome: early establishment of the vertebrate genome organization. Genetics 139:873–887.
MACEY, J. R., A. LARSON, N. B. ANANJEVA, and T. PAPENFUSS.
1997. Evolutionary shifts in three major structural features
of the mitochondrial genome among iguanian lizards. J.
Mol. Evol. 44:660–674.
NAYLOR, G. J. P., and W. M. BROWN. 1997. Structural biology
and phylogenetic estimation. Nature 388:527–528.
. 1998. Amphioxus mitochondrial DNA, chordate phylogeny, and the limits of inference based on comparisons
of sequences. Syst. Biol. 47:61–76.
OJALA, D., J. MONTOYA, and G. ATTARDI. 1981. tRNA punctuation model of RNA processing in human mitochondria.
Nature 290:470–474.
PÄÄBO, S., W. K. THOMAS, K. M. WHITFIELD, Y. KUMAZAWA,
and A. C. WILSON. 1991. Rearrangements of mitochondrial
transfer RNA genes in marsupials. J. Mol. Evol. 33:426–
430.
QUINN, T. W., and D. P. MINDELL. 1996. Mitochondrial gene
order adjacent to the control region in crocodile, turtle, and
tuatara. Mol. Phylogenet. Evol. 5:344–351.
QUINN, T. W., and A. C. WILSON. 1993. Sequence evolution in
and around the mitochondrial control region in birds. J.
Mol. Evol. 37:417–425.
RAMIREZ, V., P. SAVOIE, and R. MORAIS. 1993. Molecular characterization and evolution of a duck mitochondrial genome.
J. Mol. Evol. 37:296–310.
ROE, B. A., D.-P. MA, R. K. WILSON, and J. J.-H. WONG. 1985.
The complete nucleotide sequence of the Xenopus laevis
mitochondrial genome. J. Biol. Chem. 260:9759–9774.
SANGER, F., S. NICKLEN, and A. R. COULSON. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl.
Acad. Sci. USA 74:5463–5467.
SANKOFF, D., G. LEDUC, N. ANTOINE, B. PAQUIN, B. F. LANG,
and R. J. CEDERGREN. 1992. Gene order comparisons for
phylogenetic inference: evolution of the mitochondrial genome. Proc. Natl. Acad. Sci. USA 89:6575–6579.
SEUTIN, G., B. F. LANG, D. P. MINDELL, and R. MORAIS. 1994.
Evolution of the WANCY region in amniote mitochondrial
DNA. Mol. Biol. Evol. 11:329–340.
SHADEL, G. S., and D. A. CLAYTON. 1997. Mitochondrial DNA
maintenance in vertebrates. Annu. Rev. Biochem. 66:409–
435.
SMITH, M. J., A. ARNDT, S. GORSKI, and E. FAJBER. 1993. The
phylogeny of echinoderm classes based on mitochondrial
gene arrangements. J. Mol. Evol. 36:545–554.
SMITH, M. J., D. K. BANFIELD, K. DOTEVAL, S. GORSKI, and
D. J. KOWBEL. 1989. Gene arrangement in sea star mitochondrial DNA demonstrates a major inversion event during echinoderm evolution. Gene 76:181–185.
SMITH, A. E., and K. A. MARCKER. 1968. N-formylmethionyl
transfer RNA in mitochondria from yeast and rat liver. J.
Mol. Biol. 38:241–243.
TZENG, C.-S., C.-F. HUI, S.-C. SHEN, and P. C. HUANG. 1992.
The complete nucleotide sequence of the Crossostome lacustre mitochondrial genome: conservation and variations
among vertebrates. Nucleic Acids Res. 20:4853–4858.
WOLSTENHOLME, D. R. 1992 Animal mitochondrial DNA:
structure and evolution. Int. Rev. Cytol. 141:173–216.
WRIGHT, J. W., C. SPOLSKY, and W. M. BROWN. 1983. The
origin of the parthenogenetic lizard Cnemidophorus laredoensis inferred from mitochondrial DNA analysis. Herpetologica 39:410–416.
418
Boore et al.
YOKOBORI, S.-I., T. UEDA, and K. WATANABE. 1993. Codons
AGA and AGG are read as glycine in ascidian mitochondria. J. Mol. Evol. 36:1–8.
YONEYAMA, Y. 1987. The nucleotide sequences of the heavy
and light strand replication origins of the Rana catesbeiana
mitochondrial genome. J. Nippon Med. Sch. (Nippon Ika
Daigaku Zasshi) 54:429–440 [in Japanese].
ZARDOYA, R., and A. MEYER. 1996. The complete nucleotide
sequence of the mitochondrial genome of the lungfish (Protopterus dolloi) supports its phylogenetic position as a close
relative of land vertebrates. Genetics 142:1249–1263.
STEPHEN PALUMBI, reviewing editor
Accepted December 9, 1998