Complete Sequence, Gene Arrangement, and Genetic Code of Mitochondrial DNA of the Cephalochordate Branchiostoma floridae (Amphioxus) Jeffrey L. Boore, L. Lynne Daehler, and Wesley M. Brown Department of Biology, University of Michigan, Ann Arbor We have determined the 15,083-nucleotide (nt) sequence of the mitochondrial DNA (mtDNA) of the lancelet Branchiostoma floridae (Chordata: Cephalochordata). As is typical in metazoans, the mtDNA encodes 13 protein, 2 rRNA, and 22 tRNA genes. The gene arrangement differs from the common vertebrate arrangement by only four tRNA gene positions. Three of these are unique to Branchiostoma, but the fourth is in a position that is primitive for chordates. It shares the genetic code variations found in vertebrate mtDNAs except that AGA 5 serine, a code variation found in many invertebrate phyla but not in vertebrates (the related codon AGG was not found). Branchiostoma mtDNA lacks a vertebrate-like control region; its largest noncoding region (129 nt) is unremarkable in sequence or base composition, and its location between ND5 and tRNAG differs from that usually found in vertebrates. It also lacks a potential hairpin DNA structure like those found in many (though not in all) vertebrates to serve as the second-strand (i.e., L-strand) origin of replication. Perhaps related to this, the sequence corresponding to the DHU arm of tRNAC cannot form a helical stem, a condition found in a few other vertebrate mtDNAs that also lack a canonical L-strand origin of replication. ATG and GTG codons appear to initiate translation in 11 and 2 of the protein-encoding genes, respectively. Protein genes end with complete (TAA or TAG) or incomplete (T or TA) stop codons; the latter are presumably converted to TAA by post-transcriptional polyadenylation. Introduction Complete mitochondrial DNA (mtDNA) sequences have been determined for 36 vertebrate species and partial sequences for hundreds of others. All are circular DNA molecules containing 37 genes: 13 for proteins (COI-III, ND1-6, ND4L, Cytb, A6, A8); two for rRNAs (srRNA and lrRNA); and 22 for tRNAs (designated by the one-letter amino acid code, with the two S and two L tRNAs differentiated by the codons recognized [AGN/ UCN and CUN/UUR, respectively]). The genes are arranged very compactly, with no introns and few intergenic nucleotides. However, all vertebrate mtDNAs examined have a single, large, noncoding region, highly variable in size among (and sometimes within) species, that contains signalling elements for regulating transcription and replication (reviewed by Shadel and Clayton 1997). Comparisons of mitochondrial systems are useful for modeling genome evolution and for phylogenetic inference. Many complex features are available for comparison: modes of replication and transcription; RNA processing; protein, tRNA, and rRNA secondary structures; patterns of transcript editing; genetic code variations; and the relative arrangements of genes (Sankoff et al. 1992; Smith et al. 1993; Boore and Brown 1994; Boore et al. 1995; Kumazawa and Nishida 1995; Boore 1996; Boore, Lavrov, and Brown 1998). While these (and many additional) features are also present in nuclear genomes, they are currently more accessible for study in the much smaller and simpler mitochondrial genomes (Garesse et al. 1997). Here we describe and Key words: Branchiostoma, amphioxus, mitochondria, evolution, chordate, genome. Address for correspondence and reprints: J. L. Boore, Department of Biology, University of Michigan, 830 N. University Avenue, Ann Arbor, Michigan 48109. E-mail: [email protected]. Mol. Biol. Evol. 16(3):410–418. 1999 q 1999 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038 410 analyze the complete sequence of Branchiostoma floridae mtDNA (GenBank accession number AF098298), a species in the group Cephalochordata, a primitive chordate having diverged before the vertebrate radiation. Materials and Methods Live specimens of B. floridae were purchased from Gulf Specimen Co., Panacea, Fla. Mitochondrial DNA was purified by cesium chloride–ethidium bromide centrifugation as in Wright, Spolsky, and Brown (1983). A detailed restriction map was determined, separating radiolabeled fragments on both 1% agarose and 3.5% acrylamide gels, using conditions that allowed detection of fragments as small as 40 base pairs (bp) (Brown 1980). The entire Branchiostoma mtDNA was then cloned into the lambda vector EMBL4 (Stratagene) using the unique EcoRI site at position 8,834–8,839 (numbering from the first nucleotide of COI; see figs. 1 and 2). This clone was verified by comparing restriction enzyme fragment sizes with those expected from the cleavage map of the native mtDNA. After subsequent digestion with other restriction enzymes, mtDNA fragments were subcloned into pBluescript plasmids (Stratagene) and sequenced; some of the longer fragments required additional exonuclease deletion cloning steps (Erase-aBaset; Promega). Sequences were determined on both strands using dideoxynucleotide terminators and radiolabeled nucleotides (Sanger, Nicklen, and Coulson 1977); oligonucleotide sequencing primers were designed as necessary. Protein-encoding genes were identified by sequence similarity of open reading frames to mitochondrial gene sequences of Cyprinus carpio (Chang, Huang, and Lo 1994). Ribosomal RNA genes were identified by their potential to form tRNA-like secondary structures; specific identification were made according to anticodon sequences. Lancelet Mitochondrial DNA FIG. 1.—The gene map of Branchiostoma floridae mtDNA. Genes are abbreviated as in the text and scaling is only approximate; NC refers to the largest noncoding region. Asterisks mark those genes whose positions differ from those of the basic vertebrate arrangement as exemplified by human mtDNA (Anderson et al. 1981) (the choice to mark M rather than Q is arbitrary, as these are in ‘‘switched’’ positions). Transfer RNA genes identified outside of the circle are transcribed clockwise in this figure, as are all other genes except ND6; ND6 and the tRNAs marked inside are transcribed from the opposite strand. An arc marks a homologous portion previously determined for a congeneric species, Branchiostoma lanceolatum (Delarbre et al. 1997). Results and Discussion Gene Content and Organization Complete mtDNA sequences have been published for 36 vertebrate species. The 37 encoded genes are arranged identically in most, but minor variations of the basic arrangement have been found in some animals (sea lamprey [Lee and Kocher 1995], some frogs [Yoneyama 1987; Fujii et al. 1988], reptiles [Seutin et al. 1994; Kumazawa and Nishida 1995; Kumazawa et al. 1996; Quinn and Mindell 1996; Janke and Arnason 1997; Macey et al. 1997], birds [Glaus et al. 1980; Desjardins, Ramirez, and Morais 1990; Desjardins and Morais 1990, 1991; Quinn and Wilson 1993; Ramirez, Savoie, and Morais 1993; Harlid, Janke, and Arnason 1997], and marsupials [Pääbo et al. 1991; Janke et al. 1994]). Each of these variations appears to be derived independently, because the basic arrangement is shared among several vertebrate classes and none of the variations are shared among distantly related groups. This inference is strengthened further by the gene arrangement in Branchiostoma mtDNA, which, except for four tRNA gene positions, is identical to the basic vertebrate arrangement (as exemplified in mammals [Anderson et al. 1981], Xenopus [Roe et al. 1985], and bony fish [Chang, Huang, and Lo 1994]). None of the four variant positions is shared by the corresponding gene in another vertebrate species. The four tRNA genes whose arrangement in Branchiostoma mtDNA differs from the basic vertebrate ar- 411 rangement are those for glycine (G), phenylalanine (F), methionine (M), and asparagine (N) (fig. 1). Two of the differences (tRNAM and tRNAN) are the same as noted in the sequenced portion of Branchiostoma lanceolatum mtDNA (Delarbre et al. 1997). Because the positions of tRNAG and tRNAM are identical in the basic vertebrate arrangement and in the mtDNA of Drosophila (Clary and Wolstenholme 1985) and because the position of tRNAF is similar in the basic arrangement and in the mtDNAs of echinoderms (Jacobs et al. 1988; Smith et al. 1989), it can be argued that the positions of these three tRNAs are derived in Branchiostoma and that their positions in the basic arrangement represent the primitive state for chordates. From published data it is not possible to determine the primitive chordate position of tRNAN. The primitive arrangement could be as in Branchiostoma mtDNA (between ND2 and tRNAW) with its translocation to a position between tRNAA and tRNAC being a derived condition for vertebrates or, alternatively, the basic vertebrate arrangement could be primitive with an independent translocation in the lineage leading to Branchiostoma. This is resolved, however, by noting that the position of tRNAN in a hemichordate mtDNA (S. Pääbo, personal communication) is identical to that in Branchiostoma. Assuming that Hemichordata is an outgroup to a clade of Cephalochordata 1 Vertebrata, the most parsimonious explanation is that the tRNAN position in Branchiostoma mtDNA is primitive for chordates. Branchiostoma mtDNA is arranged very compactly, even for a mitochondrial genome. In total, there appear to be only 154 noncoding nucleotides (nt): 129 in a single region between ND5–tRNAG; 8 between tRNAS(UCN)–tRNAD; 2 between each of tRNAR–ND4L and tRNAF–tRNAV; 1 between each of tRNAQ–ND2, ND6– tRNAE, and tRNAE–Cytb; and 10 between tRNAY–COI (fig. 2). Several protein-encoding genes are predicted to end with abbreviated stop codons (see below). In six cases, adjacent genes overlap (COI–tRNAS(UCN), A8–A6, tRNAG–ND6, ND2–tRNAN, tRNAN–tRNAW, and tRNAW– tRNAA); however, in all except one (A8–A6, discussed below) the overlapping genes are encoded on opposite DNA strands. The lack of overlap of the ND4L–ND4 genes is unusual. Base Composition Overall, B. floridae mtDNA is 63% A1T, slightly higher than other chordate mtDNAs (e.g., Petromyzon marinus 5 62% [Lee and Kocher 1995], C. carpio 5 57% [Chang, Huang, and Lo 1994], and Protopterus dolloi 5 58% [Zardoya and Meyer 1996]). The 11,262 nt making up the protein-encoding genes are 62% A1T, nearly identical to the mtDNA overall (table 1). The A1T content of first, second, and third codon positions is 54%, 62%, and 70%, respectively. As is typical of metazoan mtDNAs (Cardon et al. 1994), the dinucleotide CpG is significantly underrepresented in Branchiostoma mtDNA. In earlier studies (Naylor and Brown 1997, 1998), comparison of the protein-encoding portions of the B. floridae mtDNA sequence with those of other metazoans 412 Boore et al. FIG. 2.—A partly schematic representation of the mtDNA sequence of Branchiostoma floridae. Numbers within the slash marks indicate omitted nucleotides. For two genes, the inferred initiation codon is GTG; the corresponding amino acid (M) is in parentheses here to indicate presumed noncomformity with the generally employed genetic code. (Mitochondrial proteins appear to initiate with formyl-methionine [Smith and Marcker 1968] as do those of their bacterial progenitors.) Stop codons, including those inferred to be abbreviated, are marked by an asterisk and the single large noncoding region by a row of dots. A dart (.) marks the last nucleotide of each gene and indicates the direction of transcription. The EcoRI restriction enzyme site used for cloning (positions 8,826–8,831) is underlined. Lancelet Mitochondrial DNA 413 Table 1 Number of Occurrences and Percentage of Total of the 3,754 Codons in the 13 Protein-Encoding Genes of Branciostoma floridae mtDNA Amino Acid Codon N % Phe (F) . . . . [GAA]a . . . . Leu (L) . . . . [UAA] . . . . TTT TTC TTA TTG 170 82 289 112 4.5 2.2 7.7 3.0 Leu (L) . . . . CTT [UAG] . . . . CTC CTA CTG 77 15 93 36 ATT ATC ATA ATG Val (V) . . . . GTT [UAC]. . . . . GTC GTA GTG Ile (I) . . . . . [GAU] . . . . Met (M) . . . [CAU]. . . . . a b Amino Acid Codon N % Amino Acid Codon N % Tyr (Y) [GUA] TERb TAT TAC TAA TAG 95 53 — — 2.5 1.4 — — Cys (C) [GCA] Trp (W) [UCA] TGT TGC TGA TGG 23 15 58 48 0.6 0.4 1.5 1.3 2.0 0.5 1.1 0.6 His (H) [GUG] Gln (Q) [UUG] CAT CAC CAA CAG 65 26 38 47 1.7 0.7 1.0 1.3 Arg (R) [UCG] CGT CGC CGA CGG 16 12 27 21 0.4 0.3 0.7 0.6 65 15 65 32 1.7 0.4 1.7 0.9 Asn (N) [GUU] Lys (K) [UUU] AAT AAC AAA AAG 87 24 40 33 2.3 0.6 1.1 0.9 Ser (S) [GCU] AGT AGC AGA AGG 73 33 12 0 1.9 0.9 0.3 0.0 126 22 86 34 3.4 0.6 2.3 0.9 Asp (D) [GUC] Glu (E) [UUC] GAT GAC GAA GAG 57 23 58 45 1.5 0.6 1.5 1.2 Gly (G) [UCC] GGT GGC GGA GGG 92 20 67 113 2.5 0.5 1.8 3.0 Codon N % Ser (S) [UGA] TCT TCC TCA TCG 77 19 64 17 2.1 0.5 1.7 0.5 2.1 0.4 2.5 1.0 Pro (P) [UGG] CCT CCC CCA CCG 76 20 40 22 187 46 138 63 5.0 1.2 3.7 1.7 Thr (T) [UGU] ACT ACC ACA ACG 130 23 124 68 3.5 0.6 3.3 1.8 Ala (A) [UGC] GCT GCC GCA GCG Amino Acid The anticodon of the corresponding tRNA is shown in brackets. Stop codons are omitted from this analysis. failed to confirm Branchiostoma as the sister taxon to vertebrates, suggesting instead that it is the sister taxon to (echinoderms 1 vertebrates) (which was viewed as artifactual by the authors). From review of the original data used for these studies, 13 sequencing errors were detected, resulting in frameshifts in COI, ND2, ND4L, and ND6. A total of 79 amino acids were inferred in error for those previous studies, about 2% of the total amino acids analyzed; whether this would significantly affect their conclusions awaits further analysis. Initiation and Termination of Protein-Encoding Genes The mitochondrial protein genes of Branchiostoma correspond well in size and sequence to those of other metazoans (table 2). ATG codons initiate 11 of the 13, COI and ND1 being the exceptions. The inferred initiation codon for COI is GTG, as no ATG or other initiation codon employed in metazoan mitochondrial systems is nearby, and because the sequence of the protein initiated at this position is highly similar to the amino terminal sequences of COI in other metazoans. There is an in-frame, TAG stop codon 6 nt upstream of this GTG, and Delarbre et al. (1997) found GTG at the corresponding position in B. lanceolatum, which they inferred as the initiation codon for COI. GTG has been inferred to initiate this and other genes in many metazoan mtDNAs (Wolstenholme 1992). Inference of the initiation codon for ND1 is less certain, as two commonly used initiation codons, GTG and ATA, are in-frame and immediately adjacent at the probable ND1 start site (positions 12,576–12,581 in fig. 2). The GTG codon directly abuts the upstream tRNAL(UUR) gene, followed immediately by ATA. The same ambiguity is present at the 59 end of ND1 in B. lanceolatum, for which Delarbre et al. (1997) arbitrarily designated ATA as the initiation codon. Complete TAG stop codons are present in COI, Cytb, ND2, ND5, and ND6, none of which overlaps a downstream gene having the same transcriptional orientation, and a complete TAA stop codon is present in COIII (table 2). Also, A8 almost certainly ends at the TAG codon following the highly conserved sequence (WPW) at the 39 end of A8, where A8 and A6 overlap. These genes commonly overlap in chordate mtDNAs, where they are known to be translated from the same bicistronic mRNA (Fearnley and Walker 1986). The other genes end at incomplete (i.e., T or TA) codons. After transcription and processing, mRNAs ending in T or TA are converted to TAA by polyadenylation (Ojala, Montoya, and Attardi 1981). This is surely the case for COII; if this transcript extended to the first in-frame stop codon, it would overlap the adjacent gene by 29 nt. A6, ND1, ND3, ND4, and ND4L are all inferred to end at incomplete stop codons that directly abut their adjacent, downstream genes; for each, however, allowing the transcript to overlap the downstream gene by 1 or 2 nt would complete the termination codon. Delarbre et al. (1997) sequenced a cDNA to the ND1 mRNA of B. lanceolatum and found that it terminated with TAA, thus confirming the incomplete codon hypothesis for that gene (overlap would have resulted in a TAG stop codon). However, it seems likely that the presence, at least prior to processing, of potentially complete termination codons is not merely coincidence and may be a mechanism for preventing translational readthrough in cases where correct transcript processing fails. Transfer RNAs Twenty-two sequences can be folded into tRNAlike structures (fig. 3). In these, the sequences corre- Gene lengths are as inferred in the text and depicted in figure 1 or obtained from Genbank. Actual gene length could be slightly different due to ambiguity in determining start and stop codons. Percent identity is the number of identical inferred amino acids in a pairwise alignment divided by the mean length of the two compared sequences. c For predicted stop codons the parentheses indicate the potential of a complete stop codon overlapping a downstream gene with the same transcriptional orientation. The asterisks indicate that no such potential reasonably exists and that the stop codon is incomplete, presumably completed by polyadenylation of the mRNA (see text). b a ATG TAA GTG TAA ATG TAA ATG TAA ATR TAA ATG TA(G) ATG TAA ATG TAG ATR TAG ATR TAG ATC TA(A) ATG TAG ATG TAG ATG TA(A) ATG TAG GTG TAA ATG T* ATG TA(A) ATG T* ATG TAA ATG TA(G) ATG T(AG) ATG T* ATG TAA ATG TAA ATG TAA 37.0 16.7 76.2 62.5 69.7 67.1 57.7 36.0 54.5 44.8 41.0 40.1 33.1 232 54 517 229 260 380 323 352 116 463 97 638 160 227 54 515 230 262 380 314 346 117 452 91 599 167 A6 A8 COI COII COIII Cytb ND1 ND2 ND3 ND4 ND4L ND5 ND6 227 52 516 230 261 380 324 348 116 460 98 607 172 52.4 30.2 77.4 60.9 73.8 69.7 59.2 34.3 59.2 48.5 40.7 46.8 27.7 34.9 19.4 77.3 64.5 69.2 63.7 53.1 34.1 47.9 45.6 41.5 37.6 23.2 ATG TA(A) ATG TAG GTG TAG ATG T* ATG TAA ATG TAG GTG T(AG) ATG TAG ATG T(AA) ATG TA(G) ATG TA(A) ATG TAG ATG TAG Sea urchin Carp Lancelet Sea urchin/carp Lancelet/ sea urchin Lancelet/carp Sea urchin Carp Lancelet PROTEIN NUMBER OF AMINO ACIDSa PERCENT AMINO ACID IDENTITYb PREDICTED INITIATION AND TERMINATIONc CODONS Boore et al. Table 2 Comparisons of the Mitochondrial Protein-Coding Genes of a Lancelet (Branchiostoma floridae), a Carp (Cyprinus carpio; Chang, Huang, and Lo 1994), and a Sea Urchin (Paracentrotus lividus; Cantatore et al. 1989) 414 sponding in position to the anticodons are identical to those for the mitochondrial tRNA gene of human, chicken, frog, and fish mtDNAs (Anderson et al. 1981; Roe et al. 1985; Desjardins and Morais 1990; Chang, Huang, and Lo 1994). All B. floridae mitochondrial tRNA genes have a TCC loop of 3–7 nt and a TCC stem of 3–6 nt; two (tRNAR and tRNAQ) have a single mismatch in this stem. All but tRNAM, tRNAF, and tRNAW have a fully paired, seven-member acceptor stem, and all but tRNAQ and tRNAL(UUR) have a fully paired five-member anticodon stem. All except tRNAT have a four-member extra arm. The dinucleotide between the acceptor and DHU arms is TpA in all tRNA genes except tRNAS(UCN) and tRNAY, where it is TpG, and in tRNAV, where it is GpA. In all of the tRNA genes, the 2 nt preceding the anticodon are pyrimidines and the nucleotide following it is a purine. Except for tRNAC and tRNAS(AGN), all have DHU arms with a stem of 3–5 nt and a loop of 3–11 nt. For all but tRNAG, the 2 most proximal unpaired nt in the DHU loop are purines (almost always A’s). The unpaired replacement for the DHU arm of tRNAS(AGN) is typical of metazoan mtDNAs, as is the potential for additional pairing at the end of the anticodon stem. An unpaired DHU arm in tRNAC is unusual but has precedents among vertebrates; in some (but not all) cases (for examples, Seutin et al. 1994; Macey et al. 1997) it is correlated with the loss of the immediately adjacent, stem–loop structure that functions, in many vertebrate mtDNAs, as the second-strand (i.e., L-strand) origin of replication. We speculate that this aberrant tRNAC might represent a compromise structure, serving as both an origin of replication and a functional tRNA gene. A similar condition appears in the mtDNA of P. dolloi (Zardoya and Meyer 1996) where tRNAC and this stem–loop structure partially share the same sequence. Unassigned DNA The largest noncoding region in B. floridae mtDNA is only 129 nt. By contrast, the largest noncoding region is 198 nt in P. marinus (Lee and Kocher 1995), 928 nt in C. carpio (Chang, Huang, and Lo 1994), and 1183 nt in P. dolloi (Zardoya and Meyer 1996) mtDNAs. A search of all vertebrate mtDNA sequences identified no obvious similarity with this noncoding sequence, and its location, between ND5 and tRNAG, differs from that usually found in vertebrates. In B. floridae, this region is slightly less A1T-rich (59%) than the overall mtDNA (63%). Other than in this region, there are only 25 nt, distributed in blocks of 1–10 nt, that are unassigned to genes, and the composition of these appears unremarkable. Genetic Code: AGA Specifies Serine, Not Glycine, in Branchiostoma mtDNA In vertebrate mtDNAs only AGY specifies serine, with AGR codons being absent or, when present, used as stop codons. In all invertebrate mtDNAs except those of cnidarians both AGR and AGY appear to specify serine (reviewed by Wolstenholme 1992). (AGR specifies arginine in cnidarian mtDNA, as in the ‘‘universal’’ Lancelet Mitochondrial DNA 415 FIG. 3.—The potential secondary structures of the 22 inferred tRNAs of Branchiostoma floridae mtDNA. Nomenclature for tRNA arms is shown for tRNAV. The five additional nucleotides in parentheses outside of the structures and accompanied by an arrow indicate the only differences in comparing the eight sequenced tRNA genes of Branchiostoma lanceolatum (Delarbre et al. 1997). code, presumably due to the use of imported, nuclearencoded tRNAs.) Based on a single AGA and no AGG codons, Delabre et al. (1997) suggested a variation for the lancelet mitochondrial genetic code in which AGR specifies glycine and AGY, serine. Based on the paucity of data available to them, that suggestion may have been reasonable. However, with the much larger data set pre- 416 Boore et al. sented here (and noting that no AGG codons are present), it is clear that AGA (along with AGY) specifies serine in Branchiostoma mtDNA. There are 12 AGA codons in B. floridae mtDNA, 1 of which is identical in position to the single AGA codon found by Delabre et al. (1997) in the ND2 gene of B. lanceolatum. In alignments with the corresponding gene sequences of P. marinus (Lee and Kocher 1995), C. carpio (Chang, Huang, and Lo 1994), P. dolloi (Zardoya and Meyer 1996), Gadus morhua (Johansen, Guddal, and Johansen 1990), and Crossostome lacustre (Tzeng et al. 1992), the Branchiostoma AGA codons correspond most frequently to serine codons, with the correspondence to TCN codons being even more frequent than to AGY codons. No tRNA genes in B. floridae mtDNA have a TCT anticodon, as would be needed to discriminate AGR from the AGN codon family; moreover, only tRNAS(AGN) has an NCT anticodon. Even though we believe that there is unequivocal evidence that AGN specifies serine in Branchiostoma mtDNA, the absence of AGG and reduced usage of AGA codons in this mtDNA can be viewed as a precondition for codon reassignment, as shown even better for the mtDNA of the hemichordate Balanoglossus (Castresana, Feldmaier-Fuchs, and Pääbo 1998). AGR codons appear to specify glycine in the urochordate Halocynthia roretzi mtDNA, because of their frequent alignment with glycine (GGN) codons in other metazoan mtDNAs (Yokobori, Ueda, and Watanabe 1993). The assignment of glycine as the amino acid specified by AGR codons in Halocynthia mtDNA is based on 19 occurrences in a 1,263-nt fragment of COI (Yokobori, Ueda, and Watanabe 1993). No AGR codons appear in COI of B. floridae; however, 11 of the 12 AGA codons and 5 of the 7 AGG codons in Halocynthia COI align with glycine codons (GGN) in B. floridae COI, providing further evidence for the reassignment of AGR in the urochordate lineage, after it split from the lineage leading to cephalochordates. Sequence alignments of the B. floridae and vertebrate mitochondrial proteins suggest that there are no other differences between its genetic code and that used in vertebrate mtDNAs. Nucleotide Sequence Comparisons with B. lanceolatum mtDNA The 2,562 nt previously determined for B. lanceolatum (Delarbre et al. 1997) are remarkably similar in sequence to the corresponding region of B. floridae mtDNA, with only 73 nt differences between the two (.97% sequence identity). This common region includes complete genes for ND1, ND2, and eight tRNAs, and parts of genes for COI and a ninth tRNA. There is evidence for only one insertion/deletion event. This results in a single nucleotide difference in the lengths of the noncoding region between tRNAY–COI (10 nt in B. floridae, 9 nt in B. lanceolatum) that are otherwise identical. In the aggregate, the tRNA gene sequences of the two species differ by five substitutions (fig. 3). Of these, four are in loops and one is in a stem (which changes a T–G pair to a C–G pair); all five substitutions are transitions. The ND1 proteins differ by 15 substitutions (disregarding whether initiated by GTG or ATA), of which 13 are synonymous and 2 nonsynonymous; 11 are transitions and 4 are transversions. The ND2 proteins differ by 52 substitutions, of which 47 are synonymous and 5 nonsynonymous; 50 are transitions and 2 are transversions. However, the ND2 situation is made complex by Delarbre et al.’s (1997) report of a second ND2 copy, which could be either mitochondrial or nuclear. The sequence of the second copy was not specifically reported, but they noted that the two differed by 42 substitutions, 8 of which cause amino acid replacements, which they describe. Of these eight, the B. floridae ND2 amino acid sequence is identical to three in the first (reported) copy and to five in the second. We found no evidence for a second ND2 copy in the B. floridae mtDNA sequence, and the size of the purified mtDNA, estimated from restriction fragment length summations, is clearly insufficient to accommodate a second copy of this gene. Acknowledgments Thanks to Susan Fuerstenberg, Kevin Helfenbein, Gavin Naylor, and Alan Wolf for helpful comments on the manuscript and to Svante Pääbo, Jose Castresana, and coworkers for sharing the Balanoglossus data with us prior to publication. This work was supported by NSF grant DEB-9220640 to W.M.B. LITERATURE CITED ANDERSON, S., A. T. BANKIER, B. G. BARRELL et al. (14 coauthors). 1981. Sequence and organization of the human mitochondrial genome. Nature 290:457–465. BOORE, J. L. 1996. Ancient patterns of arthropod evolution are recorded in mitochondrial genome rearrangements. Pp. 69– 78 in M. NEI and N. TAKAHATA, eds. Current topics in molecular evolution: proceedings of the U.S.–Japan Binational Workshop on Molecular Evolution. Graduate School for Advanced Studies, Hayama, Japan. BOORE, J. L., and W. M. BROWN. 1994. Mitochondrial genomes and the phylogeny of mollusks. Nautilus 108(Suppl. 2):61–78. BOORE, J. L., T. M. COLLINS, D. STANTON, L. L. DAEHLER, and W. M. BROWN. 1995. Deducing the pattern of arthropod phylogeny from mitochondrial DNA rearrangements. Nature 376:163–165. BOORE, J. L., D. V. LAVROV, and W. M. BROWN. 1998. Gene translocation links insects and crustaceans. Nature 393:667– 668. BROWN, W. M. 1980. Polymorphism in mitochondrial DNA of humans as revealed by restriction endonuclease analysis. Proc. Natl. Acad. Sci. USA 77:3605–3609. CANTATORE, P., M. ROBERTI, G. RAINALDI, M. N. GADALETA, and C. SACCONE. 1989. The complete nucleotide sequence, gene order and genetic code of the mitochondrial genome of Paracentrotus lividus. J. Biol. Chem. 264:10965–10975. CARDON, L. R., C. BURGE, D. A. CLAYTON, and S. KARLIN. 1994. Pervasive CpG suppression in animal mitochondrial genomes. Proc. Natl. Acad. Sci. USA 91:3799–3803. CASTRESANA, J., G. FELDMAIER-FUCHS, and S. PÄÄBO. 1998. Codon reassignment and amino acid composition in hemi- Lancelet Mitochondrial DNA chordate mitochondria. Proc. Natl. Acad. Sci. USA 95: 3703–3707. CHANG, Y.-S., F.-L. HUANG, and T.-B. LO. 1994. The complete nucleotide sequence and gene organization of carp (Cyprinus carpio) mitochondrial genome. J. Mol. Evol. 38:138– 155. CLARY, D. O., and D. R. WOLSTENHOLME. 1985. The mitochondrial DNA molecule of Drosophila yakuba: nucleotide sequence, gene organization, and genetic code. J. Mol. Evol. 22:252–271. DELARBRE, C., V. BARRIEL, S. TILLIER, P. JANVIER, and G. GACHELIN. 1997. The main features of the craniate mitochondrial DNA between the ND1 and the COI genes were established in the common ancestor with the lancelet. Mol. Biol. Evol. 14:807–813. DESJARDINS, P., and R. MORAIS. 1990. Sequence and gene organization of the chicken mitochondrial genome: a novel gene order in higher vertebrates. J. Mol. Biol. 212:599–634. . 1991. Nucleotide sequence and evolution of coding and noncoding regions of a quail mitochondrial genome. J. Mol. Evol. 32:153–161. DESJARDINS, P., V. RAMIREZ, and R. MORAIS. 1990. Gene organization of the Peking duck mitochondrial genome. Curr. Genet. 17:515–518. FEARNLEY, I. M., and J. E. WALKER. 1986. Two overlapping genes in bovine mitochondrial DNA encode membrane components of ATP synthase. EMBO J. 5:2003–2008. FUJII, H., T. SHIMADA, Y. GOTO, and T. OKAZAKI. 1988. Cloning of the mitochondrial genome of Rana catesbeiana and the nucleotide sequences of the ND2 and five tRNA genes J. Biochem. 103:474–481. GARESSE, R., J. A. CARRODEGUAS, J. SANTIAGO, M. L. PEREZ, R. MARCO, and C. G. VALLEJO. 1997. Artemia mitochondrial genome: molecular biology and evolutive considerations. Comp. Biochem. Physiol. B. Biochem. Mol. Biol. 117:357–366. GLAUS, K. R., H. P. ZASSENHAUS, N. S. FECHHEIMER, and P. S. PERLMAN. 1980. Avian mtDNA: structure, organization and evolution. Pp. 131–135 in A. M. KROON and C. SACCONE, eds. The organization and expression of the mitochondrial genome. Elsevier/North-Holland Biomedical Press, Amsterdam. HARLID, A., A. JANKE, and U. ARNASON. 1997. The mtDNA sequence of the ostrich and the divergence between paleognathous and neognathous birds. Mol. Biol. Evol. 14:754– 761. JACOBS, H. T., D. J. ELLIOTT, V. B. MATH, and A. FARQUARSON. 1988. Nucleotide sequence and gene organization of sea urchin mitochondrial DNA. J. Mol. Biol. 202:185–217. JANKE, A., and U. ARNASON. 1997. The complete mitochondrial genome of Alligator mississippiensis and the separation between recent Archosauria (birds and crocodiles). Mol. Biol. Evol. 14:1266–1272. JANKE, A., G. FELDMAIER-FUCHS, W. K. THOMAS, A. VON HAESELER, and S. PÄÄBO. 1994. The marsupial mitochondrial genome and the evolution of placental mammals. Genetics 137:243–256. JOHANSEN, H., P. H. GUDDAL, and T. JOHANSEN. 1990. Organization of the mitochondrial genome of Atlantic cod, Gadus morhua. Nucleic Acids Res. 18:411–419. KUMAZAWA, Y., and M. NISHIDA. 1995. Variations in mitochondrial tRNA gene organization of reptiles as phylogenetic markers. Mol. Biol. Evol. 12:759–772. KUMAZAWA, Y., H. OTA, M. NISHIDA, and T. OZAWA. 1996. Gene rearrangements in a snake mitochondrial genomes: highly concerted evolution of control-region-like sequences 417 duplicated and inserted into a tRNA gene cluster. Mol. Biol. Evol. 13:1242–1254. LEE W.-J., and T. KOCHER. 1995. Complete sequence of a sea lamprey (Petromyzon marinus) mitochondrial genome: early establishment of the vertebrate genome organization. Genetics 139:873–887. MACEY, J. R., A. LARSON, N. B. ANANJEVA, and T. PAPENFUSS. 1997. Evolutionary shifts in three major structural features of the mitochondrial genome among iguanian lizards. J. Mol. Evol. 44:660–674. NAYLOR, G. J. P., and W. M. BROWN. 1997. Structural biology and phylogenetic estimation. Nature 388:527–528. . 1998. Amphioxus mitochondrial DNA, chordate phylogeny, and the limits of inference based on comparisons of sequences. Syst. Biol. 47:61–76. OJALA, D., J. MONTOYA, and G. ATTARDI. 1981. tRNA punctuation model of RNA processing in human mitochondria. Nature 290:470–474. PÄÄBO, S., W. K. THOMAS, K. M. WHITFIELD, Y. KUMAZAWA, and A. C. WILSON. 1991. Rearrangements of mitochondrial transfer RNA genes in marsupials. J. Mol. Evol. 33:426– 430. QUINN, T. W., and D. P. MINDELL. 1996. Mitochondrial gene order adjacent to the control region in crocodile, turtle, and tuatara. Mol. Phylogenet. Evol. 5:344–351. QUINN, T. W., and A. C. WILSON. 1993. Sequence evolution in and around the mitochondrial control region in birds. J. Mol. Evol. 37:417–425. RAMIREZ, V., P. SAVOIE, and R. MORAIS. 1993. Molecular characterization and evolution of a duck mitochondrial genome. J. Mol. Evol. 37:296–310. ROE, B. A., D.-P. MA, R. K. WILSON, and J. J.-H. WONG. 1985. The complete nucleotide sequence of the Xenopus laevis mitochondrial genome. J. Biol. Chem. 260:9759–9774. SANGER, F., S. NICKLEN, and A. R. COULSON. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74:5463–5467. SANKOFF, D., G. LEDUC, N. ANTOINE, B. PAQUIN, B. F. LANG, and R. J. CEDERGREN. 1992. Gene order comparisons for phylogenetic inference: evolution of the mitochondrial genome. Proc. Natl. Acad. Sci. USA 89:6575–6579. SEUTIN, G., B. F. LANG, D. P. MINDELL, and R. MORAIS. 1994. Evolution of the WANCY region in amniote mitochondrial DNA. Mol. Biol. Evol. 11:329–340. SHADEL, G. S., and D. A. CLAYTON. 1997. Mitochondrial DNA maintenance in vertebrates. Annu. Rev. Biochem. 66:409– 435. SMITH, M. J., A. ARNDT, S. GORSKI, and E. FAJBER. 1993. The phylogeny of echinoderm classes based on mitochondrial gene arrangements. J. Mol. Evol. 36:545–554. SMITH, M. J., D. K. BANFIELD, K. DOTEVAL, S. GORSKI, and D. J. KOWBEL. 1989. Gene arrangement in sea star mitochondrial DNA demonstrates a major inversion event during echinoderm evolution. Gene 76:181–185. SMITH, A. E., and K. A. MARCKER. 1968. N-formylmethionyl transfer RNA in mitochondria from yeast and rat liver. J. Mol. Biol. 38:241–243. TZENG, C.-S., C.-F. HUI, S.-C. SHEN, and P. C. HUANG. 1992. The complete nucleotide sequence of the Crossostome lacustre mitochondrial genome: conservation and variations among vertebrates. Nucleic Acids Res. 20:4853–4858. WOLSTENHOLME, D. R. 1992 Animal mitochondrial DNA: structure and evolution. Int. Rev. Cytol. 141:173–216. WRIGHT, J. W., C. SPOLSKY, and W. M. BROWN. 1983. The origin of the parthenogenetic lizard Cnemidophorus laredoensis inferred from mitochondrial DNA analysis. Herpetologica 39:410–416. 418 Boore et al. YOKOBORI, S.-I., T. UEDA, and K. WATANABE. 1993. Codons AGA and AGG are read as glycine in ascidian mitochondria. J. Mol. Evol. 36:1–8. YONEYAMA, Y. 1987. The nucleotide sequences of the heavy and light strand replication origins of the Rana catesbeiana mitochondrial genome. J. Nippon Med. Sch. (Nippon Ika Daigaku Zasshi) 54:429–440 [in Japanese]. ZARDOYA, R., and A. MEYER. 1996. The complete nucleotide sequence of the mitochondrial genome of the lungfish (Protopterus dolloi) supports its phylogenetic position as a close relative of land vertebrates. Genetics 142:1249–1263. STEPHEN PALUMBI, reviewing editor Accepted December 9, 1998
© Copyright 2026 Paperzz