Widespread Recombination in Published Animal mtDNA Sequences

Widespread Recombination in Published Animal mtDNA Sequences1
A. D. Tsaousis,* D. P. Martin, E. D. Ladoukakis,*à D. Posada,§ and E. Zouros*
*Department of Biology, University of Crete, Iraklio, Crete, Greece; Institute of Infectious Diseases and Molecular Medicine,
University of Cape Town, Observatory 7925, Cape Town, South Africa; àCentre for the Study of Evolution & School of Life Sciences,
University of Sussex, Brighton, United Kingdom; and §Facultad de Biologı́a, Universidad de Vigo, Vigo, Spain
Mitochondrial DNA (mtDNA) recombination has been observed in several animal species, but there are doubts as to
whether it is common or only occurs under special circumstances. Animal mtDNA sequences retrieved from public databases were unambiguously aligned and rigorously tested for evidence of recombination. At least 30 recombination events
were detected among 186 alignments examined. Recombinant sequences were found in invertebrates and vertebrates,
including primates. It appears that mtDNA recombination may occur regularly in the animal cell but rarely produces
new haplotypes because of homoplasmy. Common animal mtDNA recombination would necessitate a reexamination
of phylogenetic and biohistorical inference based on the assumption of clonal mtDNA transmission. Recombination
may also have an important role in producing and purging mtDNA mutations and thus in mtDNA-based diseases and
senescence.
Introduction
Recombination of mitochondrial DNA (mtDNA) is
common in protist, fungi, and plant species (for review,
see Gray 1989) but was thought to be absent or rare in animals
(Avise 1994). This perception is based on failure to observe
recombinant haplotypes in surveys of mtDNA variation in
natural populations or animal cell cultures (Zuckerman
et al. 1984). Indirect support for this view comes from the
sequestration of mtDNA molecules (Nass 1969; Satoh and
Kuroiwa 1991) and the absence of detectable excision repair
activity and crossover products in mammalian mitochondria
(Clayton, Doda, and Friedberg 1974). However, the demonstration that human mitochondria contain enzymes for the
catalysis of homologous recombination (Thyagarajan,
Padua, and Campbell 1996) and the observation of mitochondrial fusion in Drosophila (Yaffe 1999) and in rat liver cells
(Cortese 1999) suggest the plausibility of recombination.
Homologs of the genes involvedin Drosophila mitochondrial
fusion (fuzzy onions) have been identified in yeast (Fzo 1)
and humans (Mfn1 and Mfn2) (Santel and Fuller 2001),
but the frequency with which human mitochondria actually
fuse is a matter of disagreement (Enriquez et al. 2000; Legros
et al. 2002).
Direct evidence of recombination now exists in four
animal species: the nematode Meloidogyne javanica (Lunt
and Hyman 1997), the mussel sister-species Mytilus galloprovincialis (Ladoukakis and Zouros 2001a) and Mytilus
trossulus (Burzynski et al. 2003), the flatfish Platichthys
flesus (Hoarau et al. 2002), and, most recently, man
(Kraytsberg et al. 2004). It is important to emphasize the
distinction between intramolecular and intermolecular
crossing-over and between nonhomologous and homologous recombination (homologous recombination is by definition intermolecular, but the reverse is not true). It is
homologous recombination that has strong implications
for the molecular biology and evolution of mitochondrial
genomes and whose occurrence in animals has been in
1
This article is dedicated to the memory of John Maynard Smith.
Key words: animal mtDNA, recombination.
E-mail: [email protected].
Mol. Biol. Evol. 22(4):925–933. 2005
doi:10.1093/molbev/msi084
Advance Access publication January 12, 2005
doubt. Recombination in Meloidogyne javanica (Lunt
and Hyman 1997) is an example of intramolecular crossing-over facilitated by the presence of tandem repeats. Size
heteroplasmy due to variable number of tandem repeats
(VNTRs) is a common feature of the mtDNA control region
of many animal species (Densmore, Wright, and Brown
1985; Moritz and Brown 1987; Bentzen, Legget, and
Brown 1988; Gjetvai, Cook, and Zouros 1992) and is
thought to be the result of either intramolecular or intermolecular crossing-over (Rand and Harrison 1986; Buroker
et al. 1990; La Roche et al. 1990). The study of Hoarau
et al. (2002) provides direct evidence for intermolecular
crossing-over between arrays of tandem repeats. The
reports of recombination in Mytilus galloprovincialis and
man, on the other hand, are clear cases of homologous
recombination.
However, there are several reasons why these diverse
observations cannot be considered conclusive evidence for
widespread animal mtDNA recombination. Unequal crossovers involving VNTRs are localized to small noncoding
parts of mitochondrial genomes, and although VNTRs
are common, they are not a general feature of animal
mtDNA. It would also be difficult to draw general conclusions about animal mtDNA recombination from studies
involving species of the mussel family Mytilidae because
they have an uncommon system of mtDNA inheritance
known as doubly uniparental inheritance (DUI). DUI is
characterized by the presence of two mitochondria genomes, one transmitted through the eggs and the other through
the sperm (Skibinski, Gallagher, and Beynon 1994; Zouros
et al. 1994). Heteroplasmy (presence of more than one type
of mtDNA molecules in an individual) for the two genomes
is obligatory in males. The high-frequency recombination
observed in these males may, therefore, be considered a
peculiarity of DUI.
There is, however, another way to interpret the available evidence for and against widespread animal mtDNA
recombination. This is that mtDNA recombination occurs
regularly in the animal cell, but its products are difficult to
detect under normal circumstances. Strict maternal inheritance in animals ensures homoplasmy (presence of a single
type of mtDNA molecule in an individual). Any two
mtDNA molecules in an organism will, with the exception
of postfertilization mutations, have identical sequences and
Molecular Biology and Evolution vol. 22 no. 4 Ó Society for Molecular Biology and Evolution 2005; all rights reserved.
926 Tsaousis et al.
so too will any molecules resulting from their homologous
recombination. All four studies where recombination was
observed were designed to take advantage of the fact that
recombinants, if they occurred, would be very different
from the paternal haplotypes.
It follows therefore that demonstration of widespread
animal mtDNA recombination requires the following evidence: (1) it is truly homologous recombination, (2) it is
not confined to specific regions of the genome, particularly
those that are prone to either intramolecular or intermolecular pairing and crossing-over, (3) it occurs in a variety of
distantly related species, and (4) its products can spread and
survive in the population. In this paper, we provide such evidence. We have extended the approach of Ladoukakis and
Zouros (2001b) by both applying it to a much larger data set
and, most importantly, making use of the most powerful statistical tests that are presently available for the detection of
recombination in sets of aligned DNA sequences.
Materials and Methods
Sequence Data
Sequences retrieved from public databases were used
to generate sets of sequences hereafter referred to as ‘‘alignments.’’ Each alignment consisted of sequences from the
same region of an mtDNA protein gene that had been
derived from individuals of the same or closely related species. In total, 308 alignments were generated from which
we retained 186 for analysis. These consisted of sequences
that contained no termination codons and could be unambiguously aligned with no indels from the first to the last
codon. These 186 alignments included 7 invertebrate, 22
fish, 10 amphibian, 22 reptile, 14 avian, and 111 mammalian genera, representing a total of 1,091 species. Thirty of
the mammalian genera were primates. The number of
sequences varied among alignments from 4 to 148, the
sequence length from 225 to 1,472 base pairs, the number
of informative sites from 5 to 605, and the nucleotide diversity from 0.003 to 0.224. One hundred and twenty-eight
alignments coded for cob, 12 for cox2, 12 for nad2, 8
for nad1, 7 for nad5, 6 for cox1, 6 for nad4, 4 for nad3,
2 for cox3, and 1 for nad6 (see Supplementary Material).
A second set of 186 alignments was generated by retaining
only third codon positions. The third codon position alignments were used to avoid detection of linkage disequilibria
generated by selection rather than recombination.
Detection of Recombination
Alignments were searched for evidence of recombination using nine different methods, which could be classified
as global or local methods. Global methods provide a P
value for the occurrence of recombination but without identifying or characterizing particular recombination events,
while local methods provide P values for particular regions
of the sequences delimited by putative recombination
breakpoints.
Global methods included MaxChi Global (Posada and
Crandall 2001), Reticulate (Jakobsen and Easteal 1996),
and Geneconv Global (Sawyer 1989). Results of tests per-
formed using these methods were considered to be significant if they produced a P value smaller than 0.05.
The MaxChi Global method (Posada and Crandall
2001) is an extension of that of Maynard Smith (Maynard
Smith 1992). It was implemented in a personal program
written in C. MaxChi Global uses a sliding window that
is moved along every pair of sequences considering only
variable sites. Within each window, a chi square is calculated with the number of matches and mismatches between
the two sequences in the first and second halves of the window. The test statistic is the maximum chi square observed
in the original alignment. Its null distribution (i.e., under no
recombination) is obtained by permuting the columns of the
alignment and recalculating its value several times. When
applying the test, the step size for moving the sliding window along the sequences was set to 2. The minimum halfsize of the sliding window was set to five variable sites,
while the maximum half-size of the sliding window was
set to the number of variable sites divided by 3. P values
were estimated by randomizing the alignment 1,000 times.
The Reticulate partition compatibility method (Jakobsen
and Easteal 1996) was implemented with the program
Reticulate (Jakobsen, Wilson, and Easteal 1997) available
at http://jcsmr.anu.edu.au/dmm/humgen/ingrid/reticulate.
htm. In this method, a matrix is constructed in which a cell
contains information about the compatibility of a pair of
informative sites. Two sites are incompatible when the
same tree cannot be constructed from these two sites without assuming multiple changes, otherwise they are compatible. The neighbor similarity score is calculated as the
number of adjacent cells that are either compatible or not
compatible. The statistical significance of this test is calculated by permutation. Here the number of permutations was
set to 1,000.
The Geneconv Global method is an extension of that
of Sawyer (Sawyer 1989; http://www.math.wustl.edu/
;sawyer/geneconv/). In this method, global P values are
based on a version of the Blast score (Altschul et al.
1990). The global permutation P value is the proportion
of permuted alignments for which some fragment for some
pair of sequences has a higher score than the observed fragment. These P values have a built-in multiple-comparison
correction for all sequence pairs in the alignment. The
g-scale parameter was set to 0 in order to be most conservative. The number of permutations was set to 10,000.
P values for local methods were Bonferroni corrected
for multiple tests. For each unique recombination event
detected in an alignment, the P values reported are the
smallest corrected probabilities found by each method. A
method was considered to indicate significant evidence
for recombination when the corrected P value was smaller
than 0.05. Local methods included recombination detection
program (RDP; Martin and Rybicki 2000), Geneconv
Local, MaxChi Local (Martin, Williamson, and Posada
2004), Chimaera Local (Martin, Williamson, and Posada
2004), SiScan (Gibbs, Armstrong, and Gibbs 2000), and
RecScan (Martin, et al. 2005). All local methods were
implemented in the program RDP2 (Martin, Williamson,
and Posada 2004), available from http://darwin.uvigo.es/
rdp/rdp.html for the Windows OS. The manual of this program contains detailed information on the implementation
Animal mtDNA Recombination 927
of the different algorithms. Common settings in RDP2 for
all methods were that sequences were considered as linear,
the P-value cutoff was set to 0.05, the standard Bonferroni
correction was used, consensus daughters were found,
breakpoints were polished, and only unique events detected
by two or more methods were listed (see RDP2’s manual).
The original RDP method (Martin and Rybicki 2000)
uses a sliding window to examine variable nucleotide positions of every possible sequence triplet for a change in their
percentage identities along an alignment. The significance
of this change is approximated using the binomial distribution. This P value is then corrected for multiple significance
by multiplying it by the number of independent windows
and triplets examined. Here the length of the window
was set to 10 variable sites, and the step size was set to
1 nucleotide (nt).
The Geneconv Local method (Martin, Williamson,
and Posada 2004) is identical to the Geneconv Global
method except that instead of analyzing the entire alignment, it works with three sequences at a time, and Bonferroni corrected Blast-like P values are used. Although in the
Geneconv Global test we used a g-scale of 0 to be most
conservative, detecting specific breakpoints is much more
difficult (there is less data than in the global case) and in this
case the g-scale parameter was set to 1.
The MaxChi Local method (Martin, Williamson, and
Posada 2004) examines every possible pair of sequences in
every triplet in the alignment. This implementation is a
modification of the MaxChi Global method but analyzes
three sequences at a time. Here the length of the window
was set to 30 variable sites, and the step size was set to
1 nt.
The Chimaera Local method (Martin, Williamson, and
Posada 2004) is similar to the MaxChi Local method, but in
this case, a match occurs when the putative recombinant
sequence matches parental A but not parental B, while a
mismatch occurs when the putative recombinant sequence
matches parental B but not parental A. Here the length of
the window was set to 30 variable sites, and the step size
was set to 1 nt.
The SiScan method (Gibbs, Armstrong, and Gibbs
2000) implemented in RDP2 examines all combinations
of three sequences plus an out-group sequence. This
method interprets an increase in the frequency of particular
patterns of sites as evidence of recombination. The statistical significance of the test statistic is obtained through permutation of the alignment. Here the size of the window was
set to 100, and it was moved 20 nt at a time. Only variable
sites were considered. The number of permutations was set
to 1,000.
The RecScan method (Martin et al. 2005) is a modification of the Bootscan method (Salminen et al. 1995). It
interprets significant alterations in pairwise distance patterns as evidence of recombination. The alterations of
the patterns are quantified in terms of their bootstrap support, and the P value is approximated using a Bonferroni
corrected version of the binomial distribution. Here the window size was set to 100 sites, while the window step size
was set to 20. Candidate events were identified from 100
bootstrap replicates, and the bootstrap cutoff was set to
70%.
Strength of Recombination Signal
Alignments were classified in four arbitrary categories
regarding the evidence for recombination:
1. Strong evidence: This category included alignments for
which the null hypothesis of no recombination was
rejected by at least one of the global methods and by
two local methods.
2. Good evidence: This category included alignments for
which the hypothesis of no recombination was rejected
by more than one test (regardless of being a global or a
local method) but did not meet the set of criteria for the
first class.
3. Weak evidence: This category included alignments for
which only one test (global or local) rejected the hypothesis of no recombination.
4. No evidence: This category included alignments for
which no test rejected the hypothesis of no recombination.
Correlation of Recombination Detection Power with
Alignment Features
To obtain a relative index of the overall evidence of
recombination in an alignment we used the score
k
Y 5 ÿ 2 + ln Pi ;
i51
where Pi is the probability from the ith recombination
detection method and k is the number of methods. If probabilities from different methods are uncorrelated, Y should
be distributed as a chi square with 2k degrees of freedom.
This score does not provide the overall probability of
recombination in an alignment because in fact some methods may be more likely to produce similar probabilities than
others. Here it was rather used to evaluate the effect of several varying alignment features (such as number, length,
and diversity of sequences) on the probability of detecting
recombination when multiple detection methods are used.
Results and Discussion
One strategy for the detection of mtDNA recombination in data sets from population surveys is to search for
linkage disequilibria at variable sites in haplotypes from
the same species (Awadalla, Eyre-Walker, and Maynard
Smith 1999; Elson et al. 2001). One difficulty with this
strategy is the confounding effect of homoplasy, which
can be overcome by looking at the correlation of linkage
disequilibrium with the proximity of variable sites in the
genome (Awadalla, Eyre-Walker, and Maynard Smith
1999; Elson et al. 2001). Another difficulty is that the frequencies of the different haplotypes in the sample must be
representative of the population frequencies, which would
not normally be the case in data sets drawn from public
databases or from population surveys that were not
designed for this purpose. In fact, results from this strategy
have proven either equivocal (Kivisild et al. 2000) or negative (Elson et al. 2001). Another strategy (Ladoukakis
and Zouros 2001b) is to look for linkage disequilibria
among genomes that were already highly differentiated
928 Tsaousis et al.
at the time of recombination. It is known that the mechanism of maternal inheritance may break down in crosses
between different species (Shitara et al. 1998). Hybrids
from these crosses are therefore likely to be heteroplasmic
for highly divergent mitochondrial genomes. If there is no
impediment to mtDNA recombination, fertile offspring
from such crosses will be vectors generating and passing
on recombinant haplotypes to the population that will be
easily distinguishable from the parental types. In due time,
parental and recombinant haplotypes will occur either as
intraspecific or as interspecific variants. A limitation of this
strategy is that such variants might be rare in natural populations because the reduced fitness of hybrids and their
immediate progeny diminishes the long-term survival prospect of recombinant haplotypes. Another limitation is that
postrecombination mutation may obscure the distinction
between recombinant and parental types.
The second strategy was first applied to a small survey
of sequences retrieved from public databases (Ladoukakis
and Zouros 2001b). Only protein-coding mtDNA sequences were examined because they offered the assurance of
unambiguous alignment. Putative triads of sequences
involved in a recombination event and putative crossover
points were first identified by visual inspection of amino
acid sequences. Statistical testing was based on generating
sequences of the same length as the putatively exchanged
segment by random sampling of nucleotide sites from the
sequence triad and asking whether any such random
sequence could produce a better degree of matching
between donor and recipient sequences than the matching
produced by the putatively exchanged sequence itself.
Three cases of potential recombination were detected: in
the amphipod Gammarus, in the frog Rana, and in the
rodent Apodemus. Maynard Smith and Smith (2002) reexamined the data and concluded that the evidence for recombination was strong for Rana, weak for Apodemus, and
absent for Gammarus.
In this paper, we report the results from a much larger
survey of mtDNA sequences. As in Ladoukakis and Zouros
(2001b), only protein-coding sequences were examined.
The data were analyzed with an array of statistical methods
that have been developed in recent years for the detection of
recombination in large sets of aligned DNA sequences. All
of these methods other than the SiScan method have been
recently evaluated. In fact, they were chosen because they
show reasonable power and low false-positive rates, the latter being independent of divergence levels and recurrent
mutation (Posada and Crandall 2001; Posada, Crandall,
and Holmes 2002; Martin, Williamson, and Posada 2004).
Based on the arbitrary criteria categorizing the strength
of evidence for recombination (see Materials and Methods), the 186 alignments were classified as follows: 26 with
strong evidence, 36 with good evidence, 36 with weak evidence, and 88 with no evidence. When third codon position
alignments were examined, these numbers were 21, 14, 26,
and 125 for strong, good, weak, and no evidence, respectively. Thirty alignments were sorted into the first class
when full or third codon position alignments were examined (table 1). These include the Rana and Apodemus alignments, for which evidence for recombination has been
previously detected (Ladoukakis and Zouros 2001b).
One risk in ‘‘real time’’ recombination studies is that
when polymerase chain reactions (PCRs) are presented
with more than one molecule, the polymerase may switch
templates thus creating recombinant products (Paabo,
Irwin, and Wilson 1990). The possibility that this might
have happened in the sequences we have examined must
be very low. Each sequence was extracted from a different
individual, and in many alignments, different sequences
were determined in different laboratories. In an experiment
designed to test the frequency with which ‘‘PCR recombinants’’ occur during the amplification of mixed populations
of undamaged target DNA (as opposed to highly degraded
DNA common in ancient DNA studies), no such recombinants were found (Ladoukakis and Zouros 2001a). Also, if
recombination occurred through PCR jumping, exchanged
regions found in two different sequences should be perfect
copies of one other. Visual inspection of the recombinant
sequences that we identified revealed no perfect matches
between the putatively exchanged parts of these sequences
and those of their most probable parent sequences (fig. 1).
Another possibility is that the nonrandom associations
we have observed are real but that they have resulted from
selection rather than recombination. Because the sequences
in our alignments were drawn from different species, it is
conceivable that selection acting on mtDNA alone or
through nuclear-mtDNA interactions has favored different
combinations of amino acids in different sequences. To
minimize the possibility that the associations are caused
by selection, we performed the same statistical analysis
on alignments that contained only third codon position
nucleotides. Examination of only third codon position
nucleotide sequences is an unduly conservative constraint
on the detection of recombination because the number of
compared sites is reduced to approximately one third,
and the retained sites are those that are most likely to accumulate postrecombination mutations and thus reduce the
power of the tests. Even so, the statistical evidence for
recombination from the full sequence alignments was
strongly correlated with the evidence when only third
codon base sequences were used (the regression of Y calculated from full sequences against the Y calculated from
third codon base sequences gave r 5 0.806, P 5 1 3
10ÿ33 on n 5 142). Also, 17 of the 30 alignments in table
1 were sorted into the first class whether full sequences or
third codon base sequences were used. This implies that the
mtDNA recombination events that we have detected are
unlikely to be the products of selection-induced linkage disequilibria.
The preponderance of cytochrome b sequences among
the alignments of the first class is simply a reflection of this
gene’s high frequency in the 186 alignments tested (v2 5
2.084, df 5 1, P 5 0.149). Thus, there is no reason to suspect that, at least for the protein-coding part of the genome,
recombination is more likely in one region than in another.
There are 2 invertebrate, 7 fish, 1 amphibian, 2 reptile, 2
avian, 11 nonprimate mammalian, and 4 primate species
among the 29 listed in table 1 (one genus, Myotis, is represented twice). Again, this distribution is a good representation of the frequency of the taxa examined (v2 5 6.199,
df 5 6, and P 5 0.401) and does not indicate that recombination is likely to be more common in some major divisions
Animal mtDNA Recombination 929
Table 1
Thirty Events of Recombination with the Strongest Statistical Support in the 186 Animal mtDNA Alignments Examined
Genus
(Common Name)
Data
Set Number
Number of
Sequences
Cephalophus1
Mammal
Collocalia3
Bird
Python3
Reptile
Rana1
Amphibian
Aerodramus3
Bird
Anguilla3
Fish
Seriola1
Fish
Tragelaphus3
Mammal
Artibeus1
Mammal
Atropoides1
Reptile
Rhamdia1
Fish
Kobus2
Mammal
Semnopithecus4
Mammal—Primate
Bactrocera1
Insect
Macaca3
Mammal—Primate
Ovis1 Mammal
037
23
044
3
Ctenomys
Mammal
Phalanger1
Mammal
Chironomus5 Insect
Lepilemur3
Mammal—Primate
Apodemus1 Mammal
Acipenser5 Fish
Tilapia4 Fish
Trachipithecus5
Mammal—Primate
Spermophilus5
Mammal
Myotis5 Mammal
Nandopsis5 Fish
Trachinotus1 Fish
Plecotus5 Mammal
Gene
Sequence
Length
Number of
Diagnostic Sites
Nucleotide
Divergence
Supporting
Tests
cob
1,140
307 (253)
0.082 (0.208)
15
cob
1,143
156 (104)
0.049 (0.109)
149
28
cob
313
93 (—)
0.094 (—)
150
14
cob
572
213 (176)
0.157 (0.411)
002
12
cob
1,143
116 (—)
0.043 (—)
010
39
cob
1,140
300 (251)
0.074 (0.188)
163
7
cob
1,141
318 (156)
0.121 (0.182)
179
13
cob
1,140
319 (239)
0.099 (0.234)
011
22
cob
1,140
375 (282)
0.103 (0.258)
020
21
nad4
689
164 (119)
0.07 (0.168)
153
121
cob
1,138
369 (280)
0.053 (0.127)
088
9
cob
1,140
191 (151)
0.072 (0.177)
162
8
cob
1,140
206 (141)
0.073 (0.151)
021
47
cox1
636
243 (191)
0.112 (0.282)
097
36
cob
859
303 (—)
0.079 (—)
M, R, G, D, Gl, RS, Ml, C,
S (G, D, Gl, RS, Ml, C, S)
M, R, G, D, Gl, RS, Ml, C,
S (M, G, D, Gl, RS, Ml, C)
M, R, G, D, Gl,
RS, Ml, C, S (—)
M, R, G, D, Gl, RS, Ml,
C, S (M, G, D, Gl, Ml, C)
M, R, G, D, Gl, RS, Ml,
C (—)
M, G, D, Gl, RS, Ml, C, S
(M, G, D, Gl, RS, Ml, C, S)
M, G, D, Gl, RS, Ml, C, S
(M, R, G, D, Gl, RS, Ml, C)
M, G, D, Gl, RS, Ml, C, S
(M, G, D, Gl, Ml, C)
M, D, Gl, RS, Ml, C, S
(M, G, Ml, RS, Ml, C, S)
M, D, Gl, RS, Ml, C, S
(M, D, Gl, RS, Ml, C, S)
G, D, Gl, RS, Ml, C, S
(G, D, Gl, RS, Ml, C, S)
M, G, D, RS, Ml, C (M, D,
RS, Ml, C)
M, R, D, RS, Ml, C (M, R,
D, Gl, RS, Ml, C)
G, D, Gl, Ml, C, S (G, D,
Gl, Ml, C, S)
M, D, Gl, Ml, C, S (—)
122
4
cob
1,143
97 (87)
0.045 (0.12)
049
65
cob
1,140
470 (317)
0.085 (0.194)
132
18
nad2
1,036
605 (171)
0.224 (0.194)
042
093
46
30
cob
cob
672
350
319 (—)
101 (—)
0.147 (—)
0.050 (—)
013
001
176
178
10
26
14
12
cob
cob
nad2
cob
1,140
1,141
1,047
825
385 (300)
308 (—)
370 (—)
231 (—)
0.147 (0.361)
0.056 (—)
0.126 (—)
0.121 (—)
168
148
cob
1,140
528 (—)
0.112 (—)
113
114
116
177
137
66
106
5
7
21
cob
nad1
cob
cob
nad1
1,140
779
1,062
1,140
766
525 (363)
— (250)
— (179)
— (216)
— (178)
0.146 (0.343)
— (0.301)
— (0.267)
— (0.247)
— (0.238)
M, RS, Ml, C, S (M, D, RS,
Ml, C, S)
G, RS, m, C, S (G, RS, C, S)
R, D, Ml, C, S (M, R, RS,
Ml, C, S)
R, D, RS, C, S (—)
M, Ml, C, S (—)
R, D, RS, Ml (RS, C, S)
R, Ml, S (—)
R, Ml, C (—)
G, C, S (—)
M, D, S (—)
M, D, Gl (R, RS, S)
— (M, RS, S)
— (M, D, Gl, Ml, C)
— (G, Gl, Ml, C)
— (M, RS, Ml)
NOTE. —Information in parentheses refers to third base sequence alignments. Open dashes indicate that recombination was not detected in full sequence alignments, and
dashes in parentheses indicate that the recombination was not detected in third base sequence alignments. Superscripts denote origin of sequences involved in recombination; 1:
the three sequences came from three different species, 2: the recombinant sequence came from one species and both parental sequences came from another species, 3: the
recombinant and one parental sequence came from the same species, 4: all three sequences come from the same species, 5: the specific origin of one or two sequences could not
be identified. Data set numbers refer to Appendix provided as online Supplementary Material to this paper. Abbreviation for genes are as follows: cob: cytochrome b, cox1:
cytochrome oxydase 1, nad1: NADH dehydrogenase subunit 1, nad2: NADH dehydrogenase subunit 2, nad4: NADH dehydrogenase subunit 4, Abbreviations for statistical
tests are as follows: M: MaxChi Global, R: Reticulate, G: Geneconv Global, D: RDP, Gl: Geneconv Local, RS: RecScan, Ml: MaxChi Local, C: Chimaera, S: SiScan. Sequence
accession numbers can be obtained from A.D.T.
of animals than in others. In figure 1 we present for visual
inspection the first recombination example in table 1 for an
invertebrate, fish, amphibian, reptile, bird, and primate. For
Collocalia it is apparent that both recombination points lie
within the sequence examined, whereas only one breakpoint is included for the other examples. It is difficult to
determine whether one or both breakpoints are included
in the Bactrocera sequence. In 22 of the 30 cases listed
in table 1, all three sequences involved in the recombination
event could be identified. In 11 of these, each of the three
sequences belonged to a different species, in 9 two
belonged to the same species and one to another, and in
930 Tsaousis et al.
FIG. 1.—Alignment of variable nucleotide sites of recombinant and parental sequences (with their accession numbers) for six of the cases of recombination listed in table 1 representing different animal divisions. Dots indicate nucleotide identity with the middle sequence. Sites of nonsynonymous
substitutions are starred. The first 100 base pairs of Bactrocera are not presented for simplicity.
2 all three sequences belonged to the same species. This
illustrates the importance of recombination in generating
mtDNA variation that eventually becomes an interspecific
difference.
The number of sequences, the sequence length, the
number of informative sites, and the degree of nucleotide
divergence varied widely among alignments. To evaluate
the effect of these variables on the likelihood of detecting
recombination, we implemented a simple linear regression
with each of these factors as the independent variable and Y
(see Materials and Methods) as the dependent variable. The
probability of detecting recombination was not correlated
with the number of sequences in the alignment (r 5
0.065, P 5 0.393), and it was weakly correlated with
sequence length (r 5 0.168, P 5 0.027). The probability
of detecting recombination was, however, strongly correlated with both the nucleotide diversity and number of
informative sites in the alignment (r 5 0.251, P 5 8 3
10ÿ4 and r 5 0.343, P 5 3.7 3 10ÿ6, respectively). Similar
correlations were obtained when third codon nucleotide
sequences were used instead of full sequences (r 5
0.129, P 5 0.127; r 5 0.225, P 5 0.007; r 5 0.387,
P 5 2 3 10ÿ6; r 5 0.493, P 5 4.6 3 10ÿ10, respectively).
This result is in agreement with simulation studies evaluat-
ing many of the recombination detection methods used here
(Posada and Crandall 2001), which showed that recombination detection power increases with the number of variable/phylogenetically informative sites in an alignment.
Our results satisfy the requirements necessary to suggest that recombination is a common event in animal
mtDNA. It is clear from figure 1 that the recombinant
sequences we have identified are products of true homologous recombination. Five different mitochondrial genes and
all major divisions of the animal kingdom are represented in
table 1. Finally, the fact that recombinant and parental types
were found in different individuals of different species
(whose mtDNA variation was surveyed for reasons other
than the detection of recombination) suggests that recombinant haplotypes can spread within populations. A very recent
survey of animal mtDNA recombination by Piganeau,
Gardner, and Eyre-Warker (2004) produced similar results.
These authors also used the strategy of searching for evidence of recombination among sequences of animal
mtDNA protein genes of the same or different species
and applied four statistical tests for the detection of recombination. Two of their tests, MaxChi and Geneconv, are
similar to some of the nine tests used here. The other
two tests in their study are haplotype frequency-dependent
Animal mtDNA Recombination 931
tests and were not used here for the reasons explained
above. Piganeau, Gardner, and Eyre-Warker (2004) examined 279 alignments and found that in 14.2% of them there
existed evidence for recombination supported by one or
more of the four tests used. Interestingly, they found no
such evidence among the 12 asexual species included in
their survey. The two studies are not strictly comparable
because of the different statistical tests used and also
because of the different sampling effort devoted to different
animal taxa. The studies do, however, supplement one
another in demonstrating that animal mtDNA recombination might be much more common than previously anticipated and that it is becoming increasingly more difficult to
maintain the view that animal mtDNA is intrinsically different from that of other eukaryotes with regard to mechanisms
and frequencies of recombination. The paucity of examples
of recombination in the animal kingdom seems to be more a
reflection of the difficulty with which its products can be
detected than of the frequency of its occurrence. In two
known cases of heteroplasmy, one in mussels resulting
from normal biparental inheritance and another in humans
resulting from paternal mtDNA leakage (Schwartz and
Vissing 2002) recombination has been detectable (Ladoukakis and Zouros 2001a; Kraytsberg et al. 2004).
Our study was not designed to provide estimates of
mtDNA recombination in the animal kingdom but only
to detect its presence. The number of sampled alignments
is biased in favor of vertebrates and in favor of mammals
within vertebrates. Moreover, there is a large degree of heterogeneity among the alignments with respect to the number of sequences they contain, the number of species from
which the sequences were obtained, and whether the
sequences that belong to the same species were derived
from one or more populations. We believe that most alignments gathered from sequences in GeneBank cannot provide reliable estimates of haplotype frequencies in
natural populations, which are essential for the estimation
of population parameters (like 4Nr, where N is the effective
population size and r is the recombination rate per gene per
generation) or linkage disequilibrium. Therefore, we did
not try to estimate the rate of recombination (4Nr; Kuhner,
Yamato, and Felsenstein 2000; McVean, Awadalla, and
Fearnhead 2002; Li and Stephens 2003) or the decay of
linkage disequilibrium with increasing distance between
makers (Awadalla, Eyre-Walker, and Maynard Smith
1999).
The implications of widespread animal mtDNA
recombination are diverse. Recombination constitutes a
means of evading uniparental inheritance and thus clonal
transmission. With regard to the use of mtDNA for evolutionary studies, it would mean that we should not, as it has
been common practice until now, draw conclusions about
the evolutionary history of the entire mitochondrial genome
by looking at parts of it. Such extrapolations are, for example, implicit in studies of hybridization and introgression in
natural populations (Ferris et al. 1983; Powell 1983;
Harrison 1989) and of mtDNA selection in natural or laboratory populations (Clark and Lyckegaard 1988; MacRae
and Anderson 1988; Ballard and Kreitman 1995). Homoplasy due to mutation cannot be considered as the sole or
even the most important factor that confounds rates of
mtDNA divergence and phylogenetic inference. Estimates
of effective population size for mtDNA may have to be
revised upward, and the theory of hierarchical analysis
of mtDNA variation, from the mitochondrion to the species
through the intermediate levels of the cell, the organism,
and the population (Birky, Maruyama, and Fuerst 1983;
Birky, Fuerst, and Maruyama 1989; Zouros and Rand
1999), will need to be reconsidered. Equally important is
that the mutation load of single genomes and the purging
of mutations from the mtDNA pool will not be governed by
the rules of strict clonal transmission. Nonclonal transmission of the animal mitochondrial genome may, therefore,
have important consequences in the etiology of mtDNAbased conditions for the organism, such as senescence
and diseases caused by mtDNA deletions and mutations.
Literature Cited
Altschul, S. F., W. Gish, W. Miller, E. Myers, and D. J. Lipman.
1990. Basic local alignment search tool. J. Mol. Biol. 215:
403–410.
Avise, J. C. 1994. Molecular markers, natural history and evolution. Chapman and Hall, London.
Awadalla, P., A. Eyre-Walker, and J. Maynard Smith. 1999. Linkage disequilibrium and recombination in hominid mitochondrial DNA. Science 286:2524–2525.
Ballard, J. W., and M. Kreitman. 1995. Is mitochondrial DNA a
strictly neutral marker? Trends Ecol. Evol. 10:485–488.
Bentzen, P., W. C. Legget, and G. G. Brown. 1988. Length
and restriction site heteroplasmy in the mitochondrial
DNA of American shad (Alisa sapidissima). Genetics 118:
509–518.
Birky, C. W. Jr., P. Fuerst, and T. Maruyama. 1989. Organelle
gene diversity under migration, mutation, and drift: equilibrium expectations, approach to equilibrium, effects of heteroplasmic cells, and comparison to nuclear genes. Genetics
121:613–627.
Birky, C. W. Jr., T. Maruyama, and P. Fuerst. 1983. An approach
to population and evolutionary genetic theory for genes in
mitochondria and chloroplasts, and some results. Genetics
103:513–527.
Buroker, N. E., J. R. Brown, T. A. Gilbert, P. J. O’Hara, A. T.
Beckenbach, W. K. Thomas, and M. J. Smith. 1990. Length
heteroplasmy of sturgeon mitochondrial DNA: an illegitimate
elongation model. Genetics 124:157–163.
Burzynski, A., M. Zbawicka, D. O. Skibinski, and R. Wenne.
2003. Evidence for recombination of mtDNA in the marine
mussel Mytilus trossulus from the Baltic. Mol. Biol. Evol.
20:388–392.
Clark, A. G., and E. M. Lyckegaard. 1988. Natural selection with
nuclear and cytoplasmic transmission. III. Joint analysis of segregation and mtDNA in Drosophila melanogaster. Genetics
118:471–481.
Clayton, D. A., J. N. Doda, and E. C. Friedberg. 1974. The
absence of a pyrimidine dimer repair mechanism in mammalian mitochondria. Proc. Natl. Acad. Sci. USA 71:
2777–2781.
Cortese, J. D. 1999. Rat liver GTP-binding proteins mediate
changes in mitochondrial membrane potential and organelle
fusion. Am. J. Physiol. 276:C611–C620.
Densmore, L. D., J. W. Wright, and W. M. Brown. 1985. Length
variation and heteroplasmy are frequent in mitochondrial DNA
from parthenogenetic and bisexual lizards (genus Cnemidophorus). Genetics 110:689–707.
932 Tsaousis et al.
Elson, J. R., R. M. Andrews, P. F. Chinnery, R. N. Lightowlers,
D. M. Turnbull, and N. Howell. 2001. Analysis of European
mtDNAs for recombination. Am. J. Hum. Genet. 68:145–153.
Enriquez, J. A., J. Cabezas-Herrera, M. P. Bayona-Bafaluy, and
G. Attardi. 2000. Very rare complementation between mitochondria carrying different mitochondrial DNA mutations
points to intrinsic genetic autonomy of the organelles in
cultured human cells. J. Biol. Chem. 275:11207–11215.
Ferris, S. D., R. D. Sage, C. M. Huang, J. T. Nielsen, U. Ritte,
and A. C. Wilson. 1983. Flow of mitochondrial DNA
across a species boundary. Proc. Natl. Acad. Sci. USA 80:
2290–2294.
Gibbs, M. J., J. S. Armstrong, and A. J. Gibbs. 2000. Sister-scanning: a Monte Carlo procedure for assessing signals in
recombinant sequences. Bioinformatics 16:573–582.
Gjetvai, B. M., D. I. Cook, and E. Zouros. 1992. Repeated sequences and large-scale size variation of mitochondrial DNA: a
common feature among scallops (Bivalvia: Pectinidae). Mol.
Biol. Evol. 8:106–114.
Gray, M. W. 1989. The evolutionary origins of organelles. Trends
Genet. 5:294–299.
Harrison, R. G. 1989. Animal mitochondrial DNA as a genetic
marker in population and evolutionary biology. Trends Ecol.
Evol. 4:6–11.
Hoarau, G., S. Holla, R. Lescasse, W. T. Stam, and J. L. Olsen.
2002. Heteroplasmy and evidence for recombination in the
mitochondrial control region of the flatfish Platichthys flesus.
Mol. Biol. Evol. 19:2261–2264.
Jakobsen, I. B., and S. Easteal. 1996. A program for calculating
and displaying compatibility matrices as an aid in determining
reticulate evolution in molecular sequences. CABIOS 12:
291–295.
Jakobsen, I. B., S. R. Wilson, and S. Easteal. 1997. The partition matrix: exploring variable phylogenetic signals along
nucleotide sequence alignments. Mol. Biol. Evol. 14:
474–484.
Kivisild, T., R. Villems, B. Jorde et al. (13 co-authors). 2000.
Questioning evidence for recombination in human mitochondrial DNA. Science 288:1931a.
Kraytsberg, Y., M. Schwartz, T. A. Brown, K. Ebralidse, W. S.
Kunz, D. A. Clayton, J. Vissing, and K. Khrapko. 2004.
Recombination of human mitochondrial DNA. Science
304:981.
Kuhner, M. K., J. Yamato, and J. Felsenstein. 2000. Maximum
likelihood estimation of recombination rates from population
data. Genetics 156:1393–1401.
Ladoukakis, E., and E. Zouros. 2001a. Direct evidence for homologous recombination in Mussel (Mytilus galloprovincialis)
mitochondrial DNA. Mol. Biol. Evol. 18:1168–1175.
———. 2001b. Recombination in animal mitochondrial DNA:
evidence from published sequences. Mol. Biol. Evol.
18:2127–2131.
La Roche, J., M. Snyder, D. I. Cook, K. Fuller, and E. Zouros.
1990. Molecular characterization of a repeat element causing
large-scale size variation in the mitochondrial DNA of the
sea scallop Placopecten magellanicus. Mol. Biol. Evol. 7:
45–64.
Legros, F., A. Lombes, P. Frachon, and M. Rojo. 2002. Mitochondrial fusion in human cells is efficient, requires the inner membrane potential, and is mediated by mitofusins. Mol. Biol. Cell
13:4343–4354.
Li, N., and M. Stephens. 2003. Modeling linkage disequilibrium
and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165:2213–2233.
Lunt, D. H., and B. C. Hyman. 1997. Animal mitochondrial DNA
recombination. Nature 387:247.
MacRae, A. F., and W. W. Anderson. 1988. Evidence for non-neutrality of mitochondrial DNA haplotypes in Drosophila pseudoobscura. Genetics 120:485–494.
Martin, D., and E. Rybicki. 2000. RDP: detection of recombination amongst aligned sequences. Bioinformatics 16:
562–563.
Martin, D. P., D. Posada, K. A. Crandall, and C. Williamson.
2005. A modified bootscan algorithm for automated identification of recombinant sequences and recombination breakpoints.
AIDS Res. Hum. Retrovir. 21:98–102.
Martin, D. P., C. Williamson, and D. Posada. 2004. RDP2: recombination detection and analysis from sequence alignments.
Bioinformatics 21:260–262.
Maynard Smith, J. 1992. Analyzing the mosaic structure of genes.
J. Mol. Evol. 34:126–129.
Maynard Smith, J., and N. H. Smith. 2002. Recombination
in animal mitochondrial DNA. Mol. Biol. Evol. 19:2330–
2332.
McVean, G. A., P. Awadalla, and P. Fearhead. 2002. A coalescent-based method for detecting and estimating recombination
from gene sequences. Genetics 160:1231–1241.
Moritz, C., and W. M. Brown. 1987. Tandem duplications in animal mitochondrial DNAs: variation in incidence and gene
content among lizards. Proc. Natl. Acad. Sci. USA 84:
7183–7187.
Nass, M. M. 1969. Mitochondrial DNA. I. Intramitochondrial distribution and structural relations of single- and double-length
circular DNA. J. Mol. Biol. 42:521–528.
Paabo, S., D. M. Irwin, and A. C. Wilson. 1990. DNA damage
promotes jumping between templates during enzymatic amplification. J. Biol. Chem. 265:4718–4721.
Piganeau, G., M. Gardner, and A. Eyre-Walker. 2004. A broad
survery of recombination in animal mitochondria. Mol. Biol.
Evol. 21:2319–2325.
Posada, D., and K. A. Crandall. 2001. Evaluation of methods for
detecting recombination from DNA sequences: computer simulations. Proc. Natl. Acad. Sci. USA 98:13757–13762.
Posada, D., K. A. Crandall, and E. C. Holmes. 2002. Recombination in evolutionary genomics. Annu. Rev. Genet. 36:
75–97.
Powell, J. R. 1983. Interspecific cytoplasmic gene flow in the
absence of nuclear gene flow: evidence from Drosophila. Proc.
Natl. Acad. Sci. USA 80:492–495.
Rand, D. M., and R. G. Harrison. 1986. Mitochondrial DNA transmission genetics in crickets. Genetics 114:955–970.
Salminen, M. O., J. K. Carr, D. S. Burke, and F. E. McCutchan.
1995. Identification of breakpoints in intergenotypic recombinants of HIV type 1 by bootscanning. AIDS Res. Hum. Retrovir. 11:1423–1425.
Santel, A., and M. T. Fuller. 2001. Control of mitochondrial
morphology by a human mitofusin. J. Cell Sci. 114:867–
874.
Satoh, M., and T. Kuroiwa. 1991. Organization of multiple nucleoids and DNA molecules in mitochondria of a human cell. Exp.
Cell Res. 196:137–140.
Sawyer, S. 1989. Statistical tests for detecting gene conversion.
Mol. Biol. Evol. 6:526–538.
Schwartz, M., and J. Vissing. 2002. Paternal inheritance of mitochondrial DNA. N. Engl. J. Med. 347:576–580.
Shitara, H., J.-I. Hayashi, S. Takahama, H. Kaneda, and H. Yonekawa. 1998. Maternal inheritance of mouse mtDNA in interspecific hybrids: segregation of the leaked paternal mtDNA
followed by the prevention of subsequent paternal leakage.
Genetics 148:851–857.
Skibinski, D. O., F. C. Gallagher, and C. M. Beynon. 1994. Mitochondrial DNA inheritance. Nature 368:817–818.
Animal mtDNA Recombination 933
Thyagarajan, B., R. A. Padua, and C. Campbell. 1996. Mammalian mitochondria possess homologous DNA recombination
activity. J. Biol. Chem. 271:27536–27543.
Yaffe, M. P. 1999. The machinery of mitochondrial inheritance
and behavior. Science 283:1493–1497.
Zouros, E., and D. M. Rand. 1999. Population genetics and evolution of animal mitochondrial DNA. Pp. 430–455 in R. M. Sing
and C. B. Krimbas, eds. Evolutionary genetics: from molecules
to morphology. Cambridge University Press, Cambridge, U.K.
Zouros, E., A. O. Ball, C. Saavedra, and K. R. Freeman. 1994.
Mitochondrial DNA inheritance. Nature 368:818.
Zuckerman, S. H., J. F. Solus, F. P. Gillespie, and J. M. Eisenstadt.
1984. Retention of both parental mitochondrial DNA species
in mouse-Chinese hamster somatic cell hybrids. Somat. Cell
Mol. Genet. 10:85–91.
Richard Thomas, Associate Editor
Accepted December 21, 2004