The Origin and Characterization of New Nuclear Genes Originating

The Origin and Characterization of New Nuclear Genes
Originating from a Cytoplasmic Organellar Genome
Andrew H. Lloyd*,1 and Jeremy N. Timmis1
1
School of Molecular and Biomedical Science, The University of Adelaide, South Australia 5005 Australia
*Corresponding author: E-mail: [email protected].
Associate editor: Douglas Crawford
Abstract
Key words: chloroplast, endosymbiosis, evolution, gene transfer, tobacco.
Introduction
Far beyond the traditional and well-researched origin of
new genes by mutations in duplicated copies, genomics
has revealed an array of previously unexpected sources
of genetic novelty (Kaessmann 2010). One pathway that
has often been ignored in discussions of the origin of novel
genes is functional transfer from the cytoplasmic organelles
or their endosymbiotic precursors (Timmis et al. 2004). Mitochondria and plastids were once free-living prokaryotes
that evolved into extant cytoplasmic organelles during endosymbiotic evolution (Margulis 1970). The genomes of
these organelles are now greatly reduced in size compared
with those of their free-living ancestors. In large part, this is
due to wholesale relocation of genes from the prokaryote
ancestors to the nuclear genome followed by their subsequent deletion from the organelle genomes (Timmis et al.
2004;Kleine et al. 2009). The constant ingress of organelle
DNA to the nucleus (Huang et al. 2003; Stegemann et al.
2003; Sheppard et al. 2008) has contributed a great diversity
of new genetic material for evolutionary tinkering and it is
a major past and present pathway for the generation of
new genes (Timmis et al. 2004).
In the majority of cases, organelle DNA transferred to
the nucleus is nonfunctional and its sequence may decay
rapidly and/or it may be eliminated from the nuclear genome (Matsuo et al. 2005; Sheppard and Timmis 2009). In
rare cases, however, gene transfer is accompanied or followed by the acquisition of a nuclear promoter and a poly-
adenylation signal leading to nuclear activation. Many
proteins encoded by such genes retain their original function and are imported back into the organelle if they also
acquire an appropriate transit peptide. Others may adopt
novel roles that are unrelated to those they previously carried out (Martin et al. 2002). Smaller fragments of organelle
DNA can also add to the complexity of nuclear genomes by
contributing exonic or intronic sequences to existing nuclear genes (Noutsos et al. 2007).
Recently, new techniques have enabled the experimental recapitulation and dissection of molecular events involved in endosymbiotic gene transfer. The frequency of
DNA transfer from the chloroplast to the nucleus in tobacco (Nicotiana tabacum) has been measured (Huang
et al. 2003; Stegemann et al. 2003; Sheppard et al. 2008)
and was found to be unexpectedly high. These studies used
transplastomic plants containing, within the chloroplast
genome, an aminoglycoside resistance gene (aadA) used
to select the transplastomic lines and a kanamycin resistance gene (neo) designed for exclusive nuclear expression.
Though inactive in the chloroplast, migration of neo to the
nucleus could be monitored simply by screening cells in
culture or seedling progeny for kanamycin resistance.
DNA fragments transferred to the nucleus in these experiments were large (Huang et al. 2003; Sheppard et al. 2008)
and so most kanamycin resistant (kr) lines generated contained the adjacent aadA gene in addition to neo. As aadA
was driven by the chloroplast promoters of psbA (Huang
et al. 2003) or the rRNA operon (Stegemann et al. 2003), it
© The Author 2011. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please
e-mail: [email protected]
Mol. Biol. Evol. 28(7):2019–2028. 2011 doi:10.1093/molbev/msr021
Advance Access publication January 20, 2011
2019
Research article
Endosymbiotic transfer of DNA and functional genes from the cytoplasmic organelles (mitochondria and chloroplasts) to
the nucleus has been a major factor driving the origin of new nuclear genes, a process central to eukaryote evolution.
Although organelle DNA transfers very frequently to the nucleus, most is quickly deleted, decays, or is alternatively
scrapped. However, a very small proportion of it gives rise, immediately or eventually, to functional genes. To simulate the
process of functional transfer, we screened for nuclear activation of a chloroplast reporter gene aadA, which had been
transferred from the chloroplast to independent nuclear loci in 16 different plant lines. Cryptic nuclear activity of the
chloroplast promoter was revealed, which became conspicuous when present in multiple nuclear copies. We screened ;50
million cells of each line and retrieved three plants in which aadA showed strong nuclear activation. Activation occurred
by acquisition of the CaMV 35S nuclear promoter or by nuclear activation of the native chloroplast promoter. Two
fortuitous sites within the 3# UTR of aadA mRNA both promoted polyadenylation without any sequence change.
Complete characterization of one nuclear sequence before and after gene transfer demonstrated integration by
nonhomologous end joining involving simultaneous insertion of multiple chloroplast DNA fragments. The real-time
observation of three different means by which a chloroplast gene can become expressed in the nucleus suggests that the
process, though rare, may be more readily achieved than previously envisaged.
MBE
Lloyd and Timmis · doi:10.1093/molbev/msr021
was not expected to be expressed in the nucleus/cytoplasm
genetic compartment (Huang et al. 2003). The aadA gene
should be expressed only if it acquired a nuclear promoter
and the transcript was translated on 80S ribosomes—
paralleling the pathway of genes functionally transferred
during endosymbiotic evolution.
A previous study (Stegemann and Bock 2006) used this
pair of reporter genes arranged in the nucleus in the same
transcriptional polarity relative to each other with 35Sdriven neo upstream of aadA. This set of experiments
monitored nuclear activation of the aadA gene and found
evidence of frequent activation. Approximately 1 in 3.3 107
cells of leaf explants cultured in medium containing aminoglycoside antibiotics could be regenerated into aminoglycoside resistant plants. However, in all cases, aadA activation
was due to simple small deletions which brought the
upstream strong nuclear promoter of neo to the 5# end
of aadA. Thus, the frequency measured was artificially dependent upon the orientation of the experimental reporter
genes, their proximity and their transcriptional polarity.
To simulate more closely the natural process of functional endosymbiotic gene transfer, we used an arrangement of the reporter gene cassette that precludes
simple deletion as a means of aadA activation, thereby enabling more evolutionarily authentic mechanisms to be revealed. We screened a large number of independent
nuclear aadA loci and uncovered two entirely new mechanisms of functional transgene transfer: By multiple copy
insertion of a transgene whose promoter has weak nuclear
activity and by rearrangements that recruit a nuclear enhancer. In addition, we present the characterization of
the nupt (nuclear integrant of chloroplast DNA) in kr2.2,
the line that activated aadA by rearrangement. This represents the first characterization of a complete de novo nupt
and its preinsertion site.
Materials and Methods
Growth Conditions, Plant Material, and Nucleic
Acid Isolation
N. tabacum plants were grown in soil in a controlled environment chamber with a 14 h light/10 h dark and 25 °C
day/18 °C night growth regime or in sterile jars on 0.5 MS
medium (Murashige and Skoog 1962) with a 16 h light/8 h
dark cycle at 25 °C.
Kr plant lines used in this study were generated in two
previously published screens with the kr series from Huang
et al. (2003) and the kr2 series from Sheppard et al. (2008).
These lines were produced by crossing pollen from a transplastomic parent to female wild type to remove the transplastome. Plants used in the aminoglycoside resistance
screen were kr progeny obtained from self-fertilized flowers
on the original kr plants generated in the studies detailed
above. Known hemizygous plants were kr progeny obtained by crossing the original kr plants to wild type.
DNA was prepared using a DNeasy Plant Mini Kit (Qiagen, Hilden, Germany) according to manufacturer’s instructions. RNA was prepared using an RNeasy
2020
Plant Mini Kit (Qiagen) according to manufacturer’s
instructions.
aadA Activation Screen
Leaf explants with an average area of ;34 mm2, from
plants grown in sterile jars, were placed on culture plates
containing MS104 medium (Mathis and Hinchee 1994)
supplemented with 400 mg l1 spectinomycin and 200
mg l1 streptomycin. Resistant shoots were transferred
to 0.5 MS medium to root and subsequently transferred
to soil. For seedling selection, surface-sterilized seeds were
grown on 0.5 MS medium containing 150 mg l1 kanamycin or 200 mg l1 spectinomycin. Plates were incubated
at 25 °C with 16 h light/8 h dark.
Cell counts were performed essentially as described
(Humphries and Wheeler 1960). Leaves from plants grown
in tissue culture were digested with pectinase (Sigma, St.
Louis, MO) for 1 h at 60 °C, macerated to single cells
and counted with a haemocytometer.
aadA Expression Analysis
For RT–polymerase chain reaction (PCR) DNA was removed
from RNA samples using a TURBO DNA-free kit (Ambion,
Austin, TX). Reverse transcription was performed using an
Advantage RT-for-PCR kit (Clontech, Mountain View, CA)
with an oligo(dT) primer in accordance with the manufacturer’s instructions. Samples were also prepared without RT.
For amplification of aadA, primers aadAF5, and aadAR3
were used. RPL25 mRNA was amplified using primers
L25F and L25R. The sequence of all primers can be found
in supplementary table S3, Supplementary Material online.
ImageQuant TL software (GE Healthcare, Buckinghamshire,
United Kingdom) was used to quantify bands in RT–PCR for
correlation with slot blot hybridization.
Real-Time Quantitative PCR was performed as described
(Schmidt and Delaney 2010). Sr line RNA was prepared from
plants 2 months after transfer to soil. For amplification of
aadA cDNA, primers QaadAF and QaadAR were used.
For relative quantification, RPL25 mRNA, amplified using primers QL25F and QL25F, was used as internal standard.
Polymerase Chain Reaction
Thermal asymmetric interlaced (TAIL)-PCR was performed
as described (Liu et al. 1995) using degenerate primer AD3
(Sessions et al. 2002) and gene-specific primers aadAT1, aadAT2, and aadAT3 (Stegemann and Bock 2006). Inverse
PCR followed by Genome Walking and standard PCR were
used to amplify the kr2.2 chloroplast integrant and its
flanking nuclear DNA. For full details, see supplementary
text, Supplementary Material online.
RACE
5# and 3# RACE were performed using Marathon cDNA
Amplification Kit (Roche, Indianapolis, IN) or FirstChoice
RLM-RACE Kit (Ambion) according to manufacturers’ instructions. For 5# RACE PCR of neo mRNA, nested primers
neoR1 and neoR2 were used sequentially in two rounds of
PCR; for aadA, primers aadAT1, aadAT2, and aadAT3 were
Organelle to Nucleus Functional Gene Transfer · doi:10.1093/molbev/msr021
MBE
used sequentially in three rounds of PCR. For 3# RACE PCR,
primers aadAF1 and aadAF2 (Sheppard and Timmis 2009)
were used in two sequential rounds of PCR.
Transient Expression Analysis
The pGreen system of binary vectors (Hellens et al. 2000)
was used in the transient expression assay. For full vector
construction details, see supplementary text, Supplementary Material online. The transient expression assay was
adapted from the Agrobacterium infiltration method
(Sparkes et al. 2006). Triplicate samples were prepared
for each expression construct with each sample comprising
material from three infiltrations. b-glucuronidase (GUS) activity was determined by assaying fluorescence generated
by hydrolysis of 4- methylumbelliferyl b-D-glucuronide
(MUG (Delaney et al. 2007). All samples were measured
with three technical replicates. Each replicate (40 ll)
contained 3 lg total protein. The microtitre plate was
incubated for 40 min at 37 °C prior to detection of 4methyl umbelliferone (4-MU) fluorescence. Fluorescence
was measured and 4-MU levels were calculated by comparison with a standard curve generated using known concentrations of 4-MU. For the concentration of total protein
used (75 lg ml1), linearity of fluorescence with respect
to GUS concentration was determined (R2 . 0.99). Data
were compared using one-way ANOVA with Bonferroni
post hoc comparison. ANOVA was performed using Prism
5 (GraphPad Software, Inc., San Diego, CA).
DNA Blot Analysis
DNA slot blot was performed and image quantified as described (Sheppard and Timmis 2009). Levels of aadA hybridization and expression (as shown by RT–PCR) were
compared. Correlation coefficient and P value were calculated using Prism 5.
Accession Number
The sequence of the kr2.2 chloroplast integrant and flanking nuclear DNA has been submitted to GenBank accession
number: HM066935.
FIG. 1. Comparison of aadA copy number and expression. (A)
Explants grown on regeneration medium containing 400 mg l1
spectinomycin and 200 mg l1 streptomycin. Scale bars 5 10 mm.
(B) DNA slot blot was performed using DNA from known
hemizygous plants. Triplicates of each sample were probed with
an aadA probe followed by probing with ribosomal DNA as
a loading control. The graphs show average aadA hybridization,
relative to kr2.9, after normalization to ribosomal DNA hybridization. Error bars show standard deviation. (C) RT–PCR showing aadA
expression (þ), with no reverse transcriptase () and no template
(-ve) controls. Control RT–PCRs with RPL25 primers are also shown.
Results
The Screen for aadA Activation
To identify events leading to nuclear expression of aadA,
we screened for aminoglycoside resistance in leaf explants
from 16 kr tobacco lines (supplementary table S1, Supplementary Material online) in which aadA as well as neo had
been cotransferred to independent nuclear genomic loci
(Huang et al. 2003; Sheppard et al. 2008). Some of these
lines were from experiments described in Huang et al.
(2003) (the kr series) and some from Sheppard et al.
(2008) (the kr2 series). One thousand leaf explants (;34
mm2) from each line were placed on plant regeneration
medium containing spectinomycin and streptomycin to
select for nuclear activation of aadA. Spontaneous mutations can provide resistance to each of these drugs separately but aadA must be involved to provide dual
resistance. Two distinct types of resistance were observed:
Several lines displayed immediate resistance to aminoglycosides in all cells of all explants screened and other lines
showed very rare growth foci where aadA was activated as
a result of an acquired mutation in a single cell.
Multiple Copy Insertion of aadA Leads Immediately
to Aminoglycoside Resistance
The full psbA promoter shows weak nuclear activity
thought to be dependent upon the presence of fortuitous
TATA and CAAT boxes (Cornelissen and Vandewiele
1989). For this reason, the psbA promoter used in this study
was truncated at the 5# end to remove the CAAT box.
However, despite this modification, all cells of all explants
from 2 of the 16 lines screened (kr2.7 and kr2.9) displayed
2021
Lloyd and Timmis · doi:10.1093/molbev/msr021
MBE
nucleus sufficient to confer significant aminoglycoside resistance. Given that gene transfer from the chloroplast to
the nucleus occurs once in every 11,000–16,000 pollen
grains (Huang et al. 2003; Sheppard et al. 2008), this corresponds to a frequency of one active gene transfer due to
multiple copy insertion in ;88,000–128,000 pollen grains.
Mutational Activation of aadA Occurred Either by
Acquisition of the CaMV 35S Nuclear Promoter or
by Nuclear Activation of the Native Chloroplast
Promoter
FIG. 2. Nuclear activation of aadA in lines sr1 and sr2. (A) Selection
for aadA activation in line kr2.2. Explants were grown on plant
regeneration medium with 400 mg l1 spectinomycin and 200 mg
l1 streptomycin. A single resistant shoot (sr1) can be seen. Scale
bar 5 10 mm. (B) RT–PCR demonstrating aadA expression (þ),
with no reverse transcriptase () and no template (-ve) controls.
(C) Real-time quantitative PCR shows aadA mRNA accumulation in
leaves. RPL25 normalized data are shown relative to sr1 mRNA
levels. Error bars show standard deviation. Sr1 and sr2 show
a significant increase in expression compared with the parent line
kr2.2 (P , 0.05, Student’s t-test). Sr3 showed no increase in aadA
expression.
significant resistance to aminoglycosides (fig. 1A) and three
others showed various degrees of partial resistance (kr2.10,
kr19, and kr17). Most lines were fully sensitive (e.g., kr2.2
and kr2.3 shown in fig. 1A). Southern blots (Sheppard
et al. 2008) indicate that the two highly resistant lines contain multiple nuclear insertions of chloroplast DNA,
whereas lines that are susceptible have Southern signals
consistent with single or low aadA copy numbers (Sheppard
et al. 2008). We used quantitative hybridization to determine
the relative aadA gene copy numbers from all lines in this
study that arose from the screen of Sheppard et al. (2008):
kr2.2, kr2.3, kr2.7, kr2.9, and kr2.10 (fig. 1B). Assuming that
kr2.2, the line giving the lowest aadA hybridization signal,
harbors a single copy of the gene, kr2.3, kr2.7, kr2.9, and
kr2.10 contain 3, 11, 17, and 5 copies, respectively (these
are approximate figures given the compound variance
associated with the calculations). Transcript accumulation
of aadA, assessed by RT–PCR relative to RPL25 in these lines
(fig. 1C) is correlated (r 5 0.93, P , 0.05) with the number of
nuclear copies of the gene.
Of the 16 gene transfer lines tested, 2 (12.5%) contained
multiple but probably unchanged copies of aadA in the
2022
Two independent resistant shoots (sr1 and sr2) emerged
from one of the 13 remaining lines (kr2.2). Figure 2A shows
one of the resistant shoots and RT–PCR of leaf RNA samples demonstrated elevated aadA transcripts in both (fig.
2B). The shoots were placed on 0.5 MS agar medium to
generate roots then transferred to soil and grown into mature plants designated sr1 and sr2. After plant regeneration,
aadA activation was confirmed by QPCR (fig. 2C) and progeny were tested for the inheritance of aminoglycoside resistance phenotype (supplementary fig. S1, Supplementary
Material online). Sr1 showed a significant deviation from
the expected 3:1, with a large excess of resistant progeny
(R 5 116, S 5 3, P 5 1.49 1008). For discussion, see
supplementary text, Supplementary Material online. Sr2
segregated normally (R 5 212, S 5 74, P 5 0.73). A third
resistant shoot (sr3) from kr18 showed increased aadA
transcript accumulation in the initial RT–PCR assay (fig.
2B) but this could not be detected by QPCR after 3 months
further growth in the absence of selection (fig. 2C) and the
spectinomycin resistance phenotype was not inherited by
progeny (supplementary fig. S2, Supplementary Material
online).
To determine the mechanism of gene activation in the
two lines showing stable activation of aadA, we recovered
the sequence 5# of this gene (fig. 3) using TAIL-PCR and
genome walking. The transcription start site was determined using 5# RLM-RACE (fig. 3). In sr1, aadA was activated by acquisition of the nearby CaMV 35S promoter
which, together with the first 47 bp of neo, was duplicated
and inserted in reverse polarity upstream of aadA (fig. 3B).
Sr2 showed activation by the recruitment of nuclear gene
regulatory elements within the 35S promoter to enhance
nuclear transcription from the psbA promoter (fig. 3C).
This activation was likely dependent upon the preexisting
weak nuclear activity of the psbA promoter. A 98 bp deletion brought the 35S promoter closer to the psbA promoter. Two adjacent single nucleotide substitutions 25
bp upstream from the site of the deletion were also observed (fig. 3C) that are unlikely to be of functional significance (supplementary text, Supplementary Material
online). In both sr1 and sr2, the two sequences joined
by rearrangement or deletion show microhomology
(2 bp) at the junction (fig. 3B–C).
We verified that the relocated 35S promoter was the
cause of activation using an in vivo transient expression
assay, quantifying the expression of the GUS reporter gene
Organelle to Nucleus Functional Gene Transfer · doi:10.1093/molbev/msr021
FIG. 3. Sequence rearrangements leading to aadA activation in sr1
and sr2. (A) Arrangement of experimental cassette in parent line
kr2.2. (B) Sequence of part of the experimental cassette in sr1. 35S
promoter (lowercase), neo first exon (underlined), STLS2 intron
(bold italics), psbA promoter (uppercase), aadA open reading frame
(ORF) (bold). Transcription of aadA (black arrow) is from the
expected 35S transcription start site. An upstream 41 aa ORF exists
from 241 / 89. The aadA ORF begins at þ1 (ORFs are
indicated by black bars above the sequence). A black box marks the
junction of STLS2 intron with the psbA promoter. The CA
dinucleotide is shared by both the STLS2 intron and the psbA
promoter representing microhomology that is commonly found at
sites of double strand break repair. (C) Sequence of part of the
experimental cassette in sr2. 35S promoter (lowercase), psbA
promoter (uppercase), aadA (bold). Transcription of aadA is from
the psbA promoter at position 81 relative to the start of
translation (black arrow), 4 nt downstream of the plastid
transcription start site. A TATA box is present within the psbA
promoter and a putative CAAT box is present within the
35S promoter. A 98 bp deletion occurred in line sr2 between the
35S promoter and the psbA promoter. The new junction of 35S
promoter and psbA promoter sequences is marked by a black box.
The GA dinucleotide present at this site is found in both the 35S
promoter and the psbA promoter. Two adjacent single nucleotide
substitutions are found in sr2 at positions 172 and 173. The
bases in bold shown above the substitutions show the original
sequence of kr2.2 at each position. Arrows indicate the direction of
transcription.
MBE
FIG. 4. Transient expression analysis of the novel sequence motifs
that activate aadA. (A) GUS expression constructs. Promoters as
follows: pG.GUS—negative control, no promoter; pG.35S—positive
control, CaMV 35S promoter; pG.kr2.2—kr2.2 promoter (sequence
5# of aadA in line kr2.2); pG.sr2CAAT—sr2 promoter (sequence 5#
of aadA in line sr2); pG.sr2CTTT—sr2 promoter with putative
CAAT box sequence altered to the sequence CTTT; pG.CAAT—
truncated sr2 promoter that included the putative CAAT box but
no other known regulatory elements further 5#; pG.CTTT—similar
to pG.CAAT but with putative CAAT box altered to the sequence
CTTT. (B) Quantitative measurement of GUS activity. Total protein
was prepared from leaf segments 3 days after infiltration with
Agrobacterium strains containing the various expression constructs.
GUS activity is shown as picomoles of product (4-methylumbelliferone; 4-MU) generated per hour per lg of total leaf protein.
Values shown are the average activity of triplicate samples of
protein extract with each sample comprising material from three
independent infiltrations. Error bars show the standard deviation.
Arrows below the line in (A) indicate 35S promoter sequence in
reverse orientation.
from a series of promoters involving variations of the initial
psbA promoter sequence and the upstream nuclear sequences in sr2 and its progenitor kr2.2 (fig. 4). The deletion
also introduced a CAAT motif at approximately the correct
position relative to the TATA motif in the native psbA promoter (fig. 3C). The GUS expression results confirmed that
2023
MBE
Lloyd and Timmis · doi:10.1093/molbev/msr021
mm2, and 1,000 explants for each line, a total of 650 million
cells were screened with a resulting estimated frequency of
one activation in 2 108 cells (kr2.7, kr2.9, and kr2.10 were
excluded as they displayed homogeneous aminoglycoside
resistance).
Characterization of a De Novo Nuclear Integrant of
Chloroplast DNA
FIG. 5. Characterization of the kr2.2 integrant. Three disparate
fragments of chloroplast DNA (gray background) with sizes of 14.4
kb, 31 bp, and 2.4 kb comprise the 16.8 kb integrant in kr2.2 (A).
Nuclear genomic sequence (boxed white background) before (B)
and nuclear/chloroplast sequence after integration (C) is shown.
Filler DNA sequences (bold) are found at each of the junctions (a–b,
c–d, e–f, and g–h) and are derived from sequences near the end of
chloroplast DNA fragments (a*, b*, g*, and d*). In junctions c–d,
e–f, and g–h sequences adjacent to the filler DNA are homologous
with the template that promoted the formation of the junction
such that b and b*, g and g*, and d and d* are complementary.
the closer proximity of the 35S promoter, even though it
was in reverse orientation, was responsible for increased
aadA expression in vivo (fig. 4, P , 0.01, one-way ANOVA,
Bonferroni corrected).
The effect on expression of the relocated CAAT motif
within the de novo sr2 promoter was investigated by its
mutation to CTTT (fig 4, pG.sr2CTTT) and by deletion
of 35S promoter sequence further 5# of aadA (constructs
pG.CAAT and pG.CTTT). The CAAT motif had little influence on expression indicating that enhancer elements further 5# of the CAAT sequence were responsible for
activation. It is known that a number of enhancer elements
within the 35S promoter are able to act in either orientation (Fang et al. 1989).
Maturation of aadA Transcripts
Polyadenylation status was determined by 3# RACE and
found to occur at two different sites within the 189 bp
psbA 3# UTR (supplementary fig. S3, Supplementary Material online). Both sites matched the rather flexible AT rich
plant polyadenylation consensus sequence (Li and Hunt
1997) and one was described previously (Stegemann and
Bock 2006). 5# RLM-RACE, which amplifies only cDNA
from full-length capped mRNA, was used to determine
the aadA transcription start site and to confirm the correct
5# maturation of transcripts.
Activation of aadA in sr1 and sr2 was due to two independent rearrangements of the same initial integrant of the
chloroplast gene aadA in the nucleus (kr2.2) suggesting
that this nuclear locus may be particularly susceptible to
change. Therefore, we characterized this locus to determine
the nuclear sequence context. The full sequence suggested
that minor microhomology involving only two bases in
each case was responsible for the two rearrangements, consistent with their rarity.
The molecular characterization of the kr2.2 integrant
shed considerable light on the integration process itself.
Three fragments of chloroplast DNA from disparate parts
of the chloroplast genome comprised the integrant (figs.
5A and 6A). These fragments corresponded to nucleotides
95667–107107 in inverted repeat B (IRB) (or 135523–
146963 in IRA) and 64850–64880 and 48075–50450 in
the large single copy region of the tobacco plastid genome
(Shinozaki et al. 1986). Among the 16,757 bp of the de novo
nuclear integrant in kr2.2 a single nucleotide deletion (nucleotide 101286 in IRB or 141344 in IRA), within a homopolymeric stretch, was the only difference from the original
plastid sequence.
To investigate the molecular mechanisms of insertion,
we determined the sequence of the insertion site before
(fig. 5B) and after (fig. 5C) integration. No deletion of nuclear DNA occurred during integration but filler DNA (5–
15 bp) was added at each of the four junctions (fig. 5C). In
every case, the filler sequence was derived from the end of
one of the chloroplast DNA molecules being inserted (fig.
5C). Sequences adjacent to the filler DNA were usually homologous with the template that promoted the formation
of the junction such that, b and b*, c and c*, and d and d*
are complementary (fig. 5C). This suggests that short regions of homology are used to prime the synthesis of filler
DNA which is then used to mediate joining (fig. 6C–D). The
availability of filler templates at the sites of filler synthesis
must be necessary during insertion and so, to explain the
pattern of filler DNA observed in this insertion event, the
two ends of each of the chloroplast DNA molecules, and
the two nuclear DNA ends must all have been in close
proximity at the time of integration (fig. 6B–D).
The Frequency of Gene Activation
Discussion
The frequency of gene activation, which is of consequence
from both evolutionary and biotechnological perspectives,
was determined. The cell density of leaves from plants
grown under experimental conditions was found to be
;1,500 cells mm2 (supplementary text, Supplementary
Material online). Given an average explant area of ;34
Although it has become increasingly clear that organelle
genomes have contributed large numbers of genes to
the nuclear genomes of all eukaryotes, this pathway is usually not appropriately recognized in discussions of the possible origins of new nuclear genes (Kaessmann 2010). In
Arabidopsis, 18% of nuclear genes are thought to be
2024
Organelle to Nucleus Functional Gene Transfer · doi:10.1093/molbev/msr021
MBE
FIG. 6. A model of chloroplast DNA insertion in line kr2.2. (A) Insertion of three chloroplast fragments (green) at a single nuclear location
(blue). Lower case letters indicate DNA ends involved. (a) and (h) indicate nuclear DNA ends. (b) and (c), (d) and (e), and (f) and (g) represent
the ends of the three chloroplast DNA fragments inserted (A, C–E). (B) The ends of chloroplast and nuclear DNA are brought together to
facilitate joining at a central repair node. At this site (C), using primarily polymerase-dependent nonhomologous end joining, 3# overhangs are
extended (red lines) to generate regions of complementarity enabling subsequent joining of two DNA molecules by single strand annealing. 3#
overhangs at (f) and (e) are extended using opposite DNA strands of (b) as a template (C). This generates complementary filler sequence (g)
which can be used to facilitate their joining, generating junction f–e (D). Similarly, 3# overhangs at (g) and (h) are extended using opposite
strands of (c) as template (C). Again this generates complementary filler sequence (d) which can be used to join these two DNA ends. Filler
sequence (b) at junction c–d is synthesized using template sequence at (e). Sequence at junction a–b (a) is introduced through a local
duplication.
derived from the chloroplast and its ancestor (Martin et al.
2002) and studies in yeast suggest an even greater number
of nuclear genes may be derived from the mitochondrial
ancestor (Esser et al. 2004). Although the processes involved in endosymbiotic gene transfer are beginning to
be understood, a number of key questions still remain unanswered. First, how do organelle DNA fragments insert into nuclear chromosomes, second, what are the
advantageous and disadvantageous genetic consequences
of the insertions, and third, what kinds of secondary
changes are needed to activate organelle genes once they
arrive in the nucleus.
Insertion of Chloroplast DNA
Despite the high frequency with which nuclear integration
of chloroplast DNA occurs (Huang et al. 2003; Stegemann
et al. 2003), characterization of de novo chloroplast DNA
insertions has so far proven elusive due to the large size of
the inserts, the high copy number of chloroplast genomes
present in each cell and the presence of multiple ancestral
2025
MBE
Lloyd and Timmis · doi:10.1093/molbev/msr021
copies of chloroplast DNA fragments present within the
nuclear genome (Huang et al. 2004). Our complete characterization of the kr2.2 integrant and flanking nuclear
DNA has, therefore, shed light on a number of previously
unanswered questions.
Naturally occurring regions of chloroplast DNA within
the nucleus often contain adjacent fragments of chloroplast DNA from disjunct parts of the chloroplast genome
and these linked fragments may show different polarities
with respect to the plastome (Noutsos et al. 2005). It has
not hereto been clear whether this is due to sequential
multiple insertions at a single location, concomitant insertions of multiple fragments, or insertion of a single fragment followed by fragmentation, deletion, inversion, and
shuffling (Richly and Leister 2004; Matsuo et al. 2005). The
chloroplast integrant characterized contained sequence
from three separate regions of the chloroplast genome.
Southern analysis of the transplastomic line that gave rise
to kr2.2 (Sheppard et al. 2008) confirms that the rearrangement did not occur in the plastid genome prior
to nuclear transfer. Our results, therefore, demonstrate
concomitant insertion of three chloroplast DNA
fragments and provide the first unequivocal evidence
that much of the complexity observed in genome sequences may be generated during the primary integration
events.
Another factor contributing to the complexity of chloroplast sequences in the nucleus is the presence of filler
DNA at the multiple junctions formed during nupt insertion (Huang et al. 2004). This filler DNA is usually short in
length (5–30 bp) and often derived from nuclear or chloroplast sequence adjacent to the integration site (Huang
et al. 2004). Filler DNA is also commonly found at sites
of T-DNA integration (Windels et al. 2003; Zhu et al.
2006) and double strand break (DSB) repair (Gorbunova
and Levy 1997) suggesting that, at least in some instances,
a similar DNA insertion pathway may be involved in all
three processes.
In kr2.2, the colocalization of filler DNA template sequences and sites of filler DNA synthesis during integration (fig.
6C–D) requires all the eight DNA ends involved (two ends
for each of the three chloroplast fragments and the two nuclear DNA ends) to be in close proximity during the insertion
process (fig. 6B–D). It is possible that this was a chance association; if so it necessitates very large amounts of chloroplast DNA to be present in the nucleus at the time that this
insertion occurred. Significantly this nupt was formed during
male gametogenesis when paternal plastids are degraded and
their genomes eliminated from the next generation (Sheppard et al. 2008). One intriguing alternative possibility is that
free DNA ends of multiple chloroplast DNA fragments are
brought together through active recruitment to a single repair node, probably a site of DSB repair, where they are processed and joined to mediate nuclear integration. This would
contribute to the highly complex nature of chloroplast DNA
insertions observed in sequenced nuclear genomes and provide a pathway for the cointegration of chloroplast, mitochondrial, and possibly other sequences at the one locus.
2026
Secondary Rearrangements Lead to aadA
Activation
A number of molecular and bioinformatic studies have investigated the activation of organelle genes transferred to
the nucleus through evolutionary time (Figueroa et al.
1999; Adams et al. 2000; Cusack and Wolfe 2007). Although
these studies provide good examples of how genes transferred to the nucleus can become active, they are neither
exhaustive nor do they give any indication of the rate at
which activation of a transferred gene is likely to occur. This
last point is of great interest in plastid biotechnology in
order to fully understand the level of transgene containment provided by maternal inheritance.
Only one experimental study aimed to determine the
rate of gene activation within the nucleus (Stegemann
and Bock 2006). In that study, the chloroplast marker aadA
was transferred to the nucleus and a number of lines were
generated which showed activation by short simple deletions bringing the closely linked 35S nuclear promoter to
the 5# end of aadA to drive its expression. The frequency at
which aadA becomes activated in the nucleus must be artificially high as a strong nuclear promoter, in the same
transcriptional polarity, is always present within 1 kb of
the chloroplast gene. Therefore, we addressed this same issue using an alternative arrangement of the neo and aadA
genes that precluded activation by simple deletion.
A frequency of one activation per 2 108 cells was estimated, which is an order of magnitude lower than that
ascertained by Stegemann and Bock (2006) reflecting the
very different events needed to activate aadA in the two
studies. Nonetheless, the 35S promoter was implicated in
all the events, we observed suggesting that this is also an
overestimate of the frequency of natural chloroplast gene
activation in the nucleus. As the present study differs also
in that it involves many different novel integrant loci, the
results clearly demonstrate that activation involving native
tobacco nuclear DNA is very rare—probably too rare for
experimental simulation.
Some Chloroplast Genes Contain Elements that
Promote Fortuitous Nuclear Expression
One of the surprising findings of this research was that explants from two lines in which aadA had been transferred
to the nucleus in multiple (;10–17) copies, immediately
displayed strong homogeneous resistance to aminoglycosides suggesting significant aadA expression without sequence change. This nuclear activity was observed
despite removal of the CAAT box from the psbA promoter,
suggesting that other unidentified regulatory elements may
be present or that the TATA box alone is sufficient for lowlevel expression. As segregation analysis shows all copies of
aadA in these lines are at a single locus (Sheppard and
Timmis 2009), it seems most likely that the correlation
observed between copy number and expression is due to
additive low-level transcription rather than multiple copies
of aadA throughout the genome providing more opportunity for activation by nuclear enhancers.
Organelle to Nucleus Functional Gene Transfer · doi:10.1093/molbev/msr021
Remarkably, not only were the genes in the nucleus
transcribed from the native chloroplast promoter but polyadenylation was found to occur at two separate sites
within the psbA UTR, 3# of the aadA coding sequence.
This finding supports earlier suggestions that the AT-rich
nature of chloroplast noncoding regions may provide
abundant chance polyadenylation sites (Stegemann and
Bock 2006) for newly transferred genes. Indeed, it seems
that some organelle genes may arrive in the nucleus complete with all the requirements for immediate expression:
A promoter that has nuclear activity, a 3# UTR containing
cryptic polyadenylation signals and, according to a recent
report (Ueda et al. 2008), may even encode a transit peptide. Although such transfer is unlikely to give a high level
of expression, low-level expression could provide a ‘‘starting point’’ for gene transfer enabling selective maintenance of the gene while transcription, polyadenylation,
and protein targeting is gradually strengthened under selection by postinsertional mutation.
Whether the nuclear activity we have observed for the
psbA promoter applies more widely, remains to be determined. It is possible that other chloroplast promoters harbor nuclear activity and that older transfer events may have
been facilitated by genes having prokaryote promoters with
partial nuclear activity. For the aadA gene driven by the
psbA promoter, we estimate one active gene transfer
through multi-copy insertion in ;88,000–128,000 pollen
grains. This frequency is up to three times that of paternal
plastid transmission (Ruf et al. 2007; Svab and Maliga 2007)
and is more significant for transgene containment as it results in regular nuclear inheritance (Sheppard and Timmis
2009 and supplementary table S2, Supplementary Material
online) rather than the patchy inheritance associated with
heteroplasmic plants.
These findings warn that potential nuclear activity of
a chloroplast promoter should be considered when designing chloroplast transgenes, particularly in cases where complete containment is required. However, provided
chloroplast transgenes do not contain, or are not placed
adjacent to, sequences that can promote nuclear activation, containment by maternal inheritance is extremely effective. Under the right circumstances, plastid leakage
through the male parent represents the major threat to
escape.
Supplementary Material
Supplementary text, supplementary tables S1–S3, and
supplementary figures S1–S3 are available at Molecular
Biology and Evolution online (http://www.mbe.oxfordjournals.
org/).
Acknowledgments
We would like to thank Y. Li for technical assistance and A.
Sheppard for discussion in preparation of this manuscript.
This research was supported under the Australian Research
Council’s Discovery Projects funding scheme (project
number DP0986973).
MBE
References
Adams KL, Daley DO, Qiu YL, Whelan J, Palmer JD. 2000. Repeated,
recent and diverse transfers of a mitochondrial gene to the
nucleus in flowering plants. Nature 408:354–357.
Cornelissen M, Vandewiele M. 1989. Nuclear transcriptional activity
of the plastid psbA promoter. Nucleic Acids Res. 17:19–29.
Cusack BP, Wolfe KH. 2007. When gene marriages don’t work out:
divorce by subfunctionalization. Trends Genet. 23:270–272.
Delaney SK, Orford SJ, Martin-Harris M, Timmis JN. 2007. The fiber
specificity of the cotton FSltp4 gene promoter is regulated by an
AT-rich promoter region and the AT-hook transcription factor
GhAT1. Plant Cell Physiol. 48:1426–1437.
Esser C, Ahmadinejad N, Wiegand C, et al. (15 co-authors). 2004. A
genome phylogeny for mitochondria among alpha-proteobacteria and a predominantly eubacterial ancestry of yeast nuclear
genes. Mol Biol Evol. 21:1643–1660.
Fang RX, Nagy F, Sivasubramaniam S, Chua NH. 1989. Multiple cis
regulatory elements for maximal expression of the Cauliflower
Mosaic Virus 35S promoter in transgenic plants. Plant Cell.
1:141–150.
Figueroa P, Gomez I, Holuigue L, Araya A, Jordana X. 1999. Transfer
of rps14 from the mitochondrion to the nucleus in maize
implied integration within a gene encoding the iran-sulphur
subunit of succinate dehydrogenase and expression by alternative splicing. Plant J. 18:601–609.
Gorbunova V, Levy AA. 1997. Non-homologous DNA end joining in
plant cells is associated with deletions and filler DNA insertions.
Nucleic Acids Res. 25:4650–4657.
Hellens RP, Edwards EA, Leyland NR, Bean S, Mullineaux PM. 2000.
pGreen: a versatile and flexible binary Ti vector for Agrobacterium-mediated plant transformation. Plant Mol Biol.
42:819–832.
Huang CY, Ayliffe MA, Timmis JN. 2003. Direct measurement of the
transfer rate of chloroplast DNA into the nucleus. Nature
422:72–76.
Huang CY, Ayliffe MA, Timmis JN. 2004. Simple and complex
nuclear loci created by newly transferred chloroplast DNA in
tobacco. Proc Natl Acad Sci U S A. 101:9710–9715.
Humphries EC, Wheeler EC. 1960. The effects of kinetin, gibberellic
acid, and light on expansion and cell division in leaf disks of
Dwarf Bean (Phaseolus vulgaris). J Exp Bot. 11:81–85.
Kaessmann H. 2010. Origins, evolution, and phenotypic impact of
new genes. Genome Res. 20:1313–1326.
Kleine T, Maier UG, Leister D. 2009. DNA transfer from organelles to
the nucleus: the idiosyncratic genetics of endosymbiosis. Annu
Rev Plant Biol. 60:115–138.
Li QS, Hunt AG. 1997. The polyadenylation of RNA in plants. Plant
Physiol. 115:321–325.
Liu YG, Mitsukawa N, Oosumi T, Whittier RF. 1995. Efficient
isolation and mapping of Arabidopsis thaliana T-DNA insert
junctions by thermal asymmetric interlaced PCR. Plant J.
8:457–463.
Margulis L. 1970. Origin of eukaryotic cells. New Haven (CT): Yale
University Press.
Martin W, Rujan T, Richly E, Hansen A, Cornelsen S, Lins T, Leister D,
Stoebe B, Hasegawa M, Penny D. 2002. Evolutionary analysis of
Arabidopsis, cyanobacterial, and chloroplast genomes reveals
plastid phylogeny and thousands of cyanobacterial genes in the
nucleus. Proc Natl Acad Sci U S A. 99:12246–12251.
Mathis NL, Hinchee AW. 1994. Agrobacterium inoculation techniques for plant tissues. In: Gelvin SB, Schilperoort RA, editors.
Plant molecular biology manual. Dordrecht (The Netherlands):
Kluwer Academic Publishers. p. 1–9.
Matsuo M, Ito Y, Yamauchi R, Obokata J. 2005. The rice nuclear
genome continuously integrates, shuffles, and eliminates the
2027
Lloyd and Timmis · doi:10.1093/molbev/msr021
chloroplast genome to cause chloroplast-nuclear DNA flux.
Plant Cell. 17:665–675.
Murashige T, Skoog F. 1962. A revised medium for rapid growth and
bio assays with tobacco tissue cultures. Physiol Plantarum.
15:473.
Noutsos C, Kleine T, Armbruster U, DalCorso G, Leister D. 2007.
Nuclear insertions of organellar DNA can create novel patches
of functional exon sequences. Trends Genet. 23:597–601.
Noutsos C, Richly E, Leister D. 2005. Generation and evolutionary
fate of insertions of organelle DNA in the nuclear genomes of
flowering plants. Genome Res. 15:616–628.
Richly E, Leister D. 2004. NUPTs in sequenced eukaryotes and their
genomic organization in relation to NUMTs. Mol Biol Evol.
21:1972–1980.
Ruf S, Karcher D, Bock R. 2007. Determining the transgene
containment level provided by chloroplast transformation. Proc
Natl Acad Sci U S A. 104:6998–7002.
Schmidt GW, Delaney SK. 2010. Stable internal reference genes for
normalization of real-time RT-PCR in tobacco (Nicotiana
tabacum) during development and abiotic stress. Mol Genet
Genomics. 283:233–241.
Sessions A, Burke E, Presting G, et al. (22 co-authors). 2002. A highthroughput Arabidopsis reverse genetics system. Plant Cell.
14:2985–2994.
Sheppard AE, Ayliffe MA, Blatch L, Day A, Delaney SK, KhairulFahmy N, Li Y, Madesis P, Pryor AJ, Timmis JN. 2008. Transfer of
plastid DNA to the nucleus is elevated during male gametogenesis in tobacco. Plant Physiol. 148:328–336.
Sheppard AE, Timmis JN. 2009. Instability of plastid DNA in the
nuclear genome. PLoS Genet. 5:e1000323.
2028
MBE
Shinozaki K, Ohme M, Tanaka M, et al. (23 co-authors). 1986. The
complete nucleotide sequence of the tobacco chloroplast genome:
its gene organization and expression. EMBO J. 5:2043–2049.
Sparkes IA, Runions J, Kearns A, Hawes C. 2006. Rapid, transient
expression of fluorescent fusion proteins in tobacco plants and
generation of stably transformed plants. Nat Protoc.
1:2019–2025.
Stegemann S, Bock R. 2006. Experimental reconstruction of
functional gene transfer from the tobacco plastid genome to
the nucleus. Plant Cell. 18:2869–2878.
Stegemann S, Hartmann S, Ruf S, Bock R. 2003. High-frequency gene
transfer from the chloroplast genome to the nucleus. Proc Natl
Acad Sci U S A. 100:8828–8833.
Svab Z, Maliga P. 2007. Exceptional transmission of plastids and
mitochondria from the transplastomic pollen parent and its
impact on transgene containment. Proc Natl Acad Sci U S A.
104:7003–7008.
Timmis JN, Ayliffe MA, Huang CY, Martin W. 2004. Endosymbiotic
gene transfer: organelle genomes forge eukaryotic chromosomes.
Nat Rev Genet. 5:123–135.
Ueda M, Fujimoto M, Arimura SI, Tsutsumi N, Kadowaki KI. 2008.
Presence of a latent mitochondrial targeting signal in gene on
mitochondrial genome. Mol Biol Evol. 25:1791–1793.
Windels P, De Buck S, Van Bockstaele E, De Loose M, Depicker A.
2003. T-DNA integration in Arabidopsis chromosomes.
Presence and origin f filler DNA sequences. Plant Physiol.
133:2061–2068.
Zhu QH, Ramm K, Eamens AL, Dennis ES, Upadhyaya NM. 2006.
Transgene structures suggest that multiple mechanisms
are involved in T-DNA integration in plants. Plant Sci. 171:
308–322.