The Origin and Characterization of New Nuclear Genes Originating from a Cytoplasmic Organellar Genome Andrew H. Lloyd*,1 and Jeremy N. Timmis1 1 School of Molecular and Biomedical Science, The University of Adelaide, South Australia 5005 Australia *Corresponding author: E-mail: [email protected]. Associate editor: Douglas Crawford Abstract Key words: chloroplast, endosymbiosis, evolution, gene transfer, tobacco. Introduction Far beyond the traditional and well-researched origin of new genes by mutations in duplicated copies, genomics has revealed an array of previously unexpected sources of genetic novelty (Kaessmann 2010). One pathway that has often been ignored in discussions of the origin of novel genes is functional transfer from the cytoplasmic organelles or their endosymbiotic precursors (Timmis et al. 2004). Mitochondria and plastids were once free-living prokaryotes that evolved into extant cytoplasmic organelles during endosymbiotic evolution (Margulis 1970). The genomes of these organelles are now greatly reduced in size compared with those of their free-living ancestors. In large part, this is due to wholesale relocation of genes from the prokaryote ancestors to the nuclear genome followed by their subsequent deletion from the organelle genomes (Timmis et al. 2004;Kleine et al. 2009). The constant ingress of organelle DNA to the nucleus (Huang et al. 2003; Stegemann et al. 2003; Sheppard et al. 2008) has contributed a great diversity of new genetic material for evolutionary tinkering and it is a major past and present pathway for the generation of new genes (Timmis et al. 2004). In the majority of cases, organelle DNA transferred to the nucleus is nonfunctional and its sequence may decay rapidly and/or it may be eliminated from the nuclear genome (Matsuo et al. 2005; Sheppard and Timmis 2009). In rare cases, however, gene transfer is accompanied or followed by the acquisition of a nuclear promoter and a poly- adenylation signal leading to nuclear activation. Many proteins encoded by such genes retain their original function and are imported back into the organelle if they also acquire an appropriate transit peptide. Others may adopt novel roles that are unrelated to those they previously carried out (Martin et al. 2002). Smaller fragments of organelle DNA can also add to the complexity of nuclear genomes by contributing exonic or intronic sequences to existing nuclear genes (Noutsos et al. 2007). Recently, new techniques have enabled the experimental recapitulation and dissection of molecular events involved in endosymbiotic gene transfer. The frequency of DNA transfer from the chloroplast to the nucleus in tobacco (Nicotiana tabacum) has been measured (Huang et al. 2003; Stegemann et al. 2003; Sheppard et al. 2008) and was found to be unexpectedly high. These studies used transplastomic plants containing, within the chloroplast genome, an aminoglycoside resistance gene (aadA) used to select the transplastomic lines and a kanamycin resistance gene (neo) designed for exclusive nuclear expression. Though inactive in the chloroplast, migration of neo to the nucleus could be monitored simply by screening cells in culture or seedling progeny for kanamycin resistance. DNA fragments transferred to the nucleus in these experiments were large (Huang et al. 2003; Sheppard et al. 2008) and so most kanamycin resistant (kr) lines generated contained the adjacent aadA gene in addition to neo. As aadA was driven by the chloroplast promoters of psbA (Huang et al. 2003) or the rRNA operon (Stegemann et al. 2003), it © The Author 2011. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 28(7):2019–2028. 2011 doi:10.1093/molbev/msr021 Advance Access publication January 20, 2011 2019 Research article Endosymbiotic transfer of DNA and functional genes from the cytoplasmic organelles (mitochondria and chloroplasts) to the nucleus has been a major factor driving the origin of new nuclear genes, a process central to eukaryote evolution. Although organelle DNA transfers very frequently to the nucleus, most is quickly deleted, decays, or is alternatively scrapped. However, a very small proportion of it gives rise, immediately or eventually, to functional genes. To simulate the process of functional transfer, we screened for nuclear activation of a chloroplast reporter gene aadA, which had been transferred from the chloroplast to independent nuclear loci in 16 different plant lines. Cryptic nuclear activity of the chloroplast promoter was revealed, which became conspicuous when present in multiple nuclear copies. We screened ;50 million cells of each line and retrieved three plants in which aadA showed strong nuclear activation. Activation occurred by acquisition of the CaMV 35S nuclear promoter or by nuclear activation of the native chloroplast promoter. Two fortuitous sites within the 3# UTR of aadA mRNA both promoted polyadenylation without any sequence change. Complete characterization of one nuclear sequence before and after gene transfer demonstrated integration by nonhomologous end joining involving simultaneous insertion of multiple chloroplast DNA fragments. The real-time observation of three different means by which a chloroplast gene can become expressed in the nucleus suggests that the process, though rare, may be more readily achieved than previously envisaged. MBE Lloyd and Timmis · doi:10.1093/molbev/msr021 was not expected to be expressed in the nucleus/cytoplasm genetic compartment (Huang et al. 2003). The aadA gene should be expressed only if it acquired a nuclear promoter and the transcript was translated on 80S ribosomes— paralleling the pathway of genes functionally transferred during endosymbiotic evolution. A previous study (Stegemann and Bock 2006) used this pair of reporter genes arranged in the nucleus in the same transcriptional polarity relative to each other with 35Sdriven neo upstream of aadA. This set of experiments monitored nuclear activation of the aadA gene and found evidence of frequent activation. Approximately 1 in 3.3 107 cells of leaf explants cultured in medium containing aminoglycoside antibiotics could be regenerated into aminoglycoside resistant plants. However, in all cases, aadA activation was due to simple small deletions which brought the upstream strong nuclear promoter of neo to the 5# end of aadA. Thus, the frequency measured was artificially dependent upon the orientation of the experimental reporter genes, their proximity and their transcriptional polarity. To simulate more closely the natural process of functional endosymbiotic gene transfer, we used an arrangement of the reporter gene cassette that precludes simple deletion as a means of aadA activation, thereby enabling more evolutionarily authentic mechanisms to be revealed. We screened a large number of independent nuclear aadA loci and uncovered two entirely new mechanisms of functional transgene transfer: By multiple copy insertion of a transgene whose promoter has weak nuclear activity and by rearrangements that recruit a nuclear enhancer. In addition, we present the characterization of the nupt (nuclear integrant of chloroplast DNA) in kr2.2, the line that activated aadA by rearrangement. This represents the first characterization of a complete de novo nupt and its preinsertion site. Materials and Methods Growth Conditions, Plant Material, and Nucleic Acid Isolation N. tabacum plants were grown in soil in a controlled environment chamber with a 14 h light/10 h dark and 25 °C day/18 °C night growth regime or in sterile jars on 0.5 MS medium (Murashige and Skoog 1962) with a 16 h light/8 h dark cycle at 25 °C. Kr plant lines used in this study were generated in two previously published screens with the kr series from Huang et al. (2003) and the kr2 series from Sheppard et al. (2008). These lines were produced by crossing pollen from a transplastomic parent to female wild type to remove the transplastome. Plants used in the aminoglycoside resistance screen were kr progeny obtained from self-fertilized flowers on the original kr plants generated in the studies detailed above. Known hemizygous plants were kr progeny obtained by crossing the original kr plants to wild type. DNA was prepared using a DNeasy Plant Mini Kit (Qiagen, Hilden, Germany) according to manufacturer’s instructions. RNA was prepared using an RNeasy 2020 Plant Mini Kit (Qiagen) according to manufacturer’s instructions. aadA Activation Screen Leaf explants with an average area of ;34 mm2, from plants grown in sterile jars, were placed on culture plates containing MS104 medium (Mathis and Hinchee 1994) supplemented with 400 mg l1 spectinomycin and 200 mg l1 streptomycin. Resistant shoots were transferred to 0.5 MS medium to root and subsequently transferred to soil. For seedling selection, surface-sterilized seeds were grown on 0.5 MS medium containing 150 mg l1 kanamycin or 200 mg l1 spectinomycin. Plates were incubated at 25 °C with 16 h light/8 h dark. Cell counts were performed essentially as described (Humphries and Wheeler 1960). Leaves from plants grown in tissue culture were digested with pectinase (Sigma, St. Louis, MO) for 1 h at 60 °C, macerated to single cells and counted with a haemocytometer. aadA Expression Analysis For RT–polymerase chain reaction (PCR) DNA was removed from RNA samples using a TURBO DNA-free kit (Ambion, Austin, TX). Reverse transcription was performed using an Advantage RT-for-PCR kit (Clontech, Mountain View, CA) with an oligo(dT) primer in accordance with the manufacturer’s instructions. Samples were also prepared without RT. For amplification of aadA, primers aadAF5, and aadAR3 were used. RPL25 mRNA was amplified using primers L25F and L25R. The sequence of all primers can be found in supplementary table S3, Supplementary Material online. ImageQuant TL software (GE Healthcare, Buckinghamshire, United Kingdom) was used to quantify bands in RT–PCR for correlation with slot blot hybridization. Real-Time Quantitative PCR was performed as described (Schmidt and Delaney 2010). Sr line RNA was prepared from plants 2 months after transfer to soil. For amplification of aadA cDNA, primers QaadAF and QaadAR were used. For relative quantification, RPL25 mRNA, amplified using primers QL25F and QL25F, was used as internal standard. Polymerase Chain Reaction Thermal asymmetric interlaced (TAIL)-PCR was performed as described (Liu et al. 1995) using degenerate primer AD3 (Sessions et al. 2002) and gene-specific primers aadAT1, aadAT2, and aadAT3 (Stegemann and Bock 2006). Inverse PCR followed by Genome Walking and standard PCR were used to amplify the kr2.2 chloroplast integrant and its flanking nuclear DNA. For full details, see supplementary text, Supplementary Material online. RACE 5# and 3# RACE were performed using Marathon cDNA Amplification Kit (Roche, Indianapolis, IN) or FirstChoice RLM-RACE Kit (Ambion) according to manufacturers’ instructions. For 5# RACE PCR of neo mRNA, nested primers neoR1 and neoR2 were used sequentially in two rounds of PCR; for aadA, primers aadAT1, aadAT2, and aadAT3 were Organelle to Nucleus Functional Gene Transfer · doi:10.1093/molbev/msr021 MBE used sequentially in three rounds of PCR. For 3# RACE PCR, primers aadAF1 and aadAF2 (Sheppard and Timmis 2009) were used in two sequential rounds of PCR. Transient Expression Analysis The pGreen system of binary vectors (Hellens et al. 2000) was used in the transient expression assay. For full vector construction details, see supplementary text, Supplementary Material online. The transient expression assay was adapted from the Agrobacterium infiltration method (Sparkes et al. 2006). Triplicate samples were prepared for each expression construct with each sample comprising material from three infiltrations. b-glucuronidase (GUS) activity was determined by assaying fluorescence generated by hydrolysis of 4- methylumbelliferyl b-D-glucuronide (MUG (Delaney et al. 2007). All samples were measured with three technical replicates. Each replicate (40 ll) contained 3 lg total protein. The microtitre plate was incubated for 40 min at 37 °C prior to detection of 4methyl umbelliferone (4-MU) fluorescence. Fluorescence was measured and 4-MU levels were calculated by comparison with a standard curve generated using known concentrations of 4-MU. For the concentration of total protein used (75 lg ml1), linearity of fluorescence with respect to GUS concentration was determined (R2 . 0.99). Data were compared using one-way ANOVA with Bonferroni post hoc comparison. ANOVA was performed using Prism 5 (GraphPad Software, Inc., San Diego, CA). DNA Blot Analysis DNA slot blot was performed and image quantified as described (Sheppard and Timmis 2009). Levels of aadA hybridization and expression (as shown by RT–PCR) were compared. Correlation coefficient and P value were calculated using Prism 5. Accession Number The sequence of the kr2.2 chloroplast integrant and flanking nuclear DNA has been submitted to GenBank accession number: HM066935. FIG. 1. Comparison of aadA copy number and expression. (A) Explants grown on regeneration medium containing 400 mg l1 spectinomycin and 200 mg l1 streptomycin. Scale bars 5 10 mm. (B) DNA slot blot was performed using DNA from known hemizygous plants. Triplicates of each sample were probed with an aadA probe followed by probing with ribosomal DNA as a loading control. The graphs show average aadA hybridization, relative to kr2.9, after normalization to ribosomal DNA hybridization. Error bars show standard deviation. (C) RT–PCR showing aadA expression (þ), with no reverse transcriptase () and no template (-ve) controls. Control RT–PCRs with RPL25 primers are also shown. Results The Screen for aadA Activation To identify events leading to nuclear expression of aadA, we screened for aminoglycoside resistance in leaf explants from 16 kr tobacco lines (supplementary table S1, Supplementary Material online) in which aadA as well as neo had been cotransferred to independent nuclear genomic loci (Huang et al. 2003; Sheppard et al. 2008). Some of these lines were from experiments described in Huang et al. (2003) (the kr series) and some from Sheppard et al. (2008) (the kr2 series). One thousand leaf explants (;34 mm2) from each line were placed on plant regeneration medium containing spectinomycin and streptomycin to select for nuclear activation of aadA. Spontaneous mutations can provide resistance to each of these drugs separately but aadA must be involved to provide dual resistance. Two distinct types of resistance were observed: Several lines displayed immediate resistance to aminoglycosides in all cells of all explants screened and other lines showed very rare growth foci where aadA was activated as a result of an acquired mutation in a single cell. Multiple Copy Insertion of aadA Leads Immediately to Aminoglycoside Resistance The full psbA promoter shows weak nuclear activity thought to be dependent upon the presence of fortuitous TATA and CAAT boxes (Cornelissen and Vandewiele 1989). For this reason, the psbA promoter used in this study was truncated at the 5# end to remove the CAAT box. However, despite this modification, all cells of all explants from 2 of the 16 lines screened (kr2.7 and kr2.9) displayed 2021 Lloyd and Timmis · doi:10.1093/molbev/msr021 MBE nucleus sufficient to confer significant aminoglycoside resistance. Given that gene transfer from the chloroplast to the nucleus occurs once in every 11,000–16,000 pollen grains (Huang et al. 2003; Sheppard et al. 2008), this corresponds to a frequency of one active gene transfer due to multiple copy insertion in ;88,000–128,000 pollen grains. Mutational Activation of aadA Occurred Either by Acquisition of the CaMV 35S Nuclear Promoter or by Nuclear Activation of the Native Chloroplast Promoter FIG. 2. Nuclear activation of aadA in lines sr1 and sr2. (A) Selection for aadA activation in line kr2.2. Explants were grown on plant regeneration medium with 400 mg l1 spectinomycin and 200 mg l1 streptomycin. A single resistant shoot (sr1) can be seen. Scale bar 5 10 mm. (B) RT–PCR demonstrating aadA expression (þ), with no reverse transcriptase () and no template (-ve) controls. (C) Real-time quantitative PCR shows aadA mRNA accumulation in leaves. RPL25 normalized data are shown relative to sr1 mRNA levels. Error bars show standard deviation. Sr1 and sr2 show a significant increase in expression compared with the parent line kr2.2 (P , 0.05, Student’s t-test). Sr3 showed no increase in aadA expression. significant resistance to aminoglycosides (fig. 1A) and three others showed various degrees of partial resistance (kr2.10, kr19, and kr17). Most lines were fully sensitive (e.g., kr2.2 and kr2.3 shown in fig. 1A). Southern blots (Sheppard et al. 2008) indicate that the two highly resistant lines contain multiple nuclear insertions of chloroplast DNA, whereas lines that are susceptible have Southern signals consistent with single or low aadA copy numbers (Sheppard et al. 2008). We used quantitative hybridization to determine the relative aadA gene copy numbers from all lines in this study that arose from the screen of Sheppard et al. (2008): kr2.2, kr2.3, kr2.7, kr2.9, and kr2.10 (fig. 1B). Assuming that kr2.2, the line giving the lowest aadA hybridization signal, harbors a single copy of the gene, kr2.3, kr2.7, kr2.9, and kr2.10 contain 3, 11, 17, and 5 copies, respectively (these are approximate figures given the compound variance associated with the calculations). Transcript accumulation of aadA, assessed by RT–PCR relative to RPL25 in these lines (fig. 1C) is correlated (r 5 0.93, P , 0.05) with the number of nuclear copies of the gene. Of the 16 gene transfer lines tested, 2 (12.5%) contained multiple but probably unchanged copies of aadA in the 2022 Two independent resistant shoots (sr1 and sr2) emerged from one of the 13 remaining lines (kr2.2). Figure 2A shows one of the resistant shoots and RT–PCR of leaf RNA samples demonstrated elevated aadA transcripts in both (fig. 2B). The shoots were placed on 0.5 MS agar medium to generate roots then transferred to soil and grown into mature plants designated sr1 and sr2. After plant regeneration, aadA activation was confirmed by QPCR (fig. 2C) and progeny were tested for the inheritance of aminoglycoside resistance phenotype (supplementary fig. S1, Supplementary Material online). Sr1 showed a significant deviation from the expected 3:1, with a large excess of resistant progeny (R 5 116, S 5 3, P 5 1.49 1008). For discussion, see supplementary text, Supplementary Material online. Sr2 segregated normally (R 5 212, S 5 74, P 5 0.73). A third resistant shoot (sr3) from kr18 showed increased aadA transcript accumulation in the initial RT–PCR assay (fig. 2B) but this could not be detected by QPCR after 3 months further growth in the absence of selection (fig. 2C) and the spectinomycin resistance phenotype was not inherited by progeny (supplementary fig. S2, Supplementary Material online). To determine the mechanism of gene activation in the two lines showing stable activation of aadA, we recovered the sequence 5# of this gene (fig. 3) using TAIL-PCR and genome walking. The transcription start site was determined using 5# RLM-RACE (fig. 3). In sr1, aadA was activated by acquisition of the nearby CaMV 35S promoter which, together with the first 47 bp of neo, was duplicated and inserted in reverse polarity upstream of aadA (fig. 3B). Sr2 showed activation by the recruitment of nuclear gene regulatory elements within the 35S promoter to enhance nuclear transcription from the psbA promoter (fig. 3C). This activation was likely dependent upon the preexisting weak nuclear activity of the psbA promoter. A 98 bp deletion brought the 35S promoter closer to the psbA promoter. Two adjacent single nucleotide substitutions 25 bp upstream from the site of the deletion were also observed (fig. 3C) that are unlikely to be of functional significance (supplementary text, Supplementary Material online). In both sr1 and sr2, the two sequences joined by rearrangement or deletion show microhomology (2 bp) at the junction (fig. 3B–C). We verified that the relocated 35S promoter was the cause of activation using an in vivo transient expression assay, quantifying the expression of the GUS reporter gene Organelle to Nucleus Functional Gene Transfer · doi:10.1093/molbev/msr021 FIG. 3. Sequence rearrangements leading to aadA activation in sr1 and sr2. (A) Arrangement of experimental cassette in parent line kr2.2. (B) Sequence of part of the experimental cassette in sr1. 35S promoter (lowercase), neo first exon (underlined), STLS2 intron (bold italics), psbA promoter (uppercase), aadA open reading frame (ORF) (bold). Transcription of aadA (black arrow) is from the expected 35S transcription start site. An upstream 41 aa ORF exists from 241 / 89. The aadA ORF begins at þ1 (ORFs are indicated by black bars above the sequence). A black box marks the junction of STLS2 intron with the psbA promoter. The CA dinucleotide is shared by both the STLS2 intron and the psbA promoter representing microhomology that is commonly found at sites of double strand break repair. (C) Sequence of part of the experimental cassette in sr2. 35S promoter (lowercase), psbA promoter (uppercase), aadA (bold). Transcription of aadA is from the psbA promoter at position 81 relative to the start of translation (black arrow), 4 nt downstream of the plastid transcription start site. A TATA box is present within the psbA promoter and a putative CAAT box is present within the 35S promoter. A 98 bp deletion occurred in line sr2 between the 35S promoter and the psbA promoter. The new junction of 35S promoter and psbA promoter sequences is marked by a black box. The GA dinucleotide present at this site is found in both the 35S promoter and the psbA promoter. Two adjacent single nucleotide substitutions are found in sr2 at positions 172 and 173. The bases in bold shown above the substitutions show the original sequence of kr2.2 at each position. Arrows indicate the direction of transcription. MBE FIG. 4. Transient expression analysis of the novel sequence motifs that activate aadA. (A) GUS expression constructs. Promoters as follows: pG.GUS—negative control, no promoter; pG.35S—positive control, CaMV 35S promoter; pG.kr2.2—kr2.2 promoter (sequence 5# of aadA in line kr2.2); pG.sr2CAAT—sr2 promoter (sequence 5# of aadA in line sr2); pG.sr2CTTT—sr2 promoter with putative CAAT box sequence altered to the sequence CTTT; pG.CAAT— truncated sr2 promoter that included the putative CAAT box but no other known regulatory elements further 5#; pG.CTTT—similar to pG.CAAT but with putative CAAT box altered to the sequence CTTT. (B) Quantitative measurement of GUS activity. Total protein was prepared from leaf segments 3 days after infiltration with Agrobacterium strains containing the various expression constructs. GUS activity is shown as picomoles of product (4-methylumbelliferone; 4-MU) generated per hour per lg of total leaf protein. Values shown are the average activity of triplicate samples of protein extract with each sample comprising material from three independent infiltrations. Error bars show the standard deviation. Arrows below the line in (A) indicate 35S promoter sequence in reverse orientation. from a series of promoters involving variations of the initial psbA promoter sequence and the upstream nuclear sequences in sr2 and its progenitor kr2.2 (fig. 4). The deletion also introduced a CAAT motif at approximately the correct position relative to the TATA motif in the native psbA promoter (fig. 3C). The GUS expression results confirmed that 2023 MBE Lloyd and Timmis · doi:10.1093/molbev/msr021 mm2, and 1,000 explants for each line, a total of 650 million cells were screened with a resulting estimated frequency of one activation in 2 108 cells (kr2.7, kr2.9, and kr2.10 were excluded as they displayed homogeneous aminoglycoside resistance). Characterization of a De Novo Nuclear Integrant of Chloroplast DNA FIG. 5. Characterization of the kr2.2 integrant. Three disparate fragments of chloroplast DNA (gray background) with sizes of 14.4 kb, 31 bp, and 2.4 kb comprise the 16.8 kb integrant in kr2.2 (A). Nuclear genomic sequence (boxed white background) before (B) and nuclear/chloroplast sequence after integration (C) is shown. Filler DNA sequences (bold) are found at each of the junctions (a–b, c–d, e–f, and g–h) and are derived from sequences near the end of chloroplast DNA fragments (a*, b*, g*, and d*). In junctions c–d, e–f, and g–h sequences adjacent to the filler DNA are homologous with the template that promoted the formation of the junction such that b and b*, g and g*, and d and d* are complementary. the closer proximity of the 35S promoter, even though it was in reverse orientation, was responsible for increased aadA expression in vivo (fig. 4, P , 0.01, one-way ANOVA, Bonferroni corrected). The effect on expression of the relocated CAAT motif within the de novo sr2 promoter was investigated by its mutation to CTTT (fig 4, pG.sr2CTTT) and by deletion of 35S promoter sequence further 5# of aadA (constructs pG.CAAT and pG.CTTT). The CAAT motif had little influence on expression indicating that enhancer elements further 5# of the CAAT sequence were responsible for activation. It is known that a number of enhancer elements within the 35S promoter are able to act in either orientation (Fang et al. 1989). Maturation of aadA Transcripts Polyadenylation status was determined by 3# RACE and found to occur at two different sites within the 189 bp psbA 3# UTR (supplementary fig. S3, Supplementary Material online). Both sites matched the rather flexible AT rich plant polyadenylation consensus sequence (Li and Hunt 1997) and one was described previously (Stegemann and Bock 2006). 5# RLM-RACE, which amplifies only cDNA from full-length capped mRNA, was used to determine the aadA transcription start site and to confirm the correct 5# maturation of transcripts. Activation of aadA in sr1 and sr2 was due to two independent rearrangements of the same initial integrant of the chloroplast gene aadA in the nucleus (kr2.2) suggesting that this nuclear locus may be particularly susceptible to change. Therefore, we characterized this locus to determine the nuclear sequence context. The full sequence suggested that minor microhomology involving only two bases in each case was responsible for the two rearrangements, consistent with their rarity. The molecular characterization of the kr2.2 integrant shed considerable light on the integration process itself. Three fragments of chloroplast DNA from disparate parts of the chloroplast genome comprised the integrant (figs. 5A and 6A). These fragments corresponded to nucleotides 95667–107107 in inverted repeat B (IRB) (or 135523– 146963 in IRA) and 64850–64880 and 48075–50450 in the large single copy region of the tobacco plastid genome (Shinozaki et al. 1986). Among the 16,757 bp of the de novo nuclear integrant in kr2.2 a single nucleotide deletion (nucleotide 101286 in IRB or 141344 in IRA), within a homopolymeric stretch, was the only difference from the original plastid sequence. To investigate the molecular mechanisms of insertion, we determined the sequence of the insertion site before (fig. 5B) and after (fig. 5C) integration. No deletion of nuclear DNA occurred during integration but filler DNA (5– 15 bp) was added at each of the four junctions (fig. 5C). In every case, the filler sequence was derived from the end of one of the chloroplast DNA molecules being inserted (fig. 5C). Sequences adjacent to the filler DNA were usually homologous with the template that promoted the formation of the junction such that, b and b*, c and c*, and d and d* are complementary (fig. 5C). This suggests that short regions of homology are used to prime the synthesis of filler DNA which is then used to mediate joining (fig. 6C–D). The availability of filler templates at the sites of filler synthesis must be necessary during insertion and so, to explain the pattern of filler DNA observed in this insertion event, the two ends of each of the chloroplast DNA molecules, and the two nuclear DNA ends must all have been in close proximity at the time of integration (fig. 6B–D). The Frequency of Gene Activation Discussion The frequency of gene activation, which is of consequence from both evolutionary and biotechnological perspectives, was determined. The cell density of leaves from plants grown under experimental conditions was found to be ;1,500 cells mm2 (supplementary text, Supplementary Material online). Given an average explant area of ;34 Although it has become increasingly clear that organelle genomes have contributed large numbers of genes to the nuclear genomes of all eukaryotes, this pathway is usually not appropriately recognized in discussions of the possible origins of new nuclear genes (Kaessmann 2010). In Arabidopsis, 18% of nuclear genes are thought to be 2024 Organelle to Nucleus Functional Gene Transfer · doi:10.1093/molbev/msr021 MBE FIG. 6. A model of chloroplast DNA insertion in line kr2.2. (A) Insertion of three chloroplast fragments (green) at a single nuclear location (blue). Lower case letters indicate DNA ends involved. (a) and (h) indicate nuclear DNA ends. (b) and (c), (d) and (e), and (f) and (g) represent the ends of the three chloroplast DNA fragments inserted (A, C–E). (B) The ends of chloroplast and nuclear DNA are brought together to facilitate joining at a central repair node. At this site (C), using primarily polymerase-dependent nonhomologous end joining, 3# overhangs are extended (red lines) to generate regions of complementarity enabling subsequent joining of two DNA molecules by single strand annealing. 3# overhangs at (f) and (e) are extended using opposite DNA strands of (b) as a template (C). This generates complementary filler sequence (g) which can be used to facilitate their joining, generating junction f–e (D). Similarly, 3# overhangs at (g) and (h) are extended using opposite strands of (c) as template (C). Again this generates complementary filler sequence (d) which can be used to join these two DNA ends. Filler sequence (b) at junction c–d is synthesized using template sequence at (e). Sequence at junction a–b (a) is introduced through a local duplication. derived from the chloroplast and its ancestor (Martin et al. 2002) and studies in yeast suggest an even greater number of nuclear genes may be derived from the mitochondrial ancestor (Esser et al. 2004). Although the processes involved in endosymbiotic gene transfer are beginning to be understood, a number of key questions still remain unanswered. First, how do organelle DNA fragments insert into nuclear chromosomes, second, what are the advantageous and disadvantageous genetic consequences of the insertions, and third, what kinds of secondary changes are needed to activate organelle genes once they arrive in the nucleus. Insertion of Chloroplast DNA Despite the high frequency with which nuclear integration of chloroplast DNA occurs (Huang et al. 2003; Stegemann et al. 2003), characterization of de novo chloroplast DNA insertions has so far proven elusive due to the large size of the inserts, the high copy number of chloroplast genomes present in each cell and the presence of multiple ancestral 2025 MBE Lloyd and Timmis · doi:10.1093/molbev/msr021 copies of chloroplast DNA fragments present within the nuclear genome (Huang et al. 2004). Our complete characterization of the kr2.2 integrant and flanking nuclear DNA has, therefore, shed light on a number of previously unanswered questions. Naturally occurring regions of chloroplast DNA within the nucleus often contain adjacent fragments of chloroplast DNA from disjunct parts of the chloroplast genome and these linked fragments may show different polarities with respect to the plastome (Noutsos et al. 2005). It has not hereto been clear whether this is due to sequential multiple insertions at a single location, concomitant insertions of multiple fragments, or insertion of a single fragment followed by fragmentation, deletion, inversion, and shuffling (Richly and Leister 2004; Matsuo et al. 2005). The chloroplast integrant characterized contained sequence from three separate regions of the chloroplast genome. Southern analysis of the transplastomic line that gave rise to kr2.2 (Sheppard et al. 2008) confirms that the rearrangement did not occur in the plastid genome prior to nuclear transfer. Our results, therefore, demonstrate concomitant insertion of three chloroplast DNA fragments and provide the first unequivocal evidence that much of the complexity observed in genome sequences may be generated during the primary integration events. Another factor contributing to the complexity of chloroplast sequences in the nucleus is the presence of filler DNA at the multiple junctions formed during nupt insertion (Huang et al. 2004). This filler DNA is usually short in length (5–30 bp) and often derived from nuclear or chloroplast sequence adjacent to the integration site (Huang et al. 2004). Filler DNA is also commonly found at sites of T-DNA integration (Windels et al. 2003; Zhu et al. 2006) and double strand break (DSB) repair (Gorbunova and Levy 1997) suggesting that, at least in some instances, a similar DNA insertion pathway may be involved in all three processes. In kr2.2, the colocalization of filler DNA template sequences and sites of filler DNA synthesis during integration (fig. 6C–D) requires all the eight DNA ends involved (two ends for each of the three chloroplast fragments and the two nuclear DNA ends) to be in close proximity during the insertion process (fig. 6B–D). It is possible that this was a chance association; if so it necessitates very large amounts of chloroplast DNA to be present in the nucleus at the time that this insertion occurred. Significantly this nupt was formed during male gametogenesis when paternal plastids are degraded and their genomes eliminated from the next generation (Sheppard et al. 2008). One intriguing alternative possibility is that free DNA ends of multiple chloroplast DNA fragments are brought together through active recruitment to a single repair node, probably a site of DSB repair, where they are processed and joined to mediate nuclear integration. This would contribute to the highly complex nature of chloroplast DNA insertions observed in sequenced nuclear genomes and provide a pathway for the cointegration of chloroplast, mitochondrial, and possibly other sequences at the one locus. 2026 Secondary Rearrangements Lead to aadA Activation A number of molecular and bioinformatic studies have investigated the activation of organelle genes transferred to the nucleus through evolutionary time (Figueroa et al. 1999; Adams et al. 2000; Cusack and Wolfe 2007). Although these studies provide good examples of how genes transferred to the nucleus can become active, they are neither exhaustive nor do they give any indication of the rate at which activation of a transferred gene is likely to occur. This last point is of great interest in plastid biotechnology in order to fully understand the level of transgene containment provided by maternal inheritance. Only one experimental study aimed to determine the rate of gene activation within the nucleus (Stegemann and Bock 2006). In that study, the chloroplast marker aadA was transferred to the nucleus and a number of lines were generated which showed activation by short simple deletions bringing the closely linked 35S nuclear promoter to the 5# end of aadA to drive its expression. The frequency at which aadA becomes activated in the nucleus must be artificially high as a strong nuclear promoter, in the same transcriptional polarity, is always present within 1 kb of the chloroplast gene. Therefore, we addressed this same issue using an alternative arrangement of the neo and aadA genes that precluded activation by simple deletion. A frequency of one activation per 2 108 cells was estimated, which is an order of magnitude lower than that ascertained by Stegemann and Bock (2006) reflecting the very different events needed to activate aadA in the two studies. Nonetheless, the 35S promoter was implicated in all the events, we observed suggesting that this is also an overestimate of the frequency of natural chloroplast gene activation in the nucleus. As the present study differs also in that it involves many different novel integrant loci, the results clearly demonstrate that activation involving native tobacco nuclear DNA is very rare—probably too rare for experimental simulation. Some Chloroplast Genes Contain Elements that Promote Fortuitous Nuclear Expression One of the surprising findings of this research was that explants from two lines in which aadA had been transferred to the nucleus in multiple (;10–17) copies, immediately displayed strong homogeneous resistance to aminoglycosides suggesting significant aadA expression without sequence change. This nuclear activity was observed despite removal of the CAAT box from the psbA promoter, suggesting that other unidentified regulatory elements may be present or that the TATA box alone is sufficient for lowlevel expression. As segregation analysis shows all copies of aadA in these lines are at a single locus (Sheppard and Timmis 2009), it seems most likely that the correlation observed between copy number and expression is due to additive low-level transcription rather than multiple copies of aadA throughout the genome providing more opportunity for activation by nuclear enhancers. Organelle to Nucleus Functional Gene Transfer · doi:10.1093/molbev/msr021 Remarkably, not only were the genes in the nucleus transcribed from the native chloroplast promoter but polyadenylation was found to occur at two separate sites within the psbA UTR, 3# of the aadA coding sequence. This finding supports earlier suggestions that the AT-rich nature of chloroplast noncoding regions may provide abundant chance polyadenylation sites (Stegemann and Bock 2006) for newly transferred genes. Indeed, it seems that some organelle genes may arrive in the nucleus complete with all the requirements for immediate expression: A promoter that has nuclear activity, a 3# UTR containing cryptic polyadenylation signals and, according to a recent report (Ueda et al. 2008), may even encode a transit peptide. Although such transfer is unlikely to give a high level of expression, low-level expression could provide a ‘‘starting point’’ for gene transfer enabling selective maintenance of the gene while transcription, polyadenylation, and protein targeting is gradually strengthened under selection by postinsertional mutation. Whether the nuclear activity we have observed for the psbA promoter applies more widely, remains to be determined. It is possible that other chloroplast promoters harbor nuclear activity and that older transfer events may have been facilitated by genes having prokaryote promoters with partial nuclear activity. For the aadA gene driven by the psbA promoter, we estimate one active gene transfer through multi-copy insertion in ;88,000–128,000 pollen grains. This frequency is up to three times that of paternal plastid transmission (Ruf et al. 2007; Svab and Maliga 2007) and is more significant for transgene containment as it results in regular nuclear inheritance (Sheppard and Timmis 2009 and supplementary table S2, Supplementary Material online) rather than the patchy inheritance associated with heteroplasmic plants. These findings warn that potential nuclear activity of a chloroplast promoter should be considered when designing chloroplast transgenes, particularly in cases where complete containment is required. However, provided chloroplast transgenes do not contain, or are not placed adjacent to, sequences that can promote nuclear activation, containment by maternal inheritance is extremely effective. Under the right circumstances, plastid leakage through the male parent represents the major threat to escape. Supplementary Material Supplementary text, supplementary tables S1–S3, and supplementary figures S1–S3 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals. org/). Acknowledgments We would like to thank Y. Li for technical assistance and A. Sheppard for discussion in preparation of this manuscript. This research was supported under the Australian Research Council’s Discovery Projects funding scheme (project number DP0986973). MBE References Adams KL, Daley DO, Qiu YL, Whelan J, Palmer JD. 2000. Repeated, recent and diverse transfers of a mitochondrial gene to the nucleus in flowering plants. Nature 408:354–357. Cornelissen M, Vandewiele M. 1989. Nuclear transcriptional activity of the plastid psbA promoter. Nucleic Acids Res. 17:19–29. Cusack BP, Wolfe KH. 2007. When gene marriages don’t work out: divorce by subfunctionalization. Trends Genet. 23:270–272. Delaney SK, Orford SJ, Martin-Harris M, Timmis JN. 2007. The fiber specificity of the cotton FSltp4 gene promoter is regulated by an AT-rich promoter region and the AT-hook transcription factor GhAT1. Plant Cell Physiol. 48:1426–1437. Esser C, Ahmadinejad N, Wiegand C, et al. (15 co-authors). 2004. A genome phylogeny for mitochondria among alpha-proteobacteria and a predominantly eubacterial ancestry of yeast nuclear genes. Mol Biol Evol. 21:1643–1660. Fang RX, Nagy F, Sivasubramaniam S, Chua NH. 1989. Multiple cis regulatory elements for maximal expression of the Cauliflower Mosaic Virus 35S promoter in transgenic plants. Plant Cell. 1:141–150. Figueroa P, Gomez I, Holuigue L, Araya A, Jordana X. 1999. Transfer of rps14 from the mitochondrion to the nucleus in maize implied integration within a gene encoding the iran-sulphur subunit of succinate dehydrogenase and expression by alternative splicing. Plant J. 18:601–609. Gorbunova V, Levy AA. 1997. Non-homologous DNA end joining in plant cells is associated with deletions and filler DNA insertions. Nucleic Acids Res. 25:4650–4657. Hellens RP, Edwards EA, Leyland NR, Bean S, Mullineaux PM. 2000. pGreen: a versatile and flexible binary Ti vector for Agrobacterium-mediated plant transformation. Plant Mol Biol. 42:819–832. Huang CY, Ayliffe MA, Timmis JN. 2003. Direct measurement of the transfer rate of chloroplast DNA into the nucleus. Nature 422:72–76. Huang CY, Ayliffe MA, Timmis JN. 2004. Simple and complex nuclear loci created by newly transferred chloroplast DNA in tobacco. Proc Natl Acad Sci U S A. 101:9710–9715. Humphries EC, Wheeler EC. 1960. The effects of kinetin, gibberellic acid, and light on expansion and cell division in leaf disks of Dwarf Bean (Phaseolus vulgaris). J Exp Bot. 11:81–85. Kaessmann H. 2010. Origins, evolution, and phenotypic impact of new genes. Genome Res. 20:1313–1326. Kleine T, Maier UG, Leister D. 2009. DNA transfer from organelles to the nucleus: the idiosyncratic genetics of endosymbiosis. Annu Rev Plant Biol. 60:115–138. Li QS, Hunt AG. 1997. The polyadenylation of RNA in plants. Plant Physiol. 115:321–325. Liu YG, Mitsukawa N, Oosumi T, Whittier RF. 1995. Efficient isolation and mapping of Arabidopsis thaliana T-DNA insert junctions by thermal asymmetric interlaced PCR. Plant J. 8:457–463. Margulis L. 1970. Origin of eukaryotic cells. New Haven (CT): Yale University Press. Martin W, Rujan T, Richly E, Hansen A, Cornelsen S, Lins T, Leister D, Stoebe B, Hasegawa M, Penny D. 2002. Evolutionary analysis of Arabidopsis, cyanobacterial, and chloroplast genomes reveals plastid phylogeny and thousands of cyanobacterial genes in the nucleus. Proc Natl Acad Sci U S A. 99:12246–12251. Mathis NL, Hinchee AW. 1994. Agrobacterium inoculation techniques for plant tissues. In: Gelvin SB, Schilperoort RA, editors. Plant molecular biology manual. Dordrecht (The Netherlands): Kluwer Academic Publishers. p. 1–9. Matsuo M, Ito Y, Yamauchi R, Obokata J. 2005. The rice nuclear genome continuously integrates, shuffles, and eliminates the 2027 Lloyd and Timmis · doi:10.1093/molbev/msr021 chloroplast genome to cause chloroplast-nuclear DNA flux. Plant Cell. 17:665–675. Murashige T, Skoog F. 1962. A revised medium for rapid growth and bio assays with tobacco tissue cultures. Physiol Plantarum. 15:473. Noutsos C, Kleine T, Armbruster U, DalCorso G, Leister D. 2007. Nuclear insertions of organellar DNA can create novel patches of functional exon sequences. Trends Genet. 23:597–601. Noutsos C, Richly E, Leister D. 2005. Generation and evolutionary fate of insertions of organelle DNA in the nuclear genomes of flowering plants. Genome Res. 15:616–628. Richly E, Leister D. 2004. NUPTs in sequenced eukaryotes and their genomic organization in relation to NUMTs. Mol Biol Evol. 21:1972–1980. Ruf S, Karcher D, Bock R. 2007. Determining the transgene containment level provided by chloroplast transformation. Proc Natl Acad Sci U S A. 104:6998–7002. Schmidt GW, Delaney SK. 2010. Stable internal reference genes for normalization of real-time RT-PCR in tobacco (Nicotiana tabacum) during development and abiotic stress. Mol Genet Genomics. 283:233–241. Sessions A, Burke E, Presting G, et al. (22 co-authors). 2002. A highthroughput Arabidopsis reverse genetics system. Plant Cell. 14:2985–2994. Sheppard AE, Ayliffe MA, Blatch L, Day A, Delaney SK, KhairulFahmy N, Li Y, Madesis P, Pryor AJ, Timmis JN. 2008. Transfer of plastid DNA to the nucleus is elevated during male gametogenesis in tobacco. Plant Physiol. 148:328–336. Sheppard AE, Timmis JN. 2009. Instability of plastid DNA in the nuclear genome. PLoS Genet. 5:e1000323. 2028 MBE Shinozaki K, Ohme M, Tanaka M, et al. (23 co-authors). 1986. The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J. 5:2043–2049. Sparkes IA, Runions J, Kearns A, Hawes C. 2006. Rapid, transient expression of fluorescent fusion proteins in tobacco plants and generation of stably transformed plants. Nat Protoc. 1:2019–2025. Stegemann S, Bock R. 2006. Experimental reconstruction of functional gene transfer from the tobacco plastid genome to the nucleus. Plant Cell. 18:2869–2878. Stegemann S, Hartmann S, Ruf S, Bock R. 2003. High-frequency gene transfer from the chloroplast genome to the nucleus. Proc Natl Acad Sci U S A. 100:8828–8833. Svab Z, Maliga P. 2007. Exceptional transmission of plastids and mitochondria from the transplastomic pollen parent and its impact on transgene containment. Proc Natl Acad Sci U S A. 104:7003–7008. Timmis JN, Ayliffe MA, Huang CY, Martin W. 2004. Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes. Nat Rev Genet. 5:123–135. Ueda M, Fujimoto M, Arimura SI, Tsutsumi N, Kadowaki KI. 2008. Presence of a latent mitochondrial targeting signal in gene on mitochondrial genome. Mol Biol Evol. 25:1791–1793. Windels P, De Buck S, Van Bockstaele E, De Loose M, Depicker A. 2003. T-DNA integration in Arabidopsis chromosomes. Presence and origin f filler DNA sequences. Plant Physiol. 133:2061–2068. Zhu QH, Ramm K, Eamens AL, Dennis ES, Upadhyaya NM. 2006. Transgene structures suggest that multiple mechanisms are involved in T-DNA integration in plants. Plant Sci. 171: 308–322.
© Copyright 2026 Paperzz