Insertion of group II intron retroelements after intrinsic transcriptional

Insertion of group II intron retroelements
after intrinsic transcriptional terminators
Aaron R. Robart, Wooseok Seo*, and Steven Zimmerly†
Department of Biological Sciences, University of Calgary, 2500 University Drive NW, Calgary, AB, Canada T2N 1N4
Edited by Alan M. Lambowitz, University of Texas, Austin, TX, and approved February 27, 2007 (received for review January 22, 2007)
Mobile DNAs use many mechanisms to minimize damage to their
hosts. Here we show that a subclass of group II introns avoids host
damage by inserting directly after transcriptional terminator motifs in bacterial genomes (stem-loops followed by Ts). This property
contrasts with the site-specific behavior of most group II introns,
which insert into homing site sequences. Reconstituted ribonucleoprotein particles of the Bacillus halodurans intron B.h.I1 are shown
to reverse-splice into DNA targets in vitro but require the DNA to
be single-stranded and fold into a stem-loop analogous to the RNA
structure that forms during transcription termination. Recognition
of this DNA stem-loop motif accounts for in vivo target specificity.
Insertion after terminators is a previously unrecognized strategy
for a selfish DNA because it prevents interruption of coding sequences
and restricts expression of the mobile DNA after integration.
reverse transcriptase 兩 ribozyme 兩 mobile DNA 兩 bacteria
R
eduction of host damage is a basic principle of survival for
selfish DNAs. The most common mechanisms involve suppression of transcription and translation, and integration into
comparatively innocuous sites (1–4). Among retroelements, for
example, R2 elements insert site-specifically into defined sequences within rDNA arrays while leaving most rDNAs intact
(5). LINE1 elements have a less stringent preference for the
sequence TTTTA, which results in integrations into gene-poor
regions (6). Other retroelements target genomic loci through
protein–protein interactions, to insert at pol III promoters (7),
into telomeric elements (8), or into heterochromatin regions (9).
Group II intron retroelements avoid host damage in two ways.
First, the introns splice out of interrupted sequence at the RNA
level to reconstitute functional coding sequence. Second, the
introns insert site-specifically into compatible targets through
retrohoming, thereby limiting potential targets. The mechanism
of retrohoming is well characterized and is carried out by a
ribonucleoprotein (RNP) complex composed of excised intron
lariat and intron-encoded protein (IEP). The process is initiated
by reverse splicing of lariat RNA into the top strand of a DNA
target. The bottom strand is cleaved by the endonuclease domain
of the IEP, and the integrated intron is reverse transcribed by the
IEP, using the cleaved DNA as a primer. Repair processes
complete the insertion steps (10–12).
Normally, retrohoming of group II introns is highly sitespecific because of an ⬇20- to 35-bp target sequence. Thirteen
positions of the DNA target are recognized by base pairings
between the intron RNA and the DNA exons via the interactions
IBS1–EBS1, IBS2–EBS2 (intron and exon binding sites 1 and 2;
6 bp each), and either ␦-␦⬘ (IIA introns; 1 bp) or IBS3–EBS3
(IIB, IIC introns; 1 bp). Additional target-specificity comes from
IEP recognition of upstream exon DNA sequences ⫺23 to ⫺1
and downstream exon sequences ⫹4 to ⫹9 (11, 13, 14).
In contrast to such site-specificity, introns of the phylogenetic
subclass ‘‘bacterial class C’’ instead are located after intrinsic
transcriptional terminators (stem-loop followed by a run of Ts)
(15–17). In some genomes, an intron copy is found in multiple
sites throughout a genome, after nonidentical terminators (15),
suggesting that a motif is targeted rather than a discrete homing
site. This apparent property contrasts with the well characterized
6620 – 6625 兩 PNAS 兩 April 17, 2007 兩 vol. 104 兩 no. 16
phenomenon of retrohoming and suggests that a mechanistic
variation exists to account for the difference.
In addition to the unusual insertion properties, bacterial class
C introns are novel structurally. Their RNAs are abbreviated in
size compared with other group II introns, and they are considered IIC because they do not fit into the IIA and IIB structural
classes (11). Importantly for this study, they lack EBS2 and IBS2
motifs, which are normally involved in 5⬘ exon recognition.
Furthermore, their IBS1–EBS1 pairings are shortened from 6 bp
to 4 bp, producing in all a total of 5 bp between intron and exons
(IBS1–EBS1 and IBS3–EBS3), rather than 13 bp as for IIA and
IIB introns. Bacterial class C ORFs also lack the endonuclease
domain for cleavage of the bottom strand during the mobility
reaction, and so they must employ an alternative priming mechanism. Other bacterial introns also lack endonuclease domains
(e.g., bacterial classes A, D, and E), but introns of these classes
are nevertheless mobile (18, 19). The mechanism has been shown
to involve the DNA replication fork, with the intron reverse
splicing into either ssDNA or dsDNA and the primer being
provided by either the leading or lagging strand (20, 21).
To address how bacterial class C introns insert at terminator
motifs, we have investigated the B.h.I1 intron of Bacillus halodurans because it initially appeared to have a degree of sitespecificity, which would facilitate characterization. There are
five B.h.I1 intron copies in the B. halodurans genome (⬎99.8%
identity), with four located at analogous sites after the terminators of ribosomal RNA (rrn) operons B, D, F, and G (Fig. 1A).
The four sites are identical upstream of the insertion but
nonidentical downstream, suggesting that the introns inserted
independently into the four sites rather than spreading by
gene-conversion events.
Here we demonstrate the basis for insertion of bacterial class
C introns at terminator motifs by reconstituting a mobility
reaction in vitro for the B.h.I1 intron. Like other group II introns,
the mechanism proceeds by reverse splicing into the DNA target
and reverse transcription of the inserted intron. However, the
target is recognized in a remarkable fashion, with the RNP
recognizing a stem-loop structure in ssDNA that corresponds to
the hairpin structure formed in RNA during transcription termination. This insertion-specificity represents a new strategy for
selfish DNAs to avoid damage to their hosts.
Results
Defining the Target-Specificity of Bacterial Class C Introns. To define
more thoroughly the targets of bacterial class C introns, we
Author contributions: A.R.R., W.S., and S.Z. designed research; A.R.R. and W.S. performed
research; A.R.R. and S.Z. analyzed data; and A.R.R. and S.Z. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Abbreviations: RNP, ribonucleoprotein; IEP, intron-encoded protein; RT, reverse transcriptase; IBS, intron binding site; EBS, exon binding site.
*Present address: Department of Medicine, University of British Columbia, 2222 Health
Sciences Mall, Vancouver, BC, Canada V6T 1Z3.
†To
whom correspondence should be addressed. E-mail: [email protected].
This article contains supporting information online at www.pnas.org/cgi/content/full/
0700561104/DC1.
© 2007 by The National Academy of Sciences of the USA
www.pnas.org兾cgi兾doi兾10.1073兾pnas.0700561104
compiled and analyzed 68 target sites of 42 introns in 36 species
[Fig. 1B and supporting information (SI) Fig. 5A]. Fifty-five of
these insertions were after putative terminators, although some
sites may be weak or factor-dependent terminators because the
hairpins are followed by fewer than six Ts. The remaining 13
insertions were after inverted repeats lacking the run of Ts.
Interestingly, six of the 13 introns were located at the attC sites
of integrons (SI Fig. 5A). attC sites are inverted repeats recognized by integrases, but they do not operate as typical terminators (22).
All 68 sequences are capable of forming GC-rich stems of 5–17
bp, with the introns inserted 7–12 bp downstream, precisely
between IBS1 and IBS3 motifs, which are 4 bp and 1 bp,
respectively (Fig. 1B). The 50% consensus shows little sequence
identity except for the IBS1 sequence, a G䡠C base pair closing the
stem, the Ts of the terminator motif, and upstream As that
correspond to bidirectional properties of some terminators.
To examine whether an individual intron might have greater
site-specificity, we examined 21 targets of the S.th.I1 intron of the
Symbiobacterium thermophilum genome (Fig. 1C and SI Fig. 5B).
These intron copies have ⱖ99% identity, but their targets differ,
and a 60% consensus is surprisingly similar to Fig. 1B. Notably,
an 80% consensus cutoff does not show any sequence identity in
the stem, revealing no absolute sequence requirement beyond
Robart et al.
Fig. 2. A mobility reaction in vitro. (A) RNPs generated by IEP-dependent
splicing were incubated with a 5⬘-end-labeled 45-nt oligonucleotide corresponding to rrnF exons for 15 min at 30°C, in the presence or absence of dNTPs,
ddNTPs, and a primer complementary to the 3⬘ exon. Products were phenolextracted, precipitated, and resolved on a polyacrylamide/urea gel. RNA size
markers are indicated to the left, and band identifications are to the right. The
cropped region of the gel contained no bands. (B) Diagram of the putative
mobility product. Asterisks represent the radiolabel.
IBS1. The S.th.I1 insertions clearly indicate that a terminator
motif is recognized rather than a homing site sequence.
A Mobility Reaction in Vitro. To reconstitute a mobility reaction in
vitro, the IEP of B.h.I1 was expressed in Escherichia coli and
affinity-purified, yielding protein active in reverse transcriptase
(RT) assays with poly(A)䡠oligo(dT)18 (SI Fig. 6 A and B). The
IEP was incubated with in vitro transcripts of unspliced B.h.I1
RNA, which resulted in rapid splicing to form lariats (SI Fig. 6C).
The product of this reaction is expected to be a mobilitycompetent RNP consisting of lariat and IEP (SI Fig. 6D).
RNPs were then incubated with potential mobility substrates,
including DNA and RNA sequences of rrnF exons (Fig. 1 A).
Incubation with an ssDNA alone produced no discernible reaction (Fig. 2A, lane 2); however, addition of a primer complementary to the downstream exon produced a faint product band,
which was greatly stimulated by the addition of dNTPs (Fig. 2 A,
lanes 6 and 7). Because bacterial class C introns do not possess
endonuclease domains, a preformed primer may be needed to
mimic the intermediate formed in other intron mobility reactions. The importance of polymerization is demonstrated by
ddNTP inhibition (Fig. 2 A, lanes 6–8).
The structure of the reaction product is expected to be an
intron integrated into the DNA target between IBS1 and IBS3
motifs, with cDNA synthesis initiating at the downstream primer
(Fig. 2B). This identification was confirmed by RNase A and
RNase H digestions (SI Fig. 7B) and by excision of the radiolabeled band from a polyacrylamide gel followed by RT-PCR
and sequencing of the two junctions (data not shown).
To test the specificity of the mobility reaction, different DNA
PNAS 兩 April 17, 2007 兩 vol. 104 兩 no. 16 兩 6621
BIOCHEMISTRY
Fig. 1. Compilation of bacterial class C insertion sites. Sequences were
aligned based on IBS1, IBS3, and inverted repeat boundaries and are depicted
as DNA secondary structures except for A. IBS1–EBS1 and IBS3–EBS3 pairings
are indicated by rectangles and circles, respectively. (A) Insertion sites of B.h.I1.
Four intron copies are inserted after identical rrn terminator motifs, and the
sites are identical upstream but not downstream of the insertion site. The fifth
insertion site is after a terminator-like motif. Potential pairings are indicated
by gray shading. (B) Summary of 68 target sites, expressed as a 50% consensus.
Introns with more than seven natural insertions were omitted from the
compilation to reduce bias. The G䡠C pair closing the stem is not apparent in a
60% consensus. The complete alignment is shown in SI Fig. 5A. (C) Summary
of the 21 insertions of the S.th.I1 intron, expressed as a 60% consensus. See SI
Fig. 5B for the full alignment. (D) Biochemically inferred target requirements
for B.h.I1 insertion.
Fig. 3. Substrate specificity for in vitro mobility. Mobility reactions were performed at 30°C for 15 min with RNPs, dNTPs, and various substitutions of
radiolabeled substrates, either with or without primer. Substrate variations tested are diagrammed to the right and are defined in Materials and Methods.
dsDNA, dsDNA containing the same sequence as the substrate in Fig. 2, labeled at the 5⬘ ends of both strands; RNA, RNA transcript containing sequence of the
standard substrate, with the same complementary primer as the standard substrate; AS, 5⬘-labeled oligonucleotide corresponding to the antisense, or bottom
strand, of the standard substrate, and with an analogous complementary primer; pKS, 5⬘-end-labeled oligonucleotide corresponding to pBluescript sequence,
with a complementary primer; S(T3), 5⬘-end-labeled standard substrate oligonucleotide with a noncomplementary primer corresponding to the T3 promoter
sequence; S, standard substrate consisting of 5⬘-end-labeled oligonucleotide and complementary primer. Substrates are drawn to scale, and gray indicates the
3⬘ exon. Sizes of the DNA substrates are to the left of the gel (kb), and no bands were excluded in cropping.
and RNA substrate variants were examined (Fig. 3). Interestingly, a dsDNA with the same sequence as the substrate in Fig.
2 did not support the reaction, indicating that the RNP cannot
unwind a DNA helix, unlike the well studied Ll.LtrB intron, but
like the RmInt1 intron, which similarly lacks an endonuclease
domain (23) (Fig. 3, lane 1). Furthermore, an RNA substrate
failed to support a mobility reaction, either with or without a
primer (Fig. 3, lanes 2 and 3). Two ‘‘nonspecific’’ DNA sequences
did not support mobility, including a DNA sequence derived
from pBluescript plasmid and a DNA sequence corresponding to
the antisense rrnF DNA exons (Fig. 3, lanes 4–7). A noncomplementary primer likewise did not stimulate the reaction (Fig. 3,
lanes 8 and 9). Together, these data show that the reaction has
specificity for the native insertion site and demonstrate a requirement for an ssDNA substrate and a complementary primer.
To examine the possibility of partial reverse splicing into the
DNA substrate, which is a major product for some introns (24,
25), a 3⬘-end-labeled substrate was used to simultaneously
visualize partial and full integration products (SI Fig. 7C). Partial
reverse splicing indeed occurs and predominates in the absence
of primer, where previously no product was detected (Fig. 2). We
conclude that, although a reverse splicing reaction occurs with an
ssDNA substrate, the primer and dNTPs stimulate the mobility
reaction by driving forward the reversible integration of intron
RNA because of reverse transcription of the ribozyme, as has
been shown for the Ll.LtrB intron in vitro (26).
Dissection of Target Substrate Requirements. Having reconstituted
a mobility reaction, the substrate was dissected to determine the
basis for terminator motif recognition. Through deletions and
mutations, a minimal target was determined to be a DNA
oligonucleotide comprising 35 nt of 5⬘ exon, 5 nt of 3⬘ exon, and
a 17-bp anchor sequence, annealed with a complementary
primer (SI Fig. 8). As expected, mutations in IBS1 and IBS3
motifs inhibited the reaction, and mutations downstream of IBS3
had no effect (Fig. 4A). Mutation of the spacer sequence
upstream and downstream of the stem-loop had little consequence, revealing that Ts of the terminator are not essential to
intron insertion (Fig. 4B). Moreover, the tract of Ts could be
expanded to 8 nt, indicating flexibility in length; however,
6622 兩 www.pnas.org兾cgi兾doi兾10.1073兾pnas.0700561104
shortening of the tract blocked the reaction and indicated a
minimum 3- to 4-nt spacer requirement (Fig. 4B). Similarly,
lengthening the stem to 14 bp was tolerated, whereas reducing
it below the wild-type length of 9 bp blocked the reaction (Fig.
4C). Mutations in the distal loop sequence had little effect (SI
Fig. 8D).
Most importantly, the pairing and sequence requirements of
the stem were examined (Fig. 4 D and E). First, mispairing
mutations in the 3⬘ half of the stem were found to block the
reaction, but compensating mutations in the 5⬘ half restored
wild-type levels, revealing a lack of absolute sequence requirement in the stem (Fig. 3D, lanes 2 and 3). Further sequence
requirements were probed by locating three adjacent G䡠C base
pairs at proximal, central, and distal locations in an otherwise
A䡠T stem, with mispaired versions tested for comparison (Fig.
4D, lanes 4–9). In each case, the mispaired substrate did not
support intron insertion, but paired substrate did. A second set
of substrates with GC contents varying from 100% to 0% (Fig.
4E) again revealed all paired substrates to be reactive and
unpaired substrates unreactive. Together, these data indicate
that pairing is the critical feature of the stem rather than
sequence. Notably, stems with a high GC content or basal G䡠C
pairs were superior substrates, suggesting that stem stability
enhances the reaction, which is consistent with the high GC
content of stems in Fig. 1 B and C (70% and 87% GC).
We also compared the five natural targets of B.h.I1 (Fig. 1 A).
All five sequences gave insertion reactions equivalent to rrnF,
except for rrnG, which has a mispaired IBS3–EBS3. The reactivity of rrnG was reduced similarly to the IBS3 mutation in Fig.
4A (data not shown).
The biochemical experiments thus define the in vitro DNA
target to be a ⱖ9 bp DNA stem followed by a spacer of 4–8 nt,
a 4-nt IBS1 motif, a 1-nt IBS3 motif, and a bottom strand primer
with its 3⬘ end at ⫹5 to ⫹10 relative to the insertion site (Fig.
1D). Interestingly, these requirements are largely consistent with
the consensus structures for all class C targets (Fig. 1B). Having
failed to identify a homing site sequence requirement for B.h.I1
as we had initially expected, we next considered whether the
biochemical rules might be sufficient to explain the in vivo
insertions in the B. halodurans genome. The genome was
Robart et al.
scanned by using the program RNAMotif (27) for GC-rich stems
followed by IBS1 and IBS3 motifs. When a 70% GC stem was
required, no additional targets were found, whereas a 50%
requirement identified only four more targets (SI Fig. 9). Given
the complexities expected to influence formation of DNA
stem-loops in vivo, we conclude that no additional sequence
specificity is required to explain B.h.I1 insertions in vivo. Furthermore, we propose that the B.h.I1 target requirements largely
represent the specificity of other bacterial class C introns.
Discussion
We have established a previously undescribed mobility behavior
for group II introns in which a structural motif is targeted rather
than a defined homing site sequence. Furthermore, the specificity represents a new strategy of survival for selfish DNAs. This
strategy has three obvious advantages for the retroelement.
First, the specificity ensures that the element will insert outside
of transcription units and avoid inactivation of vital host genes.
Second, insertion after terminators provides the retroelement
with low levels of transcription via terminator read-through, thus
limiting further transpositions. The B.h.I1 intron, for example, is
expressed by 1% read-through of the rrnB terminator (W.S. and
S.Z., unpublished data), and a similar scenario is expected for
other class C introns. Finally, insertion after intrinsic terminator
motifs ensures the mobile DNA of finding targets in many
organisms because intrinsic terminators are found across the
eubacterial kingdom.
Bacterial genomes are densely packed (⬇90% coding), making
it difficult for a mobile DNA to avoid disrupting important genes.
Most bacterial mobile DNAs do not exhibit site-specificity and
instead limit damage by infrequent transpositions because of low
levels of transcription and translation. There are notable excepRobart et al.
tions, however, such as Tn7, which targets the attTn7 sequence
found in many bacterial genomes, and IS231, IS21, and IS911, which
each insert into sequences of other transposable elements (4). Some
IS elements, such as IS1397 (28) and IS621 (29), target extragenic,
palindromic repeats that are not terminators, a specificity similar to
class C introns. The targeting of terminator motifs by class C introns
provides an additional mechanism to avoid gene disruptions in
bacteria: the element elegantly locates the end of a transcription
unit and inserts after it.
A parallel strategy exists in eukaryotes, in that regions upstream of promoters are sometimes targeted by retroelements
because this region is usually noncoding. For example, Ty1 and
Ty3, and to a lesser extent Ty2 and Ty4, of Saccharomyces
cerevisiae insert upstream of pol III promoters (30), whereas Tf1
of Schizosaccharomyces pombe inserts upstream of pol II promoters (31). In the case of Ty3, the basis of specificity is well
defined and is due to recognition of proteins that assemble at
promoters, rather than recognition of a DNA sequence (7).
Targeting to terminator motifs has not been reported in eukaryotes, and this may be because of the nature of eukaryotic
termination signals. Pol II lacks a discrete termination signal,
and the poly(U) tract of pol III terminators can be found within
protein genes.
Unwound DNA is rarely found in the cell, which raises the
question as to how the RNP encounters the stem-loop in vivo.
The most probable source is ssDNA at the DNA replication fork
because this ssDNA is the substrate for other group II introns
that lack endonuclease domains (20, 21, 32). In this pathway, the
intron reverse splices into either ssDNA or dsDNA at the
replication fork and then uses nascent DNA of the leading or
lagging strands to prime reverse transcription of the intron.
Adapting this mechanism to explain class C intron mobility, the
PNAS 兩 April 17, 2007 兩 vol. 104 兩 no. 16 兩 6623
BIOCHEMISTRY
Fig. 4. Sequence and motif requirements of the DNA target substrate. Oligonucleotide variants with annealed 3⬘ exon primers were incubated with RNPs and
dNTPs for 2 min at 30°C and resolved on denaturing polyacrylamide gels. Mutations tested correspond to lane numbers. (A) Mutagenesis of IBS1, IBS3, and 3⬘
exon sequence. Gray shading shows deviations from wild-type sequence. (B) Mutagenesis of spacers upstream and downstream of the stem-loop. (C) Mutagenesis
of the stem length. (D) Mutagenesis of stem sequence. Mutations were made in sets of two to test mispaired and paired substrates. Substrates 2 and 3 maintain
GC content but not sequence of the stem; substrates 4 –9 test three G䡠C base pairs located at different positions in an A䡠T-rich stem. In all cases, paired but not
unpaired substrates were reactive. (E) Effect of GC content in the stem.
intron would reverse-splice into unwound ssDNA at the replication fork, in either strand, and reverse transcription would be
primed by either the leading strand or the lagging strand.
A second potential source for ssDNA is the transcription
bubble, which, although short, might allow a cruciform structure
to form more readily during transcriptional termination. One
possibility combining the two models is that the RNP may bind
to the DNA stem-loop during transcription, either with or
without the reverse-splicing reaction, and then remain bound to
the hairpin until the arrival of the replication fork. This model
is attractive because the RNP and RNA polymerase would
interact with opposite strands of DNA, such that the RNP–DNA
complex might persist during multiple passes of RNA polymerase. This model is also consistent with the observation that most
class C introns insert downstream of the motifs, whereas the
former model would predict no correspondence with the direction of transcription.
The model for an in vivo DNA hairpin target is strengthened
by the six intron examples that insert at attC sites of integrons.
Crystal structure data show that the integron integrase recognizes the attC site as a DNA stem-loop (33), and it has been
proposed that a hairpin within ssDNA presented during conjugation is the substrate for integration in vivo rather than dsDNA
(34). It follows that the ability of attC sites to form DNA
stem-loops in vivo may be what allowed them to become targets
for class C introns. Conjugation is a potential source of ssDNA
targets for class C introns also, particularly introns that insert
into attC sites; however, it is unlikely to be the primary source
because most class C introns are located on chromosomes rather
than plasmids (e.g., Sy. thermophilum; Fig. 1C).
Finally, we consider the evolutionary implications of the target
preference: did the ancestral intron mobilize by homing or by
insertion at terminator motifs? It has been proposed that
bacterial class C introns are the earliest branching based on IEP
phylogenies (35), which makes it possible that they represent an
ancestral form. However, as discussed previously (36), it is not
yet clear whether class C introns are in fact the earliest branching, and even if so the properties of the ancestor are not
straightforward because all lineages have diverged since their
split. Nevertheless, the mobility properties of class C introns
reinforce the retroelement character of group II introns in
bacteria and support the hypothesis for a retroelement ancestor
of known group II introns (37). The selfish DNA character is less
pronounced for group II introns in organelles, where many
introns are immobile and the mobile introns reside in housekeeping genes. For bacterial class C introns, one might question
whether they should even be considered introns because they are
excluded from genes and transcription units.
Materials and Methods
Plasmids. Oligonucleotides were purchased from Alpha DNA
(Montreal, QC, Canada), and sequences are provided in SI
Table 1. The ORF expression construct pQE30-BhOhis was
made by PCR amplification of the B.h.I1 ORF from genomic
DNA prepared as previously described (36) with the primers
BhHis-5⬘ and BHO3⬘-AS1. The resulting 1,267-bp DNA was
cloned into the SphI and SmaI sites of pQE-30 (Qiagen, Valencia, CA). The plasmid pKS-SEP3 was cloned in two steps. The
5⬘ portion of the B.h.I1 ribozyme was amplified from genomic
DNA with the primers B.h.Int-5⬘ and BHR5⬘-AS1, which produced a 704-bp fragment (18-nt 5⬘ exon and 686-nt intron) that
was cloned into the SmaI site of pBluescript II KS⫹ (Stratagene,
La Jolla, CA). The 3⬘ portion of the ribozyme was amplified with
the primers BHR3⬘-S1 and BHNG3⬘-AS1 to produce a 428-bp
DNA (271-bp intron, 150-nt 3⬘ exon), which was cloned into the
EcoRV site of the 5⬘ clone intermediate. Plasmid pKS-BhD35/10
was made by blunt ligation of a 45-bp DNA into the pBluescript
EcoRV site in the forward orientation. The 45-bp DNA was
6624 兩 www.pnas.org兾cgi兾doi兾10.1073兾pnas.0700561104
made by annealing and polymerizing the primers BhD35/10-S
and BhD10-AS. All PCR amplifications were performed by
using Pfu DNA polymerase (Stratagene) to minimize mutations,
and final constructs were verified by sequencing.
Expression and Purification of Recombinant B.h.I1 IEP. pQE30-
BhOhis was expressed in M15[prep4] E. coli cells (Qiagen). A
single colony was inoculated into 3 ml of LB with 100 ␮g/ml
ampicillin and 30 ␮g/ml kanamycin and grown at 37°C for 12 h.
The starter culture was diluted 50-fold into 50 ml of LB with the
same antibiotics and grown at 37°C to an OD600 of 0.3, followed
by induction with 0.5 mM IPTG for 4 h at 30°C. Cells were
harvested at 5,000 ⫻ g, washed, and resuspended in 500 ␮l of lysis
buffer (50 mM NaH2PO4, pH 8.0/0.5 M NaCl/2 mM imidazole/
0.5% Triton X-100). Cells were lysed on ice with sonication with
five bursts of 15 sec interspersed with 60-sec rests. All subsequent
steps were carried out at 4°C. Lysates were cleared at 10,000 ⫻
g in a microfuge and batch-bound to 0.2 ml of Ni-NTA Agarose
(Qiagen) for 30 min with constant gentle mixing. The resin was
loaded on a 1-ml syringe column and washed with 3 ml of
washing buffer (50 mM NaH2PO4, pH 8.0/0.5 M NaCl/20 mM
imidazole/0.5% Triton X-100), followed by elution with 250-␮l
aliquots of elution buffer (50 mM NaH2PO4, pH 8.0/0.5 M
NaCl/100 mM imidazole/0.5% Triton X-100). The majority of
protein eluted in the first two fractions, which were pooled.
Glycerol was added to a final concentration of 50%, DTT was
added to 5 mM, and the IEP was stored at ⫺20°C.
RT Assays. Poly(rA)䡠oligo(dT)18 RT assays were performed as
described previously (38) by using 500 ng (10 pmol) of purified
IEP per reaction and a modified assay buffer of 40 mM Tris䡠HCl
(pH 8.5), 300 mM NH4Cl, 5 mM MgCl2, and 5 mM DTT, at
either 37°C or 42°C for 10 min. For reactions with RNase, 1 ␮g
of RNase A (specific for U and C) and 5 units of RNase T1
(specific for G) were added to the reaction, allowing simultaneous release of the RT from bound RNAs and reverse transcription of the poly(A) template.
In Vitro Transcription Reactions. Transcription reactions were per-
formed with 0.5 ␮g of template pKS-SEP3 linearized by ClaI and
with 50 units of T7 RNA polymerase (Invitrogen, Carlsbad, CA)
according to the manufacturer’s protocol. Transcripts were extracted with phenol-CIA (25:24:1 of phenol:chloroform:isoamyl
alcohol) and ethanol-precipitated in the presence of 2 M NH4OAc,
washed with 70% ethanol, and resuspended in TE (10 mM
Tris䡠HCl, pH 7.6/1 mM EDTA). For radiolabeled B.h.I1 transcripts,
transcription included 10 ␮Ci of [␣-32P]UTP (3,000 Ci/mmol; GE
Healthcare, Piscataway, NJ) with 0.1 mM UTP and 0.5 mM other
NTPs. Transcripts were gel-purified on a 1% agarose TBE gel (90
mM Tris䡠borate, pH 8.3/1 mM EDTA), extracted by a MinElute gel
extraction column (Qiagen) according to the manufacturer’s instructions for dsDNA molecules, and resuspended in TE.
In Vitro IEP-Dependent Splicing Reactions. Before splicing, transcripts of B.h.I1 in TE were subjected to a folding protocol using
a PerkinElmer 2400 PCR machine (PerkinElmer, Wellesley,
MA) with the following temperature cycles: 90°C for 1 min, 75°C
for 5 min, and then cooling to 45°C over 15 min. Initial
experiments used radiolabeled transcript (20,000 cpm, 13 fmol)
and 500 ng (10 pmol) of purified B.h.I1 RT protein. Reactions
were initiated by adding to the prefolded RNA (5 ␮l) a prewarmed mixture containing IEP and buffer to make a final
volume of 25 ␮l of 1⫻ splicing buffer (40 mM Tris䡠HCl, pH 8.5/5
mM MgCl2/300 mM NH4Cl). The reaction was incubated at 30°C
for specified times and stopped by placement on dry ice.
Products were then phenol-CIA-extracted and ethanolprecipitated with 0.3 M NaOAc (pH 5.2) and 1 ␮g of carrier
tRNA (Sigma–Aldrich, St. Louis, MO). Samples were resusRobart et al.
In Vitro Mobility Assays. Substrates for mobility assays were
5⬘-end-labeled oligonucleotides that were made by using
[␥-32P]ATP (3,000 Ci/mmol; GE Healthcare) and T4 polynucleotide kinase (Invitrogen), with resulting specific activities of 106
to 107 cpm/␮g. For the experiment in SI Fig. 7C, oligonucleotides
were 3⬘-end-labeled by using [␣-32P]dCTP (3,000 Ci/mmol; GE
Healthcare) and terminal transferase (New England Biolabs,
Ipswitch, MA). In this case, the specific activity was made equal
to the 5⬘-end-labeled substrates by dilution with unlabeled
oligonucleotide. Radiolabeled DNAs were purified by a G-50
spin column (Sigma–Aldrich) (39), phenol-CIA-extracted, ethanol-precipitated with 1 ␮g of carrier tRNA, washed with 70%
ethanol, and resuspended in TE buffer.
Substrates for reverse-splicing assays were prepared by prefolding 100,000 cpm (23 fmol) radiolabeled oligonucleotide, as
described for RNA transcripts above, together with 20 pmol
complementary primer, and 200 ␮M of each dNTP or ddNTP in
a volume of 5 ␮l. Buffer was added to produce 25 ␮l with 1⫻
splicing buffer concentrations. Reactions were initiated by adding 25 ␮l of RNPs from the splicing reaction and incubated at
30°C. Reactions were for 15 min for experiments in Figs. 2 A and
3 and SI Figs. 7 and 8 A–C and 2 min for experiments in Fig. 4
and SI Fig. 8D. Reactions were stopped by placing on dry ice and
were then phenol-CIA-extracted and ethanol-precipitated by
using 0.3 M NaOAc (pH 5.2) with 1 ␮g of carrier tRNA
1. Boeke JD, Devine SE (1998) Cell 93:1087–1089.
2. Bushman FD (2003) Cell 115:135–138.
3. Chandler M, Mahillon J (2002) in Mobile DNA II, eds Craig NL, Craigie R,
Gellert M, Lambowitz AM (Am Soc Microbiol, Washington, DC), pp 305–366.
4. Craig NL (1997) Annu Rev Biochem 66:437–474.
5. Luan DD, Korman MH, Jakubczak JL, Eickbush TH (1993) Cell 72:595–605.
6. Moran JV, Gilbert N (2002) in Mobile DNA II, eds Craig NL, Craigie R, Gellert
M, Lambowitz AM (Am Soc Microbiol, Washington, DC), pp 836–869.
7. Kirchner J, Connolly CM, Sandmeyer SB (1995) Science 267:1488–1491.
8. Pardue ML, DeBaryshe PG (2003) Annu Rev Genet 37:485–511.
9. Zhu Y, Zou S, Wright DA, Voytas DF (1999) Genes Dev 13:2738–2749.
10. Belfort M, Derbyshire V, Parker MM, Cousineau B, Lambowitz AM (2002) in
Mobile DNA II, eds Craig NL, Craigie R, Gellert M, Lambowitz AM (Am Soc
Microbiol, Washington, DC), pp 761–783.
11. Lambowitz AM, Zimmerly S (2004) Annu Rev Genet 38:1–35.
12. Smith D, Zhong J, Matsuura M, Lambowitz AM, Belfort M (2005) Genes Dev
19:2477–2487.
13. Mohr G, Smith D, Belfort M, Lambowitz AM (2000) Genes Dev 14:559–573.
14. Singh NN, Lambowitz AM (2001) J Mol Biol 309:361–386.
15. Dai L, Zimmerly S (2002) Nucleic Acids Res 30:1091–1102.
16. Granlund M, Michel F, Norgren M (2001) J Bacteriol 183:2560–2569.
17. Zimmerly S, Hausner G, Wu X (2001) Nucleic Acids Res 29:1238–1250.
18. Martı́nez-Abarca F, Garcı́a-Rodrı́guez FM, Toro N (2000) Mol Microbiol
35:1405–1412.
19. Jiménez-Zurdo JI, Garcı́a-Rodrı́guez FM, Barrientos-Durán A, Toro N (2003)
J Mol Biol 326:413–423.
20. Ichiyanagi K, Beauregard A, Lawrence S, Smith D, Cousineau B, Belfort M
(2002) Mol Microbiol 46:1259–1272.
21. Zhong J, Lambowitz AM (2003) EMBO J 22:4555–4565.
Robart et al.
(Sigma–Aldrich). Products were resuspended in formamide
loading buffer and heated to 85°C for 2 min before loading onto
4% polyacrylamide (19:1 acrylamide:bisacrylamide)/8 M urea
gels. All reverse-splicing assays were verified by repeating the
data in an independent experiment.
The DNA substrates for mobility reactions consisted of a long
radiolabeled oligonucleotide and a short primer. For initial
characterizations (Figs. 2 and 3 and SI Figs. 7 and 8B) the
radiolabeled sequence was BhD35/10-S or the specified variation
with primer BhD10-AS. For subsequent assays (Fig. 4 and SI Fig.
8 A, C, and D), the radiolabeled sequence was BhDhs35/5/For or
a specified variation with M13 forward primer (For).
For the substrates in Fig. 3, the dsDNA substrate was prepared
by PCR amplification of pKS-BhD35/10 using primers T7 and
T3, followed by gel extraction and 5⬘-end-labeling. The RNA
substrate was prepared by in vitro transcription of pKS-BhD35/10
template, followed by 5⬘-end-labeling. The substrate S (T3) was
the oligonucleotide BhD35/10-S with T3 or no primer. The
substrate AS was the oligonucleotide BhDhs35/10-AS and the
primer 10S. The substrate pKS was the oligonucleotide pKShs45/F and primer For.
Bioinformatic Methods. Alignment of all bacterial class C and
S.th.I1 homing sites for Fig. 1 A and B was performed manually
by using BioEdit (www.mbio.ncsu.edu/BioEdit/bioedit.html).
Alignments are shown in SI Fig. 5. Genome scans for potential
unfilled homing sites were performed by using RNAMotif (27)
against the complete B. halodurans C-125 genome sequence
(BA000004/NC㛭002570) using the descriptor shown in SI Fig. 9A.
This work was supported by a grant from the Canadian Institutes of
Health Research, a Natural Sciences and Engineering Council PGS-D
studentship (to A.R.R.), and a Senior Scholar award from the Alberta
Heritage Foundation for Medical Research (to S.Z.).
22. Stokes HW, O’Gorman DB, Recchia GD, Parsekhian M, Hall RM (1997) Mol
Microbiol 26:731–745.
23. Muñoz-Adelantado E, San Filippo J, Martı́nez-Abarca F, Garcı́a-Rodrı́guez
FM, Lambowitz AM, Toro N (2003) J Mol Biol 327:931–943.
24. Yang J, Mohr G, Perlman PS, Lambowitz AM (1998) J Mol Biol 282:505–523.
25. Zimmerly S, Guo H, Eskes R, Yang J, Perlman PS, Lambowitz AM (1995) Cell
83:529–538.
26. Aizawa Y, Xiang Q, Lambowitz AM, Pyle AM (2003) Mol Cell 11:795–805.
27. Macke TJ, Ecker DJ, Gutell RR, Gautheret D, Case DA, Sampath R (2001)
Nucleic Acids Res 29:4724–4735.
28. Wilde C, Escartin F, Kokeguchi S, Latour-Lambert P, Lectard A, Clément JM
(2003) Nucleic Acids Res 31:4345–4353.
29. Choi S, Ohta S, Ohtsubo E (2003) J Bacteriol 185:4891–2569.
30. Lesage P, Todeschini AL (2005) Cytogenet Genome Res 110:70–90.
31. Singleton TL, Levin HL (2002) Eukaryot Cell 1:44–55.
32. Coros CJ, Landthaler M, Piazza CL, Beauregard A, Esposito D, Perutka J,
Lambowitz AM, Belfort M (2005) Mol Microbiol 56:509–524.
33. MacDonald D, Demarre G, Bouvier M, Mazel D, Gopaul DN (2006) Nature
440:1157–1162.
34. Bouvier M, Demarre G, Mazel D (2005) EMBO J 24:4356–4367.
35. Rest JS, Mindell DP (2003) Mol Biol Evol 20:1134–1142.
36. Toor N, Robart AR, Christianson J, Zimmerly S (2006) Nucleic Acids Res
34:6461–6471.
37. Toor N, Hausner G, Zimmerly S (2001) RNA 7:1142–1152.
38. Zimmerly S, Moran JV, Perlman PS, Lambowitz AM (1999) J Mol Biol
289:473–490.
39. Sambrook J, Russell DW (2001) Molecular Cloning: A Laboratory Manual (Cold
Spring Harbor Lab Press, Woodbury, NY), 3rd Ed.
PNAS 兩 April 17, 2007 兩 vol. 104 兩 no. 16 兩 6625
BIOCHEMISTRY
pended in formamide loading buffer (39), heated to 85°C for 2
min, and resolved on 4% polyacrylamide (19:1 acrylamide:bisacrylamide)/8 M urea gels. To make RNPs for in vitro mobility
reactions, 0.1 ␮g of nonradioactive intron transcript (280 fmol)
was prefolded and incubated in the same splicing buffer with the
same IEP concentrations at 30°C for 15 min, followed by
immediate use in mobility assays.