Selection of novel forms of a functional domain

© 1994 Oxford University Press
Nucleic Acids Research, 1994, Vol. 22, No. 11
2003-2009
Selection of novel forms of a functional domain within the
Tetrahymena ribozyme
Kelly P.Williams+, Hiroshi lmahori§, Denise N.Fujimoto and Tan Inoue*
The Salk Institute for Biological Studies, 10010 N. Torrey Pines Road, San Diego, CA 92037, USA
Received March 31, 1994; Revised and Accepted April 26, 1994
ABSTRACT
P5abc is an RNA structure within the self-splicing
Tetrahymena group I intron that provides an activation
function to the remainder of the ribozyme, either when
present in cis or when added in trans. This
69-nucleotide activator domain was replaced with
randomized sequence of 20 or 40 nt in length, and
individuals among these pools with sequences that
could functionally replace P5abc were selected. The
basis of selection was a reaction in which two separate
halves of the ribozyme became joined; selection was
completed by reverse transcription and the polymerase
chain reaction, using primers with sequence from either
side of the ligation junction. Selectant sequences fell
into three families that appear unrelated to P5abc; for
example they lack the A-rich bulge thought to be a
important feature of P5abc. Thus, rather than defining
some consensus sequence for activator domains, this
result reveals a certain tolerance in the ribozyme in its
ability to derive activation function from diverse
sequence types. In the context of splicing precursor
RNA, the new sequences supported self-splicing, but
failed to activate a related reaction, hydrolysis of the
3' splice site, implying that this region of the intron can
differentially control two related reactions.
INTRODUCTION
Despite fundamental differences between RNA and protein
folding, a portion of the Tetrahymena group I intron ribozyme
makes a particularly apt comparison with protein domains, the
distinct segments of primary sequence with functional and
structural independence (1), that in some cases may also have
been units of evolution (2). P5abc is a 69-nt element, comprising
a sixth of the total molecular mass of the intron (Figures 4 and
1). The AP5abc mutant ribozyme, precisely deleted of P5abc,
is essentially inactive under normal in vitro reaction conditions,
but can be reactivated by adding spermidine and elevating the
Mg 2+ concentration (3). Thus P5abc is neither part of the
catalytic core nor essential for its folding, yet provides an
activation function; a role in stabilizing ribozyme structure is
suggested. A close P5abc homolog is found in only about a sixth
of the known group I introns, but some others bear an extension
from the L5 loop in which at least an equivalent to the A-rich
bulge feature of P5abc has been discerned (4). From a viewpoint
of modular evolution of group I introns, P5abc may have been
acquired as one solution to a general problem of activating the
core, that has been solved differently in other subgroups.
Structural autonomy of P5abc has been revealed using solventbased free radicals to probe backbone ribose moieties; protection
from the reagent indicates internalization due to tertiary structure.
In the context of the ribozyme, about half the positions of P5abc
are protected, a higher level than any other region outside of the
catalytic core (which is almost completely protected) (5-8). But
P5abc also adopts tertiary structure as an isolated molecule,
evidenced by protection of seven positions in P5c and flanking
the A-rich bulge, a subset of the pattern observed in the intact
ribozyme (8). In the ribozyme, these seven P5abc-autonomous
protections are adopted at a distinctly lower Mg 2+ concentration
than the other protections within P5abc and throughout the
remainder of the intron; it has been suggested that P5abc folds
first to nucleate ribozyme folding (7,8). Perhaps the most vivid
evidence of its domain-like relationship to the rest of the ribozyme
is the fact that the separately synthesized P5abc molecule, added
in trans, reactivates the AP5abc deletion mutant (9). Gel-shift
experiments show that the two RNAs form a specific complex
(9); there is no apparent Watson-Crick base-pairing, although
specific tertiary interactions have been suggested (6,8,10).
Activation of the AP5abc deletion mutant is not uniquely
provided by P5abc. Not only can spermidine and extra Mg 2+
also reactivate the mutant, so can a protein molecule, the tyrosyltRNA synthetase CYT-18 of Neurospora mitochondria (11). This
protein was previously shown to be required for splicing by a
group I intron of Neurospora mitochondria, both in vivo and in
vitro (12,13), and is just one example of such proteins known
to facilitate group I intron splicing (14,15). CYT-18 can also
reactivate diverse splicing-defective point mutants of an intron
from bacteriophage T4 (16); it acts through binding the catalytic
core of these and other group I introns that lack P5abc, however
it does not efficiendy bind the P5abc-containing wild-type
Tetrahymena intron (11). That the P5abc activator domain can
*To whom correspondence should be addressed at: Department of Chemistry, Kyoto 606, Japan
Present addresses: +Institute of Cell Biology, National Research Council, viale Marx 43, 00137 Rome, Italy and 8The Institute of Scientific and Industrial
Research, Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567, Japan
2004 Nucleic Acids Research, 1994, Vol. 22, No. 11
| — P 5 a b c ( 6 9 n t ) - | WT
r 40 random n t - ] 4 0 N
(242nt)
(213m)
LjjX) random m J J 20N (193 m)
^ ^ > _
. - ^ A P S a b c (173 nl)
upper-half RNAs
Table 1. Synthetic DNA strands and their uses
Initial randomized upper-half RNA template
Dl
143 nt, upper-half: PT7 + (5'end-126)
D2
89 nt, upper-half complement: (235-196)+N 20 + (126-98)
D3
109 nt, upper-half complement: (235-196) + N 4O +(126-98)
Lower-half RNA template
D4
54 nt, lower-half: P T7 +(5'end-254)
D5
30 nt, lower-half complement: (3'end-385)
Reverse transcription
D6
27 nt, upper-half complement: (3'end-216)
Selective (first) PCR
D7
28 nt, lower-half: (5'end-254)
D6
(above)
Upper-half template-regenerating (second) PCR; control templates
D8
58 nt, upper-half: PT7 + (5' end-41)
D6
(above)
lower-half RNA
(188
nt)
self-ligation ^
Figure 1. Self-ligation of ribozyme halves, the basis of selection for P5abcreplacements. The group I intron of Tetrahymena LSU rRNA (sequence in capital
letters) was divided in the L6 loop (dotted line indicates the connection forming
L6 in the intact ribozyme) into two halves (boxed), which are separately
nonfunctional but can hybridize to restore ribozyme activity (19). Standard
nomenclature of some base-paired regions and loops, and sequence numbering
of the intron (4) are indicated. Sequences not part of the intron (printed in lower
case) are a 4-nt sequence replacing the 14 nt of the wild-type LI, the 5'-miniexon, and nt outside of the separation at L6. Four forms of the upper-half RNA
were used, differing in the sequences extending from and connecting L5, as
indicated (see Figure 4 for the sequence of the wild-type P5abc). The upper-half
RNA pools used to initiate the 20N and 40N experiments lacked the 3' seven
nt of the sequence specified here. In the self-ligation reaction (large arrow), the
two halves became joined inverse to their normal order, by nucleophilic attack
of the 3'-OH of the terminal guanosine residue at the 5' splice site. Lined sequences
correspond to the D6 and D7 primers used in the selective (first) PCR. Also marked
are the restriction sites used ultimately for subcloning.
be replaced by a molecule so strikingly different as a protein
suggested to us that other, unrelated RNA sequences might also
be able to replace P5abc in the ribozyme, perhaps conferring
even stronger activation function than does P5abc. To test this
line of thought, we developed a system for the in vitro selection
of novel activator domains of the Tetrahymena ribozyme. We
report the selection and preliminary characterization of three
families of functional P5abc-replacement sequences, which do
not resolve but provide new insights into the mechanism of P5abc
function.
MATERIALS AND METHODS
General methods
Synthetic DNA strands, designated D 1 - D 8 (Table 1), were
prepared using phosphoramidite chemistry on an Applied
Biosystems model 490 synthesizer. PCR reactions were
performed with 30 temperature cycles (95°C, 1 min; 55°C, 2
min; 72°C, 3 min) using Pfu DNA polymerase (Stratagene)
according to manufacturer's instructions, with 200pmol of each
primer in a 200 pA reaction volume using a Perkin-Elmer-Cetus
model 480 thermocycler. T7 RNA polymerase was purified and
used as described (17) for transcription with uniform [32P]-GMP
PYJ-. the T7 RNA polymerase promoter sequence TAATACGACTCACTATA.
In parenthesis: end-inclusive DNA versions of sequence indexed in Figure 1 (or
its complement). N: randomized position (according to the manufacturer, the
reactivity of the phosphoramidites is such that each of these positions actually
had 30% T, 26% G, 24% C and 20% A).
labeling. Electrophoresis of DNA and RNA was in 4 %
polyacrylamide (29:1 acrylamide:bisacrylamide) gels containing
7 M urea, 90 mM Tris-borate, 0.2 mM EDTA; purifications
continued with crush-and-soak elution from the gel slice in 0.3
M NaOAc, pH 5.2, and ethanol precipitation. RNA was further
purified by gel permeation chromatography on Sephadex G-50
columns.
Starting templates (for RNAs of Figure 1)
For the lower-half RNA template, a segment of pTZIVSU (18)
was deleted according to (19), and the resulting plasmid used
to prime a PCR reaction with D4 and D5; the product was
purified in an agarose gel. Templates for the control WT and
AP5abc forms of the upper-half RNA were prepared by PCR
with D6 and D8 using pTZIVSU bearing either wild-type or
AP5abc intron sequences.
To prepare the templates for the inital 20N and 40N upperhalf RNAs of the selection experiments, 20 pmol Dl was
hybridized to 20 pmol 5'-[32P]-labeled D2 or D3 and the DNA
strands in these pairs were extended using Sequenase (US
Biochemicals) according to the manufacturer's instructions;
approximately 30% of the D2 or D3 molecules were converted
into complete bottom strands in these templates. Templates with
full-length bottom strands should have supported transcription
initiation equally well; more questionable would bethe efficiency
of transcript elongation in the downstream region of the template,
where top strand synthesis may have been incomplete (not
normally considered problematic for T7 RNA polymerase) or
damage sustained during synthesis of D2 or D3 may have blocked
RNA polymerase progress; we were content to observe that there
was little smearing below the gel band of the full-length
transcripts, no more than for comparable templates produced by
PCR (although direct measurement of active templates was in
principle accessible by single-round transcription with excess
RNA polymerase). One-half of each template preparation, which
we estimate contained 2.4 pmol of active template, was
transcribed; upper-half RNA yields were 95 (20N) and 52 (40N)
pmol.
Nucleic Acids Research, 1994, Vol. 22, No. 11 2005
START
hybridize oligo pairs
29 bDP
ZZ
Dl
p
^
5'-rriini-exon
primer extension
D2(20N)or
D3 (40N)
T
randomized P5abc region
V
template DNA for upper-half RNA
X
second PCR (D6+D8):
replace lower-half DNA with
T7 promoter and mini-5'-exon
V
transcribe upper-half RNAs
\
gel-purify
first PCR (D6+D7):
across ligation junction
subclone
sequence
assay activity
END
hybridize with
lower-half RNA
reverse transcribe (D6)
\
self-ligation reaction
5 mM Mg+\ 30°, 1 or 15 min
lower-half RNA ligated to upper-half RNA
5'-mini-exon
Figure 2. Outline of selection procedure.
Self-ligation reactions
Upper-half RNA (5 pmol) was hybridized with 10 pmol lowerhalf RNA by heating to 70°C in 5 jtl 0.1 mM EDTA for 10 min
and incubating at 20°C for 20 min. The sample was preheated
to 30°C and the reaction was initiated by dilution to 10 jtl
containing 50 mM Epps, pH 6.5, 200 mM NH4OAc, 5 mM
MgCl2, 0.1 mM EDTA; reactions were incubated 1 or 15 min
at 30°C, then stopped by adding 5 /tl of 12 mM EDTA. The
RNA was precipitated with ethanol, and resuspended in 12 /tl
of 0.1 mM EDTA. The reaction was monitored by subjecting
2 /il to electrophoresis, followed by autoradiography.
Completion of the selection cycle
Half of the RNA from each self-ligation reaction was incubated
2 min at 37°C with 100 pmol D6 in 20 /A containing 50 mM
Tris-Cl, pH 8.3, 75 mM KC1, 3 mM MgCl2, 10 mM DTT,
0.5 mM each dNTP. RNase H~ M-MLV reverse transcriptase
(BRL) was added and the sample incubated at 65°C for 10 min,
then 95°C for 5 min. The bulk of the RNA was degraded by
adding NaOH to 200 mM and incubating at 37°C for 10 min.
After neutralizing, the cDNA was precipitated with ethanol and
10 mg glycogen, then subjected to PCR using D6 and D7,
radiolabeling with [a-32P]-dATP. The product was purified by
electrophoresis in a 1.5 mm thick gel (gels used for the first three
cycles were composed of 5 % polyacrylamide without urea), and
subjected to a second PCR using D6 and D8. Half of the yield
was used as the transcription template to initiate the subsequent
selection cycle. The amplification product from the 1 min selfligation reaction was used to continue selection, except that in
the first cycle for both 20N and 40N, and in the second cycle
for 40N, no PCR products were obtained for the 1 min
timepoints; in these cases, the product from the 15 min timepoint
was substituted to continue selection.
Control experiments revealed that two types of background
could occur in this selection system. A carry-over background
was observed occasionally when upper-half RNA alone was taken
through the amplification steps; this was attributed to
contamination of the gel slice of the hypothetical product of the
first PCR with upper-half cDNA, which is readily amplified in
the second PCR. Smaller DNAs can sometimes be trapped in
the electrophoretic band of a larger DNA; we found the use of
long thick porous gels effective in eliminating carry-over. When
AP5abc upper-half RNA was allowed to react with lower-half
RNA for 15 min, the product was too scarce to be detected
directly by autoradiography, but there was sufficient ligated RNA
to yield products in the subsequent reverse transcription and PCR.
Such a reactivity background was probably also operating in the
early cycles of the 20N and 40N selections, with few appreciably
active molecules vastly outnumbered by very weakly active
molecules; thus directly detectable self-ligation activity did not
appear in the pools until after four cycles of selection.
Analysis of individual selectants
DNA from the second PCR of the sixth selection cycle for the
20N and 40N pools was subcloned into pTZIVSU using the Sphl
and BglE restriction sites; the sequence between these sites was
determined for 52 subclones. Outside of the randomized region,
the sequences exhibited no base-substitutions from the wild-type,
although the four consecutive A residues of P2.1 were found in
three instances to have been converted into three, and in one
instance, into five A residues. Plasmid pTZFVSU and its
derivatives bearing the AP5abc and each of the six different
selectant sequences (Figure 4) were digested with HindM to
prepare templates for precursor RNAs (with wild-type LI and
P2.1, and with both 5' and 3' mini-exons.) Precursor RNA (0.16
pmol) was incubated at 37°C for 5, 15, or 45 min in 10 /tl of
5 mM MgCl2 with either 50 mM Mops, pH 7, 0.2 mM GTP
(splicing) or 50 mM Epps, pH 8.3 (3' splice site hydrolysis),
and reactions were stopped by adding 10 /il of 7 M urea, 12 mM
EDTA. Reaction extent was evaluated by autoradiography after
electrophoresis.
Experimental outline
Several in vitro selection systems on a related plan have been
described recently for exploring RNA function (20, 21).
Typically, a pool of a large number of sequence-varied RNAs,
transcribed from a synthetic template, is treated so as to select
individual RNAs that exhibit a desired function; these individuals
are converted back into transcription templates by reverse
transcription and PCR, so that subsequent cycles of selection can
proceed, ultimately subcloning for analysis of individuals. The
functions that have been used as a basis for selection are mainly
of two types: specific binding to a ligand, or ability to act as
a substrate or autocatalyst for a reaction (such as cleavage or
ligation) that alters the RNA chain. RNA-altering reactions can
allow for the isolation of desired product RNAs in gels, but
ligation provides a better means of selection than cleavage, since
subsequent reverse transcription and PCR steps can be arranged
so as to amplify only the product RNAs.
We thought it might be possible to select novel forms of the
replaceable activator domain in the Tetrahymena ribozyme with
such an approach. Not only can the intron ligate its flanking
exons, it can also be a substrate for self-catalyzed ligation
reactions. PCR using oligonucleotides with convergent sequences
from opposite sides of the ligation junction is, in principle, an
extremely powerful method for selecting self-ligating RNAs from
a pool containing many inactive RNAs. An appropriate selection
2006 Nucleic Acids Research, 1994, Vol. 22, No. 11
system was developed (Figure 2 and Materials and Methods),
bearing two major potential problems in mind. First, since the
wild-type ribozyme is highly active, and the selective methods
are very powerful, such systems are extremely susceptible to
contamination from stray molecules bearing the wild-type
sequence. To avoid this problem, the 69-nt P5abc sequence was
replaced in starting RNA pools not with randomized sequence
of the same length, but with shorter sequences, of either 20 or
40 random nt (each of the four bases present at approximately
equal frequency at each position). Such size difference was
sufficient to allow separation from potential wild-type
contaminants in the two gel-purification steps employed in each
selection cycle. A second problem to address was that the
ribozyme is highly active under the conditions used for
transcription; the most active molecules in a sequence-varied
RNA pool could be lost if they engage in undesirable reactions
during the transcription reaction or if the transcripts are to be
purified in a gel. The problem of premature reaction was solved
by splitting the ribozyme into two halves (upper-half and lowerhalf RN As; Figure 1), neither active separately, but capable of
base-pairing in two regions to restore ribozyme activity. The
efficacy of such a bimolecular ribozyme has been demonstrated
previously for the Tetrahymena and other group I introns (19,
22). Separate synthesis of the two ribozyme halves had other
technical advantages. A large batch of the lower-half RNA was
prepared and used throughout the course of the selection
procedures; fresh replacement of lower-half RNA in each cycle
reduced the potential for accumulation of unwanted mutations
outside the randomized region. P5abc resides in the upper-half
RNA, which is short enough so that the templates for transcription
of the initial sequence-varied RNAs could be constructed simply
by enzymatically extending synthetic DNA strands paired at their
3' ends. For larger initial RNAs, incorporation of specified
sequence variation into the transcription template would require
DNA ligation, which could engender multiple side-products.
We chose an equivalent to the self-circularizing reaction of the
intact intron (23), as a basis for selecting sequences that could
functionally replace the P5abc domain of the Tetrahymena
ribozyme. Circularization can occur after the 3' exon has been
lost from the precursor due to either splicing or hydrolysis at
the 3' splice site. The 3'-terminal residue of all group I introns
is a guanosine; me exposed 3'-OH of the intron thus resembles
the 3'-OH of free guanosine, the nucleophile that normally
initiates splicing by specific binding to the intron and attack of
the 5' splice site. The preferred site of attack by the terminal
3'-OH of the intron is likewise the 5' splice site, but certain
positions in the loop LI can also serve as circularization sites.
To eliminate secondary circularization sites, the LI loop was
replaced in upper-half RNA with a shorter inert sequence (Figure
1). Reaction was allowed to proceed for 1 or 15 min, at which
timepoints the reaction with the upper-half RNA bearing the wildtype P5abc was in an early phase (Figure 3).
In our case, the equivalent to the self-circularized intron was
of course a linear self-ligation product of the two ribozyme halves.
This product RNA was selectively amplified in the DNA phase
of the selection cycle. Following the self-ligation reaction, RNAs
were treated by reverse transcription using a primer
complementary to the 3' end of the upper-half RNA; thus all
upper-half RNAs, not just those with lower-half RNA attached
upstream, were subject to conversion into cDNA. The cDNA
of ligated RNA was by design substantially longer than that of
unligated RNA; more importantly, the subsequent PCR reaction
20N 20N 20N 40N 40N WT WT Upper AP5abc
-
+
+
+
+
+
+ Lower
+
15'
1'
15'
1'
15'
1'
15' Time
15'
419
390
370
Figure 3. Self-ligation activity attained by in vitro selection. The indicated
combinations of 5 pmol upper-half and 10 pmol lower-half RNAs were hybridized
and allowed to react (reaction time in minutes) as described in Materials and
Methods; 20N and 40N upper-half RNAs were from the pools after six selection
cycles. Densitometric scans of multiple exposures of the gel indicated that
approximately 150 fmol of the ligation product was produced in the 15 min WT
control. Reaction extents, relative to the 15 min WT control (100 %), were more
accurately determined as 43%, 64%, and 12%, for the 15 min 20N, 15 min 40N,
and 1 min WT samples, respectively. Ligation products were not detected for
the other samples. Substrate and product RNAs are marked by their expected
lengths in nt (see also Figure 1); lines mark the positions of end-labeled DNA
markers of 310, 281, 271, 234 and 194 nt in a lane not shown. Lower-half RNA
(188 nt) was radiolabeled at a lower specific activity than the upper-half RNAs.
amplified only the desired cDNA, by using primers with sequence
from either side of the ligation junction. A second PCR was then
necessary to regenerate the template for the upper-half RNA in
order to proceed with the subsequent selection cycle.
RESULTS
For both the 20N and 40N experiments, self-ligation activity was
not directly detectable (by autoradiography) for the starting RNA
pools (data not shown), but became observable after four selection
cycles; activity further improved after both the fifth and sixth
cycles, nearly attaining the level of activity produced by upperhalf RNA containing wild-type P5abc (Figure 3), but there was
no further improvement after a seventh cycle. Selectant DNA
was subcloned and individual sequences were determined (Figure
4). Each pool was dominated by a particular sequence (20N-a
or 40N-la); the 20N pool additionally contained two variant
sequences, one differing from the dominant sequence by a single
base and the other differing at six positions. The 40N pool also
contained a single-base variant of the dominant sequence,
however an unrelated sequence (40N-2) was also found. At the
primary structural level, these three selectant families bear no
apparent relationship to P5abc (or each other).
Nucleic Acids Research, 1994, Vol. 22, No. 11 2007
Table 2. Activity of selectants in precursor form
Sequence at
P5abc region
•£=g'
&
u
u
20N-b
G-*-A(n=l)
S
20N-C
(si* c g e s
in bold type)
u.g
5f-
wild-type
P5abc
~gsu~
20N-a
(n=2S)
Figure 4. P5abc and replacement sequences that were selected. Six different
selectant sequences were found, falling into three apparent families; the number
of occurrences (n) among the analyzed subclones are given. Sequences are
presented in the secondary structure predicted by a standard RNA-folding algorithm
(24), with positions that were randomized in upper case and nonrandomized
neighboring sequences in lower case. Alternative potential base-pairing partner
sequences are overlined. 20N-b and 40N-lb differ from 20N-a and 40N-la by
single G-to-A transversions (large arrows). Small arrows mark the six bases (bold
type) that differ between 20N-la and 20N-lc. Note that several intra-family
differences are compensatory base-pair changes in the given structures.
The Zuker free-energy minimization algorithm (24) was used
to predict secondary structures for the selectant sequences (Figure
4); such predicted structures must be viewed with some
skepticism, but are a useful starting point for discussion. Some
of the base changes within the selectant families tend to support
the assignment of these strucures in that they conserve predicted
base-pairing: the single G - A difference between 40N-la and
40N-lb maintains a predicted pairing with a U; four of the six
base differences between 20N-a and 20N-C are conservative
changes of two base-pairs in the predicted structure. The
secondary structures show some resemblances to that predicted
for P5abc which may be only superficial. The A-rich bulge
feature, with its 3' end proximal to P5, has been discerned in
many group I introns in sequence extending from the L5 loop
(4) and some functional importance has been attached to it
(8,10,25); a bulge in the predicted 40N-2 structure is A-rich,
but is oriented with its 3' end distal to P5. The structure given
for 40N-la and b, like that of P5abc, has three closely joined
base-paired stems.
Alternative potential base-paired stems can be found for each
selectant (Figure 4, overlined sequences), particularly convincing
for the 40N-1 family. Each alternative stem exhibits, with respect
to a stem of the computer-predicted structure, the general
interdigitated topology of a pseudoknot (26), i.e., the partner
segments of each stem surround one partner of the other.
However, these hypothetical pseudoknots do not appear to be
favored, because die loops would be very short and one or two
bases would intervene between the stems (27); therefore the
potential stems are probably instead mutually exclusive. A
potential alternative stem was also observed in P5abc, that is
definitely mutually exclusive with both the P5b and P5c stems
(Figure 4). The overlapping organization of these potential
alternative base-paired structures might facilitate the transition
from one hairpin to the other, perhaps upon interaction with the
Splicing
3' splice site
hydrolysis
wild-type
AP5abc
20N-a
-b
-c
40N-la
-lb
40N-2
Individual selectant sequences, in (unimolecular) splicing precursor RNA form,
were tested for activity in the splicing and 3' splice site hydrolysis reactions,
as described in Materials and Methods. Symbols: + + + , rate for the wild-type;
+ +, 30-60 % wild-type rate; + , 10-20 % wild-type rate; - , no activity
detected.
remainder of the intron in such a way as to dynamically activate
the catalytic core.
Precursor RNA (the conventional unimolecular intron, with
wild-type LI and flanked by mini-exons) was prepared for each
of the selectants and assayed for splicing activity (Table 2). All
showed reasonable activity, although none was as active as the
wild-type precursor. From the 40N selectant pool, the rare 40N-2
was more active than the predominant 40N-la. The same
precursors were also assayed for hydrolysis of the 3' splice site,
another well-known activity of the intron that is related to splicing.
Surprisingly, none of the selectants performed this reaction at
detectable levels.
DISCUSSION
From RNA pools in which the P5abc region of the Tetrahymena
ribozyme was replaced with shorter randomized sequences, we
selected three families of sequences that provide to the catalytic
core an activation function comparable to that of P5abc, both
in the self-ligation reaction of the split ribozyme and, in the
context of the conventional precursor RNA, self-splicing. We
were unable to discern clear similarities among the new sequence
families and P5abc; for example the A-rich bulge is not properly
represented among the selectants. Thus rather than defining
critical features of sequence or structure in the P5abc activator
domain, this result instead emphasizes a tolerance in the catalytic
core in its ability to derive activation function from apparently
quite different RNA sequence types. This tolerance is in general
accord with the reactivation of the DP5abc deletion mutant by
the CYT-18 protein or by elevated Mg 2+ concentration in the
presence of spermidine, and with the absence of a P5abc-like
structure in many group I introns.
Is the ribozyme merely tolerant in its ability to use apparently
dissimilar RNA sequences, or is it indiscriminate? The inability
of the initial upper-half RNA pools to support directly detectable
self-ligation shows that most 20- or 40-nt random sequences do
not provide activation function, although it is possible that most
sequences interfere with proper folding of the rest of the
ribozyme. What then is the fraction of the possible sequences
that can replace P5abc function? At face value, the finding that
our selectant pools were dominated by few different sequences
would suggest that this fraction is indeed very small, that there
were very few active sequences in the original pools. However
we cannot rule out unintended selective principles acting later
2008 Nucleic Acids Research, 1994, Vol. 22, No. 11
in the procedure to unbalance a larger set of similarly-active
sequences that may have survived the crucial first selective step.
Our single minuscule sampling of possible 40-nt sequences
leaves the number of active sequence families an open question;
the first self-ligation reaction contained one-trillionth of the
number of possible 40-nt sequences. If the 40N experiment were
repeated from the beginning, with success, we would expect
different sequence families from the two found here. Incidentally
we note that since the probability that a single-base variant of
a given 40N pool member would be present in the same initial
pool is vanishingly small, 40N-la and 40N-lb were probably
not legitimately co-selected, but one arose from the other by a
misincorporation error in one of the many copying steps
employed throughout selection.
In contrast, thorough sampling of 20-nt sequences is within
reach. With a rough correction for the non-negligible bias at the
randomized positions (Table 1 legend), an estimate of the number
of templates initially transcribed (Materials and Methods), and
the assumption of random sampling of this population at each
step, we estimate that our sample from the first self-ligation
reaction contained a third of all possible 20-nt sequences, at an
average copy number of 4 molecules. However, reaction time
was sufficiently short that only about 3 % of the molecules in
a hypothetical subpopulation of sequences conferring activity
equal to that of the wild-type P5abc would have undergone selfligation (Figure 3). Although any given fast-reacting pool member
was more likely to have been selected than a given slower one,
survival in the first selection cycle was stochastic, because of
the short reaction time and low representation of individual
species. Thus, many sequences in the pool that could functionally
replace P5abc were probably lost, some perhaps with even more
activity than P5abc. Reaction time could have been extended,
but then less active molecules would comprise a greater fraction
of the product, which could necessitate more selection cycles;
we had also anticipated that the most active ligated RNAs would
begin the essentially irreversible reaction of hydrolysis at the
ligation junction, as do the circular forms of the wild-type intron.
A given active sequence would be expected to have active
variant forms (bearing compensatory base-pair changes, variation
at nonessential positions, etc.). The three sequences selected in
the 20N experiment appear to comprise such a family: 20N-b
differs from 20N-a at a single position, and 20N-C differs from
20N-a at six positions forming two compensatory base-pair
changes in the predicted secondary structure (Figure 4).
Representation of die possible 20-nt sequences in our experiment
was incomplete yet sufficiently high to warrant the expectation
that more than one, but not all, members of any hypothetical
active sequence family could have been selected. The 20N-a and
20N-b species could well be legitimate co-selectants from the
first cycle, although they may, like 40N-la and 40N-lb, be
related by a copying error subsequent to the first selection step.
However, the enzymes used were chosen on the basis of low
misincorporation frequencies, and no base-substitutions were
found in sequences outside the randomized region (but see
Materials and Methods), although many should have been
tolerable. This apparent high fidelity makes it unlikely that 20N-a
and 20N-c, are related by copying errors; they were probably
legitimately co-selected in the first cycle. To legitimately select
multiple sequences, all closely related, from such a pool would
imply that few, if any, different 20-nt sequence families could
have been selected. We conclude that the ribozyme is not
indiscriminate in its ability to tolerate different activator domains.
It is surprising that the selectant sequences, though good
activators of splicing, conferred no detectable activity in specific
hydrolysis of the 3' splice site; in this sense, the selectants do
not completely replace P5abc function. Specific hydrolysis is
related to the second step of splicing, in that the phosphodiester
bond of the 3' splice site undergoes nucleophilic attack, by water
in one case and by the 3'-OH of the 5' exon in the other; the
splicing step, requiring precise alignment of the 5' exon terminus,
is considered more demanding than specific hydrolysis. Certain
mutations in the Tetrahymena and other group I introns have been
reported to exert differential effects on the two reactions, but
these generally diminish splicing relative to specific hydrolysis
activity, and none of them map to the P5abc region. Selection
of the transesterification-competent/hydrolysis-deficient phenotype can be rationalized by considering that in our system, the
equivalent of 3' splice site hydrolysis would be cleavage of the
ligation product, counter to the principle of selection. The
selectants may contain structures that exclude water from the 3'
splice site, or perhaps the larger P5abc provides a secondary
contribution to activation function, that improves splicing to some
extent or under particular conditions, and incidentally improves
access of water. Longer replacement sequences and modification
of the protocol may be required to select sequences that fully
replace P5abc function.
Our results have not clarified the functional mechanism of the
wild-type P5abc either in transesterification or hydrolysis
reactions. A simple sequence or structural motif common to
P5abc and the selectants was not apparent, nor do they necessarily
function similarly in transesterification. We need to learn more
about the structures of P5abc and our selectants, and their modes
of interaction with the ribozyme core; an attractive approach is
suggested by the work of Murphy and Cech (8), who have
localized several P5abc-ribozyme tertiary interactions within a
higher-order substructure composed of P5abc together with the
P5, P4 and P6 duplexes. We anticipate interesting comparisons
between the actions of the different RNA sequences and of the
CYT-18 protein in activation of the ribozyme core. Future
experiments will address whether the selectant sequences can
activate the DP5abc mutant precursor in trans, as can P5abc.
ACKNOWLEDGEMENTS
K.P.W. was supported by an American Cancer Society
fellowship. The work was supported by a grant from the National
Institutes of Health. We thank Jacqueline Wyatt for discussions.
REFERENCES
1. Janin,J. and Chothia.C. (1985) Methods in Enzymol. 115, 420-430.
2. Dorit,R.L., Schoenbach,L. and GUbert.W. (1990) Science250, 1377-1382.
3. Joyce.G.F., van der Horst,G. and Inoue,T. (1989) Nucleic Acids Res. 17,
7879-7889.
4. Michel.F. and Westhof,E. (1990) / Mol. Biol. 216, 585-610.
5. Latham.J.A. and Cech.T.R. (1989) Science 245, 276-282.
6. Heuer.T.S., Chandry.P.S., Belfort,M., Celander.D.W. and Cech.T.R. (1991)
Proc. Nail. Acad. Sci. USA 88, 11105-11109.
7. Celander.D.W. and Cech,T.R. (1991) Science 251, 401-407.
8. Murphy,F.M. and Cech,T.R. (1993) Biochemistry 32, 5291-5300.
9. van der Horst,G., Christian,A. and Inoue,T. (1991) Proc. Natl. Acad. Sci.
USA 88, 184-188.
10. Flor,P.J., Flanegan,J.B. and CechJ.R. (1989) EMBO J. 8, 3391-3399.
11. Guo,Q. and Lambowitz,A.M. (1992) Genes and Development 6, 1357-1372.
12. CoUins,R.A. and Lambowitz,A.M. (1985) J. Mol. Biol. 184, 413-428.
13. Guo.Q., Akins,R.A., Garriga,G. and Lambowitz.A.M. (1991) J. Biol. Chan.
266, 1809-1819.
Nucleic Acids Research, 1994, Vol. 22, No. 11 2009
14. Gampel.A. andCech.T.R. (1991) Genes and Development 5, 1870-1880.
15. Lambowitz,A.M. and Perlman.P.S. (1990) Trends Biochem. Sci. 15,
440-444.
16. Mohr.G., Zhang.A., Gianelos,J.A., Belfort.M. and Lambowitz.A.M. (1992)
Cell 69, 483-494.
17. MUliganJ.F. and Uhlenbeck,O.C. (1989) Methods Enzymol.180, 51-69.
18. Williamson.C.L., Desai,N.M. and Burke,J.M. (1989) Nucleic Acids Res.
17, 675-689.
19. Waring.R.B. (1989) Nucleic Acids Res. 17:10281-10293.
20. Szostak,J.W. and Ellington,A.D. (1993) In Gesteland,R.F. and Atkins.J.F.
(ed.), The RNA World. Cold Spring Harbor Laboratory Press, pp. 511 -533.
21. Gold,L., Allen,P., Binkley,J., Brown.D., Schneider.D., Eddy,S.R.,
Tuerk.C, Green,L., MacDougal.S. and Tasset.D. (1993) In Gesteland,R.F.
and Atkins.J.F. (ed.), The RNA World. Cold Spring Harbor Laboratory
Press, pp. 497-509.
22. Salvo,J.L.G., Coetzee.T. and Belfort.M. (1990) J. Mol. Biol. 211, 537-549.
23. Inoue.T. Sullivan,F.X. and CechJ.R. (1986) / Mol. Biol. 189, 143-165.
24. Zuker,M. (1989) Science 244, 48-52.
25. Pace,U. and Szostak.J.W. (1991) FEBS Letters 280, 171-174.
26. Pleij,C.W.A., Reitveld.K. and Bosch,L. (1987) Nucleic Acids Res. 13,
1717-1731.
27. WyatU.R., Puglisi,J.D. and Tinoco,!. ,Jr. (1990)/ Mol. Biol. 214, 455-470.