© 1994 Oxford University Press Nucleic Acids Research, 1994, Vol. 22, No. 11 2003-2009 Selection of novel forms of a functional domain within the Tetrahymena ribozyme Kelly P.Williams+, Hiroshi lmahori§, Denise N.Fujimoto and Tan Inoue* The Salk Institute for Biological Studies, 10010 N. Torrey Pines Road, San Diego, CA 92037, USA Received March 31, 1994; Revised and Accepted April 26, 1994 ABSTRACT P5abc is an RNA structure within the self-splicing Tetrahymena group I intron that provides an activation function to the remainder of the ribozyme, either when present in cis or when added in trans. This 69-nucleotide activator domain was replaced with randomized sequence of 20 or 40 nt in length, and individuals among these pools with sequences that could functionally replace P5abc were selected. The basis of selection was a reaction in which two separate halves of the ribozyme became joined; selection was completed by reverse transcription and the polymerase chain reaction, using primers with sequence from either side of the ligation junction. Selectant sequences fell into three families that appear unrelated to P5abc; for example they lack the A-rich bulge thought to be a important feature of P5abc. Thus, rather than defining some consensus sequence for activator domains, this result reveals a certain tolerance in the ribozyme in its ability to derive activation function from diverse sequence types. In the context of splicing precursor RNA, the new sequences supported self-splicing, but failed to activate a related reaction, hydrolysis of the 3' splice site, implying that this region of the intron can differentially control two related reactions. INTRODUCTION Despite fundamental differences between RNA and protein folding, a portion of the Tetrahymena group I intron ribozyme makes a particularly apt comparison with protein domains, the distinct segments of primary sequence with functional and structural independence (1), that in some cases may also have been units of evolution (2). P5abc is a 69-nt element, comprising a sixth of the total molecular mass of the intron (Figures 4 and 1). The AP5abc mutant ribozyme, precisely deleted of P5abc, is essentially inactive under normal in vitro reaction conditions, but can be reactivated by adding spermidine and elevating the Mg 2+ concentration (3). Thus P5abc is neither part of the catalytic core nor essential for its folding, yet provides an activation function; a role in stabilizing ribozyme structure is suggested. A close P5abc homolog is found in only about a sixth of the known group I introns, but some others bear an extension from the L5 loop in which at least an equivalent to the A-rich bulge feature of P5abc has been discerned (4). From a viewpoint of modular evolution of group I introns, P5abc may have been acquired as one solution to a general problem of activating the core, that has been solved differently in other subgroups. Structural autonomy of P5abc has been revealed using solventbased free radicals to probe backbone ribose moieties; protection from the reagent indicates internalization due to tertiary structure. In the context of the ribozyme, about half the positions of P5abc are protected, a higher level than any other region outside of the catalytic core (which is almost completely protected) (5-8). But P5abc also adopts tertiary structure as an isolated molecule, evidenced by protection of seven positions in P5c and flanking the A-rich bulge, a subset of the pattern observed in the intact ribozyme (8). In the ribozyme, these seven P5abc-autonomous protections are adopted at a distinctly lower Mg 2+ concentration than the other protections within P5abc and throughout the remainder of the intron; it has been suggested that P5abc folds first to nucleate ribozyme folding (7,8). Perhaps the most vivid evidence of its domain-like relationship to the rest of the ribozyme is the fact that the separately synthesized P5abc molecule, added in trans, reactivates the AP5abc deletion mutant (9). Gel-shift experiments show that the two RNAs form a specific complex (9); there is no apparent Watson-Crick base-pairing, although specific tertiary interactions have been suggested (6,8,10). Activation of the AP5abc deletion mutant is not uniquely provided by P5abc. Not only can spermidine and extra Mg 2+ also reactivate the mutant, so can a protein molecule, the tyrosyltRNA synthetase CYT-18 of Neurospora mitochondria (11). This protein was previously shown to be required for splicing by a group I intron of Neurospora mitochondria, both in vivo and in vitro (12,13), and is just one example of such proteins known to facilitate group I intron splicing (14,15). CYT-18 can also reactivate diverse splicing-defective point mutants of an intron from bacteriophage T4 (16); it acts through binding the catalytic core of these and other group I introns that lack P5abc, however it does not efficiendy bind the P5abc-containing wild-type Tetrahymena intron (11). That the P5abc activator domain can *To whom correspondence should be addressed at: Department of Chemistry, Kyoto 606, Japan Present addresses: +Institute of Cell Biology, National Research Council, viale Marx 43, 00137 Rome, Italy and 8The Institute of Scientific and Industrial Research, Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567, Japan 2004 Nucleic Acids Research, 1994, Vol. 22, No. 11 | — P 5 a b c ( 6 9 n t ) - | WT r 40 random n t - ] 4 0 N (242nt) (213m) LjjX) random m J J 20N (193 m) ^ ^ > _ . - ^ A P S a b c (173 nl) upper-half RNAs Table 1. Synthetic DNA strands and their uses Initial randomized upper-half RNA template Dl 143 nt, upper-half: PT7 + (5'end-126) D2 89 nt, upper-half complement: (235-196)+N 20 + (126-98) D3 109 nt, upper-half complement: (235-196) + N 4O +(126-98) Lower-half RNA template D4 54 nt, lower-half: P T7 +(5'end-254) D5 30 nt, lower-half complement: (3'end-385) Reverse transcription D6 27 nt, upper-half complement: (3'end-216) Selective (first) PCR D7 28 nt, lower-half: (5'end-254) D6 (above) Upper-half template-regenerating (second) PCR; control templates D8 58 nt, upper-half: PT7 + (5' end-41) D6 (above) lower-half RNA (188 nt) self-ligation ^ Figure 1. Self-ligation of ribozyme halves, the basis of selection for P5abcreplacements. The group I intron of Tetrahymena LSU rRNA (sequence in capital letters) was divided in the L6 loop (dotted line indicates the connection forming L6 in the intact ribozyme) into two halves (boxed), which are separately nonfunctional but can hybridize to restore ribozyme activity (19). Standard nomenclature of some base-paired regions and loops, and sequence numbering of the intron (4) are indicated. Sequences not part of the intron (printed in lower case) are a 4-nt sequence replacing the 14 nt of the wild-type LI, the 5'-miniexon, and nt outside of the separation at L6. Four forms of the upper-half RNA were used, differing in the sequences extending from and connecting L5, as indicated (see Figure 4 for the sequence of the wild-type P5abc). The upper-half RNA pools used to initiate the 20N and 40N experiments lacked the 3' seven nt of the sequence specified here. In the self-ligation reaction (large arrow), the two halves became joined inverse to their normal order, by nucleophilic attack of the 3'-OH of the terminal guanosine residue at the 5' splice site. Lined sequences correspond to the D6 and D7 primers used in the selective (first) PCR. Also marked are the restriction sites used ultimately for subcloning. be replaced by a molecule so strikingly different as a protein suggested to us that other, unrelated RNA sequences might also be able to replace P5abc in the ribozyme, perhaps conferring even stronger activation function than does P5abc. To test this line of thought, we developed a system for the in vitro selection of novel activator domains of the Tetrahymena ribozyme. We report the selection and preliminary characterization of three families of functional P5abc-replacement sequences, which do not resolve but provide new insights into the mechanism of P5abc function. MATERIALS AND METHODS General methods Synthetic DNA strands, designated D 1 - D 8 (Table 1), were prepared using phosphoramidite chemistry on an Applied Biosystems model 490 synthesizer. PCR reactions were performed with 30 temperature cycles (95°C, 1 min; 55°C, 2 min; 72°C, 3 min) using Pfu DNA polymerase (Stratagene) according to manufacturer's instructions, with 200pmol of each primer in a 200 pA reaction volume using a Perkin-Elmer-Cetus model 480 thermocycler. T7 RNA polymerase was purified and used as described (17) for transcription with uniform [32P]-GMP PYJ-. the T7 RNA polymerase promoter sequence TAATACGACTCACTATA. In parenthesis: end-inclusive DNA versions of sequence indexed in Figure 1 (or its complement). N: randomized position (according to the manufacturer, the reactivity of the phosphoramidites is such that each of these positions actually had 30% T, 26% G, 24% C and 20% A). labeling. Electrophoresis of DNA and RNA was in 4 % polyacrylamide (29:1 acrylamide:bisacrylamide) gels containing 7 M urea, 90 mM Tris-borate, 0.2 mM EDTA; purifications continued with crush-and-soak elution from the gel slice in 0.3 M NaOAc, pH 5.2, and ethanol precipitation. RNA was further purified by gel permeation chromatography on Sephadex G-50 columns. Starting templates (for RNAs of Figure 1) For the lower-half RNA template, a segment of pTZIVSU (18) was deleted according to (19), and the resulting plasmid used to prime a PCR reaction with D4 and D5; the product was purified in an agarose gel. Templates for the control WT and AP5abc forms of the upper-half RNA were prepared by PCR with D6 and D8 using pTZIVSU bearing either wild-type or AP5abc intron sequences. To prepare the templates for the inital 20N and 40N upperhalf RNAs of the selection experiments, 20 pmol Dl was hybridized to 20 pmol 5'-[32P]-labeled D2 or D3 and the DNA strands in these pairs were extended using Sequenase (US Biochemicals) according to the manufacturer's instructions; approximately 30% of the D2 or D3 molecules were converted into complete bottom strands in these templates. Templates with full-length bottom strands should have supported transcription initiation equally well; more questionable would bethe efficiency of transcript elongation in the downstream region of the template, where top strand synthesis may have been incomplete (not normally considered problematic for T7 RNA polymerase) or damage sustained during synthesis of D2 or D3 may have blocked RNA polymerase progress; we were content to observe that there was little smearing below the gel band of the full-length transcripts, no more than for comparable templates produced by PCR (although direct measurement of active templates was in principle accessible by single-round transcription with excess RNA polymerase). One-half of each template preparation, which we estimate contained 2.4 pmol of active template, was transcribed; upper-half RNA yields were 95 (20N) and 52 (40N) pmol. Nucleic Acids Research, 1994, Vol. 22, No. 11 2005 START hybridize oligo pairs 29 bDP ZZ Dl p ^ 5'-rriini-exon primer extension D2(20N)or D3 (40N) T randomized P5abc region V template DNA for upper-half RNA X second PCR (D6+D8): replace lower-half DNA with T7 promoter and mini-5'-exon V transcribe upper-half RNAs \ gel-purify first PCR (D6+D7): across ligation junction subclone sequence assay activity END hybridize with lower-half RNA reverse transcribe (D6) \ self-ligation reaction 5 mM Mg+\ 30°, 1 or 15 min lower-half RNA ligated to upper-half RNA 5'-mini-exon Figure 2. Outline of selection procedure. Self-ligation reactions Upper-half RNA (5 pmol) was hybridized with 10 pmol lowerhalf RNA by heating to 70°C in 5 jtl 0.1 mM EDTA for 10 min and incubating at 20°C for 20 min. The sample was preheated to 30°C and the reaction was initiated by dilution to 10 jtl containing 50 mM Epps, pH 6.5, 200 mM NH4OAc, 5 mM MgCl2, 0.1 mM EDTA; reactions were incubated 1 or 15 min at 30°C, then stopped by adding 5 /tl of 12 mM EDTA. The RNA was precipitated with ethanol, and resuspended in 12 /tl of 0.1 mM EDTA. The reaction was monitored by subjecting 2 /il to electrophoresis, followed by autoradiography. Completion of the selection cycle Half of the RNA from each self-ligation reaction was incubated 2 min at 37°C with 100 pmol D6 in 20 /A containing 50 mM Tris-Cl, pH 8.3, 75 mM KC1, 3 mM MgCl2, 10 mM DTT, 0.5 mM each dNTP. RNase H~ M-MLV reverse transcriptase (BRL) was added and the sample incubated at 65°C for 10 min, then 95°C for 5 min. The bulk of the RNA was degraded by adding NaOH to 200 mM and incubating at 37°C for 10 min. After neutralizing, the cDNA was precipitated with ethanol and 10 mg glycogen, then subjected to PCR using D6 and D7, radiolabeling with [a-32P]-dATP. The product was purified by electrophoresis in a 1.5 mm thick gel (gels used for the first three cycles were composed of 5 % polyacrylamide without urea), and subjected to a second PCR using D6 and D8. Half of the yield was used as the transcription template to initiate the subsequent selection cycle. The amplification product from the 1 min selfligation reaction was used to continue selection, except that in the first cycle for both 20N and 40N, and in the second cycle for 40N, no PCR products were obtained for the 1 min timepoints; in these cases, the product from the 15 min timepoint was substituted to continue selection. Control experiments revealed that two types of background could occur in this selection system. A carry-over background was observed occasionally when upper-half RNA alone was taken through the amplification steps; this was attributed to contamination of the gel slice of the hypothetical product of the first PCR with upper-half cDNA, which is readily amplified in the second PCR. Smaller DNAs can sometimes be trapped in the electrophoretic band of a larger DNA; we found the use of long thick porous gels effective in eliminating carry-over. When AP5abc upper-half RNA was allowed to react with lower-half RNA for 15 min, the product was too scarce to be detected directly by autoradiography, but there was sufficient ligated RNA to yield products in the subsequent reverse transcription and PCR. Such a reactivity background was probably also operating in the early cycles of the 20N and 40N selections, with few appreciably active molecules vastly outnumbered by very weakly active molecules; thus directly detectable self-ligation activity did not appear in the pools until after four cycles of selection. Analysis of individual selectants DNA from the second PCR of the sixth selection cycle for the 20N and 40N pools was subcloned into pTZIVSU using the Sphl and BglE restriction sites; the sequence between these sites was determined for 52 subclones. Outside of the randomized region, the sequences exhibited no base-substitutions from the wild-type, although the four consecutive A residues of P2.1 were found in three instances to have been converted into three, and in one instance, into five A residues. Plasmid pTZFVSU and its derivatives bearing the AP5abc and each of the six different selectant sequences (Figure 4) were digested with HindM to prepare templates for precursor RNAs (with wild-type LI and P2.1, and with both 5' and 3' mini-exons.) Precursor RNA (0.16 pmol) was incubated at 37°C for 5, 15, or 45 min in 10 /tl of 5 mM MgCl2 with either 50 mM Mops, pH 7, 0.2 mM GTP (splicing) or 50 mM Epps, pH 8.3 (3' splice site hydrolysis), and reactions were stopped by adding 10 /il of 7 M urea, 12 mM EDTA. Reaction extent was evaluated by autoradiography after electrophoresis. Experimental outline Several in vitro selection systems on a related plan have been described recently for exploring RNA function (20, 21). Typically, a pool of a large number of sequence-varied RNAs, transcribed from a synthetic template, is treated so as to select individual RNAs that exhibit a desired function; these individuals are converted back into transcription templates by reverse transcription and PCR, so that subsequent cycles of selection can proceed, ultimately subcloning for analysis of individuals. The functions that have been used as a basis for selection are mainly of two types: specific binding to a ligand, or ability to act as a substrate or autocatalyst for a reaction (such as cleavage or ligation) that alters the RNA chain. RNA-altering reactions can allow for the isolation of desired product RNAs in gels, but ligation provides a better means of selection than cleavage, since subsequent reverse transcription and PCR steps can be arranged so as to amplify only the product RNAs. We thought it might be possible to select novel forms of the replaceable activator domain in the Tetrahymena ribozyme with such an approach. Not only can the intron ligate its flanking exons, it can also be a substrate for self-catalyzed ligation reactions. PCR using oligonucleotides with convergent sequences from opposite sides of the ligation junction is, in principle, an extremely powerful method for selecting self-ligating RNAs from a pool containing many inactive RNAs. An appropriate selection 2006 Nucleic Acids Research, 1994, Vol. 22, No. 11 system was developed (Figure 2 and Materials and Methods), bearing two major potential problems in mind. First, since the wild-type ribozyme is highly active, and the selective methods are very powerful, such systems are extremely susceptible to contamination from stray molecules bearing the wild-type sequence. To avoid this problem, the 69-nt P5abc sequence was replaced in starting RNA pools not with randomized sequence of the same length, but with shorter sequences, of either 20 or 40 random nt (each of the four bases present at approximately equal frequency at each position). Such size difference was sufficient to allow separation from potential wild-type contaminants in the two gel-purification steps employed in each selection cycle. A second problem to address was that the ribozyme is highly active under the conditions used for transcription; the most active molecules in a sequence-varied RNA pool could be lost if they engage in undesirable reactions during the transcription reaction or if the transcripts are to be purified in a gel. The problem of premature reaction was solved by splitting the ribozyme into two halves (upper-half and lowerhalf RN As; Figure 1), neither active separately, but capable of base-pairing in two regions to restore ribozyme activity. The efficacy of such a bimolecular ribozyme has been demonstrated previously for the Tetrahymena and other group I introns (19, 22). Separate synthesis of the two ribozyme halves had other technical advantages. A large batch of the lower-half RNA was prepared and used throughout the course of the selection procedures; fresh replacement of lower-half RNA in each cycle reduced the potential for accumulation of unwanted mutations outside the randomized region. P5abc resides in the upper-half RNA, which is short enough so that the templates for transcription of the initial sequence-varied RNAs could be constructed simply by enzymatically extending synthetic DNA strands paired at their 3' ends. For larger initial RNAs, incorporation of specified sequence variation into the transcription template would require DNA ligation, which could engender multiple side-products. We chose an equivalent to the self-circularizing reaction of the intact intron (23), as a basis for selecting sequences that could functionally replace the P5abc domain of the Tetrahymena ribozyme. Circularization can occur after the 3' exon has been lost from the precursor due to either splicing or hydrolysis at the 3' splice site. The 3'-terminal residue of all group I introns is a guanosine; me exposed 3'-OH of the intron thus resembles the 3'-OH of free guanosine, the nucleophile that normally initiates splicing by specific binding to the intron and attack of the 5' splice site. The preferred site of attack by the terminal 3'-OH of the intron is likewise the 5' splice site, but certain positions in the loop LI can also serve as circularization sites. To eliminate secondary circularization sites, the LI loop was replaced in upper-half RNA with a shorter inert sequence (Figure 1). Reaction was allowed to proceed for 1 or 15 min, at which timepoints the reaction with the upper-half RNA bearing the wildtype P5abc was in an early phase (Figure 3). In our case, the equivalent to the self-circularized intron was of course a linear self-ligation product of the two ribozyme halves. This product RNA was selectively amplified in the DNA phase of the selection cycle. Following the self-ligation reaction, RNAs were treated by reverse transcription using a primer complementary to the 3' end of the upper-half RNA; thus all upper-half RNAs, not just those with lower-half RNA attached upstream, were subject to conversion into cDNA. The cDNA of ligated RNA was by design substantially longer than that of unligated RNA; more importantly, the subsequent PCR reaction 20N 20N 20N 40N 40N WT WT Upper AP5abc - + + + + + + Lower + 15' 1' 15' 1' 15' 1' 15' Time 15' 419 390 370 Figure 3. Self-ligation activity attained by in vitro selection. The indicated combinations of 5 pmol upper-half and 10 pmol lower-half RNAs were hybridized and allowed to react (reaction time in minutes) as described in Materials and Methods; 20N and 40N upper-half RNAs were from the pools after six selection cycles. Densitometric scans of multiple exposures of the gel indicated that approximately 150 fmol of the ligation product was produced in the 15 min WT control. Reaction extents, relative to the 15 min WT control (100 %), were more accurately determined as 43%, 64%, and 12%, for the 15 min 20N, 15 min 40N, and 1 min WT samples, respectively. Ligation products were not detected for the other samples. Substrate and product RNAs are marked by their expected lengths in nt (see also Figure 1); lines mark the positions of end-labeled DNA markers of 310, 281, 271, 234 and 194 nt in a lane not shown. Lower-half RNA (188 nt) was radiolabeled at a lower specific activity than the upper-half RNAs. amplified only the desired cDNA, by using primers with sequence from either side of the ligation junction. A second PCR was then necessary to regenerate the template for the upper-half RNA in order to proceed with the subsequent selection cycle. RESULTS For both the 20N and 40N experiments, self-ligation activity was not directly detectable (by autoradiography) for the starting RNA pools (data not shown), but became observable after four selection cycles; activity further improved after both the fifth and sixth cycles, nearly attaining the level of activity produced by upperhalf RNA containing wild-type P5abc (Figure 3), but there was no further improvement after a seventh cycle. Selectant DNA was subcloned and individual sequences were determined (Figure 4). Each pool was dominated by a particular sequence (20N-a or 40N-la); the 20N pool additionally contained two variant sequences, one differing from the dominant sequence by a single base and the other differing at six positions. The 40N pool also contained a single-base variant of the dominant sequence, however an unrelated sequence (40N-2) was also found. At the primary structural level, these three selectant families bear no apparent relationship to P5abc (or each other). Nucleic Acids Research, 1994, Vol. 22, No. 11 2007 Table 2. Activity of selectants in precursor form Sequence at P5abc region •£=g' & u u 20N-b G-*-A(n=l) S 20N-C (si* c g e s in bold type) u.g 5f- wild-type P5abc ~gsu~ 20N-a (n=2S) Figure 4. P5abc and replacement sequences that were selected. Six different selectant sequences were found, falling into three apparent families; the number of occurrences (n) among the analyzed subclones are given. Sequences are presented in the secondary structure predicted by a standard RNA-folding algorithm (24), with positions that were randomized in upper case and nonrandomized neighboring sequences in lower case. Alternative potential base-pairing partner sequences are overlined. 20N-b and 40N-lb differ from 20N-a and 40N-la by single G-to-A transversions (large arrows). Small arrows mark the six bases (bold type) that differ between 20N-la and 20N-lc. Note that several intra-family differences are compensatory base-pair changes in the given structures. The Zuker free-energy minimization algorithm (24) was used to predict secondary structures for the selectant sequences (Figure 4); such predicted structures must be viewed with some skepticism, but are a useful starting point for discussion. Some of the base changes within the selectant families tend to support the assignment of these strucures in that they conserve predicted base-pairing: the single G - A difference between 40N-la and 40N-lb maintains a predicted pairing with a U; four of the six base differences between 20N-a and 20N-C are conservative changes of two base-pairs in the predicted structure. The secondary structures show some resemblances to that predicted for P5abc which may be only superficial. The A-rich bulge feature, with its 3' end proximal to P5, has been discerned in many group I introns in sequence extending from the L5 loop (4) and some functional importance has been attached to it (8,10,25); a bulge in the predicted 40N-2 structure is A-rich, but is oriented with its 3' end distal to P5. The structure given for 40N-la and b, like that of P5abc, has three closely joined base-paired stems. Alternative potential base-paired stems can be found for each selectant (Figure 4, overlined sequences), particularly convincing for the 40N-1 family. Each alternative stem exhibits, with respect to a stem of the computer-predicted structure, the general interdigitated topology of a pseudoknot (26), i.e., the partner segments of each stem surround one partner of the other. However, these hypothetical pseudoknots do not appear to be favored, because die loops would be very short and one or two bases would intervene between the stems (27); therefore the potential stems are probably instead mutually exclusive. A potential alternative stem was also observed in P5abc, that is definitely mutually exclusive with both the P5b and P5c stems (Figure 4). The overlapping organization of these potential alternative base-paired structures might facilitate the transition from one hairpin to the other, perhaps upon interaction with the Splicing 3' splice site hydrolysis wild-type AP5abc 20N-a -b -c 40N-la -lb 40N-2 Individual selectant sequences, in (unimolecular) splicing precursor RNA form, were tested for activity in the splicing and 3' splice site hydrolysis reactions, as described in Materials and Methods. Symbols: + + + , rate for the wild-type; + +, 30-60 % wild-type rate; + , 10-20 % wild-type rate; - , no activity detected. remainder of the intron in such a way as to dynamically activate the catalytic core. Precursor RNA (the conventional unimolecular intron, with wild-type LI and flanked by mini-exons) was prepared for each of the selectants and assayed for splicing activity (Table 2). All showed reasonable activity, although none was as active as the wild-type precursor. From the 40N selectant pool, the rare 40N-2 was more active than the predominant 40N-la. The same precursors were also assayed for hydrolysis of the 3' splice site, another well-known activity of the intron that is related to splicing. Surprisingly, none of the selectants performed this reaction at detectable levels. DISCUSSION From RNA pools in which the P5abc region of the Tetrahymena ribozyme was replaced with shorter randomized sequences, we selected three families of sequences that provide to the catalytic core an activation function comparable to that of P5abc, both in the self-ligation reaction of the split ribozyme and, in the context of the conventional precursor RNA, self-splicing. We were unable to discern clear similarities among the new sequence families and P5abc; for example the A-rich bulge is not properly represented among the selectants. Thus rather than defining critical features of sequence or structure in the P5abc activator domain, this result instead emphasizes a tolerance in the catalytic core in its ability to derive activation function from apparently quite different RNA sequence types. This tolerance is in general accord with the reactivation of the DP5abc deletion mutant by the CYT-18 protein or by elevated Mg 2+ concentration in the presence of spermidine, and with the absence of a P5abc-like structure in many group I introns. Is the ribozyme merely tolerant in its ability to use apparently dissimilar RNA sequences, or is it indiscriminate? The inability of the initial upper-half RNA pools to support directly detectable self-ligation shows that most 20- or 40-nt random sequences do not provide activation function, although it is possible that most sequences interfere with proper folding of the rest of the ribozyme. What then is the fraction of the possible sequences that can replace P5abc function? At face value, the finding that our selectant pools were dominated by few different sequences would suggest that this fraction is indeed very small, that there were very few active sequences in the original pools. However we cannot rule out unintended selective principles acting later 2008 Nucleic Acids Research, 1994, Vol. 22, No. 11 in the procedure to unbalance a larger set of similarly-active sequences that may have survived the crucial first selective step. Our single minuscule sampling of possible 40-nt sequences leaves the number of active sequence families an open question; the first self-ligation reaction contained one-trillionth of the number of possible 40-nt sequences. If the 40N experiment were repeated from the beginning, with success, we would expect different sequence families from the two found here. Incidentally we note that since the probability that a single-base variant of a given 40N pool member would be present in the same initial pool is vanishingly small, 40N-la and 40N-lb were probably not legitimately co-selected, but one arose from the other by a misincorporation error in one of the many copying steps employed throughout selection. In contrast, thorough sampling of 20-nt sequences is within reach. With a rough correction for the non-negligible bias at the randomized positions (Table 1 legend), an estimate of the number of templates initially transcribed (Materials and Methods), and the assumption of random sampling of this population at each step, we estimate that our sample from the first self-ligation reaction contained a third of all possible 20-nt sequences, at an average copy number of 4 molecules. However, reaction time was sufficiently short that only about 3 % of the molecules in a hypothetical subpopulation of sequences conferring activity equal to that of the wild-type P5abc would have undergone selfligation (Figure 3). Although any given fast-reacting pool member was more likely to have been selected than a given slower one, survival in the first selection cycle was stochastic, because of the short reaction time and low representation of individual species. Thus, many sequences in the pool that could functionally replace P5abc were probably lost, some perhaps with even more activity than P5abc. Reaction time could have been extended, but then less active molecules would comprise a greater fraction of the product, which could necessitate more selection cycles; we had also anticipated that the most active ligated RNAs would begin the essentially irreversible reaction of hydrolysis at the ligation junction, as do the circular forms of the wild-type intron. A given active sequence would be expected to have active variant forms (bearing compensatory base-pair changes, variation at nonessential positions, etc.). The three sequences selected in the 20N experiment appear to comprise such a family: 20N-b differs from 20N-a at a single position, and 20N-C differs from 20N-a at six positions forming two compensatory base-pair changes in the predicted secondary structure (Figure 4). Representation of die possible 20-nt sequences in our experiment was incomplete yet sufficiently high to warrant the expectation that more than one, but not all, members of any hypothetical active sequence family could have been selected. The 20N-a and 20N-b species could well be legitimate co-selectants from the first cycle, although they may, like 40N-la and 40N-lb, be related by a copying error subsequent to the first selection step. However, the enzymes used were chosen on the basis of low misincorporation frequencies, and no base-substitutions were found in sequences outside the randomized region (but see Materials and Methods), although many should have been tolerable. This apparent high fidelity makes it unlikely that 20N-a and 20N-c, are related by copying errors; they were probably legitimately co-selected in the first cycle. To legitimately select multiple sequences, all closely related, from such a pool would imply that few, if any, different 20-nt sequence families could have been selected. We conclude that the ribozyme is not indiscriminate in its ability to tolerate different activator domains. It is surprising that the selectant sequences, though good activators of splicing, conferred no detectable activity in specific hydrolysis of the 3' splice site; in this sense, the selectants do not completely replace P5abc function. Specific hydrolysis is related to the second step of splicing, in that the phosphodiester bond of the 3' splice site undergoes nucleophilic attack, by water in one case and by the 3'-OH of the 5' exon in the other; the splicing step, requiring precise alignment of the 5' exon terminus, is considered more demanding than specific hydrolysis. Certain mutations in the Tetrahymena and other group I introns have been reported to exert differential effects on the two reactions, but these generally diminish splicing relative to specific hydrolysis activity, and none of them map to the P5abc region. Selection of the transesterification-competent/hydrolysis-deficient phenotype can be rationalized by considering that in our system, the equivalent of 3' splice site hydrolysis would be cleavage of the ligation product, counter to the principle of selection. The selectants may contain structures that exclude water from the 3' splice site, or perhaps the larger P5abc provides a secondary contribution to activation function, that improves splicing to some extent or under particular conditions, and incidentally improves access of water. Longer replacement sequences and modification of the protocol may be required to select sequences that fully replace P5abc function. Our results have not clarified the functional mechanism of the wild-type P5abc either in transesterification or hydrolysis reactions. A simple sequence or structural motif common to P5abc and the selectants was not apparent, nor do they necessarily function similarly in transesterification. We need to learn more about the structures of P5abc and our selectants, and their modes of interaction with the ribozyme core; an attractive approach is suggested by the work of Murphy and Cech (8), who have localized several P5abc-ribozyme tertiary interactions within a higher-order substructure composed of P5abc together with the P5, P4 and P6 duplexes. We anticipate interesting comparisons between the actions of the different RNA sequences and of the CYT-18 protein in activation of the ribozyme core. Future experiments will address whether the selectant sequences can activate the DP5abc mutant precursor in trans, as can P5abc. ACKNOWLEDGEMENTS K.P.W. was supported by an American Cancer Society fellowship. The work was supported by a grant from the National Institutes of Health. We thank Jacqueline Wyatt for discussions. REFERENCES 1. Janin,J. and Chothia.C. (1985) Methods in Enzymol. 115, 420-430. 2. Dorit,R.L., Schoenbach,L. and GUbert.W. (1990) Science250, 1377-1382. 3. Joyce.G.F., van der Horst,G. and Inoue,T. (1989) Nucleic Acids Res. 17, 7879-7889. 4. Michel.F. and Westhof,E. (1990) / Mol. Biol. 216, 585-610. 5. Latham.J.A. and Cech.T.R. (1989) Science 245, 276-282. 6. Heuer.T.S., Chandry.P.S., Belfort,M., Celander.D.W. and Cech.T.R. (1991) Proc. Nail. Acad. Sci. USA 88, 11105-11109. 7. Celander.D.W. and Cech,T.R. (1991) Science 251, 401-407. 8. Murphy,F.M. and Cech,T.R. (1993) Biochemistry 32, 5291-5300. 9. van der Horst,G., Christian,A. and Inoue,T. (1991) Proc. Natl. Acad. Sci. USA 88, 184-188. 10. Flor,P.J., Flanegan,J.B. and CechJ.R. (1989) EMBO J. 8, 3391-3399. 11. Guo,Q. and Lambowitz,A.M. (1992) Genes and Development 6, 1357-1372. 12. CoUins,R.A. and Lambowitz,A.M. (1985) J. Mol. Biol. 184, 413-428. 13. Guo.Q., Akins,R.A., Garriga,G. and Lambowitz.A.M. (1991) J. Biol. Chan. 266, 1809-1819. Nucleic Acids Research, 1994, Vol. 22, No. 11 2009 14. Gampel.A. andCech.T.R. (1991) Genes and Development 5, 1870-1880. 15. Lambowitz,A.M. and Perlman.P.S. (1990) Trends Biochem. Sci. 15, 440-444. 16. Mohr.G., Zhang.A., Gianelos,J.A., Belfort.M. and Lambowitz.A.M. (1992) Cell 69, 483-494. 17. MUliganJ.F. and Uhlenbeck,O.C. (1989) Methods Enzymol.180, 51-69. 18. Williamson.C.L., Desai,N.M. and Burke,J.M. (1989) Nucleic Acids Res. 17, 675-689. 19. Waring.R.B. (1989) Nucleic Acids Res. 17:10281-10293. 20. Szostak,J.W. and Ellington,A.D. (1993) In Gesteland,R.F. and Atkins.J.F. (ed.), The RNA World. Cold Spring Harbor Laboratory Press, pp. 511 -533. 21. Gold,L., Allen,P., Binkley,J., Brown.D., Schneider.D., Eddy,S.R., Tuerk.C, Green,L., MacDougal.S. and Tasset.D. (1993) In Gesteland,R.F. and Atkins.J.F. (ed.), The RNA World. Cold Spring Harbor Laboratory Press, pp. 497-509. 22. Salvo,J.L.G., Coetzee.T. and Belfort.M. (1990) J. Mol. Biol. 211, 537-549. 23. Inoue.T. Sullivan,F.X. and CechJ.R. (1986) / Mol. Biol. 189, 143-165. 24. Zuker,M. (1989) Science 244, 48-52. 25. Pace,U. and Szostak.J.W. (1991) FEBS Letters 280, 171-174. 26. Pleij,C.W.A., Reitveld.K. and Bosch,L. (1987) Nucleic Acids Res. 13, 1717-1731. 27. WyatU.R., Puglisi,J.D. and Tinoco,!. ,Jr. (1990)/ Mol. Biol. 214, 455-470.
© Copyright 2024 Paperzz