1998 Oxford University Press 1834–1840 Nucleic Acids Research, 1998, Vol. 26, No. 7 A common 40 amino acid motif in eukaryotic RNases H1 and caulimovirus ORF VI proteins binds to duplex RNAs Susana M. Cerritelli, Oleg Y. Fedoroff1,+, Brian R. Reid1,2 and Robert J. Crouch* Laboratory of Molecular Genetics, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD 20892, USA and 1Department of Chemistry and 2Department of Biochemistry, University of Washington, Seattle, WA 98195, USA Received October 21, 1997; Revised and Accepted February 4, 1998 ABSTRACT Eukaryotic RNases H from Saccharomyces cerevisiae, Schizosaccharomyces pombe and Crithidia fasciculata, unlike the related Escherichia coli RNase HI, contain a non-RNase H domain with a common motif. Previously we showed that S.cerevisiae RNase H1 binds to duplex RNAs (either RNA–DNA hybrids or double-stranded RNA) through a region related to the double-stranded RNA binding motif. A very similar amino acid sequence is present in caulimovirus ORF VI proteins. The hallmark of the RNase H/caulimovirus nucleic acid binding motif is a stretch of 40 amino acids with 11 highly conserved residues, seven of which are aromatic. Point mutations, insertions and deletions indicated that integrity of the motif is important for binding. However, additional amino acids are required because a minimal peptide containing the motif was disordered in solution and failed to bind to duplex RNAs, whereas a longer protein bound well. Schizosaccharomyces pombe RNase H1 also bound to duplex RNAs, as did proteins in which the S.cerevisiae RNase H1 binding motif was replaced by either the C.fasciculata or by the cauliflower mosaic virus ORF VI sequence. The similarity between the RNase H and the caulimovirus domain suggest a common interaction with duplex RNAs of these two different groups of proteins. INTRODUCTION Ribonucleases H degrade the RNA strand of RNA–DNA hybrids. RNase H has been found in prokaryotes and eukaryotes and also in retroviruses (1). In Escherichia coli two different RNases H have been described: RNase HI, a very active and well studied RNase H (2–4) whose crystal structure has been determined (5–7), and RNase HII, a low activity enzyme which is poorly characterized (8). Prokaryotic RNases H of the E.coli RNase HI family are small proteins containing only the RNase H domain (9–11). Retroviral RNases H have sequence and structural conservation with E.coli RNase HI (12), but they differ in that the RNase H domain is part of a larger polypeptide that also contains a polymerase domain (13). The eukaryotic RNases H that resemble E.coli RNase HI are, like the retroviral enzymes, complex proteins and they have an additional domain that does not appear related to polymerase and whose function is unknown. Eukaryotic RNases H1 from Crithidia fasciculata (14), from Saccharomyces cerevisiae (R.Crouch et al., manuscript in preparation), from Schizosaccharomyces pombe (R.Crouch et al., manuscript in preparation) and from a chicken cDNA (accession no. D26340) have been sequenced. Although it has not yet been demonstrated that the chicken cDNA encodes an active RNase H, the high degree of similarity with the other sequences indicates that this protein is likely to be the chicken homolog of RNase H1 (15). All these proteins present a similar domain organization, with the RNase H domain located in the C-terminal part of the protein (Fig. 1). The N-terminal region, of variable length, is not required for RNase H activity (R.Crouch et al., manuscript in preparation) and is characterized by an abundance of basic amino acids and a conserved sequence of ∼40 residues (Fig. 1). Saccharomyces cerevisiae RNase H1 is, so far, the only member of this family that contains two copies of the motif (Fig. 1). The conserved sequence is related to the conserved domain of caulimovirus ORF VI product (Fig. 2), which facilitates translation of polycistronic virus RNA in plant cells (15). It was first observed in S.cerevisiae RNase H1 (16) that the last 20 amino acids of the conserved domain are related to the last 20 amino acids of a motif present in proteins that bind to double-stranded (ds)RNA (17–19). The dsRNA binding domain (dsRBD) is a sequence of ∼70 amino acids, of which the last 20 are the most conserved (18,20). Saccharomyces cerevisiae RNase H1 binds in vitro to dsRNA and to RNA–DNA hybrids with similar strength and to other nucleic acids with lower affinity. It is thought that the sequence similar to the dsRBD is required for binding, because the ability of yeast RNase H1 to bind to dsRNA is much reduced by deletion of this sequence. Similarly, mutation of two K residues that are highly conserved in the RNases H/caulimovirus ORF VI motif (RNH/CaMV) and in the dsRBD reduces binding in both domains (16,17,21–23). N-176, a *To whom correspondence should be addressed. Tel: +1 301 496 4082; Fax: +1 301 496 0243; Email: [email protected] +Present address: Drug Dynamics Institute, College of Pharmacy, University of Texas Austin, Austin, TX 78712, USA 1835 Nucleic Acids Acids Research, Research,1994, 1998,Vol. Vol.22, 26,No. No.17 Nucleic Figure 1. Structural organization of eukaryotic RNases H1. The RNase H domains are represented by dark boxes. The conserved motifs in the N-terminal region are related by dotted lines. R1 and R2 indicate the two conserved sequences present in S.cerevisiae RNase H1. truncated version of S.cerevisiae RNase H1 which lacks the RNase H domain, binds to nucleic acids, as does the full-length protein, indicating that the RNase H domain is not required for binding (16). 1835 The first 20 residues of the 40 amino acid motif are highly homologous in eukaryotic RNases H and the caulimovirus ORF VI proteins (Fig. 2). This region has been reported to be essential for the translation activation capacity of the cauliflower mosaic virus ORF VI protein (24). The mechanism of transactivation is unknown, but efficient translation of a polycistronic message into multiple proteins might involve interaction with dsRNA regions of the ribosomes or of mRNA. In this study we investigated the relative apparent affinity for nucleic acid of the conserved motif from cauliflower mosaic virus ORF VI protein (CaMV) and also from Crithidia fasciculata RNase H1 (ChRH) by making chimeric constructs in which the first motif of the N-176 protein was replaced by the 40 conserved amino acids from ChRH or CaMV. Northwestern assays showed that both constructs could bind dsRNA and RNA–DNA hybrids. We also made deletions, insertions and point mutations in the N-176 protein and determined that the first 20 amino acids of the conserved sequence are necessary for duplex RNA binding and that the first of the two motifs of S.cerevisiae RNase H1 is sufficient for the nucleic acid binding activity of the protein. Figure 2. Alignment of homologous sequences from eukaryotic RNases H and caulimovirus ORF VI proteins. The numbering on the top refers to the first amino acid of the conserved sequence. Shaded boxes highlight identical amino acids. A capital letter indicates an amino acid that occurs at the same position in >80%; a lower case letter represents a residue present in >50%; an x denotes a position that is not conserved. Arrows point to amino acids that are identical in all sequences. (A) Sequences of the conserved motif present at the N-terminal region of eukaryotic RNases H. The sequences shown are: S.pombe RNase H1 8–50; C.fasciculata RNase H1 156–198; chicken 22–64; S.cerevisiae R1 7–49; S.cerevisiae R2 108–153. (B) Sequences of the conserved motif of caulimovirus ORF VI proteins. The sequences shown are: CaMV, cauliflower mosaic virus 140–182; CERV, carnation etched ring virus 134–176; FMV, figwort mosaic virus 156–196; SVB, strawberry vein banding virus 146–185; PCISV, peanut chlorotic streak virus 80–122; for soybean chlorotic mottle virus the sequence reported (38) is SoCMV 107–146, which presents homology to the other sequences only in the C-terminal part of the motif. Homology in the N-terminal region appeared in the –1 reading frame that is shown under SoCMV# and would correspond to amino acids 105–143. This discrepancy has been previously reported (24). To derive the consensus sequence we used the first 20 amino acids of SoCMV# and the last 19 of SoCMV. (C) Comparison of the consensus sequence of the motif of eukaryotic RNases H1 and caulimovirus ORF VI proteins. The stars denote amino acids identical in both sequences. 1836 Nucleic Acids Research, 1998, Vol. 26, No. 7 Figure 3. Schematic representation of wild-type proteins and derivatives. (A) The structures of wild-type S.cerevisiae RNase H1, S.pombe RNase H1 and C.fasciculata RNase H1 (ChRH) are represented as in Figure 1. In cauliflower mosaic virus ORF VI protein (CaMV) only the conserved motif is featured. The protein derivatives constructed in this work are displayed in the same schematic representation. (B). The sequence of R1 of N-176 is presented and the substitutions in 4As(N-176), 3-R2(N-176) and 8-R2(N-176) are indicated. On the bottom are shown the sequences of cauliflower mosaic virus ORF VI product [CaMV(N-176)] and C.fasciculata RNase H1 [ChRH(N-176)] and the mutations present in CaMV-mut and ChRH-mut. The consensus sequence of the eukaryotic RNases H1 N-terminal motif is also shown. MATERIALS AND METHODS Construction of S.cerevisiae derivatives Construction of the N-176 derivative was as previously described (16). It contains the N-terminal part of the RNH1 gene cloned in pET-20b (Novagen). All gene constructs described in this work were derivatives of this plasmid. The N-64 construct was made by inserting a His6 tag and a termination site between the AvrII and SphI restriction endonuclease sites of the RNH1 gene. The ∆80–162(N-176) protein was obtained by digesting with SphI and HindIII and by filling with an oligonucleotide that has the sequence encoding amino acids 80–162 deleted. The ∆7–23(N-176) derivative was constructed by digesting with XbaI (in the T7 promoter region) and BstXI and ligating with an oligonucleotide that replaced the promoter region and the first six amino acids of the protein and deleted residues 7–23. The 4As(N-176) construct was made by PCR using oligonucleotides that extend from the XbaI site to the AvrII site and substituted F7, Y8, G13 and Y19 by A residues in the wild-type RNH1 gene. In most respects the 3-R2(N-176) and 8-R2(N-176) derivatives were made in the same way by PCR amplification and digestion with XbaI and AvrII, except that the sequence for PNI from R2 was inserted between R14 and E15 in R1 [3-R2(N-176)] or the sequence for GRETG (amino acids 13–17) in R1 was substituted with the sequence for NVPNIESK (amino acids 114–121) of R2 [8-R2(N-176)]. All constructs were confirmed by sequencing. Chimera construction The constructs ChRH(N-176), CaMV(N-176) and ChRH(∆80–162) were made by PCR amplification using oligonucleotides that substituted the sequence of R1 (amino acids 1–50) from the NdeI to the AvrII site of the N-176 or the ∆80–162(N-176) constructs with the 50 amino acid sequence from C.fasciculata RNase H1 or from cauliflower mosaic virus ORF VI protein (Fig. 3). CaMV-mut and ChRH-mut were mutants obtained by PCR errors. All constructs were confirmed by sequencing. Protein expression and purification Escherichia coli strain BL21(DE3) harboring the S.cerevisiae RNH1 gene in the pET-3a vector or N-176 and its derivatives in pET-20b (both Novagen vectors) under the control of bacteriophage T7 RNA polymerase (25) was grown and induced as previously described (16). For crude cell extract preparation a 3 ml culture was grown and induced, then 1 ml was pelleted and resuspended in 100 µl SDS cracking buffer (26) and boiled for 5 min prior to SDS–PAGE. For purification using His-Bind resin a 50 ml culture was grown and induced for 4 h. After spinning down, the pellet was resuspended in 1 ml binding buffer (5 mM imidazole, 500 mM NaCl and 20 mM Tris–HCl, pH 7.9). The sample was sonicated for 10 s 10 times, cooling each time. The cell lysate was treated with 10 µg/ml DNase I for 20 min at room temperature in the presence of 10 mM MgCl2 and then centrifuged at 20 000 g for 15 min. After resuspending the pellet in 500 µl binding buffer and running 1 µl soluble and insoluble fractions in SDS–PAGE it was observed that, in most cases, the protein of interest was mostly in the resuspended pellet. The pellet was centrifuged and resuspended in 500 µl binding buffer containing 6 M urea. The protein solution was incubated on ice for 1 h followed by centrifugation at 39 000 g for 15 min to remove any insoluble material. The supernatant was mixed with 200 µl His-Bind resin that had been previously charged with 50 mM NiSO4 and equilibrated with binding buffer containing 6 M urea. After incubation for 5 min on ice the mixture was spun for 1 min at 500 g and the supernatant removed. The resin was washed first with 750 µl binding buffer containing 6 M urea and then twice with 750 µl 20 mM imidazole in binding buffer with 6 M urea. 1837 Nucleic Acids Acids Research, Research,1994, 1998,Vol. Vol.22, 26,No. No.17 Nucleic The protein was eluted with 200 µl 500 mM imidazole in binding buffer with 6 M urea. The protein was used for Northwestern analysis directly or after dialysis into 20 mM Tris–HCl, pH 7.9, 100 mM NaCl, 3 mM β-mercaptoethanol, 0.1 mM EDTA, which gradually removes 6 M urea to allow refolding of the protein. Schizosaccharomyces pombe RNase H1 was expressed and purified using a procedure to be published elsewhere. Preparation of nucleic acid RNA–DNA hybrids and dsRNA substrates for the binding assays were prepared by in vitro transcription, using the plasmid pSPORT 1 (Life Technologies), which contains two promoters in opposite orientations. Plasmid DNA was linearized with NarI for transcription using the SP6 promoter or with SphI for transcription from the T7 promoter. Linear plasmids were incubated with the corresponding polymerase according to a procedure previously described (27). Only the RNA made from the SP6 promoter was labeled with [α-32P]CTP. To form dsRNA equal amounts of the SP6 and T7 run-off transcripts were annealed by heating to 95C for 5 min and slowly cooled. To form RNA–DNA hybrids the labeled SP6 transcript was used to make cDNA from the T7 primer that annealed to an internal sequence ∼100 bp from the end of the RNA. The cDNA was made using SuperScript II RNase H– reverse transcriptase (Life Technologies) following the manufacturer’s recommendations. RNA–RNA and RNA–DNA duplexes were treated with RNase A and T1, to remove unpaired ends and make both substrates of equal length, followed by phenol extraction and ethanol precipitation. The nucleic acids were either purified using NucTrap push columns (Stratagene) or fractionated by electrophoresis in a non-denaturing 12% polyacrylamide, 0.5× Tris–borate, EDTA (TBE) gel. RNA–DNA hybrids and dsRNA were extracted from the gel and purified using the RNaid Kit protocol (Bio 101), then stored in diethylpyrocarbonate (DEPC)-treated distilled water at –20C. Nucleic acid binding assays Northwestern assays using dsRNA and RNA–DNA hybrids purified on either NucTrap push columns (Stratagene) or from polyacrylamide gels were performed as previously described (16) in the absence of MgCl2. For quantification of binding the intensity of phosphor emission from nucleic acid bound to the protein was quantified using a PhosphorImager and then related to the amount of protein determined by staining a parallel gel with SYPRO orange (Molecular probes), a fluorescent dye that can be quantified using the Storm system (Molecular Dynamics). RESULTS Northwestern assay Data presented in this work were obtained using Northwestern assays in which proteins were transferred to PVDF membranes following SDS–PAGE. Renaturation of the protein involved removal of SDS and refolding. The Northwestern assays not only measure binding of a particular protein to duplex RNAs but also indicate the extent to which the protein has folded into an active form. Thus the data are useful mainly for qualitative comparisons, although the different level of binding to RNA–RNA versus RNA–DNA duplexes are meaningful because the same proteins are being tested in each case. The fraction of properly folded 1837 protein available for binding will be the same for both substrates. Other methods, such as band shift assays, can give data that are quantitative. However, such methods rely on proteins that are free of nucleic acid. At least some of the proteins we have purified are contaminated with nucleic acids which when removed yield a protein that does not remain in solution, either because of aggregation or because they bind to plastic or siliconized tubes, membranes or other substances with which they come into contact. Thus when incubating proteins with nucleic acids in solution, a large but unknown fraction of the protein may not be detected in binding because it is already bound to nucleic acid or has been removed from solution. Neither of these problems is present in Northwestern assays. Caulimovirus ORF VI proteins have a conserved motif that binds to nucleic acids The conserved motif present in the N-terminal region of eukaryotic RNases H1 is related to a sequence in caulimovirus ORF VI proteins (Fig. 2) necessary for translation of a viral polycistronic mRNA (24). Comparison of the consensus sequence of these two motifs revealed a striking similarity, with certain amino acids being absolutely conserved between these two very different groups of proteins (Fig. 2C). To investigate whether the conserved region from caulimovirus ORF VI proteins binds to nucleic acids we constructed a gene to express a protein in which amino acids 1–50 of S.cerevisiae R1 (in the N-176 protein) were replaced by amino acids 134–183 of cauliflower mosaic virus ORF VI protein, designated CaMV(N-176) (Fig. 3). CaMV(N-176) interacted strongly with dsRNA and RNA–DNA hybrids (Fig. 4, lanes 6 and 11). A mutant protein (CaMV-mut, Fig. 3B), in which amino acids 1–8 are deleted and in which L is replaced by V at residue 9, did not bind to nucleic acids (Fig. 4, lane 12), providing additional evidence that the CaMV motif was responsible for the binding observed in CaMV(N-176) protein. These results suggest that the conserved region of caulimovirus ORF VI proteins is involved in binding to dsRNA and RNA–DNA hybrids, a property that might be related to the transactivating activity of these proteins. The first copy (R1) of the S.cerevisiae RNase H1 motif is necessary for binding Saccharomyces cerevisiae RNase H1 is the only currently known eukaryotic RNase H1 to contain two copies of the conserved sequence (R1 and R2 in Fig. 1). Wild-type S.cerevisiae RNase H1 binds to nucleic acids, but a protein derivative that is lacking the first conserved motif (R1) does not (16). To investigate whether R1 binds to duplex RNA in the absence of R2 we tested a 50 amino acid synthetic peptide for its ability to bind to dsRNA and RNA–DNA hybrids. Surprisingly, no binding to either nucleic acid was observed (data not shown). Because failure of the peptide to bind to duplex RNA might result from poor folding of the protein, we examined the structure of the 50 amino acid peptide by NMR. We were able to sequentially assign almost all backbone protons from 2D-NMR experiments, but NOESY spectra of the peptide contained no long range NOEs (data not shown). This pattern of NOE cross-peaks, as well as the very narrow dispersion of chemical shifts, indicated the absence of stable folded structure in solution. To identify additional amino acid sequences that might be necessary to restore binding activity we constructed one gene that would express the first 64 amino 1838 Nucleic Acids Research, 1998, Vol. 26, No. 7 does not bind either to dsRNA or to RNA–DNA hybrids (16). In the light of our findings that R1 binds only when present with additional amino acids between R1 and R2 (Fig. 4) and that shorter proteins bearing amino acids 1–50 are probably disordered, it is possible that the 30 kDa protein also requires amino acids prior to residue 82 for stability or proper folding. However, inspection of the sequence comprising R2 (Fig. 2A) reveals two features that distinguish it from the other sequences: (i) highly conserved residues G7, R8 and G11 are replaced by N, V and K respectively; (ii) three amino acids, P, N and I are inserted between amino acids 8 and 9 (Fig. 2A, amino acids 116–118 in S.cerevisiae RNase H1). To test if the differences in binding properties of R1 and R2 can be attributed to the differences near amino acid 8 we generated two mutant genes: 8-R2(N-176) constructed to replace amino acids 7–11 of R1 with amino acids 7–14 of R2 (Fig. 2A, amino acids 114–121 in S.cerevisiae RNase H1), including the additional three amino acids PNI of R2, for a total of 8 amino acids; 3-R2(N-176) by inserting amino acids PNI between amino acids 8 and 9 of R1 (Fig. 3B). Neither of these two mutant proteins, 8-R2(N-176) and 3-R2(N-176), bound to dsRNA or to RNA–DNA hybrids in Northwestern assays (Fig. 4, lanes 15 and 16), indicating that R2 fails to bind to duplex RNAs due, at least in part, to the insertion of the three amino acids, possibly by perturbing the structure essential for nucleic acid binding. Figure 4. Nucleic acid binding assays. His tag-purified samples (1–6, 9–17), partially purified S.pombe RNase H1 (7) and cell extracts of E.coli BL21(DE3) induced to produce S.cerevisiae RNase H1 (8) were separated by electrophoresis in polyacrylamide gels and then stained with Coomassie blue (top panel) or transferred to an Immobilon P membrane, renatured and incubated with α-32P-labeled RNA–DNA hybrid (middle panel) or dsRNA (bottom panel), purified in NucTrap push columns (A) or extracted from polyacrylamide gels (B). Molecular weight markers (M.W. STD) are shown for the different gels and their weights in kDa are indicated. acids of S.cerevisiae RNase H1 attached to a His6 tag (designated N-64) and a second to yield a protein containing residues 1–79 fused to residues 163–176 (also with a His6 tag), which we called ∆80–162(N-176) (Fig. 3A). N-64 did not bind to nucleic acids, as judged by Northwestern assays (Fig. 4, lane 4), even though it has the entire first copy of the conserved region (Fig. 3), whereas ∆80–162(N-176) exhibited significant binding (Fig. 4, lane 3). A parallel Western blot revealed that both proteins were transferred to the membrane to a similar extent. One interpretation of these results is that stable folding of R1 requires amino acids residues between 64 and 80 (the 13 amino acids 163–176 may also contribute to stability). It is also possible that amino acids 64–80 are involved in binding to nucleic acids. Binding of ∆80–162(N-176) was less than that observed for N-176 (Fig. 4, lane 1) and for the wild-type protein (Fig. 4, lane 8), indicating that R1 is sufficient for binding, but R2 contributes to the binding strength. It remains unclear whether R2 acts directly in binding to nucleic acids or has an indirect role, such as increasing proper folding or stability of R1. A three amino acid insertion prevents R1 from binding to nucleic acids Previously we found that a 30 kDa protein derived by proteolysis of the 39 kDa S.cerevisiae RNase H1 at or near residue 81 (26) Crithidia fasciculata and S.pombe RNases H also bind to nucleic acids Because C.fasciculata and S.pombe RNases H have a region very similar in sequence to S.cerevisiae R1, we examined these proteins, or portions thereof, for their abilities to bind to duplex RNAs. To study the nucleic acid binding properties of the conserved region from C.fasciculata RNase H, we made chimeric constructs either replacing amino acids 1–50 of the N-176 protein or of the ∆80–162(N-176) construct with amino acids 149–200 of C.fasciculata RNase H1 [Fig. 3, ChRH(N-176) and ChRH(∆80–162)]. The ChRH(N-176) protein was produced in small amounts in E.coli cells and recovery from a nickel resin was low, although sufficient to detect binding in Northwestern assays. Binding of ChRH(N-176) to RNA–DNA hybrids and dsRNA was at a reduced level compared with the N-176 protein (Fig. 4, lanes 9 and 10). A mutant of the ChRH(N-176) protein (ChRH-mut, Fig. 3B) was defective in nucleic acid binding (not shown), indicating that the ChRH region is responsible for interaction with the nucleic acids substrates. Protein ChRH(∆80–162) was expressed well in E.coli and, after partial purification, was found to bind both substrates in Northwestern assays (Fig. 4, lane 5). Schizosaccharomyces pombe RNase H1 also bound to dsRNA and RNA–DNA hybrids (Fig. 4, lane 7), indicating that interaction with RNA–DNA and RNA–RNA duplexes is a common property of this class of eukaryotic RNases H. The complete 40 amino acid conserved sequence is required for binding Previously we demonstrated that the last 20 amino acids of the first conserved domain of S.cerevisiae RNase H1 are necessary for dsRNA and RNA–DNA binding (16). The first 20 residues of the motif are highly conserved in eukaryotic RNases H1 (Fig. 2A) and in the various caulimovirus proteins (Fig. 2B), suggesting an important role for the N-terminal part of the motif. Also, the lack of binding of the two constructs that have a deletion or alteration 1839 Nucleic Acids Acids Research, Research,1994, 1998,Vol. Vol.22, 26,No. No.17 Nucleic at the beginning of the motif (ChRH-mut and CaMV-mut, Figs 3 and 4) indicates that the first 20 amino acids are necessary for nucleic acid binding. To test this assumption we made a protein [∆7–23(N-176)] in which most of this region in R1 of N-176 protein was deleted (Fig. 3A). We also made a protein with substitutions of four of the most conserved residues by A in R1: F1, Y2, G11 and Y13 [4As(N-176), Fig. 3B]. Neither of these constructs bound to dsRNA or RNA–DNA hybrids in Northwestern assays (Fig. 4, lanes 2, 13 and 14), indicating that the N-terminal part of the conserved sequence is required for interaction with nucleic acids. Relative protein binding strength for nucleic acids Each protein that binds to duplex RNAs appears to do so with different ratios of duplex RNA to protein. To determine the relative binding of each protein to dsRNA and RNA–DNA hybrids we compared the intensity of phosphor emission from the substrate bound to the protein as a function of the amount of protein. Protein levels in each band were determined by staining, after electrophoresis in polyacrylamide gels, with SYPRO orange (Molecular probes), a fluorescent dye. Both fluorescence levels and phosphor emission were measured using the Storm system (Molecular Dynamics). The values presented in Table 1 correspond to the average of at least three independent experiments. In every experiment the wild-type RNase H1 of S.cerevisiae exhibited the highest amount of nucleic acid bound per unit of protein. For purposes of comparison the values given in Table 1 are expressed as a percentage of S.cerevisiae RNase H1 binding. These values give an estimate of the binding strengths of the different proteins, although additional kinetic analysis will be required to accurately determine the affinity of these proteins for dsRNA and RNA–DNA hybrids. Binding of N-176 (without the RNase H domain) decreased to ∼30% of the value obtained for the wild-type enzyme and removal of R2 [∆80–162(N-176)] resulted in a further decrease to 8–14% of that value. Schizosaccharomyces pombe RNase H1 bound to about the same degree as ∆80–162(N-176). The lowest values obtained were those using the C.fasciculata chimeric proteins. Table 1. Percentage of relative binding affinity to dsRNA and RNA–DNA hybrids Proteina dsRNA binding (%) RNA–DNA binding (%) S.cerevisiae RNase H1 100 100 N-176 35.5 30.1 ∆80–162(N-176) 14.0 7.6 8.0 7.5 CaMV(N-176) 15.1 22.9 ChRH(N-176) 4.1 6.1 ChRH(∆80–162) 1.4 1.7 S.pombe RNase H1 aOnly the proteins that gave measurable binding are shown. The relative binding of each protein to dsRNA and RNA–DNA hybrids is expressed as a percentage of the binding obtained for wild-type S.cerevisiae RNase H. DISCUSSION Data we have presented (this paper and 16) define unique features of a motif present in eukaryotic RNases H and caulimovirus ORF VI proteins. The motif binds to dsRNA and RNA–DNA hybrids 1839 and its signature is a conserved sequence of 40 amino acids, with seven of them aromatic residues. Flanking two lysines involved in nucleic acid binding, two of the aromatic residues may contribute to the strength of the interaction, while the other five may play a structural role. The presence of a common domain in these two very different families of proteins raises two interesting questions: (i) how these proteins recognize duplex RNAs; (ii) what role does this domain play in these two protein families? Although the RNase H/CaMV motif bears little resemblance to the dsRBD, it is likely that the two have similar structural features, in addition to their similar role in binding to dsRNA. When the structures of two dsRBD were solved by NMR (23,28) it became obvious that they are very similar to each other and to the N-terminal domain of ribosomal protein S5 (29). In addition to having a similar structure, the S5 family of proteins and the dsRBD share a number of conserved residues (23), although the overall sequences are very different. In the structure of the dsRBD domain two conserved lysines are well positioned to bind the substrate and they have also been implicated in binding by mutational analysis (17,21–23). The RNase H/CaMV motif has some residues in common with the dsRBD and the S5 domains and two conserved lysines are also required for binding in S.cerevisiae RNase H1 (16). It may be significant that the weakest interactions with duplex RNAs (Table 1) are those of the C.fasciculata chimeric proteins [ChRH(N-176) and ChRH(∆80–162)]. The C.fasciculata motif is the only one among eukaryotic RNases H1 that has one K residue (Fig. 2A). Two highly conserved aromatic amino acids flank the two important lysines (Fig. 2) in the RNase H/CaMV motif. These amino acids may contribute to the strength of binding by intercalating into the bases of the nucleic acids, perhaps inserting between the bases in the minor groove, forming van der Waal’s and π–π interactions, as in the case of TBP and its dsDNA target, which has an A-like structure (30). The RNase H/CaMV motif is much shorter than the dsRBD (40 amino acids versus 70). The additional 30 amino acids of the dsRBD form an α-helix, which contributes to stability of the domain through hydrophobic interactions. Similarly, the S5 domain is tightly associated with the other domain of the protein by an extensive inter-domain hydrophobic core (29). In a similar fashion, hydrophobic interactions may play a role in stability of the RNase H/CaMV motif. A 50 amino acid peptide including the 40 conserved residues of S.cerevisiae RNase H1 is insufficient for binding, most likely because it does not form a stable folded structure in solution, as assessed by NMR analysis. While the sequence adjacent to the motif is required for proper folding, it is not conserved among the various RNases H or between the RNases H and the caulimovirus ORF VI protein, although all these sequences are rich in S, P and T residues. When the 40 amino acid peptide of the caulimovirus ORF VI protein is substituted for the S.cerevisiae peptide the chimeric protein binds almost as well as the unmodified S.cerevisiae protein (Table 1), indicating proper folding of the CaMV motif in a very different environment. These observations are consistent with the notion that several of the highly conserved amino acids in the 40 amino acid sequence stabilize this peptide by hydrophobic interactions with the SPT-rich region. The question of how these proteins recognize their duplex RNA substrates is related to that of how many of these motifs are necessary for nucleic acid binding. Saccharomyces cerevisiae RNase H1, the only protein of its family with two copies (Fig. 1), exhibits the strongest binding in the Northwestern assay (Table 1840 Nucleic Acids Research, 1998, Vol. 26, No. 7 1), suggesting that two copies of the motif bind to nucleic acids better than one, although one is sufficient for binding. However, the two copies are not equivalent. In the assays employed, R2 does not bind to nucleic acids in the absence of R1 (16), while in ∆80–162(N-176) protein R1 alone does (Fig. 4). Most of the dsRNA binding proteins have more than one copy of the dsRBD (17,31,32) and for PKR, the human protein kinase activated by dsRNA, the number and relative positions of the motifs affect the binding properties of these proteins (33,34). The similarity between the RNase H and the CaMV motifs is striking, considering that these two groups of proteins appear to have such different functions. Proteins containing the 40 amino acid conserved sequence either from S.cerevisiae or from cauliflower mosaic virus bound to RNA–RNA duplexes and RNA–DNA hybrids similarly well, suggesting a common mode of interaction of the proteins with duplex RNAs. The question arises whether this also indicates a common feature in the target nucleic acid. Little is known about a target for RNase H1, but because the ORF VI protein participates in translation of a polycistronic mRNA to permit internal initiation (35,36), the RNA recognized by the RNase H/CaMV motif could be either the mRNA itself or a portion of rRNA (that might tether the ribosome to the mRNA, thereby stimulating reinitiation). RNase H1 might also recognize mRNA or rRNA but would, presumably, do this in the nucleus or nucleolus. It is also possible that these proteins are bifunctional and interact with RNA–RNA duplexes or with RNA–DNA hybrids, depending upon other proteins or nucleic acids. For example, the ORF VI protein could participate in interactions with the RNA–DNA hybrids formed during reverse transcription of RNA to form the DNA that ultimately is packaged into virus particles (37). Insight into the role(s) of the RNase H/CaMV motif will require identification of the target nucleic acid for both types of proteins. ACKNOWLEDGEMENTS We thank Hirofumi Yamada for providing purified S.pombe RNase H1 and Richard Maraia and Stephen White for critical reading of the manuscript. This work was supported in part by the National Institutes of Health Intramural AIDS Targeted Antiviral Program. REFERENCES 1 Crouch,R.J. (1990) New Biol., 2, 771–777. 2 Kanaya,S. and Crouch,R.J. (1983) J. Biol. Chem., 258, 1276–1281. 3 Kanaya,S., Kohara,A., Miura,Y., Sekiguchi,A., Iwai,S., Inoue,H., Ohtsuka,E. and Ikehara,M. (1990) J. Biol. Chem., 265, 4615–4621. 4 Nakamura,H., Oda,Y., Iwai,S., Inoue,H., Ohtsuka,E., Kanaya,S., Kimura,S., Katsuda,C., Katayanagi,K., Morikawa,K., Miyashiro,H. and Ikehara,M. (1991) Proc. Natl. Acad. Sci. USA, 88, 11535–11539. 5 Yang,W., Hendrickson,W.A., Crouch,R.J. and Satow,Y. (1990) Science, 249, 1398–1405. 6 Katayanagi,K., Miyagawa,M., Ishikawa,M., Matsushima,M., Kanaya,S., Ikehara,M., Matsuzaki,T. and Morikawa,K. (1990) Nature, 347, 306–309. 7 Katayanagi,K., Miyagawa,M., Matsushima,M., Ishikawa,M., Kanaya,S., Nakamura,H., Ikehara,M., Matsuzaki,T. and Morikawa,K. (1992) J. Mol. Biol., 223, 1029–1052. 8 Itaya,M. (1990) Proc. Natl. Acad. Sci. USA, 87, 8587–8591. 9 Itaya,M., McKelvin,D., Chatterjie,S.K. and Crouch,R.J. (1991) Mol. Gen. Genet., 227, 438–445. 10 Kanaya,S. and Itaya,M. (1992) J. Biol. Chem., 267, 10184–10192. 11 Dawes,S.S., Crouch,R.J., Morris,S.L. and Mizrahi,V. (1995) Gene, 165, 71–75. 12 Davies,J.F., Hostomska,Z., Hostomsky,Z., Jordan,S.R. and Matthews,D. (1991) Science, 252, 88–95. 13 Xiong,Y. and Eickbush,T.H. (1990) EMBO J., 9, 3353–3362. 14 Campbell,A.G. and Ray,D.S. (1993) Proc. Natl. Acad. Sci. USA, 90, 9350–9354. 15 Mushegian,A.M., Edskes,H.K. and Koonin,E.V. (1994) Nucleic Acids Res., 22, 4163–4166. 16 Cerritelli,S.M. and Crouch,R.J. (1995) RNA, 1, 246–259. 17 Green,S.R. and Mathews,M.B. (1992) Genes Dev., 6, 2478–2490. 18 St Johnston,D., Brown,N.H., Gall,J.G. and Jantsch,M. (1992) Proc. Natl. Acad. Sci. USA, 89, 10979–10983. 19 Gatignol,A., Buckler,C. and Jeang,K.T. (1993) Mol. Cell. Biol., 13, 2193–2202. 20 Gibson,T.J. and Thompson,J.D. (1994) Nucleic Acids Res., 22, 2552–2556. 21 Romano,P.R., Green,S.R., Barber,G.N., Mathews,M.B. and Hinnebusch,A.G. (1995) Mol. Cell. Biol., 15, 365–378. 22 McMillan,N.A.J., Carpick,B.W., Hollis,B., Toone,W.M., Zamanian-Daryoush,M. and Williams,B.R.G. (1995) J. Biol. Chem., 270, 2601–2606. 23 Bycroft,M., Grünert,S., Murzin,A.G., Proctor,M. and St Johnston,D. (1995) EMBO J., 14, 3563–3571. 24 De Tapia,M., Himmelbach,A. and Hohn,T. (1993) EMBO J., 12, 3305–3314. 25 Studier,F.W., Rosenberg,A.H., Dunn,J.J. and Dubendorff,J.W. (1990) Methods Enzymol., 185, 60–89. 26 Cerritelli,S.M., Shin,D.Y., Chen,H.C., Gonzales,M. and Crouch,R.J. (1993) Biochimie, 75, 107–111. 27 Milligan,J.F. and Uhlenbeck,O.C. (1989) Methods Enzymol., 180, 51–62. 28 Kharrat,A., Macias,M.J., Gibson,T.J., Nilges,M. and Pastore,A. (1995) EMBO J., 14, 3572–35784. 29 Ramakrishnan,V. and White,S.W. (1992) Nature, 358, 768–771. 30 Kim,J.L. and Burley,S.K. (1994) Nature Struct. Biol., 1, 638–653. 31 Melcher,T., Maas,S., Herb,A., Sprengel,R., Seeburg,P.H. and Higuchi,M. (1996) Nature, 379, 460–464. 32 O’Connell,M.A., Krause,S., Higuchi,M., Hsuan,J.J., Totty,N.F., Jenny,A. and Keller,W. (1995) Mol. Cell. Biol., 15, 1389–1397. 33 Green,S.R., Manche,L. and Mathews,M.B. (1995) Mol. Cell. Biol., 15, 358–364. 34 Schmedt,C., Green,S.R., Manche,L., Taylor,D.R., Ma,Y. and Mathews,M.B. (1995) J. Mol. Biol., 249, 29–44. 35 Bonneville,J.M., Sanfacon,H., Futterer,J. and Hohn,T. (1989) Cell, 59, 1135–1143. 36 Scholthof,H.B., Gowda,S., Wu,F.C. and Shepherd,R.J. (1992) J. Virol., 65, 5190–5195. 37 Pfeiffer,P. and Hohn,T. (1983) Cell, 33, 781–789. 38 Hasegawa,A., Verver,J., Shimada,A., Saito,M., Goldbach,R., VanKammen,A., Miki,K., Kameya-Iwaki,M. and Hibi,T. (1989) Nucleic Acids Res., 17, 9993–10013.
© Copyright 2026 Paperzz