A common 40 amino acid motif in eukaryotic

 1998 Oxford University Press
1834–1840 Nucleic Acids Research, 1998, Vol. 26, No. 7
A common 40 amino acid motif in eukaryotic RNases
H1 and caulimovirus ORF VI proteins binds to duplex
RNAs
Susana M. Cerritelli, Oleg Y. Fedoroff1,+, Brian R. Reid1,2 and Robert J. Crouch*
Laboratory of Molecular Genetics, National Institute of Child Health and Human Development, National Institutes
of Health, Bethesda, MD 20892, USA and 1Department of Chemistry and 2Department of Biochemistry,
University of Washington, Seattle, WA 98195, USA
Received October 21, 1997; Revised and Accepted February 4, 1998
ABSTRACT
Eukaryotic RNases H from Saccharomyces cerevisiae,
Schizosaccharomyces pombe and Crithidia fasciculata,
unlike the related Escherichia coli RNase HI, contain a
non-RNase H domain with a common motif. Previously
we showed that S.cerevisiae RNase H1 binds to duplex
RNAs (either RNA–DNA hybrids or double-stranded
RNA) through a region related to the double-stranded
RNA binding motif. A very similar amino acid sequence
is present in caulimovirus ORF VI proteins. The
hallmark of the RNase H/caulimovirus nucleic acid
binding motif is a stretch of 40 amino acids with 11
highly conserved residues, seven of which are aromatic.
Point mutations, insertions and deletions indicated
that integrity of the motif is important for binding.
However, additional amino acids are required because
a minimal peptide containing the motif was disordered
in solution and failed to bind to duplex RNAs, whereas
a longer protein bound well. Schizosaccharomyces
pombe RNase H1 also bound to duplex RNAs, as did
proteins in which the S.cerevisiae RNase H1 binding
motif was replaced by either the C.fasciculata or by the
cauliflower mosaic virus ORF VI sequence. The similarity
between the RNase H and the caulimovirus domain
suggest a common interaction with duplex RNAs of
these two different groups of proteins.
INTRODUCTION
Ribonucleases H degrade the RNA strand of RNA–DNA hybrids.
RNase H has been found in prokaryotes and eukaryotes and also
in retroviruses (1). In Escherichia coli two different RNases H
have been described: RNase HI, a very active and well studied
RNase H (2–4) whose crystal structure has been determined
(5–7), and RNase HII, a low activity enzyme which is poorly
characterized (8). Prokaryotic RNases H of the E.coli RNase HI
family are small proteins containing only the RNase H domain
(9–11). Retroviral RNases H have sequence and structural
conservation with E.coli RNase HI (12), but they differ in that the
RNase H domain is part of a larger polypeptide that also contains
a polymerase domain (13). The eukaryotic RNases H that
resemble E.coli RNase HI are, like the retroviral enzymes,
complex proteins and they have an additional domain that does
not appear related to polymerase and whose function is unknown.
Eukaryotic RNases H1 from Crithidia fasciculata (14), from
Saccharomyces cerevisiae (R.Crouch et al., manuscript in
preparation), from Schizosaccharomyces pombe (R.Crouch et al.,
manuscript in preparation) and from a chicken cDNA (accession
no. D26340) have been sequenced. Although it has not yet been
demonstrated that the chicken cDNA encodes an active RNase H,
the high degree of similarity with the other sequences indicates
that this protein is likely to be the chicken homolog of RNase H1
(15). All these proteins present a similar domain organization,
with the RNase H domain located in the C-terminal part of the
protein (Fig. 1). The N-terminal region, of variable length, is not
required for RNase H activity (R.Crouch et al., manuscript in
preparation) and is characterized by an abundance of basic amino
acids and a conserved sequence of ∼40 residues (Fig. 1).
Saccharomyces cerevisiae RNase H1 is, so far, the only member
of this family that contains two copies of the motif (Fig. 1). The
conserved sequence is related to the conserved domain of
caulimovirus ORF VI product (Fig. 2), which facilitates translation
of polycistronic virus RNA in plant cells (15).
It was first observed in S.cerevisiae RNase H1 (16) that the last
20 amino acids of the conserved domain are related to the last
20 amino acids of a motif present in proteins that bind to
double-stranded (ds)RNA (17–19). The dsRNA binding domain
(dsRBD) is a sequence of ∼70 amino acids, of which the last 20
are the most conserved (18,20). Saccharomyces cerevisiae RNase
H1 binds in vitro to dsRNA and to RNA–DNA hybrids with
similar strength and to other nucleic acids with lower affinity. It
is thought that the sequence similar to the dsRBD is required for
binding, because the ability of yeast RNase H1 to bind to dsRNA
is much reduced by deletion of this sequence. Similarly, mutation
of two K residues that are highly conserved in the RNases
H/caulimovirus ORF VI motif (RNH/CaMV) and in the dsRBD
reduces binding in both domains (16,17,21–23). N-176, a
*To whom correspondence should be addressed. Tel: +1 301 496 4082; Fax: +1 301 496 0243; Email: [email protected]
+Present
address: Drug Dynamics Institute, College of Pharmacy, University of Texas Austin, Austin, TX 78712, USA
1835
Nucleic Acids
Acids Research,
Research,1994,
1998,Vol.
Vol.22,
26,No.
No.17
Nucleic
Figure 1. Structural organization of eukaryotic RNases H1. The RNase H
domains are represented by dark boxes. The conserved motifs in the N-terminal
region are related by dotted lines. R1 and R2 indicate the two conserved
sequences present in S.cerevisiae RNase H1.
truncated version of S.cerevisiae RNase H1 which lacks the
RNase H domain, binds to nucleic acids, as does the full-length
protein, indicating that the RNase H domain is not required for
binding (16).
1835
The first 20 residues of the 40 amino acid motif are highly
homologous in eukaryotic RNases H and the caulimovirus ORF
VI proteins (Fig. 2). This region has been reported to be essential
for the translation activation capacity of the cauliflower mosaic
virus ORF VI protein (24). The mechanism of transactivation is
unknown, but efficient translation of a polycistronic message into
multiple proteins might involve interaction with dsRNA regions
of the ribosomes or of mRNA. In this study we investigated the
relative apparent affinity for nucleic acid of the conserved motif
from cauliflower mosaic virus ORF VI protein (CaMV) and also
from Crithidia fasciculata RNase H1 (ChRH) by making
chimeric constructs in which the first motif of the N-176 protein
was replaced by the 40 conserved amino acids from ChRH or
CaMV. Northwestern assays showed that both constructs could
bind dsRNA and RNA–DNA hybrids. We also made deletions,
insertions and point mutations in the N-176 protein and determined
that the first 20 amino acids of the conserved sequence are
necessary for duplex RNA binding and that the first of the two
motifs of S.cerevisiae RNase H1 is sufficient for the nucleic acid
binding activity of the protein.
Figure 2. Alignment of homologous sequences from eukaryotic RNases H and caulimovirus ORF VI proteins. The numbering on the top refers to the first amino acid
of the conserved sequence. Shaded boxes highlight identical amino acids. A capital letter indicates an amino acid that occurs at the same position in >80%; a lower
case letter represents a residue present in >50%; an x denotes a position that is not conserved. Arrows point to amino acids that are identical in all sequences.
(A) Sequences of the conserved motif present at the N-terminal region of eukaryotic RNases H. The sequences shown are: S.pombe RNase H1 8–50; C.fasciculata
RNase H1 156–198; chicken 22–64; S.cerevisiae R1 7–49; S.cerevisiae R2 108–153. (B) Sequences of the conserved motif of caulimovirus ORF VI proteins. The
sequences shown are: CaMV, cauliflower mosaic virus 140–182; CERV, carnation etched ring virus 134–176; FMV, figwort mosaic virus 156–196; SVB, strawberry
vein banding virus 146–185; PCISV, peanut chlorotic streak virus 80–122; for soybean chlorotic mottle virus the sequence reported (38) is SoCMV 107–146, which
presents homology to the other sequences only in the C-terminal part of the motif. Homology in the N-terminal region appeared in the –1 reading frame that is shown
under SoCMV# and would correspond to amino acids 105–143. This discrepancy has been previously reported (24). To derive the consensus sequence we used the
first 20 amino acids of SoCMV# and the last 19 of SoCMV. (C) Comparison of the consensus sequence of the motif of eukaryotic RNases H1 and caulimovirus ORF
VI proteins. The stars denote amino acids identical in both sequences.
1836 Nucleic Acids Research, 1998, Vol. 26, No. 7
Figure 3. Schematic representation of wild-type proteins and derivatives. (A) The structures of wild-type S.cerevisiae RNase H1, S.pombe RNase H1 and C.fasciculata RNase
H1 (ChRH) are represented as in Figure 1. In cauliflower mosaic virus ORF VI protein (CaMV) only the conserved motif is featured. The protein derivatives constructed
in this work are displayed in the same schematic representation. (B). The sequence of R1 of N-176 is presented and the substitutions in 4As(N-176), 3-R2(N-176) and
8-R2(N-176) are indicated. On the bottom are shown the sequences of cauliflower mosaic virus ORF VI product [CaMV(N-176)] and C.fasciculata RNase H1
[ChRH(N-176)] and the mutations present in CaMV-mut and ChRH-mut. The consensus sequence of the eukaryotic RNases H1 N-terminal motif is also shown.
MATERIALS AND METHODS
Construction of S.cerevisiae derivatives
Construction of the N-176 derivative was as previously described
(16). It contains the N-terminal part of the RNH1 gene cloned in
pET-20b (Novagen). All gene constructs described in this work
were derivatives of this plasmid. The N-64 construct was made
by inserting a His6 tag and a termination site between the AvrII
and SphI restriction endonuclease sites of the RNH1 gene. The
∆80–162(N-176) protein was obtained by digesting with SphI
and HindIII and by filling with an oligonucleotide that has the
sequence encoding amino acids 80–162 deleted. The
∆7–23(N-176) derivative was constructed by digesting with XbaI
(in the T7 promoter region) and BstXI and ligating with an
oligonucleotide that replaced the promoter region and the first six
amino acids of the protein and deleted residues 7–23. The
4As(N-176) construct was made by PCR using oligonucleotides
that extend from the XbaI site to the AvrII site and substituted F7,
Y8, G13 and Y19 by A residues in the wild-type RNH1 gene. In
most respects the 3-R2(N-176) and 8-R2(N-176) derivatives
were made in the same way by PCR amplification and digestion
with XbaI and AvrII, except that the sequence for PNI from R2
was inserted between R14 and E15 in R1 [3-R2(N-176)] or the
sequence for GRETG (amino acids 13–17) in R1 was substituted
with the sequence for NVPNIESK (amino acids 114–121) of R2
[8-R2(N-176)]. All constructs were confirmed by sequencing.
Chimera construction
The
constructs
ChRH(N-176),
CaMV(N-176)
and
ChRH(∆80–162) were made by PCR amplification using oligonucleotides that substituted the sequence of R1 (amino acids 1–50)
from the NdeI to the AvrII site of the N-176 or the
∆80–162(N-176) constructs with the 50 amino acid sequence
from C.fasciculata RNase H1 or from cauliflower mosaic virus
ORF VI protein (Fig. 3). CaMV-mut and ChRH-mut were
mutants obtained by PCR errors. All constructs were confirmed
by sequencing.
Protein expression and purification
Escherichia coli strain BL21(DE3) harboring the S.cerevisiae
RNH1 gene in the pET-3a vector or N-176 and its derivatives in
pET-20b (both Novagen vectors) under the control of bacteriophage
T7 RNA polymerase (25) was grown and induced as previously
described (16). For crude cell extract preparation a 3 ml culture
was grown and induced, then 1 ml was pelleted and resuspended
in 100 µl SDS cracking buffer (26) and boiled for 5 min prior to
SDS–PAGE. For purification using His-Bind resin a 50 ml culture
was grown and induced for 4 h. After spinning down, the pellet
was resuspended in 1 ml binding buffer (5 mM imidazole, 500 mM
NaCl and 20 mM Tris–HCl, pH 7.9). The sample was sonicated
for 10 s 10 times, cooling each time. The cell lysate was treated
with 10 µg/ml DNase I for 20 min at room temperature in the
presence of 10 mM MgCl2 and then centrifuged at 20 000 g for
15 min. After resuspending the pellet in 500 µl binding buffer and
running 1 µl soluble and insoluble fractions in SDS–PAGE it was
observed that, in most cases, the protein of interest was mostly in the
resuspended pellet. The pellet was centrifuged and resuspended in
500 µl binding buffer containing 6 M urea. The protein solution was
incubated on ice for 1 h followed by centrifugation at 39 000 g for
15 min to remove any insoluble material. The supernatant was
mixed with 200 µl His-Bind resin that had been previously charged
with 50 mM NiSO4 and equilibrated with binding buffer containing
6 M urea. After incubation for 5 min on ice the mixture was spun
for 1 min at 500 g and the supernatant removed. The resin was
washed first with 750 µl binding buffer containing 6 M urea and then
twice with 750 µl 20 mM imidazole in binding buffer with 6 M urea.
1837
Nucleic Acids
Acids Research,
Research,1994,
1998,Vol.
Vol.22,
26,No.
No.17
Nucleic
The protein was eluted with 200 µl 500 mM imidazole in binding
buffer with 6 M urea. The protein was used for Northwestern
analysis directly or after dialysis into 20 mM Tris–HCl, pH 7.9,
100 mM NaCl, 3 mM β-mercaptoethanol, 0.1 mM EDTA, which
gradually removes 6 M urea to allow refolding of the protein.
Schizosaccharomyces pombe RNase H1 was expressed and purified
using a procedure to be published elsewhere.
Preparation of nucleic acid
RNA–DNA hybrids and dsRNA substrates for the binding assays
were prepared by in vitro transcription, using the plasmid
pSPORT 1 (Life Technologies), which contains two promoters in
opposite orientations. Plasmid DNA was linearized with NarI for
transcription using the SP6 promoter or with SphI for transcription
from the T7 promoter. Linear plasmids were incubated with the
corresponding polymerase according to a procedure previously
described (27). Only the RNA made from the SP6 promoter was
labeled with [α-32P]CTP. To form dsRNA equal amounts of the
SP6 and T7 run-off transcripts were annealed by heating to 95C
for 5 min and slowly cooled. To form RNA–DNA hybrids the
labeled SP6 transcript was used to make cDNA from the T7 primer
that annealed to an internal sequence ∼100 bp from the end of the
RNA. The cDNA was made using SuperScript II RNase H– reverse
transcriptase (Life Technologies) following the manufacturer’s
recommendations.
RNA–RNA and RNA–DNA duplexes were treated with
RNase A and T1, to remove unpaired ends and make both
substrates of equal length, followed by phenol extraction and ethanol
precipitation. The nucleic acids were either purified using NucTrap
push columns (Stratagene) or fractionated by electrophoresis in a
non-denaturing 12% polyacrylamide, 0.5× Tris–borate, EDTA
(TBE) gel. RNA–DNA hybrids and dsRNA were extracted from the
gel and purified using the RNaid Kit protocol (Bio 101), then stored
in diethylpyrocarbonate (DEPC)-treated distilled water at –20C.
Nucleic acid binding assays
Northwestern assays using dsRNA and RNA–DNA hybrids
purified on either NucTrap push columns (Stratagene) or from
polyacrylamide gels were performed as previously described (16)
in the absence of MgCl2. For quantification of binding the
intensity of phosphor emission from nucleic acid bound to the
protein was quantified using a PhosphorImager and then related
to the amount of protein determined by staining a parallel gel with
SYPRO orange (Molecular probes), a fluorescent dye that can be
quantified using the Storm system (Molecular Dynamics).
RESULTS
Northwestern assay
Data presented in this work were obtained using Northwestern
assays in which proteins were transferred to PVDF membranes
following SDS–PAGE. Renaturation of the protein involved
removal of SDS and refolding. The Northwestern assays not only
measure binding of a particular protein to duplex RNAs but also
indicate the extent to which the protein has folded into an active
form. Thus the data are useful mainly for qualitative comparisons,
although the different level of binding to RNA–RNA versus
RNA–DNA duplexes are meaningful because the same proteins
are being tested in each case. The fraction of properly folded
1837
protein available for binding will be the same for both substrates.
Other methods, such as band shift assays, can give data that are
quantitative. However, such methods rely on proteins that are free
of nucleic acid. At least some of the proteins we have purified are
contaminated with nucleic acids which when removed yield a
protein that does not remain in solution, either because of
aggregation or because they bind to plastic or siliconized tubes,
membranes or other substances with which they come into
contact. Thus when incubating proteins with nucleic acids in
solution, a large but unknown fraction of the protein may not be
detected in binding because it is already bound to nucleic acid or
has been removed from solution. Neither of these problems is
present in Northwestern assays.
Caulimovirus ORF VI proteins have a conserved motif that
binds to nucleic acids
The conserved motif present in the N-terminal region of
eukaryotic RNases H1 is related to a sequence in caulimovirus
ORF VI proteins (Fig. 2) necessary for translation of a viral
polycistronic mRNA (24). Comparison of the consensus sequence
of these two motifs revealed a striking similarity, with certain
amino acids being absolutely conserved between these two very
different groups of proteins (Fig. 2C). To investigate whether the
conserved region from caulimovirus ORF VI proteins binds to
nucleic acids we constructed a gene to express a protein in which
amino acids 1–50 of S.cerevisiae R1 (in the N-176 protein) were
replaced by amino acids 134–183 of cauliflower mosaic virus
ORF VI protein, designated CaMV(N-176) (Fig. 3).
CaMV(N-176) interacted strongly with dsRNA and RNA–DNA
hybrids (Fig. 4, lanes 6 and 11). A mutant protein (CaMV-mut,
Fig. 3B), in which amino acids 1–8 are deleted and in which L is
replaced by V at residue 9, did not bind to nucleic acids (Fig. 4,
lane 12), providing additional evidence that the CaMV motif was
responsible for the binding observed in CaMV(N-176) protein.
These results suggest that the conserved region of caulimovirus ORF
VI proteins is involved in binding to dsRNA and RNA–DNA
hybrids, a property that might be related to the transactivating
activity of these proteins.
The first copy (R1) of the S.cerevisiae RNase H1 motif is
necessary for binding
Saccharomyces cerevisiae RNase H1 is the only currently known
eukaryotic RNase H1 to contain two copies of the conserved
sequence (R1 and R2 in Fig. 1). Wild-type S.cerevisiae RNase H1
binds to nucleic acids, but a protein derivative that is lacking the
first conserved motif (R1) does not (16). To investigate whether
R1 binds to duplex RNA in the absence of R2 we tested a 50 amino
acid synthetic peptide for its ability to bind to dsRNA and
RNA–DNA hybrids. Surprisingly, no binding to either nucleic
acid was observed (data not shown). Because failure of the
peptide to bind to duplex RNA might result from poor folding of
the protein, we examined the structure of the 50 amino acid
peptide by NMR. We were able to sequentially assign almost all
backbone protons from 2D-NMR experiments, but NOESY
spectra of the peptide contained no long range NOEs (data not
shown). This pattern of NOE cross-peaks, as well as the very
narrow dispersion of chemical shifts, indicated the absence of
stable folded structure in solution. To identify additional amino
acid sequences that might be necessary to restore binding activity
we constructed one gene that would express the first 64 amino
1838 Nucleic Acids Research, 1998, Vol. 26, No. 7
does not bind either to dsRNA or to RNA–DNA hybrids (16). In
the light of our findings that R1 binds only when present with
additional amino acids between R1 and R2 (Fig. 4) and that shorter
proteins bearing amino acids 1–50 are probably disordered, it is
possible that the 30 kDa protein also requires amino acids prior
to residue 82 for stability or proper folding. However, inspection
of the sequence comprising R2 (Fig. 2A) reveals two features that
distinguish it from the other sequences: (i) highly conserved
residues G7, R8 and G11 are replaced by N, V and K respectively;
(ii) three amino acids, P, N and I are inserted between amino acids
8 and 9 (Fig. 2A, amino acids 116–118 in S.cerevisiae RNase H1).
To test if the differences in binding properties of R1 and R2 can
be attributed to the differences near amino acid 8 we generated
two mutant genes: 8-R2(N-176) constructed to replace amino
acids 7–11 of R1 with amino acids 7–14 of R2 (Fig. 2A, amino acids
114–121 in S.cerevisiae RNase H1), including the additional three
amino acids PNI of R2, for a total of 8 amino acids;
3-R2(N-176) by inserting amino acids PNI between amino
acids 8 and 9 of R1 (Fig. 3B). Neither of these two mutant
proteins, 8-R2(N-176) and 3-R2(N-176), bound to dsRNA or
to RNA–DNA hybrids in Northwestern assays (Fig. 4, lanes 15 and
16), indicating that R2 fails to bind to duplex RNAs due, at least
in part, to the insertion of the three amino acids, possibly by
perturbing the structure essential for nucleic acid binding.
Figure 4. Nucleic acid binding assays. His tag-purified samples (1–6, 9–17),
partially purified S.pombe RNase H1 (7) and cell extracts of E.coli BL21(DE3)
induced to produce S.cerevisiae RNase H1 (8) were separated by electrophoresis
in polyacrylamide gels and then stained with Coomassie blue (top panel) or
transferred to an Immobilon P membrane, renatured and incubated with
α-32P-labeled RNA–DNA hybrid (middle panel) or dsRNA (bottom panel),
purified in NucTrap push columns (A) or extracted from polyacrylamide gels
(B). Molecular weight markers (M.W. STD) are shown for the different gels and
their weights in kDa are indicated.
acids of S.cerevisiae RNase H1 attached to a His6 tag (designated
N-64) and a second to yield a protein containing residues 1–79
fused to residues 163–176 (also with a His6 tag), which we called
∆80–162(N-176) (Fig. 3A). N-64 did not bind to nucleic acids, as
judged by Northwestern assays (Fig. 4, lane 4), even though it has
the entire first copy of the conserved region (Fig. 3), whereas
∆80–162(N-176) exhibited significant binding (Fig. 4, lane 3). A
parallel Western blot revealed that both proteins were transferred
to the membrane to a similar extent. One interpretation of these
results is that stable folding of R1 requires amino acids residues
between 64 and 80 (the 13 amino acids 163–176 may also
contribute to stability). It is also possible that amino acids 64–80
are involved in binding to nucleic acids. Binding of
∆80–162(N-176) was less than that observed for N-176 (Fig. 4,
lane 1) and for the wild-type protein (Fig. 4, lane 8), indicating
that R1 is sufficient for binding, but R2 contributes to the binding
strength. It remains unclear whether R2 acts directly in binding
to nucleic acids or has an indirect role, such as increasing proper
folding or stability of R1.
A three amino acid insertion prevents R1 from binding to
nucleic acids
Previously we found that a 30 kDa protein derived by proteolysis
of the 39 kDa S.cerevisiae RNase H1 at or near residue 81 (26)
Crithidia fasciculata and S.pombe RNases H also bind to
nucleic acids
Because C.fasciculata and S.pombe RNases H have a region very
similar in sequence to S.cerevisiae R1, we examined these
proteins, or portions thereof, for their abilities to bind to duplex
RNAs. To study the nucleic acid binding properties of the
conserved region from C.fasciculata RNase H, we made chimeric
constructs either replacing amino acids 1–50 of the N-176 protein
or of the ∆80–162(N-176) construct with amino acids 149–200
of C.fasciculata RNase H1 [Fig. 3, ChRH(N-176) and
ChRH(∆80–162)]. The ChRH(N-176) protein was produced in
small amounts in E.coli cells and recovery from a nickel resin was
low, although sufficient to detect binding in Northwestern assays.
Binding of ChRH(N-176) to RNA–DNA hybrids and dsRNA
was at a reduced level compared with the N-176 protein (Fig. 4,
lanes 9 and 10). A mutant of the ChRH(N-176) protein
(ChRH-mut, Fig. 3B) was defective in nucleic acid binding (not
shown), indicating that the ChRH region is responsible for
interaction with the nucleic acids substrates. Protein
ChRH(∆80–162) was expressed well in E.coli and, after partial
purification, was found to bind both substrates in Northwestern
assays (Fig. 4, lane 5). Schizosaccharomyces pombe RNase H1 also
bound to dsRNA and RNA–DNA hybrids (Fig. 4, lane 7), indicating
that interaction with RNA–DNA and RNA–RNA duplexes is a
common property of this class of eukaryotic RNases H.
The complete 40 amino acid conserved sequence is required
for binding
Previously we demonstrated that the last 20 amino acids of the
first conserved domain of S.cerevisiae RNase H1 are necessary
for dsRNA and RNA–DNA binding (16). The first 20 residues of
the motif are highly conserved in eukaryotic RNases H1 (Fig. 2A)
and in the various caulimovirus proteins (Fig. 2B), suggesting an
important role for the N-terminal part of the motif. Also, the lack
of binding of the two constructs that have a deletion or alteration
1839
Nucleic Acids
Acids Research,
Research,1994,
1998,Vol.
Vol.22,
26,No.
No.17
Nucleic
at the beginning of the motif (ChRH-mut and CaMV-mut, Figs 3
and 4) indicates that the first 20 amino acids are necessary for
nucleic acid binding. To test this assumption we made a protein
[∆7–23(N-176)] in which most of this region in R1 of N-176
protein was deleted (Fig. 3A). We also made a protein with
substitutions of four of the most conserved residues by A in R1: F1,
Y2, G11 and Y13 [4As(N-176), Fig. 3B]. Neither of these constructs
bound to dsRNA or RNA–DNA hybrids in Northwestern assays
(Fig. 4, lanes 2, 13 and 14), indicating that the N-terminal part of
the conserved sequence is required for interaction with nucleic acids.
Relative protein binding strength for nucleic acids
Each protein that binds to duplex RNAs appears to do so with
different ratios of duplex RNA to protein. To determine the
relative binding of each protein to dsRNA and RNA–DNA
hybrids we compared the intensity of phosphor emission from the
substrate bound to the protein as a function of the amount of
protein. Protein levels in each band were determined by staining,
after electrophoresis in polyacrylamide gels, with SYPRO orange
(Molecular probes), a fluorescent dye. Both fluorescence levels
and phosphor emission were measured using the Storm system
(Molecular Dynamics). The values presented in Table 1 correspond
to the average of at least three independent experiments. In every
experiment the wild-type RNase H1 of S.cerevisiae exhibited the
highest amount of nucleic acid bound per unit of protein. For
purposes of comparison the values given in Table 1 are expressed
as a percentage of S.cerevisiae RNase H1 binding. These values
give an estimate of the binding strengths of the different proteins,
although additional kinetic analysis will be required to accurately
determine the affinity of these proteins for dsRNA and RNA–DNA
hybrids. Binding of N-176 (without the RNase H domain)
decreased to ∼30% of the value obtained for the wild-type
enzyme and removal of R2 [∆80–162(N-176)] resulted in a
further decrease to 8–14% of that value. Schizosaccharomyces
pombe RNase H1 bound to about the same degree as
∆80–162(N-176). The lowest values obtained were those using
the C.fasciculata chimeric proteins.
Table 1. Percentage of relative binding affinity to dsRNA and RNA–DNA
hybrids
Proteina
dsRNA binding (%)
RNA–DNA binding (%)
S.cerevisiae RNase H1
100
100
N-176
35.5
30.1
∆80–162(N-176)
14.0
7.6
8.0
7.5
CaMV(N-176)
15.1
22.9
ChRH(N-176)
4.1
6.1
ChRH(∆80–162)
1.4
1.7
S.pombe RNase H1
aOnly the proteins that gave measurable binding are shown. The relative binding
of each protein to dsRNA and RNA–DNA hybrids is expressed as a percentage
of the binding obtained for wild-type S.cerevisiae RNase H.
DISCUSSION
Data we have presented (this paper and 16) define unique features
of a motif present in eukaryotic RNases H and caulimovirus ORF
VI proteins. The motif binds to dsRNA and RNA–DNA hybrids
1839
and its signature is a conserved sequence of 40 amino acids, with
seven of them aromatic residues. Flanking two lysines involved
in nucleic acid binding, two of the aromatic residues may
contribute to the strength of the interaction, while the other five
may play a structural role. The presence of a common domain in
these two very different families of proteins raises two interesting
questions: (i) how these proteins recognize duplex RNAs;
(ii) what role does this domain play in these two protein families?
Although the RNase H/CaMV motif bears little resemblance to
the dsRBD, it is likely that the two have similar structural
features, in addition to their similar role in binding to dsRNA.
When the structures of two dsRBD were solved by NMR (23,28)
it became obvious that they are very similar to each other and to
the N-terminal domain of ribosomal protein S5 (29). In addition
to having a similar structure, the S5 family of proteins and the
dsRBD share a number of conserved residues (23), although the
overall sequences are very different. In the structure of the dsRBD
domain two conserved lysines are well positioned to bind the
substrate and they have also been implicated in binding by
mutational analysis (17,21–23). The RNase H/CaMV motif has
some residues in common with the dsRBD and the S5 domains
and two conserved lysines are also required for binding in
S.cerevisiae RNase H1 (16). It may be significant that the weakest
interactions with duplex RNAs (Table 1) are those of the
C.fasciculata
chimeric
proteins
[ChRH(N-176)
and
ChRH(∆80–162)]. The C.fasciculata motif is the only one among
eukaryotic RNases H1 that has one K residue (Fig. 2A). Two highly
conserved aromatic amino acids flank the two important lysines
(Fig. 2) in the RNase H/CaMV motif. These amino acids may
contribute to the strength of binding by intercalating into the bases
of the nucleic acids, perhaps inserting between the bases in the minor
groove, forming van der Waal’s and π–π interactions, as in the case
of TBP and its dsDNA target, which has an A-like structure (30).
The RNase H/CaMV motif is much shorter than the dsRBD
(40 amino acids versus 70). The additional 30 amino acids of the
dsRBD form an α-helix, which contributes to stability of the
domain through hydrophobic interactions. Similarly, the S5
domain is tightly associated with the other domain of the protein
by an extensive inter-domain hydrophobic core (29). In a similar
fashion, hydrophobic interactions may play a role in stability of
the RNase H/CaMV motif. A 50 amino acid peptide including the
40 conserved residues of S.cerevisiae RNase H1 is insufficient for
binding, most likely because it does not form a stable folded
structure in solution, as assessed by NMR analysis. While the
sequence adjacent to the motif is required for proper folding, it is
not conserved among the various RNases H or between the
RNases H and the caulimovirus ORF VI protein, although all
these sequences are rich in S, P and T residues. When the
40 amino acid peptide of the caulimovirus ORF VI protein is
substituted for the S.cerevisiae peptide the chimeric protein binds
almost as well as the unmodified S.cerevisiae protein (Table 1),
indicating proper folding of the CaMV motif in a very different
environment. These observations are consistent with the notion
that several of the highly conserved amino acids in the 40 amino
acid sequence stabilize this peptide by hydrophobic interactions
with the SPT-rich region.
The question of how these proteins recognize their duplex RNA
substrates is related to that of how many of these motifs are
necessary for nucleic acid binding. Saccharomyces cerevisiae
RNase H1, the only protein of its family with two copies (Fig. 1),
exhibits the strongest binding in the Northwestern assay (Table
1840 Nucleic Acids Research, 1998, Vol. 26, No. 7
1), suggesting that two copies of the motif bind to nucleic acids
better than one, although one is sufficient for binding. However,
the two copies are not equivalent. In the assays employed, R2
does not bind to nucleic acids in the absence of R1 (16), while in
∆80–162(N-176) protein R1 alone does (Fig. 4). Most of the
dsRNA binding proteins have more than one copy of the dsRBD
(17,31,32) and for PKR, the human protein kinase activated by
dsRNA, the number and relative positions of the motifs affect the
binding properties of these proteins (33,34).
The similarity between the RNase H and the CaMV motifs is
striking, considering that these two groups of proteins appear to
have such different functions. Proteins containing the 40 amino
acid conserved sequence either from S.cerevisiae or from
cauliflower mosaic virus bound to RNA–RNA duplexes and
RNA–DNA hybrids similarly well, suggesting a common mode
of interaction of the proteins with duplex RNAs. The question
arises whether this also indicates a common feature in the target
nucleic acid. Little is known about a target for RNase H1, but
because the ORF VI protein participates in translation of a
polycistronic mRNA to permit internal initiation (35,36), the
RNA recognized by the RNase H/CaMV motif could be either the
mRNA itself or a portion of rRNA (that might tether the ribosome
to the mRNA, thereby stimulating reinitiation). RNase H1 might
also recognize mRNA or rRNA but would, presumably, do this
in the nucleus or nucleolus. It is also possible that these proteins
are bifunctional and interact with RNA–RNA duplexes or with
RNA–DNA hybrids, depending upon other proteins or nucleic
acids. For example, the ORF VI protein could participate in
interactions with the RNA–DNA hybrids formed during reverse
transcription of RNA to form the DNA that ultimately is
packaged into virus particles (37). Insight into the role(s) of the
RNase H/CaMV motif will require identification of the target
nucleic acid for both types of proteins.
ACKNOWLEDGEMENTS
We thank Hirofumi Yamada for providing purified S.pombe RNase
H1 and Richard Maraia and Stephen White for critical reading of the
manuscript. This work was supported in part by the National
Institutes of Health Intramural AIDS Targeted Antiviral Program.
REFERENCES
1 Crouch,R.J. (1990) New Biol., 2, 771–777.
2 Kanaya,S. and Crouch,R.J. (1983) J. Biol. Chem., 258, 1276–1281.
3 Kanaya,S., Kohara,A., Miura,Y., Sekiguchi,A., Iwai,S., Inoue,H.,
Ohtsuka,E. and Ikehara,M. (1990) J. Biol. Chem., 265, 4615–4621.
4 Nakamura,H., Oda,Y., Iwai,S., Inoue,H., Ohtsuka,E., Kanaya,S.,
Kimura,S., Katsuda,C., Katayanagi,K., Morikawa,K., Miyashiro,H. and
Ikehara,M. (1991) Proc. Natl. Acad. Sci. USA, 88, 11535–11539.
5 Yang,W., Hendrickson,W.A., Crouch,R.J. and Satow,Y. (1990) Science,
249, 1398–1405.
6 Katayanagi,K., Miyagawa,M., Ishikawa,M., Matsushima,M., Kanaya,S.,
Ikehara,M., Matsuzaki,T. and Morikawa,K. (1990) Nature, 347, 306–309.
7 Katayanagi,K., Miyagawa,M., Matsushima,M., Ishikawa,M., Kanaya,S.,
Nakamura,H., Ikehara,M., Matsuzaki,T. and Morikawa,K. (1992) J. Mol.
Biol., 223, 1029–1052.
8 Itaya,M. (1990) Proc. Natl. Acad. Sci. USA, 87, 8587–8591.
9 Itaya,M., McKelvin,D., Chatterjie,S.K. and Crouch,R.J. (1991) Mol. Gen.
Genet., 227, 438–445.
10 Kanaya,S. and Itaya,M. (1992) J. Biol. Chem., 267, 10184–10192.
11 Dawes,S.S., Crouch,R.J., Morris,S.L. and Mizrahi,V. (1995) Gene, 165,
71–75.
12 Davies,J.F., Hostomska,Z., Hostomsky,Z., Jordan,S.R. and Matthews,D.
(1991) Science, 252, 88–95.
13 Xiong,Y. and Eickbush,T.H. (1990) EMBO J., 9, 3353–3362.
14 Campbell,A.G. and Ray,D.S. (1993) Proc. Natl. Acad. Sci. USA, 90,
9350–9354.
15 Mushegian,A.M., Edskes,H.K. and Koonin,E.V. (1994) Nucleic Acids Res.,
22, 4163–4166.
16 Cerritelli,S.M. and Crouch,R.J. (1995) RNA, 1, 246–259.
17 Green,S.R. and Mathews,M.B. (1992) Genes Dev., 6, 2478–2490.
18 St Johnston,D., Brown,N.H., Gall,J.G. and Jantsch,M. (1992) Proc. Natl.
Acad. Sci. USA, 89, 10979–10983.
19 Gatignol,A., Buckler,C. and Jeang,K.T. (1993) Mol. Cell. Biol., 13,
2193–2202.
20 Gibson,T.J. and Thompson,J.D. (1994) Nucleic Acids Res., 22, 2552–2556.
21 Romano,P.R., Green,S.R., Barber,G.N., Mathews,M.B. and Hinnebusch,A.G.
(1995) Mol. Cell. Biol., 15, 365–378.
22 McMillan,N.A.J., Carpick,B.W., Hollis,B., Toone,W.M.,
Zamanian-Daryoush,M. and Williams,B.R.G. (1995) J. Biol. Chem., 270,
2601–2606.
23 Bycroft,M., Grünert,S., Murzin,A.G., Proctor,M. and St Johnston,D.
(1995) EMBO J., 14, 3563–3571.
24 De Tapia,M., Himmelbach,A. and Hohn,T. (1993) EMBO J., 12, 3305–3314.
25 Studier,F.W., Rosenberg,A.H., Dunn,J.J. and Dubendorff,J.W. (1990)
Methods Enzymol., 185, 60–89.
26 Cerritelli,S.M., Shin,D.Y., Chen,H.C., Gonzales,M. and Crouch,R.J. (1993)
Biochimie, 75, 107–111.
27 Milligan,J.F. and Uhlenbeck,O.C. (1989) Methods Enzymol., 180, 51–62.
28 Kharrat,A., Macias,M.J., Gibson,T.J., Nilges,M. and Pastore,A. (1995)
EMBO J., 14, 3572–35784.
29 Ramakrishnan,V. and White,S.W. (1992) Nature, 358, 768–771.
30 Kim,J.L. and Burley,S.K. (1994) Nature Struct. Biol., 1, 638–653.
31 Melcher,T., Maas,S., Herb,A., Sprengel,R., Seeburg,P.H. and Higuchi,M.
(1996) Nature, 379, 460–464.
32 O’Connell,M.A., Krause,S., Higuchi,M., Hsuan,J.J., Totty,N.F., Jenny,A.
and Keller,W. (1995) Mol. Cell. Biol., 15, 1389–1397.
33 Green,S.R., Manche,L. and Mathews,M.B. (1995) Mol. Cell. Biol., 15,
358–364.
34 Schmedt,C., Green,S.R., Manche,L., Taylor,D.R., Ma,Y. and Mathews,M.B.
(1995) J. Mol. Biol., 249, 29–44.
35 Bonneville,J.M., Sanfacon,H., Futterer,J. and Hohn,T. (1989) Cell, 59,
1135–1143.
36 Scholthof,H.B., Gowda,S., Wu,F.C. and Shepherd,R.J. (1992) J. Virol., 65,
5190–5195.
37 Pfeiffer,P. and Hohn,T. (1983) Cell, 33, 781–789.
38 Hasegawa,A., Verver,J., Shimada,A., Saito,M., Goldbach,R., VanKammen,A.,
Miki,K., Kameya-Iwaki,M. and Hibi,T. (1989) Nucleic Acids Res., 17,
9993–10013.