Journal of General Virology (1991), 72, 1505-1514. Printedin Great Britain 1505 Nucleotide sequence of tomato ringspot virus RNA-2 M . E. Rott, 1 J. H . Tremaine 2 and D. M . Rochon 2. 1Department of Plant Science, University of British Columbia, Vancouver, British Columbia V6H 2A2 and 2Agriculture Canada, Research Station, 6660 N.W. Marine Drive, Vancouver, British Columbia V6T 1X2, Canada The sequence of tomato ringspot virus (TomRSV) RNA-2 has been determined. It is 7273 nucleotides in length excluding the 3' poly(A) tail and contains a single long open reading frame (ORF) of 5646 nucleotides in the positive sense beginning at position 78 and terminating at position 5723. A second in-frame A U G at position 441 is in a more favourable context for initiation of translation and may act as a site for initiation of translation. The TomRSV RNA-2 3' noncoding region is 1550 nucleotides in length. The coat protein is located in the C-terminal region of the large polypeptide and shows significant but limited amino acid sequence similarity to the putative coat proteins of the nepoviruses tomato black ring (TBRV), Hungarian grapevine chrome mosaic (GCMV) and grapevine fanleaf (GFLV). Comparisons of the coding and non-coding regions of TomRSV RNA-2 and the R N A components of TBRV, GCMV, GFLV and the comovirus cowpea mosaic virus revealed significant similarity for over 300 amino acids between the coding region immediately to the N-terminal side of the putative coat proteins of T o m R S V and GFLV; very little similarity could be detected among the noncoding regions of TomRSV and any of these viruses. Introduction part of the picornavirus-like supergroup which includes the comoviruses, potyviruses and piconorvaruses (Goldbach, 1987). Some common features within this group include genomic structure and organization, as well as regions of nucleotide and amino acid sequence similarity. Martelli (1975) suggested that members of the nepovirus group could be divided into three subgroups based on the Mr of their RNA-2 components. Subgroup I would include grapevine fanleaf (GFLV), raspberry ringspot, tobacco ringspot and arabis mosaic virus with RNA-2 components of Mr 1"4 × 106 to 1"5 × 106, subgroup II would include TBRV, Hungarian grapevine chrome mosaic (GCMV) and artichoke Italian latent virus with RNA-2 components of Mr 1"5 × 106 to 1"6 × 106; and subgroup III would include TomRSV, peach rosette mosaic, cherry leaf roll and myrobalan latent ringspot virus with RNA-2 components of Mr > 1"6 x 106. We report here the nucleotide sequence of TomRSV RNA-2. This is the first sequence of a nepovirus from the subgroup having a large RNA-2 component. Comparisons between this sequence and other nepovirus sequences are presented also. Tomato ringspot virus (TomRSV), a member of the nepovirus group (Harrison & Murant, 1977), is found mainly in North America around the Great Lakes and along the Pacific coast where populations of its nematode vectors Xiphinema sp. occur. The most serious disease problems caused by TomRSV occur in Prunus sp. such as peach and cherry, but TomRSV also causes diseases in apple, raspberry and grapevine. Chronically infected plants do not show obvious symptoms but are characterized by a general decline in productivity (StaceSmith, 1984). Like other nepoviruses, TomRSV consists of 28 nm isometric particles composed of 60 copies of a single coat protein species encapsidating each of the two components of the bipartite, single-stranded, positive-sense R N A genome (Harrison & Murant, 1977; Schneider et al., 1974). The 5' terminus of each viral R N A component is covalently attached to a small protein (VPg) (Mayo et al., 1982) and their 3' termini are polyadenylated (Mayo et al., 1979). Mature TomRSV proteins are probably released from two large polyprotein precursors corresponding to RNA-I and -2 by proteolytic processing as shown for the nepovirus tomato black ring (TBRV) RNA-1 (Demangeat et al., 1990). It has been suggested that nepoviruses be included as Methods cDNA clones. Viral RNA used for preparation of cDNA was obtained from a raspberry isolate of TomRSV. Two overlapping 0001-0085 © 1991 SGM Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sat, 17 Jun 2017 00:23:52 1506 M . E. Rott, J. H. Tremaine and D. M . Rochon eDNA clones derived from TomRSV RNA-2 were used to determine most of the nucleotide sequence of RNA-2; clone K6 which has been described previously (Rott et al., 1988), and clone 035. Generation of 035 was essentially as described by D'Alessio et al. (1987), except that first-strand eDNA synthesis was primed from virion RNA using the phosphorylated synthetic oligonucleotide, oligonucleotide no. 2 (5" TTCTGGTTCCTCTTCC 3'), which was complementary to TomRSV RNA-2at nucleotide positions 2201 to 2216. The amount of first-strand product was estimated using agarose gel electrophoresis. Solution containing first-strand eDNA was adjusted to 0.2 M-sodium acetate pH 5-8, and eDNA was precipitated with 2.5 volumes ethanol, and then washed with 70~ ethanol before continuing with secondstrand synthesis (D'Alessio et aL, 1987). Synthesized double-stranded eDNA was ligated into the EcoRV site of Biuescript (Stratagene) and used to transform competent Escherichia coli DH5ct cells (BRL). Nortbern-blotted TomRSV RNA (Vrati et al., 1987) was used to confirm that clone 035 originated from TomRSV RNA-2. Clone 035 (approximately 2.2 kb) was analysed further by restriction enzyme mapping and dideoxynueleotide sequencing (see below). Sequencing and sequence analysis. Subclones used to sequence clones K6 and 035 were generated by restriction enzyme digestion and/or exonuclease III-generated nested deletions (Henikoff, 1984). The dideoxynucleotide chain termination method of Sanger et al. (1977) was used to sequence double-stranded plasmid DNA templates using modified T7 DNA polymerase (Sequenase) as described by Toneguzzo et al. (1988). All portions of RNA-2 were sequenced completely in both directions except for the 5' extreme 28 nucleotides which were not present in any cDNA clone analysed. These 28 nucleotides were sequenced in one direction only using viral RNA as a template (see below). On average each nucleotide was sequenced over three times. Two regions of TomRSV RNA-2 were sequenced using viral RNA as template with synthetic oligonucleotide primers no. 1 (5' GCCTTCGATGGAACC 3', complementary to positions 115 to 130) or no. 2 and Moloney murine leukaemia virus reverse transcriptase in the presence of dideoxynucleotides (Ahlquist et al., 1981). Sequence data were assembled and analysed using the Gene-Master (Bio-Rad) sequence analysis software package on a Compaq 386 computer. Further analysis was performed using programs and databases available through the Canadian National Research Council's Scientific Numeric Database Service on a DEC VAX. We would like to thank Bill Ronald and Jim Purves of Agriculture Canada for writing a program used to test the goodness of fit between the amino acid composition of the TomRSV coat protein (Tremainc & StaceSmith, 1968) and the calculated amino acid composition of segments of the deduced polyprotein sequence encoded by RNA-2 (see Meyer et aL, 1986). The raspberry isolate of TomRSV used by Tremaine & StaceSmith for the coat protein amino acid composition analysis and the raspberry isolate we have sequenced were both obtained from the British Columbia Frazer Valley growing area and are therefore probably very similar. The coat protein multiple sequence alignment was made by a progressive alignment procedure which uses the Needleman-Wunsch algorithm (Needleman & Wunsch, 1970) iteratively to generate a multiple sequence alignment (Feng & Doolittle, 1987). The predicted amino acid sequences were compared using the TFASTA program (Pearson & Lipman, 1988) with sequences in the protein sequence librariesof the NBRF, Swiss-Prot and Pseqlp, and the nucleotide sequence libraries of the NBRF, GenBank and EMBL. The 5' region between two potential AUG initiation codons was analysed for a possible coding function using the program TESTSCORE (Fickett, 1982) and a window size of 67. Scores within the top 77 to 100~ predict coding regions. This was followed with the absolute positional base preference method (Staden, 1984), using a window size of 50, to determine the coding frame. Results and Discussion Sequencing and primary structure o f R N A - 2 T w o o v e r l a p p i n g c D N A clones, d e s i g n a t e d K 6 a n d 0 3 5 , were used to d e t e r m i n e m o s t o f the n u c l e o t i d e s e q u e n c e o f T o m R S V R N A - 2 . C l o n e K 6 w h i c h has b e e n p r e v i o u s l y d e s c r i b e d ( R o t t et al., 1988), c o r r e s p o n d s to 5498 n u c l e o t i d e s at the 3' e n d o f R N A - 2 (Fig. 5 a). C l o n e 0 3 5 , w h i c h c o r r e s p o n d s to the 5 ' - t e r m i n a l r e g i o n o f T o m R S V R N A - 2 , was d e r i v e d using the s y n t h e t i c o l i g o n u c l e o t i d e p r i m e r o l i g o n u c l e o t i d e no. 2, w h i c h was c o m p l e m e n t a r y to R N A - 2 at a r e g i o n a p p r o x i m a t e l y 400 n u c l e o t i d e s f r o m the 5' e n d o f clone K 6 (Fig. 5a). T w o o l i g o n u c l e o t i d e s (no. 1, c o m p l e m e n t a r y to T o m R S V R N A - 2 n u c l e o t i d e s 115 to 130 a n d no. 2, c o m p l e m e n t a r y to n u c l e o t i d e s 2201 to 2216) were used to c o n f i r m a n d / o r extend the sequence obtained by sequencing cDNA clones. T h e s e q u e n c i n g r e a c t i o n s using o l i g o n u c l e o t i d e no. 1 y i e l d e d 28 n u c l e o t i d e s at the 5" t e r m i n u s o f R N A - 2 n o t p r e s e n t in 0 3 5 a n d resulted in t w o s t r o n g stop points. T h e s e stop p o i n t s c o r r e s p o n d to n u c l e o t i d e s 1 a n d 2 o f T o m R S V R N A - 2 a n d a r e e a c h d e n o t e d N in F i g . 1. O l i g o n u c l e o t i d e no. 2 was used to s e q u e n c e p a r t o f the v i r a l R N A to c o n f i r m the p r e s e n c e o f t h r e e t a n d e m r e p e a t s d e d u c e d f r o m the n u c l e o t i d e s e q u e n c e w i t h i n the o v e r l a p p i n g p o r t i o n s o f clones K 6 a n d 0 3 5 (see below). T o m R S V R N A - 2 d e t e r m i n e d as d e s c r i b e d a b o v e is 7273 n u c l e o t i d e s in l e n g t h e x c l u d i n g the 3' p o l y ( A ) tail (Fig. 1). T h e c a l c u l a t e d Mr o f R N A - 2 is 2.35 x 106 [excluding the 3' p o l y ( A ) tail a n d 5' VPg], w h i c h is s i m i l a r to the 2.4 x 106 M r value d e t e r m i n e d b y d e n a t u r i n g gel e l e c t r o p h o r e s i s ( M u r a n t et al., 1981). T h e base c o m p o s i t i o n o f R N A - 2 is 22.5~o A , 23"3~o C, 24-9~o G a n d 29.3Y/o U. Open reading f r a m e s A single long o p e n r e a d i n g f r a m e ( O R F ) c o n s i s t i n g o f 5646 n u c l e o t i d e s is p r e s e n t in t h e v i r i o n sense o r i e n t a t i o n o f R N A - 2 . T h i s O R F , w h i c h a c c o u n t s for 7 8 ~ o f t h e R N A - 2 sequence, b e g i n s at the first A U G at p o s i t i o n 78 ( A U G 78) a n d t e r m i n a t e s at a U A A stop c o d o n at p o s i t i o n 5723. T h e p r e d i c t e d t r a n s l a t i o n p r o d u c t w o u l d h a v e a n Mr o f 207K. A U G 78 ( U U U G A U G U C ) d o e s n o t h a v e e i t h e r the o p t i m a l K o z a k ( C G / A C C A U G G ) o r Liitcke ( A A C A A U G G C ) c o n t e x t for t r a n s l a t i o n initiat i o n in a n i m a l s o r plants, r e s p e c t i v e l y ( K o z a k , 1986; Liitcke et al., 1987). T h e n e x t i n - f r a m e A U G , w h i c h is the fourth A U G f r o m t h e 5' t e r m i n u s ( U G C A A U G G A ) occurs at p o s i t i o n 441 a n d is in a f a v o u r a b l e K o z a k c o n t e x t for the i n i t i a t i o n o f t r a n s l a t i o n w i t h a G in the - 3 a n d + 4 positions. T h i s raises t h e p o s s i b i l i t y t h a t A U G 441 m a y be a n i n i t i a t i o n site for t r a n s l a t i o n . T h e Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sat, 17 Jun 2017 00:23:52 T o m R S V R N A - 2 sequence predicted translation product beginning at A U G 441 would have an Mr of 194K. However, there is further evidence that A U G 78 may act as an initiation site for translation. The 5' 800 nucleotides of TomRSV RNA-2 were analysed using the program TESTSCORE (Fickett, 1982). The region beginning from A U G 78 gave a score of between 50 and 77~o for the first 100 nucleotides, and then rose to 100~o through to the end of the sequence analysed. Scores between 77 and 100~ indicate coding regions (Fickett, 1982). Further analysis using the absolute positional base preference method (Staden, 1984) indicates that there is only one coding frame in this region beginning at approximately nucleotide 180 to the end of the sequence analysed (the first 800 nucleotides). A U G 78 is the only potential in-frame initiation site upstream of the coding region indicated by TEST SCORE and the absolute positional base preference method. Translation initiation of TomRSV RNA-2 therefore, may be similar to that described for the comovirus cowpea mosaic virus (CPMV) M RNA. The latter has at least two in-frame sites for translation initiation, one at position 161 and the other at position 512 and/or 524 (van Wezenbeek et al., 1983). Both sites appear to be functional (Holness et al., 1989) and preliminary results suggest both leaky scanning and internal initiation as mechanisms for initiation at the 512 site (Wellink et al., 1990). Initiation of translation at internal A U G codons has been established to occur in both poliovirus (PeUetier & Sonenberg, 1988) and encephalomyocarditis virus (Kaminski et al., 1990) by a mechanism of internal ribosome binding. For TomRSV RNA-2, it is not known whether A U G 78 and/or A U G 441 act as the site for initiation of translation and further work is required to address this question. All other potential ORFs in the positive-strand were less than 355 nucleotides in length. Two ORFs in the negative-strand orientation are present which are 603 and 582 nucleotides in length (positive-strand nucleotide positions 956 to 353 and 3103 to 2521, respectively). A search of the NBRF, Swiss-Prot, Pseqlp, GenBank and EMBL databases failed to detect sequences with significant amino acid sequence similarity to either of these two ORFs. Non-coding regions The 5' non-coding region of TomRSV RNA-2 was analysed and compared to the corresponding regions of the two R N A components of TBRV (Meyer et al., 1986; Greif et al., 1988) and GCMV (Brault et al., 1989; Le Gall et al., 1989), RNA-2 of GFLV (Serghini et al., 1990) and the B and M components of the comovirus CPMV (Lomonossoff & Shanks, 1983; van Wezenbeek et al., 1983). It has been reported previously that TBRV, 1507 GCMV, GFLV and CPMV share a conserved U G A A A A A U sequence downstream from the 5' terminus (Serghini et al., 1990). TomRSV RNA-2 has a similar sequence (CGAAAAAU). This short octanucleotide was the longest region of sequence identity that could be detected between the 77 nucleotide 5' non-coding region of TomRSV RNA-2 and the 114 to 287 nucleotide 5' noncoding regions of any of these other viruses. The base composition of the 5' TomRSV RNA-2 non-coding region is similar to those of TBRV, GCMV, GFLV and CPMV in having a high U content (44.2%) and a low G + C content (35.1%). The 3' non-coding region of TomRSV RNA-2 is 1550 nucleotides excluding the poly(A) tail. This is much longer than the sizes of the 3' non-coding regions of TBRV, GCMV, GFLV and CPMV, as well as those of most other positive polarity R N A viruses. The 3' noncoding regions for TBRV, GCMV, GFLV and CPMV are less than 305 nucleotides. Comparison of the 3' noncoding region of TomRSV RNA-2 with those of the RNA-2 components of GFLV, TBRV and GCMV, the M R N A from CPMV, and poliovirus R N A did not reveal any significant sequence similarity except for the sequence AAAAGC found immediately preceding the poly(A) tail in RNA-2 of TomRSV, TBRV and GCMV. The conserved blocks of sequence among the 3' noncoding regions of TBRV, GCMV, GFLV and CPMV previously reported by Serghini et al. (1990) could not be detected in the 3' non-coding region of TomRSV RNA-2. The 3' non-coding regions of TBRV, GCMV, GFLV and CPMV have a high U content (40.5 to 48.0~o) which is similar to that of their 5' non-coding regions. The entire 3' non-coding region of TomRSV RNA-2 has a lower U content of 31.2%. However, its extreme 3' end (approximately 110 nucleotides) has a U content of 44-2~o which is similar to the high U content of the 3' non-coding regions of the other viruses. The possible significance of the very long 3' noncoding region on TomRSV RNA-2 is unknown. In a previous paper (Rott et al., 1988) we reported that the 3'terminal regions of TomRSV RNA-I and RNA-2 share an extended region of nucleotide sequence homology. Nucleotide sequence analysis of part of RNA-1 has confirmed that RNAs 1 and 2 share 3'-terminal sequence identity over a 1533 nucleotide region (unpublished results). The possible significance of the 3'-terminal identity of RNAs 1 and 2 in viral R N A replication will be discussed in that communication. Analysis o f the RNA-2-coding region (i) Location of the putative coat protein-coding region Tremaine & Stace-Smith (1968) reported the coat protein amino acid composition for a raspberry isolate of Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sat, 17 Jun 2017 00:23:52 1508 M. E. Rott, J. H. Tremaine and D. M. Rochon MS I NNAGCGAAAAAUCUGGUGAUADU~AA~UU~UCUCAAUUCACACUU~CAW~GUGUCGUUUUGUUUU~U~u-U~W~UUGAUGU~CUC P 15 $ K A A Y ¥ R A I S D R E L D R E G R F P C G C L A O Y T V Q A P P P A K T Q E 121 C A U C G A A G C ~ C U G C U U A C U A U C G G GCUALVGUCCGAUAGG G A G C U G G A C C G C G A G G G U C G C U U C CCUUGCC4GGUGUC U A G C A C A G U A U A C U G U G C A A G C C C C C C C U C C U G C C A A G A C A C A G G 55 K A V G R $ A D L Q K G N V A P 241 A G A A A G C C G U A G G C A G G U C C G C U G A C C U C C A A A A G G G U A A U G U U G C U C C C C I C F A CAU~GUUOCGCC G G N H A R GGUGGCAACCACGCC~ L K K Q R C D V V V A V S G P P P L E L V Y P UUAAGAAGCAACGCUGCGAUGUUGUGGUCGCAGUCUCUGGACCUCCUCCUUUGGAGUUGGUCUACCCUG A 95 R V G Q H R L D Q P S K G P L A V P S A K Q T $ T A M E V V L S A E E A A I T 36] C C C G G G U A G G G C A A C A U A G G L ~ J G G A C C A A C C [ ~ J C A A A A G G U C C C t r J G G C A G U C C C CUCUGCCAAGCAAACCUCCACUGCAAUGGAGGUUGL"JCDI~3CUGCUGAGGAGGCGGCUAUCACCG A 135 P W L L R P C K G E A P P P P P L T Q R Q Q F k A L K K R L A %; K G Q Q I I R E 481 C C C C C UG GCUUCtKJCGCC C C U G C A A G G G C G A A G C C C C C C C C C C C C C C C C CCtr~ACAC A G A G G C A G C AALr~UGCUGCC C U A A A G A A G A G G C U G G C C G U C ~ G G G C C AG C A ~ U C A L r ~ C G C G 175 H I R A R K A 601 A G C AC A U U C G U G C U C G C A A G G C G A K Y A A I A K A K K A A A L A A V K A A Q E ~% P R L A A Q K A A GC C A A A U A U G C C G C C A U C G C C A A A G C C A A A A A G G C U G C G GC UC U U G C U G C C G U U A A G G C A G C G C A G G A G G C UC C U C G C C U C G C G G C C C A A A A G GC CG 215 I S K I L R D R D V A A L P P P P P P $ A A R L A A E A E L A S K A E S L R a L 721 C C A U C A G C A A G A U C C U U C GG G A U C G A G A U G U U G CUG C U C U C C CC C C U C C CC C C C C U C C U U C U G C U G C C A G A U U G G C A G C C G A G G C C G A A U U G G C C U C A A A G G C C G A G U C U C U C C G G A G G C 255 K A F K T F S R V R P A L N T S F P P P P P P P P A R S $ E L L A A F E A A M N 84] U C A A A G C C U U U A A A A C W O W J A G C A G G G U A C G C C C U G C U W J A A A C A C U U C U U U U C C U C C C C C U C C C C C A C C C C C U C C G GC UC G G U C U U C C G A G C U D U U G G C A G C U U U U G A G G C D G C C A U G A 295 R S O P V Q G G F 961 A C A G G U C U C A G C C U G U U C A A G G G G G C U U C 335 S L P T R K G V ¥ V A P T V Q G V V R A G L R A Q K G F L N A UC C C U C C C U A C C C GC A A G G G U G U U U A U G U C G C U C C C A C C G U U C A G G G U G U G G U G C G C G C U G G G C U U C G U G C C C A G A A G G G C U U U U U G A A U G V S T G I V A G A R I L K S 1081 C C G UC U C C A C C G G C A U U G U G G C U G G A G C U C G C A U L K J U A A A G A G C G C A G P V V 1201 U A G G U U G U G C U G G G CC U G ~ G U G L T C S T F T L S K S Q N W F R R $ M G I A H D Y V E G C M A S T V L A A A A G C C AAAAUOGGtrtTUAG G A G G A G C A U G G G C A U U G C C C A U G A t K / A U G U U G A A G G A U G C A U G G C CAGCACUGUD-O 375 Q R Q E A C S V V A A P P I V Z P V L W V P P L S E Y A N D F P K C A A C G A C A G G A A G C U U G C A G C G U U G U U G C U G C A C C U C C U A U A G U G G A G C CCGUOtrOGUGGGUUCC CC CA'JqJGAGCGAGUACGCUAACGAUUUO C C U A 415 E 455 W Q R P R K Q S I A I $ N L F R K L I D R A L L V S G V S L I A 132l AGCUUAcUUG~U~UAcU°UUACUGAAUGGCAAAGAcCGCGC~AGCAAUCUADUGCCAI.R~UCU/~CCU~JDUCCGcA/~GC~C ADUG/~UCGDGC~GC~GUGAGUGG~GDUAG~DGADUG S V L L F E I A 1441 C G A G C GUC C U C C U A U U U G A G A D U G C E N F A V R O A V C P V E M P S C A T S V 5 E K S L V S L D E G G G ~ % A A D U D U G CC G U G C G A C A G G C G G U U U G CC C A G U G G A A A U G C C G U C U U G C GC A A C C UC U G U D U C G G A G A A A U C C L~JAGUC UC A U U G G A U G A A G 495 N F Y L R K Y L S P P P Y p F G R E S F Y F Q A R P R F I G P M P S M V R A V P 1561 GGAAUDIKJUAUC UC C G U A A A U A U t ~ J A U C A C C A C CUC CC U A U C CUO-OUGG U A G G GAAAGUtW3C UALK/UUCAAGCUAGGCC C C G U U U U A U U G G G C C U A U G C C U U C U A U G G U U A G G G C U G U A C 535 Q I V Q Q P T M T E E L [68] C AC A A A U U G U A C A A C A G C C U A C C A U G A C G G A G G A A C U C 575 E F E V P S S W S S P L P L F A N F K V N R G A C F E Q GAAUUUGAAGUUC CUUC CUCAUGGUC UUCUCCUUUGC CUCUGUUC GCGAAUUUUAAAGUGAAUAGGG GCGCAUGUUUUUUGC F 615 M D L L $ L F E AUGG AUUUGCUUUCUCUUUUUGAG D Q L P E G P GAUC AAUUGC CGGAAGGGC 655 Q V V L P Q R V V L P D E C M D L L S L F E D Q L P E [801 A A G U C C U G C C U C A A A G G G U U G U U U U A C C C G A U G A A U G C & U G G A U U U G C U U U C U C U U U U U G A G G A U C A A U U G C C G G A A G G G A N F K V N R G A C F L Q V L P Q R V V L P D" E C 1921 U C G C G A A U U U U A A A G U G A A U A G G G G UG C A U G U U U U U U G C A A G UC C U G C C U C A A A G G G U U G U U U U A C C U G A U G A A U G C L P S F S W S S P L P L F A S F K V N R G A C F L P L P S F $ W $ $ P L P L GCCUUUGCCUUCCUUUAGUUGGUCUUCACCUUUACCUCUAU V L P A R K V S D E F M D 695 D D 735 D A 775 D C T C L D G N M G E W E W R E S V D A M W R C P G R L L N T G A U U G U A C C U G C C UC G A U G G C A A U A U G G GC GAG U G G G A A U G GC G C G A A A G C G U C G A C G C U A U G U G G C G U U G C C C A G G G C G C U U G C U C A A U A 855 2041 CUUUACCUUCCUUUAGUUGGUCUUcACCUUUACcUCUAUUCGCGAGUUUUAAAGUGAAUAGGGGUGCAuGcuUUUUGCAGGUUUUGCCUGCGCGCAAGGUCGUcUCUGAUGAGUuCAUGG 2161 ACGUCUUGcCCUUUUUGUUCUCUC~ACUGGUAUCGCACcAGGAAGAGC~AACcAc~AAAUGGUUCCUGCUGUGUUGGAAGCAGCAGAUUCAGUUGGUGAUAUCAcCGAAGCCUUCUUUGAUG V L P F L F S P L V S H Q E E E P E M V P A V L E A A D S V G D I T E A F L E C E S F ¥ D S Y $ D E E E A E W A E V P R C K T M S E L C A S L T L A 2281 A C U U G G A A U G U G A G U C U U U C U A U G A C U C A U A U U C U G A U G A A G A A G A A G C U G A G U G G G C U G A A G U G C C A A G G U G U A A G A C U A U G U C U G A A C U U U G C G C U U C U C U U A C U C U U G C C G G U G A C G R P K K F E G H I 252l A U C G U C C C A A A A A A U U U G A G G G A C A U A U C F G K R T F T R D D W E R V Q Y L R I G F N E G R Y R R N W R V L N L E E M D L S L 2641 C A A A G CGC AC A U U C A C U C G C G A U G A U U G G G A G C G U G U G C A A O A U U U A C G C A U A G ° C U U C A A O G A G G G U A G A O A C C GC C G A A A U U G G C G C G U U U U A A A C C U U G A G G A G A U G G A U C U C u C U U 895i H E Y P E I S S A P V Q S S L F S R V V D R G A T L A S S I P F V 2761 • • C A • G A A U A U C C A G A G A U U U C G U C U • C C C C A • U A C A G U C C U C U C U U U U U U C G A G • • U U • U C • A U A • • G • A G C C A C C U U G • C A A G U A G U A U C C C C U U U • U U A C • C S N C Q S GCUCUAACUGCCAGU 935 S L G T P ° L N V H T I H Q E A P T T L R A P P F T G A R N V M G S S D A G A N 2881 C U U C U C U A G G A A C U C C U G G U U U A A A U G U A C A U A C U A U A C A C C A G G A A G C CC C U A C U A C A U U G C G A G C U C C A C C U U U U A C AG G A G C G C G C A A U G U A A U G G G A U C C U C U G A U G C G G ° U G C A A 975 A A P Y R S E A R K R W L S R K Q E D S O E D N I K R Y A D K H G I S F E E A R 3001 A U G C U G C U C C G U A C C G C U C A G A A G C G C G C A A G C G C U G G C U G A G U C G U~u~ACAAGAGGAUUCC C A A G ~ A G A U A A C A t r O A A G A G G U A U G C C G AUAAGCACGGCAUtrGCCD-OUGAAGAG G C U A I015 A V Y K A P K E G V P T Q R S I L P D V R D A Y $ A R S A G A R V R S L F G G S 3121 G A G C U G U C U A C A A G G C C C C A A A G G A A G G A G U G C C U A C C C A G C GC U C C A U U C U G C C U G A U G U U A G A G A U G C U U A U U C U G C C C G U U C UGC U G G C G C U C G G G U U C G G U C C C U C U U C G G A G G AU 1055 T R P T T R A O R T E D F V L T S P S A G D A $ S F S F Y F N P V S E Q E M A 3241 C C C C U A C C A C G C G C G C A C A G A G G A C • G A A G A U U U U • U G U U A A C G A • C C C G U C U G C G G • • • A • • C A A • C U C G U U U A G U U U U U A U U U • A A U C C C • • U U C U G A G C A A • A • A U G G C U • A G C A A • Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sat, 17 Jun 2017 00:23:52 E Q E I095 TomRSV RNA-2 sequence 1509 R G G N T M L S L D A V E V V I D P V G M P G D D T D L T V M V L W C Q N S D D 1 1 3 5 3361 AGCGUGGU~U~UACUA~CUG~GAUGC~GR~UCGUUAUCGACC~AG~GG~ADG~cUGGUGAUGACACUGAU~GACUG~AUGGU~GUGGUGUC~CAGAUG O R A L I G A M S T F V G N G L A R A V F Y P G L K L L Y A N C R V R D G R V L I I 7 5 ~AUCAGCGCGCU~UGAU~GGGGCCAUGUCUAC~GUGGGC~AUGG~CU~CCAGAGC~G~UAUC~CGGGC~A~A~AUAUG~C~GUAGAGUGCGAGAUGGCCGAGU~ K V I V S S T N S T L T H G L P Q A Q V S I G T L R O H L G P G H D R T I S G A I 2 1 5 3~1 U~AGGUCAUUGUGAGCA~A~A~CAACGCUCACGCAUGGU~GCCCCAGGCUC~GUCUCCA~GGGAC~UG~GCCAGCAUUUGGGGCCAGGUCAUGAUCGCACUAUCUCUGGUG L Y A S Q Q Q G F N I R A T E Q G G A V T F A P O G G H V E G I P S A N V Q M G I 2 5 5 3721 CCCUGUAUGC~CCC~C~CAGGG~UC~UAUACGCGCCACGGAACAAGGUGGUGCUGUAACAU~GCCCCCC~GGGGGCCAUGUUGAGGGUAUCCCCAGCGCC~UGUACAGAUGG A G E H L I Q A G P M Q W R L Q R S Q S S R F V V S G H S R T R G S S L F T G S I ~ 5 3841 GCGCUGGGGAGCAU~GCGGGUCCCAUGCAGUGGCGC~GCAGAGGUCGC~UCUUCUCGA~UGUGGUCU~UGGUCA~CGCGAACGCGUGGAAGC~CU~GU~ACUGG~ V D R T O Q G T G A F E D P G F L P P R N S S V Q G G S W Q E G T E A A Y L G K 1 3 3 5 3961 GUGUCGAUAGGACGCAGCAGGGAACGGGGGC~G~GACCCGGGUUU~UACCACCCAGG~CUUCUG~CAGGGCGGGUCCUGGC~GAAGGUACUGAAGCCGC~A~UAGGCA V T C A K D A K G G T L L H T L D I I K E C K S Q N L L R Y K E W Q R Q G F L H I ~ 5 ~8~AAG~ACCUGUGCG~GGACGCC~GGGUGG~C~UA~GCACACUUUGGAUAUUAUAAAAGAGUGC~AUCCC~UA~GGUAU~AG~UGGC~CGUCAAGGCU~CUUC G K L R L R C F I P T N I F C G H S M M C S L D A F G R Y D S N V L G A S F P V 1 4 1 5 4~I AUGGA~GCUUAGA~GCGCUGC~UAUACCCACUAACAU~UGUGGGCAUUCCAUGAUGUG~CC~GGACGCG~UGGUCG~AUGAUUCGAACGUGCUAGGUGCUAG~CCAG K L A S L L P T E V I S L A D G P V V T W T F D I G R L C G H G L Y Y S E G A Y I 4 5 5 4321UGAAG~GGC~GU~AUUGCCAACGGAGGUGA~AGUCUGGCUGAUGGGCCCGUGGUCACGUGGACG~UGAUA~GGGCGUCUGUGUGGUCAUGGUCUCUA~AUUCCGAGGGCGC~ ~l A R P K I Y F L V L S D N D V P A E A D W Q F T Y Q L L F E D H T F S N S F G A I 4 9 5 AUGCGAGGCCCA~AU~AUUUU~AG~CU~CCGAU~UGAUGUUCCUGCAGAAGCAGAUUGGC~UACCUAUCAGC~G~CGAGGAUCAUACAUUUUCG~CCU~GGGG V P F I T L P H I F N R L D I G Y W R G P T E I D L T S T P A P N A Y R L L F G I 5 3 5 4561CGG~CCU~UA~ACCUUACCCCAUA~UAAUAGA~AGAUAUAGG~AUUGGCGCGGGCC~CAGAGAUAGAU~CAUC~CUCCCGCACCA~CGCCUAUCG~UAC~CG L S T V I S G N M S T L N A N Q A L L R F F Q G S N G T L H G R I K K I G T A L I 5 7 5 ~81GCUUGUCCACUGUCAUUAGUGGU~CAUGUCGAC~UG~UGCC~UCAAGCC~AUUGCGU~UCCAGGG~CG~UGGCACUUUACAUGGGCGCA~AGAUAGGGACAGCAC T T C S L L L S L R H K D A S L T L E T A Y Q R P H Y I L A D G Q G A F S L P I I 6 1 5 4801UUA~AACCUG~CCCU~UA~AUCGUUGCGCCACA~GAUGCGAGUCUCACA~GGAGACCGCAUAUCA~GGCCCCA~ACAUUUUGGCUGACGGAC~GGGGCU~CAUUACC~ S T P H A A T S F L E D M L R L E I F A I A G P F S P K D N K A K Y Q F M C Y F 1 6 5 5 ~21U~CUACCCCCCAUGCAGCAACCUCCU~AGAGGACAUGUUGCGCCUGGAGAU~GCUAUUGCUGGGCC~AGUCCC~AGAUAAU~AGCAAAAUACC~CAUGUGUUA~ D H I E L V E G V P R T I A G E Q Q F N W C S F R N F K I D D W K F E W P A R L I 6 9 5 5~I U C G A U C A C A U A G ~ G G ~ G A G G G G G U A C C U A G ~ C U A U A G C A G G C G A A C A G C A G U U C ~ C U G G U G C A G ~ A G A ~ U C ~ U C G A U G A C U G G ~ G ~ U G A G U G G C C G G C U C G C C P D I L D D K S E V L L R Q H P L S L L I S S T G F F T G R A I F V F Q W G L N I 7 3 5 5161UUCCAGAUAUAC~GAUGAU~GUCAG~GUGCUCUUAAGGC~CAUCC~UAUCUCUGC~AUCUCAUCUACCGGU~ACGGGUAGAGCCA~UUUGU~UCCAGUGGGGU~GA T T A G N M K G S F S A R L A F G K G V E E I E Q T S T V Q P L V G A C E A R I 1 7 7 5 5281AUACUACUGCUGGG~UAUGAAAGG~CA~CUGCGCGCCUGGCC~UGGC~GGGCG~GAGG~AUUGAGC~ACGUCAACAGUGC~CCAC~G~GGCGCUUGUG~GCCCGCA P V E F K T Y T G Y T T S G P P G S M E P Y I Y V R L T Q A K L V D R L S V N V I S I 5 5401UACCCGUGGAGU~GACUUACACGGGUUAUACUACUUCGGGUCCUCCUGGAUCUAUGG~CCAUACAU~ACGUGAGGC~ACGCAAGCU~GC~GUGGAUAGGCU~CUGUG~UG I L Q E G F S F Y G P S V K H F K K E V G T P S A T L G T N N P V G R P P E N V I 8 5 5 5521UUA~ACAGGAGGGAU~UCU~CUAUGGACCUAGUGUCA~CAUUUUAAG~AG~GUCGGCACGCCUAGUGCCACCCUAGGGAC~AU~UCC~G~GGGCGCC~ACCUGAG~CG D T G G P G G Q Y A A A L Q A A Q Q A G K N P F G R G * 1862 5~I UCGAUACAGGGGGUCCUGGCGG~AGU~UGCAGCUGCC~AC~GCAGCCCAGCAAGCUGGA~AUCC~UCGGGCGUGGCUAAGUUGGC~CCUGAAAGGCGAGUAGCUGCCG~AG 5761CAGCUUCC~GGUGGC~UC~AGCU~U~UAGGGG~AUCCAGCCUUAAGCAAGCUGGCACCGGUCCUGAUGGACUA~CAGGA~GCACCUGGU~GG~GAAUUCGAGUA~ 5881A~CUU~AUCUUG~UACUCGUGACUUAUAGUA~A~CAAGAGGAAUGAcUCAUGU~UGU~AU~ACAUGAUGGCAUA~GAG~CGGCUCAUAUGGUGCUCAUUACGUUC~GU 6~1G~GAAGGAUCC~UAG~CUUGAACUGUGGUG~CAUGUGAGG~AUCCACG~AUCUCUGAUUGUC~UAGACUAGUCUAGGAGACGAUA~UCCUAUGUGGGUGAGUCcCACUCUGG 6121 CGAGACACGCAGUGCC~A~UG~UGAGGUUAUCA~CAUCAUAUC~GAGUCUGCA~UA~CG~U~UGUAGUUGUCAUAGCCUACCGAUGAGUCUGCGAG~AGG~CCAU ~41GAGGAC~GGG~GGCUAACCCCCACUUAAUCUC~CAUAGAUCAUUCGACAGUGUGUCGAG~ACUAUGGGUUUUGACACCUUAAGGGAAGCGAGAG~CUCGUAUGGAUAUCACUCUA 636! AU~AUGGUAC~ACCAUG~CAUG~UAGAGU~AUCAUCG~CUCGACGGUGUGAUAC~UCC~GU~UAGUC~A~GAUA~CUCGUUGAUCGUAUGUGAAGCGUGGAGCGAG~CG ~8[ A~CG~CUGAUUA~CAGAGGUAGGACGCUA~G~CCAGGCGU~C~AUGGGCAU~GCUGU~AC~GGt~3CGCAAGCCAUGCAGCACCUCCCG~ACUUGUGUACU~CUAGGGG 6601CUCCCGGCCUU~UU~CGGUACAAUACCUAGUG~GCAAGUAAUUGCGUUGAGGGAU~GAGUAGCAUGUUCCUACUUAAGGAAGG~UAUGUCGUGUUUUCCACACGUUAGUG~GC~ 6721UG~UGU~UGGCACUGCAGUG~AGG~UGGUUCCCAGCCACU~CUGGGAUUCUAAUCGUACGUCACAAUUGUGUGUGUAU~G~ACGGAGGAGUAGCGAUCCUCUACCACGCGAGU 6841CUGGAAGUGA~ACCAGGGCCUAAGAUGGCCAGCACACGGUACGAUU~AU~AGCUGU~UGUAGUGGUAUG~GUUGAGACU~CUUACCCGUACGAGUUA~CUCU~GAUGGAU ~ 6 1 GUGUG~CUGCCAUCUUAGAGGAAGUAGAUGUGU-~U~ACCAAUCUGAGACGAGCCG~UUCGGUGCU~UACGUCAAUGAU~UACUCGUGCAG~GCAGCUGCACGAGUAUG~ 7081 ~UA~A~A~AGU~CUCGGAUACG~U~G~ACCCU~C~UA~G~A~CUCUCUC~UCUUA~CU~CUGCAAG~ACG~GUUUUCGCA~GGU~UGUU~UCCGU~UG~UC~ 7201 AAC GC U G C ~ U G C A A U U U U C U ~ U G ~ A U U G C ~ U C GUAGUGUCG~CU~GUCC ~ G ~ C A U A ~ A G C Fig. 1. Nucl~tide sequence of TomRSV RNA-2 and deduced amino acid ~quence of the long pol~rotein. The first two nuci~tides which could not be read ~om the sequencing gel are each repre~nted by an N. The second in-~ame A U G proposed to be the initiation site ~ r translation is indicated by three asterisks. The underlined regions are complementa~ to oligonucleotides no. 1 and no. 2. Nucl~tides are numbered on the le~ beginning at the first N. Amino acids are numbered on the right and begin at the ~ s t in-~ame M. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sat, 17 Jun 2017 00:23:52 1510 M. E. Rott, J. H. Tremaine and D. M. Rochon t TomRSVI320 TBRV 834 GCMV 810 GFLV 680 O G G S W O E G T E A A Y L G K V T C K A G G S Y A F G E T I E L R A G G E F A F I H T I D L R G L A G R G V I Y I AKDAKGGTL PATVTPGTV PTAVTEGQV PKDCQANRY L L L L H A A G T V K T L F I L D N D N I I I I I F F R K D K D E K K M C I I I K Q Q S S E D D Q T A F N N K K L T S G L K M V R V V Q Y C C Y K S V E E K Q K W W W W Q L M I TomRSVI371 TBRV 880 GCMV 856 GFLV 723 R E Q T FCGHSMMCS FSGIAIWYI FCGVAIWY! FTGLTWVMS L F F F D D D D A A A A F Y Y Y G G G N R K K R Y I I I D P P T S G S S N D D R V V V I L - G T T - A T T T S T S A F F L S P E E A V L L D K E E P L M I V A A A Y S R R T L L L M Y N N W Y F F F S S S K E G G S G Q R T A G G T Y Y F F A C S E R V K S P P P P K K T R I I L L Y W W H F V V F L I I T V A A C L A A L S S S T D T T G N F A N D Q Q N I E S V F A L F N R E N R H H G L L L K T K F D D D Q I L I F G N E - Y L I L W M M T R L L C G A P P P P P P T K R I E Q Q F G T T T T G G G L I I V H H H R G F F G R V K R I V I V Q Q A A G G G G F Y Y L L V V V H S N M G Q K P K N N T L L L F R T T K L A F I R I I V C S S I F H H R I L L L P A A P T P P A N N S N I A Q A * *** . TomRSVI421 TBRV 929 GCMV ~5 GFLV 771 L S S L L F L S P D C V T P P P E H H H V V V W I Q H L S V V I L L L H TomRSVI47! TBRV 9~ GCMV 955 GFLV 821 V L L K P A P E A R W L E S S A A T A A D A Q D W T V W Q K T Q F F Y A * A R R H D D D K T R R V * G V S L Y L L V P S K G Q E E E L F A L V T T T L Y L Y V S S F * T T V C F T A A W W W S E R Q E T V T E D G G L . J . F I I I DIGRLCGHG DFHKICGQT DFHKICGQS DYGELCGHA H E D E T K E E F L I A S V A T N R H S S G G F F L L L G A A G A E T K V Q R P P P S T F I L I L V V T S T F L Y Y D P P P P H I I G t TomRSV[5~ TBRV 1028 GCMV 1~5 GFLV 870 I I M F D A A D L V I L T G G T S T N A T Y A V P A G T A M S A P I I L N T N R A S Y F F A R P P G L V L L L S S T F L F L G A A G L A V Q S K Q V T L Q P V Q K M TomRSVI570 TBRV 1077 GCMV 1054 GFLV 919 K R Q H K T C I I T T C G S S A T S S P S K R V L L L L L R R W L L L S R W W E H W K N D G G G A T D T S V T T L P I M T E T D L T L W E D E N T Q Q E A L L L T o m R S V 1620 TBRV 1124 GCMV 1101 GFLV 968 F F Y H g Y H A S A R A ...... T ...... T ...... T P A R L L A G T A A Q S F L E D N F G N S N F G D S S Q R D M G G M L S A S R A R S L F F L E Y W N I V V F F S T Y A T P A I L M I A C S A G A S G P P P P F M M I S A A A TomRSVI~3 TBRV 1168 GCMV [145 GFLV 1017 E P P P G G P D V L M L P C C S R R R L T E Q P I I I S A N N F G Y Y E E K D D Q Q Q D Q R R Y F F F F N A A V W W W W F - C C T V S L L D F L L F R E R S C P - L P - D D - N N P E F S K F K K L T I A S L D S K D D P I K W I L E TomRSVI709 TBRV 1217 GCMV 1195 GFLV 1~3 R F Y G Q V V E H N N N P A A P L L F F S A A A L I I A L L M M I C C I S A A A S T T C T T T H G G G G F M M L F H H H T H A S G G G G R N K V A C C L TomRSV1759 TBRV 12~ GCMV 1243 GFLV II~ I N S K T A I - N G M E E E G Q H H L T I H D S G G G T D E P V T F S Q M H H P C L V L Y G F V N G A G S P I A L L Q C S S K E N S L A T S E TomRSVI803 TBRV 1312 GCMV 1293 GFLV 1157 T P A D Q D H Q A F W A K S D K L W W S V V L I D A T K R S S V L L L L S H T R V V V V N S D L V I I C I E Q K L V V P Q H L R E E P P G G G G F F F F * A A A I L F F F T V V Y T T T S C S A I L I V V S A A V I S S V S T S G G S S T N G G T M R ~ K S T I V T A A - L Y Y Y . N S S N A Y Y L * Q A A S * Y A S F Q Q Q K A G G T L L L L L L L V * * F H H C F F F V Q L L L G G G G S V I M N G G G * * * ** Y - I V V V L D D Y A V C V D E D E G V V E Q N D D G V V G A D V S F A S F S S S E P P P P K E E S D T T G N V M E K E E T A T S A K G K Q Y S L L Q E E P F Y Y I M M Y V C I I V Y Q Q Q F I I I K K K E F V L I E K T E W I L I P P P G A S S S R R R R L I V F P G C F D N N D I L I F L S A T D S Y S I T I D F V F Q W G L N T T A G N M K G I H F S W L W H - - P A E L G K L H F S W T L N - - K G T S F K L K L Q W S L N .... T E F S Q D G F L L K S Q S A G G S K R H G L L I S A K S V H L G R S A T I V V T P P P E V F F L S K R S F F F F Y Y Y Y G G G G P S V K H F K K E V G T P S A T L G T N N P V G R P P R S A G P L T I P A T V A D V S A V S G S RSAGPLTIPS RTSFPV F F F V K G G G P G G Q Y A A A L Q A A Q Q A G K N P F G T S S N P Q K K I I I I S Q Q R D K L D H G G E I I I I E E D - L A A V V N D R D K K N K H E T S V A C E K T R V V V V L T T S L N N M F F F T G V Y I K G V E E Q G M G I S G M G D T K L V G D ** Y F F F T A A A G G G G L L L V * H H H G E Q E L R M M Y R S S S P P P P * TomRSV 1853 E N V D T G G N A A N * ** P P - Y I V A T T T N T S S P S G G N G G G T P K T R P A P F G D F S S E T L M A S Y E E E S P N N R T S S S P P P P * Y W W W I I L M Y E R A V I V I R Q E K T S T L RG Fig. 2. Alignment of the amino acid sequences of the TomRSV, TBRV, GCMV and GFLV putative coat proteins. Alignment was generated using the multiple sequence alignment program of Feng & Doolittle (1987). A double asterisk indicates amino acids common to all four sequences. A single asterisk indicates three of four amino acids at that position are identical. Underlined dipeptides are known or potential cleavage sites. Numbers to the left of each line of sequence refer to amino acid position on each nepovirus RNA-2 polyprotein. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sat, 17 Jun 2017 00:23:52 TomRSI/ RNA-2 sequence * * * * * * ** ** * ** * ** 1511 * TomRSV 984 A R K R W L S - R K Q E D S Q E D N I K R Y - - - A D K H G I S F E E A R A V Y K A P K E G V P T Q R S I L P D V R D A Y S A R S A G A R V R S L F G GFLV 360 GRAQWI S E R R Q A L R R R E Q A N S F E G L A A Q T D M T F E Q A R N A Y L G A A D M I E Q G L P L L P P L R S A Y ....... A P R G L W R * * *** * ** * * ***** ** * * ** * * * **** ***** TomRSV 1059' G S P T T R A Q R T E D F V L T S P S A G D A S S F S F Y F N P V S E Q E M A E Q E R G G N T M L S L D A V E V V I D P V G M P G D D T D L T V M V L GFLV 435 G-P S T R A N Y T L D F R L N G I P TG -TNTLE I L Y N P V S E E E M E E Y R D R G M S A W I D A L E IAI N P F G M P G N P T D L T V V A T * ** ** *** ******* * *** * * * *** * * * ****** TomRSV 1134 W C Q N S D D Q R A L I G A M S T F V G N G L A R A V F Y P G L K L L Y A N C R V I % D G R V L K V I V S S T N S T L T H G L P Q A Q V S I G T L R Q H GFLV 510 Y G H E R D M T R A F IGSASTFLGNGLARAI F F P G L Q - -YSQE EPRRES I I R L Y V A S T N A T V D T D S V L A A I S V G T L R Q H * * ** * ** *** * * * * * * * * * * * ** ** * TomRSV 1209 L G P G H D R T I S G A L Y A S Q Q Q G F N I R A T E Q G G A V T F A P Q G G H V E G I P S A N V Q M G A G E H L I Q A G P M Q W R L Q R S Q S S R F GFLV 585 V G S M H Y R T V A S T V H Q A Q V Q G T T L R A T M M G N T V V V S PEGS L V T G TPEARVE I GGGS S I RMVGP LQWE SVE E PGQ TF TornRSV GFLV 1284 WSGHSRT 660 S I R S R S R S Fig. 3. Amino acid sequence alignment of the regions N-terminal to the TomRSV and GFLV coat proteins. Dashes indicate gaps introduced to maximize the alignment. Asterisks above the sequences indicate positions of amino acid identity. The numbers to the left of the sequences indicate amino acid positions on the long polyproteins of TomRSV RNA-2 and GFLV RNA-2. Table 1. Amino acid composition analysis of the TomRSV coat protein Amino acid Relative mole ratio* Scaledt Determined from sequence:~ A C D+N E+ Q F G H I K L 14-9 5.1 16-0 18-1 14-5 18.1 5-2 12-7 9-7 24-1 36 12 38 43 35 43 12 30 23 58 43 10 45 49/47 37 53/51 13 28 26 60 M P 2-7 11-5 10.5 16.2 14-9 9-9 5-2 7.3 7 28 25 39 36 24 12 18 7 32 26 38/37 40 26 9/8 20 R S T V W Y * From Tremaine & Stace-Smith (1968). t Relative mole ratio values were rescaled for a coat protein with an Mr of approximately 58000 as determined by Allen & Dias (1977). :~ Value to the left of the slash is determined from the potential Q - G cleavage site, the value to the right of the slash is determined for the potential Q-E cleavage site. TomRSV assuming an Mr of 24000. Later, Allen & Dias (1977) reported that the TomRSV coat protein had an Mr of 58 000. We have rescaled the coat protein amino acid composition determined by Tremaine & Stace-Smith for a protein with an Mr of 58K (Table 1) and compared this with the predicted amino acid composition encoded by the RNA-2 long ORF by chi-squared analysis. Values of chi-squared were greater than 27.6 for sequences C-terminal to residue 1037 and were less than 5-4 for sequences between residues 1235 and 1882. This would suggest that the TomRSV coat protein is encoded at the C-terminal region of the RNA-2-encoded polyprotein. A comparison of the amino acid composition of the TomRSV coat protein and the C-terminal region of the RNA-2-encoded polyprotein is shown in Table 1. The coat protein-coding regions for the nepoviruses TBRV, GCMV and GFLV have also been localized to the C-terminal region of the RNA-2-encoded polyproteins (Meyer et al., 1986; Brault et al., 1989; Serghini et al., 1990). An alignment of the amino acid sequence at the C-terminal region of the TomRSV RNA-2 polyprotein and the putative coat protein sequences of TBRV, GCMV and GFLV is shown in Fig. 2. The putative TomRSV coat protein sequence shared sequence identity of 21.4~, 22.9~ and 23.4~o with the putative coat protein regions of TBRV, GCMV and GFLV, respectively. The N-terminal region of the putative coat protein was analysed for the following potential protease cleavage sites: Q-S, Q-G, Q-M, E-S, E-G, R-A, R - G and K - A (Palmenberg, 1990; WeUink et al., 1986; Serghini et aL, 1990; Brault et aL, 1989; Demangeat et al., 1990). The sites Q - G and E-G, at amino acid positions 1320-1321 and 1325-1326, are tentatively identified as potential cleavage sites for the TomRSV putative coat protein based on their location and the previously determined size of the TomRSV coat protein (Fig. 2). The Q - G and E - G sites proposed for TomRSV are different from the R-A, R - G and K - A cleavage sites proposed for GCMV, GFLV and TBRV, respectively, but are similar to those of the como-, poty- and picornaviruses (Hellen et al., 1989). A third potential cleavage site may occur at the K - G site at position 13431344 which would conform to a nepovirus consensus (K or R - G or A). The putative TomRSV coat protein would have a calculated size of 60K to 62K which is similar to Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sat, 17 Jun 2017 00:23:52 1512 M. E. Rott, J. H. Tremaine and D. M. Rochon the 58K determined by S D S - P A G E (Allen & Dias, 1977). The TomRSV coat protein is larger than the coat proteins o f T B R V , G C M V and GFLV. The larger size of the TomRSV coat protein is at least partially due to an additional 36 to 48 amino acids at the C terminus (Fig. 2). (ii) Coding region N-terminal to the putative coat protein The region to the N-terminal side of the putative coat protein has the capacity to encode a 132K to 145K polypeptide depending on which A U G acts as the translation initiation site. It is not known how many different protein products are encoded by this region or what their functions are. However this region can be divided into several domains based on similarities in amino acid sequence and genomic organization with other nepo- and comoviruses as well as some unique features of the TomRSV RNA-2 polyprotein itself. The TBRV and G C M V RNA-2 polyproteins share amino acid sequence similarity throughout their entire length. However, the most highly conserved region is immediately N-terminal of the putative coat protein sequences (Brault et al., 1989). We were unable to align this conserved region with the corresponding region in TomRSV. Interestingly however, we did detect sequence identity of 36.7% for over 300 amino acids between the regions immediately N-terminal of the TomRSV and G F L V putative coat proteins (Fig. 3). This was the highest match obtained in comparisons between the TomRSV RNA-2 polyprotein and those of other nepoviruses. Only a short region of identity could be detected in this region between TomRSV RNA-2 and (a) TomRSV (7273 nt) Repeat 1 554 Repeat 2 SWSSPLPLFANFKVNRGACFLQVLPQRWLPDECMDLLSLFEDOLPEGPLPSF ***************************************************** 607 S W S S P L P L F A N F K V N R G A C F L Q V L P Q R V V L P D E C M D L L S L F E D Q ~ E G P L P S F ********** ************** * * ** ** * Repeat 3 660 SWSSPLPLFASFRVNRGACFLQV/a~ARKVVSDEFMDVLP Fig. 4. Alignment of the amino acid sequence of the three tandem repeats near the N-terminal region of the TomRSV RNA-2 polyprotein. Asterisks between repeats indicate positions of amino acid identity. Proline residues and the L-P dipeptides are underlined. The numbers at the beginning of each line refer to amino acid position on the long TomRSV RNA-2 polyprotein. AUGAUG 1781 441 VpG Coat protein [ ~11 (b) RNA-2 of TBRV or GCMV. It has been suggested, by analogy with the M RNA-encoded proteins of comovirus CPMV, that this region may encode a cell-to-cell transport function (Meyer et al., 1986; Wellink & van Kammen, 1989). If so, the difference between the T o m R S V / G F L V and T B R V / G C M V sequences may reflect slightly different mechanisms to potentiate virus movement throughout infected plants. An internal region of TomRSV RNA-2 from nucleotides 1812 to 2244 consists of three tandem repeats. Two of the repeats were near perfect and 159 nucleotides in length whereas the third was degenerate and 114 nucleotides in length. This region fell almost completely within the overlap portion of clones K6 and 035, and the presence of the repeats was therefore confirmed by two independent clones. In addition, part of the repeated sequence was confirmed by dideoxynucleotide sequence analysis using oligonucleotide no. 2 as a primer and viral R N A as template. Sequence analysis indicated that this region encodes two identical amino acid repeats 53 amino acids in length, and one partial and degenerate UAA [ 5724 I po|y(A) 035 ,~. 2 AUG AUG I 8] 233 K6 Coat protein UAG [ 3560 GFLV (3774 nt) AUG I 288 Coat protein AUG [ 218 Coat protein U~G4361 TBRV (4662 nt) U/~G4190 GCMV (4441 nt) Fig. 5. (a) Locationof the cDNA clones 035 and K6 relative to TomRSVRNA-2. Arrowslabelled 1 and 2 refer to oligonucleotidesno. 1 and no. 2, respectively.(b) Comparison of the TomRSV, TBRV, GCMV and GFLV RNA-2 components. Thick lines represent nucleotidesequenceand the largeboxedregionsindicate the longORFs. Amino acid sequencesimilarityis indicatedby similar shading within boxed regions.The diagonal shading within the TomRSV sequenceindicates the three tandem repeats. Boxedregions without shading do not share amino acid sequence similarity. Numbers refers to nucleotide (nt) positions on the respective RNAs. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sat, 17 Jun 2017 00:23:52 T o m R S V R N A - 2 sequence repeat of 38 amino acids (Fig. 4). The sequence has a high proline content (12.1%) and the dipeptide L-P is repeated 13 times. The significance of this repeated region, the high frequency of P or the presence of the L-P dipeptide is unknown. A search of the NBRF, GenBank, EMBL, Swiss-Prot and Pseqlp databases failed to detect significant amino acid sequence similarity with any other sequence. In conclusion TomRSV RNA-2 has many similarities with the RNA-2 components of the nepoviruses TBRV, GCMV and GFLV, as well as with the M R N A of CPMV. These include a 5' VPg, 3' poly(A) tail, expression by cleavage of a polyprotein, location of the coat protein-coding region at the C terminus of the large polyprotein, and some sequence similarities within the N-terminal region of the long polyprotein. TomRSV RNA-2 also has some unique features which can account for the much larger size of its R N A in comparison to TBRV, GCMV and GFLV. These features include a very long 3' non-coding region of 1550 nucleotides, a larger putative coat protein-coding region and repeated elements within the N-terminal coding region. A comparison of the genomic organization of the RNA-2 components of TomRSV, TBRV, GCMV and GFLV is shown in Fig. 5. We believe that the unique features of TomRSV as compared to TBRV, GCMV and GFLV confirm the inclusion of TomRSV in a separate subgroup of nepoviruses (Martelli, 1975; Harrison & Murant, 1977). A grant from NSERC of Canada to J. H. T. providing financial support for M. E. R. is gratefully acknowledged. References AHLQUIST, P., LUCKOW, V. & KAESBERG, P. (1981). Complete nucleotide sequence of brome mosaic virus RNA3. Journal of Molecular Biology 153, 23-38. ALLEN,W. R. & DIAS,H. F. (1977). Properties of the single protein and two nucleic acids of tomato ring-spot virus. Canadian Journal of Botany 55, 1028-1037. BRAULT,V., HIBRAND,L., CANDRESSE,T., LE GALL, O. & DUNEZ, J. (1989). Nucleotide sequence and genetic organization of Hungarian grapevine chrome mosaic nepovirus RNA2. Nucleic Acids Research 17, 7809-7819. D'ALESSlO,J. M., NOON, M. C., LEY, H. L., III & GERAD,G. F. (1987). One tube double-stranded cDNA synthesis using cloned M-MLV reverse transcriptase. Focus 9 (1), 1-4. Gaithersburg: Bethesda Research Laboratories. DEMANGEAT, G., GREIF, C., HEMMER, O. & FRITSCH, C. (1990). Analysis of the in vitro cleavage products of the tomato black ring virus RNA-l-encoded 250K polyprotein. Journalof General Virology 71, 1649-1654. FENG, D. & DOOLITTLE,R. F. (1987). Progressive sequence alignment as a prerequisite to correct phylogenetic trees. Journal of Molecular Evolution 25, 351-360. FICKETr, J. W. (1982). Recognition of protein coding regions in DNA sequences. Nucleic Acids Research 10, 5303-5318. GOLDnACH,R. (1987). G enomic similarities between plant and animal RNA viruses. Microbiological Sciences 4, 197-202. 1513 GREIF, C., HEMMER,O. & FRrrscH, C. (1988). Nucleotide sequence of tomato black ring virus RNA- 1. Journalof General Virology69, 15171529. HARRISON, B. D. & MURANT, A. F. (1977). Nepovirus group. CMI/AAB Descriptions of Plant Viruses, no. 185. HELLEN, C. U. T., KR.AUSsLICH,H. & WIMMER,E. (1989). Proteolytic processing of polyproteins in the replication of RNA viruses. Biochemistry 28, 9881-9890. HENIKOFF, S. (1984). Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing. Gene28, 351-359. HOLNESS, C. L., LOMONOSSOFF,G. P., EVANS, D. & MAULE, A. J. (1989). Identification of the initiation codons for translation of cowpea mosaic virus middle component RNA using site directed mutagenesis of an infectious cDNA clone. Virology 172, 311-320. KAMINSKI,A., HOWELL, M. T. & JACKSON,R. J. (1990). Initiation of encephalomyocarditis virus RNA translation: the authentic site is not selected by a scanning mechanism. EMBO Journal9, 3753-3759. KOZAK, M. (1986). Point mutations define a sequence flanking the AUG initiation codon that modulates translation by eukaryotic ribosomes. Cell 44, 283-292. LE GALL, O., CANDRESSE, T., BRAULT, V. & DUNEZ, J. (1989). Nucleotide sequence of Hungarian grapevine chrome mosaic nepovirus RNAI. Nucleic Acids Research 17, 7795-7807. LOMONOSSOFF,G. P. & SHANKS,i . (1983). The nucleotide sequence of cowpea mosaic virus B RNA. EMBO Journal 2, 2253-2258. LUTCKE,H. A., CHOW,K. C., MICKEL,F. C., Moss, K. A., KERN,H. F. & SCHEELE,G. A. (1987). Selection of AUG initiation codons differs in plants and animals. EMBO Journal 6, 43-48. MARTELLI, G. O. (1975). Nematode Vectors of Plant Viruses, pp. 223. Edited by F. Lamberti, C. E. Taylor & J. W. Seinhorst. London & New York: Plenum Press. MAYO, M. A., BARKER,H. & HARRXSON,B. D. (1979). Polyadenylate in the RNA of five nepoviruses. Journal of General Virology 43, 603610. MAYO, M. A., BARKER,H. & HARRISON,B. D. (1982). Specificity and properties of the genome-linked proteins of nepoviruses. Journal of General Virology 59, 149-162. MEYER, M., HEMMER,O., MAYO, M. A. & FRITSCH, C. (1986). The nucleotide sequence of tomato black ring virus RNA-2. Journal of General Virology 67, 1257-1271. MURANT,A. F., TAYLOR,M., DUNCAN,G. H. & RACHKf~,J. H. (1981). Improved estimates of molecular weight of plant virus RNA by agarose gel electrophoresis and electron microscopy after denaturation with gtyoxal. Journal of General Virology 53, 321-332. NEEDLEMAN, S. B. & WUNSCH, C. D. (1970). General method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48, 443-453. PALMENBERG, A. C. (1990). Proteolytic processing of picornaviral polyprotein. Annual Review of Microbiology 44, 603-623. PEARSON,W. R. & LIPMAN,D. J. (1988). Improved tools for biological sequence comparision. Proceedings-of the National Academy of Sciences, U.S.A. 85, 2444-2448. PELLETIER, J. & SONENBER6, N. (1988). Internal initiation of translation of eukaryotic mRNA directed by a sequence derived from poliovirus RNA. Nature, London 334, 320-325. ROTT, M. E., ROCHON,D. M. & TREMAINE,J. H. (1988). A 1-9 kilobase homology in the 3'-terminal regions of RNA-I and RNA-2 of tomato ringspot virus. Journal of General Virology 69, 745-750. SANGER,F., NICKLEN,S. & COULSON,A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, U.S.A. 74, 5463-5467. SCHNEIDER,I. R., WHITE,R. M. & CIVEROLO,E. L. (1974). Two nucleic acid-containing components of tomato ringspot virus. Virology 57, 139-146. SERGHINI,M. A., FuctIS, M., PINCK, M., REINBOLT,J., WALTER,B. & PINCK, L. (1990). RNA2 of grapevine fanleaf virus: sequence analysis and coat protein cistron location. Journalof General Virology 71, 1433-144I. STACE-SmTH,R. (1984). Tomato ringspot virus. CMI/AAB Description of Plant Viruses, no. 290. STAI)EN, R. (1984). Measurements of the effects that coding for a protein have on a DNA sequence and their use for finding genes. Nucleic Acids Research 12, 551-567. Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sat, 17 Jun 2017 00:23:52 1514 M . E. Rott, J. H. Tremaine and D. M . Rochon TONEGUZZO, F., GLYNN, S., LEVI, E., MJOLS~mSS,S. & HAYDAY,A. (1988). Use of a chemically modified T7 DNA polymerase for manual and automated sequencing of supercoiled DNA. BIO- Techniques 6, 460-469. TREMAINE,J. H. & STAcE-SMITH,R. (1968). Chemical composition and biophysical properties of tomato ringspot virus. Virology 35, 102107. VAN WEZENBEEK, P., VERVER, J., HARMSEN, J., VOS, P. & VAN KAMMEN,A. (1983). Primary structure and gene organization of the middle-component RNA of cowpea mosaic virus. EMBO Journal 2, 941-946. VRATI,S., MANN,D. A. & REED, K. C. (1987). Alkaline northern blots: transfer of RNA from agarose gels to Zeta-Probe TM membrane in dilute NaOH. Molecular Biology Reports I (3), 1-4. Richmond: BioRad. WELLINK, J. 8£ VAN KAMMEN, A. (1989). Cell-to-cell transport of cowpea mosaic virus requires both the 58K/48K proteins and the capsid proteins. Journal of General Virology 70, 2279-2286. WELLINK,J., REZELMAN,G., GOLDBACH,R. & BEYREUTHER,K. (1986). Determination of the proteolytic processing sites in the polyprotein encoded by the bottom-component of cowpea mosaic virus. Journal of Virology 59, 50-58. WELLIr,rK, J., VERVER, J., KROON, G. & VAN KAMMEN, A. (1990). Initiation of translation of cowpea mosaic virus M RNA. Vlllth International Congress of Virology Abstracts, Berlin, abstract W53- 005. (Received 14 December 1990; Accepted 25 March 1991) Downloaded from www.microbiologyresearch.org by IP: 88.99.165.207 On: Sat, 17 Jun 2017 00:23:52
© Copyright 2026 Paperzz