volume 8 Number 241980 Nucleic Acids Research The nucleotide sequence and transcript map of the herpes simplex virus thymidine kinase gene Steven L.McKnight Department of Embryology, Carnegie Institution of Washington, 115 W. University Parkway, Baltimore, MD 21210, USA Received 5 November 1980 ABSTRACT This paper presents the nucleotide sequence of the Herpes Simplex Virus thymidine1 kinase1 (tk) gene. The positions on the DNA sequence corresponding to the 5 and 3 termini of tk messenger RNA have been mapped. The mRNA termini are separated by slightly more than 1,300 nucleotides. The same 2,300 nucleotide segment of tk coding strand DNA is fully protected from S^ nuclease digestion when hybridized to tk mRNA. The location and size of the mRNA-coding segment corresponds to a region of the viral DNA that is essential for tk gene expression in microinjected frog oocytes. The nucleotide sequence of the HSV tk gene exhibits an open translational reading frame of 376 codons that extends from the methionine codon most proximal to the 5 1 terminus of tk mRNA to a UGA stop codon ^70 nucleotides from the poly-A addition site. The results of these experiments indicate that the tk gene is not interrupted by intervening DNA sequences, and that certain oligonucleotide sequences adjacent to the termini of the tk gene are homologous to similarly positioned sequences common to structural genes of eukaryotic cells. INTRODUCTION The thymidine kinase (tk) gene of Herpes Simplex Virus (HSV) is well suited for detailed genetic analysis. The tk gene can be stably introduced into living cells by transfection (Wigler et al_., 1977; Maitland and McDougall, 1977; Pellicer <rt al_., 1978), and is expressed in the form of enzymatically active tk when injected into frog oocyte nuclei (McKnight and Gavis, 1980). Moreover, since the translated product of the tk gene is not essential for productive infection of cultured cells, genetic variants of the gene can be easily be generated, identified and maintained in viral stocks. During the course of HSV infection of cultured mammalian cells, three groups of viral genes are expressed in sequential order. The three groups are referred to as a, 6 and y genes according to their temporal order of expression (Honess and Roizman, 1974). The translated products of one or more of the a genes are required for the expression of 3 genes, which are expressed during the course of viral DNA © IRL Press Limited, 1 Falconberg Court, London W1V 5FG, U.K. 5949 Nucleic Acids Research synthesis. Following v i r a l DNA r e p l i c a t i o n , y genes are expressed. The HSV thymidine kinase gene is classed as a 8 gene. Preston (1979) has found that a temperature sensitive lesion in a specific a gene e f f e c t i v e l y blocks tk mRNA production at the non-permissive temperature. I t was postulated that the product of this particular a gene, termed K, is d i r e c t l y responsible for activating tk gene t r a n s c r i p t i o n . I t is puzzling, however, that the tk gene requires a positive regulatory effector since the isolated gene is expressed when introduced into cells by either transfection or microinjection. In order to establish a framework for studying the molecular mechanisms that underly tk gene regulation, I have resolved the nucleotide sequence and transcript map of the gene. The results of these experiments show that certain oligonucleotide sequences that flank the tk gene are homologous to similar regions of c e l l u l a r genes that are expressed by RNA polymerase form I I . These homologies include the octanucleotide TATTAAGG, which precedes the 5' terminus of the mRNA coding component of the tk gene by ^28 nucleotides; the octanucleotide GGCGAATT, which precedes the 5' terminus by ^84 nucleotides; and the heptanucleotide AATAAAA, which precedes, by ^16 nucleotides, the site of poly-A addition onto tk mRNA. Unlike most structural genes, however, the HSV tk gene appears to lack intervening sequences. MATERIALS AND METHODS Preparation and Radiolabeling of DNA Fragments: ; Recombinant DNA was digested with r e s t r i c t i o n enzymes under the following conditions: Bam HI (isolated according to H. Erba, personal communication), Hind I I I (a g i f t of Y. Suzuki), Pst I (a g i f t of R. Peterson), Pvu I I and Hinf I (New England Biolabs) in 50 mM NaCl/6 mM Tris-HCl, pH 7.5/6 mM MgCl2; Bgl I I (New England Biolabs) in 20 mM Tris-HCl, pH 7.5/6 mM MgCl 2 ; Taq I (New England Biolabs) in 100 mM NaCl/10 mM Tris-HCl, pH 8.4/6 mM MgCl2; Kpn I and Hae I I I (New England Biolabs) in 6 mM NaCl/6 mM Tris-HCl, pH 7.5/6 mM MgCl2; Ava I I (New England Biolabs) in 30 mM NaCl/20 mM Tris-HCl, pH 7.5/10 mM MgCl 2 ; Bst HI (New England Biolabs) in 6 mM KC1/6 mM Tris-HCl, pH 7.5/10 mM MgCl 2 ; and Sma I (a g i f t of B. Sollner-Webb) in 15 mM KC1/30 mM Tris-HCl, pH 9.0/3 mM MgCl2- After digestion at 37°C or ( f o r Taq I and Bst NI) 65°C, reactions were stopped by the addition of EDTA to 15 mM. Double stranded DNA fragments were isolated on either 1% agarose gels or on composite gels consisting of 2-6% polyacrylamide (20:1 acrylamide:bisacrylamide) and 0.5% agarose run in 40 mM Tris-acetic acid, pH 7.8/20 mM 5950 Nucleic Acids Research sodium acetate/2 mM EDTA (1 x "E" buffer). Fragments were recovered on hydroxylapaptite according to Tabak and Flavell (1978). Single stranded ONA fragments were strand separated on 6-102 polyacrylamide gels (ranging from 20:1 to 30:1 acrylamide:bisacr.ylamide) run in 50 mM Tris-Borate, pH 8.3/1 mM EDTA (1 x TBE). Single stranded DMA fragments were recovered from polyacrylamide gel slices by chemical elution (Maxam and Gilbert, 1980). Double stranded DNA fragments with protruding 51 termini were end labeled (Weiss et a l . , 1968) using T4 polynucleotide kinase (Bethesda Research Labs) and Y - P-ATP (1,800-2,200 Ci/mmole, New England Nuclear) according to methods described by Fedoroff and Brown (1978). Uniquely labeled fragments were generated either by cleaving radiolabeled duplex DNA with a second restriction The 31 terminus of single stranded DNA was 32 labeled with terminal transferase (a g i f t of B. Sollner-Webb) and a- P-CTP (400 Ci/mmole, Amersham) according to published methods (Roychoudhury et a l . , enzyme, or by strand separation. 1976; Maxam and Gilbert, 1977). Nucleic Acid Sequencing: DNA sequencing was performed according to published methods (Maxam and Gilbert, 1980). Chemically degraded DNA fragments were sized on 0.4 mm x 40 cm polyacrylamide gels (Sanger and Coulson, 1978) consisting of either 8% polyacrylamide (20:1 acrylamide:bisacrylamide)/8 M urea/1 x TBE; or 20% polyacrylamide (30:1 acrylamide:bisacrylamide)/8 M urea/2 x TBE. The sequence of certain DNA fragments was poorly resolved on 8 M urea gels due, presumably, to secondary structure. The chemically degraded products of such fragments were sized on 20% polyacrylamide gels polymerized in the presence of 90% formamide (Sollner-Webb and Reeder, 1979). RNA Isolation, Hybridization and SI Nuclease Digestion: Poly-A+ RNA was purified from HSV I infected African green monkey kidney cells according to Anderson e_t al_. (1979). Four hours post-infection cells were homogenized and polyribosomes were isolated. Following deproteinization, mRNA was fractionated from rRNA by chromatography on oligo-dT cellulose (Collaborative Research). RNA/DNA hybridizations were performed in 2 x SET at 65°C for 4-16 hours. RNA/DNA hybrids were prepared for digestion with S^ nuclease (Boehringer) either by ethanol precipitation and resuspension into 1 x Sj buffer (50 mM sodium acetate-acetic acid, pH 4.5/150 mM NaCl/0.5 mM ZnSO.) or by dilution of the hybridization mix into a 20-fold excess of 1 x Sj buffer. Sj reactions were incubated at 25°C for 30 min, iced and stopped by the addition of 1/5 volume of 0.5 H Tris-HCl, pH 9.5/100 mM EDTA. Samples 5951 Nucleic Acids Research were then ethanol precipitated, vacuum dessicated and resuspended in the appropriate gel sample buffer. For mRNA sizing experiments, Sj resistant hybrids were electrophoresed on 1% agarose gels run in either 1 x "E" buffer or 30 mM NaOH/2 mM EDTA. Following electrophoresis, gels were dried and autoradiographed at -80°C using Kodak XR-1 X-ray film and intensifying screens. For mRNA termini mapping experiments, S 1 resistant DNA fragments were sized on either 8% or 20% polyacrylamide/8 M urea sequencing gels (Sanger and Coulson, 1978). Construction of M13/tk Recombinant Bacteriophage: M13 strain MP-5 was generously provided by J. Messing. Double stranded (replicative form) DNA was isolated from a culture of infected E_. coli strain JM-101 by the cleared lysis procedure of Clewell and Helenski (1970). Covalently closed circular DNA was linearized by Hind III cleavage and 51 termini were dephosphorylated by reaction with bacterial alkaline phosphatase (Bethesda Research Laboratories). The tk DNA fragment that was inserted into M13 was derived from the 3' deletion mutant, A3'-1.13, described in the accompanying paper (McKnight and Gavis, 1980). A3'-1.13 DNA was restricted with Pvu II and ligated to 32 P-labeled synthetic Hind III linker molecules according to procedures described in the accompanying report (McKnight and Gavis, 1980). The DNA was then digested with Hind III and the 1.8 kb tk DNA fragment was isolated by agarose gel electrophoresis and inserted into the Hind III site of pBR-322. A recombinant plasmid containing an inserted tk DNA fragment reaching from the position formerly occupied by the Pvu II site to the end point of A3'-l.13 was isolated. The tk insert of the plasmid was excised by Hind III cleavage and inserted into the Hind III site of M13 strain MP-5 (Messing, 1979). Recombinant bacteriophage plaques detected on Xgal plates were isolated and grown in liquid culture. Replicative form DNA was purified and screened by restriction enzyme cleavage. Recombinant phage isolates containing the entire 1.8 kb tk insert oriented in both of the two possible directions were identified. Bacteriophage were labeled in vivo in 5 ml cultures containing 5 mCi 32 P-orthophosphate (carrier free, Amersham), centrifuged to equilibrium in CsCl gradients, and the single stranded phage DNA was purified by phenol extraction. The M13/tk fragment used to S^ map the 3' terminus of the tk gene was excised from tk coding strand DNA by digestion of M13/tk-a single stranded DNA with Hae III (Horiuchi and Zinder, 1975). Restricted DNA fragments were sized by electrophoresis on a 6% polyacrylamide/8 M urea/1 x TBE gel. The 339 nucleotide Hae III fragment used for mapping the poly-A addition site on tk 5952 Nucleic Acids Research mRNA was recovered by chemical elution and labeled at i t s 31 terminus with a-32P-CTP and terminal transferase. RESULTS AND DISCUSSION Nucleotide Sequence of the HSV tk Gene: The chemical method of DNA sequencing (Maxam and G i l b e r t , 1977, 1980) was used to resolve the nucleotide sequence of the HSV tk gene. A l l DNA sequencing runs were carried out by end labeling 5' termini exposed by res t r i c t i o n enzyme cleavage of cloned tk DNA. Uniquely labeled fragments were generated either by cleaving labeled fragments at a second r e s t r i c t i o n enzyme recognition site or by strand separating DNA fragments labeled at both 5' termini. The strategy used to sequence the tk gene is summarized in Figure 1. A preliminary sequence was established by sequencing from the end point of each deletion mutant described in the accompanying report (McKnight and Gavis, 1980). This provided an unconfirmed sequence of about 90% of the tk gene. A r e s t r i c t i o n enzyme map was generated from this sequence and confirming sequencing runs were carried out using labeled fragments excised at naturally-occurring r e s t r i c t i o n enzyme sites. The entire tk gene sequence was resolved by sequence analysis of both strands of the DNA molecule. In no case was i t observed that the nucleotide sequence of deletion mutants of the tk gene differed from that of the parental tk gene isolate (pHSV-106). The nucleotide sequences determined for complementary strands of the tk gene d i d , however, f a i l to match at 12 locations. In each case, the source of the discrepancy could be accounted for by the methylation of the second cytidine residue of an Eco RII r e s t r i c t i o n enzyme recognition site (Ohmori e_t aj^., 1978). The empirical assignments of a l l 12 Eco RII sites were confirmed by r e s t r i c t i n g tk DNA with an isoschizomer of Eco RII that cleaves methylated Eco RII sites (Bst NI) and identifying the appropriately sized fragments on a 4% acrylamide/0.5% agarose composite electrophoresis gel (data not shown). Figure 2 displays the nucleotide sequence of the non-coding strand of the HSV tk gene. The sequence progresses from a Pvu I I r e s t r i c t i o n enzyme recognition site to the end point of the 3' deletion mutant A31 -1.13. Previous studies have shown that this segment of the HSV genome contains a functional tk gene (Colbere-Garapin et al_. , 1979; McKnight and Gavis, 1980). The end points of seven 5' deletion mutants and four 3' deletion mutants are indicated in Figure 2. The sequence is numbered from 1-1,799 with nucleotide 1 corresponding 5953 Nucleic Acids Research B 1 iS £S fi| li ill I s 1 cc UJ 1 o CD 1 a 3'-1.94A3'-1.60- - £5-1.34 • A5'-I24 - A5'-I.I9 • A5'-I.O9 • A5'- 1.01 • a 5'-0.72 - a 5'-0.67 Strategy Used for Sequencing of the Herpes Simplex Virus tk Gene. The upper portion of the diagram shows a r e s t r i c t i o n enzyme map of the 3.4 kb Bam HI fragment of HSV that contains the thymidine kinase gene. Below the restriction map are lines diagramming deletion mutants that were sequenced. All deletion mutant end points terminate at synthetic Hind I I I r e s t r i c t i o n enzyme recognition sites. Dark lines with arrows show the region of each deletion mutant that was sequenced. The lower portion of the diagram shows a map of naturally-occurring restriction endonuclease sites that were used for DNA sequencing. Only those sites from which nucleotide sequencing runs were generated are shown. The hatched area denotes the segment of the tk gene for which a complete sequence was resolved (see Figure 2). The sequence of the entire segment was determined for both DNA strands. to the 5' cytidine residue of the Pvu I I restriction enzyme s i t e and nucleotide 1,799 corresponding to the end point of A31-1.13. 5954 Nucleic Acids Research 100 s'-CAGCTGCTTC ATCCCCGTGG CCCGTTGCTC GCGTTTGCTG GCGGTGTCCC CGGAAGAAAT ATArTTGCAT GTCTTTAGTT CTATGATGAC ACAAACCCCG " VUl1 ,75 200 CCCAGCGTCT TGTCATTGGC GAATTCGAAC ACGCAGATGC AGTCGGGGCG GCGCGGTCCG AGGTCCACTT CGCATATTAA GGTGACGCGT GTGGCCTCGA 201 EcoRI a5'-Q67-> xx> ACACCGAGCG ACCCTGCAGC GACCCGCTTA ACAGC6TCAA CAGCGTGCCG CAGATCTTGG TGGCGTGAAA CTCCCGCACC TCTTCGGCCA GCGCCTTGTA •<* P l " AS-072 B "» «oo GAAGCGCGTA TGGCTTCGTA CCCCGGCCAT CAACACGCGT CTGCSTTCGA CCAGGCTGC6 CGTTCTCGCG GCCATAGCAA CCGACGTACG GCGFTGCGCC CTCGCCG6CA GCAA6AAGCC ACGGAAGTCC GCCCGGAGCA GAAAATGCCC ACGCTACTGC GGGTTTATAT AGACGGTCCC CACGGGATGG GGAAAACCAC 600 CACCACGC4A CTGCTGGTGG CCCTG6GTTC GCGCGACGAT(ATCGTCTACG TACCCGAGCC GATGACTTAC TGGCGGGTGC TGGGGGCTTC CGAGACAATC 65'- IX>I 700 GCGAACATCT ACACCACACA ACACCGCCTC GACCAGGGTG AGATATCGGC CGGGGACGCG GCGGTGGTAA TGACAAGCGC CCAGATAACA ATGGGCATGC A5'-1.09 800 CTTATGCCfT GACCGACGCC GTTCTGGCTC CTCATATCGG GGGGGAGGCT GGGAGCTCAC ATGCCC^GCC CCCGGCCCTC ACCCTCATCT TCGACCGCCA a5-1.19 A5-I.24 9O0 TCCCATCGCC GCCCTCCTGT GCTACCCGGC CGCGCGGTAC CTTATGGGCA GCATGACCCC(CCAGGCCGTG CTGGCGTTCG TGGCCCTCAT CCCGCCGACC KpnI as-1.34 1000 TTGCCCGGCA CCAACATCGT GCTTGGGGCC CTTCCGGAGG ACAGACACAT CGACCGCCTG GCCAAACGCC AGCGCCCCGG CGAGCGGCTG GACCTGGCTA £3'-1.94 1100 TGCTGGCTGC GATTCGCCGC GTTTACGGGC TACTTGCCAA TACGGTGCGG TATCTGCAGT GCG6CGGGTC GTGGCGGGAG GACTGGGGAC AGCTTTCGGG "'I 1200 GACGGCCGTG CCGCCCCAGG GTGCCGAGCC CCAGAGCAAC GCGGGCCCAC GACCCCATAT CG3GGACACG TTATTTACCC TGTTTCGGGC CCCCGAGTTG 1300 CTGGCCCCCA ACGGCGACCT GTATAACGTG TTTGCCTGGG CCTTGGACGT CTTGGCCAAA CGCCTCCGTT CCATGCACGT CTTTATCCTG GATTACGACC Hotm I4O0 AATCGCCCGC CGGCTGCCGG GACGCCCTGC TGCAACTTAC CTCCGGGATG GTCCAGACCC ACGTCACCAC CCCCGGCTCC ATACC6ACGA TATGCGACCT :cf A3 1 '" - 60 JUS l«6 I5O I5OO GGCGCGCACG TTTGCCCGGG AGATGGGGGA GGCTAACTGA AACACGGAAG GAGACAATAC CGGAAGGAAC CCGCGCTATG ACGGCAATAA AAAGACAGAA 1506 1510 S " 0 1 1600 TAAAACGCAC GGGTGTTGGG TCGTTTGTTC ATAAACGCGG GGTTCGGTCC CAGGGCTGGC ACTCTGTCGA TACCCCACCG AGACCCCATT GGGGCCAATA MoOtt |700 CGCCCGCGT(T TCHCCTTTT CCCCACCCCA CCCCCCAAGT TCGGGTGAAG GCCCAGGGCT CGCAGCCAAC GTCGGGGCGG CAGGCCCTGC CATAGCCACf a3'-L32 l?» GGCCCCGTGG GTTAGGGACG GGGTCCCCCA TGGGGAATGG TTTATGGTTC GTGGGGGTTA TTATTTTG66 CGTTGCGTGG GGTCTGGTGG ACGACCCAG - 3' The Nucleotide Sequence of the Herpes Simplex Virus Thymidine Kinase Gene. The non-coding strand of the nucleotide sequence is displayed progressing from a Pvu II restriction enzyme recognition site to the end point of the 3' deletion mutant, A3'-1.13. Restriction enzyme recognition sites that have been confirmed experimentally are underlined. Only the two Hae III restriction enzyme sites used to prepare 3' hybridization probe are indicated. The end points of seven 5 1 deletion mutants and four 3' deletion mutants are indicated by arrows. The segment of the sequence that encodes tk mRNA reaches from nucleotide 201 to nucleotide 1,510. The AUG triplet located at nucleotide 310 is the most proximal translation start codon to the putative 5' terminus of tk mRNA. An open translation reading frame for the predicted tk protein progresses from the AUG triplet located at nucleotide 310 for 376 codons without interruption by a translation stop codon. The UGA codon located at nucleotide 1,438 is the putative translation stop codon. 5955 Nucleic Acids Research Localization of the 51 Terminus of tk mRNA: As reported in the accompanying paper, two partially functional 51 deletion mutants of the tk gene were isolated (McKnight and Gavis, 1980). These mutants direct the synthesis of low levels of HSV tk enzymatic activity when microinjected into frog oocyte nuclei. It was predicted that these mutants might lack 51 regulatory signals of the tk gene while maintaining the entire protein coding component. Such a prediction suggests that the end points of A5'-0.67 and A5'-0.72 might be located in either the 5' flanking sequences of the gene or a segment of the gene that is complementary to the 51 untranslated segment of tk mRNA. The end points of A51-0.67 and A5'-0.72 are shown in Figure 2. The octanucleotide sequence TATTAAGG is located 22 nucleotides upstream from the end point of A5'-0.67. This sequence is homologous to the conserved sequence, TATAAAGG, that is located in the 5 1 flanking DNA of most eukaryotic genes transcribed by RNA polymerase form II (Hogness and Goldberg, referenced in Cordell e_t al_., 1979). A number of well documented studies have shown that the 5' terminus (or "cap site") of mRNA maps to a position 25-30 nucleotides 31 to the TATAAAGG sequence (Ziff and Evans, 1978; Gannon et/al_., 1979; Tsujimoto and Suzuki, 1979; Tsuda e_t al_., 1979; Hentschell ejt al_., 1980). I have used the S, nuclease mapping procedure (Berk and Sharp, 1977) to test whether the 51 terminus of tk mRNA also maps adjacent to the TATTAAGG sequence. A 131 base pair (bp) DNA fragment, delimited on the 51 end by an Eco RI restriction cut and on the 31 end by a Bgl II restriction cut (see Figure 2 ) , was excised from cloned tk DNA and end labeled with 2 P (Weiss et al_., 1968). The two DNA strands of the molecule were separated on a polyacrylamide gel and the tk mRNA coding strand was recovered and identified by DNA sequencing. The coding strand probe was hybridized to mRNA prepared from HSV I infected African green monkey kidney cells and then digested with S, nuclease (Materials and Methods). Figure 3 shows an autoradiographic exposure of a 20% polyacrylamide sequencing gel that was used to size SI resistant DNA fragments. The autoradiogram reveals two major bands of radioactive DNA that measure 54 and 56 nucleotides in length. The sizes of these DNA fragments suggest that the most abundant species of tk mRNA protects a segment of tk DNA reaching from the Bgl II site to either adenosine residue number 201 or adenosine residue 203. These particular residues map 27 and 29 nucleotides, respectively, from the conserved octanucleotide sequence TATTAAGG. *For a detailed description of the logic behing these assignments, see SollnerWebb and Reeder (1979). 5956 Nucleic Acids Research Fig. 1 2 3 4 5 S 7 8 9 10 """ • S7 . "•mmmm !«BS -—H a 3: Si Nuclease map of the 5' terminus of tk rTlRN A The figure shows an autoradiogram of a 20% polyacrylamide sequencing gel that was used to size Si nuclease digested DNA fragments. The numbers to the left of the Figure designate the positions of molecular weight marker DNA fragments. The coding strand of the tk gene was terminally labeled at the Bgl II site and sequenced as shown in lanes 2-5 (G, A, T and C reactions, respectively). The coding strand of a Bgl II-Eco RI fragment (131 bp) was terminally labeled and isolated by strand separation. Part of this probe DNA was identified by sequencing. Lane 1 was loaded with the 131 bp probe DNA fragment after modification with pyridinium formate and degradation with piperidine (A reaction in lane 1 = A reaction in lane 3). The remaining portion of the terminally labeled 131 bp coding strand fragment was hybridized to tk mRNA and digested with Si nuclease. Lanes 6-8 were loaded with hybridized probe DNA that was digested with 0.25, 2.5 and 12.5 units Si/100 pi respectively. Lane 9 was loaded with a small amount of probe DNA and lane 10 was loaded with probe DNA that was digested with 0.25 units Si/100 ul without having been hybridized to mRNA. A longer exposure of the gel shows a pattern of bands in lane 10 corresponding exactly to the size of the bands in lane 6 in the 72-80 nucleotide range. The appearance of these bands as a function of very mild Si digestion is judged an artefact of secondary structure of the probe DNA. Sj nuclease mapping of the 51 terminus of tk mRNA results in a slightly heterogeneous display of putative 5' termini (see Figure 3). One cannot determine whether this result is caused to bona fide heterogeneity at the 5" terminus of tk mRNA, or by artefacts inherent to the S. mapping technique. Although a conclusive identification of the precise 5' terminus of tk mRNA must await sequence analysis of tk mRNA, the S, mapping experiment presented here does show that the 5 1 terminus is located between nucleotide 200 and nucleotide 204 (see Figure 2 ) . That these residues range from 25 to 30 nucleotides downstream from the TATTAAGG sequence is consistent with a pattern that has emerged from the analysis of a number of structural genes. 5957 Nucleic Acids Research Assuming that the 51 terminus of the tk gene is located within the pentanucleotide region delimited by S, mapping, certain conclusions can be drawn regarding the contribution of 51 flanking sequences to tk gene expression. The end point of the 51 deletion mutant, A5'-0.67, is located at nucleotide 196 (Figure 2 ) o In this deletion mutant most 5' flanking sequences, with the possible exception of up to 8 nucleotides, are replaced by pBR-322 sequences. As reported in the accompanying paper, A5'-0.67 will support tk enzyme synthesis in oocytes to roughly 10% the level of the parental tk gene cloned in pHSV-106. An Eco RI restriction enzyme recognition site is located ^80 nucleotides upstream from the putative 51 terminus of the tk gene. It is reported that restriction cleavage at this site inactivates the tk gene in the tk" cell transfection assay (Wigler e_t al_., 1977). These two observations indicate that sequences flanking the 51 terminus of the tk gene are required for maximal levels of expression. A rigorous identification of the 5" flanking sequences necessary for tk gene expression will require the construction and analysis of a more refined series of extragenic deletion mutants. Localization of the 3' Terminus of tk mRNA: Deletion mapping studies have predicted that a functional boundary of the HSV tk gene is located between the end points of deletion mutants A3'-1.60 and A3'-1.32 (McKnight and Gavis, 1980). It has also been reported that a Sma I restriction enzyme recognition site is located close to the 31 terminus of the tk gene (Colbere-Garapin e_t al_., 1979). Inspection of the nucleotide sequence between the end points of A3'-1.60 and A3'-1.32 reveals a Sma I restriction site at nucleotide 1,415 (see Figure 2). The nucleotide sequence in this region of the tk gene also reveals a tandem repeat of the heptanucleotide sequence AATAAAA. The first repeat of this sequence is located in Figure 2 starting at nucleotide 1,486. Proudfoot and Brownlee (1976) found this same heptanucleotide sequence M 8 nucleotides from the poly-A addition site on four different eukaryotic mRNAs. To test whether the poly-A addition site on tk mRNA maps adjacent to the AATAAAA sequences, a transcript map of this region of the gene was generated. Coding strand tk DNA propagated in bacteriophage M-13 (see following section) was restricted with Hae III and a 339 nucleotide fragment was isolated by gel electrophoresis. This single stranded Hae III fragment reaches from nucleotide 1,254 to nucleotide 1,593 as shown in Figure 2. This fragment should encompass the 3' terminus of the tk gene since it covers most of the sequences between the end points of deletion 5958 Nucleic Acids Research mutants A3'-1.60 and A3'-1.32. The single stranded DNA was labeled at i t s 32 31 terminus with a- P-CTP and terminal transferase (Roychoudhury et a!., 1976) and hybridized to poly-A+ RNA isolated from HSV infected cells (Materials and Methods). RNA/DNA hybrids were then digested with S, nuclease. Figure 4 shows an autoradiographic exposure of an 8% polyacrylamide sequencing gel that was used to size S, resistant DNA fragments. The autoradiograph shows that mRNA prepared from HSV infected cells protects ^260 nucleotides of the 339 nucleotide Hae III probe. This result indicates that the poly-A addition site on tk mRNA is located some 16 nucleotides downstream from the final adenosine residue of the first AATAAAA sequence. This position corresponds to «2- •• SZ7- v *°*~ • . ^ 309- o p — m 24223S- A 217- « • 201190- «• «p 180- W 160- ^B 147- M • Fig. 4: Si Nuclease Map of the Site of Poly-A Addition onto tk mRNA. An autoradiogram of an 8% polyacrylamide sequencing gel that was used to size Si nuclease digests of RNA/DNA hybrids. The tk hybridization probe is a 339 nucleotide coding1 strand Hae III fragment labeled at the 3 terminus with 32p. The numbers adjacent to the marker DNA fragments (lane 1) indicate their sizes in nucleotides. Lane 2 was left blank and lane 3 was loaded with probe DNA that was digested with 0.25 units Sj/100 yl without being hybridized to tk mRNA. Lane 4 was loaded with probe DNA that was hybridized to tk mRNA before being digested with 2.5 units Sj/100 ul. Lane 5 was loaded with a small sample of the intact 339 nucleotide Hae III probe DNA. ^ 122110- 5959 Nucleic Acids Research the region between nucleotides 1,506 and 1,510 on the sequence displayed in Figure 2. These results indicate that the Sma I site is 90 nucleotides internal to the mRNA coding portion of the tk gene. Colbere-Garapin e_t al_. (1979) reported that Sma I restriction cleavage of tk DNA reduces, by roughly 95%, the capacity of tk DNA to transfect tk" cells to the tk phenotype. If these restriction enzyme sites are the same, the data of Colbere-Garapin et_ al__, (1979) suggest that the 90 3' proximal nucleotides are not absolutely required for tk gene function. The end point of A3'-1.60 is 171 nucleotides internal to the 31 terminus of the tk mRNA coding segment. Since this deletion mutant is incapable of directing tk enzyme synthesis in frog oocytes, it can be concluded that at least some of these 171 nucleotides are necessary for tk gene expression. Sizing of tk mRNA: The distance separating the putative 51 and 3 1 termini of the HSV tk gene measures ^1,310 bp. The finding that this distance is not appreciably larger than the 1,100 ribonucleotides needed to code for the 42,000 dalton tk polypeptide (Summers et. a k , 1975; Cremer e_t al_., 1978) suggests that the tk gene might not be interrupted by intervening DNA sequences. To test this possibility, tk mRNA was sized by Sx mapping using single-stranded probe DNA cloned in bacteriophage M13 (Messing, 1979). In order to avoid mapping transcripts encoded by segments of the HSV I genome that closely flank the tk gene, I used a tk DNA probe that contains only 200-300 bp of 5' and 3' extragenic flanking sequences. This 1.8 kb probe extends from a Pvu II site to the end point of the 31 deletion mutant A3'-1.13, precisely the same sequences shown in Figure 2. It was expected that this DNA fragment should encompass the tk transcription unit since it is fully capable of directing the synthesis of tk enzymatic activity when microinjected into frog oocyte nuclei (data not shown). The 1.8 kb tk fragment was constructed to contain synthetic Hind III restriction sites at both termini (Materials and Methods), and inserted into the Hind III site of the mp5 strain of M13 replicative form DNA. M13/tk isolates carrying the 1.8 kb tk insert in both of the two possible orientations were isolated. In one orientation, M13/tk-a, the single stranded form of the recombinant phage contains tk coding strand DNA. When the tk fragment is inserted in the opposite recombinant orientation, M13/tk-b, the single stranded phage DNA contains the non-coding strand DNA of the tk gene. These assignments are based on the known transcription polarity of 5960 Nucleic Acids Research the tk gene (Smiley et ai., 1980) and the life cycle and construction of M13 strain rap5 (Messing, 1979). Both types of M13/tk recombinant phage were grown in E. coli strain 32 ~ JM-101 in the presence of P orthophosphate. Radiolabeled single stranded DNA was prepared from purified phage and hybridized to poly-A RNA prepared from HSV I infected African green monkey kidney cells (Materials and Methods). RNA/DNA hybrids were digested with S, nuclease and electrophoresed on neutral and alkaline agarose gels. Figure 5 shows an autoradiogram of an alkaline gel used to size Sj resistant DNA fragments. The autoradiogram shows that roughly 1.3 kb of the M13/tk-a probe is protected from S, digestion by tk mRNA. Alternatively, the M13/tk-b probe DNA is not protected from Sj nuclease digestion by tk mRNA. When the same S, digested samples were sized on a neutral agarose gel, essentially the same results were obtained. Three conclusions can be drawn from these results. First, the coding strand assignment for tk transcription reported by Smiley ert a K (1980) is confirmed. That is, the transcription polarity of the tk gene procedes 5 1855- ^^ I f 1060828- Ute ^S ^~4 f f 383- * ^ » • • Fig. 5: Sj Nuclease Sizing of tk mRNA. An autoradiogram of an alkaline electrophoresis gel that was used to size Si nuclease digested RNA/DNA hybrid molecules. The numbers adjacent to the marker DNA bands in lane 1 indicate their sizes in nucleotides. Lanes 2 and 3 were loaded with M13/tk coding strand DNA that, after hybridization to tk mRNA, was digested with 2.5 and 25 units Si nuclease/100 ul respectively. Lanes 4 and 5 were loaded with M13/tk noncoding strand DNA that was digested with the same S^ concentrations after hybridization to tk mRNA. 5961 Nucleic Acids Research from l e f t to right as diagrammed in Figure 1 of the accompanying report (McKnight and Gavis, 1980).. Second, the size of the most abundant species of tk mRNA synthesized in HSV infected cultured cells is approximately 1.3 kb excluding the poly-A t a i l . Finally, by showing that tk mRNA protects a continuous 1.3 kb segment of the tk gene from S^ digestion, the experiment strongly suggests that the HSV tk gene i s not interrupted by intervening DNA sequences. Predicted Polypeptide Sequence of the tk Gene: I f the HSV tk gene lacks intervening DNA sequences then one should be able to demonstrate an open reading frame throughout the protein-coding portion of the DNA sequence. The closest methionine coding t r i p l e t to the putative 51 terminus of the tk gene is located at nucleotide 310 (Figure 2). Since the tk polypeptide has not been sequenced one cannot be certain that this particular AUG t r i p l e t is the authentic translation start s i t e . For the large majority of eukaryotic mRNAs, however, the most 51 proximal AUG is used as the translation start s i t e (Kozak, 1978). When amino acids are assigned to the tk sequence starting from the methionine codon at nucleotide 310, a translation stop t r i p l e t i s not encountered for 376 codons (Figure 2). Three observations suggest that the amino acid sequence predicted by the DNA sequence may be correct. F i r s t , the two reading frames other than that predicted to code for tk protein are interrupted at multiple sites by translation stop codons. Second, the translation stop codon that I have tentatively assigned for the tk polypeptide is the UGA stop codon at nucleotide 1,438 (Figure 2). Cremer et_ al_. (1979) have shown that chain termination of tk protein synthesis is suppressed in v i t r o by an opal (UGA) suppressor tRNA. Third, the polypeptide predicted from the tk DNA sequence is composed of 376 amino acid residues. The molecular weight of this predicted polypeptide chain, accounting for the amino acid composition, is 39,931 daltons. This predicted size f i t s closely to the 42,000 dalton estimate determined for HSV-tk by various gel electrophoretic sizing methods (Summers et al_., 1975; Cremer e_t al_., 1978). ACKNOWLEDGEMENTS I thank Drs. G. Hayward and G. Reyes for the very generous provision of RNA samples; W. Summers and M. Wagner for the open communication of unpublished observations; R. Kingsbury and E. Gavis for competent technical assistance; P. Schmidt and S. Satchel! for typing, and especially, R. 5962 Nucleic Acids Research Peterson for helpful advise and encouragement. Funds for this research were provided by the Carnegie Institution of Washington. SLM is a fellow of the Helen Hay Whitney Foundation for Medical Research. REFERENCES 1. Anderson, K. P., Stringer, J. R., Holland, L. E. and Wagner, E. K. (1979). J. Virol„ 30: 805-820. 2. Berk, A. J. and Sharp, P. A. (1977). Cell 12: 721-732. 3. Clewell, D. B. and Helinski, D. R. (1970). Biochemistry 9: 4428-4440. 4. Colbere-Garapin, F., Chousterman, S., Horodniceann, F., Kourilsky, P. and Garapin, A. (1979). Proc. Nat. Acad. Sci. USA 76: 3755-3759. 5. Cordell, B., Bell, Go, Tischer, E., Denoto, F. M., Ulrich, A., Pictet, R., Rutter, W. J. and Goodman, H. M. (1979). Cell 18: 533-543. 6. Cremer, K. J., Bodemer, M. and Summers, W. C. (1978). Nucl. Acids Res. 5: 2333-2344. 7. Cremer, K. J,, Bodemer, M., Summers, W. P., Summers, W. C. and Gesteland, R. F. (1979). Proc. Nat. Acad. Sci. USA 76: 430-434. 8. Fedoroff, N. V. and Brown, D. D. (1978). Cell 13: 702-716. 9. Gannon, F., 0'Hare, K., Perrin, F., LePennec, J. P., Benoist, C , Cochet, M., Breathnach, R., Royal, A., Garapin, A., Cami, B., Chambon, P. (1979). Nature 278: 428:434. 10. Hentschel, C , Irminger, J., Bucher, P. and Birnstiel, M. L. (1980). Nature 285: 147-151. 11. Heyneker, H. L., Shine, J., Goodman, H. M. Boyer, H. W., Rosenberg, J., Dickerson, R. E., Narange, S. A., Isakura, K., Lin, S. and Riggs, A. D. (1976). Nature 263: 748-752. 12. Honess, R. W. and Roizman, B. (1974). J. Virol. 14: 8-19. 13. Horiuchi, K. and Zinder, N. D. (1975). Proc. Nat. Acad. Sci. USA 72: 2555-2558. 14. Kozak, M. (1978). Cell 15: 1109-1123. 15. Maitland, N. J. and McCougall, J. K. (1977). Cell 13: 233-241. 16. Maxam, A. and Gilbert, W. (1977). Proc. Nat. Acad. Sci. USA 74: 560-564. 17. Maxam. A. and Gilbert, W. (1980). Meth, Enzymol., L. Grossman and K. Moldave, eds. (Academic Press, N.Y.), Vol. 65: 499-560. 18. McKnight, S. L. and Gavis, E. R. (1980). Submitted for publication. 19. Messing, J. (1979). Federal Register: Recombinant DNA Technical Bulletin, Vol. 2: 43-48. 20. Ohmori, H., Tomizawa, J. I. and Maxam, A. M. (1978). Nucl. Acids Res. 5: 1479-1485. 21. Pellicer, A., Wigler, M., Axel, R. and Silverstein, S. (1978). Cell 14: 133-141. 22. Preston, C. M. (1979). J. Virol. 29: 275-284. 23. Proudfoot, N. J. and Brownlee, G. G. (1976). Nature 263: 211-214. 24. Roychoudhury, R., Jay, E. and Wu, R. (1976). Nucl. Acids Res. 3: .863-877. 25. Smiley, J. R., Wagner, M. J., Summers, W. P. and Summers, W. C. (1980). Virology 102: 83-93. 26. Sollner-Webb, B. and Reeder, R. H. (1979). Cell 18: 485-499. 27. Sanger, F. and Coulson, A. R. (1978). FEBS Letters 87: 107-110. 28. Summers, W. C , Wagner, M. and Summers, W. P. (1975). Proc. Nat. Acad. Sci. USA 72: 4081-4085. 29. Tabak, H. F. and Flavell, R. A. (1978). Nucl. Acids Res. 5: 2321-2332. 30. Tsuda, M., Ohshima, Y. and Suzuki, Y. (1979). Proc. Nat. Acad. Sci. USA 76: 4872-4876. 5963 Nucleic Acids Research 3 1 . T s u j i m o t o , Y. and S u z u k i , Y. (1979). Cell 18: 591-600. 32. Weiss, B . , L i v e , T. E. and Richardson, C. C. (1968). J . B i o l . Chem. 243: 4530-4542. 33. W i g l e r , M., S i l v e r s t e i n , S . , Lee, L . , P e l l i c e r , A . , Cheng, Y. and A x e l , R. (1977). Cell 1 1 : 223:232. 34. Z i f f , E. B. and Evans, R. M. (1978). Cell 15: 1463-1475. 5964
© Copyright 2025 Paperzz