This paper presents the nucleotide sequence of the Herpes Simplex

volume 8 Number 241980
Nucleic Acids Research
The nucleotide sequence and transcript map of the herpes simplex virus thymidine kinase gene
Steven L.McKnight
Department of Embryology, Carnegie Institution of Washington, 115 W. University Parkway,
Baltimore, MD 21210, USA
Received 5 November 1980
ABSTRACT
This paper presents the nucleotide sequence of the Herpes Simplex Virus
thymidine1 kinase1 (tk) gene. The positions on the DNA sequence corresponding
to the 5 and 3 termini of tk messenger RNA have been mapped. The mRNA termini
are separated by slightly more than 1,300 nucleotides. The same 2,300 nucleotide
segment of tk coding strand DNA is fully protected from S^ nuclease digestion
when hybridized to tk mRNA. The location and size of the mRNA-coding segment
corresponds to a region of the viral DNA that is essential for tk gene expression
in microinjected frog oocytes. The nucleotide sequence of the HSV tk gene exhibits an open translational reading frame
of 376 codons that extends from the
methionine codon most proximal to the 5 1 terminus of tk mRNA to a UGA stop codon
^70 nucleotides from the poly-A addition site. The results of these experiments
indicate that the tk gene is not interrupted by intervening DNA sequences, and
that certain oligonucleotide sequences adjacent to the termini of the tk gene
are homologous to similarly positioned sequences common to structural genes of
eukaryotic cells.
INTRODUCTION
The thymidine kinase (tk) gene of Herpes Simplex Virus (HSV) is well suited
for detailed genetic analysis. The tk gene can be stably introduced into living
cells by transfection (Wigler et al_., 1977; Maitland and McDougall, 1977;
Pellicer <rt al_., 1978), and is expressed in the form of enzymatically active tk
when injected into frog oocyte nuclei (McKnight and Gavis, 1980). Moreover,
since the translated product of the tk gene is not essential for productive
infection of cultured cells, genetic variants of the gene can be easily be
generated, identified and maintained in viral stocks.
During the course of HSV infection of cultured mammalian cells, three groups
of viral genes are expressed in sequential order. The three groups are referred
to as a, 6 and y genes according to their temporal order of expression (Honess and
Roizman, 1974). The translated products of one or more of the a genes are required
for the expression of 3 genes, which are expressed during the course of viral DNA
© IRL Press Limited, 1 Falconberg Court, London W1V 5FG, U.K.
5949
Nucleic Acids Research
synthesis. Following v i r a l DNA r e p l i c a t i o n , y genes are expressed. The HSV thymidine kinase gene is classed as a 8 gene. Preston (1979) has found that a temperature sensitive lesion in a specific a gene e f f e c t i v e l y blocks tk mRNA
production at the non-permissive temperature. I t was postulated that the
product of this particular a gene, termed K, is d i r e c t l y responsible for
activating tk gene t r a n s c r i p t i o n . I t is puzzling, however, that the tk gene
requires a positive regulatory effector since the isolated gene is expressed
when introduced into cells by either transfection or microinjection.
In order to establish a framework for studying the molecular mechanisms
that underly tk gene regulation, I have resolved the nucleotide sequence and
transcript map of the gene. The results of these experiments show that certain
oligonucleotide sequences that flank the tk gene are homologous to similar
regions of c e l l u l a r genes that are expressed by RNA polymerase form I I .
These homologies include the octanucleotide TATTAAGG, which precedes the 5'
terminus of the mRNA coding component of the tk gene by ^28 nucleotides; the
octanucleotide GGCGAATT, which precedes the 5' terminus by ^84 nucleotides;
and the heptanucleotide AATAAAA, which precedes, by ^16 nucleotides, the site
of poly-A addition onto tk mRNA. Unlike most structural genes, however, the
HSV tk gene appears to lack intervening sequences.
MATERIALS AND METHODS
Preparation and Radiolabeling of DNA Fragments:
;
Recombinant DNA was digested with r e s t r i c t i o n enzymes under the following
conditions: Bam HI (isolated according to H. Erba, personal communication),
Hind I I I (a g i f t of Y. Suzuki), Pst I (a g i f t of R. Peterson), Pvu I I and
Hinf I (New England Biolabs) in 50 mM NaCl/6 mM Tris-HCl, pH 7.5/6 mM MgCl2;
Bgl I I (New England Biolabs) in 20 mM Tris-HCl, pH 7.5/6 mM MgCl 2 ; Taq I
(New England Biolabs) in 100 mM NaCl/10 mM Tris-HCl, pH 8.4/6 mM MgCl2; Kpn I
and Hae I I I (New England Biolabs) in 6 mM NaCl/6 mM Tris-HCl, pH 7.5/6 mM
MgCl2; Ava I I (New England Biolabs) in 30 mM NaCl/20 mM Tris-HCl, pH 7.5/10
mM MgCl 2 ; Bst HI (New England Biolabs) in 6 mM KC1/6 mM Tris-HCl, pH 7.5/10
mM MgCl 2 ; and Sma I (a g i f t of B. Sollner-Webb) in 15 mM KC1/30 mM Tris-HCl,
pH 9.0/3 mM MgCl2- After digestion at 37°C or ( f o r Taq I and Bst NI) 65°C,
reactions were stopped by the addition of EDTA to 15 mM.
Double stranded DNA fragments were isolated on either 1% agarose gels or
on composite gels consisting of 2-6% polyacrylamide (20:1 acrylamide:bisacrylamide) and 0.5% agarose run in 40 mM Tris-acetic acid, pH 7.8/20 mM
5950
Nucleic Acids Research
sodium acetate/2 mM EDTA (1 x "E" buffer). Fragments were recovered on
hydroxylapaptite according to Tabak and Flavell (1978). Single stranded ONA
fragments were strand separated on 6-102 polyacrylamide gels (ranging from 20:1
to 30:1 acrylamide:bisacr.ylamide) run in 50 mM Tris-Borate, pH 8.3/1 mM EDTA
(1 x TBE). Single stranded DMA fragments were recovered from polyacrylamide
gel slices by chemical elution (Maxam and Gilbert, 1980).
Double stranded DNA fragments with protruding 51 termini were end labeled
(Weiss et a l . , 1968) using T4 polynucleotide kinase (Bethesda Research Labs)
and Y - P-ATP (1,800-2,200 Ci/mmole, New England Nuclear) according to methods
described by Fedoroff and Brown (1978). Uniquely labeled fragments were
generated either by cleaving radiolabeled duplex DNA with a second restriction
The 31 terminus of single stranded DNA was
32
labeled with terminal transferase (a g i f t of B. Sollner-Webb) and a- P-CTP
(400 Ci/mmole, Amersham) according to published methods (Roychoudhury et a l . ,
enzyme, or by strand separation.
1976; Maxam and Gilbert, 1977).
Nucleic Acid Sequencing:
DNA sequencing was performed according to published methods (Maxam and
Gilbert, 1980). Chemically degraded DNA fragments were sized on 0.4 mm x 40
cm polyacrylamide gels (Sanger and Coulson, 1978) consisting of either 8%
polyacrylamide (20:1 acrylamide:bisacrylamide)/8 M urea/1 x TBE; or 20%
polyacrylamide (30:1 acrylamide:bisacrylamide)/8 M urea/2 x TBE. The
sequence of certain DNA fragments was poorly resolved on 8 M urea gels due,
presumably, to secondary structure. The chemically degraded products of such
fragments were sized on 20% polyacrylamide gels polymerized in the presence
of 90% formamide (Sollner-Webb and Reeder, 1979).
RNA Isolation, Hybridization and SI Nuclease Digestion:
Poly-A+ RNA was purified from HSV I infected African green monkey kidney
cells according to Anderson e_t al_. (1979). Four hours post-infection cells
were homogenized and polyribosomes were isolated. Following deproteinization,
mRNA was fractionated from rRNA by chromatography on oligo-dT cellulose
(Collaborative Research). RNA/DNA hybridizations were performed in 2 x SET
at 65°C for 4-16 hours. RNA/DNA hybrids were prepared for digestion with S^
nuclease (Boehringer) either by ethanol precipitation and resuspension into
1 x Sj buffer (50 mM sodium acetate-acetic acid, pH 4.5/150 mM NaCl/0.5 mM
ZnSO.) or by dilution of the hybridization mix into a 20-fold excess of 1 x
Sj buffer. Sj reactions were incubated at 25°C for 30 min, iced and stopped
by the addition of 1/5 volume of 0.5 H Tris-HCl, pH 9.5/100 mM EDTA. Samples
5951
Nucleic Acids Research
were then ethanol precipitated, vacuum dessicated and resuspended in the appropriate gel sample buffer. For mRNA sizing experiments, Sj resistant hybrids
were electrophoresed on 1% agarose gels run in either 1 x "E" buffer or 30
mM NaOH/2 mM EDTA. Following electrophoresis, gels were dried and autoradiographed at -80°C using Kodak XR-1 X-ray film and intensifying screens. For
mRNA termini mapping experiments, S 1 resistant DNA fragments were sized on
either 8% or 20% polyacrylamide/8 M urea sequencing gels (Sanger and Coulson,
1978).
Construction of M13/tk Recombinant Bacteriophage:
M13 strain MP-5 was generously provided by J. Messing. Double stranded
(replicative form) DNA was isolated from a culture of infected E_. coli strain
JM-101 by the cleared lysis procedure of Clewell and Helenski (1970).
Covalently closed circular DNA was linearized by Hind III cleavage and 51
termini were dephosphorylated by reaction with bacterial alkaline phosphatase
(Bethesda Research Laboratories). The tk DNA fragment that was inserted into
M13 was derived from the 3' deletion mutant, A3'-1.13, described in the
accompanying paper (McKnight and Gavis, 1980). A3'-1.13 DNA was restricted
with Pvu II and ligated to 32 P-labeled synthetic Hind III linker molecules
according to procedures described in the accompanying report (McKnight and
Gavis, 1980). The DNA was then digested with Hind III and the 1.8 kb tk DNA
fragment was isolated by agarose gel electrophoresis and inserted into the
Hind III site of pBR-322. A recombinant plasmid containing an inserted tk
DNA fragment reaching from the position formerly occupied by the Pvu II site
to the end point of A3'-l.13 was isolated. The tk insert of the plasmid was
excised by Hind III cleavage and inserted into the Hind III site of M13
strain MP-5 (Messing, 1979). Recombinant bacteriophage plaques detected on
Xgal plates were isolated and grown in liquid culture. Replicative form DNA
was purified and screened by restriction enzyme cleavage. Recombinant phage
isolates containing the entire 1.8 kb tk insert oriented in both of the two
possible directions were identified. Bacteriophage were labeled in vivo in
5 ml cultures containing 5 mCi 32 P-orthophosphate (carrier free, Amersham),
centrifuged to equilibrium in CsCl gradients, and the single stranded phage
DNA was purified by phenol extraction.
The M13/tk fragment used to S^ map the 3' terminus of the tk gene was
excised from tk coding strand DNA by digestion of M13/tk-a single stranded
DNA with Hae III (Horiuchi and Zinder, 1975). Restricted DNA fragments were
sized by electrophoresis on a 6% polyacrylamide/8 M urea/1 x TBE gel. The 339
nucleotide Hae III fragment used for mapping the poly-A addition site on tk
5952
Nucleic Acids Research
mRNA was recovered by chemical elution and labeled at i t s 31 terminus with
a-32P-CTP and terminal transferase.
RESULTS AND DISCUSSION
Nucleotide Sequence of the HSV tk Gene:
The chemical method of DNA sequencing (Maxam and G i l b e r t , 1977, 1980)
was used to resolve the nucleotide sequence of the HSV tk gene. A l l DNA
sequencing runs were carried out by end labeling 5' termini exposed by res t r i c t i o n enzyme cleavage of cloned tk DNA. Uniquely labeled fragments were
generated either by cleaving labeled fragments at a second r e s t r i c t i o n enzyme
recognition site or by strand separating DNA fragments labeled at both 5'
termini.
The strategy used to sequence the tk gene is summarized in Figure 1.
A preliminary sequence was established by sequencing from the end point
of each deletion mutant described in the accompanying report (McKnight and
Gavis, 1980). This provided an unconfirmed sequence of about 90% of the tk
gene. A r e s t r i c t i o n enzyme map was generated from this sequence and confirming sequencing runs were carried out using labeled fragments excised at naturally-occurring r e s t r i c t i o n enzyme sites.
The entire tk gene sequence was resolved by sequence analysis of both
strands of the DNA molecule. In no case was i t observed that the nucleotide
sequence of deletion mutants of the tk gene differed from that of the parental
tk gene isolate (pHSV-106). The nucleotide sequences determined for complementary strands of the tk gene d i d , however, f a i l to match at 12 locations. In
each case, the source of the discrepancy could be accounted for by the methylation of the second cytidine residue of an Eco RII r e s t r i c t i o n enzyme recognition site (Ohmori e_t aj^., 1978). The empirical assignments of a l l 12 Eco
RII sites were confirmed by r e s t r i c t i n g tk DNA with an isoschizomer of Eco RII
that cleaves methylated Eco RII sites (Bst NI) and identifying the appropriately
sized fragments on a 4% acrylamide/0.5% agarose composite electrophoresis gel
(data not shown).
Figure 2 displays the nucleotide sequence of the non-coding strand of
the HSV tk gene. The sequence progresses from a Pvu I I r e s t r i c t i o n enzyme
recognition site to the end point of the 3' deletion mutant A31 -1.13. Previous
studies have shown that this segment of the HSV genome contains a functional tk
gene (Colbere-Garapin et al_. , 1979; McKnight and Gavis, 1980). The end points
of seven 5' deletion mutants and four 3' deletion mutants are indicated in
Figure 2. The sequence is numbered from 1-1,799 with nucleotide 1 corresponding
5953
Nucleic Acids Research
B
1
iS £S
fi|
li
ill
I
s
1
cc
UJ
1
o
CD
1
a 3'-1.94A3'-1.60-
- £5-1.34
• A5'-I24
- A5'-I.I9
• A5'-I.O9
• A5'- 1.01
• a 5'-0.72
- a 5'-0.67
Strategy Used for Sequencing of the Herpes Simplex Virus tk Gene.
The upper portion of the diagram shows a r e s t r i c t i o n enzyme map
of the 3.4 kb Bam HI fragment of HSV that contains the thymidine
kinase gene. Below the restriction map are lines diagramming
deletion mutants that were sequenced. All deletion mutant end
points terminate at synthetic Hind I I I r e s t r i c t i o n enzyme recognition sites. Dark lines with arrows show the region of each deletion
mutant that was sequenced. The lower portion of the diagram shows a
map of naturally-occurring restriction endonuclease sites that were
used for DNA sequencing. Only those sites from which nucleotide
sequencing runs were generated are shown. The hatched area denotes
the segment of the tk gene for which a complete sequence was resolved
(see Figure 2). The sequence of the entire segment was determined
for both DNA strands.
to the 5' cytidine residue of the Pvu I I restriction enzyme s i t e and nucleotide
1,799 corresponding to the end point of A31-1.13.
5954
Nucleic Acids Research
100
s'-CAGCTGCTTC ATCCCCGTGG CCCGTTGCTC GCGTTTGCTG GCGGTGTCCC CGGAAGAAAT ATArTTGCAT GTCTTTAGTT CTATGATGAC ACAAACCCCG
" VUl1
,75
200
CCCAGCGTCT TGTCATTGGC GAATTCGAAC ACGCAGATGC AGTCGGGGCG GCGCGGTCCG AGGTCCACTT CGCATATTAA GGTGACGCGT GTGGCCTCGA
201
EcoRI
a5'-Q67->
xx>
ACACCGAGCG ACCCTGCAGC GACCCGCTTA ACAGC6TCAA CAGCGTGCCG CAGATCTTGG TGGCGTGAAA CTCCCGCACC TCTTCGGCCA GCGCCTTGTA
•<*
P l
"
AS-072
B
"»
«oo
GAAGCGCGTA TGGCTTCGTA CCCCGGCCAT CAACACGCGT CTGCSTTCGA CCAGGCTGC6 CGTTCTCGCG GCCATAGCAA CCGACGTACG GCGFTGCGCC
CTCGCCG6CA GCAA6AAGCC ACGGAAGTCC GCCCGGAGCA GAAAATGCCC ACGCTACTGC GGGTTTATAT AGACGGTCCC CACGGGATGG GGAAAACCAC
600
CACCACGC4A CTGCTGGTGG CCCTG6GTTC GCGCGACGAT(ATCGTCTACG TACCCGAGCC GATGACTTAC TGGCGGGTGC TGGGGGCTTC CGAGACAATC
65'- IX>I
700
GCGAACATCT ACACCACACA ACACCGCCTC GACCAGGGTG AGATATCGGC CGGGGACGCG GCGGTGGTAA TGACAAGCGC CCAGATAACA ATGGGCATGC
A5'-1.09
800
CTTATGCCfT GACCGACGCC GTTCTGGCTC CTCATATCGG GGGGGAGGCT GGGAGCTCAC ATGCCC^GCC CCCGGCCCTC ACCCTCATCT TCGACCGCCA
a5-1.19
A5-I.24
9O0
TCCCATCGCC GCCCTCCTGT GCTACCCGGC CGCGCGGTAC CTTATGGGCA GCATGACCCC(CCAGGCCGTG CTGGCGTTCG TGGCCCTCAT CCCGCCGACC
KpnI
as-1.34
1000
TTGCCCGGCA CCAACATCGT GCTTGGGGCC CTTCCGGAGG ACAGACACAT CGACCGCCTG GCCAAACGCC AGCGCCCCGG CGAGCGGCTG GACCTGGCTA
£3'-1.94
1100
TGCTGGCTGC GATTCGCCGC GTTTACGGGC TACTTGCCAA TACGGTGCGG TATCTGCAGT GCG6CGGGTC GTGGCGGGAG GACTGGGGAC AGCTTTCGGG
"'I
1200
GACGGCCGTG CCGCCCCAGG GTGCCGAGCC CCAGAGCAAC GCGGGCCCAC GACCCCATAT CG3GGACACG TTATTTACCC TGTTTCGGGC CCCCGAGTTG
1300
CTGGCCCCCA ACGGCGACCT GTATAACGTG TTTGCCTGGG CCTTGGACGT CTTGGCCAAA CGCCTCCGTT CCATGCACGT CTTTATCCTG GATTACGACC
Hotm
I4O0
AATCGCCCGC CGGCTGCCGG GACGCCCTGC TGCAACTTAC CTCCGGGATG GTCCAGACCC ACGTCACCAC CCCCGGCTCC ATACC6ACGA TATGCGACCT
:cf
A3
1
'" -
60
JUS
l«6
I5O
I5OO
GGCGCGCACG TTTGCCCGGG AGATGGGGGA GGCTAACTGA AACACGGAAG GAGACAATAC CGGAAGGAAC CCGCGCTATG ACGGCAATAA AAAGACAGAA
1506
1510
S
"
0 1
1600
TAAAACGCAC GGGTGTTGGG TCGTTTGTTC ATAAACGCGG GGTTCGGTCC CAGGGCTGGC ACTCTGTCGA TACCCCACCG AGACCCCATT GGGGCCAATA
MoOtt
|700
CGCCCGCGT(T TCHCCTTTT CCCCACCCCA CCCCCCAAGT TCGGGTGAAG GCCCAGGGCT CGCAGCCAAC GTCGGGGCGG CAGGCCCTGC CATAGCCACf
a3'-L32
l?»
GGCCCCGTGG GTTAGGGACG GGGTCCCCCA TGGGGAATGG TTTATGGTTC GTGGGGGTTA TTATTTTG66 CGTTGCGTGG GGTCTGGTGG ACGACCCAG - 3'
The Nucleotide Sequence of the Herpes Simplex Virus Thymidine
Kinase Gene.
The non-coding strand of the nucleotide sequence is displayed
progressing from a Pvu II restriction enzyme recognition site to
the end point of the 3' deletion mutant, A3'-1.13. Restriction
enzyme recognition sites that have been confirmed experimentally
are underlined. Only the two Hae III restriction enzyme sites
used to prepare 3' hybridization probe are indicated. The end
points of seven 5 1 deletion mutants and four 3' deletion mutants
are indicated by arrows. The segment of the sequence that encodes
tk mRNA reaches from nucleotide 201 to nucleotide 1,510. The AUG
triplet located at nucleotide 310 is the most proximal translation
start codon to the putative 5' terminus of tk mRNA. An open
translation reading frame for the predicted tk protein progresses
from the AUG triplet located at nucleotide 310 for 376 codons
without interruption by a translation stop codon. The UGA codon
located at nucleotide 1,438 is the putative translation stop
codon.
5955
Nucleic Acids Research
Localization of the 51 Terminus of tk mRNA:
As reported in the accompanying paper, two partially functional 51 deletion
mutants of the tk gene were isolated (McKnight and Gavis, 1980). These mutants
direct the synthesis of low levels of HSV tk enzymatic activity when microinjected into frog oocyte nuclei. It was predicted that these mutants might
lack 51 regulatory signals of the tk gene while maintaining the entire
protein coding component. Such a prediction suggests that the end points
of A5'-0.67 and A5'-0.72 might be located in either the 5' flanking sequences
of the gene or a segment of the gene that is complementary to the 51 untranslated
segment of tk mRNA.
The end points of A51-0.67 and A5'-0.72 are shown in Figure 2. The octanucleotide sequence TATTAAGG is located 22 nucleotides upstream from the end
point of A5'-0.67. This sequence is homologous to the conserved sequence,
TATAAAGG, that is located in the 5 1 flanking DNA of most eukaryotic genes
transcribed by RNA polymerase form II (Hogness and Goldberg, referenced in
Cordell e_t al_., 1979). A number of well documented studies have shown that
the 5' terminus (or "cap site") of mRNA maps to a position 25-30 nucleotides
31 to the TATAAAGG sequence (Ziff and Evans, 1978; Gannon et/al_., 1979;
Tsujimoto and Suzuki, 1979; Tsuda e_t al_., 1979; Hentschell ejt al_., 1980).
I have used the S, nuclease mapping procedure (Berk and Sharp, 1977) to
test whether the 51 terminus of tk mRNA also maps adjacent to the TATTAAGG
sequence. A 131 base pair (bp) DNA fragment, delimited on the 51 end by an
Eco RI restriction cut and on the 31 end by a Bgl II restriction cut (see
Figure 2 ) , was excised from cloned tk DNA and end labeled with 2 P (Weiss et al_.,
1968). The two DNA strands of the molecule were separated on a polyacrylamide
gel and the tk mRNA coding strand was recovered and identified by DNA
sequencing. The coding strand probe was hybridized to mRNA prepared from HSV
I infected African green monkey kidney cells and then digested with S, nuclease
(Materials and Methods). Figure 3 shows an autoradiographic exposure of a 20%
polyacrylamide sequencing gel that was used to size SI resistant DNA fragments.
The autoradiogram reveals two major bands of radioactive DNA that measure 54
and 56 nucleotides in length. The sizes of these DNA fragments suggest that
the most abundant species of tk mRNA protects a segment of tk DNA reaching
from the Bgl II site to either adenosine residue number 201 or adenosine
residue 203. These particular residues map 27 and 29 nucleotides, respectively,
from the conserved octanucleotide sequence TATTAAGG.
*For a detailed description of the logic behing these assignments, see SollnerWebb and Reeder (1979).
5956
Nucleic Acids Research
Fig.
1 2 3 4 5 S 7 8 9 10
"""
•
S7
. "•mmmm
!«BS
-—H
a
3:
Si Nuclease map of the 5' terminus of tk
rTlRN A
The figure shows an autoradiogram of
a 20% polyacrylamide sequencing gel
that was used to size Si nuclease digested
DNA fragments. The numbers to the left
of the Figure designate the positions
of molecular weight marker DNA fragments.
The coding strand of the tk gene was
terminally labeled at the Bgl II site
and sequenced as shown in lanes 2-5
(G, A, T and C reactions, respectively).
The coding strand of a Bgl II-Eco RI
fragment (131 bp) was terminally labeled
and isolated by strand separation. Part
of this probe DNA was identified by
sequencing. Lane 1 was loaded with the
131 bp probe DNA fragment after modification with pyridinium formate and
degradation with piperidine (A reaction
in lane 1 = A reaction in lane 3).
The remaining portion of the terminally
labeled 131 bp coding strand fragment
was hybridized to tk mRNA and digested
with Si nuclease. Lanes 6-8 were
loaded with hybridized probe DNA that
was digested with 0.25, 2.5 and 12.5
units Si/100 pi respectively. Lane 9
was loaded with a small amount of
probe DNA and lane 10 was loaded with
probe DNA that was digested with 0.25
units Si/100 ul without having been
hybridized to mRNA. A longer exposure
of the gel shows a pattern of bands in
lane 10 corresponding exactly to the
size of the bands in lane 6 in the 72-80
nucleotide range. The appearance of
these bands as a function of very mild
Si digestion is judged an artefact of
secondary structure of the probe DNA.
Sj nuclease mapping of the 51 terminus of tk mRNA results in a slightly
heterogeneous display of putative 5' termini (see Figure 3). One cannot
determine whether this result is caused to bona fide heterogeneity at the 5"
terminus of tk mRNA, or by artefacts inherent to the S. mapping technique.
Although a conclusive identification of the precise 5' terminus of tk mRNA must
await sequence analysis of tk mRNA, the S, mapping experiment presented here
does show that the 5 1 terminus is located between nucleotide 200 and nucleotide
204 (see Figure 2 ) . That these residues range from 25 to 30 nucleotides downstream from the TATTAAGG sequence is consistent with a pattern that has emerged
from the analysis of a number of structural genes.
5957
Nucleic Acids Research
Assuming that the 51 terminus of the tk gene is located within the
pentanucleotide region delimited by S, mapping, certain conclusions can be
drawn regarding the contribution of 51 flanking sequences to tk gene
expression. The end point of the 51 deletion mutant, A5'-0.67, is located
at nucleotide 196 (Figure 2 ) o In this deletion mutant most 5' flanking
sequences, with the possible exception of up to 8 nucleotides, are replaced
by pBR-322 sequences. As reported in the accompanying paper, A5'-0.67 will
support tk enzyme synthesis in oocytes to roughly 10% the level of the
parental tk gene cloned in pHSV-106. An Eco RI restriction enzyme recognition
site is located ^80 nucleotides upstream from the putative 51 terminus of the
tk gene. It is reported that restriction cleavage at this site inactivates
the tk gene in the tk" cell transfection assay (Wigler e_t al_., 1977). These
two observations indicate that sequences flanking the 51 terminus of the tk
gene are required for maximal levels of expression. A rigorous identification
of the 5" flanking sequences necessary for tk gene expression will require the
construction and analysis of a more refined series of extragenic deletion
mutants.
Localization of the 3' Terminus of tk mRNA:
Deletion mapping studies have predicted that a functional boundary of
the HSV tk gene is located between the end points of deletion mutants
A3'-1.60 and A3'-1.32 (McKnight and Gavis, 1980). It has also been reported
that a Sma I restriction enzyme recognition site is located close to the 31
terminus of the tk gene (Colbere-Garapin e_t al_., 1979). Inspection of the
nucleotide sequence between the end points of A3'-1.60 and A3'-1.32 reveals
a Sma I restriction site at nucleotide 1,415 (see Figure 2). The nucleotide
sequence in this region of the tk gene also reveals a tandem repeat of the
heptanucleotide sequence AATAAAA. The first repeat of this sequence is
located in Figure 2 starting at nucleotide 1,486. Proudfoot and Brownlee
(1976) found this same heptanucleotide sequence M 8 nucleotides from the
poly-A addition site on four different eukaryotic mRNAs.
To test whether the poly-A addition site on tk mRNA maps adjacent to
the AATAAAA sequences, a transcript map of this region of the gene was
generated. Coding strand tk DNA propagated in bacteriophage M-13 (see
following section) was restricted with Hae III and a 339 nucleotide fragment was isolated by gel electrophoresis. This single stranded Hae III
fragment reaches from nucleotide 1,254 to nucleotide 1,593 as shown in
Figure 2. This fragment should encompass the 3' terminus of the tk gene
since it covers most of the sequences between the end points of deletion
5958
Nucleic Acids Research
mutants A3'-1.60 and A3'-1.32.
The single stranded DNA was labeled at i t s
32
31 terminus with a- P-CTP and terminal transferase (Roychoudhury et a!.,
1976) and hybridized to poly-A+ RNA isolated from HSV infected cells (Materials
and Methods). RNA/DNA hybrids were then digested with S, nuclease. Figure
4 shows an autoradiographic exposure of an 8% polyacrylamide sequencing gel
that was used to size S, resistant DNA fragments. The autoradiograph shows
that mRNA prepared from HSV infected cells protects ^260 nucleotides of the
339 nucleotide Hae III probe. This result indicates that the poly-A addition
site on tk mRNA is located some 16 nucleotides downstream from the final
adenosine residue of the first AATAAAA sequence. This position corresponds to
«2- ••
SZ7- v
*°*~ •
.
^
309- o p
— m
24223S-
A
217-
« •
201190-
«•
«p
180-
W
160-
^B
147-
M
•
Fig. 4: Si Nuclease Map of the Site of Poly-A Addition onto tk mRNA.
An autoradiogram of an 8% polyacrylamide
sequencing gel that was used to size Si
nuclease digests of RNA/DNA hybrids. The
tk hybridization probe is a 339 nucleotide
coding1 strand Hae III fragment labeled at
the 3 terminus with 32p. The numbers
adjacent to the marker DNA fragments (lane
1) indicate their sizes in nucleotides.
Lane 2 was left blank and lane 3 was
loaded with probe DNA that was digested
with 0.25 units Sj/100 yl without being
hybridized to tk mRNA. Lane 4 was loaded
with probe DNA that was hybridized to tk
mRNA before being digested with 2.5 units
Sj/100 ul. Lane 5 was loaded with a small
sample of the intact 339 nucleotide Hae
III probe DNA.
^
122110-
5959
Nucleic Acids Research
the region between nucleotides 1,506 and 1,510 on the sequence displayed in
Figure 2.
These results indicate that the Sma I site is 90 nucleotides internal
to the mRNA coding portion of the tk gene. Colbere-Garapin e_t al_. (1979)
reported that Sma I restriction cleavage of tk DNA reduces, by roughly 95%,
the capacity of tk DNA to transfect tk" cells to the tk phenotype. If
these restriction enzyme sites are the same, the data of Colbere-Garapin
et_ al__, (1979) suggest that the 90 3' proximal nucleotides are not absolutely
required for tk gene function. The end point of A3'-1.60 is 171 nucleotides
internal to the 31 terminus of the tk mRNA coding segment. Since this
deletion mutant is incapable of directing tk enzyme synthesis in frog oocytes,
it can be concluded that at least some of these 171 nucleotides are necessary
for tk gene expression.
Sizing of tk mRNA:
The distance separating the putative 51 and 3 1 termini of the HSV tk
gene measures ^1,310 bp. The finding that this distance is not appreciably larger than the 1,100 ribonucleotides needed to code for the 42,000
dalton tk polypeptide (Summers et. a k , 1975; Cremer e_t al_., 1978) suggests
that the tk gene might not be interrupted by intervening DNA sequences. To
test this possibility, tk mRNA was sized by Sx mapping using single-stranded
probe DNA cloned in bacteriophage M13 (Messing, 1979).
In order to avoid mapping transcripts encoded by segments of the HSV I
genome that closely flank the tk gene, I used a tk DNA probe that contains
only 200-300 bp of 5' and 3' extragenic flanking sequences. This 1.8 kb
probe extends from a Pvu II site to the end point of the 31 deletion mutant
A3'-1.13, precisely the same sequences shown in Figure 2. It was expected
that this DNA fragment should encompass the tk transcription unit since it
is fully capable of directing the synthesis of tk enzymatic activity when
microinjected into frog oocyte nuclei (data not shown).
The 1.8 kb tk fragment was constructed to contain synthetic Hind III
restriction sites at both termini (Materials and Methods), and inserted
into the Hind III site of the mp5 strain of M13 replicative form DNA.
M13/tk isolates carrying the 1.8 kb tk insert in both of the two possible
orientations were isolated. In one orientation, M13/tk-a, the single
stranded form of the recombinant phage contains tk coding strand DNA. When
the tk fragment is inserted in the opposite recombinant orientation, M13/tk-b,
the single stranded phage DNA contains the non-coding strand DNA of the tk
gene. These assignments are based on the known transcription polarity of
5960
Nucleic Acids Research
the tk gene (Smiley et ai., 1980) and the life cycle and construction of
M13 strain rap5 (Messing, 1979).
Both types of M13/tk recombinant phage were grown in E. coli strain
32
~
JM-101 in the presence of P orthophosphate. Radiolabeled single stranded
DNA was prepared from purified phage and hybridized to poly-A RNA prepared
from HSV I infected African green monkey kidney cells (Materials and Methods).
RNA/DNA hybrids were digested with S, nuclease and electrophoresed on neutral
and alkaline agarose gels. Figure 5 shows an autoradiogram of an alkaline gel
used to size Sj resistant DNA fragments. The autoradiogram shows that roughly
1.3 kb of the M13/tk-a probe is protected from S, digestion by tk mRNA.
Alternatively, the M13/tk-b probe DNA is not protected from Sj nuclease
digestion by tk mRNA. When the same S, digested samples were sized on a
neutral agarose gel, essentially the same results were obtained.
Three conclusions can be drawn from these results. First, the coding
strand assignment for tk transcription reported by Smiley ert a K (1980) is
confirmed. That is, the transcription polarity of the tk gene procedes
5
1855-
^^
I f
1060828-
Ute
^S
^~4
f f
383-
* ^
»
•
•
Fig. 5: Sj Nuclease Sizing of tk mRNA.
An autoradiogram of an
alkaline electrophoresis gel
that was used to size Si
nuclease digested RNA/DNA
hybrid molecules. The
numbers adjacent to the
marker DNA bands in lane 1
indicate their sizes in nucleotides. Lanes 2 and 3 were
loaded with M13/tk coding
strand DNA that, after
hybridization to tk mRNA,
was digested with 2.5 and 25
units Si nuclease/100 ul
respectively. Lanes 4 and 5
were loaded with M13/tk noncoding strand DNA that was
digested with the same S^
concentrations after
hybridization to tk mRNA.
5961
Nucleic Acids Research
from l e f t to right as diagrammed in Figure 1 of the accompanying report
(McKnight and Gavis, 1980)..
Second, the size of the most abundant species
of tk mRNA synthesized in HSV infected cultured cells is approximately 1.3
kb excluding the poly-A t a i l .
Finally, by showing that tk mRNA protects a
continuous 1.3 kb segment of the tk gene from S^ digestion, the experiment
strongly suggests that the HSV tk gene i s not interrupted by intervening DNA
sequences.
Predicted Polypeptide Sequence of the tk Gene:
I f the HSV tk gene lacks intervening DNA sequences then one should be
able to demonstrate an open reading frame throughout the protein-coding
portion of the DNA sequence. The closest methionine coding t r i p l e t to the
putative 51 terminus of the tk gene is located at nucleotide 310 (Figure 2).
Since the tk polypeptide has not been sequenced one cannot be certain that
this particular AUG t r i p l e t is the authentic translation start s i t e . For
the large majority of eukaryotic mRNAs, however, the most 51 proximal AUG
is used as the translation start s i t e (Kozak, 1978). When amino acids are
assigned to the tk sequence starting from the methionine codon at nucleotide
310, a translation stop t r i p l e t i s not encountered for 376 codons (Figure 2).
Three observations suggest that the amino acid sequence predicted by
the DNA sequence may be correct. F i r s t , the two reading frames other than
that predicted to code for tk protein are interrupted at multiple sites by
translation stop codons. Second, the translation stop codon that I have
tentatively assigned for the tk polypeptide is the UGA stop codon at
nucleotide 1,438 (Figure 2). Cremer et_ al_. (1979) have shown that chain
termination of tk protein synthesis is suppressed in v i t r o by an opal (UGA)
suppressor tRNA. Third, the polypeptide predicted from the tk DNA sequence
is composed of 376 amino acid residues. The molecular weight of this
predicted polypeptide chain, accounting for the amino acid composition, is
39,931 daltons. This predicted size f i t s closely to the 42,000 dalton
estimate determined for HSV-tk by various gel electrophoretic sizing
methods (Summers et al_., 1975; Cremer e_t al_., 1978).
ACKNOWLEDGEMENTS
I thank Drs. G. Hayward and G. Reyes for the very generous provision
of RNA samples; W. Summers and M. Wagner for the open communication of unpublished observations; R. Kingsbury and E. Gavis for competent technical
assistance; P. Schmidt and S. Satchel! for typing, and especially, R.
5962
Nucleic Acids Research
Peterson for helpful advise and encouragement. Funds for this research
were provided by the Carnegie Institution of Washington. SLM is a fellow
of the Helen Hay Whitney Foundation for Medical Research.
REFERENCES
1. Anderson, K. P., Stringer, J. R., Holland, L. E. and Wagner, E. K. (1979).
J. Virol„ 30: 805-820.
2. Berk, A. J. and Sharp, P. A. (1977). Cell 12: 721-732.
3. Clewell, D. B. and Helinski, D. R. (1970). Biochemistry 9: 4428-4440.
4. Colbere-Garapin, F., Chousterman, S., Horodniceann, F., Kourilsky, P.
and Garapin, A. (1979). Proc. Nat. Acad. Sci. USA 76: 3755-3759.
5. Cordell, B., Bell, Go, Tischer, E., Denoto, F. M., Ulrich, A., Pictet,
R., Rutter, W. J. and Goodman, H. M. (1979). Cell 18: 533-543.
6. Cremer, K. J., Bodemer, M. and Summers, W. C. (1978). Nucl. Acids Res.
5: 2333-2344.
7. Cremer, K. J,, Bodemer, M., Summers, W. P., Summers, W. C. and Gesteland,
R. F. (1979). Proc. Nat. Acad. Sci. USA 76: 430-434.
8. Fedoroff, N. V. and Brown, D. D. (1978). Cell 13: 702-716.
9. Gannon, F., 0'Hare, K., Perrin, F., LePennec, J. P., Benoist, C , Cochet,
M., Breathnach, R., Royal, A., Garapin, A., Cami, B., Chambon, P. (1979).
Nature 278: 428:434.
10. Hentschel, C , Irminger, J., Bucher, P. and Birnstiel, M. L. (1980).
Nature 285: 147-151.
11. Heyneker, H. L., Shine, J., Goodman, H. M. Boyer, H. W., Rosenberg, J.,
Dickerson, R. E., Narange, S. A., Isakura, K., Lin, S. and Riggs, A. D.
(1976). Nature 263: 748-752.
12. Honess, R. W. and Roizman, B. (1974). J. Virol. 14: 8-19.
13. Horiuchi, K. and Zinder, N. D. (1975). Proc. Nat. Acad. Sci. USA 72:
2555-2558.
14. Kozak, M. (1978). Cell 15: 1109-1123.
15. Maitland, N. J. and McCougall, J. K. (1977). Cell 13: 233-241.
16. Maxam, A. and Gilbert, W. (1977). Proc. Nat. Acad. Sci. USA 74: 560-564.
17. Maxam. A. and Gilbert, W. (1980). Meth, Enzymol., L. Grossman and K.
Moldave, eds. (Academic Press, N.Y.), Vol. 65: 499-560.
18. McKnight, S. L. and Gavis, E. R. (1980). Submitted for publication.
19. Messing, J. (1979). Federal Register: Recombinant DNA Technical
Bulletin, Vol. 2: 43-48.
20. Ohmori, H., Tomizawa, J. I. and Maxam, A. M. (1978). Nucl. Acids Res.
5: 1479-1485.
21. Pellicer, A., Wigler, M., Axel, R. and Silverstein, S. (1978). Cell 14:
133-141.
22. Preston, C. M. (1979). J. Virol. 29: 275-284.
23. Proudfoot, N. J. and Brownlee, G. G. (1976). Nature 263: 211-214.
24. Roychoudhury, R., Jay, E. and Wu, R. (1976). Nucl. Acids Res. 3: .863-877.
25. Smiley, J. R., Wagner, M. J., Summers, W. P. and Summers, W. C. (1980).
Virology 102: 83-93.
26. Sollner-Webb, B. and Reeder, R. H. (1979). Cell 18: 485-499.
27. Sanger, F. and Coulson, A. R. (1978). FEBS Letters 87: 107-110.
28. Summers, W. C , Wagner, M. and Summers, W. P. (1975). Proc. Nat. Acad.
Sci. USA 72: 4081-4085.
29. Tabak, H. F. and Flavell, R. A. (1978). Nucl. Acids Res. 5: 2321-2332.
30. Tsuda, M., Ohshima, Y. and Suzuki, Y. (1979). Proc. Nat. Acad. Sci. USA
76: 4872-4876.
5963
Nucleic Acids Research
3 1 . T s u j i m o t o , Y. and S u z u k i , Y. (1979). Cell 18: 591-600.
32. Weiss, B . , L i v e , T. E. and Richardson, C. C. (1968). J . B i o l . Chem. 243:
4530-4542.
33. W i g l e r , M., S i l v e r s t e i n , S . , Lee, L . , P e l l i c e r , A . , Cheng, Y. and A x e l ,
R. (1977). Cell 1 1 : 223:232.
34. Z i f f , E. B. and Evans, R. M. (1978). Cell 15: 1463-1475.
5964