PDF - College of Humanities and Sciences

Blackwell Science, LtdOxford, UKMMIMolecular Microbiology0950-382XBlackwell Publishing Ltd, 2003482429441Original ArticlePredicting mutations in stem–loop structuresB. E. Wright et al.
Molecular Microbiology (2003) 48(2), 429–441
Predicting mutation frequencies in stem–loop structures
of derepressed genes: implications for evolution
Barbara E. Wright,* Dennis K. Reschke,
Karen H. Schmidt, Jacqueline M. Reimers and
William Knight
Division of Biological Sciences, University of Montana,
Missoula, MT 59812, USA.
the background mutable bases in lacI. (iv) The data
suggest that environmental stressors may cause as
well as select mutations in derepressed genes.
The implications of these results for evolution are
discussed.
Summary
Introduction
This work provides evidence that, during transcription, the mutability (propensity to mutate) of a base in
a DNA secondary structure depends both on the stability of the structure and on the extent to which the
base is unpaired. Zuker's DNA folding computer program reveals the most probable stem–loop structures
∆G) for any
(SLSs) and negative energies of folding (–∆
given nucleotide sequence. We developed an interfacing program that calculates (i) the percentage of folds
in which each base is unpaired during transcription;
and (ii) the mutability index (MI) for each base,
expressed as an absolute value and defined as
follows: MI = (% total folds in which the base is
∆G of all folds in which it is
unpaired) × (highest –∆
unpaired). Thus, MIs predict the relative mutation or
reversion frequencies of unpaired bases in SLSs. MIs
for 16 mutable bases in auxotrophs, selected during
starvation in derepressed genes, are compared with
70 background mutations in lacI and ebgR that were
not derepressed during mutant selection. All the
results are consistent with the location of known
mutable bases in SLSs. Specific conclusions are: (i)
Of 16 mutable bases in transcribing genes, 87% have
higher MIs than the average base of the sequence
analysed, compared with 50% for the 70 background
mutations. (ii) In 15 of the mutable bases of transcribing genes, the correlation between MIs and relative
mutation frequencies determined experimentally is
good. There is no correlation for 35 mutable bases in
the lacI gene. (iii) In derepressed auxotrophs, 100% of
the codons containing the mutable bases are within
one codon's length of a stem, compared with 53% for
Gene-specific transcription-induced mutations have been
described in growing cells of prokaryotes (Brock, 1971;
Herman and Dworkin, 1971; Beletskii and Bhagwat, 1996)
and eukaryotes (Datta and Jinks-Robertson, 1995), as
well as in starving cells of Escherichia coli (Lipschutz
et al., 1965; Wright, 1996; 1997; Wright and Minnick,
1997; Rudner et al., 1999; Wright et al., 1999). Recent
reviews have discussed the mechanisms by which starvation and gene derepression can enhance non-random
mutations (Foster, 1999; Wright, 2000; Bridges, 2001;
Rosenberg, 2001). Previous studies have demonstrated a
correlation between mutation rates and transcription as
determined by measuring levels of specific mRNA (Wright
et al., 1999). However, accurate rates of mRNA synthesis
can only be determined when both mRNA concentration
and half-life are known. Current studies on mRNA turnover (J. M. Reimers et al., submitted) have confirmed the
conclusions of the earlier investigations (Wright et al.,
1999).
There are at least two mechanisms by which transcription can increase mutation rates in localized areas
of the genome: (i) separation of the two DNA strands
exposes the non-transcribed strand, which is then vulnerable to mutations (Lindahl and Nyberg, 1972; 1974;
Coulondre et al., 1978; Singer and Sterns, 1982; Fix and
Glickman, 1987; Frederico et al., 1990; 1993; Skandalis
et al., 1994); and (ii) the advancing RNA polymerase
complex drives localized supercoiling (Balke and Gralla,
1987; Liu and Wang, 1987; Dayn et al., 1991; Dayn
et al., 1992; Rahmouni and Wells, 1992; Krasilnikov
et al., 1999), which creates and stabilizes stem–loop
structures (SLSs) containing vulnerable unpaired and
mispaired bases (Ripley, 1982; Todd and Glickman,
1982; Pananicolaou and Ripley, 1991). Thus, these two
mechanisms, in concert, are ideal for obtaining the variants most likely to survive each kind of stress while
minimizing random genomic damage (Wright, 2000).
Accepted 13 December, 2002. *For correspondence. E-mail
[email protected]; Tel. (+1) 406 243 6676; Fax (+1) 406 243
4184.
© 2003 Blackwell Publishing Ltd
430 B. E. Wright et al.
Evidence from many laboratories has demonstrated that
transcription drives localized supercoiling and that stem–
loop conformations contain mutable bases. However, no
direct evidence has yet been presented to document a
correlation in specific genes between increased rates
of transcription and enhanced mutation localized in
unpaired bases of SLSs.
The highest concentration of supercoiling and SLSs
occurs in the wake (5′) of the transcription bubble, and
SLSs can form in either or both strands (cruciforms). DNA
secondary structures not only expose mutable bases, but
also increase the time of their exposure by periodically
blocking the progress of transcribing RNA polymerase
complexes during transcript elongation. Transcriptionenhanced mutations should therefore occur preferentially
at pause sites, and mutation rates should be increased by
activators such as guanosine tetraphosphate (ppGpp),
which has been shown to lengthen the duration of these
pauses (Kingston et al., 1981). Specific effects of stress
on transcription are essential to the mechanisms of nonrandom mutations in evolution (see Discussion; Wright,
2000).
To explore the possible correlation of secondary structure formation and sites of mutation, Zuker's DNA folding
computer program (Benham, 1992; SantaLucia, 1998;
Zuker et al., 1999) was used. This program folds a segment of ssDNA, predicts the most thermodynamically stable secondary structure that could form and calculates
the negative energy of folding (–∆G) for each structure
(http://www.bioinfo.rpi.edu/applications/mfold/old/dna).
For example, a 200 nucleotide (nt) sequence containing
a known mutable base is copied as input to the program.
The program is then instructed to fold a series of 30 nt
segments that include the mutable base, and then to show
the structure with the highest –∆G for that base. The
theoretical methods used in screening DNA sequences
for sites susceptible to superhelical strand separation and
transition to alternative structures have been verified
experimentally in other systems (Benham, 1992). A computer program that interfaces with Zuker's program has
been developed in this laboratory in order to obtain additional information regarding the mutability of individual
bases. This new program calculates the extent to which
each base is unpaired during transcription (see Experimental procedures).
In four Escherichia coli auxotrophs (leuB, argH, malT
and lacZ) studied in our laboratory, mutant sequences and
number of mutations have been determined. Sequences
and mutation frequencies of seven trpA auxotrophs were
found in the literature (Isbell and Fowler, 1989; Bhamre
et al., 2001). As controls, data were also obtained from
the literature for background (‘spontaneous’) mutations
not isolated in derepressed genes, namely lacI and ebgR
(Schaaper et al., 1986; Hall, 1999). The presence of loss-
of-function mutations in these repressor genes is revealed
indirectly by allowing the constitutive expression of βgalactosidase.
Results
The location of mutable bases in high and low –∆G areas
of a gene
In order to correlate the location of mutable bases in SLSs
with the –∆G of their structures, about 130 nt of the nontranscribed strand, including the mutation(s) in leuB, argH,
malT or lacZ, were folded in successive, overlapping 30 nt
segments, beginning at each fifth nt (Fig. 1). Thus, each
nucleotide was included in six successive folds. The magnitudes of the –∆G values indicate the relative stability of
the structures formed by each segment: the greater the –
∆G value, the more likely a stable secondary structure will
form and create a pause site. Figure 1 shows that mutations generally occur in structures with high –∆Gs. That
is, in lacZ, leuB, argH, malT and codon 49 of trpA, mutations occur in the highest –∆G peaks present. In the other
three codons of interest in trpA, mutations occur in three
of the four highest peaks. It should be pointed out that a
high –∆G results primarily from the stability of the stem(s),
and does not necessarily indicate the presence of an
unpaired base vulnerable to a chemical change (see Discussion). Presumably, mutations usually occur when an
unpaired base exists in its highest –∆G structure. The
SLSs that form at peak –∆G values in selected genes are
described below.
The DNA folding program appears to describe a biochemical process that is robust. That is, although the
structures formed in the 30 nt window are not as large and
stable as those in 40 nt and 50 nt windows (not shown),
all three analyses show a peak of high negative energy
values and structures with a similar conformation at the
site of the mutable base. This suggests a reliable and
buffered system for the dynamic interconversion of secondary structures. Thus, despite probable continuous
variations in the length of segments folded during transcription, mutation and (presumably) pause sites generally coincide with peak –∆G values. The most stringent
condition used, folding 30 nt (compared with 40 and 50 nt)
shows the most specific correlation of maximum peak –
∆G values with the presence of each mutant nucleotide,
reflecting, perhaps, the average number of nucleotides
folded in vivo (Fig. 1). Although the 50 nt foldings had
peaks at the mutation sites, they also had higher peaks
elsewhere, and were therefore not as specifically correlated with the mutation sites. Very large windows generate
very complex structures but, in choosing a smaller window, we attempted to simulate transcription conditions in
silico. Therefore, a 30 nt window was chosen for analysing
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
Predicting mutations in stem–loop structures 431
Fig. 1. The location and –∆G values of 16 known mutable bases in E. coli auxotrophs. Beginning at each fifth nucleotide, 130 nt (lacZ, including
codon 63; leuB, including codon 286; argH, including codon 4; malT, including codon 545; and trpA, including codon 49) or 260 nt (trpA, including
codons 183, 211 and 234) of the non-transcribed strands were folded in successive, overlapping 30 nt windows. Each bar indicates the beginning
of a window, and the number of mutable bases in each fold appears above each bar. The entire trpA gene was included in the analysis, but a
340 nt segment without mutable bases was omitted from the following sequence and is indicated by dotted lines. Codon numbers are in
parenthesis, and mutable bases are capitalized.
…gct aat tga agc cgg tgc tga gcg ctg GAg (49) tta ggc atc ccc ttc tcc gac cca ctg gca gat ggc ccg acg att caa … gca ggc gtg aCc (183) ggc gca
gaa aac cgc gcc gcg tta ccc ctc aat cat ctg gtt gcg aag ctg aaa gag tac aac gct gca cct cca ttg cag GGa (211) ttt ggt att tcc gcc ccg gat cag gta
aaa gca gcg att gat gca gga gct gcg ggc gcg att tct GGt (234) tcg ccc…
the mutant sequences. However, in the analysis of lacZ,
it was also found necessary to use a 40 nt window in order
to understand the mechanism of the triple mutation that
occurred (see below).
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
In the trpA gene (Yanofsky et al., 1981; Isbell and
Fowler, 1989; Bhamre et al., 2001), the locations of seven
mutable bases in four codons are known, and the
sequences containing these bases are shown (Fig. 1).
432 B. E. Wright et al.
Fig. 2. The location and –∆G values of mutable bases in the E. coli repressors, lacI and ebgR. Beginning every fifth nt, 30 nt windows were
folded. Each bar indicates the nucleotide at the beginning of the window, and the number of mutable bases in each fold appears above each
bar. Wild type lacI, 57 mutations (mutable bases are capitalized) have been found:
…gtg aaa cca gta ACg tta tac gat gTc GCa gag tat Gcc ggt gTc tCt Tat CAG ACc GTt TCc CGc GTg gTg Aac CAg gcc agc cac GTt tCt gcg
aaa Acg cgg gaa aaa GTg Gaa gcg GCg atg Gcg gag ctg aat TAC att cCc aAC cgc gTg GCa CAA CAa cTg gCg Ggc Aaa Cag tCg ttg ctg att
ggc gtt gcc acc tcc agt cTg gcc…
Wild-type ebgR sequence, 13 mutation sites (mutable bases are capitalized) have been found:
…gac atc gca atc gaa gCt ggc gta Tcc ctg gcg aca Gta tcc AGg gtc tta aat gac gAT ccg aca ttg Aat Gtg aaa gaa gag aCg aaa cat cgc att ctc
Gag atc gcc Gaa aag ctg gag taC aag acc…
These seven mutations are also located in high –∆G
peaks. In contrast, Fig. 2 shows an apparently random
distribution of mutable bases in lacI and ebgR in which
transcription was not induced. Note, especially in the large
lacI data set (Fig. 2 legend), that most of the 414 mutations that were sequenced are in the first and second
positions of the codons (see Discussion).
Calculation of a mutability index for mutable bases
DNA folding analyses have suggested that mutable bases
in selected, derepressed genes are preferentially located
in high –∆G SLSs, in contrast to mutable bases in noninduced genes. The mutability of a base should depend
both on the stability (–∆G) of its SLS and on whether the
base is unpaired in these structures (Lindahl and Nyberg,
1972; 1974; Coulondre et al., 1978; Singer and Sterns,
1982; Fix and Glickman, 1987; Frederico et al., 1990;
1993; Skandalis et al., 1994). Therefore, a program has
been developed that will provide this information for each
base (see Experimental procedures). To describe and
evaluate these assumed causes of base mutability, we
have defined a new term, called the mutability index (MI),
and express it as an absolute value:
MI =(% total folds in which the base is unpaired)
× (highest –∆G of all folds in which it is unpaired).
Thus, MIs indicate the relative mutation frequencies of
bases in predicted SLSs. The highest –∆G term gives the
most weight to the most stable fold containing the
unpaired base. That is, primary importance is given to the
structure in which the base is exposed for the longest
period of time and therefore has the highest probability of
mutation. In addition, the average –∆G and MI of all the
bases included in the sequence analysed are calculated,
for comparison with individual base –∆Gs and MIs. This
information is necessary, as mutable bases exist in different –∆G regions. For example, the average –∆G value in
the lacZ analysis is 4.5, whereas this value for leuB is 2.4
(Fig. 1).
The percentage of each wild-type mutable base MI
above or below the average MI in the –∆G region of each
base was calculated for the 16 mutable bases in auxotrophs and for the 70 mutable bases in control genes. The
percentage above average MI values for the auxotroph
bases is 87%, and this value for lacI and ebgR is 50%.
Comparing MIs with mutation frequencies
If mutable bases are located in unpaired bases of SLSs
predicted by the Zuker DNA folding program, MI values
may predict relative mutation frequencies. Sequences and
reversion frequencies have been determined for bases in
the same codons of seven trpA mutant alleles (Isbell and
Fowler, 1989; Bhamre et al., 2001). In Fig. 3, the SLSs
with the highest –∆G for wild type and three mutant alleles
in codon 49 are shown. In the wild type, a strong stem
with five C*G basepairs is formed. As the G in the first
position of the codon is in the stem, its MI is very low (0.3)
in this 30 nt SLS, as well as in smaller windows (not
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
Predicting mutations in stem–loop structures 433
Fig. 3. Predicted SLSs at peak –∆G values are shown for wild type and three mutant alleles of codon 49 of trpA. The predicted MI values for
the mutations (arrows) are compared in the table inset with mutation frequencies determined experimentally in two data sets. See Table 1 and
text for details .
shown). However, in higher –∆G folds of 35 or 40 nt, more
stable stems occur, and the MI is zero, i.e. this G is paired
in all folds. Therefore, the mutation of this G to an A or T,
giving rise to alleles 11 and 88, respectively, is predicted
to occur in relatively small SLSs. These latter SLSs have
only three C*G pairs in their stems and thus lower –∆G
values (see bar graph in Fig. 3). However, as they are
unpaired much of the time, their MIs are much higher (2.3
and 2.5) than that of the wild-type G (0.3). The relative MI
values calculated for the three mutable bases in these
mutants compare favourably with their relative reversion
frequencies (Isbell and Fowler, 1989; Bhamre et al., 2001;
see Fig. 3, table inset). Mutation of the middle base in
codon 49 (from an A to a T) retains the strong stem (allele
3), and the MI of that T reverting to wild-type A is the
highest of all (3.5). These relative MIs predict that the
reversion frequency of allele 3 should be higher than that
of 11 and 88, and that the latter alleles should have
comparable reversion frequencies. The table inset compares MIs with reversion frequencies. Figure 4 summarizes similar data for four more mutant alleles in codons
211 and 234 of trpA. Again, the relative MIs compare well
with their relative reversion frequencies, and MIs of the
mutant bases are higher than those of the wild-type base
to which they revert.
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
Figures 3 and 4 also show that the highest –∆G SLSs
of wild-type sequences are often the same as that of their
mutant alleles. For example, in codon 49, the most stable
SLS of allele A 3 is the same as that of wild type. Similarly,
A 23 and A 26 in codon 211 and A 78 and A 58 in codon
234 have the same SLSs as their respective wild types.
Transcription should enhance the mutation frequency of
all mutable bases, including those of wild type, by localized increases in the number of SLSs.
The most probable SLSs formed by the non-transcribed
strands (NTSs) at peak –∆G values were compared with
those formed by the transcribed strands (TSs) and usually
found to mirror one another. In such genes, therefore,
mutations are equally probable in the NTS and TS. However, in the argH gene (and, to a lesser extent, leuB),
different structures form in 30 nt folds of the two strands,
as shown in Fig. 5A. This occurs because of nearest
neighbour interactions, the asymmetry of the base stacking rules and because a stem structure containing a G*T
bond forms in the TS (SantaLucia, 1998; Zuker et al.,
1999). This stem cannot form in the NTS, which has a C
and an A in a comparable region of the SLS. The longer
stem present in the TS bestows a higher –∆G and higher
MIs to that SLS compared with the NTS, and predicts that
mutations are more likely in the TS. In both the TS and
434 B. E. Wright et al.
Fig. 4. Predicted SLSs at peak negative energy values for wild type and two mutant alleles for codons 211 and 234 of trpA. Predicted MIs and
mutation frequencies determined experimentally are compared in the table inset. See Table 1 and text for details.
the NTS, the MIs of the mutable bases are somewhat
higher than the MIs of the wild-type bases (not shown). In
the NTS, folding 40 nt gives similar MIs for the first two
codon positions but increases the MI in the third position
from 1.5 to 2.8. In all folds above 20 nt, the third codon
position has the highest MI value, consistent with the
number of mutations determined by sequencing 23 revertants (Fig. 5A, table inset).
In Fig. 5B, a qualitative comparison of MIs with the
number of mutation events is shown for the leuB
mutant. In most cases, 30 nt SLSs have higher –∆G values than 20 nt SLSs but, in this case, the stems are the
same, and the long sequences at the ends of the 30 nt
SLSs may well be paired with their complements. As
shown in the table inset of Fig. 5B, the MIs of the mutable bases are comparable and consistent with the
Fig. 5. Predicted SLSs of argH (A) and leuB
(B) genes. The predicted MI of each mutable
base is indicated by an arrow. Inset table shows
the MI of each mutable base compared with the
number of mutations (No.) determined by
sequencing 23 revertants in the case of argH
and 36 in the case of leuB. See text for
discussion.
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
Predicting mutations in stem–loop structures 435
results of sequencing 36 revertants (Wright and Minnick, 1997).
Table 1 summarizes the comparisons in the same
codon between calculated MIs and reversion frequencies
for the seven trpA auxotrophs, and Table 2 summarizes
sequencing data for the argH, leuB and lacZ auxotrophs.
Data are also available for a comparable analysis of 414
background mutable bases in lacI (Schaaper and Dunn,
1991; Fig. 6). In this sequence, all mutable bases within
a codon in which at least one base had three or more
mutational events were analysed, giving a total of 35. The
number of mutants shown above each base was counted,
and the results are summarized in Table 3. Mutation frequencies (trpA) and number of mutations (lacI) are plotted
against MIs to obtain linear regression analyses of the
data in Tables 1 and 3 (Fig. 7). The R2 value for the trpA
alleles is 0.57 and for the lacI data set is 0.002. Although
the qualitative comparisons of MIs and mutation events
are good for the mutable bases in Table 2, there are too
few data points for this type of analysis.
The triple mutant
The lacZ stop codon mutant and its triple revertant are
shown in Fig. 8. As mentioned previously, a 30 nt window was chosen for the analysis of SLSs. However,
when the very long and stable stem was observed, it
was necessary to increase the window to 32 nt in order
to include the TAG codon (Fig. 8). In the course of
sequencing 47 of the revertants, we observed many
types of single base revertants, a four codon deletion
and, on nine occasions, a triple mutation was isolated
(Table 2). As seen in Table 2, a number of amino acid
substitutions can result in an active enzyme, although
different growth rates are observed. This triple mutation
must have arisen by a templating mechanism. The reversion frequency of this stop codon is ≈ 10−9, and three
Table 1. Comparisons in the same codons between MIs and reversion frequencies in seven mutable alleles of the derepressed trpA
gene.
Revertants/108 cells
Codon
Mutant allele
MI
Expt Aa
Expt Bb
49
49
49
211
211
234
234
trpA
trpA
trpA
trpA
trpA
trpA
trpA
3.5
2.3
2.5
3.6
3.1
2.8
3.4
0.4
0.06
NDc
1.1
0.8
0.3
ND
1.3
0.2
0.3
ND
ND
0.3
1.8
3
11
88
23
46
58
78
a. Taken from Table 3 in Bhamre et al. (2001). For allele 58, the
frequencies of A to G and to C were summed.
b. Taken from Table 4 in Isbell and Fowler (1989). Only class I full
revertant frequencies were used.
c. No data available.
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
Table 2. Comparisons between MIs and relative numbers of mutations determined by sequencing in eight mutable alleles of three
derepressed genes.
Mutant
Codon
Base change
MI
No. of mutations
argH
4
4
4
286
286
286
63
63
63
63
T to G, C, or A
G to T or C
A to C, G, or T
T to A or G
T to C
G to A, C or T
T to C, G or A
A to G or C
G to C or T
TAG to AGC
0.8
0.8
1.5
2.6
2.6
–
13.9
13.9
13.9
Triple
8
2a
13
20
16
Not observed
7b
19
12
9
leuB
lacZ
a. This number of revertants is lower than expected, but mutation of
G to A results in a stop codon that would not be observed.
b. Theoretically, mutations in the first position should occur as frequently as those in the second and third positions. This value may be
low due to (observed) lower growth rates of these revertants, just as
the A to G revertants (wild type) in the second position (wild type)
may be high because these revertants have the highest growth rate.
simultaneous independent mutations would have the
unlikely frequency of (10−9)3. Inspection of the sequences
at the end of the 32 nt SLS revealed that, were the structure extended to 40 nt, a bubble would appear in which
templating from the complementary strand could change
the TAG stop codon to the triple mutant shown in Fig. 8.
There is precedence for this mechanism in the literature
(Ripley, 1982), and the existence of this templated triple
mutation provides compelling evidence for the role of
SLSs in vivo.
Discussion
Evidence in the literature indicates that DNA sequence
can determine the location of a background mutation, but
the frequency with which it occurs depends primarily upon
various kinds of DNA-destabilizing metabolic events.
When growth is inhibited, DNA replication is minimized
and transcription of specific derepressed genes in
response to an inhibitor or condition of stress may then
be a major cause of mutations (Wright, 2000). In 11
strains of leuB and argH auxotrophs, isogenic except for
relA, a correlation was found between mutation rates,
ppGpp and mRNA levels (Wright, 1996; Wright and Minnick, 1997; Longacre et al., 1999; Wright et al., 1999).
Similar results have been obtained in B. subtilis, in which
reversion rates of three different amino acid auxotrophs
were higher in strains able to produce the transcriptional
activator guanosine tetraphosphate (Rudner et al., 1999).
Current studies with malT and lacZ auxotrophs also indicate that those strains able to activate transcription (cya+,
crp+) have higher reversion rates than the strains (cya and
crp mutants) unable to activate transcription (unpublished
data). The presence of these mutations in unpaired bases
436 B. E. Wright et al.
Fig. 6. The number of background mutations in lacI (Schaaper and Dunn, 1991).
of predicted SLSs implicates supercoiling and is consistent with the literature.
Background point mutations in unpaired bases occur by
known chemical mechanisms having finite, significant
activation energies under physiological conditions, and
such lesions can subsequently be immortalized by replication. The thermodynamic properties of hydrolytic reactions that occur in nucleic acids under these conditions
are such that the deamination of C is much greater than
that of A, and the depurination of G or A is much greater
than the depyrimidation of C or T (reviewed by Singer and
Sterns, 1982). Also, the oxidation potential of a G located
5′ to another G is greater than it is next to a C or T
(reviewed by Sugden and Stearns, 2000). Because of its
size, an A is more likely than a C or a T to replace a G.
The lacI data set (Fig. 6) is sufficiently large to document
these statements, i.e. 75% of the G mutations are to A,
and 88% of the C mutations are to T. A recent analysis
(Wright et al., 2002) of 14 000 mutations in the human p53
gene shows a similar pattern, with values of 74% and 91%
respectively.
A significant source of unpaired and mispaired bases
subject to the kinds of mutations described above are the
loops of SLSs that can often elude detection and repair
(Moore et al., 1999). These secondary DNA structures are
naturally present because of the frequent occurrence of
inverted complements in DNA sequences (Lilley, 1980;
Papyotatos and Wells, 1981). It should be pointed out that,
in general, specific positions in these SLSs (e.g. a base
near a stem) are uniquely vulnerable to mutation, regardless of which base is present at that position. During
transcript elongation and the probable continuous variations in the length of the nucleotide segments folded in
vivo, the stems, which create the SLSs, are the invariant
part. Thus, the most reliable location for an unpaired base
that evolved to be mutable is near a stem. In prokaryotes,
the most mutable bases are in codons that are within a
codon's length of a stem (Figs 3–5 and 8). In the eukaryotic p53 tumour suppressor gene, the most mutable bases
are located immediately next to stems in stable SLSs.
The predicted MIs correlate well (R 2 = 0.76) with those
observed for 14 000 human cancers, whereas no such
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
Predicting mutations in stem–loop structures 437
Table 3. Comparisons in the same codon between MIs and the number of background mutations in 35 mutable bases of lacI.
Allele
MI
No. of mutations
41
42
56
57
80
81
82
83
84
86
87
89
90
92
93
95
96
104
105
116
117
140
141
149
150
167
168
169
177
178
185
186
188
189
190
3.6
0.4
0.6
0.6
3.9
3.2
3.2
0.8
0.8
1.4
4.3
4.7
4.7
4.7
3.4
2.7
0.3
3.1
3.4
1.5
0.1
4.2
3.5
0.6
2.2
3.5
3.5
3.3
3.8
1.3
2.0
3.5
2.9
2.8
3.1
2
3
6
10
3
5
1
17
5
7
3
1
4
7
16
5
10
29
5
4
1
1
12
3
2
15
6
1
6
1
9
15
6
3
3
correlation (R 2 = 0.0005) is seen for nearby control bases
(Wright et al., 2002).
Do measured mutation rates really represent mutation
rates in vivo? An appropriate gene to consider in discussing the true frequency of background mutations (whether
or not they are silent) is lacI, in which 25% of the bases
are known to be mutable (Fig. 6). However, 90% of these
mutations occur in the first two codon positions, as
changes in the third position are usually silent, i.e. do not
alter the encoded amino acid. As mutations are also
undoubtedly occurring in the third position, these can be
added to the 57 observed mutated bases. Thus, about
40% of the bases in lacI are vulnerable to mutation. This
does not include all the other silent mutations that do
not inactivate this repressor. Thus, the majority of all
bases appear to be mutable, but many of these are not
detectable.
Although SLSs are similar for mutable bases in the
auxotrophs and in lacI, a number of differences have been
noted. In lacI, mutable bases occur more frequently at a
greater distance from a stem, or in an unlooped, single
strand 8–11 bases from a stem. The existence of such a
structure may be unlikely, as that length of ssDNA would
probably be bonded to its complement as dsDNA. The
same mechanisms that give rise to the lacI mutations
must of course occur at some frequency in derepressed
auxotroph genes.
The analysis of background mutable bases in lacI indicates that as many mutations occur in high as in low –∆G
SLSs. In this system, all mutations have equal probabilities of being detected and sequenced. However, this
is not the case with revertants of auxotrophs selected
in derepressed genes during starvation. Transcription
enhances localized supercoiling and increases the concentration of SLSs that harbour unpaired bases susceptible to mutation. Therefore, these vulnerable bases located
in high –∆G SLSs are selected during starvation. Such
bases will mutate the most frequently, enrich the mutant
population with the most progeny and have the highest
probability of being isolated.
Many kinds of environmental challenge can target specific genes for supercoiling and enhanced mutagenesis.
Data summarized in this article indicate that nutritional
stress can localize enhanced mutability in SLSs resulting
from selective transcription. These circumstances can
Fig. 7. Linear regression analyses for the data in Tables 1 and 3. Mutation frequencies or number of mutations are plotted against MIs for seven
mutable alleles in the derepressed trpA gene (Isbell and Fowler, 1989; Bhamre et al., 2001), and for 35 background mutable alleles in lacI
(Schaaper and Dunn, 1991).
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
438 B. E. Wright et al.
Fig. 8. The 32 nt and 40 nt SLSs of lacZ. When
40 nt are folded, it can be seen that the triple
mutation, AGC, could arise by templating to the
complementary TCG sequence in the opposite
strand.
result in ‘forward’ mutations beneficial to evolution. For
example, during starvation for ribitol in the presence of
xylitol, two existing genes are modified to regulate the
transport and use of xylose as a new carbon source
(Lerner et al., 1964; Wright, 2000).
Other stressors, such as extreme osmotic conditions,
heat shock, host–pathogen interactions or metal toxicity,
could direct mutations to related genes potentially able
to alleviate these problems. (Borowiec and Gralla, 1987;
Higgins et al., 1988; Pruss and Drlica, 1989; Ansari
et al., 1992; Dorman, 1995; Jordi et al., 1995; Lefstin and
Yamamoto, 1998; Massey et al., 1999).
Specific areas of the genome are targeted for localized
transcription and supercoiling, increasing the concentration of SLSs and unpaired bases vulnerable to mutation.
This mechanism provides a continuous feedback loop
between the ever-changing environment, productive targets for localized enhanced mutation frequencies and
selection of the fittest. As most mutations are detrimental,
directing mutations to those genes regulating the cell's
response to each type of stress minimizes genome-wide
genetic damage while creating the most appropriate variants for accelerating evolution.
Just as DNA secondary structures may be considered
an additional dimension of gene expression (Wells et al.,
1980), so may the mutations that occur because of their
location in these structures. These structures are present
many thousands of times more frequently than expected
by chance (Lilley, 1980) and, in viral DNAs, they are overrepresented in regions that appear to have regulatory
significance (Müller and Fitch, 1982). In higher organisms, these hypermutable sequences have apparently
become localized to increase variants in genes of critical
importance to the survival of the organism (Forsdyke,
1995). However, the same mechanisms by which stress
can cause mutations and accelerate evolution in microorganisms may cause cancer in multicellular organisms
(Rady et al., 1992; Ionov et al., 1993; Skandalis et al.,
1994; Panagopoulos et al., 1997; Colman et al., 2000),
where survival of the fittest mutant can result in abnormal
growth.
There is reason to believe that the evolution of the DNA
world was paralleled by the development of systems such
as supercoiling to help maintain the integrity, geometry
and functionality of DNA (López-García, 1999). The
essential role of negative supercoiling in the regulation of
all aspects of DNA behaviour, especially in response to
environmental stress, is becoming increasingly apparent.
If specific conditions of stress target particular areas of
the genome for negative supercoiling, the formation of
secondary structures will be facilitated; if stress first activates transcription, this will in turn drive supercoiling. By
either scenario, the resulting increase in concentration of
SLSs will result in localized mutations, and a number
of circumstances can apparently converge to make
responses to particular adverse conditions quite specific.
In bases within the same codon, differential effects of
other variables affecting mutability should be minimal.
These contiguous bases obviously have a very similar
microenvironment with respect to supercoiling domains
(Sinden and Pettijohn, 1981; Krasilnikov et al., 1999),
ssDNA binding proteins, accessory proteins involved in
transcription, repair systems and so on. Therefore, comparing their relative mutation frequencies or events is ideal
for evaluating the relevance of the factors used in calculating MI, and for testing the validity of our proposed model
for predicting mutability. In fact, all our results are consistent with the location of mutable bases in SLSs. Based on
this assumption, MIs for mutable bases in selected genes
have, with unanticipated success, predicted 15 relative
mutation frequencies (trpA) or events (argH, leuB and
lacZ). This correlation occurred in spite of the fact that an
MI depends upon only two variables: (i) the highest –∆G
of all the SLSs in which it is unpaired; and (ii) the extent
to which the base is unpaired in all its foldings during
transcription.
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
Predicting mutations in stem–loop structures 439
Experimental procedures
The analysis of DNA secondary structures
A new software tool called ‘mfg’ was developed for use in this
paper for predicting mutation frequencies. Given an input
sequence, the set of all subsequences containing that base
is generated. From this set, mfg calculates the singlestranded percentage of the base and the energy of the most
stable structure containing the base. These are multiplied to
produce the MI value for that base. A full description of
this program and instructions for its use can be found at
http://biology.dbs.umt.edu/wright/upload/mfg.html. The program currently runs on the Windows platform only, but source
code is available for porting to other platforms.
Mutation rate conditions used to obtain revertants for
sequencing
Mutation rates for argH and leuB were determined as
described previously (Wright and Minnick, 1997). The reversion rate of codon 63 of lacZ was determined as follows: E.
coli CSH2 cells from a 12- to 24-h-old rich media plate were
diluted in 75 ml of minimal medium, consisting of 50 mM
sodium phosphate buffer, pH 6.5, 1.0 g l−1 (NH4)2SO4, 1.0 g
l−1 MgSO4, 0.3 mM glucose and 0.3 mM tryptophan to a cell
density of 5 × 10−4 ml−1. This inoculum was divided into 45 2cm-diameter test tubes at 1.5 ml per tube and grown for 18–
28 h with shaking, at a 45° angle at 37°C. Cell growth was
followed by hourly sampling companion tubes at a 1:10 dilution in buffered saline. When the cell concentration reached
the end of log growth, serial dilutions were made from two
companion cultures, and 10−6 and 10−7 dilutions were plated
on rich nutrient agar plates and incubated for 24 h at 37°C
to determine viable counts. The contents of 40 culture tubes
were distributed on selective [5 g l−1 lactose, salts (as above),
0.3 mM tryptophan, 0.1 mM NaCl] agar plates. Plates were
incubated for 40–48 h at 37°C and evaluated for revertants
(Luria and Delbrück, 1943). Although mutation rates per se
are not reported in this paper, the above procedures were
used to obtain the revertants sequenced (Tables 2 and 3).
Sequencing of revertants
Revertants were isolated from mutation rate plates containing
a single colony at 72 h, restruck on selective plates and
grown for 24 h at 37°C. An inoculum was transferred to 2 ml
of Luria–Bertani (LB) broth and grown for 8 h with shaking at
37°C. The culture was transferred to a microcentrifuge tube
and pelleted at 10 000 g for 10 min. A Qiagen DNeasy tissue
kit was used to extract chromosomal DNA. The lacZ DNA
was amplified by polymerase chain reaction (PCR) in a
50 µl volume using 500 ng of template and the primer pair:
LacZ5′ (5′-ATGACCATGATTACGGATTC-3′) and LacZ3′ (5′TTATTTTTGACACCAGACCAAC-3′) used at a concentration
of 50 pmol per reaction. Other components of the PCR, all
from New England BioLabs, were 10× polymerase buffer,
VentR DNA polymerase, MgSO4 and dNTPs. After 30 cycles,
2 µl of the product was visualized on a 1% agarose gel
containing 0.5 µg ml−1 of ethidium bromide, and the remainder was cleaned with a Qiaquick PCR purification kit
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
(Qiagen). Sequencing was performed at the Murdock Molecular Biology Facility using a BigDye Terminator ready reaction
kit with an Applied Biosystems Model 373A stretch DNA
sequencer.
Acknowledgements
We are indebted to Drs Robert G. Fowler, Bryn Bridges, Roel
M. Schaaper and Scott Samuels for valuable criticisms of and
suggestions for the manuscript. We thank Dr George Card
for the CSH2 E. coli strain. This work was supported by
National Institutes of Health grants R15CA88893 and
R55CA99242 and the Stella Duncan Memorial Research
Institute.
References
Ansari, A.Z., Chael, M.L., and O'Halloran, T.V. (1992) Allosteric underwinding of DNA is a critical step in positive
control of transcription by Hg-MerR. Nature 355: 87–89.
Balke, V.L., and Gralla, J.D. (1987) Changes in the linking
number of supercoiled DNA accompany growth transitions
in Escherichia coli. J Bacteriol 169: 4499–4506.
Beletskii, A., and Bhagwat, A.S. (1996) Transcription-induced
mutations: increase in C-to-T mutations in the nontranscribed strand during transcription in Escherichia coli.
Proc Natl Acad Sci USA 93: 13919–13924.
Benham, C.J. (1992) Energetics of the strand separation
transition in superhelical DNA. J Mol Biol 225: 835–847.
Bhamre, S., Bedrick, B.G., Koyama, C.A., White, S.J., and
Fowler, R.G. (2001) An aerobic recA-, umuC-dependent
pathway of spontaneous base-pair substitution mutagenesis in Escherichia coli. Mutat Res 473: 229–247.
Borowiec, J.A., and Gralla, J.D. (1987) All three elements of
the lac ps promoter mediate its transcriptional response to
DNA supercoiling. J Mol Biol 195: 89–97.
Bridges, B.A. (2001) Hypermutation in bacteria and other
cellular systems. Phil Trans R Soc London B 356: 29–39.
Brock, R.D. (1971) Differential mutation of the β-galactosidase gene of Escherichia coli. Mutat Res 11: 181–186.
Colman, M.S., Afshari, C.A., and Barrett, J.C. (2000) Regulation of p53 stability and activity in response to genotoxic
stress. Mutat Res 462: 179–188.
Coulondre, C., Miller, J.H., Farabaugh, P.J., and Gilbert, W.
(1978) Molecular basis of base substitution hotspots in
Escherichia coli. Nature 274: 775–780.
Datta, A., and Jinks-Robertson, S. (1995) Association of
increased spontaneous mutation rates with high levels of
transcription in yeast. Science 268: 1616–1619.
Dayn, A., Malkhosyan, S., Duzhy, D., Lyamichev, V.,
Panchenko, Y., and Mirkin, S. (1991) Formation of (dA-dT)n
cruciforms in Escherichia coli cells under different environmental conditions. J Bacteriol 173: 2658–2664.
Dayn, A., Malkhosyan, S., and Mirkin, S.M. (1992) Transcriptionally driven cruciform formation in vivo. Nucleic Acids
Res 20: 5991–5997.
Dorman, C.J. (1995) DNA topology and the global control of
bacterial gene expression: implications for the regulation
of virulence gene expression. Microbiology 141: 1271–
1280.
Fix, D.F., and Glickman, B.W. (1987) Asymmetric cytosine
440 B. E. Wright et al.
deamination revealed by spontaneous mutational specificity in an UngG– strain of Escherichia coli. Mol Gen Genet
209: 78–82.
Forsdyke, D.R. (1995) Conservation of stem-loop potential in
introns of snake venom phospholipase A2 genes: an application of FORS-D analysis. Mol Evol Biol 12: 1157–1165.
Foster, P.L. (1999) Mechanisms of stationary phase mutation: a decade of adaptive mutation. Annu Rev Genet 33:
57–88.
Frederico, L.A., Kunkel, T.A., and Shaw, B.R. (1990) A sensitive genetic assay for the detection of cytosine deamination: determination of rate constants and the activation
energy. Biochemistry 29: 2532–2537.
Frederico, L.A., Kunkel, T.A., and Shaw, B.R. (1993)
Cytosine deamination in mismatched base pairs. Biochemistry 32: 6523–6530.
Hall, B.G. (1999) Spectra of spontaneous growth-dependent
and adaptive mutations at ebgR. J Bacteriol 181: 1149–
1155.
Herman, R.K., and Dworkin, N.B. (1971) Effect of gene
induction on the rate of mutagenesis by ICR-191 in Escherichia coli. J Bacteriol 106: 543–550.
Higgins, C.F., Dorman, C.J., Stirling, D.A., Waddell, L.,
Booth, I.R., May, G., and Bremer, E. (1988) A physiological
role for DNA supercoiling in the osmotic regulation of gene
expression in S. typhimurium and E. coli. Cell 52: 569–584.
Ionov, Y., Peinado, M.A., Malkhosyan, S., Shibata, D., and
Perucho, M. (1993) Ubiquitous somatic mutations in simple
repeated sequences reveal a new mechanism for colonic
carcinogenesis. Nature 363: 558–561.
Isbell, R.J., and Fowler, R.G. (1989) Temperature-dependent
mutational specificity of an Escherichia coli mutator,
dnaQ49, defective in 3′→5′ exonuclease (proofreading)
activity. Mutat Res 213: 149–156.
Jordi, B.J.A.M., Owen-Hughes, T.A., Hulton, C.S.J., and
Higgins, C.F. (1995) DNA twist, flexibility and transcription
of the osmoregulated proU promoter of Salmonella typhimurium. EMBO J 14: 5690–5700.
Kingston, R.E., Nierman, W.C., and Chamberlin, M.J. (1981)
A direct effect of guanosine tetraphosphate on pausing of
Escherichia coli RNA polymerase during RNA chain elongation. J Biol Chem 256: 2787–2797.
Krasilnikov, A.S., Podtelezhnikov, A., Vologodskii, A., and
Mirkin, S.M. (1999) Large-scale effects of transcriptional
DNA supercoiling in vivo. J Mol Biol 292: 1149–1160.
Lefstin, J.A., and Yamamoto, K.R. (1998) Allosteric effects of
DNA on transcriptional regulators. Nature 392: 885–888.
Lerner, S.A., Wu, T.T., and Lin, E.C.C. (1964) Evolution of a
catabolic pathway in bacteria. Science 146: 1313–1315.
Lilley, D.M.J. (1980) The inverted repeat as a recognizable
structural feature in supercoiled molecules. Proc Natl Acad
Sci USA 77: 6468–6472.
Lindahl, T., and Nyberg, B. (1972) Rate of depurination of
native deoxyribonucleic acid. Biochemistry 11: 3610–3618.
Lindahl, T., and Nyberg, B. (1974) Heat-induced deamination
of cytosine residues in deoxyribonucleic acid. Biochemistry
13: 3405–3410.
Lipschutz, R., Falk, R., and Avigad, G. (1965) Mutation rates
of the induced and repressed locus of alkaline phosphatase in Escherichia coli. Isr J Med Sci 1: 23–24.
Liu, L.F., and Wang, J.C. (1987) Supercoiling and the DNA
template during transcription. Proc Natl Acad Sci USA 84:
7024–7027.
Longacre, A., Reimers, J.M., and Wright, B.E. (1999) Specificity of transcription-enhanced mutations. Ann NY Acad
Sci 870: 383–385.
López-García, P. (1999) DNA supercoiling and temperature
adaptation: a clue to early diversification of life? J Mol Evol
49: 439–452.
Luria, S., and Delbrück, M. (1943) Mutations of bacteria from
virus sensitivity to virus resistance. Genetics 28: 491–504.
Massey, R.C., Rainey, P.B., Sheehan, B.J., Keane, O.M.,
and Dorman, C.J. (1999) Environmentally constrained
mutation and adaptive evolution in Salmonella. Curr Biol 9:
1477–1480.
Moore, H., Greenwell, P.W., Liu, C.-P., Arnheim, N., and
Petes, T.D. (1999) Triplet repeats form secondary structures that escape DNA repair in yeast. Proc Natl Acad Sci
USA 96: 1504–1509.
Müller, U.R., and Fitch, W.M. (1982) Evolutionary selection
for perfect hairpin structures in viral DNAs. Nature 928:
582–585.
Panagopoulos, I., Lasen, C., Isaksson, M., Mitelman, F.,
Mandahl, N., and Aman, P. (1997) Characteristic
sequence motifs at the breakpoints of the hybrid genes
FUS/CHOP, EWS/CHOP and FUS/ERG in myxoid liposarcoma and acute myeloid leukemia. Oncogene 15: 1357–
1362.
Pananicolaou, C., and Ripley, L.S. (1991) An in vitro approach to identifying specificity determinants of mutagenesis
mediated by DNA misalignments. J Mol Biol 221: 805–821.
Papyotatos, N., and Wells, R.D. (1981) Cruciform structures
in superhelical DNA. Nature 289: 466–470.
Pruss, G.J., and Drlica, K. (1989) DNA supercoiling and
prokaryotic transcription. Cell 56: 521–523.
Rady, P., Scinicariello, F., Wagner, F., and King, S. (1992)
p53 mutations in basal cell carcinoma. Cancer Res 52:
3804–3806.
Rahmouni, A.R., and Wells, R.D. (1992) Direct evidence for
the effect of transcription on local DNA supercoiling in vivo.
J Mol Biol 223: 131–144.
Ripley, L.S. (1982) Model for the participation of quasipalindromic DNA sequences in frameshift mutation. Proc
Natl Acad Sci USA 79: 4128–4132.
Rosenberg, S.M. (2001) Evolving responsively: adaptive
mutation. Nature Rev Genet 2: 504–515.
Rudner, R., Murray, A., and Huda, N. (1999) Is there a link
between mutation rates and the stringent response in
Bacillus subtilis? Ann NY Acad Sci 870: 418–422.
SantaLucia, J., Jr (1998) A unified view of polymer, dumbbell,
and oligonucleotide DNA nearest-neighbor thermodynamics. Proc Natl Acad Sci USA 95: 1460–1465.
Schaaper, R.M., and Dunn, R.L. (1991) Spontaneous mutation in the Escherichia coli lacI gene. Genetics 129: 317–
326.
Schaaper, R.M., Danforth, B.N., and Glickman, B.L. (1986)
Mechanisms of spontaneous mutagenesis: an analysis of
the spectrum of spontaneous mutation in the Escherichia
coli lacI gene. J Mol Biol 189: 273–284.
Sinden, R.R., and Pettijohn, D.E. (1981) Chromosomes in
living Escherichia coli are segregated into domains of
supercoiling. Proc Natl Acad Sci USA 78: 224–228.
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
Predicting mutations in stem–loop structures 441
Singer, K.D., and Sterns, D.M. (1982) Chemical Mutagenesis. Annu Rev Biochem 52: 655–693.
Skandalis, A., Ford, B.N., and Glickman, B.W. (1994) Strand
bias in mutation involving 5-methylcytosine deamination in
the human hprt gene. Mutat Res 314: 21–26.
Sugden, K.D., and Stearns, D.M. (2000) The role of chromium (V) in the mechanism of chromate-induced oxidative
DNA damage and cancer. J Environ Pathol Toxicol Oncol
19: 215–230.
Todd, P.A., and Glickman, B.W. (1982) Mutational specificity
of UV light in Escherichia coli: indications for a role of
secondary structure. Proc Natl Acad Sci USA 79: 4123–
4127.
Wells, R.D., Goodman, T.C., Hillen, W., Horn, G.T., Klein,
R.D., Larson, J.E., et al. (1980) DNA structure and gene
regulation. Prog Nucleic Acid Res Mol Biol 24: 167–267.
Wright, B.E. (1996) The effect of the stringent response on
mutations in Escherichia coli K12. Mol Microbiol 19: 213–
219.
Wright, B.E. (1997) Does selective gene activation direct
evolution? FEBS Lett 402: 4–8.
© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441
Wright, B.E. (2000) A biochemical mechanism for nonrandom
mutations and evolution. J Bacteriol 182: 2993–3001.
Wright, B.E., and Minnick, M.F. (1997) Reversion rates in a
leuB auxotroph of Escherichia coli K12 correlate with
ppGpp levels during exponential growth. Microbiology 143:
847–854.
Wright, B.E., Longacre, A., and Reimers, J.M. (1999) Hypermutation in derepressed operons of Escherichia coli K12.
Proc Natl Acad Sci USA 96: 5089–5094.
Wright, B.E., Reimers, J.M., Schmidt, K.H., and Reschke,
D.K. (2002) Hypermutable bases in the p53 cancer gene
are at vulnerable positions in DNA secondary structures.
Cancer Res 62: 5641–5644.
Yanofsky, C.T., Platt, T., Crawford, I.P., Nichols, B.P.,
Christie, G.E., Horowitz, H., et al. (1981) The complete
nucleotide sequence of the tryptophan operon of Escherichia coli. Nucleic Acids Res 9: 6647–6668.
Zuker, M., Mathews, D.H., and Turner, D.H. (1999) A Practical Guide. In RNA Biochemistry and Biotechnology.
Barciszewski, J., and Clark, B.F.C. (eds). Dordrecht:
Kluwer Academic Publishers, pp. 11–43.