Predicting the structure of protein cavities created

Protein Engineering vol.15 no.8 pp.669–675, 2002
Predicting the structure of protein cavities created by mutation
Claudia Machicado, Marta Bueno and Javier Sancho1
Departamento de Bioquı́mica y Biologı́a Molecular y Celular, Facultad de
Ciencias, Universidad de Zaragoza, 50009 Zaragoza, Spain
1To
whom correspondence should be addressed. E-mail:
[email protected]
To assist in the efficient design of protein cavities, we have
developed a minimization strategy that can predict with
accuracy the fate of cavities created by mutation. We first
modelled, under different conditions, the structures of six
T4 lysozyme and cytochrome c peroxidase mutants of
known crystal structure (where long, hydrophobic, buried
side chains have been replaced by shorter ones) by minimizing the virtual structures derived from the corresponding
wild-type co-ordinates. An unconstrained pathway together
with an all-atom atom representation and a steepest descent
minimization yielded modelled structures with lower root
mean square deviations (r.m.s.d) from the crystal structures
than other conditions. To test whether the method
developed was generally applicable to other mutations of
the kind, we have then modelled eighteen additional T4
lysozyme, barnase and cytochrome c peroxidase mutants
of known crystal structure. The models of both cavity
expanding and cavity collapsing mutants closely fit their
crystal structures (average r.m.s.d. 0.33 ⍨ 0.25 Å, with
only one poorer prediction: L121A). The structure of
protein cavities generated by mutation can thus be confidently simulated by energy minimization regardless of the
tendency of the cavity to collapse or to expand. We think
this is favoured by the fact that the typical response
observed in these proteins to cavity-creating mutations is
to experience only a limited rearrangement.
Keywords: barnase/cytochrome c/lysozyme/peroxidase/
protein cavity/protein design/protein stability
Introduction
The design of protein ligands usually concentrates on achieving
a satisfactory chemical and shape complementarity between a
small molecule and a region of the protein surface (Amzel,
1998; Wlodawer and Vondrasek,1998; Gane and Dean, 2000).
Towards that end, protein flexibility, which is highest at the
surface of proteins, poses a serious problem (Jones et al.,
1997). Because the X-ray or NMR structure of a given protein
target may be just one of the many possible with similar
energy, ligand binding may lead to a significant decrease in
the protein conformational entropy, with a concomitant loss
of binding energy. Experimental resolution of the structure of
complexes between designed ligands and their targets sometimes shows that the binding has occurred in a protein
conformation significantly different from that used for the
design (Schoichet et al., 1993). Coping with protein surface
flexibility is in fact a major issue in docking strategies
(Abagyan and Totrov, 2001).
© Oxford University Press
Unlike protein surfaces, the interior of proteins, far more
rigid and thus lacking the aforementioned problem, has so far
received little attention as a suitable scenario for ligand binding.
There are several reasons for this. First, proteins are very
compact and few cavities, usually small, are found in them
(Brunori and Gibson, 2001); second, hosting ligands in protein
cavities may face an important kinetic barrier compared with
binding at the surface; and, third, the hydrophobic protein
interior seems to offer little potential for specificity. However,
protein cavities can be easily made by truncation mutations;
the kinetic barrier can be overcome, if required, by refolding
the protein in the presence of excess ligand; and there is
recent mounting evidence that certain interactions involving
hydrophobic residues, such as the cation–π interaction
(Fernández-Recio et al., 1999; Gallivan and Dougherty, 1999)
and hydrogen bonding to π clouds (Steiner and Koellner,
2001), can be specific enough. In addition, conventional
hydrogen bonding to internal polar groups (such as the peptide
bond) can also be used to confer specificity to protein–ligand
complexes. The pioneering work of Matthews and co-workers
showed that small organic molecules could be hosted in protein
cavities created by mutation (Eriksson et al., 1992a). Further
work by this group indicates, however, that proteins do not
always act passively upon cavity-creating mutations but that
they often tend either to collapse or to expand around the
newly created cavity (Eriksson et al., 1992b; Xu et al., 1998).
This is certainly inconvenient if ligand binding sites are to be
created by rational side chain deletions inside a protein since
it introduces the need for a subsequent protein structure
determination in order to know the real structure of the cavity.
We show here that the fate (collapse, expansion or simply
no change) of protein cavities created by substituting apolar,
buried side chains by apolar smaller ones can be accurately
predicted by a simple method that involves energy minimization
under certain conditions. Our method predicts the coordinates
of the atoms that surround cavities created in T4 lysozyme,
barnase and cytochrome c peroxidase and therefore constitutes
a useful tool for designing ligand binding sites inside proteins.
Materials and methods
Protein X-ray structures
High-resolution crystal structures of T4 lysozyme, barnase and
cytochrome c peroxidase mutants, where buried, hydrophobic
side chains have been mutated to shorter ones, were used as
modelling targets (Goodin and McRee, 1993; Buckle et al.,
1996; Baldwin et al., 1998). Most mutants contain Leu, Val
or Ile to Ala substitutions, but mutants with Phe or Met to
Ala and also with Trp to Gly and Arg to Ala replacements
were also considered. The Protein Data Bank codes of the
proteins used are, for T4 lysozyme 2LZM [WT], 1l63 [pseudo
WT with C54T and C97A mutations], 1l67 [L46A], 1l90
[L99A], 1l69 [L133A], 200l [L121A], 1l85 [F153A], 226L
[L133G], 222L [M102A], 238L [V103A], 252L [M102A/
M106A], 241L [I29A], 1L89 [L99A/F153A], 244L [I100A],
669
C.Machicado, M.Bueno and J.Sancho
Table I. Optimization of the energy minimization method for calculating the structure of lysozyme truncation mutants
Algorithm
Steepest descents
Conjugate gradients
R.m.s.d. (Å)a
Simulation conditions
Steps
Gradient tolerance
(kcal/mol.Å)
Constantd dielectric
method
Cavity-creating
mutantsb
10000
5000
5000
2000
2000
1000
1000
500
10000
5000
5000
2000
2000
1000
1000
500
0.01
0.01
0.10
0.10
0.10
0.10
0.10
0.10
0.01
0.01
0.10
0.10
0.10
0.10
0.10
0.10
Constant
Constant
Constant
Constant
D–D
Constant
D–D
D–D
Constant
Constant
Constant
Constant
D–D
Constant
D–D
D–D
0.5–0.6
0.4–0.5
0.4–0.5
0.3–0.5
0.2–0.4
0.3–0.4
0.3–0.4
0.4–0.5
0.7–0.8
0.6–0.8
0.6–0.7
0.6–0.7
0.4–0.6
0.6–0.7
0.5–0.6
0.6–0.7
Non-cavity-creating
mutantsc
0.4–0.4
0.2–0.3
0.3–0.4
0.2–0.6
0.5–0.9
0.4–0.5
0.5–0.8
0.4–0.6
aCavity
r.m.s.d were calculated from superimposed crystal and model structures, using the cavity surface side chain atoms (for cavity-creating mutants) or side
chain atoms closer than 4 Å to the side chain of the mutated residue (for non-cavity-creating mutants). The r.m.s.d refer to the atoms used for superposition.
Overall r.m.s.d of the protein structures were calculated after superposing the structures using all the atoms.
bL99A, M102A and L133A mutants.
cI67A I17A and R48A mutants.
dA dielectric constant of 1 (Constant) or a distance dependent dielectric constant (D–D) was used.
245L [M6A]), 239L [I17A], 236L [V87A], 235L [V111A],
237L [V149A], 243L [I58A], 246L [F67A]; for barnase 1A2P
[WT], 1BRI [I76A], 1BRJ [I88A], 1BRK [I96A]; and for
cytochrome c peroxidase 1CCA [WT], 1CMQ [W191G], 1DJ1
[R48A]. The lysozyme L133A and L133G mutations are
introduced into the wild-type gene whereas all other lysozyme
mutants contain, in addition, C54T and C97A mutations.
Energy minimization
All minimizations were carried out using the CHARMm force
field, as implemented in InsightII (MSI) (Brooks et al., 1983).
Explicit solvent molecules, as present in the crystal structure,
and hydrogen atoms were considered. Non-bond terms were
truncated at 11 Å (smoothing from 8 Å), with a switching
function for van der Waals and electrostatic terms (Brooks
et al., 1983). Since crystal water molecules were explicitly
included, a constant dielectric of 1 was used throughout the
minimizations. The non-bonded list, which defines the groups
of atoms included in the calculation of non-bond energies (van
der Waals and electrostatic), was updated every 10 steps.
Two thousand steps of conjugate gradients or steepest
descents were applied to each structure in an unconstrained
path (Fletcher and Reeves, 1964). Minimizations were started
from the X-ray structures of the wild-type T4 lysozyme,
barnase and cytochrome c peroxidase after having implemented
the appropriate in silico mutations. Nineteen T4 lysozyme,
three barnase and two cytochrome c peroxidase truncation
mutants of available X-ray structure were modelled. The
optimized protocol consisted of 2000 steps of steepest descents,
distance-dependent dielectric constant (1 times r), a gradient
tolerance of 0.1 kcal/mol.Å, no constraints in the system, cutoffs of 8 and 11 Å with a switching function to evaluate nonbond interactions and updating every 10 steps.
Measurement of cavity volume and cavity collapse or
expansion
Cavity volume was calculated using a probe radius of 1.4 Å
using the method implemented in Swisspdb Viewer (Guex and
670
Peitsch, 1996). Percentages of collapse of the modelled (after
minimization) and real cavities (as seen in the X-ray structures)
with respect to the theoretical cavities (in silico mutations
before minimization) were calculated as 100(Vt – Vm)/Vt and
100(Vt – Vc)/Vt, respectively, where Vt is the volume of the
cavity created by replacing in silico a given side chain by Ala
(or Gly) before any minimization is performed, Vm is the
volume of that cavity after minimization and Vc is the cavity
volume in the mutant X-ray structure. Negative percentages
of collapse indicate cavity expansion.
Comparison of modelled and X-ray structures and
calculation of solvent accessibility and of a flexibility index
To compare model and crystal structures, root mean square
deviations (r.m.s.d) were calculated. First, the model and
X-ray structures were superimposed in all-atom mode using
Swisspdb Viewer. Then, r.m.s.d. values for the atoms of surface
cavity residues, or of their side chains, were calculated.
Percentages of solvent exposure for the mutated side chains
in the wild-type structure were calculated using the Connolly
algorithm, with a 1.4 Å probe radius (Connolly, 1983). A
mean flexibility index of the protein structure at the mutation
region was calculated as the average of the B-values of the
atoms of the surface cavity residues.
Results and discussion
Optimization of the energy minimization protocol with a subset
of mutants
The minimization method was established by probing a range
of different conditions (see Table I). The energy minimization
algorithm was the first variable analysed. To do that we
modelled three cavity-forming T4 lysozyme mutants (L99A,
M102A, L133A) using either conjugate gradients or steepest
descents. The steepest descents algorithm consistently gave
better results, that is, lower r.m.s.d. values than conjugate
gradients for the three mutants analysed (Table I). Next, we
stated in the crystallographic coordinates files.
Helix
Helix
Helix
Helix
Helix
Helix
Helix
Helix
Sheet
Helix
Sheet
Sheet
Sheet
Helix
Turn
0
0
0
0
0
0
8
0
0
2
0
0
0
5
0
15
6
2
5
5
5
0
2
15
21.18
16.36
19.19
19.12
17.33
21.62
17.61
19.19
21.09
17.68
Na
15.69
22.37
Na
13.48
Na
Na
18.96
20.01
19.98
20.92
16.78
17.59
Nai
Surface
cavity
residues
flexibilityc
51
19
32
57
40
74
14
40
92
172
32
23
39
0
164
40
34
155
121
191
171
161
275
0
Theoret.
cavity
volumed
32
36
45
46
47
58
14
40
67
50
0
21
40
0
184
42
0
174
94
182
150
110
265
0
Crystal
cavity
volumee
37
–89
–41
19
–18
22
0
0
27
71
100
9
–3
Na
–12
–5
100
–12
22
5
12
32
4
Na
Crystal cavity
volume
reduction (%)f
Theoretical and crystal cavity volumes
and volume reduction
19
25
29
22
30
52
23
50
34
87
14
13
35
0
202
0
0
167
95
163
132
112
255
0
Model
cavity
volumeg
63
–32
9
63
25
30
–64
–25
63
49
56
43
10
Na
–23
100
100
–8
21
15
21
30
7
Na
0.61
0.67
0.35
0.28
0.25
0.70
0.85
0.38
0.57
1.43
0.58
0.44
0.99
0.59
0.78
1.03
0.67
0.71
0.82
0.87
0.81
0.70
0.65
1.88
43
31
40
50
0
52
0
83
40
148
0
15
48
0
204
31
0
162
105
177
147
149
268
0
Model
cavity
volumeg
16
–63
–25
12
100
30
100
–108
57
14
100
35
–23
Na
–24
23
100
–5
13
7
14
7
3
Na
0.23
0.38
0.22
0.22
0.23
0.23
0.23
0.39
0.32
1.46
0.19
0.30
0.32
0.31
0.19
0.23
0.16
0.24
0.44
0.31
0.34
0.28
0.29
0.48
Model cavity R.m.s.d. of
volume
surface
reduction (%)f cavity side
chainsh
Steepest descents model
iNot
cavities, the r.m.s.d.s were calculated for residues in a 4 Å radius from the mutated side chain.
applicable.
dCalculated after in silico mutagenesis, without minimization, represents the volume of the newly created cavities before relaxing to their final conformation (Å3).
eCalculated from the crystal coordinates of each mutant structure (Å3).
fCalculated as 100(V – V )/V , where V is the theoretical cavity volume and V is the volume calculated from the crystallographic coordinates or the modelled coordinates of the mutant after minimization.
t
x
t
t
x
gCalculated after minimization, as described in Materials and methods (Å3).
hRoot mean square deviations (Å) for the side chain atoms were calculated after superimposing those side chains of the mutant crystal and model structures using InsightII (MSI). In mutants not forming
(MSI).
Model cavity R.m.s.d. of
volume
surface
reduction (%)f cavity side
chainsh
Conjugate gradients model
bCalculated from the crystallographic file of the psWT-lysozyme (1L63), WT barnase (1A2P) and WT cytochrome c peroxidase (1CAA) using InsightII
cAverage B-factors of the corresponding side chains in psWT-lysozyme (1L63), WT barnase (1A2P) and WT cytochrome c peroxidase (1CAA).
aAs
Cyt. c
perox.
Barnase
1L67
244L
236L
235L
237L
245L
238L
246L
243L
200L
1BRI
1BRJ
1BRK
1DJ1
1CMQ
241L
239L
1L90
222L
226L
1L69
1L85
1L89
Helix
Sheet
Helix
Helix
Helix
Helix
Helix
Helix
Helix
M102A/
M106A
I29A
I17A
L99A
M102A
L133G
L133A
F153A
L99A/
F153A
L46A
I100A
V87A
V111A
V149A
M6A
V103A
F67A
I58A
L121A
I76A
I88A
I96A
R48A
W191G
T4 Lys
252L
Second
struct.a
Protein code PDB code Mutant
Residue
access.
(%)b
Cavity properties
Mutant identification
Table II. Cavity mutants modelling
Prediction of protein cavity structure
671
C.Machicado, M.Bueno and J.Sancho
determined the number of iteration steps required to reproduce
best the crystal structures, using three T4 lysozyme cavityforming mutants (L99A, L133A and M102A) and three mutants
where no cavity appeared after mutation (I17A, I76A and
R48A). The best overall results for the steepest descents
algorithm were obtained with 2000 iteration steps, although
using 1000 steps yielded comparable results. The influence of
the number of iteration steps on the performance of the
conjugate gradients algorithm was similar. We then optimized
some other variables of the energy minimization protocol such
as the gradient tolerance value, update frequency and cut-off
values for non-bonded interactions. The most accurate results
were obtained using a gradient of 0.1 kcal/mol.Å as the
convergence criterion, updating every 10 steps and cut-offs of
8 and 11 Å, with a switching function, for non-bonded
interactions. Finally, a radius-dependent dielectric method was
selected to minimize the effects of long-range force truncation
(Brooks et al., 1985).
Prediction of final cavity volume and of the cavity tendency
to reduce or to expand
The volumes of the cavities present in the X-ray structures of
the truncation mutants, calculated using a probe radius of
1.4 Å, are compared in Table II with those of the theoretical
cavities that would have arisen if no protein rearrangements
had occurred as a consequence of the mutations. The volumes
of these virtual or theoretical cavities were calculated from
the coordinates of the wild-type proteins modified so that the
side chains mutated to alanine appeared truncated to their
β-carbons (to the α-carbon in the L133G and W191G mutants).
As Table II shows, many of these virtual cavities tend to
collapse to some extent, some markedly, as a consequence
of the protein rearrangements that lead to the most stable
conformation available in the mutants. In some cases, however,
the most stable conformation is attained by enlarging the
virtual cavity, as reflected by a larger cavity volume in the
crystal structure. To quantify cavity expansion or collapse, we
calculate percentages of volume reduction from the volumes
of the virtual and X-ray cavities (Eriksson et al., 1992b). Our
values differ slightly from those quoted by Eriksson et al.
(1992b) and by Buckle et al. (1996) because they used for the
probe a radius of 1.2 Å and for the β-carbon 2.02 Å, whereas
in our calculations we used a probe radius of 1.4 Å and for
the β-carbon 2.10 Å (Xu et al., 1998).
To test the performance of the minimization procedure
in predicting the volumes of protein cavities originating in
truncation mutations, we modelled the structure of 24 truncation
mutants by both steepest descents and conjugate gradients and
then calculated the volumes of the modelled cavities (Table II).
These modelled volumes are compared with the experimental
values in Figure 1. As shown, both conjugate gradients and
steepest descents algorithms yield cavity volumes in excellent
agreement with the experimental values.
A more stringent test of the minimization procedure can be
made by assessing its ability to predict the individual fate of
each cavity. To assess the ability of our minimization procedure
to capture the intrinsic tendency of engineered protein cavities
to collapse or to expand, we calculated the percentages of cavity
reduction predicted from the models, which are compared in
Figure 2 with the corresponding experimental values. It can
be seen that both cavity expansions and cavity collapses are
well predicted by the two algorithms. The performance of
the minimization procedure is remarkable when the steepest
672
Fig. 1. Comparison of crystal cavity volumes and modelled cavity volumes.
The volumes were calculated as described in Materials and methods. Top,
conjugate gradients; bottom, steepest descents. Circles, lysozyme mutants;
triangles, barnase; squares, cytochrome c peroxidase.
Fig. 2. Percentage of volume reduction in the modelled and crystal
structures relative to the theoretical cavity volume (see Materials and
methods). Top, conjugate gradients; bottom, steepest descents. Circles,
lysozyme mutants; triangles, barnase; squares, cytochrome c peroxidase.
Two mutants with either crystal or model cavity volumes of zero, leading to
artefactually large percentages of volume reduction (V149A, V103A), and
one mutant with no change of cavity volume (F67A) are not considered in
the fit (open symbols).
Prediction of protein cavity structure
Fig. 3. Representative modelling of cavity expansion and reduction in T4
lysozyme mutants using steepest descents energy minimization. (A) Cavity
expansion in the L99A T4 lysozyme mutant; (B) cavity reduction in the
V111A T4 lysozyme mutant.
Table III. R.m.s.d of theoretical and of steepest descents-modelled cavities
Mutant
Theoretical/crystal
Model/crystal
M102A/M106A
I29A
I17A
L99A
M102A
L133G
L133A
F153A
L99A/F153A
L46A
I100A
V87A
V111A
V149A
M6A
V103A
F67A
I58A
L121A
I76A
I88A
I96A
R48A
W191G
Average
0.52
0.37
0.45
0.19
0.39
0.50
0.51
0.23
0.29
0.22
0.36
0.54
0.17
0.15
0.37
0.29
0.19
0.33
1.45
0.25
0.27
0.28
0.25
0.30
0.37 ⫾ 0.26
0.48
0.23
0.16
0.24
0.44
0.31
0.34
0.28
0.29
0.23
0.38
0.22
0.22
0.23
0.23
0.23
0.39
0.32
1.46
0.19
0.30
0.32
0.31
0.19
0.33 ⫾ 0.25
descents algorithm is used: in only two cases is a cavity that
expands predicted to collapse (I29A and V149A) and in no
case is a collapsing cavity predicted to expand (see Table II).
Accurate prediction of cavity structure
The main goal of the present study, however, was to predict
in detail the structural response of proteins to cavity-creating
mutations; in other words, to be able to calculate accurately
the structure of the protein around the cavity without having
to determine it from X-ray or NMR experiments. In this
respect, our energy minimization procedure is able to predict
Fig. 4. Rearrangement of cavity surface side chains in the L99A lysozyme
mutant accurately modelled by steepest descents minimization. The figure
shows the movement of F114, F153 and Y88 at the cavity surface from the
theoretical structure (green) to the crystal (blue) and modelled (red)
structures.
not only the fate of cavities (i.e. to reduce or to expand), but
also their actual shape. To assess the similarity between the
modelled structures and their corresponding crystal structures,
they were superimposed and r.m.s.d of the atoms in cavity
surface residues were calculated (Table II). Only the atoms in
the side chains were computed as they are expected to show
a greater mobility and therefore to depart more from the
theoretical structure than main chain atoms. The conjugate
gradients method yields modelled structures with r.m.s.d from
0.25 to 1.88 Å for cavity surface side chain atoms, with typical
values around 0.7 Å. The best performance is again offered
by the steepest descents; 96% of the structures calculated (23
out of 24) show, at cavity surface side chains, r.m.s.d from
0.16 to 0.48 Å. The only model that is predicted with lower
resolution is that of L121A, whose cavity experiences by far
the largest volume reduction of all the cavities studied, perhaps
because the leucine is slightly exposed (Table II) and in
a sequence segment rich in residues with high B-factors
(not shown).
Thus, the minimization protocol presented here (using steepest descents) allows the calculation of the coordinates of side
chains facing internal protein cavities created by truncation
mutations with high accuracy for most of the cases tested
[incidentally, for all the cases where the mutated side chain is
completely buried (13 structures; see Table II)]. Although the
minimization protocol presented here follows an unconstrained
path, no significant distortion of the protein structures occurs
outside the cavity regions (not shown). Therefore, minimizing
a protein mutant bearing an internal side chain deletion by
this procedure essentially yields the protein X-ray structure in
a very short time. Two examples (one of cavity expansion and
one of cavity collapse) where the excellent prediction of cavity
surface atoms coordinates can be seen are shown in Figure 3.
Suitable mutations for cavity prediction
To assess the extent to which the performance of the minimization is compromised in particular types of mutations, we
analysed whether the quality of the models (as judged from
the model/crystal cavity side chains r.m.s.d in Table III) is
673
C.Machicado, M.Bueno and J.Sancho
Table IV. Predicted cavities
Mutant identification
Theoretical and model cavity volumes and predicted volume reduction
Protein
Mutant
Theoretical cavity
volume (Å3)
Model cavity
volume (Å3)
T4 lysozyme
L99A/I78A
I3A
L7A/I100A
V71A
F153A/L133A
M102A/L133A
I96A/I88A
I76A/I88A
L89A
L20A
I51A
L206A
V47A
F158A
I53A
V47A/L177A
L144A
205
14
85
14
284
250
95
61
62
27
40
89
32
167
50
153
190
214
21
93
0
265
223
106
66
36
0
33
99
40
185
20
132
142
Barnase
Cyt. c peroxidase
related to a number of structural characteristics of the cavities
created by the mutations. Neither the average B-factor of
cavity surface side chain atoms (taken as a measure of local
flexibility), nor the solvent accessibility of the mutated residue
(within our 0–15% window), nor its secondary structure
location, nor the theoretical volume of the cavity are related
to the accuracy of the models (Table II). This indicates that
the minimization can be applied in principle to any buried
(⬎85%) apolar residue in a given protein. We note, anyway,
that the mutant with the highest r.m.s.d. (L121A) is slightly
exposed and close to crystal water molecules. An obvious
limitation of the method is that it does not attempt to predict
whether a given mutation will yield a thermodynamically
stable protein or whether water will bind to the cavity; these
issues have to be determined by experiment.
The reaction of proteins to cavity-creating mutations
Our analysis of the rearrangements experienced by 24 cavitycreating mutants from three different proteins allows an interesting conclusion to be drawn: for mutations involving apolar,
buried side chains, the rearrangements experienced by these
proteins are almost invariably small. In many cases (13 in 24),
the protein collapses slightly by an average of 29 ⫾ 31 Å3
(leaving L121A aside: 21 ⫾ 13 Å3); in four cases there is no
significant volume change; and in seven cases there is an
average volume increase of 11 ⫾ 10 Å3. This is a particularly
favourable scenario for ligand binding design because it
suggests that, in many instances, the structure of the cavity
mutant could be taken, as a first approximation, as that of the
theoretical structure resulting form the in silico implementation
of the mutation. To test this, we compare in Table III the
similarity between the modelled and crystal structures with
that between the theoretical and crystal structures. As Table
III shows, despite the fact that the energy minimization
consistently approximates cavity volumes to those appearing
in the crystal structures and predicts the tendency of the cavity
to expand or to collapse (Table II and Figures 1 and 2), the
model/crystal r.m.s.d are only slightly lower on average that
the theoretical/crystal r.m.s.d. However, this direct comparison
is complicated by the fact that, unlike the modelled structures,
wild-type crystal structures (from where all theoretical struc674
Model cavity volume
reduction (%)
–4.4
–50
–9.4
100
6.7
10.8
–11.6
–8.2
41.9
100
17.5
–11.2
–25
–10.8
60
13.7
25.3
tures are calculated) and mutant structures have been presumably subjected to the same minimization procedure in the
laboratories where the structures were solved, which can
increase the model/crystal r.m.s.d relative to the theoretical/
crystal values. Thus, a first approximation to the structure of
cavity-creating mutants can be obtained by simply performing
the in silico mutation, while a higher refined structure is
obtained when this theoretical structure is subjected to the
optimized minimization procedure reported here, that captures
the tendency of the cavity to expand or to collapse. The
minimization is especially important in some instances where
significant displacements of surface-located side chains occur
upon mutation because the minimized structures do reveal
those movements (Figure 4). We offer in Table IV, for
experimental testing, a list of predicted cavity sizes of not yet
reported mutants of the three proteins.
Conclusion
X-ray analysis of protein response to cavity-creating mutations
had shown that replacement of buried, bulky, hydrophobic
side chains by alanine leads to slight side chain adjustments
rather than to substantial repacking of protein cores. Perhaps
for this reason the simple minimization procedure implemented
here can predict with high accuracy the structure of the mutated
proteins so that their coordinates can be obtained from those
of the corresponding wild-type proteins without having to
perform X-ray or NMR studies. We hope that this will stimulate
the consideration of protein cores as suitable scenarios for
binding site design and the use of proteins as small molecule
carriers and deliverers.
References
Abagyan,R. and Totrov,M. (2001) Curr. Opin. Chem. Biol., 5, 375–382.
Amzel,L.M. (1998) Curr. Opin. Biotechnol., 9, 366–369.
Baldwin,E., Baase,W., Zhang,X., Feher,V. and Matthews,B. (1998) J. Mol.
Biol., 277, 467–485.
Brooks,B., Bruccoleri,R., Olafson,B., States,D., Swaminathan,S. and
Karplus,M. (1983) J. Comput. Chem., 4, 187–217.
Brooks,C., Brunger,A. and Karplus,M. (1985) Biopolymers, 24, 843–865.
Brunori,M. and Gibson,Q. (2001) MBO Rep., 8, 674–679.
Buckle,A., Cramer,P. and Fersht,A. (1996) Biochemistry, 35, 4298–4305.
Connolly,M. (1983) Science, 221, 709–713.
Prediction of protein cavity structure
Eriksson,A., Baase,W., Wozniak,J. and Matthews,B. (1992a) Nature, 355,
371–373.
Eriksson, A., Baase,W., Zhang,X., Heinz,D., Blaber,M., Baldwin,E. and
Matthews,B. (1992b) Science, 255, 178–183.
Fernández-Recio,J., Romero,A. and Sancho,J. (1999) J. Mol. Biol., 290,
319–330.
Fletcher,R. and Reeves,C. (1964) Comput. J., 7, 148–154.
Gallivan,J. and Dougherty,D. (1999) Proc. Natl Acad. Sci. USA, 96, 9459–9464.
Gane,P. and Dean,P. (2000) Curr. Opin. Struct. Biol., 10, 401–404.
Goodin,D. and McRee,D. (1993) Biochemistry, 32, 3313–3324.
Guex,N. and Peitsch,M. (1996) Protein Data Bank Q. Newsl., 77, 7.
Jones,G., Willett,P., Glen,R., Leach,A. and Taylor,R. (1997) J. Mol. Biol.,
267, 727–748.
Schoichet,B., Stroud,R., Santi,D., Kuntz,I. and Perry,K. (1993) Science, 259,
1445–1450.
Steiner,T. and Koellner, G. (2001) J. Mol. Biol., 305, 535–557.
Wlodawer,A. and Vondrasek,J. (1998) Annu. Rev. Biophys. Biomol. Struct.,
27, 249–284.
Xu,J., Baase,W., Baldwin,E. and Matthews,B. (1998) Protein Sci., 7, 158–177.
Received December 3, 2001; revised April 19, 2002; accepted May 21, 2002
675