Electrostatic Complementarity at Protein/ Protein Interfaces

J. Mol. Biol. (1997) 268, 570±584
Electrostatic Complementarity at Protein/
Protein Interfaces
Airlie J. McCoy*, V. Chandana Epa and Peter M. Colman
Biomolecular Research Institute
343 Royal Parade, Parkville
Victoria 3052, Australia
Calculation of the electrostatic potential of protein± protein complexes
has led to the general assertion that protein ±protein interfaces display
``charge complementarity'' and ``electrostatic complementarity''. In this
study, quantitative measures for these two terms are developed and used
to investigate protein± protein interfaces in a rigorous manner. Charge
complementarity (CC) was de®ned using the correlation of charges on
nearest neighbour atoms at the interface. All 12 protein ± protein interfaces studied had insigni®cantly small CC values. Therefore, the term
charge complementarity is not appropriate for the description of protein ± protein interfaces when used in the sense measured by CC. Electrostatic complementarity (EC) was de®ned using the correlation of surface
electrostatic potential at protein ±protein interfaces. All twelve protein ±
protein interfaces studied had signi®cant EC values, and thus the assertion that protein ±protein association involves surfaces with complementary electrostatic potential was substantially con®rmed. The term
electrostatic complementarity can therefore be used to describe protein ±
protein interfaces when used in the sense measured by EC. Taken
together, the results for CC and EC demonstrate the relevance of the
long-range effects of charges, as described by the electrostatic potential at
the binding interface. The EC value did not partition the complexes by
type such as antigen ±antibody and proteinase ±inhibitor, as measures of
the geometrical complementarity at protein± protein interfaces have done.
The EC value was also not directly related to the number of salt bridges
in the interface, and neutralisation of these salt bridges showed that
other charges also contributed signi®cantly to electrostatic complementarity and electrostatic interactions between the proteins. Electrostatic complementarity as de®ned by EC was extended to investigate the
electrostatic similarity at the surface of in¯uenza virus neuraminidase
where the epitopes of two monoclonal antibodies, NC10 and NC41, overlap. Although NC10 and NC41 both have quite high values of EC for
their interaction with neuraminidase, the similarity in electrostatic potential generated by the two on the overlapping region of the epitopes is
insigni®cant. Thus, it is possible for two antibodies to recognise the electrostatic surface of a protein in dissimilar ways.
# 1997 Academic Press Limited
*Corresponding author
Keywords: charge complementarity; continuum electrostatics model;
electrostatic complementarity; protein interaction; protein interfaces
Present address: A. J. McCoy: Medical Research Council Laboratory of Molecular Biology, Hills Road, Cambridge
CB2 2QH, UK
Abbreviations used: HEL, hen egg white lysozyme; D1.3, anti-HEL antibody D1.3; hGH, human growth hormone;
hGHR, human growth hormone receptor; HyHEL10, anti-HEL antibody HyHEL10; HyHEL5, anti-HEL antibody
HyHEL5; NA (N9), in¯uenza virus neuraminidase (subtype 9); NC10, anti-NA antibody NC10; NC41, anti-NA
antibody NC41; Fv, variable domain of antibody; Fab, antigen-binding fragment of antibody; OMKTY3, turkey
ovomucoid third domain; VH, variable domain of the heavy chain of antibody.
0022±2836/97/170570±15 $25.00/0/mb970987
# 1997 Academic Press Limited
571
Electrostatic Complementarity
Introduction
The nature of the electrostatic interaction between proteins has long been a subject of study.
Elementary analysis has involved simply counting
the number of charged residues and salt bridges at
protein ±protein interfaces (Janin & Chothia, 1990;
Jones & Thornton, 1995). With the development of
the continuum electrostatics model for proteins,
it has become routine to calculate and display
the electrostatic potential of protein structures
by numerically solving the Poisson-Boltzmann
equation for the protein± solvent system (as implemented in the program DelPhi; Gilson & Honig,
1988; Nicholls & Honig, 1991). The electrostatic
potential is generally visualised by colour coding
regions red for negative potential and blue for
positive potential, and is indicated either on the
molecular surface of the protein or by drawing
equipotential contours in the space around the protein. Such visualisations have been important in
gaining insight into the docking of proteins and
their small molecule ligands (for example, superoxide and copper, zinc superoxide dismutase; Getzoff
et al., 1983) and predicting sites of protein ±protein
association (for example, nerve growth factor;
McDonald et al., 1991). Where the structure of a
protein ±protein complex has been determined, continuum electrostatics models can be used to visualise the electrostatic potential at the surface buried in
the interface and to calculate the contribution to the
free energy of binding made by the electrostatic
union of the proteins (Gilson & Honig, 1988).
Various analyses of the electrostatic nature of
protein ±protein interfaces using these techniques
has led to the general assertion that proteins and
protein surfaces that interact with one another have
``charge complementarity'' (for example, Novotny
& Sharp, 1992; Roberts et al., 1991) or ``electrostatic
complementarity'' (for example, Braden & Poljak,
1995; Demchuk et al., 1994; Hendsch & Tidor, 1994;
Karshikov et al., 1992; Lescar et al., 1995; Novotny &
Sharp, 1992), and that this property of protein ±protein interaction is important for de®ning speci®city.
However, to date there has not been a rigorous
study of the complementarity of the charges and
electrostatic potential at protein ±protein interfaces.
The aim of this study is to de®ne quantitative
measures for charge complementarity and electrostatic complementarity in agreement with their
qualitative descriptions as measurements of the
complementarity of the charges and electrostatic
potential, respectively, at the protein ±protein interface itself. These measures for charge complementarity and electrostatic complementarity are then
calculated for a variety of protein ±protein complexes, and conclusions drawn about the validity of
the measurements and the nature of protein ±protein interactions. We examine how electrostatic
complementarity is affected by the various parameters used for modelling the electrostatic potential with the ®nite difference Poisson-Boltzmann
method, in order to select the most appropriate
parameters for the comparative study. We show
that while the electrostatic potential at protein± protein interfaces can exhibit quite signi®cant electrostatic complementarity, none of the protein
complexes selected shows any signi®cant charge
complementarity. Salt bridges across the interface
can play an important role in determining the magnitude of complementarity, but because of the effect
of all the other charges there can be signi®cant electrostatic complementarity even when the salt
bridges are computationally neutralised. Finally,
we examine the similarity of the electrostatic potential generated by antibodies NC10 and NC41 at the
surface of in¯uenza virus neuraminidase protein
(NA) where their epitopes overlap and ®nd that it is
insigni®cant, even though each antibody has high
electrostatic complementary with NA at the overlapping epitopes.
Theory
The measures of charge and electrostatic complementarity developed here are based on the correlation coef®cients used by Chau & Dean (1994) to
analyse complementarity at the interfaces between
proteins and their small molecule ligands (for
example, retinol-binding protein and retinol). The
study also uses the interaction energy, Gint, and
the Coulombic energy, GCoul, which are components of the total electrostatic free energy Gelec,
calculated by two different thermodynamic pathways for protein association as modelled by continuum electrostatics (Gilson & Honig, 1988).
Measure of charge complementarity: CC
First, all nearest neighbour contacts made between the proteins in the complex are determined.
Where an atom is in contact with more than one
atom in the other protein, each contact is counted
separately. Charges are then assigned to the atoms
in contact, and the correlation coef®cient r between
the lists of charges calculated. The charge complementarity (CC) is then de®ned as:
CC ˆ ÿr
…1†
The more positive the value of CC, the more
complementary are the charges on the surface of
the interface, and the more negative the CC
value, the more similar is the distribution of
charges on the interface. Pearson's correlation test
or Spearman's rank correlation test (see Materials
and Methods) are used for calculating the correlation. Thus, two CC values are calculated:
CCP ˆ ÿrP
…2†
CCS ˆ ÿrS
…3†
where the superscript indicates the type of correlation calculated: P refers to Pearson's correlation
coef®cient and S refers to Spearman's rank correlation coef®cient.
572
Electrostatic Complementarity
As the charge of each atom is not given by X-ray
diffraction data, the charges must be assigned from
a parameter set. Formal and PARSE (Sitkoff et al.,
1994) charge parameter sets were used in this
study. The Formal charge set only distributes partial charges to the peptide main chain (N ˆ ÿ0.35;
H ˆ ‡0.25; Ca ˆ ‡0.1; C ˆ ‡0.55; O ˆ ÿ0.55), formally charged residues (Lys Nz ˆ ‡1.0; Arg
NZ1,Z2 ˆ ‡0.5; Glu Oe1,e2 ˆ ÿ0.5; Asp Od1,d2 ˆ ÿ0.5),
and the amino (N ˆ ‡1.0) and carboxy (O ˆ ÿ1.0)
termini. The PARSE charges and radii have been
parameterised to reproduce the experimental solvation energies of amino acids when computed
with DelPhi (Sitkoff et al., 1994). Partial charges are
assigned to atoms in all amino acid side-chains
even if the net charge on the residue is zero.
Measure of electrostatic complementarity: EC
The molecular surface on each of the proteins
(proteins 1 and 2) buried upon complex formation
is calculated to give two interface surfaces (surfaces
1 and 2). For the electrostatic ®elds generated by
proteins 1 and 2 of the complex individually, the
values of each of the electrostatic potential at each
point of the two interface surfaces 1 and 2 are determined, generating four lists of electrostatic potential
(one for each combination of electrostatic ®eld and
surface). We use the ®nite difference Poisson-Boltzmann method (Gilson & Honig, 1988), as implemented in DelPhi, to compute this electrostatic
potential. In this continuum electrostatic model, the
interior of the molecule is considered to have a low
dielectric constant while the surrounding solvent is
taken to be a continuum of high dielectric constant.
Correlations are next calculated between the
electrostatic potential generated by proteins 1 and
2 on the buried molecular surface of protein 1 (r1)
and also between the electrostatic potentials generated by proteins 1 and 2 on the buried molecular
surface of protein 2 (r2). Electrostatic complementarity (EC) is then de®ned as:
EC ˆ ÿ…r1 ‡ r2 †=2
…4†
EC is thus positive when the electrostatic potentials
on the surface are complementary and negative
when the electrostatic potentials on the surface are
similar.
The electrostatic potential depends on the position of the dielectric boundaries used to generate
the electrostatic ®eld. The electrostatic potential
generated by either fully solvated proteins
(Figure 1(a)), or proteins partially desolvated by
the volume of the other protein in the complex
(Figure 1(b)), can be used. Thus, the following two
EC values are calculated:
ECFS ˆ ÿ…r1FS ‡ r2FS †=2
…5†
ECPS ˆ ÿ…r1PS ‡ r2PS †=2
…6†
where the subscript indicates the type of electro-
static potential: FS refers to electrostatic potential
generated by fully solvated proteins, and PS refers
to electrostatic potential generated by partially solvated proteins.
The value of EC also depends on the type of correlation calculated, Pearson's correlation or Spearman's rank correlation (see Materials and
Methods). Thus, when combined with the two
types of electrostatic potential, four EC values are
calculated:
2;P
ECPFS ˆ ÿ…r1;P
FS ‡ rFS †=2
…7†
2;S
ECSFS ˆ ÿ…r1;S
FS ‡ rFS †=2
…8†
2;P
ECPPS ˆ ÿ…r1;P
PS ‡ rPS †=2
…9†
2;S
ECSPS ˆ ÿ…r1;S
PS ‡ rPS †=2
…10†
where the superscript indicates the type of correlation calculated: P refers to Pearson's correlation,
and S refers to Spearman's rank correlation.
Electrostatic energies
Gint is part of a thermodynamic pathway in
which Gelec is divided into partial desolvation
and interaction components (Gilson & Honig,
1988). In the ®rst step, each molecule is partially
desolvated by removing the high dielectric medium (i.e. the solvent) from the region that the
other molecule occupies after binding and replacing it with the low dielectric medium (i.e. the
uncharged protein). The charge ± solvent inter1
action energies of this step are Gprotein
and
solv
protein 2
Gsolv
. In the second step, the interaction
energy, Gint, between the proteins in the presence of solvent is calculated by charging the low
dielectric cavity now present:
Gint ˆ i qi i
…11†
where qi are the newly created charges, and the
potential i at qi is generated by the already
charged, partially desolvated molecule. The value
of Gint is the same regardless of which protein
is initially charged, so that:
1
2
‡ Gprotein
‡ Gint …12†
Gelec ˆ Gprotein
solv
solv
GCoul is part of a thermodynamic pathway in
which Gelec is divided into full desolvation and
Coulombic energies (Gilson & Honig, 1988). First,
each protein is fully desolvated by removing the
high dielectric medium (i.e. the solvent) and
replacing it with the low dielectric medium (i.e.
same as that of the protein), giving the desolva1
tion energy of each protein, Gprotein
and
solv
protein 2
. Secondly, the Coulombic energy,
Gsolv
GCoul, between the proteins is calculated as the
difference between the energy needed to bring all
charges from in®nity to the positions de®ned by
the structure of the complex, and the sum of the
573
Electrostatic Complementarity
Figure 1. (a) Correlations between electrostatic potential on buried molecular surfaces of proteins 1 and 2 used in EC
calculation. Electrostatic potential generated by fully solvated proteins. Buried molecular surfaces indicated with thick
black line. (b) Correlations between electrostatic potential on buried molecular surfaces of proteins 1 and 2 used in
EC calculation. Electrostatic potential generated by partially solvated proteins. Buried molecular surfaces indicated
with thick black line. FS
1 , electrostatic potential generated by charges on protein 1 when protein 1 is fully solvated;
PS
FS
2 , electrostatic potential generated by charges on protein 2 when protein 2 is fully solvated; 1 , electrostatic potenPS
tial generated by charges on protein 1 when protein 1 is partially desolvated; 2 , electrostatic potential generated by
charges on protein 2 when protein 2 is partially desolvated.
energies required to bring the charges of each
protein individually from in®nity to the same
location, in the dielectric medium of the protein.
Finally, the solvation energy of the complex,
1‡2
Gprotein
, is calculated, so that:
solv
1
2
Gelec ˆ ÿ Gprotein
ÿ Gprotein
solv
solv
protein 1‡2
‡ GCoul ‡ Gsolv
…13†
Results and Discussion
Charge complementarity
The calculation of CC requires the identi®cation
of atoms neighbouring one another in the interface.
The distance chosen as the cut-off for including
nearby atoms as neighbours is somewhat arbitrary
because the distance over which charge interactions occur is not ®xed: the electrostatic potential
between two charges approaches zero asymptoti-
574
Electrostatic Complementarity
Table 1. Charge complementarity at protein ± protein interfaces
PDB
1CHOa
1FDLb
1NCAc
1NMBd
2HFLe
2PTCf
2SECg
2SNIg
3HFMh
3HHRi
3HHRi
4HTCj
Formal
Protein 1
Protein 2
CCP
Chymotrypsin
HEL
NA
NA
HEL
Trypsin
Subtilisin
Subtilisin
HEL
hGH
hGH- hGHR1
Thrombin
OMKTY3
D1.3
NC41
NC10
HyHEL5
BPTI
Eglin c
CI-2
HyHEL10
hGHR1
hGHR2
Hirudin
ÿ0.06
ÿ0.08
0.04
0.01
0.01
0.01
ÿ0.08
ÿ0.05
0.01
0.02
0.01
0.00
CCS
CCP
ÿ0.05
ÿ0.04
0.06
0.00
0.00
0.01
ÿ0.04
ÿ0.03
0.04
ÿ0.02
0.00
0.02
ÿ0.02
ÿ0.07
0.02
ÿ0.01
0.03
0.03
ÿ0.03
ÿ0.07
ÿ0.01
0.03
0.00
0.02
PARSE
CCS
0.01
ÿ0.06
0.03
0.00
0.02
0.03
ÿ0.01
ÿ0.08
ÿ0.01
0.04
0.00
0.01
a
Fujinaga et al. (1987). b Bhat et al. (1990). c Tulip et al. (1992). d Malby et al. (1994). e Sheriff et al. (1987). f Huber
et al. (1974). g McPhalen & James, (1988). h Padlan et al. (1989). i de Vos et al. (1992). The co-ordinates were used
to represent two complexes: the binding of hGH to the monomeric hGHR (hGHR1), and the binding of the second hGHR molecule (hGHR2) to the preformed hGH/hGHR complex. These complexes re¯ect the sequential
mode of binding of hGH to its receptor. j Rydel et al. (1991).
cally with the inverse of their separation. In this
study the cut-off distances considered for nearest
Ê , 3.5 A
Ê and 4.0 A
Ê,
neighbours are taken to be 3.0 A
Ê is the preferred distance for the separwhere 3.0 A
ation of charged groups involved in ion pairs
Ê
within proteins (Barlow & Thornton, 1983), 3.5 A
the distance often taken as the maximum length of
a hydrogen bond (Baker & Hubbard, 1984; Rashin
Ê the distance often taken
& Honig, 1984), and 4.0 A
as the maximum separation of charged groups
involved in ``ion pairs'' within proteins (Barlow &
Thornton, 1983).
Charge complementarity was calculated at each
of a variety of protein ±protein interfaces chosen
from the PDB, using both the Formal and PARSE
charge parameter sets. Table 1 shows the calculated CC values for these interfaces using a cut-off
Ê . It can be seen that none of the
distance of 3.5 A
complexes has signi®cant CCP or CCS values for
either of the charge parameter sets. Similarly insigni®cant values are obtained at cut-off distances of
Ê and 4.0 A
Ê (results not shown). Therefore, pro3.0 A
tein ±protein interfaces do not exhibit charge complementarity as measured by CC. A similar
conclusion was drawn by Chau & Dean (1994) for
protein ±small molecule ligand interfaces.
For all but one of the 24 proteins involved in the
12 complexes considered, the sum of the PARSE
assigned charges on atoms making an interface
Ê is becontact within a cut-off distance of 3.0 A
tween 0 and ÿ7.5. The exception is OMKTY3 in
the chymotrypsin/OMKTY3 complex (‡0.2). The
presence of negative charge on the solvent-exposed
surface of proteins has been noted (Novotny &
Sharp, 1992) and attributed to the high partial
charge of carbonyl oxygens (qˆ ÿ 0.55), which are
commonly oriented towards the surface of the protein. The propensity for the interface surfaces to be
negatively charged decreases with increasing cutÊ , four interoff distance: when the cut-off is 3.5 A
faces carry a net positive charge, and when it is
Ê , eight interfaces carry a net positive charge.
4.0 A
Most of the trend towards positive values is due to
a reduction in the dominance of the negatively
charged carbonyl oxygen atoms by the inclusion of
the positively charged backbone carbon atoms to
which they are bonded, within the larger cut-off
distances.
Is CC a good measure of charge complementarity? The CC values are robust to changes in the
nearest-neighbour cut-off distance and the charge
parameter set. However, an analysis of the chargemediated interaction between proteins based solely
on charges at the interface ignores the long-range
effects of other charges, which have been shown to
be important by several mutagenesis studies. Random mutagenesis of a single chain Fv fragment of
D1.3 produced two mutants with improved af®nity and on rates for lysozyme, each with three
amino acid substitutions (Pro40Leu, Asp58Ala and
Asn83Asp in mutant 1; Glu3Lys, Ile29Thr and
Thr97Ser in mutant 2) (Hawkins et al., 1993). In
each case, two of the three changes occur well outside the interface (Pro40Leu and Asn83Asp in the
heavy chain of mutant 1; Glu3Lys and Thr97Ser in
the light chain of mutant 2) while the other
changes either remove one hydrogen bond
(Asp58Ala in mutant 1) or are of a residue not involved in hydrogen bonding (Ile29Thr in mutant
2). The improvement in af®nity is therefore
mediated by residues outside the interface. Longrange interactions have also been observed for the
HyHEL10 ±HEL complex, where mutation of
Asp101 and Lys49 to neutral residues has a signi®cant effect on binding to a range of lysozymes,
although neither residue is in direct contact with
the antigen (Lavoie et al., 1992). Similarly, nine
mutations arising from af®nity maturation of the
esterase activity of antibody 48G7 are not at sites
in direct contact with the hapten (Patten et al.,
1996). These long-range electrostatic effects can be
accounted for by using the electrostatic potential
rather than the charges as the basis for the analysis
of the interaction between proteins.
575
Electrostatic Complementarity
Table 2. Electrostatic complementarity at selected protein ± protein interfaces
PDB
Protein 1
a
1CHO
1FDLb
1FDLb
1NCAc
1NCAc
1NMBd
2HFLe
2HFLe
2PTCf
2SECg
2SNIg
3HFMh
3HFMh
3HHRi
3HHRi
4HTCj
Protein 2
Chymotrypsin OMKTY3
HEL
D1.3 Fab
HEL
D1.3 Fv
NA
NC41 Fab
NA
NC41 Fv
NA
NC10
HEL
HyHEL5 Fab
HEL
HyHEL5 Fv
Trypsin
BPTI
Subtilisin
Eglin c
Subtilisin
CI-2
HEL
HyHEL10 Fab
HEL
HyHEL10 Fv
hGH
hGHR1
hGH/hGHR1
hGHR2
Hirudin
Thrombin
ECPFS
ECSFS
ECPPS
ECSPS
0.16 3
0.14 2
0.11 1
0.21 3
0.19 4
0.15 3
0.22 3
0.20 4
0.07 5
0.15 6
0.11 0
0.08 4
0.07 3
0.20 1
0.21 8
0.21 0
0.53 5
0.39 4
0.40 3
0.44 6
0.44 6
0.28 3
0.59 7
0.61 6
0.13 6
0.67 0
0.51 2
0.22 6
0.22 6
0.61 6
0.37 1
0.70 0
0.52 1
0.57 1
0.57 0
0.71 0
0.71 1
0.61 2
0.76 2
0.73 2
0.28 3
0.56 4
0.43 2
0.27 4
0.28 2
0.66 0
0.62 2
0.61 2
0.47 3
0.65 3
0.65 3
0.61 2
0.62 2
0.54 3
0.69 6
0.69 6
0.40 2
0.68 1
0.54 0
0.42 7
0.44 4
0.65 3
0.51 1
0.76 2
Gint
ÿ41
ÿ36
ÿ36
ÿ102
ÿ100
ÿ52
ÿ87
ÿ81
ÿ55
ÿ49
ÿ38
ÿ33
ÿ33
ÿ71
ÿ87
ÿ107
GCoul
ÿ95
84
112
ÿ176
ÿ169
ÿ86
ÿ195
ÿ209
206
ÿ31
ÿ68
ÿ169
ÿ152
ÿ55
105
ÿ579
Values are given as EC , where EC‡ and EC ÿ are correlations r1 and r2; given in the last decimal place. Energies in kcal
molÿ1 at 25 C.
a
Fujinaga et al. (1987). Ka ˆ 1.8 1011 Mÿ1 (Fujinaga et al., 1987).
b
Bhat et al. (1990). Ka ˆ 2.7 108 Mÿ1 (Bhat et al., 1994).
c
Tulip et al. (1992). Where the Fab was truncated to an Fv fragment, residues 1 to 109 of the variable domain light chain and residues 1 to 113 of the variable domain heavy chain were included. Ka ˆ 7.0 107 Mÿ1 (L. C. Gruen & A. A. Kortt, unpublished
results).
d
Malby et al. (1994). The constant domain of the Fab was not modelled. Ka ˆ 1.1 106 Mÿ1 (L.C. Gruen & A. A. Kortt, unpublished results).
e
Sheriff et al. (1987). The non-standard amino acid at the amino terminus of the heavy chain of the antibody was deleted as there
were no charge parameters for this residue. Where the Fab was truncated to an Fv fragment, residues 1 to 109 of the variable
domain light chain and residues 1 to 113 of the variable domain heavy chain were included. Ka ˆ 4.3 1010 Mÿ1 (Janin, 1995).
f
Huber et al. (1974).
g
McPhalen & James, 1988. Ka ˆ 4.0 109 Mÿ1 (Janin, 1995).
h
Padlan et al. (1989). Where the Fab was truncated to an Fv fragment, residues 1 to 109 of the variable domain light chain and
residues 1 to 113 of the variable domain heavy chain were included. Ka ˆ 1.5 109 Mÿ1 (Padlan et al. 1989).
i
de Vos et al. (1992). The co-ordinates were used to represent two complexes: the binding of hGH to the monomeric hGHR
(hGHR1), and the binding of the second hGHR molecule (hGHR2) to the preformed hGH ±hGHR complex. These complexes re¯ect
the sequential mode of binding of hGH to its receptor. Ka ˆ 2.5 109 Mÿ1 for the hGHR1± hGH complex (Cunningham et al., 1991).
j
Rydel et al. (1991). Ka ˆ 4.3 1012 Mÿ1 (Braun et al., 1988).
Electrostatic complementarity
EC was calculated for a selection of proteinase ±
inhibitor, ligand ± receptor and antigen ±antibody
complexes (see Table 2). Gint and GCoul were
also calculated for each of the protein ±protein
complexes for comparison with EC. Gelec was not
calculated because it includes the (unfavourable)
desolvation energy of each protein, which is a
property of the interaction of the protein with
solvent rather than the interaction between the
proteins.
All EC values shown in Table 2 are signi®cantly
positive. It thus appears that protein± protein interfaces have anti-correlated (complementary) surface
electrostatic potentials. Chau & Dean (1994) drew a
similar conclusion for protein-small molecule
ligand interfaces. For each complex, ECPFS values
are much lower than ECPPS values. The ECPPS and
ECSPS values are highly correlated with each other
so that the ordering of complexes by either parameter is similar. The patterns of electrostatic potential on the buried molecular surfaces for the
complexes with the highest and lowest ECPPS values
(HEL ± HyHEL5 Fab and HEL ± HyHEL10 Fab) are
shown in Figure 2.
Values of EC for the complexes between antigens
and Fab fragments are seen to be very similar to
the EC value for the complex between the antigens
and Fv fragments. This implies that charges and
Ê from the interface
dielectric boundaries over 20 A
do not signi®cantly affect the electrostatic potential
at the interface.
The EC values shown in Table 2 do not partition
the complexes by type (proteinase ± inhibitor, antibody ± antigen or ligand ±receptor). This is in contrast to the partitioning of antigen ±antibody
complexes from other protein ±protein complexes
by shape complementarity coef®cient, SC
(Lawrence & Colman, 1993), and gap volume
index (Jones & Thornton, 1995). The lower shape
complementarity of antigen ± antibody interfaces
has been attributed to the differences in evolutionary history between these complexes and other
protein± protein complexes (Lawrence & Colman,
1993). The different evolutionary histories of the
complexes do not manifest themselves in the electrostatic complementarity of the systems.
It is important to note that EC concerns the electrostatic potential at the interface and as such is
not directly related to either Gint or GCoul. EC is
high when the peaks in the values of the electrostatic potential generated by one protein match the
Figure 2. The electrostatic potential on the molecular surfaces buried in the interface between the two proteins of the
complex. The labelling of the surfaces and type of electrostatic potential used follows the nomenclature introduced in
Figure 1. The views of surfaces in each interface are related by a 180 rotation about an axis oriented north ± south on
the page. For the purposes of the Figures, buried molecular surfaces were calculated using GRASP (written by A.
Nicholls & B. Honig), although for the calculation of EC described in the text, the buried molecular surfaces were calculated using the MS program suite. Electrostatic potential was calculated using DelPhi (as implemented in GRASP)
as for the calculation of EC described in the text, except that for the Figures, the grid size was limited to a box of
dimensions 65 65 65. Correlation coef®cients for the calculation of EC were calculated between the sets of electrostatic potential illustrated across the page in each of the panels. The most negative electrostatic potential throughout
all space for each electrostatic ®eld is indicated in red and the most positive electrostatic potential throughout all
space for each electrostatic ®eld is indicated in blue. (a) Electrostatic potential on the molecular surfaces buried at the
interface between HEL (protein 1) and antibody HyHEL5 (protein 2); PDB entry 2HFL. Electrostatic potential generÿ1
at 25 C. Right column: FS
ated by fully solvated proteins. Left column: FS
1 , range ÿ15 to 28 kcal mol
2 , range ÿ27
ÿ1
to 18 kcal mol at 25 C. Top row: molecular surface of HEL buried by HyHEL5. Bottom row: molecular surface of
HyHEL5 buried by HEL. (b) Electrostatic potential on the molecular surfaces buried at the interface between HEL
(protein 1) and antibody HyHEL5 (protein 2); PDB entry 2HFL. Electrostatic potential generated by partially solvated
ÿ1
ÿ1
at 25 C. Right column: PS
at
proteins. Left column: PS
1 , range ÿ15 to 40 kcal mol
2 , range ÿ84 to 16 kcal mol
25 C. Top row: molecular surface of HEL buried by HyHEL5. Bottom row: molecular surface of HyHEL5 buried by
HEL. (c) Electrostatic potential on the molecular surfaces buried at the interface between HEL (protein 1) and antibody HyHEL10 (protein 2), PDB entry 3HFM. Electrostatic potential generated by fully solvated proteins. Left colÿ1
ÿ1
at 25 C. Right column: FS
at 25 C. Top row:
umn: FS
1 , range ÿ22 to 22 kcal mol
2 , range ÿ27 to 16 kcal mol
molecular surface of HEL buried by HyHEL10. Bottom row: molecular surface of HyHEL10 buried by HEL. (d) Electrostatic potential on the molecular surfaces buried at the interface between HEL (protein 1) and antibody HyHEL10
(protein 2); PDB entry 3HFM. Electrostatic potential generated by partially solvated proteins. Left column: PS
1 ,
ÿ1
at 25 C. Top row: molecular surrange ÿ47 to 58 kcal molÿ1 at 25 C. Right column: PS
2 , range ÿ38 to 16 kcal mol
face of HEL buried by HyHEL10. Bottom row: molecular surface of HyHEL10 buried by HEL.
577
Electrostatic Complementarity
Table 3. Variation of EC, Gint, and GCoul with the parameters of the model for complex NA/NC10, PDB entry
1NMB (Malby et al., 1994)
ECPFS
ECSFS
ECPPS
ECSPS
Gint
GCoul
Reference
0.15 3
0.28 3
0.61 2
0.54 3
ÿ52
ÿ86
Sampling of buried molecular surface
Ê2
Ê 2 to 16 grids/A
increased from 4 grids/A
MS surface probe radius increased from
Ê to 1.7 A
Ê.
1.4 A
DelPhi grid decreased from 201 201 201
to 65 65 65 points/side
DelPhi surface probe radius increased
Ê to 1.7 A
Ê
from 1.4 A
Hydrogen positions changed to those not
minimised in the electrostatic ®eld
Dielectric used to model the interior of
protein increased from 2 to 4
Charge distribution changed to Formal
parameters set
0.14 3
0.29 2
0.61 1
0.54 1
ÿ52
ÿ86
0.13 1
0.29 5
0.61 3
0.53 4
ÿ52
ÿ86
0.17 3
0.28 1
0.58 3
0.50 1
ÿ56
ÿ83
0.15 3
0.29 3
0.59 2
0.54 1
ÿ59
ÿ86
0.12 3
0.27 2
0.57 3
0.50 4
ÿ47
ÿ78
0.19 2
0.29 3
0.60 2
0.54 3
ÿ27
ÿ43
0.10 1
0.25 3
0.36 5
0.41 1
NCb
‡37
Parameter variation
a
Values given as EC , where EC‡ and EC ÿ are correlations r1 and r2; given in the last decimal place. Energy in kcal
molÿ1 at 25 C.
a
Parameter values described in Materials and Methods.
b
NC; energy had not converged.
troughs in the electrostatic potential generated by
the other protein, irrespective of the absolute values of the electrostatic potential. Gint is calculated using the electrostatic potential throughout
the space of the docked protein and is dependent
on the absolute values of the electrostatic potential
at each of the atoms. Despite these fundamental
differences, EC and Gint are correlated. This arises
because the distribution of electrostatic potential
for the protein± protein interfaces has a mean of
zero and approximately the same standard deviation, so that the distribution of absolute values of
electrostatic potential is similar for all complexes.
GCoul is not correlated with EC, but is large and
negative when the proteins carried large net
charges of opposite sign, and is correlated with the
product of the net charges on the two proteins. It
appears that the complementarity of the surface
electrostatic potential at the site of interaction of
two proteins is a feature of the docking even when
GCoul is positive (i.e. is unfavourable). In particular EC, Gint and GCoul are not necessarily correlated with the af®nity of interaction, since total G
of binding includes contributions from other enthalpic and entropic terms.
Values of the four measures of EC (ECPPS, ECSPS,
ECPFS, and ECSFS), are more stable under changes in
the electrostatic parameters of the model than are
the electrostatic energies Gint and GCoul (Table 3;
see Materials and Methods for a discussion of the
dependence of electrostatic complementarity on
various parameters used in the model). In particular, ECPFS and ECSFS, (being correlations) are much
more stable than Gint and GCoul (being sums) to
changes in the probe radius used for generating
the dielectric boundary and the value of the
interior dielectric constant. EC is thus a robust
measure of electrostatic complementarity.
Salt bridges
Previous qualitative studies of electrostatic complementarity have been mainly concerned with the
presence or absence of salt bridges in the interface
of protein± protein complexes. For example, the
HEL ±HyHEL5 complex has been regarded as a
complex with high electrostatic complementarity
because there are three salt bridges in the interface,
while the HEL ± HyHEL10 and HEL ± D1.3 complexes have been regarded as having less electrostatic complementarity as they have only one weak
salt bridge, and no salt bridges, respectively (Slagle
et al., 1994). However, the calculations of EC, and
indeed Gint and GCoul, take into account all the
charges, and not just the net charge on amino acids
forming salt bridges in the interface, nor even just
all the surface charges. Thus, the relative values of
EC for the HEL ± HyHEL5, HEL ±HyHEL10 and
HEL ±D1.3 complexes (see Table 2) are not proportional to the number of salt bridges, although
the HEL ± HyHEL5 complex does have the highest
EC value of the three.
In order to determine the in¯uence of salt
bridges, EC was recalculated after neutralising the
salt bridges at the interface of those protein ±protein complexes that have intermolecular salt
bridges, i.e. the side-chains involved in the salt
bridges were assigned the partial charges of the
uncharged amino acid as given in the PARSE
charge parameter set. These results are shown in
Table 4. The changes in the values of EC vary: for
the NA ±NC10, NA ± NC41, HEL ± HyHEL5 and
hGH ±hGHR1 complexes, EC is markedly reduced;
for the thrombin ±hirudin complex, EC is somewhat reduced; for the trypsin ±BPTI, HEL ±
HyHEL10 and hGH/hGHR1 ±hGHR2 complexes,
EC is only slightly reduced. Similar variation in the
578
Electrostatic Complementarity
Table 4. Electrostatic complementarity at selected protein ± protein interfaces with salt bridges neutralised
PDB
Protein 1
a
1NCA
1NMBb
2HFLc
2PTCd
3HFMe
3HHRf
3HHRg
4HTCh
Protein 2
NA
NC41 Fab
NA
NC10
HEL
HyHEL5 Fab
Trypsin
BPTI
HEL
HyHEL10 Fab
hGH
hGHR1
hGH/hGHR1
hGHR2
Hirudin
Thrombin
ECPFS
ECSFS
ECPPS
ECSPS
Gint
GCoul
0.16 2
0.09 1
0.09 2
0.07 5
0.08 4
0.10 2
0.22 9
0.16 2
0.22 3
0.16 5
0.22 6
0.07 9
0.12 5
0.35 6
0.25 3
0.57 2
0.55 3
0.39 4
0.17 2
0.26 6
0.27 2
0.37 1
0.58 3
0.51 2
0.37 3
0.20 7
0.10 4
0.32 4
0.34 4
0.38 1
0.47 1
0.54 2
ÿ61
ÿ28
ÿ13
ÿ44
ÿ24
ÿ24
ÿ68
ÿ74
ÿ105
ÿ40
ÿ30
194
ÿ71
62
179
ÿ195
Residues involved in salt bridges were taken as those in which the distance for the separation of their charged groups was less
Ê.
than 4.0 A
Values are given as EC where EC‡ and EC ÿ are correlations r1 and r2, given in the last decimal place. Energies in
kcal molÿ1 at 25 C.
a
Tulip et al. (1992). Salt bridge Lys432±Asp97(VH) neutralised.
b
Malby et al. (1994). Salt bridge Lys432±Asp56(VH).
c
Sheriff et al. (1987). Salt bridges Arg45±Glu50, Arg68±Glu35 and Arg68± Glu50 neutralised.
d
Huber et al. (1974). Salt bridge Asp189± Lys15 neutralised.
e
Padlan et al. (1989). Salt bridge Lys97±Asp32 neutralised.
f
de Vos et al. (1992). Salt bridges Lys41±Glu127, Arg64±Asp164, Arg167±Glu127 and Asp171±Arg43 neutralised.
g
de Vos et al. (1992). Salt bridges Arg8±Asp126, and Glu44±Arg16 neutralised.
h
Rydel et al. (1991). Salt bridges Asp50 ± Arg221A, Glu170 ±Arg173, Glu490 ±Lys60F, Asp550 ±Arg73, Asp550 ±Lys149E, Glu580 ±
Arg77A and Gln650 (carboxy terminus)±Lys36 neutralised.
amount of increase in values of Gint (towards less
negative values) is observed. These variations do
not correlate with the number of salt bridges neutralised. Thus, the charges not included in salt
bridges also make a signi®cant contribution to the
electrostatic interaction between the two proteins
and to the electrostatic complementarity at their interface. This is seen, for example, in the case of
the thrombin ±hirudin complex, which involves
the interaction between ®ve negatively charged
residues on hirudin (Asp550 , Glu570 , Glu580 ,
Glu610 and Glu620 ) with three positively charged
residues at the ®brinogen-recognition exosite of
thrombin (Lys149E, Arg73 and Arg77A). Biologically, this interaction involves the ordering of the
carboxy terminus of hirudin upon association
(Folkers et al., 1989; Haruyama & WuÈthrich, 1989;
Rydel et al., 1991) and is re¯ected in a very negative GCoul and relatively high electrostatic complementarity. Two of these negatively charged
residues and the three positively charged residues
also form salt bridges. Removing these salt
bridges, and four other salt bridges that are also
present in the complex does not dramatically reduce EC or increase Gint, and GCoul remains
quite negative.
Electrostatics at the common epitopes of two
NA ± antibody complexes
The two neuraminidase ± antibody complexes
1NMB (NA ± NC10) and 1NCA (NA ± NC41) have
overlapping epitopes on NA. The antibodies NC41
Ê 2 and 716 A
Ê 2, respectively, of
and NC10 bury 899 A
Ê 2 of which is
surface on the neuraminidase, 594 A
common to both interfaces (Malby et al., 1994). The
electrostatic complementarity at the interfaces in
these two antigen ± antibody complexes is represen-
tative of the other protein ±protein complexes
studied here (see Table 2). Furthermore, this complementarity is also expressed on the subset of the
two interfaces that is common to both. Over the
common interface the value of EC (Pearson's correlation for partially solvated proteins) is 0.79 for the
NA ±NC41 complex and 0.65 for the NA ±NC10
complex. We have used the methods described
here to investigate similarities in the electrostatic
potentials on the common interface surfaces in
these two complexes.
The neuraminidase antigens in the two complexes are not identical. In 1NCA, NC41 is complexed with NA from an avian in¯uenza virus
(tern N9) and in 1NMB the NC10 is complexed
with NA from an in¯uenza virus found in a whale
(whale N9). Twelve sequence differences distinguish the two antigens, and whilst none of these
are in either the NC41 or the NC10 binding sites,
three of the differences involve changes in charge.
Furthermore, the three-dimensional structures at
the common interface for the two complexes are
not identical because of differences in conformation
of interface amino acid-side chains. Also, the antibodies NC41 and NC10 have very different sequences in their complementarity determining
regions. Malby et al. (1994) have shown that the
chemical environments of particular antigen residues in the common binding site are quite different
in the two complexes. There are two questions to
be addressed: how similar are the electrostatic potentials of the two antigen structures on the common binding site, and are there similarities in the
potentials of the two quite different antibodies on
this common surface?
The two complexes were aligned by overlapping
the Ca positions of the two neuraminidases. The
Ê . The molecular surfaces
r.m.s. deviation was 0.3 A
579
Electrostatic Complementarity
Table 5. Electrostatic complementarity at overlapping area of epitopes on NA for NC41 and NC10
Protein 1
TN9 ±1NCAa
NC41±1NCAc
Protein 2
ECPFS
ECSFS
ECPPS
ECSPS
WN9±1NMBb
NC10±1NMBd
ÿ0.34 1
ÿ0.10 10
ÿ0.62 4
ÿ0.10 8
ÿ0.55 4
ÿ0.06 6
ÿ0.64 1
ÿ0.10 8
a
Tern N9 of the 1NCA (tern N9 ±NC41) complex.
Whale N9 of the 1NMB (whale N9 ±NC10) complex.
c
NC41 of the 1NCA (tern N9±NC41) complex.
d
NC10 of the 1NMB (whale N9 ±NC10) complex.
b
of whale N9 and tern N9 buried in the interfaces
with NC10 and NC41, respectively, were calculated. The overlapped region of these two epitopes
were identi®ed as those points on the two surfaces
Ê of each other. There were thus two
within 1 A
(very similar) surfaces de®ning the overlapped region, one following the molecular surface of whale
N9 and the other following the molecular surface
of tern N9. The values of the correlation between
the electrostatic potential on these two surfaces
were averaged for the calculation of EC, as described earlier.
The ®rst line of Table 5 shows the values of EC
when the electrostatic potential on the overlapping
epitope is generated by tern N9 and whale N9. The
EC values are negative, implying that the electrostatic surfaces are similar. (This can also be seen by
comparing the top left and bottom left surfaces of
Figure 3.) However, the EC values of greater than
ÿ1.0 (exact similarity) show that the movement of
side-chain and main-chain atoms, and the net
charge difference between tern and whale N9,
combine to have a measurable effect on the pattern
of electrostatic potential.
The second line of Table 5 shows that the EC values between the potentials generated by the antibodies at the overlapping epitope are insigni®cant.
Thus, although the two complexes display electrostatic complementarity, and the common antigenic
sites display electrostatic similarity, there is no
measurable similarity in the electrostatic potentials
of the two antibodies across the common binding
site. The mathematical correlations that exist between the potentials of each antigen ±antibody pair
must derive from different subsets of the potentials
at the common binding site. This becomes evident
from comparing the top right and bottom right
surfaces in Figure 3. i.e. the dominant contribution
to the complementarity in the common epitope interface is due to different spatial regions of the interface.
The ®nding that electrostatic complementarity
may exist over only a subset of the interacting surface at a protein ±protein interface is reminiscent of
other experimental (Nuss et al., 1993; Clackson &
Wells, 1995) and computational (Novotny et al.,
1989) studies, which suggest that the contributions
to the free energy of binding are not uniformly
distributed across all interface amino acids. In
particular, Tulip et al. (1994) calculated the (electro-
static) contribution to the total G of binding
made by each NA residue in the interface and concluded that different residues made the greatest
contributions to the NA ± NC41 and NA ±NC10
complex stabilities. They concluded that in the tern
N9 ±NC41 complex, tern N9 Lys432 and Lys463
and NC41 VH Glu56, Glu96 and Asp97 contributed
most to binding. Of these, tern N9 Lys432 and
NC41 VH Glu96 and Asp97 are in the region that
overlaps with the NC10 epitope, and correspond
to the segments of the surface that dominate the
electrostatic complementarity of the overlapping
epitope (Figure 3). In the whale N9 ±NC10 complex, Tulip et al. (1994) concluded that whale N9
Lys432 and NC10 VH Asp56 contributed most to
binding. Both of these are in the region that overlaps with the NC41 epitope, and again these correspond to the segments of the surface that
dominate the electrostatic complementarity of the
overlapping epitope. In particular, residue NC10
VL Asp91, which dominates the (negative) electrostatic potential generated by NC10 at one region of the surface (i.e. the rightmost dark red
patch in the bottom right surface of Figure 3),
was not found to be a large electrostatic contributor to the binding by Tulip et al. (1994), and indeed this region does not have a complementary
region of (positive) electrostatic potential generated by whale N9.
Conclusions
Values of CCP and CCS are insigni®cantly small
for all the protein± protein complexes studied, and
similar results are obtained for a variety of cut-off
distances and choice of charge parameters. These
results substantially disprove the qualitative assertion often made in the literature that intermolecular interactions involve associations between
surfaces with complementary surface charge distributions. Therefore, the term charge complementarity is not appropriate to describe protein ±protein
interfaces, when used in the sense quantitatively
measured by CC.
On the other hand, EC values for all the selected protein± protein complexes are positive.
This substantially con®rms the qualitative assertion made in the literature that intermolecular
interactions involve associations between surfaces
with complementary electrostatic potential. The
580
Electrostatic Complementarity
Figure 3. The electrostatic potential generated by proteins of the tern N9 ±NC41 and whale N9 ± NC10 complexes on
the molecular surface of tern N9 that is buried in the tern N9 ± NC41 complex and overlaps the whale N9 molecular
surface buried in the whale N9 ± NC10 complex. The calculation of the electrostatic potential is as described for
Figure 2. The most negative electrostatic potential throughout all space for each electrostatic ®eld is indicated in dark
red and the most positive electrostatic potential throughout all space for each electrostatic ®eld is indicated in dark
blue. Top left: electrostatic potential generated by partially solvated tern N9 ( range ÿ20 to 27 kcal molÿ1 at 25 C).
Top right: electrostatic potential generated by partially solvated NC41 ( range ÿ77 to 6 kcal molÿ1 at 25 C). Bottom
left: electrostatic potential generated by partially solvated whale N9 ( range ÿ15 to 20 kcal molÿ1 at 25 C). Bottom
right: electrostatic potential generated by partially solvated NC10 ( range ÿ56 to 5 kcal molÿ1 at 25 C). The region
of dark blue (positive electrostatic potential) produced on the surface by both whale and tern N9 is predominantly
generated by residue Lys432, which is adjacent to this region of the surface. The regions of dark red (negative electrostatic potential) produced on the surface by NC10 and NC41 can also be predominantly attributed to negatively
charged residues immediately facing the surface; in the top right surface the red region in the centre corresponds to
NC41 VH Glu96 and VH Asp97, and in the bottom right surface the red region at the left corresponds to NC10 VH
Asp56 and that at the right corresponds to NC10 VL Asp91.
term electrostatic complementarity is thus appropriate to describe protein± protein interfaces. All
protein charges, and not just the salt bridges or
surface charges, contribute towards the value of
electrostatic complementarity at the interface.
However, charges of atoms more distant from
the interface contribute less, so that antigen ±Fab
and antigen ± Fv complexes have similar EC values.
For all the protein ±protein complexes studied,
ECPPS is greater than ECPFS. The greater complementarity of the partially solvated surface electrostatic
potentials over the fully solvated ones is also
clearly evident from viewing the electrostatic potential (Figure 2). ECPPS also converges more
quickly with increasing grid resolution than ECPFS.
The greater signi®cance of ECPPS values may be because the partially solvated electrostatic potential
more accurately represents the bound state of the
proteins than the fully solvated electrostatic potential, since the dielectric boundary (although not the
charge distribution) is that of the complex. Since
ECPPS and ECSPS are correlated, there is little difference in the ordering of the complexes by either
measure. Future studies of protein± protein interaction should therefore consider ECPPS and ECSPS to
be robust and rigorous measures of electrostatic
complementarity.
EC can also be used to measure electrostatic
similarity. This principle was applied to compare
the electrostatic potential of NA-binding antibodies NC10 and NC41 where their epitopes
overlap. The resulting EC value was insigni®cant,
even though NC10 and NC41 both have quite
high values of EC for their interaction with NA
individually. Thus two surfaces with no measurable similarity in electrostatic potential are able to
bind a common target surface. Electrostatic com-
Electrostatic Complementarity
plementarity in the two complexes is achieved
because it derives from different subsets of the
common binding site.
Materials and Methods
Models
Except where stated, all protein atoms and bound
calcium ions reported in the crystal structure were
included, while crystallographic water molecules and
carbohydrate moieties were not. Water molecules and
carbohydrate moieties on the surface of the proteins
were thus modelled as bulk solvent. Hydrogen atoms
were added and the positions determined by energy
minimisation over 100 steps of steepest descent using
the CVFF (consistent valence force ®eld; DauberOsguthorpe et al., 1988) in Discover (Biosym/MSI Technologies Inc.) version 2.95. No Morse or cross terms
were included in the force ®eld. A distance-dependent
dielectric of 1.0 r was used. All aspartate, glutamate,
arginine and lysine residues, and both amino and
carboxy-terminal groups were considered ionised.
Amino and carboxyl-terminal groups, which arose
from discontinuity in the co-ordinates where there
was disorder in the protein structure, were not considered ionised.
Method for calculation of CC
Atoms within the speci®ed cut-off distance of each
other on the proteins forming the complex were identi®ed using the program X-PLOR version 3.1 (BruÈnger,
1992). Charges were assigned to the atoms in the list
using either the PARSE (Sitkoff et al., 1994) or the Formal
charge parameter sets. Correlations were then calculated
using Pearson's and Spearman's rank correlation tests as
described below.
Method for calculation of electrostatic energies
and EC
The co-ordinates of the points mapping the interface
surfaces were determined by MS (Connolly, 1983). DelPhi was then used to generate the electrostatic potential
at each point on these buried molecular surfaces. The
protein or protein ± protein complex was modelled as a
low dielectric medium (dielectric constant of 2) and the
surrounding solvent as a high dielectric medium, (dielectric constant of 80). The nature of the dielectric variation
and the charge distribution were determined by the
atomic radii and partial charge parameters assigned to
each of the protein atoms. The PARSE charge and radii
parameter set (Sitkoff et al., 1994) was used. Ionic
strength was zero, as increasing it to physiological
strength has been found to have little effect on the resulting solution (Jackson & Sternberg, 1995). The dielectric
boundary and charges were mapped onto a cubic grid of
size of 201 201 201 points/side, with a percentage
grid ®ll of 90%. The electrostatic potential on the boundary of the grid was given by the Debye-HuÈckel electrostatic potential approximation. Dummy atoms for the
uncharged protein were used to maintain the scaling
and orientation on the grid of the complex when only
581
one protein was charged. The molecular surface of the
Ê . The
proteins was de®ned using a probe radius of 1.4 A
linear Poisson-Boltzmann equation was then solved
iteratively until convergence was reached using the
QDIFFXS algorithm of version 3.0 of DelPhi (Gilson &
Honig, 1988; Nicholls & Honig, 1991). The number of
cycles to convergence was automatically determined by
the program, and monitored by examining a plot of
the convergence in the output ®le. DelPhi was run on
a Convex-C3220, each calculation took up to 1.5 hours
to complete. DelPhi outputs the electrostatic energy
GCoul for a protein or protein ± protein complex, and
enables the calculation of Gint by listing the electrostatic potential generated by the charged, partially desolvated protein at the positions of the charges of the
other protein.
Chau & Dean (1994) also used Pearson's and
Spearman's correlation coef®cients to analyse the electrostatic complementarity (in protein-small molecule ligand
complexes). However, our method of calculation differs
in several signi®cant aspects. The EC value in this
work is an average of correlations of the electrostatic
potential generated by the proteins on the two interface surfaces while Chau & Dean (1994) only considered the correlation of the electrostatic potential on
one surface, that of the ligand. The method for calculating the electrostatic potential of the proteins was
also notably different: Chau & Dean (1994) using
CNDO/2 charges and constant and distance-dependent
dielectrics to calculate the electrostatic potential with
the VSS program. PARSE charges are more appropriate
for investigation of electrostatic properties than are the
Mulliken point charges from very approximate semiempirical molecular orbital theory that Chau & Dean
(1994) used. Furthermore, systems where the dielectric
constant and the charge density vary from one region
of space to another, such as the protein complexes in
aqueous solution studied here, are better modelled by
the Poisson-Boltzmann equation. It has been shown
(for example, see Honig & Nicholls, 1995), that the
form of the electrostatic potential is quite dependent on
the shape of the dielectric boundary separating the low
dielectric used to model the protein from the high dielectric used to model the solvent. Also, in this work
we calculate the electrostatic potential surrounding the
proteins with and without the dielectric modelling the
other protein of the complex present, to give the ``partially'' and ``fully'' solvated electrostatic potentials, respectively. Finally, the molecular surface used here was
the buried molecular surface as de®ned by Connolly
(1983), rather than an algorithm that took an interface
point as any position on the ligand surface that was
Ê of a site van der Waals surface atom, as
within 2.8 A
used by Chau & Dean (1994).
Dependence of electrostatic complementarity on the
parameters of the model
The co-ordinates of the anti-NA antibody NC10 in
complex with NA subtype N9 (PDB entry 1NMB; Malby
et al., 1994) were used to examine the dependence of EC,
Gint and GCoul on the various electrostatic parameters
of the model, and the results of this examination are
given in Table 3.
The electrostatic energies are not altered by increasing
the sampling of the buried molecular surface from 4
Ê 2. Sampling the interface at the
Ê 2 to 16 grids/A
grids/A
higher resolution does not appreciably change EC,
although the time to calculate the electrostatic potential
582
almost doubles and the time to calculate the correlation
increases by an order of magnitude. The grid buried
Ê2
molecular surface was therefore sampled at 4 grids/A
for this study.
The probe used for molecular surface calculation re¯ects the size of the solvent molecules. For water, this is
taken as the van der Waals radius of oxygen, as the
bonds to hydrogen are subsumed beneath its surface.
Ê (Pauling, 1960), 1.5 A
Ê (Bondi, 1968), 1.6 A
Ê
Values of 1.4 A
Ê (Ramachandran &
(Brooks et al., 1983) and 1.7 A
Sasisekharan, 1968) have been used for the van der
Waals radius of oxygen. For the calculation of the buried
molecular surface, a larger probe sphere includes more
residues in the interaction and also changes the shape of
the interface. The radius of the probe used to calculate
the dielectric boundary in DelPhi will affect not only the
volume of the low dielectric medium, but also the shape
of the boundary, which in¯uences the solution of the
Poisson-Boltzmann equation through the derivative of
Ê and 1.7 A
Ê for the
the dielectric boundary. Values of 1.4 A
probe radii were chosen for the comparison. EC and
GCoul values are stable when the surface probe radii for
the calculation of the buried molecular surface is varied
Ê to 1.7 A
Ê . EC is also stable when the radius of
from 1.4 A
the probe used to calculate the dielectric boundary in
DelPhi is varied by the same amount; however, Gint is
somewhat affected by this parameter. This result is in
agreement with the conclusion drawn by Hendsch &
Tidor (1994), who showed that the destabilising effects of
a salt bridge in bacteriophage T4 lysozyme could be
altered by the set of atomic radii used to de®ne the molecular boundary between the high dielectric solvent and
the low dielectric protein.
Large protein ± protein complexes require ®ne grids in
order for the mapping of the charges and molecular
boundary to be accurate. EC, Gint and GCoul were calculated on a series of grids ranging from 0.34 to 2.01
Ê (33 33 33 to 201 201 201 points/side).
grids/A
ECSFS, ECPPS and ECSPS values are within 0.05 of the conÊ , while ECPFS does not
verged value even at 0.34 grids/A
Ê
converge to the same extent until 1.28 grids/A
(123 123 123 points/side). GCoul converges to
Ê , and Gint
within 3 kcal/mol (at 25 C) by 0.34 grid/A
Ê
converges to the same extent by 1.4 grids/A
(133 133 133 points/side). A cubic grid of size of
65 65 65 points/side, the standard for DelPhi calculations, is suf®cient to ensure its convergence of ECSFS,
ECPPS and ECSPS, although ®ner grid sampling is required
to ensure convergence of ECPFS, Gint, and GCoul.
X-ray diffraction does not locate the positions of hydrogens unless extremely high-resolution data are colÊ or better). Polar hydrogens were added to the
lected (1 A
protein structures for electrostatics calculations by using
known hydrogen bonding geometry and then re®ning
the positions by energy minimisation in the local electrostatic ®eld, as implemented in InsightII version 2.95 (Biosym/MSI Technologies Inc.). The re®nement process is
computationally intensive, but necessary for the calculation of realistic interaction energies (Jackson &
Sternberg, 1994). The values of EC, Gint and GCoul
were compared when calculated using the co-ordinates
obtained before and after hydrogen positions had been
energetically minimised. The effect of using unminimised
hydrogen positions is to depress the values of EC, and
make the values of Gint and GCoul less negative, as
might be expected.
The correct choice of the dielectric constant with
which to model the protein interior is not well established. Values of 2 and 4 are in general use. Higher di-
Electrostatic Complementarity
electrics model motion of the charges under the
in¯uence of the applied electrostatic ®eld, and decrease
the favourable effects of direct point charge interactions
by increasing the charge screening. Changing the dielectric of the protein from 2 to 4 has little effect on EC; however, the energies are halved. This result is consistent
with that of Hendsch & Tidor (1994), who found that
changing the interior dielectric from 2 to 4 halved the
Gint of several salt bridges. Thus, modelling the protein
with a higher dielectric constant results in a less favourable electrostatic contribution to the binding, but does
not signi®cantly alter EC.
The dependence of the results on the choice of charge
and radii parameters used for solution of the PoissonBoltzmann equation was determined by modelling the
proteins with Formal charge parameters. All EC values
are reduced signi®cantly with the use of Formal charges.
With Formal charge parameters, Gint does not converge
with increasing grid resolution, while although GCoul
converges, it is to a positive rather than a negative value.
The charge set used in the electrostatic calculation thus
affects EC, Gint and GCoul; however, the in¯uence on
the electrostatic energies is much more severe.
For the comparative study of the electrostatic complementarity (EC) of the different protein ± protein interfaces
in this work (Table 2), buried molecular surface sampling
Ê , grid size of
Ê 2, surface probe radii of 1.4 A
at 4 grids/A
201 201 201, energy minimised hydrogens atoms,
interior dielectric of 2, and the PARSE charge parameter
set were used.
Statistical analysis
Pearson's correlation coefficient
Pearson's correlation coef®cient (rP) is also known as
the linear correlation coef®cient, or product-moment correlation coef®cient. For pairs of charges or electrostatic
potential a and b, where ai and bi are the charges or electrostatic potential at point i, for iˆ1 . . . n then:
Pn
†…bi ÿ b†
P
iˆ1 …ai ÿ a
q
r ˆ qP

Pn
n
2
…ai ÿ a †2
…bi ÿ b†
iˆ1
iˆ1
where a is the mean of ai, iˆ1 . . . n, and b is the mean of
bi, iˆ1 . . . n.
An rP value of ‡1 means that the variance of one set
of charges or electrostatic potential is perfectly correlated
with the variance of the other and rP value of ÿ1 means
that the variances are perfectly anticorrelated. Pearson's
correlations between the charges and electrostatic potential were calculated using the S-Plus package (Statistical
Sciences Inc.) version #1990, run on a Convex-C3220
computer.
Spearman's rank correlation coefficient
Spearman's rank correlation coef®cient (rS) is a
measure of the similarity in pattern between two sets (1
and 2) of charges or electrostatic potential. Each value of
the charge or electrostatic potential at the point i,
iˆ1 . . . n, is calculated and ranked. Let Ri be the rank of
the charge or electrostatic potential at point i among the
charges or electrostatic potential in set 1, and Si be the
rank of the charge or electrostatic potential at point i in
set 2. These numbers form a uniform distribution of integers between 1 and n. Where there are ties in the ranking, the charges or electrostatic potential is assigned the
583
Electrostatic Complementarity
mean of the rank that they would have had if their
values had been slightly different. This rank will either
be in integer or a half-integer. The rank correlation is
then given by:
ÿ 3
P
2
ÿ 6 niˆ1 …Ri ÿ Si †2
n ÿ n ÿ T1 ÿT
S
2
r ˆ pp
…n3 ÿ n ÿ T1 † …n3 ÿ n ÿ T2 †
where
T1 ˆ
X
X
…r3k ÿ rk †; T2 ˆ
…s3m ÿ sm †
k
m
and rk is the number of ties in the kth group of ties
among the Ri values and sm is the number of ties in the
mth group of ties amongst the Si values. The measure is
only dependent on the rank of the values (not the actual
values), and will be high when the trend is monotonic
but not necessarily linear. An rS value of ‡1 indicates
correlation and a value of ÿ1 indicates anticorrelation.
Spearman's rank correlation was calculated using subroutine G02BQF of the NAG (Numerical Algorithms
Group) Subroutine Fortran Library, version 15, run on a
Cray Y-MP 4E/464.
Acknowledgements
We thank Michael C. Lawrence and Brian J. Smith for
valuable discussions.
References
Baker, E. N. & Hubbard, R. E. (1984). Hydrogen bonding in globular proteins. Prog. Biophys. Mol. Biol. 44,
97± 109.
Barlow, D. J. & Thornton, J. M. (1983). Ion-pairs in
proteins. J. Mol. Biol. 168, 867± 885.
Bhat, T. N., Bentley, T. G., Fischmann, T. O., Boulot,
G. & Poljak, R. J. (1990). Small rearrangements
in structures of Fv and Fab fragments of antibody D1.3 on antigen binding. Nature, 347,
483 ± 485.
Bhat, T. N., Bentley, T. G., Boulot, G., Greene, M. I.,
Tello, D., Dall'Acqua, W., Souchon, H., Schwatrz,
F. P., Mariuzza, R. A. & Poljak, R. J. (1994). Bound
water molecules and conformational stabilization
help mediate an antigen-antibody association. Proc.
Natl Acad. Sci. USA, 91, 1089± 1093.
Bondi, A. (1968). In Molecular Crystals, Liquids and
Glasses, John Wiley & Sons, Inc., New York.
Braden, B. C. & Poljak, R. J. (1995). Structural features of
the reactions between antibodies and protein
antigens. FASEB J. 9, 9 ±16.
Braun, P. J., Dennis, S., Hofsteenge, J. & Stone, S. R.
(1988). Use of site-directed mutagenesis to investigate the basis for the speci®city of hirudin. Biochemistry, 27, 6517± 6522.
Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States,
D. J., Swaminathan, S. & Karplus, M. (1983).
CHARMM: A program for macromolecular energy,
minimisation, and dynamics calculations. J. Comp.
Chem. 4, 187 ±217.
BruÈnger, A. T. (1992). X-PLOR version 3.1: A system for
X-ray Crystallography and NMR, Yale University
Press.
Chau, P.-L. & Dean, P. M. (1994). Electrostatic complementarity between proteins and ligands. 1. Charge
disposition, dielectric and interface effects.
J. Comput.-Aided Mol. Des. 8, 513 ±525.
Clackson, T. & Wells, J. A. (1995). A hot spot of binding
energy in a hormone-receptor interface. Science, 267,
383± 386.
Connolly, M. L. (1983). Analytical molecular surface
recognition. J. Appl. Crystallog. 16, 548 ± 558.
Cunningham, B. C., Ultsch, M., de Vos, A. M.,
Mulkerrin, M. G., Clausen, K. C. & Wells, J. A.
(1991). Dimerisation of the extracellular domain of
the human growth hormone receptor by a single
hormone molecule. Science, 254, 821 ± 825.
Dauber-Osguthorpe, P., Roberts, V. A., Osguthorpe, D. J.,
Wolff, J., Genest, M. & Hagler, A. T. (1988). Structure and function of ligand binding to proteins:
Escherichia coli dihydrofolate reductase-trimethoprim, a drug receptor system. Proteins: Struct. Funct.
Genet. 4, 31± 47.
Demchuck, E., Mueller, T., Oschkinat, H., Sebald, W. &
Wade, R. C. (1994). Receptor binding properties of
four-helix bundle growth factors deduced from electrostatic analysis. Protein Sci. 3, 920 ± 935.
de Vos, A. M., Ultsch, M. & Kossiakoff, A. A. (1992).
Human growth hormone and extracellular domain
of its receptor: crystal structure of the complex.
Science, 255, 306 ± 312.
Folkers, P. J. M., Clore, G. M., Driscoll, P. C., Dodt, J.,
Kohler, S. & Gronenborn, A. M. (1989). Solution
structure of recombinant hirudin and the Lys47-Glu
mutant: a nuclear magnetic resonance and hybrid
geometry-dynamical simulated annealing study.
Biochemistry, 28, 2601± 2617.
Fujinaga, M., Sielecki, R., Read, R. J., Ardelt, W.,
Laskawski, M. Jr & James, M. N. G. (1987). Crystal
and molecular structure of a-chymotrypsin with its
Ê
inhibitor turkey ovomucoid third domain at 1.8 A
resolution. J. Mol. Biol. 195, 397± 418.
Getzoff, E. D., Tainer, J. A., Weiner, P. K., Kollman,
P. A., Richardson, J. S. & Richardson, D. C. (1983).
Electrostatic recognition between superoxide and
copper, zinc superoxide dismutase. Nature, 306,
287 ± 290.
Gilson, M. K. & Honig, B. (1988). Calculation of the
total electrostatic energy of a macromolecular system: solvation energies, binding energies, and conformational analysis. Proteins: Struct. Funct. Genet. 4,
7 ±18.
Haruyama, H. & WuÈthrich, K. (1989). Conformation of
recombinant desulfato-hirudin in aqueous solution
determined by nuclear magnetic resonance. Biochemistry, 28, 4301± 4312.
Hawkins, R. E., Russell, S. J., Baier, M. & Winter, G.
(1993). The contribution of contact and non-contact
residues of antibody in the af®nity of binding to
antigen: the interaction of mutant D1.3 antibodies
with lysozyme. J. Mol. Biol. 234, 958 ± 964.
Honig, B. & Nicholls, A. (1995). Classical electrostatics
in biology and chemistry. Science, 268, 1144± 1149.
Hendsch, Z. S. & Tidor, B. (1994). Do salt bridges stabilise proteins? A continuum electrostatics analysis.
Protein Sci. 3, 211 ± 226.
Huber, R., Kukla, D., Bode, W., Schwager, P., Bartels,
K., Deisenhofer, J. & Steigemann, W. (1974). Structure of the complex formed by bovine pancreatic
trypsin inhibitor. II Crystallographic re®nement at
Ê resolution. J. Mol. Biol. 89, 73± 101.
1.9 A
584
Electrostatic Complementarity
Jackson, R. M. & Sternberg, M. J. E. (1994). Application
of scaled particle theory to model the hydrophobic
effect: implications for molecular association and
protein stability. Protein Eng. 7, 371 ± 383.
Jackson, R. M. & Sternberg, M. J. E. (1995). A continuum
model for protein-protein interactions: application
to the docking problem. J. Mol. Biol. 250, 258 ± 275.
Janin, J. (1995). Elusive af®nities. Proteins: Struct. Funct.
Genet. 21, 30 ± 39.
Janin, J. & Chothia, C. (1990). The structure of proteinprotein recognition sites. J. Biol. Chem. 265, 16027±
16030.
Jones, S. J. & Thornton, J. M. (1995). Protein-protein
interactions: a review of protein dimer structures.
Prog. Biophys. Mol. Biol. 63, 31± 65.
Karshikov, A., Bode, W., Tulinsky, A. & Stone, S. R.
(1992). Electrostatic interactions in the association of
proteins: analysis of the thrombin-hirudin complex.
Protein Sci. 1, 727± 735.
Lavoie, T. B., Drohan, W. N. & Smith-Gill, S. J. (1992).
Experimental analysis by site-directed mutagenesis
of somatic mutation effects on af®nity and ®ne
speci®city in antibodies speci®c for lysozyme.
J. Immunol. 148, 503± 513.
Lawrence, M. C. & Colman, P. M. (1993). Shape complementarity at protein/protein interfaces. J. Mol. Biol.
234, 946± 950.
Lescar, J., Pellegrini, M., Souchon, H., Tello, D., Poljak,
R., Peterson, N., Greene, M. & Alzari, P. (1995).
Crystal structure of a cross reaction complex
between Fab F9.13.7 and guinea-fowl lysozyme.
J. Biol. Chem. 270, 18067± 18076.
Malby, R. L., Tulip, W. R., Harley, V. R., McKimmBreschkin, J. L., Laver, W. G., Webster, R. G. &
Colman, P. M. (1994). The structure of a complex
between the NC10 antibody and in¯uenza virus
neuraminidase and comparison with the overlapping binding site of the NC41 antibody. Structure,
2, 733± 746.
McDonald, N. Q., Lapatto, R., Murray-Rust, J., Gunning,
J., Wlodawer, A. & Blundell, T. L. (1991). New proÊ resolution crystal
tein fold revealed by a 2.3 A
structure of nerve growth factor. Nature, 354, 411 ±
414.
McPhalen, C. A. & James, M. N. G. (1988). Crystal and
molecular structure of the serine proteinase inhibitor (CI-2) from barley seeds. Biochemistry, 26, 261 ±
269.
Nicholls, A. & Honig, B. (1991). A rapid ®nite difference
algorithm, utilizing successive over-relaxation to
solve the Poisson-Boltzmann equation. J Comput.
Chem. 12± 435.
Novotny, J. & Sharp, K. (1992). Electrostatic ®elds in
antibodies and antibody/antigen complexes. Prog.
Biophys. Mol. Biol. 58, 203± 224.
Novotny, J., Bruccoleri, R. E. & Saul, F. A. (1989). On
the attribution of binding energy in antigen-antibody complexes MPC603, D1.3, and HyHEL-5. Biochemistry, 28, 4735± 4749.
Nuss, J. M., Bossart, P. J. & Air, G. M. (1993). Identi®cation of critical contact residues in the NC41 epitope of a subtype N9 in¯uenza neuraminidase.
Proteins: Struct. Funct. Genet. 15, 121± 132.
Padlan, E. A., Silverton, E. W., Sheriff, S., Cohen, G. H.,
Smith-Gill, S. J. & Davies, D. R. (1989). Structure of
an antibody-antigen complex: Crystal structure of
the HyHEL10 Fab-lysozyme complex. Proc. Natl.
Acad. Sci. USA, 86, 5938± 5942.
Patten, P. A., Gray, N. S., Yang, P. L., Marks, C. B.,
Wedemayer, G. J., Bonface, J. J., Stevens, R. C. &
Schultz, P. G. (1996). The immunological evolution
of catalysis. Science, 271, 1086± 1091.
Pauling, L. (1960). The Nature of the Chemical Bond,
Cornell University Press, Ithaca, N.Y..
Ramachandran, G. N. & Sasisekharan, V. (1968). Conformation of polypeptides and proteins. Advan. Protein
Chem. 23, 283±437.
Rashin, A. A. & Honig, B. (1984). On the environment
of ionizable groups in globular proteins. J. Mol. Biol.
173, 515 ± 521.
Roberts, V. A., Freeman, H. C., Olson, A. J., Tainer,
J. A. & Getzoff, E. D. (1991). Electrostatic orientation
of the electron-transfer complex between plastocyanin and cytochrome c. J. Biol. Chem. 266, 13431±
13441.
Rydel, T. J., Tulinsky, A., Bode, W. & Huber, R. (1991).
The re®ned structure of the hirudin-thrombin
complex. J. Mol. Biol. 221, 583± 601.
Sheriff, S., Silverton, E. W., Padlan, E. A., Cohen, G. H.,
Smith-Gill, S. J., Finzel, B. C. & Davies, D. R. (1987).
Three dimensional structure of an antibody-antigen
complex. Proc. Natl. Acad. Sci. USA, 84, 8075± 8079.
Sitkoff, D., Sharp, K. A. & Honig, B. (1994). Accurate
calculation of hydration free energies using macroscopic solvent models. J. Phys. Chem. 98, 1978± 1988.
Slagle, S. P., Kozack, R. E. & Subramaniam, S. (1994).
Role of electrostatics in antibody-antigen association: anti-hen egg lysozyme/lysozyme complex
(HyHEL-5/HEL). J. Biomol. Struct. Dynam. 12, 439 ±
456.
Tulip, W. R., Varghese, J. N., Laver, W. G., Webster,
R. G. & Colman, P. M. (1992). Re®ned crystal structure of the in¯uenza virus N9 neuraminidase NC41 Fab complex. J. Mol. Biol. 227, 122± 148.
Tulip, W. R., Harley, V. R., Webster, R. G. & Novotny,
J. (1994). N9 neuraminidase complexes with antibodies NC41 and NC10: empirical free energy calculations capture speci®city trends observed with
mutant binding data. Biochemistry, 33, 7986 ±7997.
Edited by B. Honig
(Received 31 October 1996; received in revised form 19 February 1997; accepted 19 February 1997)