Core-directed protein design Derek N Woolfson

464
Core-directed protein design
Derek N Woolfson
For various reasons, it seems sensible to redesign or design
proteins from the inside out. Past approaches in this field have
involved iterations of mutagenesis and characterisation to
‘evolve’ designs. Increasingly, combinatorial approaches are
being taken to select ‘fit’ sequences from libraries of variant
proteins. In particular, in silico methods have been used to
good effect. More recently, experimental methods have been
developed and improved. We are now in a position to redesign
stability and function into natural protein frameworks confidently
and to attempt de novo designs for more ambitious targets.
Addresses
Centre for Biomolecular Design and Drug Development, School of
Biological Sciences, University of Sussex, Falmer BN1 9QG, UK;
e-mail: [email protected]
Current Opinion in Structural Biology 2001, 11:464–471
0959-440X/01/$ — see front matter
© 2001 Elsevier Science Ltd. All rights reserved.
Abbreviations
PDB
Protein Data Bank
TIM
triose phosphate isomerase
WT
wild-type
and combinatorial design. I discuss the latter in detail later.
By wholly rational design, I mean the direct application of
sequence-to-structure rules to achieve a specific target
structure. Preferably, the rules should be understood in
physicochemical terms. The rules may be positive, that is,
to design towards the target, or negative, to disfavour and
design away from alternative structures [4–6].
Not surprisingly, current successes in wholly rational
approaches are limited to special cases. From the perspective
of core-directed design, the best examples are the rules for
oligomer-state selection in coiled coils, which are two-,
three-, four- or five-stranded helical bundles. The seminal
studies of Harbury et al. [7] on mutants of the leucine zipper
show that the oligomer state can be distinguished — at
least between parallel dimer, trimer and tetramer — using
appropriate combinations of isoleucine and leucine at
the a and d positions of the abcdefg (heptad) sequence
repeat. The resulting rules have been, and doubtless will
continue to be, improved [1,8,9]. Nonetheless, the current
rules provide clear guidelines for constructing specified
coiled-coil oligomers and form the basis of more ambitious
designs [1,10•,11–14].
Introduction
The organisation of a hydrophobic core provides the main
driving force for protein folding and stabilisation and, in
some cases, native-state specification. It seems reasonable,
therefore, to design new proteins from the inside out.
Increasingly, protein designers are taking this approach,
which I refer to as core-directed design.
Iterative experimental design processes
In certain cases, core-directed design is protein design
heaven; for example, stability and specificity can be built
into simple coiled-coil structures using a few knowledgebased rules that can be applied without involving
computers [1]. Unfortunately, this understanding does not
extend to globular proteins. Early attempts to design
globular proteins took iterative approaches, in which related
sequences were sequentially tested for stability and
structural uniqueness — effectively evolving the designs.
Now, combinatorial methods are being applied. In these
approaches, many core sequences that are potentially
compatible with a target structure are tested simultaneously
and winners selected. Selection can be done in silico or via
wet experiments; the latter are generally referred to as
directed-evolution methods. The main computational
methods are amply reviewed elsewhere [2,3]. This review
focuses largely on recent experimental approaches to the
problem of core-directed design.
The design of α3D illustrates this process [15]. α3D is a singlechain protein designed to form a mixed parallel/antiparallel
three-helix bundle. The starting point is Coil-Ser, a
previously described three-stranded coiled coil [16]. This
is used as a template to design α3C [17], which has
shortened helices, helix-capping motifs and a repacked
hydrophobic core to introduce heterogeneity and disfavour
coiled-coil-type packing. The NMR structure of α3D (a
variant of α3C) agrees reasonably with the design model. A
noteworthy point is the use of negative design: interhelix
electrostatic interactions are used to orientate the helices in
an anticlockwise manner and disfavour alternative topologies — this principle also works in a canonical coiled-coil
system [10•,18]. The iterative redesign and characterisation
of α3D is ongoing [19]. Incidentally, Coil-Ser has been
used as a template to make another single-chain threehelix bundle [20].
Rule-based or wholly rational design and
special cases
It is convenient to consider two broad approaches in protein
redesign and de novo design; namely, wholly rational design
For non-coiled-coil proteins, design is not so prescribed
and alternative routes to correctly folded, stable structures
are needed. One approach is to design iteratively, adding
small positive and negative design features step-by-step
and testing the intermediates experimentally.
Many iterative designs and redesigns have focused on
four-helix bundles, which offer a step up in complexity
from coiled coils. The classic example is DeGrado’s evolution
of four-helix-bundle designs, which was recently reviewed
by Hill et al. [6].
Core-directed protein design Woolfson
465
Figure 1
(a)
(b)
(c)
(d)
Current Opinion in Structural Biology
Orthogonal views of various four-helix-bundle structures. (a) WT ROP (PDB code 1rop; [60]). (b) The Ala2Ile2-6 core mutant of ROP (PDB code 1f4m;
[27•]). (c) The A31P mutant of ROP (PDB code 1b6q; [28•]). (d) The de novo design α2D (PDB code 1qp6; [29]). Different chains of each structure are
coloured blue and red; N termini are highlighted black.
Gibney et al. [21] describe an iterative approach to map out
sequence space and the associated free-energy landscape
of a previously designed four-helix-bundle maquette [22,23].
This is done on a modest scale, limited by the level of
characterisation undertaken. The parent peptide has
histidine at two a sites to promote haem binding and three
leucines at d sites. The apoform is unstable to guanidine
denaturation and displays structural heterogeneity. In an
attempt to improve this design, single, double and triple
mutants were made to introduce isoleucine, valine and
phenylalanine at the d positions. The most-promising
mutants (in terms of stability and structural uniqueness) at
each stage are taken to the next stage; the first and second
iterations returned improved designs, but the third iteration
was disappointing as the peptides lost conformational
uniqueness. The maquette is being used as a template for
the iterative redesign of haem-binding pockets [24].
Interesting but cautionary tales from the
repacking of four-helix bundles
As highlighted for the coiled coils [1,7,25], repacking a
hydrophobic core can have consequences other than changing
stability and conformational heterogeneity. Four-helix
bundles also alter in response to mutation.
The four-helix bundle ROP is a dimer of an antiparallel
helical hairpin (Figure 1a). The active RNA-binding site is
on the face formed by the two copies of helix 1, which are
antiparallel; this provides a convenient probe for the native
structure. The structure has a core of a and d layers from
coiled-coil-like heptad repeats. Regan’s group [26] has
systematically repacked this core. For example, in
Ala2Leu2-6, the middle six a and d sites are exchanged for
alanine and leucine, respectively. Theoretically, this mutant
maintains the wild-type (WT) core volume. Consistent
with this, the mutant is active but thermally stabilised.
Willis et al. [27•] characterise a related mutant, Ala2Ile2-6,
which is also thermally stabilised but inactive. The crystal
structure explains the loss of activity: compared with the
WT structure, one protomer in the mutant is rotated 180°
around the dimer interface (Figure 1b). In this new topology,
the two copies of helix 1 are juxtaposed diagonally rather
than adjacent, which splits the active face.
A more perplexing structural rearrangement occurs in the
ROP mutant A31P [28•], which is a helical dimer with
reduced stability but some activity. The crystal structure
reveals a remarkable architectural transformation to a
‘bisecting U’ motif, in which the helical hairpins intercalate
466
Engineering and design
to form a right-handed four-helix bundle (Figure 1c). The
term ‘bisecting U’ is introduced by Hill to describe the
structure of the designed four-helix bundle α2D (Figure 1d)
[6,29]. Presumably, ROP A31P retains activity because the
two copies of helix 1 remain adjacent, although they are
parallel and not antiparallel as in the WT structure. This
dramatic rearrangement is particularly worrisome because it
results from a single amino acid change to the ROP sequence.
Thus, simple rules like those for packing coiled coils are not
as forthcoming in ROP and similar systems. This probably
reflects the fact that, firstly, ROP is more complicated than
the leucine zippers on which the coiled-coil studies are
based, as ROP is twice the size and has less-regular sequence
repeats. Secondly, alternative four-helix-bundle topologies
and architectures are possibly more similar in energy than the
alternative oligomer states of coiled coils [6].
Combinatorial design and the general case
How can one design without specific rules to relate sequence
and structure? The answer is to take a combinatorial
approach. This can be done either in silico or using wet
experiments. Both processes involve selecting fit variants from
libraries of sequences for the targeted structural scaffold.
Computational approaches in combinatorial design
and redesign
Various computational methods have been developed for
combinatorial core redesign and design. In essence,
sequence and (to differing extents) conformational spaces
are searched using methods such as simulated annealing,
dead-end elimination, Metropolis Monte Carlo sampling
and genetic algorithms. Sequences are then scored on the
basis of the physicochemical attributes of proteins, such as
van der Waals contacts, solvation terms, secondary structure
propensities, electrostatic energies and hydrogen-bond
potentials, which are parameterised with varying degrees
of approximation and sophistication. The developments
and successes in this area have been considerable. The
field is extremely well reviewed elsewhere [2,3] and, with
a few exceptions, I will not dwell on it.
New parameters and algorithms for in silico
core-directed design
In silico design requires scoring functions to rank the
compatibility of the sequences searched with the target
structure. The functions must be quick to implement and,
therefore, must make assumptions about interactions within
proteins. The development of parameters that make
scoring functions faster and/or more realistic, therefore, has
clear benefits for protein design.
Parameterising the part of the hydrophobic interaction that
stabilises protein structures is one issue. In an attempt to
understand the energetics of core deletions from proteinengineering studies, Vlassi et al. [30•] introduce ∆nh. nh
reflects the number of methylene and methyl contacts
made within 6 Å of the site of mutation, and it is calculated
from high-resolution structures of the WT and mutant
proteins as the weighted sum of the atom–atom contacts
within that region. Two weightings are included for
distance and solvent accessibility, which dampen contributions
from distantly spaced atoms and from surface-exposed
residues, respectively. ∆nh is the difference between the
nh of the mutant and WT structures. Given the limited
experimental data available, ∆∆G and ∆nh correlate reasonably, and the slopes of ∆∆G versus ∆nh plots essentially
provide values for the energy cost per contact lost. In this
respect, ∆nh may prove useful in quickly assessing the
relative quality of in silico generated design models. Similar,
though less-sophisticated parameters have been introduced
by others, which may also be useful in this regard [31].
What about assessing structural specificity? Fleming and
Richards [32•] apply an occluded surface algorithm to
calculate packing efficiencies in high-resolution protein
structures. A striking approximately 20% variation in packing
parameters is noted across these structures. Briefly, packing
efficiency increases with protein size, α-helix content and
content of aromatic and small residues. The higher packing
density of α-helical structures is a result of good intrahelix
packing, primarily between backbone atoms. In terms of
intersecondary structure packing, β strand–β strand interactions show the highest occluded surfaces, whereas β
strand–α helix interactions are poorer and only marginally
better than intrastrand packing efficiencies; this fits neatly
with recent experimental findings [33••]. Finally, the
packing efficiencies of proteins from the same structural
family are similar — which presumably reflects the sum of
the above correlations — and the authors suggest that
these calculations will be of use in benchmarking and
validating homology and other models. Presumably, this
includes design models.
Jiang et al. [34••] present a new algorithm (CORE) for in silico
combinatorial design. Hydrophobic core residues are mutated
on fixed backbone structures. Metropolis-driven simulated
annealing and Monte Carlo sampling are combined with a
novel scoring function to find sequence and rotamer combinations for potentially hyperstable proteins. The scoring
function selects combinations with the best compromise
of minimal atom–atom clashes, maximal burial of hydrocarbon and lowest sidechain conformational entropy. The
assessment of steric compatibility is straightforward and
does not evaluate van der Waals interactions explicitly.
The thermodynamic term ∆CP, which is an experimental
parameter that reflects the amount of hydrophobic surface
buried during protein folding, is implemented to drive
towards sequences with maximum burial of hydrocarbon.
The entropy term is introduced to select combinations of
residues that ‘freeze out’ more conformations upon packing.
The latter seems counterintuitive, but is rationalised in that
structurally unique and cooperatively folded states require
relatively fixed sidechains to make specific interactions. In
a sense, the entropy term is an attempt to parameterise
negative design by selecting complementary fits. The
Core-directed protein design Woolfson
467
Figure 2
Phage-display selection of stable, folded
proteins. (a) Selectively infective phage (SIP)
takes advantage of the three-domain structure
of the minor coat protein (g3p) of phage. The
C-terminal domain anchors the protein in the
viral coat, whereas the N-terminal domains are
responsible for binding and infection in E. coli.
Cloning a library into the flexible linker before
the C-terminal domain allows protease-based
selection because proteolysis of the insert
removes the N-terminal domains and prevents
infection in E. coli. This selects against
unstable inserts [38,39]. (b) Alternatively, an
uninterrupted g3p can be used as follows.
His-tag–target–g3p–phage allows intact
protein–phage fusions to be tethered to
nickel-coated surfaces, which can be washed
with protease to remove phage harbouring
unstable linkers [40•]. In this case, selection
can be monitored directly by surface plasmon
resonance in BIACORE, which allows many
conditions to be tested quickly and individual
clones to be compared. Alternatively, Ni-NTAagarose beads can be used for large-scale
selections. The domains of g3p are
represented by shaded ovals and the targeted
protein inserts are represented by rectangles.
(a)
(b)
Protease
Protease
disadvantage of the current version of CORE is that the
polar residue and backbone contexts of the target are
constrained, which may explain why it returns core sequences
that are closely related to the WT. This aside, CORE’s
ability to cope with large structures is impressive and its
potential to relate in silico and experimental parameters is
promising. The group has used CORE to design a hyperthermophilic protein [35].
Experimental approaches in combinatorial design
and redesign
The first approaches to the combinatorial redesign of protein
cores were experimental [36,37]. These used function-based
selections; however, true experimental counterparts of the
aforementioned in silico methods require selections that do
not rely on function, but reflect only structure and stability.
Such methods would complement in silico methods and
find applications, firstly, in optimising stability under specific
conditions, secondly, in de novo design or redesign
where selectable functions are not available and, thirdly, in
establishing sequence-to-structure/stability rules where
structure/stability and structure/function must be uncoupled.
Three groups have succeeded in selecting stable proteins
without a functional screen or selection [38,39,40•]. All
combine phage display and proteolysis to recover stable
proteins from mutant libraries. The underlying principle is
straightforward: poorly folded mutants are proteolysed more
rapidly than competently folded and stable variants. But how
are stably folded (intact) variants rescued? In phage display,
a target gene is fused to that for a phage coat protein. This
His6
Ni
His6
Ni
Current Opinion in Structural Biology
leads to display of the target on the phage surface, where it
can be subjected to various selections. Because the gene
for the fusion is encased within the phage, phenotype
and genotype are linked, and selected proteins can be
identified by DNA sequencing. This system relies on the
compliance of Escherichia coli to become infected by and
then to propagate the phage. Traditional phage-display
selection relies on the displayed proteins binding something, which is a function. The selection of stable proteins
without resorting to functional selection has been achieved
in two ways. Kristensen and Winter [38] and Sieber et al. [39]
employ selectively infective phage (SIP) [41], whereas my
colleagues and I use an alternative approach, which
involves more traditional protein–phage fusions, to select
stable target proteins (Figure 2) [40•].
The proof-of-principle studies for these methods use a
variety of control inserts and/or relatively small protein
libraries. All three groups have now presented more
ambitious applications of the new technology: we rescued
stable ubiquitin variants from a library of hydrophobic core
mutants [42••]; Riechmann and Winter [43•] generated
stable protein chimeras by complementing half of the CspA
protein with fragments generated from genomic E. coli
DNA [43]; and Martin et al. [44•] described the selection of
hyperstable variants of a mesophilic CspB.
Oil-drop versus jigsaw-puzzle models for
core packing
The issue of whether complementary core packing is
necessary for folding to a stable, unique state has also been
468
Engineering and design
addressed during the period of review. The influential
work of DeGrado and co-workers [6] on the evolution of
designed four-helix bundles emphasises that achieving
stability and structural uniqueness are distinct; some of the
earlier designs were stable to denaturation, but showed
structural heterogeneity. Achieving structural uniqueness
using negative design is now recognised as key in protein
design. Is negative design necessary, however, in
hydrophobic core design or can structural specificity be
achieved elsewhere in the sequence and structure?
There are two extreme models for core packing. In the oildrop model, partitioning of hydrophobic and polar residues
is paramount and the precise fit in the core secondary. In
the jigsaw-puzzle model, however, shape and chemical
complementarity of residues in the core are all-important
in defining the structural uniqueness. These concepts and
models are more fully reviewed elsewhere [2,4].
Until recently, combinatorial mutagenesis studies of protein
cores lent force to the oil-drop model; within the restraints
of maintaining ballpark hydrophobicity and volume, the
cores of λ-repressor [36], barnase [37] and T4 lysozyme [45]
tolerate amino-acid substitutions. Recent experimental,
bioinformatics and theoretical work, however, suggests that,
for other proteins and even for groups of structurally related
proteins, this is not necessarily the case.
maintaining an active conformation of triose phosphate
isomerase (TIM), the archetypal (β/α)8 barrel. Effectively,
this structure has two hydrophobic cores. The main conclusion
regarding core packing is that the two cores react differently
to mutation: the core between the outer α helices and
the inner β barrel is tolerant, whereas the inner core of the
β barrel is extremely sensitive.
Where do these studies leave the protein designer? Recent
computational studies on other systems [49–51] lend
support to the experimental work on ubiquitin and TIM;
that is, for certain proteins, the jigsaw-puzzle model for
core packing might be appropriate. In one respect, this is
encouraging. If stable sequences do cluster in sequence space,
such regions might be homed in on or otherwise targeted
in computational and experimental combinatorial design.
Indeed, such approaches are underway [33••,50,52]. On
the other hand, if the stable regions of sequence space are
highly focused, locating them could prove difficult. The
problem could be particularly difficult for true de novo
design of novel structures. Nonetheless, Kuhlman and
Baker [51] inject some optimism here: sequence simulations
using NMR-derived templates (compared with using X-ray
structures) illustrate that introducing backbone flexibility
widens the net of sequences compatible with the target
[51]. Furthermore, even for ubiquitin sequences with half
the core sites altered, structures that do fold correctly can
be selected computationally and experimentally [42••,46,53].
Evidence highlighting the need for specific constellations of
residues within a protein core comes from combinatorial
mutagenesis and selection of ubiquitin [42••]. A library has
been created in which the first eight core positions are
substituted with combinations of phenylalanine, isoleucine,
leucine, methionine and valine. (Multiple amino acids can be
encoded at a single position in a protein by introducing
degenerate codons into the synthetic oligonucleotides used
for mutagenesis. For example, {AGT}T{CG} encodes the
hydrophobic residues phenylalanine, isoleucine, leucine,
methionine and valine in the ratio 1:1:1:1:2, whereas
{ACG}A{ACGT} encodes the polar subset aspartic acid,
glutamic acid, histidine, lysine, asparagine and glutamine in
equal numbers.) The stable ubiquitin selectants show three
surprises. Firstly, most have only two, three or four differences
from WT, whereas random selection would have mostly
returned sequences with seven substitutions. Secondly, their
consensus sequence differs from WT at only one site (V26L).
Thirdly, none are as stable as WT. Thus, after selection, the
library becomes more like WT, although WT stability is not
matched. These results concur with earlier computational
studies that use design algorithms that either repack the
ubiquitin core [46,47] or create simulated sequences for
ubiquitin-like architectures [48]. Together, these studies
suggest that specific constellations of residues, or folding
nuclei, may be important for the ubiquitin-like superfamily.
Keefe and Szostak [54••] derive ATP-binding peptides
from a library of 80-residue randomers displayed on
mRNA. Cycles of selection and a round of mutagenesis are
combined to increase the ATP-binding fraction of the
library. Many of the selectants have a similar 45-residue
hub with a CXXC zinc-binding motif, which is responsible
for activity. Unfortunately, none of the selectants are
isolated or characterised in structural detail. The authors
suggest that approximately 1 in 1011 randomly generated
sequences should have some targetable function and
although this bodes well for directed evolution approaches,
it highlights the gargantuan task facing de novo design.
More experimental evidence comes from excellent work
by Silverman et al. [33••]. This describes a combinatorial
dissection of structural residues important in defining and
Binary patterns of hydrophobic and polar residues (HP
patterns) simplify protein sequences, and offer one approach
to deriving templates and limiting amino acid usage [55].
Experimental approaches in combinatorial de
novo design
The selection studies described above use natural scaffolds
— they are redesigns. What is the scope for the design of
novel structures using combinatorial approaches?
Keefe and Szostak’s study is wonderful and I respect their
views; however, I feel that more targeted approaches are
also needed. The difficulties here are, firstly, to choose or
design a starting template shrewdly and, secondly, to
restrict sequence space to permit an experiment while
allowing enough freedom to encounter stable proteins.
Core-directed protein design Woolfson
Roy and Hecht [56••] use HP patterns to make a library of
potential four-helix-bundle structures: amphipathic helical
segments — with PHPPHHPPHPPHHP patterns of polar
(lysine, histidine, glutamate, glutamine, aspartate and
asparagine) and hydrophobic (phenylalanine, isoleucine,
leucine, methionine and valine) residues generated using the
degenerate codons described above — are linked by glycine,
proline and polar-based turns. Although the sequences
cannot all be sampled, the library size is potentially
5 × 1041, which is approximately 54 orders of magnitude
smaller than a completely random library. In this study,
proteins are not ‘actively selected’; monoclonals are simply
expressed. Most of the 26 variants analysed are monomeric
and half of them unfold with sigmoidal thermal unfolding
curves and measurable enthalpies. The authors argue that
this should be true for the majority of the library, although
it is likely that some in vivo selection for competently
folded, expressible and nontoxic proteins is at play. By
comparison, sequences generated from truly random libraries
are generally not so well behaved.
A cautionary tale for this approach comes from the same
group. To promote amphipathic β-structures, West et al. [57•]
generate semi-random sequences with six segments of
alternating HP patterns separated by turn-promoting
tetrapeptides [57•]. Several of the expressed proteins
reversibly form amyloid-like β-structured fibres. The
group follow this up with an analysis of natural peptide
sequences and find that simple alternating HP patterns are
not favoured, but are actually under-represented [58].
They argue that Nature disfavours alternating HP
sequences because of the possibility of amyloidogenesis.
How does one assign an HP pattern to a more formal model
for a design target? Marshall and Mayo [59•] introduce
Genclass, which automatically defines a binary (HP) pattern
for a target structure. Based on the solvent accessibility of a
generic sidechain placed at a position in a sequence,
Genclass assigns the site as buried, surface or boundary.
Appropriate cut-offs are gleaned from known structures; for
example, approximately 20 Å2 predicts approximately 75%
of the HP patterns. The cut-off has also been optimised
experimentally through redesign cycles on the homeodomain
fold. Based on thermal stability and correct folding, the best
designs equate to those HP patterns that would be selected
using a cut-off of approximately 40 Å2; in effect, more of
the boundary sites are made hydrophobic. The results
show that optimisation of HP patterns can improve protein
design stability considerably. The discrepancy with natural
mesophilic sequences possibly reflects Nature’s lack of
interest in superstable proteins and the potential role of
surface hydrocarbon (or buried polar residues) in specifying
structure and function.
An alternative method for assigning HP patterns is presented by Silverman et al. [33••]. These workers used
sequence alignments to class residues as phylogenetically
hydrophobic, polar, conserved or variable. The classes are
469
used to guide the design of libraries for combinatorial
experiments on TIM. The selected (functional) sequences
indicate that approximately five times as many of the
phylogenetically hydrophobic sites show amino-acid
preferences compared with the polar sites.
Conclusions
Good progress is being made in experimental core-directed
design to complement in silico approaches. Iterative
approaches are being formalised as a continued means to
test design principles and hone specific designs. Interesting,
though cautionary, results are still emerging from the
redesign and design of four-helix-bundle structures. Protein
engineering experiments continue to be rationalised to
relate stability changes to new structural parameters, which
could be of value in improving scoring functions for in silico
design. The most encouraging signs are in experimental
combinatorial approaches. Here, methods are being developed to recover stable and correctly folded proteins from
combinatorial libraries without functional selections or
screens. In addition, because of the vastness of sequence
and structural space, the design of libraries for such work is
being rationalised and focused. In short, we are in a strong
position to redesign stability into existing protein frameworks with confidence and we are better placed to tackle
true de novo design of novel sequences and structures. The
difficulties here will be to make sensible choices for design
templates; to guide these using positive and negative
design principles; and to make focused combinatorial
libraries using reduced amino-acid alphabets, which,
nonetheless, contain sequences compatible with a competent
structure. The next step will be to append and tailor
functions onto such structures.
References and recommended reading
Papers of particular interest, published within the annual period of review,
have been highlighted as:
• of special interest
•• of outstanding interest
1.
Kohn WD, Hodges RS: De novo design of alpha-helical coiled
coils and bundles: models for the development of protein-design
principles. Trends Biotechnol 1998, 16:379-389.
2.
Lazar GA, Handel TM: Hydrophobic core packing and protein
design. Curr Opin Chem Biol 1998, 2:675-679.
3.
Street AG, Mayo SL: Computational protein design. Structure
1999, 7:R105-R109.
4.
Beasley JR, Hecht MH: Protein design: the choice of de novo
sequences. J Biol Chem 1997, 272:2031-2034.
5.
Hellinga HW: Rational protein design: combining theory and
experiment. Proc Natl Acad Sci USA 1997, 94:10015-10017.
6.
Hill RB, Raleigh DP, Lombardi A, DeGrado WF: De novo design of
helical bundles as models for understanding protein folding and
function. Accounts Chem Res 2000, 33:745-754.
7.
Harbury PB, Zhang T, Kim PS, Alber T: A switch between
2-stranded, 3-stranded and 4-stranded coiled coils in GCN4
leucine-zipper mutants. Science 1993, 262:1401-1407.
8.
Woolfson DN, Alber T: Predicting oligomerization states of coiled
coils. Protein Sci 1995, 4:1596-1607.
9.
Walshaw J, Woolfson DN: SOCKET: a program for identifying and
analysing coiled-coil motifs within protein structures. J Mol Biol
2001, 307:1427-1450.
470
Engineering and design
10. Nautiyal S, Alber T: Crystal structure of a designed, thermostable;
•
heterotrimeric coiled coil. Protein Sci 1999, 8:84-90.
The authors present a crystal structure that confirms a previous design of a
novel heterotrimeric coiled coil (see [18]) that employs positive and negative
design features to specifically orient the helices.
11. Harbury PB, Plecs JJ, Tidor B, Alber T, Kim PS: High-resolution
protein design with backbone freedom. Science 1998,
282:1462-1467.
12. Sharma VA, Logan J, King DS, White R, Alber T: Sequence-based
design of a peptide probe for the APC tumor suppressor protein.
Curr Biol 1998, 8:823-830.
13. Pandya MJ, Spooner GM, Sunde M, Thorpe JR, Rodger A,
Woolfson DN: Sticky-end assembly of a designed peptide fiber
provides insight into protein fibrillogenesis. Biochemistry 2000,
39:8728-8734.
14. Ogihara NL, Ghirlanda G, Bryson JW, Gingery M, DeGrado WF,
Eisenberg D: Design of three-dimensional domain-swapped
dimers and fibrous oligomers. Proc Natl Acad Sci USA 2001,
98:1404-1409.
31. Main ERG, Fulton KF, Jackson SE: Context-dependent nature of
destabilizing mutations on the stability of FKBP12. Biochemistry
1998, 37:6145-6153.
32. Fleming PJ, Richards FM: Protein packing: dependence on protein
•
size, secondary structure and amino acid composition. J Mol Biol
2000, 299:487-498.
The packing efficiencies of protein structures are quantified and the origins
of noted differences are explored.
33. Silverman JA, Balakrishnan R, Harbury PB: Reverse engineering the
•• (beta/alpha)(8) barrel fold. Proc Natl Acad Sci USA 2001,
98:3092-3097.
A wonderful combinatorial dissection of the residues important for specifying the TIM barrel is described. A variety of conclusions are made. Of note
here is the finding that the two distinct hydrophobic cores respond differently to mutation: the outer core is permissive, whereas the inner core is less
tolerant (see also [42••]).
15. Walsh STR, Cheng H, Bryson JW, Roder H, DeGrado WF: Solution
structure and dynamics of a de novo designed three-helix bundle
protein. Proc Natl Acad Sci USA 1999, 96:5486-5491.
34. Jiang X, Farid H, Pistor E, Farid RS: A new approach to the design
•• of uniquely folded thermally stable proteins. Protein Sci 2000,
9:403-416.
A new computational approach to core design is introduced. A new scoring
function includes an entropy term, which effectively selects residues and
rotamers that ‘freeze out’ more conformational degrees of freedom. The aim
is to produce cores with complementary fits of residues.
16. Lovejoy B, Choe S, Cascio D, McRorie DK, Degrado WF, Eisenberg D:
Crystal-structure of a synthetic triple-stranded alpha-helical
bundle. Science 1993, 259:1288-1293.
35. Jiang X, Bishop EJ, Farid RS: A de novo designed protein with
properties that characterize natural hyperthermophilic proteins.
J Am Chem Soc 1997, 119:838-839.
17.
36. Lim WA, Sauer RT: Alternative packing arrangements in the
hydrophobic core of lambda-repressor. Nature 1989, 339:31-36.
Bryson JW, Desjarlais JR, Handel TM, DeGrado WF: From coiled
coils to small globular proteins: design of a native-like three-helix
bundle. Protein Sci 1998, 7:1404-1414.
37.
18. Nautiyal S, Woolfson DN, King DS, Alber T: A designed
heterotrimeric coiled-coil. Biochemistry 1995, 34:11645-11651.
19. Walsh STR, Sukharev VI, Betz SF, Vekshin NL, DeGrado WF:
Hydrophobic core malleability of a de novo designed three-helix
bundle protein. J Mol Biol 2001, 305:361-373.
20. Johansson JS, Gibney BR, Skalicky JJ, Wand AJ, Dutton PL:
A native-like three-alpha-helix bundle protein from
structure-based redesign: a novel maquette scaffold. J Am Chem
Soc 1998, 120:3881-3886.
21. Gibney BR, Rabanal F, Skalicky JJ, Wand AJ, Dutton PL: Iterative
protein redesign. J Am Chem Soc 1999, 121:4952-4960.
22. Robertson DE, Farid RS, Moser CC, Urbauer JL, Mulholland SE,
Pidikiti R, Lear JD, Wand AJ, Degrado WF, Dutton PL: Design and
synthesis of multi-heme proteins. Nature 1994, 368:425-431.
23. Skalicky JJ, Gibney BR, Rabanal F, Urbauer RJB, Dutton PL, Wand AJ:
Solution structure of a designed four-alpha-helix bundle
maquette scaffold. J Am Chem Soc 1999, 121:4941-4951.
24. Gibney BR, Dutton PL: Histidine placement in de novo-designed
heme proteins. Protein Sci 1999, 8:1888-1898.
25. Lupas A: Coiled coils: new structures and new functions. Trends
Biochem Sci 1996, 21:375-382.
26. Munson M, Balasubramanian S, Fleming KG, Nagi AD, O’Brien R,
Sturtevant JM, Regan L: What makes a protein a protein?
Hydrophobic core designs that specify stability and structural
properties. Protein Sci 1996, 5:1584-1593.
27.
•
Willis MA, Bishop B, Regan L, Brunger AT: Dramatic structural and
thermodynamic consequences of repacking a protein’s
hydrophobic core. Structure 2000, 8:1319-1328.
A topological rearrangement of ROP is described that accompanies multiple
core mutations within the hydrophobic core. See also [28•,29].
28. Glykos NM, Cesareni G, Kokkinidis M: Protein plasticity to the
•
extreme: changing the topology of a 4-alpha-helical bundle with a
single amino acid substitution. Structure 1999, 7:597-603.
A new protein architecture is described for a point mutant of ROP. See
also [27•,29].
29. Hill RB, DeGrado WF: Solutions structure of alpha D-2, a
nativelike de novo designed protein. J Am Chem Soc 1998,
120:1138-1145.
30. Vlassi M, Cesareni G, Kokkinidis M: A correlation between the loss
•
of hydrophobic core packing interactions and protein stability.
J Mol Biol 1999, 285:817-827.
A new parameter is introduced for rationalising the effects on stability of
making deletions in protein cores.
Axe DD, Foster NW, Fersht AR: Active barnase variants with
completely random hydrophobic cores. Proc Natl Acad Sci USA
1996, 93:5590-5594.
38. Kristensen P, Winter G: Proteolytic selection for protein folding
using filamentous bacteriophages. Fold Des 1998, 3:321-328.
39. Sieber V, Pluckthun A, Schmid FX: Selecting proteins with improved
stability by a phage-based method. Nat Biotechnol 1998,
16:955-960.
40. Finucane MD, Tuna M, Lees JH, Woolfson DN: Core-directed protein
•
design. I. An experimental method for selecting stable proteins
from combinatorial libraries. Biochemistry 1999, 38:11604-11612.
A phage-display selection method for rescuing stably folded proteins from
combinatorial libraries (see also [38,39]).
41. Jung S, Arndt KM, Muller KM, Pluckthun A: Selectively infective
phage (SIP) technology: scope and limitations. J Immunol
Methods 1999, 231:93-104.
42. Finucane MD, Woolfson DN: Core-directed protein design. II.
•• Rescue of a multiply mutated and destabilized variant of
ubiquitin. Biochemistry 1999, 38:11613-11623.
This paper describes the selection of stable ubiquitin variants from a library
of hydrophobic mutants. The selectants show a clear consensus for the WT
sequence, although none match WT stability. This provides evidence for the
requirement for a restricted constellation of residues to specify and cement
this particular core (see also [33••]).
43. Riechmann L, Winter G: Novel folded protein domains generated
•
by combinatorial shuffling of polypeptide segments. Proc Natl
Acad Sci USA 2000, 97:10068-10073.
This paper describes an interesting attempt to select stable protein chimeras
formed by combining the N-terminal half of CspA and fragmented genomic
E. coli DNA.
44. Martin M, Sieber V, Schmid FX: In-vitro selection of highly
•
stabilized protein variants with optimized surface. J Mol Biol
2001, 309:717-726.
This paper provides an alternative view of stabilising proteins: hyperstable
variants are selected from a library in which only surface residues of a
mesophilic form of CspB are mutagenised.
45. Gassner NC, Baase WA, Matthews BW: A test of the ‘jigsaw
puzzle’ model for protein folding by multiple methionine
substitutions within the core of T4 lysozyme. Proc Natl Acad Sci
USA 1996, 93:12155-12158.
46. Lazar GA, Desjarlais JR, Handel TM: De novo design of the
hydrophobic core of ubiquitin. Protein Sci 1997, 6:1167-1178.
47.
Wernisch L, Hery S, Wodak SJ: Automatic protein design with all
atom force-fields by exact and heuristic optimization. J Mol Biol
2000, 301:713-736.
Core-directed protein design Woolfson
48. Michnick SW, Shakhnovich E: A strategy for detecting the
conservation of folding-nucleus residues in protein superfamilies.
Fold Des 1998, 3:239-251.
49. Koehl P, Levitt M: De novo protein design. I. In search of stability
and specificity. J Mol Biol 1999, 293:1161-1181.
50. Koehl P, Levitt M: De novo protein design. II. Plasticity in sequence
space. J Mol Biol 1999, 293:1183-1193.
51. Kuhlman B, Baker D: Native protein sequences are close to
optimal for their structures. Proc Natl Acad Sci USA 2000,
97:10383-10388.
52. Voigt CA, Mayo SL, Arnold FH, Wang ZG: Computational method to
reduce the search space for directed protein evolution. Proc Natl
Acad Sci USA 2001, 98:3778-3783.
53. Johnson EC, Lazar GA, Desjarlais JR, Handel TM: Solution structure
and dynamics of a designed hydrophobic core variant of ubiquitin.
Structure 1999, 7:967-976.
54. Keefe AD, Szostak JW: Functional proteins from a
•• random-sequence library. Nature 2001, 410:715-718.
An excellent study is described in which ATP-binding polypeptides are
selected from a starting library of near-random 80-mers.
55. Kamtekar S, Schiffer JM, Xiong HY, Babik JM, Hecht MH: Protein
design by binary patterning of polar and nonpolar amino-acids.
Science 1993, 262:1680-1685.
471
56. Roy S, Hecht MH: Cooperative thermal denaturation of proteins
•• designed by binary patterning of polar and nonpolar amino acids.
Biochemistry 2000, 39:4603-4607.
An alternative view of core packing and protein design. Characterisations of
variants expressed from a HP library for a four-helix-bundle template are
described. It is estimated that about half of the proteins fold with some
degree of cooperativity and independence, which is a considerable improvement on completely random libraries.
57.
•
West MW, Wang WX, Patterson J, Mancias JD, Beasley JR, Hecht MH:
De novo amyloid proteins from designed combinatorial libraries.
Proc Natl Acad Sci USA 1999, 96:11211-11216.
A cautionary note on working with HP-patterned templates in combinatorial
design. This work describes a library targeted at six-stranded β structures.
Expressed variants self-assemble and reversibly form amyloid.
58. Broome BM, Hecht MH: Nature disfavors sequences of alternating
polar and non-polar amino acids: implications for
amyloidogenesis. J Mol Biol 2000, 296:961-968.
59. Marshall SA, Mayo SL: Achieving stability and conformational
•
specificity in designed proteins via binary patterning. J Mol Biol
2001, 305:619-631.
A computational method for assigning HP patterns to a structural template
is described. This is tested on the structures in the PDB and optimised
through cycles of protein redesign.
60. Banner DW, Kokkinidis M, Tsernoglou D: Structure of the Co1E1
Rop protein at 1.7 Å resolution. J Mol Biol 1987, 196:657-675.