Intrinsically unstructured proteins

YourAuthorPDFforBiochemicalSocietyTransactions
We are pleased to provide a copy of the Version of Record of your article. This PDF is
provided for your own use and is subject to the following terms and conditions:



You may not post this PDF on any website, including your personal website or your
institution’s website or in an institutional or subject-based repository (e.g. PubMed
Central).
You may make print copies for your own personal use.
You may distribute copies of this PDF to your colleagues provided you make it clear
that these are for their personal use only.
Permission requests for re-use or distribution outside of the terms above, or for commercial
use, should be sent to [email protected].
A Biochemical Society Focused Meeting held at University of York, U.K., 26–27 March 2012. Organized and Edited by Jennifer Potts (York, U.K.) and Mike
y
Williamson (Sheffield, U.K.).
C
op
Mike P. Williamson*1 and Jennifer R. Potts†
*Department of Molecular Biology and Biotechnology, University of Sheffield, Firth Court, Western Bank, Sheffield S10 2TN, U.K., and †Department of Biology,
University of York, Heslington, York YO10 5DD, U.K.
Abstract
IDPs (intrinsically disordered proteins) are common in eukaryotic genomes and have regulatory roles. In
the cell, they are disordered, although not completely random. They bind weakly, but specifically, often
remaining partially disordered even when bound. Whereas folded globular proteins have ‘executive’ roles
in the cell, IDPs have an essential administrative function, making sure that the executive functions are
properly co-ordinated. This makes them a good target for pharmaceutical intervention.
Au
th
or
Biochemical Society Transactions
www.biochemsoctrans.org
Intrinsically disordered proteins: administration
not executive
Introduction
As soon as the first crystal structures of proteins became available, it became clear that most proteins are
globular, composed of regular secondary structures (αhelices and β-sheets), with intervening irregular loops,
sometimes of considerable length. This is the basis of
the classic primary/secondary/tertiary/quaternary structure
classification of proteins. It also became clear early on that
some proteins have loops or termini that are so unstructured
or mobile that they cannot be seen in crystal structures. The
fact that well-defined protein folds are both conceptually
and experimentally more accessible probably contributed to
disordered regions of proteins being entirely overlooked or
to the assumption that they have no function or are merely
linkers. More recently, particularly with the availability of
eukaryotic genome sequences, the realization has come that
these unstructured parts are more than just linkers; they often
have a function.
It is currently estimated that roughly one-third of
protein sequence in the human proteome is intrinsically
disordered [1], that almost half of all human proteins contain
stretches of at least 30 intrinsically disordered residues,
and that roughly 25 % of human proteins are entirely
disordered [2]. Disordered regions are much more common
in eukaryotes than in prokaryotes, and are found particularly
in transcription factors and other proteins associated with
Key words: flexibility, globular protein, intrinsically disordered protein (IDP), protein evolution,
secondary structure.
Abbreviation used: IDP, intrinsically disordered protein.
1
To whom correspondence should be addressed (email m.williamson@sheffield.ac.uk).
Biochem. Soc. Trans. (2012) 40, 945–949; doi:10.1042/BST20120188
Intrinsically Disordered Proteins
Intrinsically Disordered Proteins
signalling and regulation. It is therefore clear that intrinsically
disordered regions are not ‘junk’, but that this is an evolved
property and thus functionally important.
It is becoming clear that there are several different types of
IDP (intrinsically disordered protein) [3], and that ‘disorder’
represents a continuum from complete and permanent
disorder to sequence that, under some circumstances (for
example, when bound to a partner), is fully ordered. A
relatively small number of proteins function as genuinely
disordered entropic chains, essentially as expected for a classic
disordered polymer [4]. These include some very interesting
proteins. To take four examples: resilin is an elastic protein
that is found in several insects which allows them to jump
several times their body length [5]; salivary proline-rich
proteins are abundant proteins in saliva and function to
wrap up dietary polyphenols and escort them out of the
body [6]; the C-terminal ends of neurofilaments form a
hydrogel, their lengths acting as spacers [7]; and nucleoporins
form a stiff curtain that covers the nuclear pore and hinders
the free passage of proteins across the pore, unless such
proteins interact strongly enough with the (Phe-Gly) repeats
of nucleoporins to allow them to move through the curtain,
pushing it aside as it goes [8]. However, most IDPs become
fully or partially ordered on binding, as discussed in more
detail below.
The structure of IDPs
Protein secondary structure can be classified as α-helix, βsheet or random coil. Helix and sheet are regular secondary
C The
Authors Journal compilation
C 2012
Biochemical Society
945
Biochemical Society Transactions (2012) Volume 40, part 5
Figure 1 Backbone dihedral angles
Backbone dihedral angles ϕ and ψ (Ramachandran plot) for (a) all
amino acid residues, (b) residues in regular secondary structure only, (c)
residues not in regular secondary structure, i.e. random coil/intrinsically
disordered. Each contour includes approximately 15 % more residues.
Data taken from [9], courtesy of Professor S. Hovmöller (Stockholm
University).
y
structures, stabilized by a regular pattern of hydrogen bonds,
and amino acid residues in regular secondary structures
occupy a small fraction of the (ϕ,ψ) plane (Figure 1). We
can therefore define random coil as the structure that a
polypeptide chain adopts when it is not in a regular secondary
structure. This can be described rather precisely, by, for
example, analysing the backbone dihedral angles of proteins
for all residues not in regular secondary structure. The result
is shown in Figure 1(c) [9]. More careful analysis shows
that each amino acid has a slightly different distribution,
because of the different restrictions posed by the side chains
(except, of course, for glycine and proline, which are very
different). We can then calculate the random coil structure for
a given polypeptide sequence by sampling randomly from this
residue-specific distribution. To represent the population,
we need to take many different samples: random coil is
an ensemble and it is meaningless to try to represent it by
any kind of ‘average’. Experimental NMR studies based on
residual dipolar couplings, and confirmed by SAXS (smallangle X-ray scattering), have shown that such ensembles
are an accurate representation of truly disordered proteins
[10,11]. In particular, although they are disordered, this does
not mean that they are random; there remain conformational
preferences that are sequence-specific. The random coil (ϕ,ψ)
plane of Figure 1(b) predominantly contains conformations
around the β-sheet or polyproline II helix region, implying
that, in general, IDPs have fairly extended structures.
The conformation of an amino acid is necessarily
constrained by the nature and conformation of the residues
on either side. This represents a sophistication of the approach
described in the previous paragraph: to generate an accurate
picture of a random coil ensemble, we need to filter in nearestneighbour effects. However, we need go no further than the
nearest neighbour, since steric effects beyond one residue are
unlikely. This gives rise to an important conclusion: that,
for a truly random coil polypeptide, there should be no
interactions between non-adjacent side chains. In particular,
there should be no hydrophobic interactions. Hydrophobic
interactions give rise to ordering or clustering of side chains,
and thus to non-random-coil structures. This provides a
useful insight into the reason IDPs contain a very low
proportion of hydrophobic (leucine, isoleucine and valine) or
aromatic (phenylalanine, tyrosine and tryptophan) residues,
and a very high proportion of small hydrophilic residues
(glycine, serine, glutamate, glutamine and proline) [2]: if
they did contain significant amounts of hydrophobic or
aromatic amino acids, they would no longer be disordered.
However, functionally disordered proteins, e.g. α-synuclein,
can contain detectable long-range interactions and are more
compact than would be predicted in the absence of such
interactions [12].
It is now agreed that the cytoplasmic milieu is very
different from the dilute aqueous solutions normally used
by biochemists. The cytoplasm is crowded, with high protein
concentrations [13]. This tends to make proteins associate
together more, and fold more readily. It is therefore a
very reasonable question to ask whether proteins which
Au
th
or
C
op
946
C The
C 2012 Biochemical Society
Authors Journal compilation are disordered in dilute solution genuinely are disordered
in the cell. A number of recent studies have shown that
they are. Technical improvements have made it possible to
observe NMR spectra of proteins expressed inside cells at
only a little more than their physiological concentrations
[14]. Most proteins expressed in this way are not visible in
NMR spectra, implying that they bind too tightly to other
cellular components to be seen. However, IDPs are much
more likely to be visible, and have NMR spectra very similar
to those in dilute solution. The strong implication is that there
is evolutionary pressure on IDPs not to interact strongly
Intrinsically Disordered Proteins
The functions of IDPs
Figure 2 The overall free energy change G for the binding of an
IDP to its partner can be considered as the sum of two terms: the
unfavourable free energy for folding of the IDP (Gf ), and the
favourable energy for binding of the folded protein to its partner
(Gb )
The value of Gb is more or less fixed by the structure of the complex,
y
but Gf can have a wide range of values, and therefore the value of
G can also vary widely, allowing the binding affinity to have whatever
value is required by the cell.
C
op
with cellular components: this is consistent with the largely
hydrophilic amino acid composition described above.
It is relevant to note that many IDPs go further than
the amino acid bias described above, by also having lowcomplexity sequences, i.e. sequences formed by approximate
repeats of single or multiple amino acids. In many cases,
it is likely that the low complexity is a consequence of
recent evolutionary history, in which the sequence arose by
duplication (repeat expansion) of shorter existing sequences
[15]. An important implication of this conclusion is that many
IDPs are recent proteins. This is consistent with the point
made above that many IDPs are eukaryotic proteins involved
in the regulation of transcription or signalling pathways; such
functions generally arise only as modifications to a pathway,
after the main function has evolved [16].
Weak binding as described in the present paper is often
carried out by short sequences that are poor versions of
a consensus binding sequence; their weak binding is an
evolved property, which enables them to bind rapidly but
be displaced easily [18]. The inherently weak binding of
such sequences is often enhanced by intramolecular binding,
which strengthens the interaction considerably, by greatly
increasing the probability of two sequences coming together,
and thus increasing the on-rate [19]. These sequences are often
very short, no more than eight or ten residues, which makes
them easily created and easily lost by random mutations.
The long unstructured segments of many signalling proteins
and transcription factors probably contain many such weak
interaction sites, acting as volume control rather than on/off
switches to assemble the cellular machinery appropriately.
In this context, it is worth noting that alternative splicing
has the effect of introducing or removing short peptide
sequences. Alternative splicing occurs in over 90 % of human
genes, and over 50 % of these events are tissue-specific [20].
Many of these tissue-specific events involve sequences that
are intrinsically disordered [21]. It therefore seems likely
that these alternative splicing events introduce or remove
weak binding interactions, and thereby modulate function.
The very high frequency of such events means that this is
probably an important regulatory mechanism.
Weak binding has a very important thermodynamic
benefit: because the energies involved are small, binding and
folding can be pushed in one direction or the other by rather
small external triggers. This makes it possible to create a
system that is ‘poised’ in a metastable state, in which a small
event, such as ligand binding, can give rise to large-scale
change. This idea has been developed by Hilser and colleagues
[22,23], and examples can be found in many of the papers in
this issue of Biochemical Society Transactions; for example, see
[24] and [25], which show how dynein function is modulated
by competing binding events.
Au
th
or
We noted above that most IDPs fold on binding, although
not all fold to fully defined conformations, maintaining some
‘fuzziness’ even when folded [17]. In this section, we consider
why such behaviour has evolved. Unfolded proteins in vitro
are generally degraded more quickly than folded proteins
and are more likely to aggregate. The cell would presumably
be ‘happier’ if all proteins were properly folded. Why then
are IDPs so common? There is no single answer to this
question; IDPs are useful in several different ways, a number
of which are discussed elsewhere in this issue of Biochemical
Society Transactions and a few are highlighted below.
Weak, but specific, binding
Given equivalent interactions and solvation in the bound
state, intrinsically disordered regions will always bind to
their partners more weakly than an equivalent ready-folded
domain, owing to the unfavourable entropy associated with
folding (Figure 2). Weak, but specific, binding is particularly
important in regulatory systems, in which the binding of an
IDP acts to modulate the activity of its partner. This is a very
common observation, the modulation being either to enhance
activity, for example by creating a new interaction surface
for additional proteins, or to reduce activity, for example by
autoinhibition, i.e. the shutting down of activity by closing a
protein in on itself. There are many examples of such function
in the other papers in this issue of Biochemical Society
Transactions. This role for IDPs is emerging as probably
their most important cellular function. As an analogy, we
can picture classical globular proteins as being the taskfocused executives of the cell, intent only on getting the job
done as quickly as possible. In contrast, IDPs are the administration or management of the cell: they step in to
calm things down, maintain relationships, bring partners
together, revitalize dormant activity and ensure the integrity
of complex processes. As in any big organization, good
administration is essential for the smooth running of the
cell, and IDPs do their job well because they have been
selected and refined by evolution for this role.
C The
C 2012 Biochemical Society
Authors Journal compilation 947
Biochemical Society Transactions (2012) Volume 40, part 5
Conclusions
A fascinating feature of some IDPs is that they are able to
bind different target proteins in different conformations. An
excellent example is the cell-cycle-regulatory protein p21,
which binds in different conformations to different Cdk
(cyclin-dependent kinase)–cyclin complexes [26], and others
are described in [27].
Although some IDPs fold fully when bound to their
partners, others seem to remain partially disordered or ‘fuzzy’
[17]. As discussed in the Introduction, there is a continuum of
behaviour, even when binding to a common target [28]. Fuzzy
binding permits a fascinating range of biological effects,
many of which are as yet undiscovered. Some are discussed
elsewhere in this issue of Biochemical Society Transactions
[29,30]. An interesting further example is the IDP Sic1, which
is phosphorylated at up to six positions, each of which can
bind to the yeast cell cycle protein Cdc4, presenting a highly
dynamic complex [31].
IDPs have almost the maximum possible solvent-accessible
surface area per residue, and are therefore able to make good
use of their surface properties and flexibility to grab on to
their more rigid partners, a property which has been called
a ‘sticky arm’ [32]. The ability to reach out and grab hold
of binding partners has also been aptly described as flycasting [33], although it remains unclear how important such a
mechanism is in speeding up recognition under physiological
conditions.
An important special case is when the intrinsically
disordered partner contains a high proportion of proline
residues. Proline-rich regions are common in proteins, and
impart rigidity, as well as leading to conformations close to a
polyproline II helix, a very extended conformation. Prolinerich regions therefore have the ‘sticky arm’ of an IDP, but
with less entropy to lose on binding, which means that they
can bind more tightly to their target. They are often found
in signalling systems, for example as the target of SH3 (Src
homology 3) and WW domains [34].
Globular proteins are the executives of the cell, getting
on with essential functions such as enzymatic catalysis as
quickly as possible. However, the complex organization of
the eukaryotic cell would grind to a halt without the constant
intervention of the management structure provided by IDPs.
Neither can function without the other. Although genetic
errors in the executive enzymes often lead to fatal metabolic
problems, errors in the management IDPs more often
produce less severe problems, implying that intervening in
IDP function could be a good target for new pharmaceuticals
[27,35]. A better understanding of what IDPs do is the key to
understanding the regulation of eukaryotic cell function, and
thus to designing ways of altering cellular pathways in highly
specific ways without unwanted side effects.
y
Advantages of flexibility
Au
th
or
C
op
948
Distance markers
Given its ubiquity, it seems unlikely that all intrinsically
disordered sequence is involved in binding interactions, and
studies of sequences suggest that much is unlikely to be able to
fold to a stable tertiary structure even after interaction with
a binding partner. A good example of the potential (nonbinding) roles that could be played by disordered sequences
is provided by scaffold proteins. Such proteins bring together
a series of proteins required for a sequential process, such as
a kinase cascade or a signalling pathway. Scaffold proteins
contain extensive amounts of disordered sequence, and it is
a reasonable hypothesis that the length of such sequence acts
mainly as a spacer, to regulate the probability of interaction
of the proteins attached to the scaffold. As discussed above,
there must also be short sequences within these disordered
regions that regulate scaffold function by binding to other
proteins or to folded domains within the same protein.
C The
C 2012 Biochemical Society
Authors Journal compilation References
1 Fukuchi, S., Hosoda, K., Homma, K., Gojobori, T. and Nishikawa, K. (2011)
Binary classification of protein molecules into intrinsically disordered and
ordered segments. BMC Struct. Biol. 11, 29
2 Dunker, A.K., Silman, I., Uversky, V.N. and Sussman, J.L. (2008) Function
and structure of inherently disordered proteins. Curr. Opin. Struct. Biol.
18, 756–764
3 Tompa, P. (2005) The interplay between structure and function in
intrinsically unstructured proteins. FEBS Lett. 579, 3346–3354
4 Flory, P.J. (1988) Statistical Mechanics of Chain Molecules, Hanser
Publishers, New York
5 Gosline, J., Lillie, M., Carrington, E., Guerette, P., Ortlepp, C. and Savage,
K. (2002) Elastic proteins: biological roles and mechanical properties.
Philos. Trans. R. Soc. London Ser. B 357, 121–132
6 Luck, G., Liao, H., Murray, N.J., Grimmer, H.R., Warminski, E.E.,
Williamson, M.P., Lilley, T.H. and Haslam, E. (1994) Polyphenols,
astringency and proline-rich proteins. Phytochemistry 37, 357–371
7 Beck, R., Deek, J. and Safinya, C.R. (2012) Structures and interactions in
‘bottlebrush’ neurofilaments: the role of charged disordered proteins in
forming hydrogel networks. Biochem. Soc. Trans. 40, 1027–1031
8 Ader, C., Frey, S., Maas, W., Schmidt, H.B., Goerlich, D. and Baldus, M.
(2010) Amyloid-like interactions within nucleoporin FG hydrogels. Proc.
Natl. Acad. Sci. U.S.A. 107, 6281–6285
9 Hovmöller, S., Zhou, T. and Ohlson, T. (2002) Conformations of amino
acids in proteins. Acta Crystallogr. Sect. D Biol. Crystallogr. 58, 768–776
10 Bernadó, P., Blanchard, L., Timmins, P., Marion, D., Ruigrok, R.W.H. and
Blackledge, M. (2005) A structural model for unfolded proteins from
residual dipolar couplings and small-angle X-ray scattering. Proc. Natl.
Acad. Sci. U.S.A. 102, 17002–17007
11 Sibille, N. and Bernadó, P. (2012) Structural characterization of
intrinsically disordered proteins by the combined use of NMR and SAXS.
Biochem. Soc. Trans. 40, 955–962
12 Bernadó, P., Bertoncini, C.W., Griesinger, C., Zweckstetter, M. and
Blackledge, M. (2005) Defining long-range order and disorder in native
α-synuclein using residual dipolar couplings. J. Am. Chem. Soc. 127,
17968–17969
13 Zimmerman, S.B. and Minton, A.P. (1993) Macromolecular crowding:
biochemical, biophysical, and physiological consequences. Annu. Rev.
Biophys. Biomol. Struct. 22, 27–65
14 Binolfi, A., Theillet, F.-X. and Selenko, P. (2012) Bacterial in-cell NMR of
human α-synuclein: a disordered monomer by nature? Biochem. Soc.
Trans. 40, 950–954
15 Tompa, P. (2003) Intrinsically unstructured proteins evolve by repeat
expansion. BioEssays 25, 847–855
16 Kuriyan, J. and Eisenberg, D. (2007) The origin of protein interactions and
allostery in colocalization. Nature 450, 983–990
17 Tompa, P. and Fuxreiter, M. (2008) Fuzzy complexes: polymorphism and
structural disorder in protein–protein interactions. Trends Biochem. Sci.
33, 2–8
18 Neduva, V. and Russell, R.B. (2005) Linear motifs: evolutionary
interaction switches. FEBS Lett. 579, 3342–3345
Intrinsically Disordered Proteins
y
29 Kovacs, D. and Tompa, P. (2012) Diverse functional manifestations of
intrinsic disorder in molecular chaperones. Biochem. Soc. Trans. 40,
963–968
30 Hincha, D.K. and Thalhammer, A. (2012) LEA proteins: IDPs with versatile
functions in cellular dehydration tolerance. Biochem. Soc. Trans. 40,
1000–1003
31 Mittag, T., Marsh, J., Grishaev, A., Orlicky, S., Lin, H., Sicheri, F., Tyers, M.
and Forman-Kay, J.D. (2010) Structure/function implications in a
dynamic complex of the intrinsically disordered Sic1 with the Cdc4
subunit of an SCF ubiquitin ligase. Structure 18, 494–506
32 Williamson, M.P. (2011) How Proteins Work, Garland Science, New York
33 Shoemaker, B.A., Portman, J.J. and Wolynes, P.G. (2000) Speeding
molecular recognition by using the folding funnel: the fly-casting
mechanism. Proc. Natl. Acad. Sci. U.S.A. 97, 8868–8873
34 Kay, B.K., Williamson, M.P. and Sudol, P. (2000) The importance of being
proline: the interaction of proline-rich motifs in signaling proteins with
their cognate domains. FASEB J. 14, 231–241
35 Cuchillo, R. and Michel, J. (2012) Mechanisms of small molecule
binding to intrinsically disordered proteins. Biochem. Soc. Trans. 40,
1004–1008
C
op
19 Zhou, H.-X. (2006) Quantitative relation between intermolecular and
intramolecular binding of Pro-rich peptides to SH3 domains. Biophys. J.
91, 3170–3181
20 Buljan, M., Chalancon, G., Eustermann, S., Wagner, G.P., Fuxreiter, M.,
Bateman, A. and Babu, M.M. (2012) Tissue-specific splicing of disordered
segments that embed binding motifs rewires protein interaction
networks. Mol. Cell 46, 871–883
21 Fukuchi, S., Homma, K., Minezaki, Y. and Nishikawa, K. (2006)
Intrinsically disordered loops inserted into the structural domains of
human proteins. J. Mol. Biol. 355, 845–857
22 Motlagh, H.N., Li, J., Thompson, E.B. and Hilser, V.J. (2012) Interplay
between allostery and intrinsic disorder in an ensemble. Biochem. Soc.
Trans. 40, 975–980
23 Motlagh, H.N. and Hilser, V.J. (2012) Agonism/antagonism switching in
allosteric ensembles. Proc. Natl. Acad. Sci. U.S.A. 109, 4134–4139
24 Gontero, B. and Maberly, S.C. (2012) An intrinsically disordered protein,
CP12: jack of all trades and master of the Calvin cycle. Biochem. Soc.
Trans. 40, 995–999
25 Barbar, E. (2012) Native disorder mediates binding of dynein to NudE
and dynactin. Biochem. Soc. Trans. 40, 1009–1013
26 Yoon, M.-K., Mitrea, D.M., Ou, L. and Kriwacki, R.W. (2012) Cell cycle
regulation by the intrinsically disordered proteins p21 and p27. Biochem.
Soc. Trans. 40, 981–988
27 Dyson, H.J. (2011) Expanding the proteome: disordered and alternatively
folded proteins. Q. Rev. Biophys. 44, 467–518
28 Choy, M.S., Page, R. and Peti, W. (2012) Regulation of protein
phosphatase 1 by intrinsically disordered proteins. Biochem. Soc. Trans.
40, 969–974
Au
th
or
Received 19 July 2012
doi:10.1042/BST20120188
C The
C 2012 Biochemical Society
Authors Journal compilation 949