Combining Theoretical and Experimental Approaches to

SHOWCASE ON RESEARCH
Combining Theoretical and Experimental
Approaches to Enhance our Understanding
of Biological Macromolecules
David Hoke1 and Itamar Kass1,2*
Department of Biochemistry and Molecular Biology and 2Victorian Life Sciences Computation
Initiative Life Sciences Computation Centre, Monash University, Clayton, VIC 3800
*Corresponding author: [email protected]
1
Biomolecules in Motion
The dynamics and flexibility of macromolecules play
an important role in a vast range of biological processes.
Proteins, DNA and lipids are all flexibile and often undergo
conformational changes while carrying out their activities
within cells. Indeed, there are many examples in which the
conformational changes involved have been elucidated in
detail but these have generally come as the result of decades
of experimental research. Modern computational methods
can be used to complement such laboratory-based research
and enable researchers to generate detailed models of
protein dynamics with atomic-level resolution that were
impossible even ten years ago.
The oxygen-carrying protein hemoglobin is a case in
point. Hemoglobin has been the subject of intense research
for over 60 years and flexibility is central to its biological
function. Hemoglobin binds oxygen in an allosteric
manner, with the affinity for oxygen regulated by its
alternative taut and relaxed conformations (1). Although
hemoglobin was one of the first proteins for which a 3D
structure was determined by X-ray crystallography (2), it
took an additional eight years to appreciate that hemoglobin
existed in alternate conformational states (3). Furthermore,
deciphering the conformational pathways that hemoglobin
undergoes to transform between taut and relaxed forms is
still an active area of research ((4) and (5)). Understanding
how hemoglobin alternates between these two states was
the focus of decades of experimental work and today
hemoglobin is a paradigm of allosteric regulation of
proteins. Even so, intense computational modelling is now
still leading us to an even more sophisticated understanding
of how hemoglobin function is linked to its dynamics.
Use of Theoretical Methods to Study Protein Dynamics
Theoretical models of protein motions require an atomic
level structural model as a starting point. While NMR is a
powerful tool for studying biomolecule dynamics (as has
been reviewed extensively (6)), the majority of atomic-level
structural information on biomolecules corresponds to static
models derived from X-ray diffraction data. However, by
combining crystallographic data with theoretical methods
such as molecular dynamics (MD) and normal mode
analysis (NMA), one can expand our understanding of
protein dynamics. This review will discuss the application
of such theoretical methods and how they can be used in
conjunction with experimental methods to understand
the role dynamics and flexibility play in determining the
function of biological macromolecules.
Page 16
Finding Dynamics in Structural Data Derived by X-ray
Crystallography
While a crystal structure of a protein may be thought of
as a single static model, they do contain some information
on dynamics. For example, in many protein crystal
structures, residues are missing in the final model. This
stems from limitations in the diffraction data. Diffraction is
dependent upon the atoms in a biomolecule being ordered
in an exquisite 3D array. When this condition is met, it is
possible to get a high-intensity diffraction pattern that is
mathematically translated into the 3D position of each
atom within the molecule. Regions that are flexible or have
static disorder result in poor diffraction and an inability
to translate the data into the 3D model. A consequence of
this phenomenon is that many atoms have intermediate
flexibility/disorder, resulting in a diffraction pattern in
which some regions have more uncertainty than others. This
is measured by the so-called ‘b-factor’ reported in the pdb
structure. Thus, missing data or the degree of uncertainy
provides clues about local dynamics within the protein.
While crystal structures can give hints regarding dynamic
processes within a protein, ultimately crystallography is
reliant upon immobilising the protein in a lattice. Even
proteins that seem well ordered under the conditions used
in crystallographic experiments (usually at temperatures
of 100 kelvin) may be highly flexible under physiological
conditions and dynamics may play a major role in their
biological function (7).
Molecular Dynamics (MD)
MD is a classical mechanics method that allows atomiclevel insights into the dynamics of molecules. In a typical
MD protocol, one describes the system of study using
semi-empirical sets of rules, known as force fields. Such
force fields use simple chemical and physical concepts
to describe the potential energy of the system in terms of
the Cartesian coordinates of atoms. Propagation in time is
then achieved by iteratively solving Newton’s equations
of motion. Applying concepts from statistical mechanics,
the resulting trajectory can be used to evaluate various
time-dependent structural, dynamic and thermodynamic
properties of the system (reviewed in (8)).
Normal Mode Analysis (NMA)
NMA is a computational method used for studying
large-amplitude (resonant) molecular motions. NMA of
a molecule results in a set of harmonic oscillations, each
describing an intrinsic motion in which some parts of
AUSTRALIAN BIOCHEMIST
Vol 45 No 2 August 2014
SHOWCASE ON
RESEARCH
Combining Theoretical and
Experimental Approaches
the molecule move with the same frequency in a
‘harmonic’ way. NMA may be applied to the study
of large-scale functional motions in macromolecules
that are too large to be simulated effectively using
MD. NMA has been shown to describe biologically
relevant motions in a range of systems (as reviewed
in (9)).
Missing residues
Structural
data
Missing residues
Missing residues
Linking Theoretical Models of Dynamics to
Experimental Methods and Vice Versa
The importance of computational methods in
structural biology lies in the fact that they can
direct experiments and allow the interpretation
of experimental data at atomic-level resolution.
Therefore, this review will focus on the utility of
MD model
studies that employ a combination of theoretical and
experimental methods.
At the most basic level, MD can be used to study
Increasing dynamics
flexibility implied in crystallographic data and
the effect of mutations on protein dynamics and
stability. As discussed previously, many crystal
structures have weak and missing electron density
180 o H13-loop-H14
in mobile regions. Using MD, the flexibility can
Catalytic loop
be modeled, thus extending our understanding of
protein structure beyond that available from the
Experiments
crystallographic data. In addition, MD can be used to
study the effect mutations have on protein dynamics
H1-loop-H2
H1-loop-H2
and stability. This can be used to determine the effect
holoGAD65
apoGAD65
of disease-causing mutations as well as to direct
the rational design of point mutations to study the
Glu GABA
dynamic mechanism of protein function. Lastly, the
PMP + SSA
effect of specific mutations on protein dynamics can
be estimated, allowing protein engineering for the
PLP
purpose of stabilising proteins to make them more
Ab
Ab
amenable to crystallisation.
MD can be used to predict areas of flexibility and
Biological
NMA can be used to predict new conformations not
Model
available to X-ray crystallography and NMR. The
predictions can then be validated by performing
additional experiments. Theoretical models of largescale conformational change are a powerful tool
when applied to small angle X-ray scattering (SAXS)
Fig. 1. A workflow schematic for using theoretical modelling
and NMR data. Since these two methods yield atomic
in combination with experiment. The GAD65 structure
distance restraints, they are perfect datasets to be
was solved by X-ray crystallography with two loop regions
analysed by atomic resolution computational models.
that failed to yield electron density. Missing residues were
Lastly, MD/NMA can be used in the analysis of
modeled into the structure and the dynamics modeled by MD.
data from experiments that infer motion but cannot
Based on the dynamics and NMA, models were developed for
alone give atomic-level detail. Experimental methods
GAD65 flexibility and conformational change. These models
could include limited proteolysis and hydrogen/
were tested and validated using experimental techniques
deuterium (H/D) exchange to validate the stability
such as SAXS and limited proteolysis (cut sites shown in red
of secondary structural elements, solvent accessibility
in the third panel from the top). This experimental data was
and molecular flexibility. Other experimental
incorporated back into the dynamics models and further
techniques could include Förster resonance energy
refined. Finally, this allowed the development of a hypothesis
transfer (FRET) for validating regions that undergo
for the basis of GAD65 autoantigenicity in type 1 diabetes.
large conformational changes. Lastly, data from
Adapted from ref. 17.
classic biophysical techniques such as tryptophan
fluorescence and quenching can be interpreted by predicting changes in surface exposure which can then be used to
predict new areas for the introduction of tryptophan point mutants.
In summary, MD allows the researcher to combine chemical and structural data to further develop models of biomolecular
activity and generate a detailed theory for further testing.
Vol 45 No 2 August 2014
AUSTRALIAN BIOCHEMIST
Page 17
SHOWCASE ON
RESEARCH
Combining Theoretical and
Experimental Approaches
The Dynamical Effect of a Point Mutation on the
Polymerisation Rate of Alpha-1 Antitrypsin (α1-AT)
The human serine protease inhibitor (serpin) α-1
antitrypsin (α1-AT), protects tissues from the proteases of
inflammatory cells, especially elastase (10). Several diseasecausing mutations in α1-AT have been identified, the
most common being the Z-mutation (E342K) that results
in an increased propensity of α1-AT to polymerise in the
ER of hepatocytes, leading to a lack of secretion into the
circulation (11). The mutation, located at the top of the A
b-sheet, increases the aggregation propensity of folding
intermediates, ultimately resulting in misfolding and
aggregation (11). Despite the vast range of structural and
biochemical data related to α1-AT activity, the molecular
nature of this process has remained elusive. In order to
provide insight into the conformational properties of the
Z variant, a comparative MD investigation of the native
states of wildtype and Z α1-AT was carried out (12). The
results reveal a striking contrast between their structures
and dynamics. This is characterised by greater flexibility of
the breach region in Z, which in turn leads to the opening
of strands 3 and 5 at the top of the A b-sheet. Moreover,
the theoretical study suggests that electrostatic and H-bond
interactions play a vital role in the stability and activity of
α1-AT.
Based on those findings, and in order to further study
the effect of mutations at position 342, E342Q and E342R
mutants were expressed and their structure and activity
were studied. Fluorescence and polymerisation data for
E342Q and E342R mutants clearly show that they behave in
a similar fashion to the Z variant.
Taken together, this demonstrates that the MD simulations
successfully identified regions of the Z-variant that
have greater flexibility and may contribute to its greater
propensity to misfold, aggregate and thus cause disease.
Experimental studies designed based on this finding, imply
that polymerisation can be the result of mutations at residue
342 that either stabilise an open form of the top of the A
β-sheet (E342K) or increase the local flexibility in this region
(E342Q and E342R). These findings show that local changes
in dynamics can affect protein folding and function and
suggest the broader applicability of theoretical methods in
the study of disease-causing mutations.
The Dynamical Interplay between Activity and
Autoinactivation that Leads to Autoimmunity
Glutamic-acid decarboxylase (GAD) is a pyridoxal5’-phosphate (PLP) dependent enzyme that catalyzes
the synthesis of γ-aminobutyric acid (GABA), the chief
inhibitory neurotransmitter in mammals, from glutamic
acid (13). Distinct from most other biosynthetic processes,
this reaction is catalysed by two GAD isoforms, GAD67
and GAD65, named according to their respective molecular
weights (14). While cytosolic GAD67 is more saturated
with the co-factor PLP and is constitutively active,
producing basal levels of GABA, the membrane associated
GAD65 exists mainly without PLP, as an autoinactivated
apo-enzyme (15). ApoGAD65 can be reactivated by PLP
to form holoGAD65, when additional GABA is required,
for example in response to stress. GAD65, but not GAD67,
Page 18
is a major autoantigen and autoantibodies to GAD65 are
detected at high frequency in patients with type 1 diabetes
and other autoimmune disorders (16). The series of events
responsible for catalytic variance and initiation of these
autoimmune responses are unknown, despite the fact that
the structures of both enzymes have been characterised at
atomic-level resolution by X-ray crystallography.
Since these enzymes share high sequence and structural
homology, it was hypothesised that the differential
autoantigenicity and propensity to lose PLP are due to
profoundly different dynamics. By applying MD analysis
to GAD65/67 crystal structures and cross-referencing these
models with areas of implied flexibility from the crystal
structures, it was possible to determine a plausible set of
motions for areas with high b-factors or missing electron
density (Fig. 1) (17). The modeled motions were quite
different between the 65/67 isoforms, with 65 showing
greater degrees of motion. Furthermore, using NMA
analysis, models of large-scale movements that mimicked
large-scale conformational changes in a related enzyme
were developed. Once again, NMA analysis suggested that
the intrinsic motions of the two isoforms were different,
with GAD67 being more constrained than the GAD65
isoform. These models were then tested using SAXS, which
did show altered values that were dependent upon PLP
status and gave atomic-level constraints to the motions that
were examined in the context of our molecular models (Fig.
1). Lastly, limited proteolysis indicated greater flexibility
in GAD65 that was dependent upon PLP status and the
proteolysed sites were modeled onto the structure (Fig.
1). Since the proteolysed sites corresponded to areas of
flexibility that were proposed by MD and NMA, we were
satisfied our computational models reflected the actual
motions seen in GAD65/67.
Since these structural changes altered the interaction
of GAD65 with antibodies, we are closer to determining
the structural basis of the differential autoantigenicity
for GAD65 and GAD67. These findings thus provide
insights into how structural flexibility governs protein
immunogenicity in autoimmune diabetes, and have
implications for therapeutic antibody and vaccine design
and improved diagnostics. Lastly, these studies show how
a combined experimental and theoretical approach can be
used to elucidate biologically relevant molecular motions.
Summary
Just as an experiment without theory does not advance
human knowledge theory without experiment does no
better. The modern structural biologist has a vast array
of theoretical methods to complement experimental
techniques. By combining the two, we are able to develop
the most mature, sophisticated models of biomolecular
structure and function to date. This combined approach will
allow researchers to unravel some of the most challenging
problems in health. Topics such as how mutations cause
disease, why proteins aggregate in amyloid diseases and
why some proteins escape immune tolerance are within the
reach of the combined application of computational and
experimental biology. The future is now!
References can be found on page 15
AUSTRALIAN BIOCHEMIST
Vol 45 No 2 August 2014
SHOWCASE ON
RESEARCH
In Silico Enzyme Modelling
enzymologists are common, and are poving to be
extremely beneficial in addressing mechanistic puzzles
that cannot otherwise be conclusively resolved. With the
development of computational methods and the everincreasing availability of computing resources, the field of
computational enzymology will continue to flourish and
play an increasingly important role in various aspects of
biochemistry and medicinal chemistry.
References
1. The Nobel Prize in Chemistry 2013, Press Release,
Nobel Media AB
2. Lin, H., and Truhlar, D.G. (2007) Theor. Chem. Acc. 117,
185-199
3. Senn, H.M., and Thiel, W. (2009) Angew. Chem. Int. Ed.
Engl. 48, 1198-1229
4. Lonsdale, R., Ranaghan, K.E., and Mulholland, A.J.
(2010) Chem. Commun. 46, 2354-2372
5. Wolfenden, R. (2011) Annu. Rev. Biochem. 80, 645-667
6. Röthlisberger, D., Khersonsky, O., Wollacott, A.M., et al.
(2008) Nature 453, 190-195
7. Imming, P., Sinning, C., and Meyer, A. (2006) Nat. Rev.
Drug Discov. 5, 821-834
8. Aqvist, J., Kolmodin, K., Florian, J., and Warshel, A.
(1999) Chem. Biol. 6, R71-R80
9. Warshel, A., and Karplus, M. (1972) J. Am. Chem. Soc. 94,
5612-5625
10.Warshel, A., and Levitt, M. (1976) J. Mol. Biol. 103, 227249
11.Thiel, W. (2014) WIREs Comput. Mol. Sci. 4, 145-157
12.Claeyssens, F., Harvey, J.N., Manby, F.R., Mata, R.A.,
Mulholland, A.J., Ranaghan, K.E., Schütz, M., Thiel, S.,
Thiel, W., and Werner, H.-J. (2006) Angew. Chem. Int. Ed.
Engl. 45, 6856-6859
13.van Gunsteren, W.F., Bakowies, D., Baron, R., et al.
(2006) Angew. Chem. Int. Ed. Engl. 45, 4064-4092
14.Bowman, A.L. Grant, I.M., and Mulholland, A.J. (2008)
Chem. Commun. 37, 4425-4427
15.Vocadlo, D.J., Davies, G.J., Laine, R., and Withers, S.G.
(2001) Nature 412, 835-838
16.Warshel, A., Sharma, P.K., Kato, M., Xiang, Y., Liu, H.,
and Olsson, M.H.M. (2006) Chem. Rev. 106, 3210-3235
17.Yang, Y., Yu, H., and Cui, Q. (2008) J. Mol. Biol. 381,
1407-1420
18.Hou, G., and Cui, Q. (2013) J. Am. Chem. Soc. 135,
10457-10469
19.Harris, T. K., and Turner, G.J. (2002) IUBMB Life 53,
85-98
20.Yu, H., and Griffiths, T.M. (2014) Phys. Chem. Chem.
Phys. 16, 5785-5792
21.Hansson, T., Oostenbrink, C., and van Gunsteren, W.
(2002) Curr. Opin. Struct. Biol. 12, 190-196
22.Liu, H., Elstner, M., Kaxiras, E., Frauenheim, T.,
Hermans, J., and Yang, W. (2001) Proteins 44, 484-489
23.Kamerlin, S.C.L., Vicatos, S., Dryga, A., and Warshel, A.
(2011) Annu. Rev. Phys. Chem. 62, 41-64
24.Daily, M.D., Yu, H., Phillips, G.N.J., and Cui, Q. (2013)
Top. Curr. Chem. 337, 139-164
25.van Gunsteren, W.F., and Mark, A.E. (1998) J. Chem.
Phys. 108, 6109-6116
26.Riccardi, D., Schaefer, P., Yang, Y., Yu, H.; Ghosh, N.,
Vol 45 No 2 August 2014
Prat-Resina, X., Konig, P., Li, G., Xu, D., Guo, H., Elstner,
M., and Cui, Q. (2006) J. Phys. Chem. B 110, 6458-6469
27.Yang, Y., Yu, H., York, D., Cui, Q., and Elstner, M.
(2007) J. Phys. Chem. A 111, 10861-10873
28.Gaus, M., Cui, Q., and Elstner, M. (2011) J. Chem. Theory
Comput. 7, 931-948
29.Kamerlin, S.C., Haranczyk, M., and Warshel, A. (2009) J.
Phys. Chem. B 113, 1253-1272
30.Yu, H., and van Gunsteren, W.F. (2005) Comput. Phys.
Commun. 172, 69-85
31.Woodcock, H.L., Hodoscek, M., Gilbert, A.T.B., Gill,
P.M.W., Schaefer, H.,F., and Brooks, B.R. (2007) J.
Comput. Chem. 28, 1485-1502
32.Nam, K., Gao, J., and York, D.M. (2005) J. Chem. Theory
Comput. 1, 2-13
33.Schaefer, P., Riccardi, D., and Cui, Q. (2005) J. Chem.
Phys. 123, 014905
34.Klähn, M., Braun-Sand, S., Rosta, E., and Warshel, A.
(2005) J. Phys. Chem. B 109, 15645-15650
35.Christen, M., and van Gunsteren, W.F. (2008) J. Comput.
Chem. 29, 157-166
References from page 18
1. Lukin, J.A., and Ho, C. (2004) Chem. Rev. 104, 1219-1230
2. Perutz, M.F., Rossmann, M.G., Cullis, A.F., Muirhead,
H., Will, G., and North, A.C. (1960) Nature 185, 416-422
3. Perutz, M.F., Muirhead, H., Cox, J.M., and Goaman L.C.
(1968) Nature 219, 131-139
4. Shibayama, N., Sugiyama, K., Tame, J.R., and Park, S.Y.
(2014) J. Am. Chem. Soc. 136, 5097-5105
5. Liu, J., Dong, Y., Zheng, J., He, Y., and Sheng, Q. (2013)
Anal. Sci. 29, 1075-1081
6. Kay, L.E. (2005) J. Magn. Reson. 173, 193-207
7. Tilton, R.F. Jr, Dewan, J.C., and Petsko, G.A. (1992)
Biochemistry 31, 2469-2481
8. van Gunsteren, W.F., Bakowies, D., Baron, R.,
Chandrasekhar, I., Christen, M., Daura, X., Gee, P.,
Geerke, D.P., Glättli, A., Hünenberger, P.H., Kastenholz,
M.A., Oostenbrink, C., Schenk, M., Trzesniak, D., van
der Vegt, N.F., and Yu, H.B. (2006) Angew Chem Int Ed
Engl 45, 4064-4092
9. Ma, J. (2005) Structure 13, 373-380
10.Crystal, R.G. (1989) Trends Genet. 5, 411-417
11.Lomas, D.A., Evans, D.L., Finch, J.T., and Carrell, R.W.
(1992) Nature 357, 605-607
12.Kass, I., Knaupp, A.S., Bottomley, S.P., and Buckle, A.M.
(2012) Biophys. J. 102, 2856-2865
13.Bu, D.F., Erlander, M.G., Hitz, B.C., Tillakaratne, N.J.,
Kaufman, D.L., Wagner-McPherson, C.B., Evans, G.A.,
and Tobin, A.J. (1992) Proc. Natl. Acad. Sci. USA 89, 21152119
14.Erlander, M.G., Tillakaratne, N.J., Feldblum, S., Patel, N.,
and Tobin, A.J. (1991) Neuron 7, 91-100
15.Martin, D.L., and Rimvall, K. (1993) J. Neurochem. 60,
395-407
16.Baekkeskov, S., Aanstoot, H.J., Christgau, S., Reetz, A.,
Solimena, M., Cascalho, M., Folli, F., Richter-Olesen, H.,
and De Camilli, P. (1990) Nature 347, 151-156
17.Kass, I., Hoke, D.E., Costa, M.G., et al. (2014) Proc Natl
Acad Sci USA 111, E2524-E2529
AUSTRALIAN BIOCHEMIST
Page 15