Coarse-grained models of dynamic allostery in proteins Rhoda Joy Hawkins Submitted in accordance with the requirements for the degree of Doctor of Philosophy The University of Leeds School of Physics and Astronomy November 2005 The candidate confirms that the work submitted is her own and that appropriate credit has been given where reference has been made to the work of others. This copy has been supplied on the understanding that it is copyright material and that no quotation from the thesis may be published without proper acknowledgement. Abstract Allosteric proteins are involved in many signalling processes in molecular biology. Increasingly, the functional importance of dynamics as well as the static structure of biological molecules is being recognised. The aim of this thesis is to investigate dynamic mechanisms of ‘allostery’ in proteins, using physical coarse-grained models, treated by the methods of statistical mechanics. We explore how changes in protein flexibility on ligand binding are used to transmit information to a distant site across an allosteric protein. We analytically calculate changes in the intramolecular vibrational free energy on ligand binding of coarse-grained models that capture the low frequency global modes of vibration. By parameterisation from more detailed simulations and experimental data, we use these models to make quantitative predictions in the specific test case of the lac repressor. We find that realistic, experimentally comparable values for the allosteric free energy arise from changes in the fluctuation spectrum of the proteins, without the need for large conformational changes in the mean static structure. Considerations of the case in which a set of fast localised modes are coupled to slow global modes, lead to compensating dynamic entropic and enthalpic contributions to the allosteric free energy. This allows the application of our methods to the case of the met repressor. Allostery in alpha helical coiled-coils is also examined. Coiled-coils are found in proteins such as the molecular motor dynein and bacterial chemotaxis receptors. We treat the specific application to dynein in detail. i Acknowledgements Firstly, I thank my supervisor Tom McLeish for all his support and guidance. Tom is the most encouraging and inspirational supervisor anyone could wish for. I thank all my colleagues in the Astbury Centre for Structural Molecular Biology in Leeds, particularly Peter Stockley for many enlightening discussions about the lac and met repressors; Steve Homans and Simon Phillips for advice on NMR and crystallography; and Stan Burgess and Peter Knight for fruitful discussions on dynein. Thanks to Tannie Liverpool, from the department of Applied Mathematics in Leeds, for advice on mathematical modelling of coiled-coils. Thanks to numerous people I have met and enjoyed talking with at conferences. In particular, I learnt much about science and research by engaging with many people during my stay at the Isaac Newton Institute for Mathematical Sciences in Cambridge for the “Statistical Mechanics of Molecular and Cellular Biological Systems” program. I wish to especially thank David Dryden, Robijn Bruinsma, Fred Mackintosh and Wilson Poon for stimulating conversations. I am grateful to Dennis Bray and all his group in the department of Anatomy in Cambridge, for introducing me to bacteria chemotaxis receptors. Thank you to all my friends and colleagues in the polymer IRC in Leeds School of Physics and Astronomy. Thanks to Peter Olmsted for advice throughout my PhD and for agreeing to be my internal examiner. I thank my office mates Richard Graham, Bhavin Khatri, Jayne Wallace and Dan West for putting up with me, supporting me and answering my many questions. Thank you to Maureen Thompson for vital organisational and communication support. I am grateful to Richard England for proof-reading my thesis. Thanks to my non-physicist friends and my family for coping with me and keeping me sane. Finally, I also acknowledge financial support from the EPSRC. This year the UK is celebrating ’Einstein Year’ as part of the World Year of Physics, coinciding with the 100th anniversary of Einstein’s three famous papers. In acknowledgement of this the reader will find a quote from Einstein in each chapter. ii Contents Abstract i Acknowledgements ii List of figures viii List of tables ix Abbreviations x 1 Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Physics at the life science interface . . . . . . . . . . . . . . . . . . . . . 1 1.3 Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.4 Allostery in proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.4.1 The Monod-Wyman-Changeux model . . . . . . . . . . . . . . . 4 1.4.2 The Koshland-Nemethy-Filmer model . . . . . . . . . . . . . . . 5 1.5 Protein dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.6 Thermodynamics of proteins . . . . . . . . . . . . . . . . . . . . . . . . 9 1.6.1 Introduction to thermodynamics and kinetics . . . . . . . . . . . 9 1.6.2 Isothermal Titration Calorimetry . . . . . . . . . . . . . . . . . . 11 Experimental techniques probing protein dynamics . . . . . . . . . . . . 12 1.7.1 X-ray crystal diffraction . . . . . . . . . . . . . . . . . . . . . . . 12 1.7.2 Nuclear Magnetic Resonance . . . . . . . . . . . . . . . . . . . . 14 1.7.3 Neutron scattering . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.7.4 Fluorescence Resonance Energy Transfer . . . . . . . . . . . . . . 18 1.7.5 Other emerging experimental techniques . . . . . . . . . . . . . . 20 1.8 Dynamic allostery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1.9 Biological systems studied in this thesis . . . . . . . . . . . . . . . . . . 22 1.9.1 Repressor proteins . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.9.2 Proteins with coiled-coils . . . . . . . . . . . . . . . . . . . . . . 25 1.7 iii iv CONTENTS 1.10 Open research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.11 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2 Levels of coarse-graining in protein modelling 31 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.2 Computational techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.2.1 Quantum level . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.2.2 Molecular Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.2.3 Principal component analysis . . . . . . . . . . . . . . . . . . . . 34 2.2.4 Calculating entropies . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.2.5 Normal mode analysis . . . . . . . . . . . . . . . . . . . . . . . . 37 2.2.6 Elastic Network Model . . . . . . . . . . . . . . . . . . . . . . . . 38 2.2.7 Graph theoretic methods . . . . . . . . . . . . . . . . . . . . . . 46 2.3 Our methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.4 Localisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.5 Motivation for our level of coarse-graining . . . . . . . . . . . . . . . . . 53 2.6 Timescales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3 Dimers of rigid monomers 57 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.2.1 Rigid rods model . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.2.2 Rigid plates model . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.2.3 Example theoretical inter-monomer potential . . . . . . . . . . . 69 Parameterisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.3.1 Experimental B-factors . . . . . . . . . . . . . . . . . . . . . . . 71 3.3.2 Atomistic simulation . . . . . . . . . . . . . . . . . . . . . . . . . 71 Results and comparison to experiments . . . . . . . . . . . . . . . . . . 75 3.4.1 Comparison to biochemical data . . . . . . . . . . . . . . . . . . 76 3.4.2 Significance of the vibrational component . . . . . . . . . . . . . 76 3.4.3 Comparison to elastic network model simulations . . . . . . . . . 78 3.5 Lac repressor mutations . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.6 Extensions to the basic model . . . . . . . . . . . . . . . . . . . . . . . . 81 3.6.1 Sequential binding . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.6.2 Bending modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 3.7 Future applications to other repressor proteins . . . . . . . . . . . . . . 86 3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 3.3 3.4 CONTENTS v 4 Coiled-coils 89 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.2.1 A coarse-grained elastic model of coiled-coils . . . . . . . . . . . 91 4.2.2 Parameterisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Parallel rigid rods: slide only . . . . . . . . . . . . . . . . . . . . . . . . 93 4.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Parallel rods: slide and bend . . . . . . . . . . . . . . . . . . . . . . . . 95 4.4.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Coiled geometry: slide and bend . . . . . . . . . . . . . . . . . . . . . . 98 4.5.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.3 4.4 4.5 4.6 Coiled geometry: slide, bend and twist . . . . . . . . . . . . . . . . . . . 104 4.6.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.7 Parameterisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5 Coupling of global and local modes 115 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.2 Allostery amplified by enslaved fast modes . . . . . . . . . . . . . . . . . 115 5.3 Dynamic enthalpic allostery . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.4 Application to the met repressor . . . . . . . . . . . . . . . . . . . . . . 122 5.4.1 Comparing theory with calorimetry experiments . . . . . . . . . 123 5.4.2 Coarse-grained model of the met repressor . . . . . . . . . . . . . 123 5.5 Anharmonicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 6 Conclusions 6.1 6.2 131 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 6.1.1 Improvements and extensions to the repressor proteins work . . . 133 6.1.2 Chemotaxis receptor clustering . . . . . . . . . . . . . . . . . . . 136 6.1.3 G-protein-coupled receptor transmembrane proteins . . . . . . . 140 6.1.4 Allostery around a ring . . . . . . . . . . . . . . . . . . . . . . . 141 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 List of Figures 1.1 Cartoon to describe allostery . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 The MWC model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 The KNF model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Protein dynamics at different frequencies . . . . . . . . . . . . . . . . . . 8 1.5 Michaelis-Menten and cooperative kinetics . . . . . . . . . . . . . . . . . 10 1.6 A typical ITC experiment . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.7 Example of a FRET experiment . . . . . . . . . . . . . . . . . . . . . . 19 1.8 Transcription . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.9 Lac repressor binding DNA only in the absence of inducer . . . . . . . . 24 1.10 X-ray crystal structure of the lac repressor tetramer . . . . . . . . . . . 25 1.11 DNA binding motifs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.12 Example coiled-coil . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.13 Structure and power stroke of dynein . . . . . . . . . . . . . . . . . . . . 28 2.1 Quantum and classical simple harmonic oscillator entropy . . . . . . . . 36 2.2 Elastic network model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.3 A typical elNémo output . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.4 FIRST ‘pebble game’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.5 Levels of coarse-graining . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.6 Timescales of protein dynamics . . . . . . . . . . . . . . . . . . . . . . . 55 3.1 Lac apo and holorepressor tetramers . . . . . . . . . . . . . . . . . . . . 58 3.2 Lac repressor DNA looping . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.3 FIRST rigid cluster decomposition of the lac repressor . . . . . . . . . . 61 3.4 Rods and springs model . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.5 Allosteric free energy against spring constants of two rods model . . . . 64 3.6 Plates and springs model . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.7 Relative displacement parameters in the plates model . . . . . . . . . . 66 3.8 Effective springs in the plates model . . . . . . . . . . . . . . . . . . . . 67 3.9 Lennard-Jones potential . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 vi LIST OF FIGURES 3.10 Potential wells with and without inducer . . . . . . . . . . . . . . . . . . vii 72 3.11 Fraction of DNA sites bound by repressors against inducer concentration 77 3.12 Comparison of GNM, elNémo and our model . . . . . . . . . . . . . . . 78 3.13 Lac mutants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.14 Energy level diagram of sequential inducer binding . . . . . . . . . . . . 82 3.15 Bending angle of a rod . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 3.16 Allosteric free energy against bending modulus . . . . . . . . . . . . . . 85 3.17 Vibrational free energy change on ligand binding against ligand bound bending modulus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.1 Mean and distribution of dynein stalk positions . . . . . . . . . . . . . . 90 4.2 Model of coiled-coil as two classical flexible rods . . . . . . . . . . . . . 92 4.3 Allosteric free energy against clamping for the model of rigid parallel rods 94 4.4 Allosteric free energy against Young’s modulus for the model of parallel flexible rods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 97 Allosteric free energy against clamping for the model of flexible parallel rods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.6 Geometry of two coiled rods . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.7 Geometry of two coiled rods unrolled and end on . . . . . . . . . . . . . 99 4.8 Writhe in a bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.9 Allosteric free energy against shear modulus for coiled geometry with slide and bend fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.10 Allosteric free energy against Young’s modulus for coiled geometry with slide and bend fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.11 Allosteric free energy against clamping for coiled geometry with slide and bend fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.12 Allosteric free energy against shear modulus for coiled geometry with slide, bend and twist fluctuations for different clamping . . . . . . . . . 106 4.13 Allosteric free energy against Young’s modulus for coiled geometry with slide, bend and twist fluctuations . . . . . . . . . . . . . . . . . . . . . . 107 4.14 Allosteric free energy against clamping for coiled geometry with slide, bend and twist fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.15 Allosteric free energy against shear modulus for coiled geometry with slide, bend and twist fluctuations for different Young’s moduli . . . . . . 108 4.16 Geometry of the lowest normal mode (bend) . . . . . . . . . . . . . . . . 108 4.17 Geometry of the lowest twist normal mode . . . . . . . . . . . . . . . . . 109 4.18 Allosteric free energy against number of turns . . . . . . . . . . . . . . . 112 5.1 Scissor model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 viii LIST OF FIGURES 5.2 Crystal structure and coarse-grained model of met repressor . . . . . . . 124 5.3 Met aporepressor modes calculated by elNémo . . . . . . . . . . . . . . 125 5.4 Amputated Feynman diagrams . . . . . . . . . . . . . . . . . . . . . . . 127 6.1 Chemotaxis pathway in a bacterium . . . . . . . . . . . . . . . . . . . . 137 6.2 Receptor clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 6.3 E-coli Tsr dimer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 6.4 Crystal structure of aspartate receptor periplasmic domain with and without bound aspartate . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 6.5 Verotoxin-1 B subunit pentamer . . . . . . . . . . . . . . . . . . . . . . 142 List of Tables 3.1 Fitting parameters for lac repressor plates model . . . . . . . . . . . . . 73 3.2 Spring constants for lac repressor plates model . . . . . . . . . . . . . . 74 3.3 Predicted allosteric free energy for various lac repressor mutants . . . . 80 5.1 Eigenvalues of the lowest six modes of the met repressor calculated by elNémo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 ix Abbreviations AAA ATPases Associated with various cellular Activities ADP adenosine diphosphate amu atomic mass units ANM Anisotropic Network Model Asp aspartate (also known as aspartic acid) ATCase aspartate transcarbamoylase ATP adenosine triphosphate ATPase adenosine triphosphatase bp base pair cAMP cyclic adenosine monophosphate CAP Catabolite Activator Protein CBF Core Binding Factor C cysteine CFP Cyan Fluorescent Protein CNS Crystallography and NMR System CPU central processing unit CRP cyclic adenosine monophosphate receptor protein DCM Distance Constraint Model DFT Density Functional Theory E glutamic acid E-coli Escherichia coli elNémo Elastic Network Model web server ENM Elastic Network Model FID Free Induction Decay FIRST Floppy Inclusion and Rigid Substructure Topography FliM flagellar motor switch protein F phenylalanine x ABBREVIATIONS xi FRET Fluorescence Resonance Energy Transfer FRODA Framework Rigidity Optimised Dynamic Algorithm GFP Green Fluorescent Protein G glycine GNM Gaussian Network Model GPCR G-protein-coupled receptor GTP guanosine triphosphate ID identification IPTG isopropyl-beta-D-thiogalactopyranoside ITC Isothermal Titration Calorimetry KNF Koshland-Nemethy-Filmer lac lactose L leucine MC Monte Carlo MD Molecular Dynamics met methionine MWC Monod-Wyman-Changeux NMA Normal Mode Analysis NMR Nuclear Magnetic Resonance NOE Nuclear Overhauser Effect NSE Neutron Spin Echo PCA Principal Component Analysis PDB Protein Data Bank Pi phosphoric acid H3 PO4 Q glutamate R relaxed rmsd root mean square deviations RNA ribonucleic acid RTB Rotations-Translations of Blocks SAM S-adenosyl methionine S serine T tense Tar aspartate receptor tet tetracycline trFRET time-resolved Fluorescence Resonance Energy Transfer xii ABBREVIATIONS trp tryptophan Tsr serine receptor TTDS Terahertz Time Domain Spectroscopy tyr tyrosine VER Vibrational Energy Relaxation Vi inorganic vanadate YFP Yellow Fluorescent Protein Chapter 1 Introduction 1.1 Overview The story of this thesis is set in part of our world where nanometres are large and nanoseconds are slow. In biological systems complex networks of information transfer rely on amazing protein macromolecules. Communication, known as ‘allostery’, through protein macromolecules allows them to act as biochemical logic gates. We suggest a new mechanism by which such communication works, by appealing to Brownian motion. Biology utilises thermal motion, to which its molecules are continuously subjected, for its functional benefit. By using various physical mechanisms to couple small localised changes in its structure to delocalised global modes of vibration, we describe how a protein could harness these inherent dynamics to communicate information. 1.2 Physics at the life science interface Einstein said “the eternal mystery of the world is its comprehensibility”. This is particularly impressive when applied to life sciences. Can we actually comprehend some of the mysteries of life? The challenges of understanding the inner workings of biological organisms are huge, due to their great complexity. Do physicists have anything to offer in such a quest? Schrödinger [1944] wrote a short book entitled What is Life?, which was very influential in inspiring physicists and biologists to address the physical basis of life. Since then, the new science of molecular biology has made revolutionary progress (such as the discovery of the DNA double helix [Watson and Crick, 1953]). It appears that by scientists from different disciplines working together, we can achieve greater understanding of the world around us [Rhoten and Parker, 2004]. Many biological approaches aim to observe and qualitatively describe the system being studied in great detail. Physical approaches however, often ignore the details to simplify the system and aim to quantitatively predict and understand the simplified model system. 1 2 CHAPTER 1. INTRODUCTION By combining such different approaches can physics help us to understand living material? Can we discover general physical principles in the midst of such diversity and specificity? The challenge for the physicist is to devise the simplest model that still captures the fundamental physics underlying the workings of the biological system in question. Such models aim to explain and predict data from the vast wealth of information obtained by biologists and biochemists. In this thesis we take up this challenge for a particular function of proteins. 1.3 Proteins Proteins are the macromolecules produced by cells from the genetic information stored on DNA. They are heteropolymers made up of combinations of 20 different amino acids (also called residues). Combinations of three DNA base pairs on a gene encode for one amino acid. Each of the thousands of types of protein is highly specific, being composed of its own sequence of amino acids and folding into its unique ‘native’ structure. Proteins carry out the main workings of the cell as well as providing important elements of the cell structure. Understanding protein function is of great biological interest and is important for potential future advances in medicine as well as involving fascinating physics. In this thesis we are interested in understanding an aspect of protein function. More specifically, we are concerned with the crucial signalling role of many proteins such as those involved in delicately balanced regulatory systems responding to environmental stimuli. 1.4 Allostery in proteins An allosteric protein is one in which events at one site affect the activity at a distant site in the protein. Figure 1.1 shows a simple cartoon example of this, in which a small (red) molecule binds at the top of the large (blue) protein affecting whether the protein binds the other (green) molecule at the bottom of the protein. This ‘action at a distance’ implies there is some mechanism by which the protein transmits information though its structure. A common type of allosteric interaction is allosteric cooperativity. Cooperativity is the effect of one ligand binding changing the affinity of binding to another molecule. If the second ligand binds more easily due to the presence of the first, the cooperativity is positive and conversely, if the presence of the first ligand inhibits the binding of the second, the cooperativity is negative. For example, competitive binding of two different ligands at the same site produces negative cooperativity. Competitive binding to the same site is an example of cooperativity that is not allosteric. However, 1.4. ALLOSTERY IN PROTEINS 3 Figure 1.1: Cartoon to describe allostery. The red ball at the top represents a small molecule whose binding to the blue protein affects whether the protein binds the green molecule at a distant site. many systems in molecular biology are both cooperative and allosteric. Binding of a ligand to one part of a protein can increase or decrease the affinity for ligand binding to another site. Such cooperativity is allosteric in that information is transmitted between sites on the protein. The terms ‘allostery’ and ‘cooperativity’ are used somewhat interchangeably in the literature but specifically each of these processes is a subset of the other. Allosteric proteins can act as many different functional devices in a cell. For example, bacterial chemotaxis receptors are allosteric sensors [Kim et al., 1999]; gene regulatory proteins such as the lac repressor are allosteric switches [Matthews, 1996]; some allosteric proteins integrate multiple ligand binding inputs acting as nanochips, such as the Wiskott-Aldrich syndrome protein (WASP) involved in control of the actin cytoskeleton [Buck et al., 2004]; allosteric molecular motors such as dynein generate force [Burgess et al., 2003]; and ion channels act as allosteric pumps transporting ions through the membrane [Hogg et al., 2005]. Traditionally it was thought that only multi-domain proteins could be allosteric. However, many monomeric proteins are now known to be allosteric. For example, the classic protein myoglobin was previously thought to be non-allosteric but recent work by Frauenfelder et al. [2001] implies it is allosteric. Such observations have led to the suggestion that allostery is an intrinsic property of all proteins [Gunasekaran et al., 2004]. They suggest some proteins contain previously unknown allosteric sites that could be enhanced by mutations leading to a wealth of allosteric proteins for drug design [Gunasekaran et al., 2004]. Evolution may have optimised proteins to work allosterically. Classically allostery has been explained by static conformational change. Ligand binding causes the protein to change conformation (shape). This changes its activity 4 CHAPTER 1. INTRODUCTION (usually affinity for another ligand) in one state compared to the other state. It will only ‘fit’ into the binding site when it is in one of the states. The classic theories for allosteric proteins by Monod et al. [1963, 1965] (the MWC model) and Koshland et al. [1966] (the KNF model) set out the idea that allosteric transitions involve changes in quaternary structure. The quaternary structure of a protein is the arrangement of polypeptide chains (subunits) within a multi-domain protein. Since these theories were suggested, numerous x-ray crystal and NMR (nuclear magnetic resonance) structures have shown such conformational changes. In sections 1.7.1 and 1.7.2 we introduce x-ray crystallography and NMR. In sections 1.4.1 and 1.4.2 we review these early theories of allostery and discuss their usefulness and limitations. 1.4.1 The Monod-Wyman-Changeux model Monod et al. [1965] developed the first model of allostery (the MWC model). They wished to address how ligands could regulate an enzyme by binding to a different site from the active site. Previous explanations of enzyme regulation assumed ligands cause steric hindrance by binding to the enzyme active site. Since Monod et al. [1965] proposed their model, the concepts they introduced (of allostery by switching between discrete conformational states) have been used to explain allostery found in many protein systems. Here we introduce their original theory. In section 3.2 we relate our work to this early model. Monod et al. [1965] describe the general properties of allosteric systems as oligomers (made up of more than one subunit) that change their quaternary structure. They consider only subunits that are identical and seek to explain cooperativity on symmetry considerations. In their model, allosteric proteins are oligomers with equivalent subunits giving at least one axis of symmetry (see figure 1.2). Each subunit has one ligand T T R R T T R R Figure 1.2: Cartoon of the MWC model. All the subunits are changed from the (pink) T state to the (green) R state when one (blue) ligand binds to one of the subunits. Figure taken from Gunasekaran et al. [2004] binding site. The conformation of each subunit is constrained by the other subunits. Two (or more) conformationally different states exist, which can be reversibly accessed, each with different ligand affinity. The two states, called relaxed (R) and tense (T), are in equilibrium. The ligand will bind one state with higher affinity. Monod et al. [1965] assume the symmetry of the whole protein is conserved, meaning that all the subunits 1.4. ALLOSTERY IN PROTEINS 5 are in the same state. This is depicted in figure 1.2 where the subunits are all pink hexagons or all green squares. When one subunit switches state (by ligand binding) the other subunits simultaneously switch state too. Ligand binding to one subunit therefore causes all subunits to switch to their bound conformation even if they do not have a ligand bound. Such ligand-free subunits in the bound conformation will have high affinity for ligand. This produces positive cooperativity since subsequent ligands bind more easily to subunits already in the bound conformation. From their model Monod et al. [1965] calculate theoretical binding curves. The cooperativity is dependent on the equilibrium constants in the model, which are the equilibrium constants between the R and T states in the absence of ligand and of ligand binding to the T state compared to the R state. Complex cooperative kinetics can arise from the displacements in the R T equilibrium. The assumption of conserved symmetry means a transition from R to T in one subunit forces the same transition in the other subunits. This accounts for positive cooperativity (the second ligand binds more easily because the first ligand is bound) but fails to address negative cooperativity (the second ligand doesn’t bind if the first ligand is bound). The idea of allostery as two state transitions with conformational change has been very influential. Changeux and Edelstein [2005] give a recent review of evidence supporting the MWC model. Allostery has been found in many proteins not just enzymes as the MWC model originally considered. X-ray crystal and NMR structures have shown conformational changes for many allosteric proteins (e.g. the lac repressor [Lewis et al., 1996]). Conformational change has become a text book explanation of allostery [Alberts, 2002] leading structural molecular biologists to expect and look for conformational changes. However, there are allosteric proteins that show no conformational change (e.g. met repressor [Rafferty et al., 1989]). X-ray crystallography has shown that most signalling proteins are oligomers made up of identical subunits. However, not all allosteric proteins have identical subunits (e.g., many transmembrane receptors are made up of different subunits). The most controversial aspect of the MWC model even at the time [Koshland et al., 1966] is the concerted nature of all subunits switching together. Physically there is no reason why the subunits have to remain symmetrical and such a concerted transition seems unrealistic. This aspect of the MWC model is criticised by Koshland et al. [1966] who propose an alternative sequential model (which we review in section 1.4.2). 1.4.2 The Koshland-Nemethy-Filmer model Koshland et al. [1966] challenge the MWC assumption of symmetry conservation. In their (KNF) model Koshland et al. [1966] allow for different subunits to be in different states. They consider interactions between subunits in different states. They allow 6 CHAPTER 1. INTRODUCTION sequential changes, unlike the MWC model which insists on each subunit being in the same state. Figure 1.3 gives examples of possible sequential transitions from all subunits in one state to all in another state. Koshland et al. [1966] outline four possible (a) (b) (c) (d) Figure 1.3: Cartoon of the KNF model showing sequential changes in state.(a) Tetrahedral model (b) square model (c) linear model (d) concerted model. arrangements of four subunits. The ‘tetrahedral’ model has each subunit interacting with all the others (see figure 1.3a). The ‘square’ model has each subunit interacting with two others (see figure 1.3b). The ‘linear’ model has each end subunit interacting with one other subunit but the middle two subunits interact with two subunits each (see figure 1.3c). The ‘concerted’ model is like the MWC model in that all the subunits change simultaneously (see figure 1.3d). The KNF model allows for intermediate states between the two states in the MWC model. This allows for both negative and positive cooperativity depending on the nature of the subunit interactions. While the KNF model addresses criticisms of the MWC concerted transition it still relies on conformational changes so cannot explain allostery without conformational change (such as the met repressor [Rafferty et al., 1989]). Both the MWC and the KNF models have been helpful in considering mechanisms for allostery but their picture is of switching between discrete states. We will see in section 1.5 that such a static view of proteins is now being revised. 1.5 Protein dynamics In this thesis we study dynamic mechanisms for allostery. Throughout this thesis we use the word ‘dynamics’ in the loose manner that biochemists use the word to mean 1.5. PROTEIN DYNAMICS 7 movement. Traditionally the goal of molecular biology has been structure determination but now it is increasingly recognised that dynamics are important as well as static structure. One example of the crucial role of dynamics in drug design is work by Ferrari et al. [2003]. They show that differences in the dynamics of the essential bacterial and human enzyme thymidylate synthase, leads the antibacterial agent α156 to specifically inhibit the bacterial form but not the human form. In his review of NMR relaxation methods Wand [2001b] says “in the midst of the rush to pursue structural genomics, a quieter revolution ... to characterise internal protein motion has also begun”. Weber [1972] was maybe the first to encourage a move away from the view of proteins in discontinuous static states. He pointed out the stochastic nature of proteins as macromolecules in ensembles of many conformations, only the average of which is seen by an experimental measurement. He suggests the binding of a ligand affects equilibria between such conformations within the ensemble. More recently, NMR hydrogen exchange data on lysozyme [Freire, 1999] has inspired the ‘dynamic population shift’ model. This model, also known as redistribution of conformational states, has built on the pre-existing equilibrium concept. These theoretical ideas have changed the way people think about binding mechanisms from the ‘lock and key’ model of Fischer [1894]. It is even a move away from the ‘induced fit’ model of Koshland [1958] (a small structural remoulding in order to fit a binding site) since it suggests the conformational state favoured by ligand binding already exists as part of the equilibrium ensemble of states. Within the pre-existing equilibrium there may exist conformations that bind different ligands with high affinity. This allows proteins to maintain specificity whilst also binding a range of ligands [Ma et al., 2002]. Such studies are forcing the drug design community to broaden their searches by considering protein dynamics. Bioinformatics studies by Dunker et al. [2002] have shown that a significant percentage of proteins contain disordered regions (of ≥ 40 consecutive residues). Radivojac et al. [2004] find disordered and flexible regions correlate with certain amino acid patterns, implying disorder and flexibility are encoded in the sequence. Interestingly, the predicted percentage of proteins with disorder increases along the evolutionary chain (6 − 33% in bacteria to 35 − 51% in eukaryotes) implying evolutionary selection of proteins with disordered regions. Such widespread disorder in protein structures adds evidence for the importance of disorder and dynamics (since disordered regions tend to be dynamic) for protein function. Kurzyński [1998] makes clear the distinction between different kinds of motions within proteins and their different time scales. The spectrum of time scales ranges from ∼ 10−14 s to ∼ 105 s. Kurzyński [1998] distinguishes between vibrational modes (∼ 10−14 s to ∼ 10−11 s) and stochastic conformational transitions (∼ 10−11 s to ∼ 105 s. Vibrational modes can be described by damped harmonic oscillations following the 8 CHAPTER 1. INTRODUCTION fluctuation-dissipation theorem. Stochastic conformational transitions are described by master equations for the activated kinetic processes. He defines transitions between conformational states to be those divided by barriers greater than ∼ 4kB T . Biochemical reactions (such as dissociation or association) and large scale changes such as protein folding or unfolding fall into this category. The high frequency modes correspond to localised stretching of N-H or C-H bonds. High frequency vibrational modes (time scales shorter than ∼ 10−13 s) correspond to energies greater than kB T at physiological temperatures of 300K. Therefore they are not in principle thermally excited. Frequencies that are thermally excited in the higher frequency region correspond to the vibrations of sidechains (see figure 1.4a). The low frequency vibrational modes ∼ 10−11 s are (overdamped) collective modes, which are correlated over whole domains (see figure 1.4b). It is the contributions of these low frequency vibrations to the entropy of proteins that (a) (b) Figure 1.4: Cartoon of protein dynamics at different frequencies.(a) High frequencies (∼ 10−12 s) excite vibrations in sidechains such as those depicted here. (b) Low frequencies (∼ 10−11 s) correspond to global motions of whole domains such as shown here (for the lac repressor PDB ID (protein data bank identification) 1LBI[Lewis et al., 1996]). we are primarily interested in in this thesis. We will explore the hypothesis that a key to understanding protein function lies in the intramolecular dynamics. Thermal fluctuations not only affect the diffusion of the whole protein molecule, but excite a whole spectrum of internal vibrations. One hundred years after Einstein’s paper on Brownian motion there are still many unsolved issues and applications of the statistical mechanical tools developed. See Frey and Kroy [2005] for a recent review of Brownian motion and its applications in biological physics. We introduce, in section 1.7, various experimental techniques that probe protein dynamics. Computational techniques can also be used to probe protein dynamics and we introduce these in section 2.2. However, before we discuss such direct experimental or computational probes we review the thermodynamics of proteins. 1.6. THERMODYNAMICS OF PROTEINS 1.6 1.6.1 9 Thermodynamics of proteins Introduction to thermodynamics and kinetics Throughout this thesis we will be using the language of thermodynamics as we investigate changes in free energy, enthalpy and entropy. Conservation of energy governs all processes. All molecular activity within a cell involves transfer of energy of different types. Allostery relies critically on changes in free energies. For a chemical reaction to happen spontaneously the change in free energy ∆G of that reaction must be negative. That is, the final state of the reactants must have a lower free energy than the initial state. The free energy change of any chemical process (such as the binding of a ligand) involves both an enthalpic, ∆H and entropic, ∆S component; ∆G = ∆H − T ∆S. (1.1) This free energy change ∆Go at standard conditions can be readily obtained from the equilibrium constant K for the reaction; ∆Go = −RT ln K. (1.2) The equilibrium constant for a reaction of the form A + B ¿ AB is given by, K= ka [AB] = , [A][B] kd (1.3) where ka and kd are the rate constants for the association and dissociation reactions respectively and the square brackets mean the concentration of that species. An indirect method of measuring enthalpies of reactions is obtained by van’t Hoff analysis of the temperature dependence of the equilibrium constant. A van’t Hoff plot (or Arrhenius plot) is a plot of ln K against reciprocal temperature. It is obtained by combining equations (1.1) and (1.2) to give ln K = −∆H 1 ∆S R ( T )+ R . Thus the gradient of the plot gives the enthalpy and the y-intercept gives the entropy. This method assumes there is no significant temperature dependence of ∆H or ∆S. Enzyme kinetics in steady state sometimes follow ‘Michaelis-Menten’ kinetics [Michaelis and Menten, 1913]. For a reaction enzyme E binding onto a substrate S and reacting to give a product P of the form E + S ¿ ES → E + P the rate of the reaction is V = kcat [ES]. kcat is the rate of the catalysis ES → E + P , which is the number of substrate molecules processed per enzyme molecule per second. At steady state [ES] is constant. This leads us to the Michaelis-Menten equation; V = kcat [E0 ][S] Vmax [S] = , Km + [S] Km + [S] (1.4) 10 CHAPTER 1. INTRODUCTION where [E0 ] is the initial concentration of enzyme and Km = k−1 +kcat k1 where k1,−1 are the forward and backward rates of the reaction E + S ¿ ES. Km is equal to the concentration of substrate when the rate is half maximum V = Vmax /2. At small substrate concentrations the reaction rate is substrate-limited but at high concentrations the reaction rate is enzyme-limited. In most systems k−1 >> kcat so Km is approximately the dissociation constant for the enzyme-substrate binding. The characteristic curve of Y the Michaelis-Menten equation is shown in figure 1.5. Allosteric enzymes however, do [S] Figure 1.5: The fraction of enzyme sites bound to the substrate Y as a function of the substrate concentration [S]. The solid red curve is a typical Michaelis-Menten curve (plot of equation (1.4)) and dashed green is a sigmoidal curve showing cooperativity (equation (1.5)). not obey Michaelis-Menten kinetics. They show sigmoidal curves as shown in figure 1.5. Such sigmoidal curves are described by the Hill equation; Y = [S]n , K + [S]n (1.5) where n is the Hill coefficient. The Hill coefficient is the slope at the midpoint of the curve and is a measure of the degree of cooperativity. For positively cooperative systems it gives an estimate of the number of binding sites but in general it is not equal to the number of binding sites and does not represent a physically possible reaction scheme. It is best treated as an empirical equation for the cooperativity of the enzyme. A Hill coefficient of n = 1 means each successive ligand binds with an affinity independent of the number of ligands already bound (i.e. the Michaelis-Menten case). n > 1 indicates positive cooperativity and n < 1 indicates negative cooperativity. Y = V /Vmax is equal to the fraction of enzyme sites bound. Talk of ‘enthalpy-entropy compensation’ is common in the literature. If ∆H and ∆S have the same sign they will give opposite contributions to the free energy change ∆G = ∆H − T ∆S. Reactions that are enthalpically favourable ∆H < 0 often involve entropic cost ∆S < 0. Enthalpically favourable ∆H < 0 means the internal energy of the products is less than that of the reactants. An entropic cost ∆S < 0 is an increase 1.6. THERMODYNAMICS OF PROTEINS 11 in order. Reactions that are enthalpically unfavourable may occur due to being highly entropically favourable. In section 5.3 we develop a model for compensating dynamic entropy and enthalpy. A little care is needed as some of the apparent effects referred to as enthalpy-entropy compensation are artifacts from the analysis technique used (such as extrapolation of Arrhenius plots or van’t Hoff plots) [Cornish-Bowden, 2002]. Whenever enthalpy-entropy compensation is found by calorimetric data where the heat transfer has been directly measured it can be trusted to be a real effect [Cooper, 1999]. Calorimetric techniques often measure the heat capacity. Heat capacity Cp is the rate of heat absorbed with change in temperature (at constant pressure). It is related to the enthalpy and entropy by; Z ∆H = ∆Cp dT + ∆H(0), Z ∆Cp dT . ∆S = T (1.6) (1.7) Unfavourable enthalpy of breaking bonds is often compensated by increased molecular flexibility giving a favourable entropy. Conversely, favourable enthalpy of forming bonds can be compensated by an unfavourable entropy due to the resulting restriction of motion. Consequently quite large changes in enthalpy and entropy can result in the small subtle free energies often required by biological function (e.g., signalling when reactions need to reverse with changes in the environment). 1.6.2 Isothermal Titration Calorimetry Isothermal Titration Calorimetry (ITC) is a direct method for measuring the enthalpy change of a reaction. In sections 3.4.1 and 5.4.1 we compare our calculation results to data obtained by ITC. ITC measures the binding affinity of interactions by detecting the heat absorbed or released during the reaction so giving the enthalpy change. ITC can measure affinities between millimolar and nanomolar scales. The main limitation is the need for high concentrations of reactants to detect the heat [Doyle, 1997]. Leavitt and Freire [2001] give a review of modern ITC techniques. A typical ITC experiment involves injecting ∼ 10µl of ligand solution into a cell containing ∼ 1ml of protein solution at regular intervals in time (see insert of figure 1.6). The instrument maintains a constant temperature between the reaction cell and a reference cell. The power needed to achieve this temperature maintenance is recorded. At each injection heat is released or absorbed in the reaction leading to the recorded peak in the power (see plot in figure 1.6). The heat associated with the ith injection is qi = V ∆H∆Li where V is the volume of the cell, ∆Li is the increase in concentration of bound ligand and ∆H is the enthalpy of binding. ∆Li is a function of the known total ligand concentration and the binding equilibrium constant Ka . In the simplest case of only 12 CHAPTER 1. INTRODUCTION ³ one binding site ∆Li = [P ]Ka [L]i 1+Ka [L]i − [L]i−1 1+Ka [L]i−1 ´ where [P ] is the concentration of protein. Analysis of the data gives ∆H and Ka , which gives ∆G from equation (1.2). The entropy is then obtained from equation (1.1). Repeating the titration at different temperatures gives the heat capacity ∆Cp = ∂∆H ∂T Figure 1.6: A typical ITC experiment. The inset shows an ITC reaction cell containing protein (red) and the injection syringe containing ligand (green). The graph shows the measured signal - the electric power needed to maintain a constant temperature difference between the reaction and reference cells. Each injection gives a peak, the area under which is the heat associated with the process. Figure reproduced from Leavitt and Freire [2001] Thermodynamic measurements, such as binding equilibria and calorimetry, measure energy changes. The enthalpic component is primarily due to static conformational changes. The entropic component however is due to disorder. Entropic changes can occur by changes in the ordering of solvent molecules due to the hydrophobic effect. However, the entropic component is largely due to changes in the disorder of the protein structure (i.e., changes to the protein flexibility). In this way thermodynamical measurements are indirectly measuring protein dynamics. We now introduce experimental methods that probe protein dynamics more directly. 1.7 1.7.1 Experimental techniques probing protein dynamics X-ray crystal diffraction Throughout this thesis we use protein structures determined by x-ray crystallography. We use such structures as a starting point to guide the formation of coarse-grained 1.7. EXPERIMENTAL TECHNIQUES PROBING PROTEIN DYNAMICS 13 models. We also use the atomic level detail in crystal structures to parameterise our coarse-grained models. In this section we introduce the methods of determining such crystal structures and assess what dynamical information can be obtained from such techniques. Since Kendrew et al. [1958] and Perutz [1963] obtained the first x-ray crystal structures of a protein (myoglobin and haemoglobin respectively) thousands of structures have been solved. X-ray crystallography remains the most common method used with 90% of the ∼ 20 thousand structures on the Protein Data Bank (PDB) [Berman et al., 2000] being x-ray crystal structures. The major challenge in x-ray structure determination is to crystallise the protein in a pure enough form to diffract the x-rays. Crystals (> 0.5mm) are needed since diffraction from single molecules is too weak. Suitable crystals contain 1013 − 1015 complexes and the model obtained is an average of these over the time of the experiment (several days). Protein crystallography is difficult because the crystals are held together by hydrogen bonds between water molecules, the interactions of which are weak causing the crystal to be very fragile. There are large numbers of ordered and disordered H2 O molecules (about one per amino acid) in the aqueous crystal. Further drying or dehydration destroys the structure. All structures are quoted with a resolution in Å. A resolution of 2Å indicates that reflections from a distance up to 1/2Å in reciprocal space have been taken into consideration (i.e., the diffraction from 2Å spaced parallel planes). This means only atoms 2Å or more apart can be resolved. The position of an atom is accurate to about 10% of the quoted distance [Rhodes, 1993]. There is concern as to how much the crystallisation of a protein alters its structure. Functionality retention can be checked in some cases (e.g., enzymes giving the product though at a reduced rate due to a reduced availability of active sites [Rhodes, 1993]). As NMR structures are gradually solved the x-ray structures can be compared to these (see section 1.7.2). The main aim of molecular biology over the past half century has been structure determination. X-ray crystallography has opened our eyes to many wonders of the protein world. However, it does have the disadvantage of encouraging a static view of proteins as we marvel at the multicoloured model structures freely downloadable from the protein data bank (PDB) [Berman et al., 2000]. It is important to remember the x-ray crystal structure model obtained is an average static picture of a dynamic system. Information about the dynamics of the protein can be inferred from x-ray crystallography. For example, many loop regions in proteins are missing from the crystal structure because the electron density is not sharp enough to determine the atomic positions. This is usually explained by the loop region being so flexible that the atomic 14 CHAPTER 1. INTRODUCTION positions cannot be resolved. Most crystal structures have ‘B-factors’ or ‘Debye-Waller factors’ quoted for each atom. These are often interpreted as being an indication of dynamics. We need to be careful with such interpretations as the uncertainty in atomic position includes refinement protocol effects and static disorder (due to the breakdown of crystal symmetry) as well as actual fluctuation in atomic positions. The measured B-factors can be related to the root mean squares of the atomic positions < u¯2 > in the crystal by B = 8π 2 < u¯2 > [Rhodes, 1993]. This expression is within the harmonic approximation and assumes the B-factors are a measure of the thermal fluctuations only. The thermal fluctuations in a crystal are likely to be a lower limit of in vivo protein movement, since the crystal lattice packing most likely constrains otherwise more flexible proteins. For small molecules anisotropic temperature factors can be obtained. These give the preferred direction of vibration of each atom. However, these are unattainable for most proteins due to the amount of data needed for such analysis exceeding resolution limits. The B-factors give no time scale information. Crystal structures are often an essential starting point for computational techniques probing dynamics (such as molecular dynamics simulations). In section 3.3.1 we use B-factors for an initial parameterisation of our coarse-grained model for the lac repressor. 1.7.2 Nuclear Magnetic Resonance We will be referring to NMR (nuclear magnetic resonance) structure and dynamics studies during this thesis (for example, in section 5.4 for the met repressor). We will be comparing our calculations of flexibility changes with dynamics measured by NMR experiments. Here we introduce NMR methods for structure determination and dynamics. Over the past 20 years NMR has been used to determine now over three thousand protein structures. NMR as a technique for determining protein structures has many advantages over x-ray crystallography. The protein does not have to be crystallised so even proteins that cannot be crystallised can be studied. NMR has none of the problems, associated with x-ray diffraction, of crystallisation changing the native structure. However, the most important advantage for us is that the dynamics of the protein can be more directly probed with NMR than x-ray crystallography. As Jardetzky and Lefevre [1994] put it, x-ray diffraction provides accurate geometric information and indirect evidence of dynamics but NMR provides direct dynamics and approximate geometric information. The two techniques are therefore complementary. However, only small protein structures can be solved by NMR, the current upper limit being of the order of 100kDa [Alberts, 2002] (1000 residues). The major limitations of NMR are to do with its low sensitivity and the challenging analysis of the highly complex 1.7. EXPERIMENTAL TECHNIQUES PROBING PROTEIN DYNAMICS 15 information contained in the spectra. Here we introduce the physical concepts behind NMR experiments. Atomic nuclei with an odd number of nucleons (such as hydrogen) posses a spin magnetic moment µI = γI where I is the spin angular momentum and γ is the nuclear gyromagnetic ratio. This magnetic moment couples to an external magnetic field B0 splitting the quantum energy levels into hyperfine structure of 2I + 1 levels by a perturbation of −µI · B0 (the Zeeman effect). For nuclei such as 1 H, 13 C or 15 N with spin 1 2 the energy gap between the two energy levels is E = γ~B0 . Electromagnetic radiation pulses of this energy E = hν0 excite transitions between the energy levels where ν0 is the resonant frequency (which is equal to the Larmor frequency of the semi-classical description). In the semi-classical description of the nuclear quantum spin each nucleus has a magnetic moment in the xy plane, which rotates at the Larmor frequency ν0 given by E = hν0 = γ~B0 . At equilibrium in an applied magnetic field B0 in the z direction, the xy components of all the nuclei average to zero, but there will be a net magnetisation in the z direction Mz = 1 2 γ h̄(N+ − N− ) due to the difference in Boltzmann populations (N+ − N− ) of the different spin states (more nuclei spins will be in the lower state). By applying a radio-frequency pulse at the Larmor frequency the net magnetisation Mz is reduced or inverted due to the nuclei excitations increasing the population of the higher energy level. The transitions bring the spins into phase with the pulse applied. This causes a non-zero magnetisation Mxy in the xy plane due to the phase coherence of the excited spins. The length of time the pulse is applied for determines the resulting values of Mz and Mxy . NMR experimentalists refer to the length of the pulse by the angle the resultant magnetisation M is tilted away from the z direction. So a 90o pulse results in equalising the populations leading to Mz = 0 and M = Mxy . A 180o pulse inverts the populations leading to reversing the sign of Mz and Mxy = 0 [Rattle, 1995]. After a radio-frequency pulse the excited states relax back to equilibrium over time. The absorption and relaxation spectrum may be detected in an NMR experiment. The precessing magnetisation Mxy induces an oscillating current in a nearby wire coil. The intensity of this current is measured over time. This signal is known as the free induction decay (FID). The frequency of the detected oscillating current is the Larmor frequency. The Fourier transform of the FID will give the spectrum of different frequencies in the signal. Nuclei in different chemical environments will have different Larmor frequencies and so different peak positions in the spectrum. On application of B0 , moving electrons induce an opposing magnetic moment (the diamagnetic effect). This causes a reduction in the local field seen by the nucleus. The nucleus is shielded by the electrons. Each nucleus in a different chemical environment will be shielded by a different extent giving rise to the different peak positions in the spectrum. Structural information can therefore 16 CHAPTER 1. INTRODUCTION be extracted from the spectra. The linewidth of the peaks in the spectrum is dependent on the relaxation time measured from the decay of the current intensity. The shorter the relaxation time the broader the peak, due to Heisenberg’s uncertainty principle. In order to determine the structure of a protein from NMR experiments the Nuclear Overhauser Effect (NOE) is measured. This is the detected change in the intensity of the resonance from one nucleus when a different nearby (< 5Å apart) nucleus is perturbed. In practice the perturbation is usually ‘saturation’ meaning the spin state populations are equilibrated in the higher spin state by applied radio-frequency. Irradiation to saturate one nucleus (for example a C nucleus), enhances the NMR signal of a second nearby nucleus (for example a proton). The change in signal intensity is dependent on the rates of energy transfer between the nuclei. Note this is different to spin-spin coupling (or J-coupling) causing splitting of the resonance peaks. The latter is caused by the orientation of a neighbouring bonded nucleus affecting the local magnetic field via the electrons in the bond. In large molecules such as proteins J-coupling is usually hidden by the line broadening due to short relaxation times. The NOE is due to dipolar interactions between the magnetic dipoles of close nuclei. It can therefore give information about nuclei that are adjacent but not connected by bonds. The rates of energy transfer in NOE depend on the distance between the nuclei and the correlation time of the molecule (the time it takes to translate one molecular diameter or rotate one radian). That is, the presence of a detected NOE shows two protons are close in space (< 5Å apart) [Rattle, 1995]. The NOE data provides distance restraints that are used to form the NMR model structures. We now consider NMR experiments that tell us about the protein dynamics. Different NMR techniques have been developed that probe different timescale windows of protein dynamics. Nuclear spin relaxation probes ps-ns; backbone dynamics are obtained from 15 N and 13 C spin relaxation experiments and sidechain dynamics from 2 H spin relaxation [Yang and Kay, 1996]. Lineshape analysis and rotating frame nuclear spin relaxation probe µs-ms. Magnetisation exchange spectroscopy probes milliseconds to seconds and hydrogen-deuterium exchange probes seconds to hours [Palmer, 2001]. Relaxation measurements provide quantitative measurements of transverse and longitudinal relaxations of protons or 15 N or 13 C nuclei. The transverse relaxation time T2 is measured by spin-echo experiments. Firstly a 90o pulse tips the magnetisation into the xy plane. Then, this Mxy is allowed to exponentially decay for a certain time. The decay is due partly to magnetic noise and spin exchange governed by the dynamics we want to probe, but partly due to loss in phase coherence due to field inhomogeneities. This latter effect is reversed by applying a 180o pulse followed by the same time for relaxation (the spin echo). This pulse sequence (spin echo) is then repeated leading to exponential decay in Mxy with a decay rate equal to the true transverse relaxation 1.7. EXPERIMENTAL TECHNIQUES PROBING PROTEIN DYNAMICS 17 time. The longitudinal relaxation time T1 is measured by inversion recovery. Here a pulse sequence of 180o followed by 90o after a short time τ is applied for varying values of τ . The 180o pulse inverts Mz , which decays over time τ , and is then tipped into the xy plane by the 90o pulse where it is detected. The peak heights decay over τ with a relaxation time equal to T1 . Solid-state NMR relaxation probes similar time scales as solution NMR. It has the advantage that the sample is not affected by overall rotational diffusion. This allows extra information from anisotropic quadrupolar and chemical shift interactions, which average to zero in solution, to be obtained. However, solid-state NMR usually requires specific isotope labelling restricting the number of sites that can be measured. The NMR relaxation parameters can be expressed in terms of a generalised order 2 , which is related to the amplitude and timescale of a given bond vector’s parameter SLZ 2 = motion. Lipari and Szabo [1982] define the generalised order parameter as SLZ CI (∞) where CI (t) is the correlation function of internal motions. Yang and Kay [1996] 2 can be related to the conformational entropy S per show how this order parameter SLZ p residue on the ps-ns timescale. This is because both the entropy of bond vector rotation and the order parameter are dependent on the distribution of bond vector orientations. Yang and Kay [1996] calculate the relation as Sp /k = A + ln[3 − (1 + 8SLZ )1/2 ] where A is a constant depending on the model of motion used. As Jardetzky and Lefevre [1994] point out, the dynamical information extracted from NMR measurements is local. However, it is the global dynamics that are important for allosteric function. While large entropy changes due to alterations in the fast sidechain dynamics are seen by 2 H relaxation (for example, in calmodulin peptide binding by Lee et al. [2000]) we expect such local dynamics not to be effective in allosteric free energies. The slower backbone dynamics reflect more global dynamics. Mäler et al. [2000] measured 15 N relaxation and found changes in the ps-ns backbone dynamics clearly showing a dynamic allosteric effect in the cooperative binding of successive Ca2+ ions to a calcium binding protein. Their study shows NMR dynamics experiments can say something about protein dynamics that affect allostery. The comment in a recent review by Kern and Zuiderweg [2003]: “the dynamic NMR community is just beginning to uncover the tip of the entropic iceberg” sums up the exciting stage dynamic NMR experiments have reached in being able to investigate dynamic allostery. 1.7.3 Neutron scattering Neutron scattering experiments can measure the thermal mean square atomic fluctuations and frequencies in the 0.1 − 100ps timescale. Neutron scattering is a useful technique for studying protein dynamics including allosteric proteins. Neutrons are 18 CHAPTER 1. INTRODUCTION scattered strongly from hydrogen atoms and less so by deuterium. Therefore, by selective deuterium labelling, flexibility of specific parts of a protein can be studied. Neutron scattering has the advantage over x-ray crystallography that it can be performed on non-crystalline or monodisperse samples (maybe even in vivo in the future). Neutrons of energy ∼ 1kB T have a wavelength of ∼ 1Å, which is the range of thermal energy atomic fluctuations in proteins making neutrons ideal to study protein dynamics [Zaccai, 2000]. Mean square displacements are calculated from the angular dependence of the scattered intensity (from the dynamic structure factor assuming a two state model of two free energy wells). Pseudo force constants can be obtained from the slopes of the root mean square fluctuations plotted against temperature. Protein dynamical transitions at T ∼ 200K have been observed by sharp changes in such a plot (first seen in myoglobin by Doster et al. [1989]). The protein dynamical transition is analogous to the glass transition. At low temperatures only harmonic vibrations are excited but above the dynamic transition anharmonic motions are switched on (corresponding to there being enough thermal energy to jump between wells in the free energy landscape). Such a dynamical transition seen in neutron scattering experiments has been linked to the onset of activity above the transition temperature (for example, in bacteriarhodopsin Ferrand et al. [1993]). However, other studies by Daniel et al. [1998] on a glutamate dehydrogenase enzyme show activity well below the transition temperature measured by 100ps resolution neutron scattering. The dynamical transition is now thought to be timescale dependent [Daniel et al., 2003]. Comparisons of neutron scattering with 5ns resolution with 100ps resolution on a glutamate dehydrogenase enzyme suggests that anharmonic motions faster than 100ps are not required for enzyme function [Daniel et al., 1999]. Principal component analysis (introduced in section 2.2.3) of molecular dynamics simulations by Tournier and Smith [2003] suggest the dynamics activated at the dynamical transition can be explained by a few global modes. Such studies are important in the quest to understand the role of different time scale dynamics in allostery. 1.7.4 Fluorescence Resonance Energy Transfer Förster [1948] formulated the Fluorescence Resonance Energy Transfer (FRET) technique. FRET enables the measurement of molecular distances. Binding or conformational change can be observed in real time (on the order of seconds). Time resolved FRET can be used to infer flexibility (on a µs-ms time scale). Such techniques have now been developed for single molecule studies and are being used to monitor conformational changes [Weiss, 2000]. These emerging techniques may therefore be useful for observing allosteric mechanisms. The FRET phenomenon occurs when two fluorescent atoms that have overlapping 1.7. EXPERIMENTAL TECHNIQUES PROBING PROTEIN DYNAMICS 19 emission/absorption spectra are sufficiently close (10 − 100Å). One fluorophore is the donor and the other the acceptor (see figure 1.7). When the donor is excited by the right frequency, dipole-dipole couplings allow the acceptor to also be excited. The emission from the donor or acceptor can be measured. The intensity of the acceptor is enhanced, at the detriment of the donor emission intensity, due to the energy transfer. The energy transfer efficiency E is dependent on the distance R between the fluorophores according to the dipole-dipole coupling mechanism, E= R06 , R6 + R06 (1.8) (where R0 is the Förster distance or critical transfer distance characteristic of the donor-acceptor pair used) as seen in figure 1.7. Figure 1.7: Example of FRET experiments. Left: green: donor emission spectrum, pink: acceptor absorbance spectrum, dotted: the overlap. Right: distance dependence of the FRET efficiency. The dotted line indicates R0 and the highlighted green area shows the experimentally accessible distance range for this donor-acceptor pair. Figure taken from Klostermeier and Millar [2001]. The natural jellyfish green fluorescent protein (GFP) is used to label proteins for FRET measurements. Mutants of GFP: cyan fluorescent protein (CFP) and yellow fluorescent protein (YFP) are also used. Protein-protein interactions can be studied in vivo by labelling different proteins with fluorophores. Conformational changes can be studied by fusing the donor and acceptor to the same protein. See Truong and Ikura [2001] for a recent review of FRET. Time-resolved FRET (trFRET) can be used to study global flexibility of biomolecules with a ms time resolution. The nanosecond emission decay of the donor is measured after a short excitation pulse. In the presence of an acceptor at distance R the exponential donor emission intensity decay is affected. If a range of distances are present 20 CHAPTER 1. INTRODUCTION the decay will depend on the probability distribution P (R); Z I(t) = I0 P (R) exp[− t R0 (1 + ( )6 )]dR, τD R (1.9) where τD is the intrinsic lifetime of the donor excited state. The width of this measured profile then reflects the conformational flexibility of the molecule in that it gives the distribution of conformations. [Klostermeier and Millar, 2001]. FRET studies and trFRET have be used to investigate allosteric proteins. For example, conformational and dynamics changes have been observed by Polit et al. [2003] in cyclic AMP receptor protein (CRP). Their FRET results indicate a static conformational change (N and C terminal domains closer together) on cAMP binding. The trFRET measurements indicate an increased mobility in the N terminal region on ligand binding to the apo-CRP. 1.7.5 Other emerging experimental techniques Here we mention a couple of emerging techniques that may become useful in the future for studying the protein dynamics of allosteric systems. Neutron spin echo (NSE) has been used to study the ps-ns time scale dynamics of polymers. There is potential for this technique to be used for probing protein dynamics [Bellissent-Funel et al., 1998]. Various fluorescence techniques, which utilise naturally occurring fluorophores (aromatic tyr and trp residues), have been used. For example, time-resolved fluorescence anisotropy decay, which is analogous to NMR spin relaxation. Jimenez et al. [2004] describe three-pulse photon echo shift spectroscopy. In such an experiment a chromophore is photo-excited and the decay of echo peak shifts over time is measured. The spectral density ρ(ω) is obtained, which describes the amplitude of motion as a function of frequency. This technique is analogous to spin echo NMR. Terahertz time domain spectroscopy (TTDS) has been shown to probe ps time scale protein motion. This technique uses ultra-fast far infra-red lasers. The absorption spectrum is then measured and can be compared to computational normal mode calculations [Markelz et al., 2002]. 1.8 Dynamic allostery Since the classic 1960s MWC and KNF theories set out the idea of allostery by conformational changes (as discussed in section 1.4), numerous x-ray crystal and NMR structures have shown such conformational changes [Changeux and Edelstein, 2005]. However, as can be seen from the experimental techniques discussed in section 1.7, 1.8. DYNAMIC ALLOSTERY 21 it is now becoming clear that dynamics are important in protein function. A recent NMR relaxation study by Volkman et al. [2001] found the first evidence of allostery in a single domain protein. This challenged the traditional models, which only considered oligomeric proteins as potentially allosteric. Their study gives experimental evidence for dynamic origins of allostery. In this section we review theories suggesting dynamic allostery in the literature and examples of experimental evidence of dynamic allostery. Cooper and Dryden [1984] set out a theory of “Allostery without conformational change” in which changes in protein dynamics produce allostery. Cooper and Dryden [1984] considered two classes of dynamical modes of proteins that might contribute to allostery: high-frequency, short wavelength and low frequency, longer wavelength modes. The former have the advantage of high density of states, so that small modifications to each mode become amplified by their large number. However, there is evidence from neutron scattering experiments of the protein dynamical transition [Daniel et al., 2003] that motions faster than 100ps are not required for enzyme function (see section 1.7.3). Volkman et al. [2001] show allosteric regulation by backbone dynamics on a µs - ms timescale by NMR. They say ps time scale, sidechain motion does not contribute to allostery. Which time scale dynamics allosteric communication utilises is debated (for example see Wand [2001a]). Since high frequency modes are predominantly localised (vibrations affecting only a few atoms) [Kurzyński, 1998], one would expect only the global modes to contribute to allosteric communication across the protein. Despite these criticisms of their treatment of the high frequency modes, Cooper and Dryden [1984] provide valid in-principle calculations of the low frequency global modes. Their work lays down much of the thermodynamical ideas on which we build in this thesis. If ligand binding increases frequencies of vibrations (stiffens the protein) the entropy cost of binding a second ligand is reduced. In their paper Cooper and Dryden [1984] do not consider physically how normal mode vibrations may change or attempt to apply the ideas to specific proteins. This thesis provides just such physical modelling. Some early work following Cooper and Dryden [1984] approached the dynamic hypothesis by considering the entropic component of allosteric changes. Reinhart et al. [1989] measured equilibrium constants for the allosteric inhibition of phosphofructokinase at different temperatures in order to find the enthalpic and entropic contributions. They found that the inhibition is entropically determined with the enthalpic contribution compensating (favouring activation). They attribute such differences in entropy of the different states to dynamical properties discussed by Cooper and Dryden [1984]. Laatikainen and Tuppurainen [1988] suggest a physical origin of torsional entropic allostery arising from changes to the freedom of rotational groups in a molecule. They suggest utilising this to make synthetic allosteric systems. Today dynamic allostery is more commonly accepted. Ming and Wall [2005] define 22 CHAPTER 1. INTRODUCTION the ‘allosteric potential’ as the Kullback and Leibler [1951] divergence between conformational distributions. They identify three contributing terms: changes in the mean conformation, eigenvalue spectrum and eigenvectors. They calculate these terms for lysozyme with its ligand tri-N-acetyl-D-glucosamine using normal mode analysis, finding the mean conformation term was small compared to the changes in the eigenmodes. Here we highlight a few recent interesting examples of entropic allosteric systems whose thermodynamics have been measured experimentally. Schymkowitz et al. [2001] measure the equilibrium constants for dimerisation of suc1 (suppressor of cyclin dependent kinase 1), which dimerises by ‘domain swapping’. In domain swapping two identical monomers exchange a domain in order to dimerise. Schymkowitz et al. [2001] find differences in the energetics throughout the protein but no static conformational changes leading them to the conclusion that it is differences in the dynamics that affect the phosphate binding site. They identify a strained hinge loop that relaxes on dimerisation. Braxton et al. [1996] find the allosteric enzyme carbamoyl phosphate synthase is entropy driven. After estimating the effect of change in hydrophobic surface area, they conclude that this cannot account for the measured thermodynamics. They therefore propose this entropy driven allostery occurs due to changes in internal dynamics. Thermodynamic studies by Liu et al. [1998] concluded that the entropy component is significant in determining the nature and magnitude of the allostery in ATCase (aspartate transcarbamoylase). Recent electrochemical studies by Liu et al. [2005] on myoglobin show an entropic allosteric effect of phosphate dependent imidazole binding with minimal conformational change. Another allosteric protein with minimal conformational change is the HIV-1 protease. Allosteric inhibitors of HIV-1 protease are drug targets since inhibiting HIV-1 protease combats HIV infection [Bowman et al., 2005]. These examples mentioned show something of the importance and breadth of dynamic allostery effects in proteins. Further evidence of dynamic allostery, from the many computational techniques available to study protein dynamics, will be discussed in section 2.2. Now however we introduce some specific biological allosteric systems studied in this thesis. 1.9 Biological systems studied in this thesis We use the biological systems introduced in this section as test case studies for investigating dynamic allostery through our theoretical modelling. Here we review the biological function of these allosteric systems. 1.9. BIOLOGICAL SYSTEMS STUDIED IN THIS THESIS 1.9.1 23 Repressor proteins Chapter 3 focuses on repressor proteins and specifically on the lac repressor. As a preliminary we review here the biological function of repressor proteins. Regulation of gene expression determines which genes are activated in different circumstances leading, for example, to the striking differences between different types of cells in eukaryotes despite every cell containing the same DNA. Gene expression can be controlled at many levels. Regulation can occur at the level of protein activation or one step further back at the translation stage. Translation is the process by which proteins are made by decoding messenger RNA. The messenger RNA itself can be degraded in a controlled manner to regulate protein expression. Regulation can also occur by transport and localisation of RNA and during splicing (splicing is removing non-gene encoded regions from the RNA to form the messenger RNA). However, the most important level of regulation occurs at the level of transcription itself [Alberts, 2002, ch 7]. Transcription is the production of an RNA molecule by RNA polymerase using DNA as a template (see figure 1.8) [Thain and Hickman, 2001]. Regulation of transcription is economical Ribosome Protein mRNA DNA RNA Polymerase Figure 1.8: Cartoon to show transcription by RNA polymerase. since it prevents unwanted products from being synthesised in the first place. Such regulation is seen even in our single celled ancestors, bacteria. This control is achieved by using DNA-binding gene regulatory proteins. These proteins act as gene switches by binding to specific sites on DNA. Repressor proteins bind to DNA to ‘turn off’ genes when the cell does not require their expression while transcriptional activator proteins turn genes ‘on’. The Jacob and Monod [1961] theory descriptively explained prokaryotic (bacterial) gene expression using the concept of an ‘operon’, which is a section of DNA that is the unit of translation. It includes the genes (coded DNA) to be expressed plus the ‘operator’, which is the section of DNA that binds to a DNA-binding protein. The repressor protein itself is coded for elsewhere in the chromosome. The repressor 24 CHAPTER 1. INTRODUCTION bound to the operator prevents the RNA polymerase reaching the ‘promoter’ to initiate transcription by physically blocking it. The promoter is about 100 base pairs of DNA positioned adjacent to the operator and before the coding region. Activators bind to DNA near the promoter site increasing the probability of RNA polymerase binding to the promoter and initiating transcription. DNA binding proteins recognise specific binding sites on the DNA by interactions with the bases exposed on the DNA surface in the major and minor grooves but also by recognising local distortions in the helical geometry of DNA caused by the specific base sequences. Specificity is crucial since regulatory proteins must switch on/off the correct genes. Strength and specificity of binding is achieved by the combination of ∼ 10 residue contacts to the DNA (by e.g., hydrogen bonds, ionic bonds, hydrophobic interactions). However, the binding must not be so strong that the interaction is permanent, since regulatory proteins need to dissociate in response to changes in the environmental signals they receive. The binding of repressor proteins to DNA is cooperative and usually allosteric since it is activated depending on the presence of effectors (small ligands that bind to the protein) at sites distant to the DNA binding site. For example, the lac repressor only binds to DNA in the absence of the inducer lactose. When the inducer molecules bind, the lac repressor unbinds from DNA as shown in the cartoon figure 1.9. Figure 1.9: Cartoon to show lac repressor dimer binding to DNA only in the absence of its inducer lactose. Some repressors (e.g. the lac repressor) allow gene expression (by unbinding from the operator DNA) only in the presence of an inducer ligand. Other repressors however, (e.g. the trp repressor) work conversely. In these latter systems the genes are expressed only in the absence of their effector, which in this case is often called a ‘corepressor’ since the repressor only binds to DNA (repressing the genes) when bound to this ligand. The repressor protein with its bound effector is called the ‘holorepressor’ and the protein without a bound effector the ‘aporepressor’. In some systems, it is the aporepressor that binds to DNA and some, the holorepressor. Commonly DNA binding proteins are dimers or tetramers (as a dimer of dimers) [Alberts, 2002]. For example, the lac repressor tetramerises. The crystal structure of the lac repressor tetramer is shown in figure 1.10. 1.9. BIOLOGICAL SYSTEMS STUDIED IN THIS THESIS 25 Figure 1.10: X-ray crystal structure of the lac repressor tetramer shown as a ribbon display. Each monomer is shown in a different colour. Of the hundreds of DNA binding proteins with known structures, the DNA binding regions have been classified into a small number of structural motifs [Alberts, 2002, ch 7]. A motif is a distinctive region of the three-dimensional structure associated with a particular function. Figure 1.11 shows the main known DNA binding motifs. Figure 1.11a shows the helix-turn-helix motif. This consists of two alpha helices at a fixed angle to each other. One example of this motif is found in the lac repressor (see section 3.1). Figure 1.11b shows an example of a zinc finger. Zinc fingers contain an alpha helix and a beta sheet (or two alpha helices) stuck together with zinc atoms. One example is a mouse gene regulatory protein. Figure 1.11c shows a beta sheet binding motif, which consists of two beta strands. An example of a beta sheet motif is the met repressor. Figure 1.11d shows a leucine zipper. This consists of two alpha helices (one from each monomer of a dimer) that form a short coiled-coil held together by hydrophobic residues, which are often leucines. One example is the yeast Gcn4 protein. Figure 1.11e shows a helix-loop-helix. This is a short alpha helix connected to longer alpha helix by a flexible loop. As we have seen, the binding of repressor proteins to DNA is cooperative since the presence of the inducer affects the affinity for DNA. They are often allosteric too since the inducer binding site is distant from the DNA binding site. We are interested in understanding and predicting how biology achieves such allosteric cooperativity. One of the key unanswered questions for the lac repressor and other proteins is the mechanism of cooperativity of ligand binding [Matthews, 1996]. We address such questions in chapter 3. 1.9.2 Proteins with coiled-coils In chapter 4 we consider dynamic allostery in coiled-coils with a particular focus on the molecular motor dynein. In this section we introduce coiled-coils in biology and the 26 CHAPTER 1. INTRODUCTION (b) (a) (d) (c) (e) Figure 1.11: Cartoons of different known DNA binding motifs. Figures taken from Alberts [2002, ch 7]. (a) Helix-turn-helix (b) Zinc finger (c) Beta sheet (d) Leucine zipper (e) Helix-loop-helix molecular motor dynein. Figure 1.12: An example of a coiled-coil (taken from PDB ID: 1C1G [Whitby and Phillips, 2000]). Coiled-coils of alpha helices appear to be a common motif in molecular biology. An example of this structural motif is shown in figure 1.12. Coiled-coils are found in several, diverse allosteric proteins. For example, the coiled-coil in figure 1.12 is from the muscle protein tropomyosin. Another example of a protein containing coiled-coils is the chaperone clipB. Chaperones are proteins that assist proteins to fold correctly. A different example of proteins containing coiled-coils are bacterial chemotaxis receptors, which are transmembrane proteins. These receptors transmit information into the cell allowing the bacteria sense (‘smell’) chemicals in their environment. The bacteria respond by swimming towards or away from the sensed chemicals. Coiled-coils are also found in the molecular motor dynein, which we now introduce. 1.10. OPEN RESEARCH QUESTIONS 27 Dynein molecular motor Dynein is a molecular motor with a coiled-coil motif that transmits an allosteric signal for microtubule binding. Cytoplasmic dynein transports vesicles along microtubules and axonemal dynein is responsible for the beating of cilia and flagella. Dyneins are the largest and one of the fastest known molecular motors (moving microtubules in vitro at 14µm/s [Alberts, 2002]). Dynein is made up of one to three ‘head(s)’ with a ‘stalk’ (or ‘B-link’), which binds the (‘B’) microtubule, and a ‘stem’ (or ‘tail’), which binds the cargo (or the ‘A’ microtubules in the case of flagella). As shown in figure 1.13b the head is the central circular region made up of a ring 6 ‘AAA’ subunits. The stalk is a 15.5nm coiled-coil, which binds a microtubule at its end. The coiled-coil region is structurally conserved in all known dynein sequences [Gee et al., 1997]. The mechanism for force generation is not entirely understood but it is thought that a conformational change due to ATP binding and hydrolysis causes a ‘power stroke’ associated with release of ADP and phosphate (Pi ). The pre-power stroke is the ADP·Pi bound state and the post power stroke state the free (apo) one. Sliding of microtubules in flagella is also thought to be due to coordinated activity of dynein heads along the microtubules with several dynein molecules working together [Gee and Vallee, 1998]. The ATP binding site in the head is ∼ 20nm away from the microtubule binding site at the end of the stalk, yet the binding of the microtubule is ATP sensitive. This raises the allosteric question of how the ATP binding site and the microtubule binding site communicate [Burgess et al., 2004a; Gee et al., 1997; Lindemann and Hunt, 2003; Gee and Vallee, 1998]. Electron microscopy images of dynein-c by Burgess et al. [2003] show differences in the conformations of the two states. They suggest an origin for the power stroke and, significantly for this thesis, suggest that the stem and stalk are flexible and that the stiffness of the stalk changes depending on the ATP binding state. Figure 1.13a shows the two states of dynein with and without ADP·Vi (thought to mimic the ADP·Pi bound state pre-power stroke). Alignment of the tails in the images suggests a mean static displacement of 15nm of the tip of the stalk. As well as this static conformational change, the flexibility of the tail and stalk was investigated. These flexibility studies can be seen in cartoon movies the group generated, which can be viewed on their website http://www.leeds.ac.uk/bms/research/muscle/dynein. 1.10 Open research questions It is clear that proteins are rich in functionally important dynamics, which are beginning to be probed by a wide variety of experimental techniques. It is therefore an 28 CHAPTER 1. INTRODUCTION Figure 1.13: Structure and power stroke of dynein. a, Electron microscopy averaged images of the ADP·Vi bound and apo states of dynein (left views). b, Speculative structure of the molecule suggesting an origin of the power stroke. c, undocked right view. Figure from Burgess et al. [2003]. important and exciting time to seek to understand theoretically the relationship between protein dynamics and function. Allosteric effects, which regulate or facilitate function, are being discovered in numerous proteins from vastly different families. It has even been suggested that all proteins may be potentially allosteric [Gunasekaran et al., 2004]. It is therefore of great interest and importance to understand the mechanisms of allostery used by biological proteins. Such understanding could enhance drug design and biomimetics. For example, Choi et al. [2005] have artificially controlled the allostery of a hybrid protein-DNA molecule by increasing the tension exerted on the protein by binding a complementary strand to the the single stranded DNA, affecting the statics and dynamics of the protein. What are the fundamental physical principles driving allostery in proteins? How do these macromolecules communicate across such large molecular distances? The evidence is mounting for an important role for dynamic allostery but which parts of the vast spectrum of motions are used? What time scale motions are important for allostery? How are dynamics affected by ligand binding? How can we decode the 1.11. THESIS OUTLINE 29 rich and complex protein dynamical data in an illuminating and quantitative manner? What level of coarse-graining is appropriate for theoretical models of allostery? In this thesis we seek to understand the fundamental physical principles underlying dynamic allosteric mechanisms. We explore degrees of coarse-graining with an aim to find biologically relevant physical models for specific proteins, that capture, explain and predict allosteric mechanisms. 1.11 Thesis outline In chapter 2 we introduce various computational methods used to study protein dynamics at different levels of coarse-graining. We discuss appropriate levels of coarse-graining for different questions. We introduce our methodology and show where our approach fits into this wider picture of the field. In chapter 3 we present our basic model of a dimer of rigid monomers. We develop this model using the lac repressor as a test case and relate the coarse-grained model to atomistic details. In chapter 4 we describe our elastic rods model of allostery in alpha helical coiled-coils. As an example system we focus specifically on the molecular motor dynein. In chapter 5 we explore a model of allostery in which the localised fast modes of vibration are coupled to the delocalised slow allosteric modes. We explore applications of these concepts to the met repressor protein. We conclude in chapter 6 and discuss ideas for future work including applications to bacterial chemotaxis receptors. Chapter 2 Levels of coarse-graining in protein modelling 2.1 Overview Einstein expressed the principle that guides coarse-graining by “Everything should be made as simple as possible but not simpler”. In this chapter we review the various computational techniques used to study dynamics in biomolecules, including those used in this thesis. We discuss what levels of coarse-graining are appropriate for different questions. We introduce our methodology, showing where our approach fits into the wider field. As we saw in section 1.7, different experimental techniques often probe different time scales. In this chapter we shall see that different levels of modelling can also probe dynamics on different time scales. We start by considering the finest grain studies at the quantum mechanical level and then consider methods in order of increasing coarseness. 2.2 2.2.1 Computational techniques Quantum level The highest resolution calculations that have been done on proteins are at the quantum level. Very detailed, accurate quantum calculations (ab initio molecular dynamics) using, for example, density functional theory (DFT) or Hartree-Fock, can be carried out on small molecules (< 20 atoms) at fast time scales (sub-picosecond). Such calculations are too computationally expensive to perform for most biomolecules, which are macromolecules consisting of hundreds to thousands of atoms. However, such calculations on small systems are used to test and improve existing molecular dynamics force fields. Larger systems can be studied by using the quantum mechanics/molecular mechanics 31 32 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING (QM/MM) method. This partitions the system into a chemically active region that is treated at the quantum level, and the rest that is treated with empirical molecular dynamics (MD) potentials. The main advantage of such ab initio methods is that the interatomic forces are calculated from the electronic structure rather than relying on approximate empirical models such as conventional molecular dynamics. Quantum level calculations are appropriate when studying certain protein functions where electronic structure is important. Examples of such functions include: ion channels, metal ions in enzymes, bond breaking or formation and proton or electron transport [Carloni et al., 2002]. Vibrational energy relaxation (VER) is an example of protein dynamics requiring a quantum mechanical approach. VER occurs on a sub-picosecond timescale after a protein is excited by ATP, ligand binding or photons. VER is the transfer of this energy from the local site to the rest of the protein and solvent surrounding it. For example, quantum mechanical perturbation theory and non-equilibrium methods have been used to calculate a VER time of ∼ 1ps for a CD bond stretching mode in the protein cytochrome-c [Fujisaki and Straub, 2005]. Allosteric protein function, however, is expected to depend on timescales that are too long to be simulated by ab initio quantum mechanical calculations. Therefore, such a high resolution technique is not appropriate for the questions we are interested in in this thesis. 2.2.2 Molecular Dynamics Molecular dynamics (MD) is the most common computational technique used for studying protein dynamics. Since quantum mechanical calculations are too time consuming, MD simulations use an empirical potential function (or force field) for interactions between the atoms in the protein. Such a potential function is parameterised by fitting to experimental data for a known molecule. The limitation of setting the parameters empirically is that they are only suitable for the fitted system so new parameters need to be introduced for different types of molecules (e.g., ligands or nucleic acids). The input set of atomic coordinates is provided by a known crystal or NMR structure. The empirical function is then used to calculate the multidimensional potential energy hypersurface of the protein (the potential energy as a function of the coordinates of all the atoms in the protein). The dynamics are simulated by differentiating the potential energy hypersurface to obtain interatomic forces and solving Newtons laws of motion to give the positions and velocities of the atoms as a function of time. A typical potential 2.2. COMPUTATIONAL TECHNIQUES 33 function [Levitt, 1983] can be written as, U= X 1,2 pairs + 1 Kb (b − b0 )2 + 2 X X1 dihedral angles n + 2 X non−bonded i,j pairs X bond angles 1 Kθ (θ − θ0 )2 2 Kφn [1 − cos(nφ − δn )] 4²ij h¡ σ ¢ ¡ σij ¢6 i qi qj ij 12 − + . r r ²r The dihedral potential is expressed as a cosine series of n modes, each with an offset δn , in order to describe a discrete number of allowed dihedral angle states (usually two cis and trans). Newton’s equations of motion; mi d2 ri = −∇i U (r1 , r2 , ..., rN ), dt2 (2.1) are solved for each atom i = 1...N with mass mi and position ri . U (r1 , r2 , ..., rN ) is the potential function, which depends on the positions of the N particles in the system. Initially, the atoms are assigned velocities from a Maxwell distribution at a low temperature. The system is equilibrated by integrating the equation of motion over ∼ 10 − 50ps, during the first stage of which the temperature is increased to the required value by incrementally increasing the velocities of all the atoms. The system is said to be equilibrated once properties such as the kinetic energy remain constant over further time steps. The trajectory (the positions of the atoms at each time step) is calculated over a period of time. Times of ps-ns are possible, depending on the size of the protein. Ideally, solvent atoms are included explicitly but larger proteins can be studied using an implicit solvent model. Such a model simulates the effect of solvent on the protein by treating the solvent as a bulk medium with dielectric constant ². A commonly used model is that due to Born, in which the electrostatic solvation energy is given by the work done in transferring a charge q (with radius a) from vacuum to a 2 q solvent of dielectric constant ². This is given by U = − 2a (1 − 1² )[Leach, 2001]. MD simulations can provide information about motions on time scales up to ns (depending on the size of the system being studied). One drawback of MD simulations is that many functional motions occur on timescales at the limit or beyond the limit of what can be calculated in a reasonable amount of time. Even for small systems, which can be run for a long time, we cannot be sure that the lowest frequency motions have been captured. For example, the lowest mode seen in a 500ps MD simulation of acetylcholinesterase by Wlodek et al. [1997] is ambiguous. The slight tightening motion of the dimer may be due to relaxation from the crystal structure. Asymmetric motion of the monomers implied the simulation was not long enough. The functionally 34 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING important opening of a ligand binding gorge appeared only wide enough 5% of the time. However, a later 10ns simulation by Shen et al. [2002] showed a wider distribution of gorge radii and revealed two major substructures. This longer simulation showed at least half the protein is involved in this motion with local sidechain motions riding on collective dynamics. It is these low frequency, global modes that are important for allosteric function of proteins. The time step required to obtain accurate MD trajectories is ∼ 10−15 s leading to CPU times ∼ 106 s for a ∼ 10ns trajectory. Another limitation of MD is that the quality of the results is dependent on the quality of forcefield used (quantum level simulations are being used to improve such force-fields as discussed in section 2.2.1). Another theoretical problem with molecular dynamics is that the potential is pairwise. No many-body system can be completely described by pair potentials. Coarse-graining a many-body interaction by pair potentials can lead to density dependent potentials that correctly reproduce the structure but not the internal energy [Louis, 2002]. Vast amounts of dynamics information are contained in the MD trajectories. Due to the timescale best probed by MD, the technique is especially suited to finding the minimised structure and for docking ligand binding. Lindorff-Larsen et al. [2005] present a method they call dynamic ensemble refinement (DER) that combines MD simulations and NMR data. They suggest this method will improve understanding of the effects of dynamics on function. However, as we have discussed MD is not necessarily able to capture the functionally important dynamics for allosteric proteins. 2.2.3 Principal component analysis Whilst watching the animation produced from a MD simulation can be illuminating, extracting meaningful quantities from the simulation data is not obvious. In this section we introduce one method of extracting physically meaningful quantities from MD. Principal component analysis (PCA) (also called quasiharmonic normal mode analysis) uses the quasiharmonic approximation to extract dynamic modes from MD trajectories. The covariance matrix σ with elements; σij = h(xi − hxi i)(xj − hxj i)i, (2.2) is calculated, where x1 , ..., x3N are the Cartesian coordinates of the atoms from an MD trajectory. The mass weighted covariance matrix, σ 0 = M1/2 σM1/2 , is then obtained. The inertial matrix, M, is diagonal with elements; M1,1 = M2,2 = M3,3 = m1 , M4,4 = M5,5 = M6,6 = m2 , ... , M3N −2,3N −2 = M3N −1,3N −1 = M3N,3N = mN where mi is the mass of atom i. The mass weighted covariance matrix is diagonalised to obtain p the eigenvalues λi giving the 3N quasiharmonic frequencies ωi = (kB T /λi ). Six of these frequencies (corresponding to three translational and three rotational modes of 2.2. COMPUTATIONAL TECHNIQUES 35 the whole molecule) will be zero. The advantage of PCA over normal mode analysis (described in section 2.2.5) is that the PCA modes take into account the full anharmonic motion seen in the MD simulation without using the full harmonic approximation. Note however that the MD potential itself, equation (2.1), is a combination of anharmonic terms and terms in the harmonic approximation. The main disadvantage of PCA is that it is not clear whether the length of a particular MD simulation is long enough for PCA to capture the slowest modes. Also, since full MD needs to be run, this limits the size of the systems that can be studied by this method. Therefore, PCA has all the disadvantages of MD when it comes to addressing allosteric questions. However, for small systems MD and PCA can be useful. For example, Tournier and Smith [2003] use PCA to study the protein dynamical transition in myoglobin seen by neutron scattering experiments (as reviewed in section 1.7.3). They find that the onset of the dynamical transition is characterised by the appearance of a double-well principal component. 2.2.4 Calculating entropies The physical quantity we are most interested in is the entropy. In this section we introduce methods used to calculate entropy from MD simulations. Enthalpy calculations follow from the evaluation of the potential function at specific configurations. However, the entropy is harder to calculate. The standard one dimensional quantum mechanical harmonic oscillator vibrational partition function, Z = e−~ω/(2kB T ) /(1 − e−~ω/(kB T ) ), gives; Sq = ~ ω/T e~ ω/(kB T ) −1 − kB ln(1 − e−~ ω/(kB T ) ). (2.3) MD is a classical approach. In the classical limit, ~ω << kB T , the entropy is; Sc = kB (1 − ln ~ω ), kB T (2.4) However, the classical limit breaks down for the higher frequencies, such as hydrogen bond stretching. Some frequencies calculated by MD simulations are in this range. This leads to aberrant contributions to the entropy. Figure 2.1 shows the quantum mechanical equation (2.3) tends to zero for high frequencies but the classical equation (2.4) gives unphysical negative values for high frequencies. Schlitter [1993] introduced a simple equation allowing an estimate of the configurational entropy to be calculated from an MD simulation. The equipartition theorem, mω 2 hx2 i = kB T , allows the entropy to be written in terms of the covariance matrix σ, which can be calculated from an MD trajectory. However, the covariance matrix is singular due to the zero frequency whole body translations and rotations. Schlitter 36 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING 3 2.5 TS/kBT 2 1.5 1 0.5 0 -0.5 0 1 2 3 4 5 hf/kBT Figure 2.1: Solid red curve: Quantum mechanical simple harmonic oscillator entropy equation (2.3). Dashed green curve: Classical simple harmonic oscillator entropy equation (2.4) [1993] introduced an approximation to deal with this and the problems with the classical expression, equation (2.4), discussed above. Generalised to multiple dimensions the Schlitter equation is, Ss = kB kB T e2 ln |I + Mσ|, 2 ~2 (2.5) where I is the unitary matrix and M is the mass matrix. Schlitter showed this equation gives an upper limit to the configurational entropy. The Schlitter equation ensures the higher frequencies have a negligible effect by introducing a smooth cutoff function. It is a simple model of the quantum mechanical cutoff without using the full quantum equation (2.3). Harris et al. [2001] use the Schlitter approach to calculate entropy changes of a drug binding to DNA. Two drug molecules bind cooperatively to non-adjacent sites on DNA. Harris et al. [2001] show that the cooperativity of the system is entropically driven. They plot the calculated entropy as a function of the simulation time. This shows the increase in entropy as the time window is increased allowing the inclusion of lower frequency motions and seems to tend to a limit S∞ . They suggest an empirical relationship, S(t) = S∞ − α t2/3 and use this to extrapolate the calculated entropy from their 5ns MD simulations to the expected result for S∞ . Andricioaei and Karplus [2001] have shown that the Schlitter approximation is not needed if PCA is run and the 3N − 6 nonzero quasiharmonic frequencies ωi = p (kB T /λi ) are put straight into equation (2.3). This method does not experience problems the classical approach has with the high frequency modes. This is because the full quantum mechanical harmonic oscillator equation (2.3) is used, which reduces to the classical limit for small ω and tends to zero for high ω. 2.2. COMPUTATIONAL TECHNIQUES 37 Jusuf et al. [2003] use a 1ns MD simulation followed by quasiharmonic normal mode analysis (PCA) to calculate the configurational entropies of cooperative glycopeptide antibiotics (using equation (2.3)). These antibiotics bind to ligands in the membranes of bacteria. Ligand binding induces dimerisation. The process is interesting in that the antibiotics do not show significant structural changes. Jusuf et al. [2003] study three such antibiotics and find that both ligand binding and dimerisation reduce the amplitude of fluctuations. Their calculations show that the entropy changes are sufficient to account for the experimentally observed cooperativity. 2.2.5 Normal mode analysis However, many allosteric proteins are too large for MD and PCA techniques. A further level of coarse-graining is to use the harmonic approximation to the potential. Normal mode analysis (NMA) decomposes complex motion into a sum of independent vibrational modes within the harmonic approximation. NMA assumes the displacements of the atoms about their equilibrium positions are small enough for the potential U (equation (2.1)) to be approximated by a sum of quadratic terms in the atomic coordinates xi (by performing a Taylor series expansion about the equilibrium). This allows the harmonic approximation of Newton’s equations of motion to be written as; Mẍ = −Kx, (2.6) where x is the vector of the 3N atomic coordinate displacements, M is the mass matrix and K is the matrix of force constants given by Kij = ∂ 2 U (x1 ,...,x3N ) ∂xi ∂xj evaluated at equilibrium. In the harmonic approximation these second derivatives will be constants. As long as Cartesian coordinates are used, M is diagonal with elements; M1,1 = M2,2 = M3,3 = m1 , M4,4 = M5,5 = M6,6 = m2 , ... , M3N −2,3N −2 = M3N −1,3N −1 = M3N,3N = mN where mi is the mass of atom i. Since equation (2.6) is harmonic it can be expressed as (ω 2 M − K)x = 0 where ω is the eigenfrequency and x is the eigenvector. In NMA computations the mass weighted Hessian matrix, H = M−1/2 KM−1/2 , is calculated. (2.7) The eigenvalues λi and eigenvectors are then calculated by solving |H − λI| = 0. Diagonalising the mass weighted Hessian matrix H gives U such that D = U−1 HU where D is the diagonal matrix of eigenvalues and U is the matrix of eigenvectors. Six of the eigenvalues will be zero corresponding to the translational and rotational modes of the whole system. The frequency νi of each of the 3N − 6 normal 38 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING modes is calculated from the non-zero eigenvalues, √ λi νi = . 2π (2.8) These frequencies can then be used to calculate the vibrational entropy from the quantum or classical simple harmonic oscillator partition function (see equation (2.3)). NMA can predict x-ray crystal B-factors, B = 8π 2 h∆r2n i, for each atom, n, by using, h∆r2n i ¯ 3N ¯ kB T X ¯¯ αni ¯¯ = ¯ ωi ¯ mn (2.9) i=7 where αni is the projection of the ith normal mode, with frequency ωi , on the displacement vector for the nth atom [Brooks III et al., 1988]. NMA has been used to study allosteric proteins, for example, the comparison of normal modes of ATCase (aspartate transcarbamylase) in the R and T states [Thomas et al., 1996]. These authors find modes in the direction of the allosteric conformational change and differences in the flexibility of the two states. Ma and Karplus [1998] studied the allosteric mechanisms in GroEL by normal mode analysis. They find ATP binding causes an increase in flexibility and suggest how different modes could cause structural changes leading to the positive and negative cooperativity seen in GroEL. NMA has advantages over MD in that it can be performed on larger systems. Within the harmonic approximation used, NMA can be calculated exactly and captures the lowest frequencies, without the problem MD has of knowing if it has been run for long enough to capture the lowest modes. Theoretical limitations of NMA arise from the harmonic approximation (which can break down for ‘floppy’ proteins) and neglecting solvent. For systems small enough to be studied by MD, PCA (see section 2.2.3) can capture more anharmonic motion. Although not as computationally expensive as MD, NMA also has computational limitations. NMA of large molecules can take up a large amount of memory (e.g., 2Gb for a 4000 atom system using the Amber MD package [Case et al., 2004]). The structure needs to be minimised by MD algorithms before running NMA. This minimisation can take a long time for large proteins. 2.2.6 Elastic Network Model In this section we introduce further coarse-graining techniques that overcome the computational problems experienced with the techniques described so far. These methods simplify the interactions within a protein and strip away residue specific details. In section 3.4.3 we compare our calculations to results using the elastic network model (ENM). 2.2. COMPUTATIONAL TECHNIQUES 39 The Tirion potential Tirion [1996] started a revolution in normal mode analysis by replacing the full MD potential (equation (2.1)) by a single-parameter potential, which has come to be known as the ‘Tirion potential’. This is a pairwise Hookean potential given by, U= XC (|ra,b | − |r0a,b |)2 , 2 (2.10) (a,b) where ra,b is the vector connecting atoms a and b. The superscript 0 refers to the initial configuration, which need not be minimised. C is a phenomenological constant that is assumed to be the same for all interacting pairs. The sum is over all interacting pairs of atoms defined as those closer than an arbitrary cutoff. Fits to NMA using the full potential (equation (2.1)) give appropriate values for the cutoff distance Rc ∼ 2Å and C ∼ 0.3kB T Å−2 , giving a universal ‘bond strength’ of CRc2 ∼ kB T . Tirion [1996] shows that such a simple potential reproduces the normal modes and theoretical B-factors calculated from the full potential (equation (2.1)) remarkably well. The Tirion potential is most accurate for the lowest frequency modes, becoming inappropriate for calculating the high frequency modes where the details of the potential become important. This surprisingly simple potential is sufficient for the low frequency modes since these involve coherent motions of large groups of atoms. Due to the central limit theorem the sum of the detailed interactions of large numbers of atoms approaches a Gaussian form. With no minimisation and the use of such a simple potential, huge savings in central processing unit (CPU) times are made. Tirion [1996] compares the CPU times for several proteins, finding her approach gives results two to three orders of magnitude quicker than standard NMA. Consequently, larger proteins may be studied using such a simplified potential. Alternative simple potential models Hinsen [1998] presents a similar harmonic pair potential to equation (2.10), which uses a distance dependent force constant; à k(r0a,b ) = c exp − |r0a,b |2 r02 ! (2.11) instead of C/2 in equation (2.10). The sum in equation (2.10) is then performed over all pairs, with an exponential decay instead of Tirion’s sharp cutoff. A value of r0 = 0.3nm gives the best agreement with standard NMA results. Hinsen [1998] notes that the results do not depend strongly on the value of r0 chosen. The value of c is 40 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING arbitrary and scales all the frequencies uniformly. Hinsen [1998] shows that using one point mass for each residue gives low frequency modes in good agreement with standard NMA. The value of r0 is increased to 0.7nm for this simplified model protein. Thus, coarse-graining the protein from an atomistic model to considering each residue as a point mass seems appropriate for questions involving the low frequency modes. Definitions Here we define some useful quantities mentioned in studies of protein modes. Such quantities are used to compare different computational and experimental techniques, as well as illuminating properties of the protein. Hinsen [1998] suggests a measure of local deformation Di of atom i in a particular mode is the energy density due to the deformation; N |(di − dj ) · r0i,j |2 NX Di = k(r0i,j ) 2 |r0i,j |2 |dj |2 (2.12) j=1 where di is the displacement of atom i in the mode being analysed. The results from several modes can be averaged to obtain a measure of the combined deformation. Such an analysis allows the identification of rigid regions in the protein by defining rigidity as deformation smaller than a chosen value of Di . Two sets of normal modes can be compared using the concept of ‘overlap’. Hinsen [1998] defines the overlap matrix as the scalar products of each eigenvector vi from one set with each eigenvector wi from the other set; Oij = vi · wj (2.13) Tama and Sanejouand [2001] define an overlap, Ij , that is a measure of the similarity between the direction of the jth normal mode vector aj and the direction of the conformational change ∆r observed by x-ray crystallography. It is given by Ij = aj · ∆r . |aj ||∆r| (2.14) An overlap of one means the direction of mode j is identical with the direction of the observed conformational change. The correlation coefficient cj measures the similarity between the amplitudes of the atomic displacements in the jth mode and in the observed conformational change. It is given by; cj = < aj · ∆r > [< aj · aj >< ∆r · ∆r >]1/2 (2.15) 2.2. COMPUTATIONAL TECHNIQUES 41 The degree of collectivity κ of a mode j is related to the number of atoms significantly affected by the movement. Brüschweiler [1995] defines the collectivity as, ( N ) X 1 κj = exp − αu2j,i ln αu2j,i N (2.16) i where α is a normalisation constant such that PN i αu2j,i = 1. If the displacements, uj,i , for every atom are the same then κ = 1 and the mode is maximally collective. When only one atom moves κ = 1/N and the mode is in the extreme local limit. Gaussian network model Bahar et al. [1997] present an elastic network model using a the Tirion potential. Their elastic network model, which has come to be known as the Gaussian network model (GNM), has been used to study protein functional dynamics including allostery. It follows the classic theory by Flory [1976] for rubber elasticity. Figure 2.2 shows an example of such an elastic network model. Junctions in an elastic network undergo Gaussian fluctuations under the potential of the pendant chains. Figure 2.2: The elastic network model. Top: the LAO binding protein shown in ribbon representation. Bottom: the elastic network model for the LAO binding protein. Pairs of atoms < 8Å apart are connected by harmonic springs. Figure from Tama and Sanejouand [2001]. Bahar et al. [1997] identify the junctions of the network as the Cα atoms in the protein. The Cα of a residue is the central carbon atom in the backbone connected to the sidechain. The junctions in an elastic network fluctuate under the influence 42 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING of their nearest neighbours. Thus, their model is coarse-grained to the same level as suggested by Hinsen [1998]. Their model accounts for the effect of chain connectivity by automatically including the constraints of the backbone, by the use of an appropriate cutoff. In the protein elastic network model the fluctuations ∆Rij in the separation between Cα atoms i and j follow a Gaussian distribution; µ P (∆Rij ) = γ∗ π ¶3/2 e−γ ∗ ∆R2 ij . (2.17) The Hookean force constant parameter C used by Tirion [1996] is equal to 2kB T γ ∗ in GNM. According to Flory [1976] the partition function of the network is; ZN = K e−∆RΓ∆R , (2.18) where K is a constant and ∆R is the vector for the fluctuations ∆Ri of the N Cα atoms. Γ is the Kirchhoff matrix of contacts with elements; Γij = −γ ∗ − P 0 i6=j Γij ifi 6= j and Rij ≤ rc ifi 6= j and Rij > rc ifi = j (2.19) where rc is the cutoff separation defining the range of non-bonded interactions. The neighbouring Cα atoms (∼ 3.8 Å apart) are automatically included and all non-bonded interacting pairs ≤ rc = 7 Å apart. The Kirchhoff matrix can be split into a sum of contributions from the chain connectivity and non-bonded interactions, Γ = Γnb + Γcc where [Γcc ]ij = γ ∗ (2δij − δi,j+1 − δi,j−1 ) is the Rouse matrix for polymer chains [Rouse, 1953]. The inverse of the Kirchhoff matrix gives the correlations of the fluctuations h∆Rk · ∆Rl i = [Γ−1 ]kl . Bahar et al. [1997] calculate the B-factors Bk = 8π 2 h∆R2k i/3 = 8π 2 [Γ−1 ]kk /3 for 12 proteins. Their results compare well with experimental B-factors using γ ∗ as the only fitting parameter. The GNM is related to NMA by h∆ri2 i = 1/λi where λi are the eigenvalues of the Kirchhoff matrix Γ. It should be noted however, that these eigenvalues are not the same as the eigenvalues of NMA, since the latter are mass weighted. The main advantage of this GNM approach over NMA is that the Kirchhoff matrix, Γ, is obtained directly from the input experimental protein structure (i.e., there is no need to specify the potential). Computationally all that is required is inverting the N × N Kirchhoff matrix for a protein with N residues. The force constant γ ∗ may be eliminated by normalising the results, comparing the relative amplitudes of different atoms and relative frequencies of different modes. Such normalised results can 2.2. COMPUTATIONAL TECHNIQUES 43 be compared directly to normalised experimental results with no fitting required. Tama and Sanejouand [2001] studied twenty proteins that undergo x-ray crystal structurally determined conformational changes, using the Tirion [1996] potential and coarse-graining the proteins by using only the Cα atoms, as proposed by Bahar et al. [1997]. Tama and Sanejouand [2001] find this level of coarse-graining is sufficient for studying the low frequency modes. The aim of their study is to assess the degree to which individual low frequency modes describe observed conformational changes. They found that in most cases they are described well by a single low frequency normal mode, providing the conformational change has a large collectivity κ > 0.18. They also compare individual modes calculated using the Tirion [1996] potential with those calculated by standard NMA, finding that the overlaps obtained are almost equivalent. Clearly such methods are suitable for studying protein allostery. Anisotropic network model Atilgan et al. [2001] present an extension to the GNM called the anisotropic network model (ANM). This relaxes the GNM assumption of isotropic fluctuations and addresses the directions of collective motions. The fluctuation vectors for each residue are determined as well as the magnitudes of fluctuations. The forces acting on a residue due to neighbouring residues in the network must sum to zero in equilibrium (Bf = 0). The forces are given by Hooke’s law in the GNM, f = γI∆s. The fluctuations of the springs in the network, ∆s, are related to the fluctuations of the residues, ∆R, by the matrix of cosines of the angles between the springs and each of the Cartesian axes B, ∆s = BT ∆R. The ANM equivalent of the GNM N dimensional Kirchhoff matrix Γ is then the 3N dimensional matrix BBT . This larger matrix leads to correspondingly longer computational times of the order of hours rather than seconds. However, the advantage of ANM over GNM is that it provides the full 3N − 6 directional modes, which can be compared to traditional NMA, whilst still being computationally much faster than standard NMA. Xu et al. [2003] use the GNM and ANM to study the classic allosteric protein haemoglobin. They show that the slowest mode calculated by elastic network models predicts the transition between tense (T) and relaxed (R) states of haemoglobin. This implies the protein is inherently predisposed to undergo this allosteric transition and highlights the importance of vibrational entropy in driving such a process. They find the oxygen binding site is at the site of a hinge in the global motion and that the hinge motion is restricted on oxygen binding. Thus their study shows that purely mechanical, entropic effects play a significant role in the allosteric transition. 44 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING Rotations-translations of blocks The rotations-translations of blocks (RTB) method is a way of speeding up NMA for large proteins by coarse-graining the protein into blocks made up of one or a few consecutive amino acid residues. It assumes such blocks move as rigid bodies. As we have seen in section 2.2.5, standard NMA of a protein with N atoms requires diagonalising a 3N × 3N Hessian matrix H (second derivatives of the potential energy). The RTB method expresses H in a new basis set; Hb = PT HP. (2.20) P is an orthogonal 3n × 6nb matrix made up of the translation and rotation vectors of each of nb blocks. The normal modes are then calculated by diagonalising Hb , which is a 6nb × 6nb matrix and therefore computationally far easier to diagonalise than the original 3N × 3N matrix H. The normal mode vectors are then given by Ap = PAb where AP are the eigenvectors of Hb . Tama et al. [2000] compare results using the RTB method with blocks of 1, 2, 3 and 5 residues with standard NMA for 12 proteins of different sizes. They find the low frequency modes compare very well and the RTB method computations are performed 20 − 30 times faster than standard NMA. This method allows very large systems to be studied by including more residues within a block to speed up the calculations. Another advantage of this approach is that it uses the standard MD potential without needing a simplified potential. Proteinquakes Miyashita et al. [2003] draw together the various techniques discussed in section 2.2.6 and the idea of ‘proteinquakes’ to consider models of allostery. The term ‘proteinquake’ was first suggested by Ansari et al. [1985] to describe the return to equilibrium of myoglobin after stress induced by ligand binding or photodissociation. Miyashita et al. [2003] use the term to refer to local partial unfolding (‘cracking’) and refolding that may occur during transitions between states. Normal modes are vibrations about a local minimum within the energy landscape of the protein, which could contain many such local minima. In order for a protein to access a different conformation it has to undergo some anharmonic motion to overcome the barrier between the minima of different conformations. Miyashita et al. [2003] estimate the free energy barriers between states of adenylate kinase by iterative anharmonic interpolation between normal mode calculations. Their cracking model predicts an unexpected dependence of transition rate on folding stability that has been seen experimentally. They predict an increase in denaturant concentration will increase the rate of the transition, by decreasing the 2.2. COMPUTATIONAL TECHNIQUES 45 barrier height due to enhanced local folding. ENM web servers Recently, various groups have developed databases of protein dynamics and online servers using elastic network model calculations. Such websites provide free access to data and allow users to submit their own structures for simulation. We make use of two of these (elNémo and iGNM) in section 3.4.3 to compare to our results. Figure 2.3: An example of a typical elNémo output. Figure from Suhre and Yves-Henri [2004]. Suhre and Yves-Henri [2004] have developed a web server ‘elNémo’ for the Elastic Network Model. ElNémo uses the Tirion [1996] potential and the RTB method (see sections 2.2.6 and 2.2.6) leading to virtually no upper limit to the size of protein that can be studied. ElNémo automatically determines the number of residues to group together in the RTB method based on the number of residues in the protein. All the atom masses are set to the same fixed value (an approximation Suhre and YvesHenri [2004] show to have little influence on the low frequency modes). For this reason, elNémo frequencies are normalised to the lowest, which is set to one. ElNémo computes the 100 lowest frequency modes, the degree of collectivity, mean square displacements, distance fluctuation maps, the overlap, the correlation between observed and theoretical B-factors and animations of selected normal modes. The animations produced are series of perturbations to the original structure along the normal mode of interest with an arbitrary amplitude to allow visualisation of the mode. 46 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING Gerstein and Krebs [1998] attempt to classify protein and nucleic acid movements in their macromolecular motions database (http://molmovdb.org). The movement they consider is that between two different structures of similar sequence (from the PDB). Their database now includes a ‘morph movie’ server [Krebs and Gerstein, 2000]. A morph movie is made by dividing the Cartesian transformation between the two structures into steps and minimising the intermediate structure at each step. The minimisation ensures each intermediate structure is chemically realistic. The server will calculate morph movies of user submitted structures. Krebs et al. [2002] carried out a database investigation calculating the normal modes of ∼ 4000 pairs of protein structures. They used one point mass for each residue at the Cα positions as outlined by Hinsen [1998]. For each pair of protein structures they determined the linear combination of modes that best describes the direction of the conformational transition between the observed structures. Their data is deposited on their macromolecular motions database. Yang et al. [2005] have created a database called iGNM (available at the website: http://ignm.ccbb.pitt.edu/) of GNM results for protein structures in the PDB. The results of running the GNM calculations on all protein structures in the PDB up to September 2003 are stored on the database. Newer PDB entries can be submitted to the online server. The group are preparing additions to iGNM that will automatically check the PDB, download new entries and run GNM calculations on them. The output files calculated and stored on iGNM include theoretical and experimental B-factors, cross correlations, contact topology (the number of neighbours in contact with each residue in the elastic network model), eigenvalues and twenty fastest and twenty slowest eigenvectors. Such databases are now providing large amounts of data on the dynamics of proteins. With this wealth of new data, the role of dynamics in protein function such as allostery can be investigated for large numbers of proteins. 2.2.7 Graph theoretic methods FIRST We now introduce a recently developed coarse-grained method of predicting rigid and flexible regions in a protein. This is useful, for example, in deciding how best to divide up a protein in the RTB method. By predicting rigid regions this method would be useful in guiding the formation of the type of coarse-grained models we use (see section 2.3). Jacobs et al. [2001] introduce a computational procedure, which predicts rigid and flexible regions, called floppy inclusion and rigid substructure topography (FIRST). The algorithm is based on graph-theory, in which network rigidity depends on the connectivity of the network. A protein is modelled as a three dimensional bond-bending 2.2. COMPUTATIONAL TECHNIQUES 47 network, in which the atoms (vertices) are connected by bonds (edges) and every angle between the bonds (edges) is defined (see figure 2.4). Covalent bonds and hydrogen bonds are treated as edges, but the weaker van der Waal’s and hydrophobic interactions are ignored. Atomic coordinates are not required, it is the connectivity of the network that defines the rigidity. Laman’s [1970] theorem states that rigidity within a two di- Figure 2.4: A bond-bending network showing the ‘pebble game’ in FIRST. Free pebbles [◦] on vertices represent degrees of freedom. Pebbles covering edges [•] represent distance constraints. The arrows show a possible pebble rearrangement. mensional network can be characterised by counting the constraints to all the subgraphs within the network. Jacobs and Thorpe [1995] developed an algorithm, known as the ‘pebble game’, that applies the three dimensional generalisation of Laman’s theorem recursively. The pebble game counts the number of degrees of freedom available to a network. Starting with a set of unconstrained vertices, (the atoms of the protein structure with no bonds defined) new distance constraints (bonds) are added one by one and the degrees of freedom counted each time. To start with, each vertex has three free pebbles (representing three degrees of freedom). Each edge must be covered by one pebble. This means a network of two vertices with one edge between them will have five free pebbles representing five degrees of freedom. Each time a new edge is added the pebbles are rearranged to cover this edge. If there is not an available free pebble to cover the edge, the distance constraint is redundant. A redundant bond does not further increase the rigidity of the region. Once all the distance constraints (bonds in the protein structure) have been placed in this way the number of free pebbles remaining on the vertices is the number of degrees of freedom of the network. After the pebble game is complete FIRST performs rigid cluster decomposition to identify the rigid regions in the network. A rigid region has only six or fewer free pebbles. The search for such regions selects a vertex and two of its nearest bonded neighbours. Three free pebbles are collected on the chosen vertex, two on one selected neighbour and one on the other selected neighbour. Next, all bonded nearest neighbours to these vertices are checked for free pebbles. If no free pebble can be obtained, that 48 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING vertex is part of the same rigid cluster as the original selected vertex. Rigid regions will be over-constrained due to the redundant bonds contained in them. Flexible hinge joints (rotatable dihedral angles) are found between two bonded vertices that belong to different rigid clusters. Not all the free dihedral angles are independent. Imposing an external torsional constraint (specifying a dihedral angle) that is redundant indicates the angle is predetermined as part of a collective motion. By systematically testing dihedral angle constraints FIRST determines such under-constrained regions. The FIRST algorithm is fast, taking seconds to run. The validity of FIRST depends on the accuracy of the hydrogen bonds to be included in the network and it is a purely mechanical model effectively at T = 0. Jacobs et al. [2001] compare the results of FIRST with the B-factors and NMR order parameters. Since ligand binding changes the hydrogen bonds, FIRST captures changes in flexibility on ligand binding (as Jacobs et al. [2001] show for adenylate kinase) and can therefore be useful in studying allosteric systems. Identifying the rigid regions within a protein structure can help determine and justify the coarse-grained models we develop in this thesis (as we show in section 3.2). Distance constraint model Jacobs et al. [2003] present a distance constraint model (DCM) that builds on FIRST. DCM allows the distance constraints to fluctuate with thermal energy allowing a finite temperature study of rigidity. FIRST treats all distance constraints as fixed, defining hydrogen bonds above a cutoff energy as constraints. In reality hydrogen bonds have a continuum of energies and continuously break and re-form with thermal fluctuations. In DCM each distance constraint represents a free energy contribution and the partition function for the system is calculated. Enthalpy components are additive but entropy components are not since not all the degrees of freedom are independent. Network rigidity determines which degrees of freedom are independent. In DCM the constraints are added in order of increasing assigned entropies (i.e., the strongest interactions are added first). Thus, unlike FIRST, the order of edge placing in the pebble game matters. The enthalpy is obtained by adding the enthalpy components of all the constraints added. The entropy is obtained by adding the entropy components of just the constraints that are independent. Since DCM works with free energies the computation is fast (of the order of hours). Livesay et al. [2004] use DCM to reproduce experimental heat capacity curves for various proteins such as lysozyme and bovine pancreas trypsin inhibitor (BPTI), with remarkably good fits. FRODA Wells et al. [2005] have recently developed a computational method known as framework 2.3. OUR METHODOLOGY 49 rigidity optimised dynamic algorithm (FRODA) as an alternative to MD simulation. FRODA simulations capture the mobility resulting from the flexibility of a protein structure. Mobility is the collective movement of a rigid region. As an alternative to the Newtonian MD approach, FRODA uses a constraint based Lagrangian approach. In such a method, for ease of calculation, a system with more coordinates than degrees of freedom is used and then constraints are imposed on it. FRODA uses FIRST to generate constraints of fixed regions, which are treated as ‘ghost templates’. The atoms are allowed to move randomly (by Monte Carlo moves). The ghost template is then moved to fit with the new atom positions and the atoms refitted to the ghost template. These two steps are repeated iteratively until all the constraints are satisfied within a specified tolerance. In this way many new conformers of the protein are generated and the root mean square deviations (RMSD) from the original are recorded. Once the RMSD saturate the full range of conformational space has been explored. The set of structures generated make up a trajectory in conformational space that is equivalent to the trajectory in time produced by MD simulations. The FRODA algorithm is fast and scales linearly with molecular weight so simulations that would take MD weeks can be carried out in minutes. Wells et al. [2005] compare FRODA results for barnase with the ensemble of states obtained from NMR and show that FRODA captures the main features of the RMSD profile measured by NMR. This new technique has much potential to simulate proteins at functional time scales. 2.3 Our methodology Having discussed various levels of coarse-graining used in simulations of protein dynamics we now introduce our methods. In this thesis we develop coarse-grained models of proteins to investigate dynamic allostery in the binding of substrates. By a ‘dynamic’ mechanism for allostery we mean changes to the equilibrium vibrational thermodynamics. The level of coarse-graining we use is larger than any of the models introduced in section 2.2. We treat whole domains or subdomains as rigid (or elastic) bodies. An example is shown in chapter 3, figure 3.6 for the lac repressor dimer modelled as two rigid plates. Such simple models capture the few lowest frequency, global vibrational modes. Due to the simplicity of the models, the vibrational modes of the model can be calculated analytically. These calculated modes correspond to the lowest frequency modes of the protein in question. In such a model the motion is governed by a minimal set of harmonic potentials representing the interactions between the domains. These model potentials, drawn as springs in figure 3.6, contain local information that changes on ligand binding. The effective spring near to the ligand binding site will change its strength on ligand binding. Atomic level detail from experiment or simulation may be 50 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING used to parameterise these coarse-grained interactions. For example, we use experimental B-factors (section 3.3.1), NMA (section 4.7), and energy summations using a full MD force-field (section 3.3.2) to parameterise our models. Here we give an outline of the analytical calculations we perform. The model Hamiltonian is given by, 1 1 H = pT M−1 p + xT Kx, 2 2 (2.21) where M is the inertial matrix, x (p) is the vector of all the fluctuation variables (momenta) and K is the interaction or elasticity matrix. The elements of the interaction matrix K are parameterised as discussed above and change on ligand binding. The partition function is given by; Z ∞ Z= −∞ e−H/(kB T ) dx dp = (2πkB T )d (|M−1 ||K|)−1/2 (2.22) where d is the number of degrees of freedom (fluctuation variables) in the model. Note that since we are interested in free energy changes ∆G, any ‘phase-space density of states’ will cancel so they are omitted in the calculations here. The free energy is then; 1 G = −kB T ln Z = kB T ln(|K||M−1 |) − dkB T ln(2πkB T ). 2 (2.23) In equation (2.23) we have split the logarithmic term into two resulting terms breaking the convention of dimensionless arguments to logarithms. This is intended to highlight the term involving K that will change on ligand binding. The second term is constant on ligand binding. We do this at points throughout this thesis in order to clarify the physics. However, in all final results of changes in free energies the argument of logarithms will be dimensionless. In order to answer allosteric questions the allosteric free energy, ∆∆G, is calculated; ∆∆G = ∆G+− (bind) − ∆G−− (bind). (2.24) ∆G+− (bind) is the change in free energy of the protein in its ligand-bound state (+−) on binding the molecule of interest (e.g., DNA for repressor proteins). Similarly ∆G−− (bind) is the change in free energy of the protein in its ligand-free state (−−) on binding the molecule of interest. Since we are interested in changes in free energy, we only need consider G = 12 kB T ln |K| + constant (assuming isothermal changes and any change in the protein mass is negligible). Therefore, ¡ |K|++ |K|−− ¢ 1 ∆∆G = kB T ln , 2 |K|−+ |K|+− (2.25) 2.4. LOCALISATION 51 where the subscripts refer to the different liganded states. It is worth noting that the mass will change on ligand binding by the mass of the ligand itself. However, for the global modes the resulting momentum-space term will be small since the mass of the ligand is small compared to the mass of the protein. 2.4 Localisation In justifying coarse-graining methods it is often mentioned that high frequency modes are not as functionally important as low frequency modes. This is because the high frequency modes are localised, not collective. It is the global, delocalised modes that are important in conformational changes and allosteric communication. The low frequency modes are the ones that are delocalised and therefore functional. In this section we provide a brief introduction to the theory of localisation and its role in protein dynamics. It is worth noting that the low frequency modes are overdamped and not vibrational. However, due to continual bombardment by solvent molecules they will be excited. That is, what we often loosely call the low frequency vibrational modes are more faithfully understood as continual random excitations and decays rather than oscillations. Thermodynamically we can treat such damped motions in the same way as oscillations, since the entropy of a damped harmonic oscillator is the same as for an undamped one. Following work on localisation of spins due to disorder in random lattices by Anderson [1958], Biroli and Monasson [1999] show that eigenmodes in amorphous materials (such as glasses) are localised by geometric defects. The degree of localisation of an P 4 eigenvector φi,l , is defined as i |φi,l | . This is maximum, 1, if all entries are zero P except one. If all the components of the eigenvector are equal, i |φi,l |4 is minimum corresponding to full delocalisation. Biroli and Monasson [1999] find that for random stiffness matrices (e.g. of glassy systems), low and high frequencies are localised but intermediate frequencies are delocalised. Micheletti et al. [2002] study contact maps, calculated by a finite temperature extension of GNM, of ∼ 30 proteins. A contact map is a matrix, ∆, with elements ∆ij = 1 if i and j are in contact (separated by less than the cutoff 7.5Å) and zero otherwise. They compare the contact maps for these proteins with disordered contact maps obtained by reshuffling the entries whilst preserving the site connectivity. The high frequency modes of both the proteins and the reshuffled cases are localised on sites with high connectivity. This is because, sites with many neighbours are locally stiffened leading to higher frequencies centred there. The low frequencies, however, are more delocalised in the proteins than in the reshuffled case. It is such delocalised low frequency modes that are necessary for biological function. They also find that the proteins have a greater number of low frequency modes than the reshuffled maps. 52 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING They conclude that these indicators of the enhanced flexibility of natural proteins compared to random structures are due to the hierarchical organisation of native contacts in proteins (secondary and tertiary structure). The high frequency modes correspond to small amplitudes of fluctuation so make negligible difference to the root mean square fluctuations h∆r2 i = kB T /(mω 2 ). Consequently such modes are insignificant when calculating B-factors. As can be seen from the plot 2.1 of equation (2.3), each high frequency mode gives a vanishingly small contribution to the entropy. However, the density of states for the high frequency modes is high (there are many high frequency modes) leading to significant contributions to the entropy. On ligand binding to binding site 1 the low frequency global modes ωg−− will be changed to ωg+− . However, all the high frequency modes ωl and ωl2 that are not localised near binding site 1 will remain constant. Only the high frequency modes that are localised near binding site 1 will be affected (changed from ωl1 to ωl01 ). G−− = kB T X (ln ωl1 + ln ωl2 + ln ωl + ln ωg−− ) + C (2.26) (ln ωl01 + ln ωl2 + ln ωl + ln ωg+− ) + C (2.27) ω G+− = kB T X ω where C is constant. Similarly, on ligand binding to site 2, only the global modes ωg and the high frequency modes localised near binding to site 2 (ωl2 ) will be affected: G+− = kB T X (ln ωl01 + ln ωl2 + ln ωl + ln ωg+− ) + C (2.28) ω G++ = kB T X (ln ωl01 + ln ωl02 + ln ωl + ln ωg++ ) + C (2.29) ω In the allosteric vibrational free energy, the localised modes do not contribute as seen by; G−− = kB T X (ln ωl1 + ln ωl2 + ln ωl + ln ωg−− ) + C (2.30) ω G−+ = kB T X (ln ωl1 + ln ωl02 + ln ωl + ln ωg−+ ) + C (2.31) ω G+− = kB T X (ln ωl01 + ln ωl2 + ln ωl + ln ωg+− ) + C (2.32) ω G++ = kB T X (ln ωl01 + ln ωl02 + ln ωl + ln ωg++ ) + C ω ∆∆G = ((G++ − G+− ) − (G−+ − G−− )) = kB T X ω ln (2.33) ωg++ ωg−− . ωg+− ωg−+ (2.34) 2.5. MOTIVATION FOR OUR LEVEL OF COARSE-GRAINING 2.5 53 Motivation for our level of coarse-graining The aim of our approach (as outlined in section 2.3) is to develop models that are simple enough to be understood and solved analytically but still capture the fundamental physics of allostery. The criticism of such coarse-grained models is that by stripping away all the details we are left with an abstract model of little relevance to biological systems, which have high specificity. In order to address this issue the parameterisation of our models takes into account finer details, sometimes even at the atomic scale (as in section 3.3). In this section we consider the appropriateness of the level of coarsegraining we use. Elastic network models (as described in section 2.2.6) strip away much of the biochemical details and yet describe low frequency modes well. By including only the positions of the α C atoms and using a harmonic potential with a uniform force constant, all residues are treated equivalently. It is therefore surprising that such coarse-graining so accurately describes the dynamics on functionally important timescales. However, these low frequency modes are delocalised modes that are not dependent on the atomic details. Can the lowest frequency global modes be described by coarse-graining even further? Doruker et al. [2002] consider different levels of coarse-grained elastic network model structures. They perform ANM calculations on a series of coarse-grained structures of a transmembrane protein, hemagglutinin A, which has N = 1509 residues. They compare the calculated mean square fluctuations for the standard GNM of N residues (just the Cα atoms) as the network junctions with models using only selected residues. They coarse-grain the network model to use only N/2 residues (every other residue along the backbone). They investigate a series of such models containing only N/10, N/20 and N/40 residues. Figure 2.5 shows some of these coarse-grained models. By repeating the Figure 2.5: Figure to show levels of coarse-graining. X-ray structure of hemagglutinin showing Cα atoms of (a) all residues (N = 1509), (b) N/10 and (c) N/40 residues. Figure from Doruker et al. [2002]. 54 CHAPTER 2. LEVELS OF COARSE-GRAINING IN PROTEIN MODELLING calculations for such coarse-grained models with shifted sets of residues they calculated the mean square fluctuations for every residue. The cutoff values used (determined by fits to the experimental B-factors) increase substantially with the levels of coarsegraining (up to 60Å for the N/40 model). Larger cutoff values remove unrealistic behaviour in the more coarse-grained models due to missing parts of the structure. The correlations of the coarse-grained models with the all residue model are impressive. Even the highest level of coarse-graining they used (N/40) has a correlation of 0.79 for all modes and 0.98 for the lowest mode. They show that the lower resolution models lose more high frequency modes but have minimal effect on the low frequency modes. Such high level of coarse-graining allows rapid calculation of the modes of motion for arbitrary large proteins. Grouping residues along the backbone, as Doruker et al. [2002] do and as is done in the RTB method (described in section 2.2.6), is not necessarily very realistic for higher levels of coarse-graining. A more physical method may be to group residues that are close in space rather than sequentially along the backbone. The lowest resolution model Doruker et al. [2002] consider (N/40) still has ∼ 40 junctions in the network due to the large protein they studied. Is it physically meaningful to coarse-grain further? Given that some of the elastic network models and graph theoretic methods described in this chapter are computationally fast enough to study arbitrary large proteins is there anything to gain from further coarse-graining? Elastic network and graph theoretic models, though not atomistic, are still made up of discrete particles. The aim of such coarse-graining is pragmatic; to speed up computationally expensive simulations. The models we develop, however, are more continuous in nature. The aim of our approach is physical; to illuminate mechanisms of allosteric communication. Our models are highly coarse-grained but physically intuitive. By providing physical insight into dynamical mechanisms, our models are complementary to the more ‘blind’ computational techniques described in this chapter. Since our coarse-grained models are not reliant on existing PDB structures, they can be used to establish ‘design rules’ for general classes of proteins including synthetic ones. Our models can be described as globally coarse-grained. This is appropriate for studying the global modes, but is inappropriate for higher frequencies. Allosteric communication is a global phenomenon and therefore, coarse-graining at a global level is appropriate for investigating such mechanisms. 2.6 Timescales Figure 2.6 summarises the experimental and computational methods described in chapters 1 and 2. It indicates which techniques are appropriate for investigating dynamics in each time window. The figure highlights what protein functions occur at each timescale. 2.6. TIMESCALES Experimental technique 55 NMR 2H spin NMR15N spin relaxation relaxation neutron scattering terahertz spectroscopy Computational method quantum MD NMA NMR lineshape analysis NMR exchange trFRET ENM FRODA our models −15 10 s Protein motion −12 10 s atomic sidechain bond stretching bond rotation −9 −6 10 s backbone 10 s domain −3 10 s folding unfolding Figure 2.6: Figure to show the different timescales of protein dynamics. Experimental (red text) and computational (blue text) techniques are indicated at the timescales they probe. Note the figure is schematic not accurate. Chapter 3 Dimers of rigid monomers 3.1 Overview In this chapter we present a coarse-grained model of dimeric proteins. We use the model to investigate dynamic allostery in repressor proteins, focusing on the lac repressor as a test case. We compare our results with experimental and computational data and discuss extensions to the basic model. Finally, we discuss the applicability of this model to other repressor proteins. Of course “The truth of a theory can never be proven, for one never knows if future experience will contradict its conclusions” [Einstein]. The basic model has been published in Hawkins and McLeish [2004]. As discussed in section 2.4 the low frequency modes will dominate allosteric function [Bahar et al., 1998]. Although higher frequency modes are more numerous, they are spatially localised due to elastic disorder [Micheletti et al., 2002]. Ligand binding at sites where high frequency modes have significant amplitude will therefore generally have only local effects. Long distance allosteric signalling will be exponentially suppressed beyond the localisation length of the mode. Focusing on the slower, global modes motivates spatially coarse-grained models. In particular, the dimeric nature of many repressor proteins suggests the level of coarse-graining for our model. The binding of a ligand changes the flexibility of the protein affecting dynamics at sites far from the ligand binding site due to alterations in the global modes. Changes in these modes give rise to a change in vibrational free energy on ligand binding that affects the affinity of the repressor for DNA. Repressor proteins, as reviewed in section 1.9.1, bind to DNA to turn genes off. Some repressor proteins display negative cooperativity, only binding to DNA as an aporepressor (in the absence of bound inducer). A classic example of such a repressor is the E-coli lac repressor [Alberts, 2002; Lehninger et al., 2000; Bruinsma, 2002; Lewis et al., 1996; Bell and Lewis, 2000]. When bound to DNA the lac repressor suppresses the genes for the metabolism of the sugar lactose (β-galactosidase enzyme, galactoside 57 58 CHAPTER 3. DIMERS OF RIGID MONOMERS permease and thiogalactoside transacetylase). The inducer is allolactose in vivo (in the natural setting of the cell) and IPTG (isopropyl-beta-D-thiogalactopyranoside) in vitro (in experimental conditions). In the absence of lactose the repressor binds to DNA shutting down lactose metabolism. In this way, the bacterium saves energy by avoiding attempts to eat absent sugar. The gene regulation of the lac operon is actually more complex in that the genes are only expressed when lactose is present and glucose is not. A second DNA binding protein CAP (catabolite activator protein) works together with the lac repressor to enable the use of lactose instead of glucose for metabolism [Alberts, 2002; Lehninger et al., 2000]. The lac repressor forms our representative case for one class of repressors. The other class of repressors are those that only bind to DNA as a ‘holorepressor’ (with a bound ligand). The ligand in this case is known as a ‘corepressor’ (since the protein will only repress the genes with its partner corepressor). An example of this class is the E-coli tryptophan (trp) repressor [Otwinowski et al., 1988; Lawson and Carey, 1993; Zhao et al., 1993]. On binding to DNA it prevents the expression of the gene for tryptophan synthesis. The corepressor is L-Trp so the synthesis of tryptophan does not occur when it is already present in sufficient abundance in the cell. Introduction to the lac repressor Figure 3.1: The lac apo and holorepressor tetramers [Lewis et al., 1996; Friedman et al., 1995], showing the static conformational change. Each monomer of the apo repressor is a different colour (blue, grey, cyan and green). The holorepressor is shown with every monomer in red for comparison. Figure 3.1 shows the lac repressor tetramer. Two lac repressor dimers bind to form a tetramer of four repeat units. The lac repressor is a large protein with 360 amino acids in each monomer, giving a dimer mass of ∼ 77kDa. 3.1. OVERVIEW 59 The monomer of the lac repressor is divided into several domains [Bell and Lewis, 2001]. The headpiece domain (residues 1-49) binds to DNA by inserting its helix-turnhelix motif (see figure 1.11a of this common DNA binding motif) into the DNA major groove [Slijper et al., 1997]. Residues 50-58 form a hinge helix that binds to the minor groove in the specific complex [Lewis et al., 1996]. However, when free or bound to non-specific DNA this hinge helix is disordered [Kalodimos et al., 2004a]. The core domain is split into an N subdomain and a C subdomain with the inducer binding pocket between them. Finally, the C terminal alpha helix (residues 340-357) forms a helix bundle with other monomers to form the dimer or tetramer. The DNA sequence of the lac operon has three lac repressor recognition sites. One lac repressor tetramer binds simultaneously to two sites on the DNA - one dimer on each of two operator sites. This causes the DNA to bend to form a loop (see figure 3.2). Maximum repression occurs only when the tetramer is bound to the primary operator Figure 3.2: Lac repressor tetramer complexed with DNA showing DNA looping. The DNA helix is modelled as an elastic rod. The figure is taken from Villa et al. [2005] site O1 and an ancillary site O2 (401bp (base pairs) up from O1 ) or O3 (93bp down from O1 ) [Levandoski et al., 1996; Oehler et al., 1994]. It is not known whether the tetramer is formed before binding or whether the dimers bind individually to the respective operators and then form the tetramer. The x-ray diffraction crystal structures of the apo and holorepressors [Lewis et al., 1996] reveal a static conformational change on inducer binding (shown in figure 3.1). Binding to DNA results in a rotation of about 10o of the N -terminal subdomains of two monomers in a dimer. The tethered dimers in the tetramer move apart broadening the V-shaped structure by about 12o on DNA binding. Inducer lactose binding (one per monomer) results in structural changes tending to destabilise this DNA bound structure. The aporepressor dimers in figure 3.1 can be seen to be further apart than the 60 CHAPTER 3. DIMERS OF RIGID MONOMERS holorepressor dimers, favouring the DNA bound conformation. This altered orientation of the N subdomains on inducer binding displaces the hinge helices from the minor groove so predisposing the holorepressor to unbind DNA. A higher resolution study by Bell and Lewis [2000] has led to the idea that the allosteric transition involves the reorientation of the N subdomains disrupting the network of interactions between the N subdomains and the DNA binding domains. The disruption of interactions can affect not only the static conformation but also the vibrational dynamics of the protein. In this chapter we investigate how disruptions to local interactions that increase the protein flexibility lead to a dynamic contribution to the allosteric signal. Dynamics studies of the NMR solution structure for the lac repressor headpiece domain complexed to DNA [Slijper et al., 1997; Kaptein et al., 1995] have revealed mobility changes on binding to DNA. A loop in the DNA binding region shows a remarkable decrease in backbone flexibility on complex formation. However, there is still flexibility of the sidechains at the protein-DNA interface. Kalodimos et al. [2004a] compare the structure and dynamics of the DNA binding region of the lac repressor dimer complexed with specific and non-specific DNA. They find that the binding interface of the non-specific complex is highly flexible on the µs-ms timescale and that these motions are completely quenched on binding to the lac operator DNA. It is suggested that such motion in the non-specific complex is useful in exploring the DNA to find the target site. We briefly consider specificity in section 6.1.1. Kalodimos et al. [2004b] suggest this difference in the dynamics between the free (or non-specifically bound) and operator-bound lac repressor may provide a method of allosteric control. In this chapter we present a model of an allosteric mechanism that invokes such dynamics. Bell and Lewis [2001] point out that “despite the wealth of genetic, biochemical and structural studies of the lac repressor, a complete understanding of the allosteric transition at the mechanistic level is still not at hand”. In this chapter we develop a model of dynamic allostery to address this issue. We investigate a vibrational entropic contribution to the allosteric free energy that does not depend on the static conformational change. Such dynamic contributions have often been neglected in previous explanations of allosteric mechanisms. 3.2 Model In this section we introduce our basic model for allostery in repressor proteins. We explain in detail the theory outlined in section 2.3 and describe the coarse-grained model. One of the statements of the MWC model (reviewed in section 1.4.1) is that allosteric proteins are oligomers (contain a finite number of identical subunits) with at least one axis of symmetry. The MWC model states that the different states of the protein 3.2. MODEL 61 differ in the distribution or energy of the bonds between protomers. This is usually interpreted in terms of static conformational changes but can also lead to changes in the vibrational dynamics. We develop this point quantitatively by modelling a repressor dimer as two symmetric rigid slabs representing the two subunits. The platelike morphology is motivated by the typical extended surface of interaction between monomer units. We investigate the relative motion of these slabs arising from the interactions between them. This coarse-grained approach strips away all the molecular detail of the underlying atomic structure of a particular protein leaving a simple model that can be treated analytically with classical statistical mechanics. Such a coarsegrained model is shown in figure 3.6 superimposed on the x-ray crystal structure of the lac repressor dimer. Figure 3.3 shows the rigid cluster decomposition results of a FIRST simulation Figure 3.3: Rigid cluster decomposition results of a FIRST simulation of the lac repressor (using PDB ID: 1LBI Lewis et al. [1996]) using a hydrogen bond cutoff energy of −0.5kcal/mol). Each different coloured region is a different rigid cluster. The grey regions are flexible. we carried out on the apo lac repressor crystal structure (using PDB ID: 1LBI Lewis et al. [1996]). It clearly shows the two most rigid clusters (in blue and red) make up a large percentage of each monomer. This provides justification for our treatment of each monomer as a rigid body. A recent molecular dynamics simulation by Villa et al. [2005] shows the different domains within the lac repressor tetramer move as rigid bodies. The N and C subdomains of the core monomer move together as in our model. They also find that the DNA-binding headpiece domains move with respect to the core, absorbing tension due to the DNA loop between the dimers. In order to make the simulation tractable Villa et al. [2005] use a coarse-grained elastic rod model for the DNA as presented by Balaeff et al. [2004]. Despite the simplicity of our coarse-grained approach it is useful in helping us to understand how these proteins achieve their remarkable allosteric regulatory function. 62 CHAPTER 3. DIMERS OF RIGID MONOMERS In section 3.3 we show how atomic level detail of the structure of specific proteins can be used to compute the parameters used in the coarse-grained model. In this way, the coarse-grained models are related to simulations at atomic detail rather than remaining aloof from molecular detail. Such models enable the dynamic consequences of the slow modes to be calculated quantitatively far beyond the timescale accessible by MD. 3.2.1 Rigid rods model As an approach to explain the theory outlined in section 2.3, we first describe a toy model consisting of two one-dimensional rods with three harmonic springs between them as shown in figure 3.4. Then, in section 3.2.2 we extend this to account for the λ1 x l θ λ0 λ −1 Figure 3.4: Rods and springs model of the two interacting domains of a repressor protein dimer. The dashed line is the equilibrium position of the right hand rod. The relative translation displacement is x and the mutual rotation fluctuation is θ. two dimensional surface between monomers. In the rods model (figure 3.4) the three springs provide locally specific information about the interaction potential between the two domains. This toy rods model has two degrees of freedom, which we describe by the coordinates x and θ defined in figure 3.4. We assume that ligand binding affects the spring constant of the spring local to the binding site but does not affect the other springs. We calculate the free energies G++ , G+− , G−+ , G−− due to the modes of vibration of this model. Finally, calculation of ∆∆G shows that information of ligand binding to the site of spring λ1 is transmitted to the site of spring λ−1 . The potential energy for this model in terms of the coordinates (x, θ) and the spring constants λi is given by; lθ 1 1 lθ 1 V = λ1 (x + )2 + λ0 x2 + λ−1 (x − )2 . 2 2 2 2 2 (3.1) 3.2. MODEL 63 This leads to the Hamiltonian; 1 1 H = pT M−1 p + xT Kx, 2 2 (3.2) where the interaction matrix is; à K = (λ1 + λ0 + λ−1 ) (λ1 − λ−1 ) (λ1 − λ−1 ) ! (λ1 + λ−1 ) , the inertial matrix; à M = m 0 0 4I l2 ! à , the displacements; x = x ! lθ 2 , and the p are the momenta conjugate to the displacements (pi = mẋi ). I is the moment of inertia. For d degrees of freedom, the partition function; Z ∞ Z= −∞ e−H/(kB T ) dx dp = (2πkB T )d (|M−1 ||K|)−1/2 , (3.3) gives the free energy; 1 G = −kB T ln Z = kB T ln |K| + constant 2 1 = kB T ln(4λ1 λ−1 + λ0 (λ1 + λ−1 )) + constant. 2 (3.4) (3.5) The classical limit we take is valid for ~ω << kB T . A rough estimate of the frequencies of our modes is given by taking a dimer mass m ∼ 1 × 10−23 kg and spring constant λ ∼ 1Jm−2 (estimated from crystal B-factors as described in section 3.3.1), λ 1/2 giving, ω = ( m ) ∼ 1011 s−1 . This corresponds to time scales of ∼ 10ps. and gives, ~ω kB T ∼ 10−3 << 1, hence the classical limit is clearly valid. We assume that the spring constant λ1 changes to λ01 on inducer or corepressor binding and λ−1 changes to λ0−1 on DNA binding. The central spring λ0 remains constant and acts as an anchor. The allosteric free energy (the free energy of the holorepressor binding DNA minus that of the aporepressor binding DNA) is; 1 ∆∆G = ln 2 µ (4λ01 λ0−1 + λ0 (λ01 + λ0−1 ))(4λ1 λ−1 + λ0 (λ1 + λ−1 )) (4λ01 λ−1 + λ0 (λ01 + λ−1 ))(4λ1 λ0−1 + λ0 (λ1 + λ0−1 )) ¶ . (3.6) Allosteric communication is indicated by a non zero allosteric free energy (∆∆G 6= 0). Interestingly, if only two springs are used no allostery is possible: setting λ0 = 0 gives no allosteric communication, ∆∆G = 0. This is because each spring, λ1 and λ−1 , acts independently. However, with a non-zero ‘anchoring’ spring, λ0 , the movements of 64 CHAPTER 3. DIMERS OF RIGID MONOMERS springs λ1 and λ−1 are coupled due to the third spring. An infinitely stiff third spring results in a ‘scissor’ model (such as that used in section 5.2 and shown in figure 5.1). Positive cooperativity (the holorepressor binds to DNA with higher affinity than the aporepressor) is signified by ∆∆G < 0 (for example, the trp repressor). Conversely, negative cooperativity (the aporepressor binds to DNA with higher affinity than the holorepressor) is signified by ∆∆G > 0 (for example, the lac repressor). Each case of positive or negative cooperativity is captured by this simple model with the following rule determining which case arises: ( Λ1 Λ−1 + 1 where Λ1 = λ01 λ1 > ) Λ1 + Λ−1 < and Λ−1 = ( λ0−1 λ−1 positive cooperativity ) negative cooperativity (3.7) are the ratios of bound to unbound spring constants. The positive cooperativity case occurs when both spring constants increase (Λ1 > 1, Λ−1 > 1) or decrease (Λ1 < 1, Λ−1 < 1) on ligand binding. The negative cooperativity case occurs when one spring constant increases and the other decreases (Λ1 > 1 and Λ−1 < 1 or Λ1 < 1 and Λ−1 > 1). The decrease in force constants will, however, be limited by the physical requirement for overall stability of the complex. ∆∆G/kBT 3 2 1 0 -1 -2 -3 10 1 0.1 Λ1 1 0.1 10 Λ-1 λ0 λ0 Figure 3.5: Graph showing ∆∆G against Λ1 = λ11 and Λ−1 = λ−1 (plot of equa−1 tion (3.6)). The contour plot on the base shows regions where lac-type negative cooperativity (∆∆G > 0) (blue, magenta) and trp-type positive cooperativity (∆∆G < 0) (red, green) behaviour is optimised. Figure 3.5 plots the function ∆∆G(Λ1 , Λ−1 ) (equation (3.6)) showing the regions of positive and negative cooperativity for a case where λ1 λ0 > λ−1 λ0 (i.e., the potential is 3.2. MODEL 65 stronger at the inducer binding site than the DNA binding site). This case of λ1 6= λ−1 leads to the asymmetry seen in figure 3.5. Biologically relevant values for the unbound spring constants were estimated using experimental B-factors for the relevant crystal structure regions of the lac repressor. The B-factors are related to the RMS deviations of atomic positions < u¯2 > by B = 8π 2 < u¯2 >. The spring constants are given by the expression λ ∼ kB T . <u¯2 > For the lac repressor we obtained λ1 λ0 ≈ 1.2 and λ−1 λ0 ≈ 0.1 (see section 3.3.1). 3.2.2 Rigid plates model x λ 10 w x x λ 0−1 λ 00 x λ 01 l x λ −10 y x z Figure 3.6: Plates and springs model for the interaction of the two domains of a repressor dimer. The X-ray structure (PDB ID: 1EFA [Lewis et al., 1996]) of the lac repressor dimer is shown behind the model (DNA in green). A more realistic model of a repressor protein dimer extends the rods model to include the two-dimensional interaction surface between the monomers. Such a model has six degrees of freedom rather than the two described in section 3.2.1. It therefore models the six most global modes of the protein. In the basis we use, these modes are relative translation displacements (x, y, z) along the respective axes and rotation angles (θx , θy , θz ) about the respective axes. These translation and rotation displacements are shown in figure 3.7. In our method we do not need to diagonalise the interaction matrix to work with normal modes, since all we need is the determinant of the interaction matrix |K|. The protein subunits are represented by rigid slabs with length l and width w as shown in figure 3.6. The interactions at the dimer interface arise from hydrophobic, 66 CHAPTER 3. DIMERS OF RIGID MONOMERS z y x y x θx θz z θy Figure 3.7: Diagram showing the relative displacement parameters translation x, y, z and rotation θx , θy , θz of a plate. sidechain, and electrostatic forces. We model these interactions within the harmonic approximation with effective spring constants λi . In principle, these spring constants could be calculated from the details of the interactions between the protein subunits. In section 3.3 we show how this is done computationally using an atomistic force field. The effective spring constants, λi , come from integrating the interaction strength density k(r1 , r2 ) between the plates over the surfaces S1 , S2 of each plate. Treating the interaction strength density as a continuous function of position on each plate r1 , r2 , the total force between the plates is given by; Z Z F=− dS1 dS2 k(r1 , r2 )∆r(r1 , r2 ) (3.8) where ∆r(r1 , r2 ) is the mutual displacement of the plates, which is a function of position on each plate. By dividing the integral in equation (3.8) into suitable regions of the interaction surface, effective springs in each region can be determined. A discrete sum is more appropriate for proteins where one can imagine a spring between every pair of atoms across the interface. In this case, F=− X kij ∆rj (3.9) ij where the elements kij are the force constants between atoms i and j and ∆rj are the displacements of atoms j assuming atoms i are fixed. Due to the fact that the potentials in proteins are short ranged (even electrostatic due to screening), it would be possible to determine the subsets of springs, kij , the sum of which make up the effective springs 3.2. MODEL 67 in our model. The total force is equivalent to F = −Kx where x is the displacement vector and K is the interaction matrix of the spring constants in our model. Resolving this force in x, y and z directions it can be written as P F = − λx x Pi iy λ y Pi iz i λi z . (3.10) where λxi are the effective force constants in the x direction. Note, the effective springs in the y and z direction, parallel to the plates, are the y, z components of summations of diagonal forces between atoms in the respective plates. As such, these effective springs are part of the interaction between the plates despite their direction parallel to the interface. This representation (equation (3.10)) is equivalent to equation (3.9). In our model we characterise the interactions between the subunits by a minimal set of such effective springs, shown in figure 3.8. Only those in the x direction (perpenz w λ 01 x λ 10 x λ 0−1 x x z λ 00 λ 00 λ 01 l y x λ −10 λ 00 z λ 0−1 y x z Figure 3.8: Diagram showing the springs characterising the resolved x (solid), y (dotted) and z (dashed) displacements. dicular to the plates) λxij are drawn in figure 3.6. We specify enough springs to allow the potentials between the protein monomers to acquire locally specific values. The two rods model in section 3.2.1 required three springs. In the two plates model we therefore include three springs for each rotational direction. We use the approximation that the rotation about x is governed by the force constants in the z direction only (valid for 68 CHAPTER 3. DIMERS OF RIGID MONOMERS l >> w). Therefore, only one spring constant in the y direction is required to characterise the translation in the y direction, which in our approximation is independent from the other motions. On ligand binding we presuppose that the interactions are disturbed locally. Since the interactions are short ranged, we assume only interactions local to the ligand binding site are affected on binding. In our model this means we allow the effective springs local to the ligand binding site to change on the binding of an inducer or corepressor. In most repressor proteins, including the lac repressor, there is one effector binding site on each subunit. Since the binding of successive effector ligands is usually strongly positively cooperative we simplify the binding of two effectors to a dimer by a single bound holorepressor state. Section 3.6.1 investigates successive ligand binding. Just as effective springs local to the inducer or corepressor binding site are affected by effector binding, springs local to the DNA binding site will be affected by DNA binding. The other springs act as anchoring potentials. Each of the modes of vibration will be governed by a linear combination of the effective spring constants. Together they determine the intramolecular vibrational free energy. This will change on effector and DNA binding so giving us a vibrational contribution to the allosteric free energy. The six degrees of freedom of the two plates model, with spring constants λi as defined in figure 3.8, lead to a 6 × 6 interaction matrix |K| with non zero components; K11 =λy00 K22 =λz01 + λz0−1 + λz00 K23 =λz01 − λz0−1 K32 =λz01 − λz0−1 K33 =λz01 + λz0−1 K44 =λx10 + λx0−1 + λx00 + λx01 + λx−10 K45 =λx10 − λx−10 K46 =λx01 − λx0−1 K54 =λx10 − λx−10 K55 =λx10 + λx−10 K64 =λx01 − λx0−1 K66 =λx01 + λx0−1 . (3.11) 3.2. MODEL 69 The inertial matrix is; M = m 0 0 0 0 0 m 0 0 0 0 0 Ix w2 0 0 0 0 0 m 0 0 0 0 0 Iz l2 0 0 0 0 0 0 0 0 , and 0 0 Iy w2 y z wθ /2 x x= x lθ /2 z wθy /2 , (3.12) where m is the reduced mass and Ii is the reduced moment of inertia about the i-axis. The free energy is then given by equation (3.4); ³ ¡ ¢ 1 G = kB T ln λy00 4λz01 λz0−1 + λz00 (λz01 + λz0−1 ) 2 ¡ + ln 4(λx10 λx−10 λx01 + λx10 λx−10 λx0−1 + λx01 λx10 λx0−1 + λx01 λx−10 λx0−1 ) ¢´ + λx00 (λx10 + λx−10 )(λx01 + λx0−1 ) . (3.13) As for the toy rod model (section 3.2.1) it is the change in vibrational free energy on binding that we are interested in. 3.2.3 Example theoretical inter-monomer potential The vibrational free energy of a protein may increase or decrease on ligand binding depending on the details of the inter-subunit interactions at the binding site and how these are modified by the ligand. To illustrate that it is possible for a ligand to influence this free energy in both directions we consider a very simple example potential, namely a Lennard-Jones potential. This models the van der Waals attraction and the short range repulsion between the subunits. Here for simplicity we will consider this in one dimension. µ V0 = ²L-J 1 2 − dˆ12 dˆ6 ¶ (3.14) where the energy scale parameter ²L-J corresponds to the well depth and dˆ = d/d0 is the separation between the monomers d normalised by the equilibrium separation d0 such that the minimum is at dˆ = 1. The effective spring constant is then given by the ¯ 2 curvature at the minimum of this well λ0 = ∂2 Vˆ02 ¯d=1 = 72 ²dL-J 2 . We suppose that a ˆ d0 ∂ d 0 ligand binding adds an extra ‘hard-core’ term (a model of a local conformational change in which the subunits are separated by the ligand): à V = ²L-J 1 2 A − + dˆ12 dˆ6 (dˆ − 1)10 ! (3.15) 70 CHAPTER 3. DIMERS OF RIGID MONOMERS A is the dimensionless strength of ligand binding perturbation. The new effective spring constant λB will be given by the curvature at the new minimum. 0 -0.2 V/εLJ -0.4 -0.6 -0.8 -1 0.8 1 1.2 1.4 1.6 d/d0 1.8 2 2.2 Figure 3.9: Graph showing the two cases of an increase and decrease in curvature at the minimum for the potential V given in equation (3.15). The solid red curve is the unbound case A = 0, dashed green shows an increase in curvature for the strength of binding A = 10−10 and dotted blue shows a decrease in curvature for a higher binding strength A = 10−5 . The potentials are plotted against the ratio of subunit separation to the equilibrium separation, x0 , as a dimensionless measure of conformational change. As figure 3.9 shows, a small perturbation from the ligand (small value of A) increases the curvature at the minimum point but a larger perturbation (larger A) causes the curvature to decrease. In this model therefore the bound spring constant λB may increase or decrease on ligand binding depending on the degree of conformational change. An increase in the spring constant (λB > λ0 ) gives an entropy decrease on binding ∆S < 0 and the protein is ‘stiffer’. The converse is a decrease in the spring constant on binding (λB < λ0 ), which gives an entropy increase on binding ∆S > 0 and the protein is more flexible. The strength of the ligand binding hard-core term at the crossover between the two cases is Ac = 5 × 10−8 . The new minimum at this point is dˆ = 1.25 (the dimensionless measure of conformational change as the ratio of the subunit separation to their equilibrium separation). Note the change in depth of the minimum gives the static enthalpy change. 3.3 Parameterisation In this section we develop methods of parameterisation of the coarse-grained models using atomic details. We evaluate our model in the real example of the lac repressor. We show how parameterisation of the coarse-grained model results in a realistic value for the lac repressor allosteric free energy. 3.3. PARAMETERISATION 3.3.1 71 Experimental B-factors Initially, we obtain a quick rough parameterisation of the toy rods model (described in section 3.2.1) to see if a fuller parameterisation of the rigid plates model (of section 3.2.2) is worth embarking on. For this initial rough parameterisation of the two rods model we used x-ray B-factor data. RMSD values < u¯2 > of the atoms in the appropriate protein regions were obtained from the B-factor data for the aporepressor [Lewis et al., 1996], holorepressor [Friedman et al., 1995] and complex with DNA [Bell and Lewis, 2000], were converted into atomic RMSD values (B = 8π 2 < u¯2 >). The spring constants λi (in this case averaged over the vibrations in the different planes) were estimated from the expression λ = kB T / < ū2 > where < ū2 > was taken as the average of atoms in the relevant region. This gave λ1 λ0 ≈ 1.2, λ−1 λ0 ≈ 0.1 (estimated from Slijper et al. [1997]), Λ1 ≈ 0.07 and Λ−1 ≈ 6.7. Substituting these values in equation (3.6) gives an estimate of the vibrational contribution to ∆∆G. Including a factor of three due to the three planes of vibration and a factor of two to compare our dimer results with experimental tetramer results, we obtain ∆∆G ∼ 1.4kB T . Since the experimental values for the change in binding energy between holo and aporepressor binding to DNA are ∆∆G ∼ 6kB T [Barkley and Bourgeois, 1980; Horton et al., 1997; Levandoski et al., 1996] this indicates that the vibrational contribution is likely to be significant. The crystal structure dynamics are likely to give a lower bound for the amplitudes of vibration, due to the constrained nature of the crystal compared to solution environment. 3.3.2 Atomistic simulation To improve this rough estimate of the spring parameterisation we used a computational approach based on fully atomistic MD potentials to parameterise the rigid plates model (of section 3.2.2). In the spirit of the rigid monomer approximation, we treat each protein subunit as a rigid body by fixing its atomic coordinates within the subunit. Keeping one subunit fixed we move the other subunit relative to it. At each incremental displacement the energy change from summing all interactions in the potential is calculated. By fixing the atoms within a subunit we freeze out many higher frequency modes, looking only at the lower frequency global modes. We assume the entropy of each subunit does not change on collective transformation of all the subunit atoms. This approximation is justified for our purposes since we are not interested in the higher modes but only in the lower collective modes. In section 5.2 we consider a case where the entropy of the sidechains is affected by the collective modes. For our test case of the lac repressor we used the x-ray crystal atomic coordinates PDB ID: 1LBI [Lewis et al., 1996] for the aporepressor without the inducer bound and PDB ID: 1TLF [Friedman et al., 1995] for the holorepressor with the inducer bound. 72 CHAPTER 3. DIMERS OF RIGID MONOMERS We used the software ‘Crystallography and NMR System’ (CNS) [Brunger et al., 1998], which uses molecular dynamics force fields. We used the ‘protein-allhdg.top’ and ‘protein-allhdg.param’ topology and parameter files. Using the CNS package we generated the hydrogen atoms that were missing in the crystal structures and performed 1000 steps of Powell minimisation. Then we fixed the coordinates of one subunit and transformed the coordinates of the other, recalculating the interaction energy between them at each increment. The potential energy calculation is a sum of the interactions between all the atoms in the two subunits. We repeated this subunit transformation for a sufficient number of different modes to obtain all the spring constants. To determine the nine spring constants in the model nine independent modes are needed, involving linear combinations of the spring constants. The different modes we looked at were: translation along each axis, rotation about x and z (each with the rotation point at the top and bottom of the crystal structure) and finally, rotation about the y axis from points at the left and right of the structure. The standard orientation we use is with the DNA binding site at the bottom with the axes as shown in figure 3.6. The data generated for the energy at each incremental transformation gives us potential energy wells for the interaction between the monomers in the different modes. An example of these curves is given in figure 3.10. By curve fitting the bottom of these Potential/kBT 2 1 0 -0.05 0 0.05 Displacement/Å Figure 3.10: Graph showing an example of a change in spring constant with and without the inducer for the lac repressor. The mode shown is rotation about the y-axis (rotation point (0,0,20)Å). The blue squares are for PDB ID: 1LBI the aporepressor and the red stars are for PDB ID: 1TLF the holorepressor. The lines are quadratic fits. wells to a quadratic (the harmonic approximation) we were able to extract the curvature for the potential well for each transformation investigated. The curve fits obtained are given in table 3.1. The curvatures of each mode are linear combinations of the effective spring constants in our model; κxtrans = λx−10 + λx00 + λx10 + λx0−1 + λx01 3.3. PARAMETERISATION Transformation 1LBI rot t x 1LBI rot b x 1LBI rot t z 1LBI rot b z 1LBI rot l y 1LBI rot r y 1LBI trans 0 x 1LBI trans 0 y 1LBI trans 0 z 1TLF rot t x 1TLF rot b x 1TLF rot t z 1TLF rot b z 1TLF rot l y 1TLF rot r y 1TLF trans 0 x 1TLF trans 0 y 1TLF trans 0 z potential min/kB T -454 -454 -455 -460 -461 -454 -457 -453 -454 -403 -389 -388 -391 -387 -391 -389 -389 -398 73 position min/Å 0.03 0.06 -0.03 -0.15 -0.13 -0.03 -0.08 0.01 0.05 0.24 0.11 0.06 0.13 0.07 0.11 0.10 0.11 0.24 curvature/kB T Å−2 618 353 1933 674 1153 1368 1030 296 405 674 736 1693 764 1038 1222 1122 932 479 Table 3.1: Table of fitting parameters for lac repressor without and with the inducer (PDB ID: 1LBI and 1TLF) for translations (trans) along each axis and rotations (rot) centred at points (top (t), bottom (b), left (l), right (r)) about each axis. The fits are quadratic fits of the form A + Bx + Cx2 . The table shows the minimum potential and the point relative to the initial crystal structure at which this occurs. The curvature values obtained are used to parameterise the effective springs in the model. κzrot−t = 4λx−10 + λx01 + λx00 + λx0−1 κzrot−b = 4λx10 + λx01 + λx00 + λx0−1 κyrot−r = 4λx01 + λx10 + λx00 + λx−10 κyrot−l = 4λx0−1 + λx10 + λx00 + λx−10 κytrans = λy00 κztrans = λz0−1 + λz00 + λz01 κxrot−t = 4λz0−1 + λz00 κxrot−b = 4λz01 + λz00 . The effective spring constants are therefore given by; λx−10 =(3κzrot−t + κzrot−b − 4κxtrans )/8 λx00 =(6κxtrans − κzrot−t − κzrot−b − κyrot−l − κyrot−r )/2 λx10 =(3κzrot−b + κzrot−t − 4κxtrans )/8 (3.16) 74 CHAPTER 3. DIMERS OF RIGID MONOMERS λx0−1 =(3κyrot−l + κyrot−r − 4κxtrans )/8 λx01 =(3κyrot−r + κyrot−l − 4κxtrans /8 λy00 =κytrans λz0−1 =(3κxrot−t + κxrot−b − 4κztrans )/8 λz00 =(4κztrans − κxrot−t − κxrot−b )/2 λz01 =(3κxrot−b + κxrot−t − 4κztrans )/8. (3.17) In this way, we obtained the spring constants of both the effector bound and unbound states of repressors. These effective spring constants are given in table 3.2. The comspring constant/kB T Å−2 λx−10 λx00 λx10 λx0−1 λx01 λy00 λz0−1 λz00 λz01 1LBI (apo) 294 527 -21 88 142 296 73 324 7.1 1TLF (holo) 169 1010 -63 -19 27 932 105 253 121 Table 3.2: Table of values of the spring constants λijk (shown in figure 3.8) calculated from the quadratic fits in table 3.1. bined effect of this parameterisation is indeed on average an increased flexibility of the inducer bound repressor leading to unfavourable DNA binding due to an increased vibrational entropy. However, the values of the λi do not show clear constant ‘anchor’ and changing ‘ligand binding’ springs. Some λi are negative, which implies a locally unstable interaction rather than the expected attractive interaction. This is indicative of motion in certain directions being over constrained due to the crude method of spring positioning in the model. By placing the springs arbitrarily the actual parameters do not easily correspond to simple meaningful interactions. However, since the network of springs we work with is equivalent to summations over all atomic pairwise interactions the resulting calculated free energies are meaningful. The springs need to be placed where they will not cause unphysical steric clashes with other parts of the protein structure. As long as this condition is satisfied, the exact configuration of springs is not important since their values will be consistent with the total energy of the atomistic protein structure. It can be seen from table 3.2 that the effective springs at the bottom of figure 3.8, far from the effector binding site, λx−10 and λz0−1 , are less 3.4. RESULTS AND COMPARISON TO EXPERIMENTS 75 affected by effector binding. The actual position of the inducer binding site in the lac repressor is near the centre of the protein subunits so we expect the effective springs at the centre to be affected by inducer binding. This is clearly the case in table 3.2. A more physically enlightening parameterisation method would pick out the local interactions directly affected by ligand binding giving an effective spring local to the binding site. This could be achieved by including only atoms within a certain cutoff distance of the ligand binding site. Each spring could be calculated in a similar manner including atoms local to the effective spring in question. The resulting effective springs would then have more physically identifiable values. The challenge with a method like this would be in dividing the protein into reasonable regions. Since we wish to probe the lowest frequency modes a more realistic simulation would include minimisation steps after each incremental transformation, before the energy calculation. This would allow the relaxation over fast times scales that would occur in a real protein. Such a method would include the entropy of the fast modes therefore probing the free energy landscape. However, including such minimisation steps would increase the computational time needed. In chapter 5 we investigate theoretically the case of such fast modes being affected by the position of the slow mode. Another method of computational parameterisation would be to use the normal mode eigenvalues calculated by an elastic network model method (introduced in section 2.2.6). However, since we have more springs than degrees of freedom the lowest six eigenvalues would not completely define our set of effective springs (see section 6.1.1 for further discussion of this). 3.4 Results and comparison to experiments We put the effective spring constant parameters, calculated as described in section 3.3.2, into equation (3.13). We make the simplification that the calculated allosteric free energy ∆∆G ≈ −∆G(bind inducer). This is valid for large Λ−1 (i.e., the springs in the locality of the DNA binding site greatly stiffen on DNA binding). The result for the dimer is multiplied by two in order to compare with experimental data for two dimers, that form a tetramer. In total this gives is an estimate for ∆∆G ∼ 2.5kB T . Interestingly, the softest mode contributing most to the allostery, rotation about the y-axis, is the one that shifts the DNA read heads away from the DNA (see table 3.1). Physically, such ‘entropic allostery’ allows the inducer binding to communicate, via the large amplitudes of the global internal modes of the protein, with the headpiece binding regions near the DNA, which, as a result, move too much to be inserted into the DNA. 76 CHAPTER 3. DIMERS OF RIGID MONOMERS 3.4.1 Comparison to biochemical data Donner et al. [1982] measure the binding of inducer IPTG to the lac repressor by calorimetry. Their results give ∆G(bind inducer) = −12kB T , ∆H(bind inducer) = −6kB T and ∆S(bind inducer) = 6kB T . Equilibrium measurements give the allosteric free energy ∆∆G ∼ 6kB T [Barkley and Bourgeois, 1980; Horton et al., 1997; Levandoski et al., 1996]. Horton et al. [1997] quote IPTG increasing the DNA dissociation constant 1000 fold from 10−11 to 10−8 M. This value of ∆∆G ∼ 6kB T includes the static effect as well as the vibrational effect that we have calculated. Entropic changes on binding are often interpreted as being due to the hydrophobic effect [Spolar and Record, 1994]. This arises from the change in hydrophobic surface area exposed to solvent. Kalodimos et al. [2004a] calculate the change in hydrophobic surface area of lac repressor binding to DNA as 1400Å2 . However, this is the same for specific and non-specific DNA binding so cannot explain the difference in affinities. Similarly, the apo and holorepressors binding to DNA will bury the same hydrophobic surface area so this static entropic effect cannot explain the allosteric entropy. Kalodimos et al. [2004a] suggest that contributions to entropy changes on binding from changes in the flexibility are therefore crucial. Kalodimos et al. [2004b] suggest that the hinge helix (residues 50-58), which is disordered in the free and non-specifically bound states but structured in the specific complex, is important in communicating the allosteric signal. In terms of our model this is the likely position of the spring local to the DNA binding site. Unfortunately however, the apo and holorepressor crystal structures that we have used to parameterise our model do not contain these residues. Falcon and Matthews [2001] find that introducing a disulphide bond between the hinge helix of each monomer in a lac repressor dimer increases the affinity for DNA whilst reducing specificity. The engineered lac repressor binds strongly to both operator and non-specific DNA. Allosteric response to the inducer is lost. These observations can be neatly explained by our theory. The disulphide bond introduces a very large DNA local spring constant that remains constant. Thus the vibrational entropy cost of binding is greatly reduced removing the allosteric control and leaving a high affinity repressor. Putting λ−1 = λ0−1 → ∞ into equation (3.6) leads to ∆∆G = 0. 3.4.2 Significance of the vibrational component To obtain an idea of the significance of the calculated vibrational component compared to the total ∆∆G ∼ 6kB T we calculated the probability curve of DNA sites occupied by repressors against lactose concentration with and without the vibrational component (following Yildirim and Mackey [2003]) (see figure 3.11). The equilibrium constants are 3.4. RESULTS AND COMPARISON TO EXPERIMENTS 77 1 0.8 f 0.6 0.4 0.2 0 0 5 10 15 20 [L] 25 30 35 40 Figure 3.11: Graph showing the fraction, f , of DNA sites bound by repressors against the concentration of inducer lactose [L] (number of molecules). The solid red curve includes the vibrational component we have calculated and the dashed green curve is without this vibrational component. given by R + nL RLn D + R DRn [RLn ] [R][L]n [DR] KDNA = [D][R] Klac = = e−∆Glac /kB T (3.18) = e−∆GDNA /kB T (3.19) where [R] is the concentration of free repressor, [RLn ] the concentration of repressor with n inducers bound, [L] the concentration of inducer lactose, [DR] the concentration of repressor occupied DNA sites and [D] the concentration of free DNA repressor binding sites. The fraction of DNA sites occupied by repressor proteins is given by f= [DR] KDNA [R] = . [D] + [DR] 1 + KDNA [R] (3.20) Assuming the amount of repressor bound to DNA is small compared to free repressor gives [Rtot ] = [R] + [RLn ] + [DR] ≈ [R] + Klac [R][L]n . This leads to f= KDNA [Rtot ] . 1 + KDNA [Rtot ] + [L]n e−∆Glac /kB T (3.21) We take n = 2 for two inducers binding and −∆Glac = ∆∆G with and without our calculated vibrational contributions for the two respective curves. For 95% activation (operators not bound) 18 lactose molecules are required with the vibrational contribution but 50 would be required if there was no vibrational component. Note there are only of order 10 repressors in the cell [Kalodimos et al., 2004b]. This would imply that the vibrational contribution to allostery is significant in controlling the level of lactose concentration at which the gene expression is turned on. 78 3.4.3 CHAPTER 3. DIMERS OF RIGID MONOMERS Comparison to elastic network model simulations As a test of our method we compare our results with recently developed more finely coarse grained computational normal mode analysis methods. We calculated the change in vibrational free energy from eigenvalues obtained using elastic network models. We obtain normal mode eigenvalues using the software iGNM [Yang et al., 2005] and elNémo [Suhre and Yves-Henri, 2004] (introduced in section 2.2.6) with lac repressor dimers from the crystal structures PDB ID: 1LBI and 1TLF [Lewis et al., 1996; Friedman et al., 1995] as inputs. We calculated the free energy component from each mode with eigenvalue λi using ∆G = 1 2 ln(λi [holo]/λi [apo]). The results are plotted in figure 3.12. The free energy change on inducer binding including the components from the first n modes is plotted against the number of modes n included. As a comparison we diagonalised our interaction matrix |K| (equation (3.11)). Using the spring constant parameters obtained as in section 3.3.2, we calculated the equivalent eigenvalues and free energy contributions for the six modes of our coarse-grained model. As can Cumulative entropy change/kBT 0 -0.5 -1 -1.5 -2 -2.5 -3 0 10 20 30 40 50 Mode number Figure 3.12: Comparison of GNM (red circles), elNémo (green squares) and our model (blue triangles) cumulative free energy change components for each mode against the mode number. be seen from figure 3.12, breaking down our model into components from individual modes does not compare very well to the lowest modes calculated by GNM or elNémo. This is hardly surprising given the highly coarse-grained nature of our model and the crudity of the parameterisation we have used. The unsatisfactory shape of the curve produced by our method proves our intuition that including less than the six most global modes is a poor approximation. It can clearly be seen from the elNémo results in figure 3.12 that including each successive mode has a decreasing effect, such that the allosteric free energy seems to saturate around 30 modes. As expected the lowest modes have the greatest effect. This calculation implies that more than the six modes 3.5. LAC REPRESSOR MUTATIONS 79 we include are required for the full allosteric effect. The fact that the value we get from the six modes of our simple model is of the same order as that obtained by a more detailed model indicates, however, that our treatment is reasonable. This implies the parameterisation of the plates model has projected some of the higher modes onto our intuitive low modes. 3.5 Lac repressor mutations In order to test our model we have predicted the effect certain point mutations will have on the allostery of the lac repressor. ITC on these mutant proteins will be a test of these predictions. Suckow et al. [1996] and Pace et al. [1997] analyse over 4000 point mutations (mutations of single amino acids) in the lac repressor. They observe the effects of these mutations on the resulting E-coli bacteria. Interestingly, they find that as well as amino acids close to the inducer binding site, residues > 8Å away from the inducer binding site but at the dimerisation interface are crucial to induction. Bacteria with such mutations are unresponsive to the inducer, implying, either the inducer is unable to bind to the mutated lac repressor, or that it can bind but the allosteric mechanism is lost. Our model suggests that mutations at the dimer interface would indeed affect the dynamic allosteric mechanism. Swint-Kruse et al. [2003] measure the DNA and inducer binding equilibria for three point mutations in the core near the inducer binding site. They find different mutations in this region can strengthen or weaken DNA binding affinity with opposing effects on inducer affinity. We made in silico mutations by changing the residue in the the PDB file. The backbone atoms for the old residue were relabelled as the new residue and the sidechain atoms deleted. New sidechain atoms were generated using CNS and 1000 steps of Powell minimisation run. In order to calculate estimates for the effective spring constants in the model, the potential between the monomers for incremental transformations was calculated, as described in section 3.3.2. Table 3.3 shows the predicted change in allosteric free energy for various lac repressor mutants. Some mutants show positive cooperativity rather than the wild type negative cooperativity. Interestingly, all the mutants tested showed a decrease in predicted allosteric free energy compared to the wild type. This may be an indication that evolution has optimised the inter-facial residues for maximum allosteric effect. The residues chosen for study are expected to affect the spring constants in our model since they lie at the interface between the two monomers. To decide which residues to investigate, we calculated (using the software CNS) the interaction energy between each successive residue of one monomer with the other subunit to find which residues have the largest inter-facial interactions. Leucine 251 and glutamic acid (E) 80 CHAPTER 3. DIMERS OF RIGID MONOMERS Protein wild type L251S L251Q L251C L251G L251F E100L E100S ∆∆G(holo − apo)/kB T 2.5 1.1 -1.8 0.57 -0.61 1.2 1.6 1.6 Table 3.3: Predicted allosteric free energy for various lac repressor mutants. The notation L251S means the leucine (L) residue 251 in the wild type (natural) protein is changed to a serine (S). 100 were chosen. The position of these residues is shown in figure 3.13. Figure 3.13: Lac holorepressor (PDB ID: 1TLF [Friedman et al., 1995]) with inducer (yellow), L251 and E100 (green) highlighted as spacefill. Leucine has a non-polar sidechain so mutations to polar residues such as serine (S) and glutamate (Q) were expected to make a difference to the electrostatic interactions. L251C is predicted to make a disulphide bond across the interface due to the proximity of the sulphur atoms in the cysteine residues introduced. Glutamic acid (E) has a large acidic sidechain so mutations to the non-polar leucine or to the small polar serine were thought to make a difference. The mutants L251Q, L251C, L251G and E100L may be made by Stockley et al. [2005] and used for such experiments. 3.6. EXTENSIONS TO THE BASIC MODEL 3.6 3.6.1 81 Extensions to the basic model Sequential binding So far we have considered a single holorepressor state. However, most repressor proteins bind one inducer or corepressor per monomer, so two to a dimer. The case of multiple, sequential ligand binding will lead to additional structure. The binding of successive ligands will be cooperative if the binding of a second inducer is affected by the presence or absence of the first. If so, the quantity ∆∆Gseq = ∆G1st (2nd bind) − ∆Gapo (1st bind) will be non-zero. We assume that the ligand bindings are symmetric, that is, ∆Gapo (1st bind) = ∆Gapo (2nd bind). First we do not consider binding to DNA, just the cooperativity of sequential inducer or corepressor binding to the repressor. For simplicity we consider the case of two rods and three springs as discussed in section 3.2.1. By extending this to consider sequential inducer or corepressor binding, ∆∆Gseq = ∆G1st (2nd bind) − ∆Gapo (1st bind) is obtained from equation (3.5) where λ01 is the inducer or corepressor doubly bound form but λI1 is the intermediate form of only one inducer or corepressor bound. ∆∆Gseq kB T = ln 2 µ (4λ01 λ−1 + λ01 + λ−1 )(4λ1 λ−1 + λ1 + λ−1 ) (4λI1 λ−1 + λI1 + λ−1 )2 ¶ . (3.22) We can obtain the following rule determining the cooperativity of inducer binding ( ΛI1 where α = > < ) ( ((α + 1)(α + Λ1 ))1/2 − α λ0 λ−1 (4λ−1 λ1 +λ0 λ1 ) . Λ1 = λ01 λ1 positive cooperativity ) negative cooperativity (3.23) is the spring constant when two inducers are bound λ01 , normalised by the unbound spring constant λ1 . ΛI1 = λI1 λ1 is the intermediate bound spring constant ratio - the spring constant when only one inducer is bound λI1 , normalised by the unbound spring constant λ1 . Positive cooperative behaviour means the binding of the second inducer is easier if the first is present and negative cooperativity means the binding of the second is harder if the first is present. We now investigate the case of the spring constants being equally divided, that is, the binding of a single inducer changes the spring constant by half that of binding both inducers; λI1 = 1 0 2 (λ1 + λ1 ). This is a reasonable assumption for symmetric inducer binding. In this case equation (3.23) becomes ( (ΛI1 − 1)2 > < ) ( 0 positive cooperativity negative cooperativity ) . (3.24) Since (ΛI1 − 1)2 is always positive, the binding of the second inducer is always easier 82 CHAPTER 3. DIMERS OF RIGID MONOMERS than the first (positive cooperativity) if the spring constants are equally divided. It is interesting to note that, apart from in the trivial case of no change in spring constant ΛI1 = 1, even if the spring constants are equally divided the energy splittings are not equal and therefore there is non-zero cooperativity, that we have shown to be positive. This fits with the observation that most repressor protein inducers or corepressors bind with positive cooperativity. As well as the cooperativity of multiple inducer binding, we are also interested in the effect of sequential inducer binding on the allosteric function of repressor proteins binding to DNA. As an example, we now investigate the binding to DNA of a repressor with one inducer compared to a repressor with two inducers. Often biology makes use of an effect known as molecular proofreading. This mechanism, designed to reduce errors, ensures the system is not activated until the second inducer is bound. Such molecular proof reading is attained if the fraction of the total allosteric free energy change occurring if only one inducer is bound is less than half; f= ∆∆Gint-apo (DNA) 1 < . ∆∆Gholo-apo (DNA) 2 (3.25) This is shown in figure 3.14. If |∆Gholo −∆Gapo | > 2|∆Gint −∆Gapo (DNA)| the system a) ∆Gholo Affinity for DNA ∆Gint ∆Gapo b) Affinity for DNA ∆Gapo ∆Gint ∆Gholo Figure 3.14: Energy level diagram showing |∆Gholo − ∆Gapo | > 2|∆Gint − ∆Gapo | indicating molecular proof reading for a a) negative cooperative (lac-type) system and b) positive cooperative (trp-type) system. is making use of molecular proofreading ensuring the external signal is strong enough before responding with its alternative behaviour. For the lac repressor, using the values estimated from the x-ray B-factors (Λ1 ≈ 0.07, Λ−1 ≈ 6.7, λ1 ≈ 1.2, λ−1 ≈ 0.1) and including a factor three for the other planes, we obtain ∆∆Gholo-apo ∼ 1.4kB T and ∆∆Gint-apo ∼ 0.18kB T giving a fraction f ∼ 0.13. This is clearly biased such that the first inducer binding has a comparatively small effect so is unlikely to release the DNA until the second inducer binds. This can be seen on 3.6. EXTENSIONS TO THE BASIC MODEL 83 the energy level diagram drawn in figure 3.14. E-coli prefers to eat glucose and will only metabolise lactose if there is enough available, so, the lac repressor only releases the DNA if it is bound by two inducer molecules. Recent binding experiments by SwintKruse et al. [2005] on mutants with slower kinetics than the wild type imply both inducer sites in the dimer must be occupied to release DNA. Molecular proofreading in the trp case would mean that E-coli normally synthesises tryptophan and only stops doing so if there is an excess of trp, that is, the trp repressor will only bind if there are two trp corepressors bound. It is known that trp binds to DNA when both effectors are bound in nature. 3.6.2 Bending modes In this section we consider a more realistic extension of the rigid monomer model that includes the lowest mode of monomer flexibility (bending). In reality the protein monomers are far from rigid plates - they are themselves flexible, dynamic entities. Consequently there will be large bending modes delocalised across the protein, and so able to contribute to the Brownian signalling mechanism we propose. If such bending modes are altered on ligand binding, addition of such bending modes would modify the values of the predicted ∆∆G. For simplicity we consider the toy two rods model (as described in section 3.2.1). If we allow the rods to bend by angle α as shown in figure 3.15 the potential energy R α l Figure 3.15: Diagram to show bending angle α of rod length l bent with radius of curvature R becomes; 1 lθ lα 1 1 lθ lα 1 lα V = λ1 (x + + )2 + λ0 x2 + λ−1 (x − − )2 + κb ( )2 2 2 2 2 2 2 2 2 2 (3.26) where κb = 16Y I/l3 . Y is the Young’s modulus and I is the moment of inertia. We write the interaction matrix as a 3 × 3 matrix that includes a third coordinate; α, the 84 CHAPTER 3. DIMERS OF RIGID MONOMERS bending angle; K= (λ1 + λ−1 + λ0 ) (λ1 − λ−1 ) (λ1 + λ−1 ) (λ1 − λ−1 ) (λ1 + λ−1 ) (λ1 − λ−1 ) (λ1 + λ−1 ) (λ1 − λ−1 ) (λ1 + λ−1 + κb ) . The inertial matrix is given by m 0 M= 0 I l2 0 0 0 0 , and x= I l2 x lθ 2 lα 2 , where m is the reduced mass and I is the reduced moment of inertia. The vibrational free energy now becomes µ ¶ 1 4λ0 λ1 λ−1 1 + constant (3.27) G = kB T ln κb + kB T ln 4λ1 λ−1 + λ0 (λ1 + λ−1 ) + 2 2 κb For very floppy rods κb → 0 the vibrational free energy reduces to 1 G = kB T ln λ0 λ1 λ−1 + constant. 2 (3.28) This gives no allosteric free energy ∆∆G = 0. Physically there is no coupling between the effective springs, since the plates are infinitely floppy. For κb → ∞ the 4λ0 λ1 λ−1 /κb term in the argument of the logarithm of equation (3.27) tends to zero giving the same form as for rigid rods plus an additive term ln κb . In this case the cooperative behaviour is the same as for no bending modes if the bending modulus κb is constant. With a constant finite κb we obtain the same general cooperativity rule as for the case with no bending, that is, ( Λ1 λ−1 + 1 > < ) ( Λ1 + λ−1 trp lac ) . (3.29) However, we get a different ∆∆G when bending modes are considered 1 ∆∆G = kB T ln 2 à λ0 λ0 0 0 0 0 κb )λ1 λ−1 + λ0 (λ1 + λ−1 ))(4(1 + κb )λ1 λ−1 + λ0 (λ1 + λ−1 )) + λκ0b )λ01 λ−1 + λ0 (λ01 + λ−1 ))(4(1 + λκ0b )λ1 λ0−1 + λ0 (λ1 + λ0−1 )) (4(1 + ((4(1 ! (3.30) The graph of ∆∆G, equation (3.30) as a function of κb (figure 3.16) clearly shows that there is no maximum for finite κb . A finite, as opposed to infinite, κb value reduces, rather than increases, the allosteric free energy. This is because the coupling between the different effective springs will be greatest for an infinite bending stiffness and any . 3.6. EXTENSIONS TO THE BASIC MODEL 85 1.4 1.2 ∆∆G/kBT 1 0.8 0.6 0.4 0.2 0 0 2 4 6 8 10 κb/λ0 Figure 3.16: Allosteric free energy ∆∆G against the normalised bending modulus κb /λ0 of the plates for effective spring values for the lac repressor estimated from experimental B-factors. bending will serve only to absorb the vibrations locally rather than transmit them. This gives a lower limit to the bending rigidity for effective allosteric communication. However, the bending of the plates may increase the allosteric communication if ligand binding changes the bending rigidity κb . If κb becomes κ0b on ligand binding the change in free energy on ligand binding becomes; 1 ∆G = kB T ln 2 µ ¶ 4(κ0b + λ0 )λ01 λ−1 + λ0 κ0b (λ01 + λ−1 ) . 4(κb + λ0 )λ1 λ−1 + λ0 κb (λ1 + λ−1 ) (3.31) The case where κb changes on ligand binding is plotted in figure 3.17. We plot the case where λ1 remains constant to highlight the additional free energy change due to the alteration in bending stiffness only. It can clearly be seen that if the bending modulus 1 0.8 ∆G/kBT 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 0 2 4 6 8 10 κbl/κb Figure 3.17: Vibrational free energy change on ligand binding, ∆G, against the normalised ligand bound bending modulus κ0b /κb for the case where κb changes on ligand binding but the spring constant λ1 remains constant. The plot is for κb = λ1 = λ−1 = λ0 . increases on ligand binding κ0b /κb > 1 the vibrational free energy increases (due to the 86 CHAPTER 3. DIMERS OF RIGID MONOMERS stiffening). Conversely if the bending modulus decreases on ligand binding κ0b /κb < 1 the vibrational free energy change is negative. 3.7 Future applications to other repressor proteins In this section we briefly discuss examples of other allosteric repressor proteins that would be suitable to investigate using the methods we have presented in this chapter. Homodimer proteins are suitable for a rigid dimer coarse-grained model as long as the monomers in the dimer are not intertwined. There are 20 repressors in the lac repressor family [Nagadoi et al., 1995]. Proteins in this family contain a helix-turn-helix DNA binding domain and a separate effector (inducer or corepressor) binding domain. It may be that many of the proteins in this family are suitable for coarse-grained modelling as we have presented in this chapter. We discuss a few specific examples of repressor proteins that may be suitable for such studies. The tetracycline (tet) repressor is a homodimer that functions in a similar way to the lac repressor in that it binds operator DNA only in the absence of the inducer tetracycline. NMR backbone dynamics studies by Vergani et al. [2000] show, on inducer binding, an increased mobility of a loop region. They conclude that entropic factors may be important in the allosteric mechanism. However, the B-factors in the crystal structures [Orth et al., 1998; Kisker et al., 1995] suggest the apo protein has a greater flexibility. The mobile loop studied by NMR is not resolved in these crystal structures. The gal repressor is another lac-type allosteric repressor protein that binds DNA in the absence of the inducer D-galactose with ten fold higher affinity (∆∆G = 2.3kB T ) [Chatterjee et al., 1997]. Like the lac repressor the gal repressor dimers bind to two operator sites and tetramerise with the DNA forming a loop. The purine repressor binds operator DNA when bound to its corepressor guanine. It has a similar structure to the lac repressor, being a homodimer with a helix-turn-helix DNA binding motif on each monomer. Thermodynamic studies by Xu et al. [1998] give the allosteric free energy ∆∆G = −5.2kB T . NMR studies by Nagadoi et al. [1995] show helices in the DNA binding region are disordered in the free repressor but ordered in the complex. Another interesting allosteric repressor protein is the biotin repressor. On binding of bio-50 -AMP, biotin repressor dimerisation and DNA binding are favoured. This process is thought to be facilitated by the ordering of disordered regions on ligand binding, lowering the entropic cost of dimerisation [Streaker et al., 2002]. The allosteric free energy is ∆∆G = −7.6kB T , which is the same as the change in free energy of dimerisation of holo compared to aporepressors. Streaker et al. [2002] therefore suggest the allosteric mechanism is mediated purely through the monomer-monomer interface. Our theory supports such a concept. Brown and Beckett [2005] compare ITC results 3.8. SUMMARY 87 of binding various allosteric ligands to the biotin repressor. They find the binding is enthalpically favourable for all four tested ligands, but the two weakly binding ligands are entropically unfavourable. Bio-50 -AMP binding is found to be entropically slightly favourable inconsistent with the known disorder to order transition implying static solvent effects may also have an entropic component in this reaction. These studies highlight the subtle nature of such allosteric mechanisms. The E-coli trp repressor is an example of a repressor that may need a modified coarse-grained model. The trp repressor binds to operator DNA, when bound to its corepressor L-trp, preventing the expression of genes that that synthesises the amino acid tryptophan. Two repressor dimers bind DNA in tandem forming a dimer of dimers [Lawson and Carey, 1993]. The trp repressor dimer is made up of three domains: the central core and two DNA-reading heads. An alpha helix from each monomer in the dimer are interlocked in the core domain [Zhang et al., 1987]. This structure results in a rigid core domain that may move as one unit rather that separate monomers in the lowest frequency modes. Several studies of the trp repressor indicate that dynamics are important in its allosteric function. For example, the NMR solution structures reveal that the aporepressor is less rigid than the holorepressor [Zhao et al., 1993; Zhang et al., 1994]. Calorimetry by Jin et al. [1993] shows a decrease in entropy on corepressor binding of T ∆S = −14.7kB T . As can be seen from the above examples there are many dimeric allosteric repressor proteins that may benefit from coarse-grained modelling such as we have presented in this chapter. However, not all repressor proteins are suitable for such treatment. For example, proteins that are intertwined dimers cannot be represented as rigid monomers with interactions between them. In some proteins intra-monomer motions will be significant. For proteins with intra-monomer modes that contribute to allostery a finer level of coarse-graining is needed that captures these modes. One example of a repressor protein that cannot be treated in the way described in this chapter is the met repressor. We take up the challenge of proteins in this class in section 5.4. 3.8 Summary In this chapter we have presented a global coarse-grained model of dynamic allostery in dimeric repressor proteins. The model predicts both positive and negative vibrational contributions to the allosteric free energy. This simple physical model explains how local changes on ligand binding can be communicated over long distances via the global vibrational modes. We have described a plausible method of parameterisation for the test case of the lac repressor that gives realistic results. The allosteric vibrational free energy is purely entropic if the changes on ligand binding are isothermal. The entropy is obtained from the partition function Z = 88 CHAPTER 3. DIMERS OF RIGID MONOMERS (2πkB T )d (|M−1 ||K|)−1/2 by; ∂ ln Z ∂T 1 1 ∂|K|/∂T =dkB ln 2πkB T − kB ln |M−1 ||K| + dkB − kB T . 2 2 |K| S =kB ln Z + kB T (3.32) When considering changes in entropy on ligand binding, ∆S, the constant terms cancel. If ∂|K|/∂T is zero or constant (unaffected by ligand binding), the final term of equation (3.32) is constant. In this case the allosteric free energy is purely entropic; ∆∆G = −T ∆∆S. If changes on ligand binding are not isothermal however, the enthalpic term, H = kB T 2 ∂ ln Z 1 ∂|K|/∂T = dkB T − kB T 2 , ∂T 2 |K| (3.33) will not be constant. It will therefore have a contribution in the vibrational free energy. Micheletti et al. [2002] suggest peptide bonds are independent of temperature but noncovalent interactions have temperature dependence. In section 5.6 we discuss further such a temperature dependent enthalpic term. In this chapter we have shown that significant contributions to the allosteric free energy can arise from changes in the low frequency vibrational modes of a protein. We have demonstrated the usefulness and applicability of simple global-level coarse-grained modelling in understanding and predicting such dynamic allostery in dimeric repressor proteins. Chapter 4 Coiled-coils 4.1 Overview Coiled-coil motifs are found in a wide variety of allosteric proteins, as introduced in section 1.9.2. In this chapter we present and discuss a model of dynamic allostery in alpha-helical coiled-coils. In a coarse-grained approach, we treat coiled-coils as classical elastic rods, using biologically realistic parameters. As an example system, we apply the model in detail to the coiled-coil in the molecular motor dynein, which we reviewed in section 1.9.2. We develop the model and highlight experimentally testable predictions it makes. This work has been published in Hawkins and McLeish [2005]. At present, the allosteric mechanism of dynein is unexplained. Electron microscopy images of dynein-c by Burgess et al. [2003] show differences in the conformations of the two states. They suggest that the stem and stalk are flexible and that the stiffness of the stalk changes depending on the ATP binding state. Figure 4.1 shows the two states of dynein with and without ADP·Vi (thought to mimic the ADP·Pi bound state pre-power stroke). As well as the static conformational change, they investigated the apparent flexibility of the stem and stalk. The stalk chord angle standard deviation in the ADP·Vi bound state is 20o compared with 11o in the apo (without ADP·Vi ) state. This apo state is the state that has the higher affinity for the microtubule. One interpretation of these observed changes in standard deviation is a change in flexibility of the stalk (though there may be contributions to the scatter from artifacts of the adsorption onto the carbon surface). Such suggested changes in flexibility support our hypothesis that the allostery is dominated by changes in the vibrational free energy of the coiled-coil. We develop coarse-grained models, which we solve analytically to calculate these changes in vibrational free energy. We consider the relative slide of the helices, their bend and twist modes and the coupling between them. We model the binding of a ligand by a local attractive interaction (or ‘clamping’) between the two helices that restricts the degree of 89 90 CHAPTER 4. COILED-COILS Figure 4.1: A Mean conformations of ADP·Vi and apo dynein-c molecules. B Distribution of stalk tip positions. Figure reprinted from Burgess et al. [2004a] with permission from Elsevier. mutual sliding motion. The calculation shows that this increases the effective stiffness of the whole coiled-coil. In this way a small local conformational change is communicated across the long coiled-coil structure. Recent evidence supporting the idea of such a sliding mode communication has emerged from studies of the alignment of the two strands of the stalk [Gibbons et al., 2005]. It is worth noting that there are thought to be four possible ATP binding sites around the dynein head. We model the effect of ATP binding as influencing the slip at the end of the coiled-coil. In reality this is likely to be the result of an allosteric communication around the dynein head ring from the actual site of ATP binding to the end of the coiled-coil. See section 4.8 for further discussion of this. We use known geometrical parameters for dynein, and employ the AMBER package [Case et al., 2004] to perform computational normal mode analysis (NMA) on a representative alpha helix to calculate values for the elastic moduli in our models. Throughout this chapter we treat the elastic dynamics of the model in increasing detail, calculating the allosteric free energy ∆∆G in terms of the strengths of local substrate binding and the elastic properties of the helices. Section 4.2 introduces the general method and model. Each of sections 4.3 - 4.6 introduces another level of complexity to the problem. In section 4.3 and 4.4 we consider the simple model of two parallel rods and in subsequent sections we consider the two rods coiled round each other. Section 4.3 considers sliding motion only. In sections 4.4 and 4.5 we consider bending and sliding modes of vibration and in section 4.6 we add a twisting 4.2. METHOD 91 mode. Details of parameterisation are given in section 4.7. At the end of section 4.8 we conclude by drawing out predictions and consequences of this type of modelling for experimental work without referring to the detailed mathematics. Remember “Do not worry about your difficulties in mathematics; I can assure you that mine are still greater” [Einstein to a junior high school student]. 4.2 Method We follow the method of calculating the vibrational free energy of the lowest frequency modes of a coarse grained model as described in section 2.3. We require the elastic internal energy induced in the rods due to the strain imposed on the system by thermal fluctuations. We write this energy as H = 1 T 2 x Kx where x is a vector of all the fluctuation variables in the problem and K is a generalised elasticity matrix. Standard equations of statistical mechanics give the partition function of the fluctuating coiledcoil equation (2.22); the free energy equation (2.23), repeated here for convenience 1 d G = −kB T ln Z = kB T ln |K| − kB T ln(2πkB T ); 2 2 (4.1) and the allosteric free energy equation (2.25), repeated here for convenience ∆∆G =∆G+− (bind) − ∆G−− (bind) ¡ |K|++ |K|−− ¢ 1 = kB T ln , 2 |K|−+ |K|+− (4.2) where the subscripts refer to the different liganded states defined in section 4.2.1. 4.2.1 A coarse-grained elastic model of coiled-coils We use a coarse-grained model for a coiled-coil, consisting of two alpha helices coiled round each other, that treats each alpha helix as a classical flexible rod as shown in the diagram in figure 4.2. Much work has been done on the geometry and writhe of helical DNA (for example, see Marko [1998]; Jülicher [1994]; Moroz and Nelson [1998]; Rossetto and Maggs [2003]) and there are clear parallels here. In each of the following sections we write the paths of the two rods, r±1 (s), as a function of the path length s along the central axis of the coiled-coil. Each rod has a Young’s modulus Y and a shear modulus µ and we take the mean perpendicular separation of the centres of the helices to be 2ρ. We model the adhesive resistance to mutual sliding of the two helices with a distributed localising harmonic potential of strength kslide (s). We restrict the form of kslide (s) to a general background constant 92 CHAPTER 4. COILED-COILS s 2ρ _r+1 (s) _r−1 (s) 0 Figure 4.2: Model of coiled-coil alpha helices as two classical flexible rods with paths r±1 (s) and radii ρ. interaction, k0 plus extra interactions at the two ligand binding sites si ; whose changes model ATP binding and hydrolysis (dephosphorylation ATP+H2 O→ADP+Pi ) at one end or microtubule binding at the other. So the force constant per unit length is given by X kslide (s) = k0 /l0 + ki δ(s − si ). (4.3) i=−1,1 The force constants ki change depending on the ligand binding state in the following way, ++ k−1 = κ−1 k0 k1 = κ1 k0 +− k−1 = κ−1 k0 k1 = 0 −+ k−1 = 0 k1 = κ1 k0 −− k−1 = 0 k1 = 0. (4.4) For simplicity we set κ−1 = κ1 = κ. Note a ligand binding may provide the clamping or it may release the clamp depending on the details of the protein and ligand interaction in question. In the case of dynein microtubule binding clamps and ATP binding unclamps. The ligand binding states are defined such that ‘++’ means both ends are clamped (tightened) with k±1 6= 0 corresponding in our case to microtubule bound at the tip and ATP unbound at the other end. ‘+−’ means only the tip is clamped k1 6= 0 (microtubules bound) and k−1 = 0 due to ATP bound. ‘−+’ means only the end attached to the head is clamped k−1 6= 0 (ATP unbound) but k1 = 0 (microtubules unbound). Finally ‘−−’ means neither end is clamped (microtubules unbound, ATP bound). In the case of dynein, two sites at −l0 /2 and +l0 /2 that are affected by ligand binding give kslide (s) = k0 /l0 + k−1 δ(s + l0 /2) + k1 δ(s − l0 /2). (4.5) 4.3. PARALLEL RIGID RODS: SLIDE ONLY 4.2.2 93 Parameterisation We parameterise the model using values for the geometry known from electron microscopy [Burgess et al., 2003, 2004a]. Estimation of the Young’s modulus and shear modulus was performed by normal mode analysis of a simple polyalanine alpha helix using the Nmode program in AMBER [Case et al., 2004] (details are given in section 4.7). We estimate the adhesive resistance to mutual sliding k0 as being of the order typical of the hydrophobic effect. To estimate the order of magnitude of κ±1 we enhance the background hydrophobic interaction by an additional electrostatic attraction at the binding sites. The parameterisation is meant to be realistic, if not necessarily exact for dynein, since details of the local interactions between the coils are not known. The goal is to calculate in principle attainable values for the allosteric free energy ∆∆G in a coiled-coil system with realistic interactions. 4.3 4.3.1 Parallel rigid rods: slide only Model We start very simply by considering two inextensible, rigid rods, which lie parallel, side by side and are not coiled round each other. The equilibrium paths of the upper rod, r1 (s) and the lower rod, r−1 (s), for two parallel rods are given below; r±1 (s) = ±ρx̂ + sẑ. (4.6) Each rod is rigid (possessing infinite bending Young’s modulus), but we allow a finite localising potential kslide (s) between the two rods to account for adhesive resistance to relative sliding. The only fluctuation variable in this problem is the relative slide between the rods, which we call ζ. On sliding, equation (4.6) deforms to; ¡ ζ ¢ r±1 (s) = ±ρx̂ + ± + s ẑ. 2 (4.7) Assuming a linear stress-strain relationship the general classical elastic internal energy of the two rods is H= 1 2 Z l0 /2 −l0 /2 kslide (s)(∆s)2 ds, (4.8) describing the energy due to relative sliding displacement ∆s(s) between the two rods. The integral is over s along the path length of the coiled-coil axis. The potential due to mutual sliding is modelled by kslide (s) described in section 4.2 (equation (4.5)). We calculate the slide parallel to the rods by taking the difference between the paths for 94 CHAPTER 4. COILED-COILS each rod. This gives the relative local displacement between the contingent sites, in this case, as everywhere ∆s = ∆rz (s) = ζ. We substitute equation (4.5) and ∆s = ζ into equation (4.8) and integrate over the path length −l0 /2 < s < l0 /2. After performing the integration we obtain 1 H = (k0 + k1 + k−1 ) ζ 2 . 2 (4.9) The free energy is given by equation (4.1), where in this case of just one degree of freedom |K| = k0 + k1 + k−1 , G= ´ kB T ³ ln(k0 + k1 + k−1 ) − ln(2πkB T ) . 2 (4.10) We calculate the allosteric free energy from equation (4.2), using the liganded conditions (equation (4.4)), 1 ∆∆G = kB T ln 2 4.3.2 µ 1 + 2κ (1 + κ)2 ¶ . (4.11) Results ∆∆G for parallel rigid rods (equation (4.11)) is drawn as a function of κ in figure 4.3 keeping all other parameters fixed for dynein from section 4.7. Providing that substrate binding affects the mutual sliding potential by introducing a delta function of κ = 100, significant allosteric free energies can be generated by the sliding mode alone. The 0 -0.5 ∆∆G/kBT -1 -1.5 -2 -2.5 -3 -3.5 0.1 1 10 κ 100 1000 Figure 4.3: Allosteric free energy ∆∆G against clamping κ = k1 /k0 = k−1 /k0 showing the effect of the clamping on the allosteric free energy for the model of rigid parallel rods. The values of the parameters used are those given for Dynein in section 4.7. electrostatic estimation of binding site attraction gives κ ∼ 100 and ∆∆G ≈ −2.0kB T . 4.4. PARALLEL RODS: SLIDE AND BEND 4.4 4.4.1 95 Parallel rods: slide and bend Model Now, as well as the finite potential between the rods, we allow them to bend. Each rod has a bending Young’s modulus, Y . We impose a relative slide ζ of the two rods parallel to the central axis by introducing the additional term ± ζ2 in the z direction (as in section 4.3). We now also impose a bend of curvature c in the positive x direction. This adds an additional term of cs2 /2, in the x direction. We now have two fluctuation variables c and ζ. Bending also induces a relative slide (−2ρcs) between the rods. We include this slide induced by bending by adding ∓ρcs to the z component giving, ¡ ¡ ζ ¢ 1 ¢ r±1 (s) = ± ρ + cs2 x̂ + ± + (1 ∓ ρc)s ẑ, 2 2 (4.12) to linear order in the fluctuation variables c and ζ. In general there is also a bending mode in the y direction However in this simple case, since it is not coupled to the sliding, it does not affect ∆G so we omit it here. Note this is not true in the case of coiled geometry treated in section 4.5. Bending in the y direction is not coupled to the sliding for parallel rods, since the two rods are at the same position in y (one is above the other in x). The classical elastic internal energy of the rods will be made up of the energy due to bending of each rod and the energy of relative slide between the rods. We combine these energies to give H= 1 2 Z l0 /2 −l0 /2 ¡ ¢ 2Y I|r00 |2 + kslide (s)(∆s)2 ds, (4.13) where the first term describes the energy due to bending and the second terms describes the energy due to relative sliding of the two rods as in section 4.3. Y is the Young’s modulus of the rod and I is the moment of inertia about the y-axis, which for circular cross sectional radius ρ is I = 1 4 4 πρ . |r00i | = ∂ 2 ri ∂s2 is the curvature of the rod of path ri (s). The factor of two accounts for the bending energy of the two rods. kslide (s) is given by equation (4.5). We take the origin of bending at the centre of mass of the rod, so the rod path runs −l0 /2 < s < l0 /2. We calculate the curvature from the deformed paths (equation (4.12)) and the relative slide of the rods by taking the difference between the path lengths for each rod, ∆s = ∆rz = r1z − r−1z , thus obtaining: ∂ 2 r±1 |=c ∂s2 ∆s = ∆rz = ζ − 2ρcs |r00±1 | = | (4.14) (4.15) 96 CHAPTER 4. COILED-COILS to linear order in c and ζ. The slide parallel to the rods, ∆s, is now the sum of the relative slide between the rods induced by the bend, −2ρcs, and that due to the slide mode itself, ζ, giving ∆s = ζ − 2ρcs. The term due to the slide induced by bend accounts for the coupling between bending and sliding motion. We substitute equations (4.5), (4.14) and (4.15) into equation (4.13) and integrate over the path length −l0 /2 < s < l0 /2. H= ´ 1 1³ (k0 + k−1 + k1 ) ζ 2 + 2ρl0 (k−1 − k1 ) ζ c + πρ4 Y l0 c2 + ρ2 l02 (k0 /3 + k1 + k−1 ) c2 . 2 4 Writing this Hamiltonian in the form H = 12 xT Kx where x = (ζ, c) gives K, the 2 × 2 elasticity matrix: à K= (k0 + k−1 + k1 ) ρl0 (k−1 − k1 ) ! ρl0 (k−1 − k1 ) ρ2 l02 ( k30 + k1 + k−1 ) + πρ4 l0 Y 2 . (4.16) We then obtain the free energy from equation (4.1) G= 1 ´ 4 kB T ³ ρ2 l02 ( k30 (k0 + 4(k1 + k−1 )) + 4k1 k−1 ) 2 πρ l0 Y (k0 + k1 + k−1 ) ln + . 2 (2πkB T )2 (2πkB T )2 Therefore the allosteric free energy is given from equation (4.2) by kB T ∆∆G = 2 4.4.2 ( 3 2Y ³ l (1 + 4(κ + κ ) + 12κ κ ) ´ 0 1 −1 1 −1 2 πρ k0 (1 + κ1 + κ−1 ) + l0 (1 + 4κ1 ) + 32 πρ2 kY0 (1 + κ1 ) l0 (1 + 4κ1 ) + 32 πρ2 kY0 (1 + κ1 ) ) ³ l0 (1 + 4κ−1 ) + 3 πρ2 Y (1 + κ−1 ) ´ 2 k0 . (4.17) − ln l0 + 32 πρ2 kY0 ln Results The dependence of ∆∆G on the Young’s modulus Y is given in figure 4.4. The functional form (equation (4.17)) interpolates between the limit of ∆∆G ¿ kB T for very floppy rods (Y → 0) and the parallel rigid rods result from section 4.3 (∆∆G = −2kB T ) for very stiff rods (Y → ∞). These limits themselves are independent of the geometrical parameters and depend only on the ratio of the clamping potentials, which we have taken as κ = k±1 /k0 = 100. We find that, for this non-coiled structure, a significant value of ∆∆G ∼ −kB T is achieved if Y is greater than ∼ 105 MPa. The small curvature approximation of equation (4.11) breaks down for Y < 103 MPa in this parallel rod model (the persistence length of the dimerised parallel helices becomes of order 10nm < l0 ). So significance should not be read into the maximum at Y ≈ 103 MPa in figure 4.4. The value for the Young’s modulus we estimate for alpha helices, (see 4.4. PARALLEL RODS: SLIDE AND BEND 97 0 -0.2 -0.4 ∆∆G/kBT -0.6 -0.8 -1 -1.2 -1.4 -1.6 -1.8 -2 102 103 104 105 106 107 Y/MPa Figure 4.4: Allosteric free energy ∆∆G against the Young’s modulus of each rod Y showing the effect of bending on the allosteric free energy for the model of parallel flexible rods free to slide and bend. The values of the parameters used are those given for Dynein in section 4.7. section 4.7) is Y = 2.5 × 109 Jm−3 = 2500MPa, which corresponds to a persistence length that is a factor of three longer than the minimum required for the small curvature approximation. Interestingly, the value Y = 2500MPa is right at the top of the steep slope in figure 4.4. This implies that if biology is able to shift this value slightly, large changes in allosteric free energy may result. Coiling the alpha helices may provide just such a strategy. By allowing the rods to bend the allosteric free energy has been reduced to a negligible value due to the value of Y ∼ 4l0 k0 . 3πρ2 0 -0.005 ∆∆G/kBT -0.01 -0.015 -0.02 -0.025 -0.03 0.1 1 10 κ 100 1000 Figure 4.5: Allosteric free energy ∆∆G against clamping κ = k1 /k0 = k−1 /k0 showing the effect of the clamping on the allosteric free energy for the model of flexible parallel rods free to slide and bend. The values of the parameters used are those given for Dynein in section 4.7. Finite Y also means that ∆∆G saturates as a function of κ (figure 4.5) in contrast to the stiff result (equation (4.11), figure 4.3). For the parallel rigid rods the binding of the first ligand restricts the slide mode. The vibrations are already restricted when the 98 CHAPTER 4. COILED-COILS second ligand binds leading to the observed divergence in ∆∆G (figure 4.3). However, for the flexible parallel rods, though the first ligand binding restricts the slide mode, the second ligand restricts the bend mode leading to the saturation behaviour for ∆∆G seen in figure 4.5. Figure 4.5 shows the effect of different values for the clamping potentials κ±1 = k±1 /k0 where we have set κ1 = κ−1 . Recall that k−1 is switched on by binding to microtubules and off by unbinding and k1 is switched on when there is no ATP bound and switched off by ATP binding. Trivially for κ → 0 the allosteric free energy ∆∆G → 0. For large values of κ±1 → ∞ the small ∆∆G = −0.03kB T is approached for physical values of Y . The value of κ = 100 is in this saturation region. Clearly to move beyond the poor allosteric properties of the simple parallel helices we need to account for the fully coiled-coil geometry. 4.5 4.5.1 Coiled geometry: slide and bend Model We now introduce the helical geometry of two rods coiled round each other, as shown in figure 4.6. In order to work out the position vector of the individual rods r±1 in the α s1 2ρ s −1 h Figure 4.6: Diagram showing the geometry of two rods coiled round each other. α is the angle between the central axis and the path length along an individual rod. h is the helical pitch of the two rods coiled round each other. The distance between the centres of the rods is 2ρ. new geometry it is helpful to refer to figures 4.6 and 4.7. The x, y and z components of the position vector of an individual rod r±1 (s) can be calculated from figure 4.7a and 4.7b where s is the path length along the central axis of the coiled-coil (the neutral axis). r±10 (s) = ±ρ cos γ0 s x̂ ± ρ sin γ0 s ŷ + s ẑ, (4.18) where γ0 is the helicity of the coil. It can be seen from figure 4.7 that the coil helicity, which is the rate of change of global twist angle φ, is γ0 = ∂φ ∂s = 2π h where h is the helical pitch length. The subscripts 0 refer to these being the equilibrium paths with zero bend fluctuation. Setting γ0 = 0 reproduces the paths for the parallel geometry in section 4.4. We consider the deformation to these paths under bending and relative slide ζ. Due to the helical geometry the relative slide induced by bend is now (∓ρc cos γ0 s)s. 4.5. COILED GEOMETRY: SLIDE AND BEND 99 x (a) (b) s1sin α ρ s1 2πρ φ y α s h Figure 4.7: (a) Diagram showing the coil in figure 4.6 unrolled. ρ is the radius of the circle that the centre of one rod travels in the coil structure. Therefore the circumference is 2πρ, which makes the vertical side. The horizontal side is in the direction of the arc length of the centre of the coil s (ẑ if there’s no bend) and the pitch h. The individual rod arc length, s1 is along the diagonal in the unrolled geometry. (b) Diagram showing the coil of the two rods in figure 4.6 end on. Again ρ is the radius the centre of one rod travels. The arc length s1 sin α marked is the component of individual rod arc length s1 along the vertical side of the unrolled geometry in figure 4.7b. φ is the global twist angle defined as the angle between the line joining the centres of the two rods and the x-axis. We include the two perpendicular bending modes as curvature cx in the x direction and curvature cy in the y direction. These modes will be nearly but not perfectly degenerate due to the coiled geometry so we include them explicitly. Therefore we obtain r±1z = s(1 ∓ ρcx cos γ0 s ∓ ρcy sin γ0 s). Including the relative slide ζ (ζ cos α along z) gives the z component in the deformed path; ¡ ¡ cy ¢ cx ¢ r±1 (s) = ± ρ cos γ0 s + s2 x̂ + ± ρ sin γ0 s + s2 ŷ+ 2 2 ¡ ζ cos α ¢ ± + (1 ∓ ρcx cos γ0 s ∓ ρcy sin γ0 s)s ẑ. 2 (4.19) We generalise the bending energy for a non zero equilibrium curvature. We also include twist energy of each rod since there will be a twist induced by bend due to the coiled geometry. This gives us the internal elastic energy: 1 H= 2 Z l0 /2 −l0 /2 ³ ´ 2Y I(|r00 | − |r000 |)2 + kslide (s)(∆s)2 + 2kt t2rod ds. (4.20) For a rod of circular cross section, radius ρ, kt = 21 µπρ4 where µ is the shear modulus. The factor of two is to account for the twisting of the two rods. trod is the local elastic energy-storing twist of each rod. To obtain the curvature of the deformed paths we calculate; r0±1 (s) =(∓ργ0 sin γ0 s + cx s) x̂ + (±ργ0 cos γ0 s + cy s) ŷ + 1ẑ 100 CHAPTER 4. COILED-COILS r00±1 (s) =(∓ργ02 cos γ0 s + cx ) x̂ + (∓ργ02 sin γ0 s + cy ) ŷ 3 3 r000 ±1 (s) = ± ργ0 sin γ0 s x̂ ∓ ργ0 cos γ0 s ŷ (4.21) where we have taken the approximation ρcx cos γ0 s + ρcy sin γ0 s << 1 simplifying the z component of equation (4.19). We then obtain the curvature of the deformed and equilibrium paths: |r00±1 | =(ρ2 γ04 ∓ 2ργ02 (cx cos γ0 s + cy sin γ0 s) + c2x + c2y )1/2 |r00±10 | =ργ02 |r00±1 | − |r00±10 | ≈ ∓ (cx cos γ0 s + cy sin γ0 s). We have expanded |r00±1 | to linear order in cx and cy only, so that the Hamiltonian is in the harmonic approximation. The curvature induced by bending the coiled-coil is dependent on cos γ0 s. This means that, at points where rx is maximum the bending decreases the curvature. However, at points where rx is minimum the bend induces an increase in curvature, as expected intuitively. To obtain the mechanical twist energy we calculate the energy-storing twist of each rod, trod = γrod − τ where τ is the torsion and γrod is helicity of a rod. Imagine a straight groove marked along a straight rod. The helicity γrod of such a rod is the rate of change of angle φrod of this groove as the rod is deformed from a straight path. Part of γrod is due to the mechanical twist of the rod but part is from the geometrical torsion. The part due to the mechanical twist only (γrod − τ ) costs elastic energy. The torsion τ however, is a property of a curve in three dimensional space, which costs no energy. The torsion is a measure of the departure of a curve from the the plane of its tangent t and curvature (which is along the normal n). The osculating plane is the plane perpendicular to the binormal b = t × n. The torsion τ is defined as the rate of change of the osculating plane in the normal direction, db ds = −τ n. It can be calculated from the geometry of the paths by τ= r0 · (r00 × r000 ) |r00 |2 (4.22) [Struik, 1961]. We substitute equations (4.21) into (4.22) to obtain: γ0 ∓ τ= 1∓ 1 ργ0 (cx cos γ0 s 2 (c cos γ0 s ργ02 x + cy sin γ0 s) + cy sin γ0 s) + c2x +c2y ρ2 γ04 . (4.23) We keep terms up to quadratic order only, obtaining τ ≈ γ0 ± 1 (cx cos γ0 s + cy sin γ0 s) ργ0 (4.24) 4.5. COILED GEOMETRY: SLIDE AND BEND 101 τ0 = γ 0 τ − τ0 ≈ ±1 (cx cos γ0 s + cy sin γ0 s). ργ0 (4.25) For two rods that can slide relative to each other, the energy induced by a twist fluctuation is made up of the mechanical twist of the individual rods and the relative slide induced perpendicular to the neutral axis. We allow the system freedom to choose the optimal mechanical twist slide balance depending on the relative energy costs of each. This balance is governed by the conservation of linking number, Lk (White’s [1969] theorem). The linking number, Lk , is made up of the writhe, Wr , and the twist, Tw : Lk = Wr + Tw . (4.26) The linking number is conserved (∆Lk = 0) for a particular topology. The linking number can be thought of as the number of times the rod would cross itself if it was laid in the plane in which the mechanical twist and torsion is all relaxed. The microtubule binding tip of dynein is a closed loop, forming the antiparallel coiled-coil. The other end of the coiled-coil is attached to the dynein head, which is large in comparison with the stalk. We argue that the coiled-coil topology is preserved, due to the large rotational diffusion constant of the dynein head. We therefore take ∆Lk = 0 for the fluctuations R we consider for the dynein coiled-coil. The twist is Tw = (γrod − γ0 )ds where γrod − γ0 is the fluctuational change in helicity of a rod. The twist Tw is therefore made up of the mechanical twist of a rod and the geometric torsion. Note that at equilibrium, since there is no mechanical twist, the helicity of an individual rod is the same as that of R the coiled-coil (γ0 ). The writhe Wr = wds, where w is the change in writhe per unit length. The writhe is the change in angle of the normal to the rod surface due to the path of the rod in space. Figure 4.8 shows an example of writhe in which the normal n is rotated by 90o but the tangent t returns to the same direction. The conservation of linking number, Lk , therefore leads to the calculation Z ∆Lk = 0 = Z wds + (γrod − γ0 )ds γrod = γ0 − w trod = γ0 − w − τ. (4.27) The writhe will affect the relative slide between the rods. The slide parallel to the neutral axis is given by calculating ∆s = ∆rz / cos α where cos α = (1 + γ02 ρ2 )−1/2 . The slide perpendicular to the neutral axis is along the tangent of the cylinder of radius ρ around the neutral axis through the centres of each rod, shown in figure 4.7b. This slide is given by the writhe (change in angle of the normal to the rod surface) multiplied 102 CHAPTER 4. COILED-COILS t t t n n t n n Figure 4.8: Diagram to show writhe in a bar. The bar is bent three times. The normal to the bar n is rotated by 90o but the tangent t returns to the same direction. The rotation is due to the writhe of the bar in three dimensions. Figure taken from Maggs [2001] R by the radius giving ρ wds. The relative slide is therefore, ∆s = (ζ − 2ρ(1 + γ02 ρ2 )1/2 (cx cos γ0 s + cy sin γ0 s)s) sˆk + ρws sˆ⊥ . (4.28) We substitute the curvature (equation (4.22)), slide (equation (4.28)), mechanical twist (equation 4.27) and torsion (equation (4.25)) into equation (4.20), to obtain the elastic energy to quadratic order. We then minimise this total elastic energy with respect to w to allow the system to choose its optimal writhe, twist balance. This gives wmin = −24µπρcx sin(l0 γ0 /2) . + 3l0 (k1 + k−1 ) + 12µπρ2 ) (4.29) γ02 l0 (k0 l0 Substituting this value back into the Hamiltonian gives H = 1 T 2 x Kx where x = (ζ, cx , cy ) and the components of K are given as Kζζ = (k0 + k−1 + k1 ) Kζcx = Kcx ζ = a1 (k1 − k−1 ) Kζcy = Kcy ζ = a2 (k1 + k−1 ) + a3 k0 Kcx cx = a4 Y + a5 µ + a21 (k1 + k−1 ) + a6 k0 + a7 µ2 a8 (3(k1 + k−1 ) + k0 ) + a9 µ Kcx cy = Kcy cx = a1 a2 (k1 − k−1 ) Kcy cy = a−4 Y + a−5 µ + a22 (k1 + k−1 ) + a−6 k0 (4.30) 4.5. COILED GEOMETRY: SLIDE AND BEND 103 where γ0 l0 ) 2 γ0 l 0 a2 = − ρl0 (1 + γ02 ρ2 )1/2 sin( ) 2 ¡ 1 γ0 l0 2 γ0 l0 ¢ a3 =2ρl0 (1 + γ02 ρ2 )1/2 cos( ) − 2 2 sin( ) l0 γ0 2 2 l 0 γ0 πρ4 1 sin(γ0 l0 )) a±4 = (l0 ± 4 γ0 πρ2 1 a±5 = 2 (l0 ± sin(γ0 l0 )) γ0 2γ0 ¡ 1 ¡ sin (γ0 l0 ) cos γ0 l0 sin γ0 l0 ¢¢ a±6 =ρ2 l02 (1 + γ02 ρ2 ) ± + 2 2 − 3 3 6 2γ0 l0 l0 γ0 l 0 γ0 γ0 l0 a7 = − 48π 2 ρ4 sin2 ( ) 2 a8 =γ04 l02 a1 = − ρl0 (1 + γ02 ρ2 )1/2 cos( a9 =12πρ2 γ04 l0 . (4.31) We then obtain the free energy and allosteric free energy from equation (4.1) and (4.2). For comparison we calculated the writhe explicitly from the equation Wr = 1 4π Z Z ds ds0 r(s) − r(s0 ) dr(s) dr(s0 ) × · |r(s) − r(s0 )|3 ds ds0 (4.32) [Maggs, 2001]. Calculating the writhe in this way does not require our simplifying assumption of conserved topology. We found that to two significant figures the results for the allosteric free energy were not changed. 4.5.2 Results The dependence of ∆∆G for slide and bend modes with coiled geometry on the shear modulus µ and Young’s modulus Y is given in figures 4.9 and 4.10 respectively. For rotationally stiff rods (µ → ∞) the parallel rigid rods result (equation (4.11)) is approached. Similarly rods that are stiff to bending (Y → ∞) approach this same result. The small curvature approximation of equation (4.19) breaks down for Y < 103 MPa and µ < 103 MPa (the persistence length becomes of order 10nm < l0 ). Figure 4.11 shows for coiled, flexible, inextensible rods with slide and bend fluctuations, using the parameters of section 4.7, κ = 100 is in the saturation region giving a free energy of ∆∆G = −kB T ln(Zholo /Zapo )) = −0.7 kB T. (4.33) 104 CHAPTER 4. COILED-COILS -0.2 -0.4 -0.6 ∆∆G/kBT -0.8 -1 -1.2 -1.4 -1.6 -1.8 -2 102 103 104 µ/MPa 105 106 Figure 4.9: Allosteric free energy ∆∆G against the shear modulus of each rod µ showing the effect on the allosteric free energy for coiled geometry for slide and bend fluctuations. -0.6 -0.8 ∆∆G/kBT -1 -1.2 -1.4 -1.6 -1.8 -2 102 103 104 105 106 107 Y/MPa Figure 4.10: Allosteric free energy ∆∆G against the Young’s modulus of each rod Y showing the effect on the allosteric free energy for coiled geometry for slide and bend fluctuations. From this it is clear that the coiled geometry partially restores the allosteric communication seen for rigid rods. This may be understood from the effective increase in the bending modulus achieved by coupling bending to twist of the helices by the coiled geometry. Thus, a coiled geometry provides a way of increasing the value of the allosteric free energy from that discussed in section 4.4.2. 4.6 4.6.1 Coiled geometry: slide, bend and twist Model We now also include a twisting mode. To introduce a fluctuation, t, in the twist, what we mean is a fluctuation in the helicity, γrod , which is made up of some mechanical 4.6. COILED GEOMETRY: SLIDE, BEND AND TWIST 105 0 -0.1 ∆∆G/kBT -0.2 -0.3 -0.4 -0.5 -0.6 -0.7 -0.8 0.1 1 10 κ 100 1000 Figure 4.11: Allosteric free energy ∆∆G against clamping κ = k1 /k0 = k−1 /k0 showing the effect of clamping on the allosteric free energy for coiled geometry for slide and bend fluctuations. twist and some torsion, and is governed by the conservation of linking number. Z ∆Lk =0 = Z wds + (γrod + t − γ0 )ds 0 = w + γrod + t − γ0 (4.34) shows us that the writhe w must decrease by t. We therefore repeat the calculation as in section 4.5, but after we have minimised with respect to w we substitute in, not w = wmin but, w = wmin − t. Then we obtain a Hamiltonian of the form H = 12 xT Kx where x = (ζ, cx , cy , t) and K is a 4 × 4 matrix with components Kζζ = (k0 + k−1 + k1 ) Kζcx = Kcx ζ = a1 (k1 − k−1 ) Kζcy = Kcy ζ = a2 (k1 + k−1 ) + a3 k0 Kζt = 0 Kcx cx = a4 Y + a5 µ + a21 (k1 + k−1 ) + a6 k0 + a7 µ2 a8 (3(k1 + k−1 ) + k0 ) + a9 µ Kcx cy = Kcy cx = a1 a2 (k1 − k−1 ) Kcx t = 0 Kcy cy = a−4 Y + a−5 µ + a22 (k1 + k−1 ) + a−6 k0 Kcy t = 0 Ktt = a10 (3(k1 + k−1 ) + k0 ) + a11 µ (4.35) 106 CHAPTER 4. COILED-COILS where a1 − a9 are given by equations (4.31) and ρ2 l02 12 =πρ4 l0 . a10 = a11 (4.36) We then obtain the free energy and allosteric free energy from equation (4.1) and (4.2). 4.6.2 Results The dependence of ∆∆G for slide, bend and twist modes for coiled geometry on the shear modulus µ and Young’s modulus Y is given in figures 4.12 and 4.13 respectively. For rotationally stiff rods (µ → ∞) the parallel rigid rods result (equation (4.11)) is approached. For rods stiff to bending (Y → ∞) however, the limit approached is this parallel rigid rods result plus an additional term, which is non-zero for finite µ. Figure 4.14 shows that including this twist mode restores the non saturation at high κ behaviour seen for the parallel rigid rods. This is due to the absence of slide twist coupling so twisting is allowed without sliding. The second ligand binding therefore does not restrict the twisting so has a much smaller effect than the first ligand binding thereby increasing the allosteric effect. Figure 4.12 shows the µ dependence of ∆∆G -1.8 -2 ∆∆G/kBT -2.2 -2.4 -2.6 -2.8 -3 -3.2 103 104 105 106 µ/MPa 107 108 109 Figure 4.12: Allosteric free energy ∆∆G against the shear modulus of each rod µ showing the effect on the allosteric free energy for coiled geometry for slide, bend and twist fluctuations. Solid red line is κ = 100, dashed green line is κ = 500 and dotted blue line is κ = 1000 for three different values of κ = 100, 500, 1000 showing the increased allosteric signal for increased values of κ. Figure 4.15 shows the µ dependence of ∆∆G for two different values of Y ∼ 103 , 104 MPa. The low µ behaviour is altered but the high µ saturation is unaffected by Y . For our physically relevant parameters (section 4.7), including the twisting mode for coiled flexible rods restores the allosteric communication to the same as the rigid 4.7. PARAMETERISATION 107 -1.8 -2 ∆∆G/kBT -2.2 -2.4 -2.6 -2.8 -3 -3.2 -3.4 102 103 104 105 106 107 Y/MPa Figure 4.13: Allosteric free energy ∆∆G against Young’s modulus Y showing the effect on the allosteric free energy for coiled geometry for slide, bend and twist fluctuations. 0 -0.5 -1 ∆∆G/kBT -1.5 -2 -2.5 -3 -3.5 -4 -4.5 0.1 1 10 100 1000 10000 κ Figure 4.14: Allosteric free energy ∆∆G against clamping κ = k1 /k0 = k−1 /k0 showing the effect of the clamping on the allosteric free energy for coiled geometry for slide, bend and twist fluctuations. rods result to two significant figures (∆∆G = −2.0kB T ). 4.7 Parameterisation The geometry of the Dynein is known from electron microscopy imaging by Burgess et al. [2003, 2004a] giving the values in equations (4.39) and (4.40). We take the pitch length to be 13nm from Offer and Sessions [1995] giving the value in equation (4.41). The bulk elasticity Young’s modulus (the ratio of stress to strain for deformation along a single axis) has typical values of Y ∼ 109 Jm−3 for non-crystalline soft matter [Boal, 2002]. The persistence length of a rod of isotropic elasticity and transverse moment of inertia I is lp = YI kB T . For long alkanes lp ∼ 0.5nm, F-actin lp ∼ 10µm, and microtubules lp ∼ 1 − 6mm [Boal, 2002]. Therefore Y ∼ 109 Jm−3 for most filaments [Boal, 2002]. We expect lp and Y of an alpha helix to be less than that for microtubules and actin but more than long alkanes. 108 CHAPTER 4. COILED-COILS -1.86 -1.88 -1.9 ∆∆G/kBT -1.92 -1.94 -1.96 -1.98 -2 -2.02 -2.04 -2.06 103 104 105 106 µ/MPa 107 108 109 Figure 4.15: Allosteric free energy ∆∆G against the shear modulus of each rod µ showing the effect on the allosteric free energy for coiled geometry for slide, bend and twist fluctuations. Solid red line is Y ∼ 103 MPa and dashed green line is Y ∼ 104 MPa We investigated an estimate of the persistence length of an alpha helix by considering the normal modes of a simple polyalanine alpha helix with 100 residues (since the coil-coil helices in Dynein are about this long). We used the Nmode program in AMBER [Case et al., 2004] with a distance dependent dielectric constant to model solvent implicitly. We set the mass matrix to the identity to calculate the non-mass weighted eigenvalues. This gave a frequency of ν 0 = 1.56amu1/2 cm−1 for the lowest mode (bend). The eigenvalue is therefore λ0 = 2π(ν 0 )2 = 2.3 × 10−5 kg s−2 , which is equal to the effective spring constant for the mode ksp = 2.3 × 10−5 Jm−2 . Reading off the amplitude of displacement of the end atom from the eigenvector calculated by Nmode we find δe = 0.066Å, and the displacement of the middle atom is δm = 0.037Å. From geometry (see figure 4.16) the radius of maximally excited curvature R ≈ We use this to find R = 2.9µm. Using H = 21 λ0 x2 = 12 Y I( R1 )2 where |x|2 = δm R −δm l2 8(δe +δm ) . 1Å2 (the l 2 δe R +δe Figure 4.16: Diagram to show geometry of the lowest normal mode (bend) used to calculate the Young’s modulus Y of an alpha helix from the NMA using AMBER unit eigenvector of the force constant matrix in AMBER has units of Å) we obtain Y = 2.5 × 109 Jm−3 , in line with our expectations. This corresponds to a persistence 4.7. PARAMETERISATION length lp = YI kB T 109 = 30nm. A regular polyalanine helix may provide an upper bound to Y for the less regular dynein helix. We expect the shear modulus µ to be the same order of magnitude as Y . We used the polyalanine normal mode analysis to estimate µ from the lowest twisting mode. The non mass normalised constant gave us a frequency of ν 0 = 6.5cm−1 amu1/2 . The end atom is found to be displaced by δ = 0.066Å for the lowest twisting mode. From geometrical considerations (see figure 4.17) a) ∂φ ∂s ≈ 2δ ρl = 1.7 × 106 m−1 . From b) l2 δ δ φ ρ Figure 4.17: Diagram to show geometry of the lowest twist normal mode used to calculate the shear modulus µ of an alpha helix from the NMA using AMBER (a) The lowest twist mode of rod of length l. The straight groove (dashed line) becomes twisted (solid line). (b) Looking at the rod end on (radius ρ). The twist is described by the angle φ and the displacement of end atom is δ. 2 H = 12 λ0 |x|2 = 14 µπρ4 l( ∂φ ∂s ) we obtain µ = λ0 |x|2 l 2πδ 2 ρ2 = 9.1 × 108 Jm−3 . The adhesive resistance to mutual sliding k0 will be due to the hydrophobic effect that holds the two alpha helices together in a coiled-coil. To estimate the magnitude of this we take the surface tension of an oil water interface giving typically T = 5 × 10−2 Jm−2 [Boal, 2002]. The change in energy due to sliding ∆s is then ∆E = T w∆s = 1 2 2 k0 ∆s where w is the width of the hydrophobic stripe, which we take to be of the order of ∆s so k0 ∼ 0.1Jm−2 (equation (4.44)). To estimate the order of magnitude of κ±1 we calculate the effect of introducing charges that cause the clamping by the Coulomb interaction energy q2 4π²0 r where q is the charge. We take the separation between charges, r, to be ρ. We equate the change in coulomb interaction energy to the energy of sliding ∆s to obtain 1 q2 1 1 k1 (∆s)2 = ( 2 − ) 1/2 2 2 4π²0 (ρ + (∆s) ) ρ 2 q . k1 ≈ 4π²0 ρ3 (4.37) (4.38) Koonce and Tikhonenko [2000] investigate the effect of alanine substitutions of conserved charge residues in the microtubule binding region of dynein. They find there are 4 charged residues that affect the ATP-stimulated release of dynein from microtubules. It may be that these charges are the ones that form our k1 . We therefore take q = 4e. This gives us an estimate of κ1 = k1 /k0 = 590. As a conservative estimate we take κ±1 = 100 so k±1 is two orders of magnitude larger than k0 . 110 CHAPTER 4. COILED-COILS We write here for convenience all the parameters used: l0 =15.5nm = 1.55 × 10−8 m (4.39) ρ =0.5nm = 5.00 × 10−10 m 2π 2π γ0 = = = 4.8 × 108 m−1 h0 13nm (4.40) Y ∼2.5 × 109 Jm−3 (4.42) 8 µ ∼9.1 × 10 Jm −3 k0 ∼0.1Jm−2 κ±1 = 4.8 k1 = 100. k0 (4.41) (4.43) (4.44) (4.45) Discussion We compare our calculated values for ∆∆G with experimental values for dynein affinity for microtubules. Kon et al. [2004] measure the kinetics of single headed cytoplasmic dynein binding microtubules in ATP. The wild type gives an association constant of K = 3 × 104 M−1 . From ∆G = −RT ln K we obtain ∆G ∼ −10kB T . A mutated form that prevents ATP binding to the ‘P1’ site gives K = 5 × 106 M−1 (giving ∆G ∼ −15kB T ). Assuming the wild type is the ATP bound form and the mutant is the free from of dynein we obtain ∆∆G(free − ATPbound) ∼ −5kB T . Earlier less direct work by Porter and Johnson [1983a,b]; Omoto and Johnson [1986] on a three headed dynein under the simple assumption that the heads are independent lead to a similar but lower value than Kon et al. [2004] for the allosteric free energy. This value is consistent with the expectation that the successive binding of the three heads will actually be cooperative. Porter and Johnson [1983a] obtain a lower limit for the association constant, K ∼ 107 M−1 , from titrations of free three-headed tetrahymena dynein binding to bovine brain microtubules. This gives ∆G ∼ −16kB T and assuming the heads are independent we expect ∆G ∼ −5.4kB T for a single headed dynein. The same authors, Porter and Johnson [1983b] estimate the lower limit of the dissociation rate constant of the dynein with ATP bound from the microtubules to be kd ∼ 1000s−1 , from stopped flow light scattering methods. Omoto and Johnson [1986] give an association constant for ADP bound dynein of ka = 1.2 × 104 M−1 s−1 . Combining these values gives an equilibrium constant of K= ka kd ∼ 12M−1 . Since dynein binding to microtubules is unfavourable in this state, we assume this value corresponds to the affinity of one head (Porter and Johnson [1983b] say that the kinetics they measure is that expected for one ATP needed to dissociate the dynein). This gives ∆G ∼ −2.5kB T . Combining these gives us a value of ∆∆G(free − ADP · Vi ) ∼ −2.9kB T , which is lower but of the same order as that 4.8. DISCUSSION 111 obtained from Kon et al. [2004] of ∆∆G(free − ATPbound) ∼ −5kB T . Comparing this with our calculations we note that if the ATP unclamps the end so that k−1 = 0 for the bound form compared to k−1 /k0 = κ−1 for the free form this corresponds to our value of ∆∆G ∼ −2.0kB T . This value uses an estimate of κ = 100. If we use κ = 1000 we obtain ∆∆G ∼ −3.1kB T . Our calculated values are sufficiently close to the experimental values, when a physical range of binding forces is assumed, for us to take this as quantitative evidence for our hypothesis: that dynein allostery is dominated by changes in the vibrational dynamics of the coiled-coil. There exists static contributions to binding affinities unaffected by changes in the flexibility of the coiled-coil investigated by Mizuno et al. [2004] who measure the dissociation constant of the microtubule binding domain at the tip of the stalk (the dynein stalk head DSH) binding to microtubule giving an association constant of K = 6 × 105 M−1 . This gives an indication of the static (mainly enthalpic) contribution ∆G ∼ −13kB T to the binding of each state. This value is consistent with the lower wild type ATP bound association due to the large entropic cost of binding this flexible form. Interestingly biochemical studies have also shown that there is allosteric communication in the other direction in dynein. Namely, as well as the presence of ATP determining microtubule release, after ATP hydrolysis the binding of microtubules accelerates the release of products ADP and Pi from dynein completing the ATPase cycle [Johnson, 1985]. This product release is thought to be coupled to the net movement of the motor [Porter and Johnson, 1989]. ADP release is thought to be the rate limiting step in the dynein ATPase cycle [Holzbaur and Johnson, 1989a]. ADP release in the absence of microtubules has Kd = 0.085mM from Holzbaur and Johnson [1989a]. ADP release from microtubule bound dynein has an equilibrium constant of Kd = 0.37mM [Holzbaur and Johnson, 1989b]. This gives us a value for the allosteric free energy of ADP release of ∆∆G(microtubules − free) ∼ −1.5kB T . The microtubule bound dynein in our model has k1 on (clamped) reducing the vibrational free energy cost required to obtain the k−1 clamped (free from ATP) state. Thus by tuning the values of κ1 and κ−1 our model can explain this back communication too. To illustrate how our model can account for this ‘reverse allostery’ quantitatively, we can allow ADP to partially unclamp k−1 to a small value kADP rather than zero compared to the effect of ATP fully unclamping k−1 = 0. If we use κ1 = κ−1 = 1000 we reproduce the allostery from the microtubules to the ADP (back communication) ∆∆G = −1.5kB T if κADP = kADP /k0 = 1.2. To further test our hypothesis of this vibrational allosteric mechanism we compare calculations of the effective Young’s modulus of the composite coiled-coil bending modes 112 CHAPTER 4. COILED-COILS with that obtained from the observations of the changes in distribution of stalk tip positions from electron microscopy images of dynein-c by Burgess et al. [2003, 2004b]. We calculate the effective Young’s modulus of the composite coiled-coil for the bending mode in one plane from the Kcx cx element of our elasticity tensor, Yef f = Kcx cx /l0 I giving Yeff(ADP) = 3.8 × 1010 Jm−3 and Yeff(apo) = 3.3 × 1011 Jm−3 . The effective Young’s modulus due to the bending mode in the plane perpendicular to this is given by Yef f = Kcy cy /l0 I giving Yeff(ADP) = 3.0×1010 Jm−3 and Yeff(apo) = 1.6×1011 Jm−3 . From the standard deviations of the angles (variance h∆θ2 i) quoted by Burgess et al. [2003], Lindemann and Hunt [2003] calculated effective spring constants for each state from the equipartition theorem keff = kB T /h∆x2 i where they took h∆x2 i = l02 h∆θ2 i. Alternatively the distribution of curvatures of the stalk can be calculated hc2 i = 4h∆θ2 i/l02 and used to obtain an effective Young’s modulus of the composite coiled-coil structure for each state Yeff = kB T l0 Ihc2 i giving Yeff(ADP) = 2.7 × 109 Jm−3 and Yeff(apo) = 8.9 × 109 Jm−3 . These values are lower than our calculated values and show slightly less contrast between the different states. This is consistent with our parameterisation of the Young’s modulus for a single helix from the polyalanine calculation (section 4.7), which may provide an upper bound for the less regular dynein helices, and also with the expectation that some of the apparent flexibility observed in both states is due to artifacts of the experimental method (e.g. the absorbing of dynein onto the carbon surface and the projection onto the two-dimensional image). Note the observed two dimensional images do show some information about the out of plane bending since scatter along the length of the stalk is seen in figure 4.1, which may be interpreted as out of plane bending. We also note that the allosteric signal has a significant dependence on the number of turns, that is, the ratio of the length to the pitch (h = 2π/γ0 ). As figure 4.18 -1.8 -2 -2.2 ∆∆G/kBT -2.4 -2.6 -2.8 -3 -3.2 -3.4 -3.6 -3.8 0.5 1 1.5 2 2.5 3 l0/h Figure 4.18: Allosteric free energy ∆∆G against the length in units of number of turns l0 /h (where h is the pitch h = 2π/γ0 , which we fix) for coiled geometry for slide, bend and twist fluctuations. 4.8. DISCUSSION 113 shows, the phase of the coil controls the degree of coupling between twist and bend. This appears in the oscillatory nature of ∆∆G when plotted as a function of number of turns l0 /h. Integral number of turns give the minimum allosteric effect with the maximum effect at half integral number of turns. This is because, for integral number of turns the allosteric effect due to the bend-slide coupling is zero but is maximum for half turns. The parameters we have used (section 4.7) in our calculations give the number of turns l0 /h = 1.2, which interestingly does not correspond to a maximum in |∆∆G| (figure 4.18). It may turn out that the length of the stalk is different from previously assumed since the precise boundaries of the coiled-coil are hard to predict [Gibbons et al., 2005]. Mutant forms of dynein might be used to explore this prediction by varying the length of the coiled-coil. The predictions of this work suggest a number of possible biochemical investigative experiments. The predicted allosteric free energy of ∼ 2kB T may be investigated by calorimetry, which would show the entropic and enthalpic contributions. The predicted changes in effective Young’s modulus of the dynein stalk could be investigated more accurately using cryo-electron microscopy, which would avoid artifacts of absorbing to a surface. Such flexibility could also be studied by molecular dynamics simulations subject to the availability of suitable crystal structures. Mutations that alter the interactions between the helices in the coiled-coil are predicted to affect the allosteric communication due to their modulation of the slide mode. The dependence of the allosteric free energy on the number of turns suggests that coiled-coil mutants of varied lengths would show different allosteric free energies. In particular mutants adding 25% to the stalk length are predicted to substantially increase the allostery. The slide mode may cause a rotation of the binding site at the tip of the stalk further reducing its affinity for microtubules. This would lead to an enthalpic contribution to ∆∆G emerging at this level of modelling. In the present model the binding of ATP releases the clamp at the base of the stalk. From observing a model structure based on homologous AAA domains [Mocz and Gibbons, 2001] it is conceivable that the binding of ATP could pull one helix away from the other reducing the interaction between them present in the absence of ATP. However, the primary ATP binding site in the dynein head (the ‘P1’ site) is not the ATP binding site closest to the base of the stalk [Kon et al., 2004]. The exact mechanism of the AAA ring is not known but it has been suggested that the AAA domains are cooperative causing ATP induced conformational and dynamics changes at the interface between the first two domains to propagate round the ring to the site of the base of the stalk [Vale, 2000]. The recent biochemical results on the role of the different ATP sites support the idea that the P2, P3 and P4 sites work cooperatively with the primary site (maybe with regulatory roles) [Kon et al., 2004]. A more sophisticated model would 114 CHAPTER 4. COILED-COILS combine the allostery intrinsic to the coiled-coil developed here with a similar treatment of these allosteric effects within the AAA ring. We consider this in section 6.1.4. In conclusion, we find that a dynamic model of allosteric response is able to account for observed structural and thermodynamic data of the microtubule binding stalk of dynein. Furthermore, it suggests that significant allosteric free energy of ∼ 2kB T can be achieved quite generally by coiled-coils of 10 − 20nm in length. Significantly, the coiled rather than simply parallel configuration of the helices proves essential for their allosteric function. Chapter 5 Coupling of global and local modes 5.1 Overview In this chapter we present a model for a potential amplification of entropic allostery due to coupling of fast, localised modes to slow, global modes. We show how such coupling can give rise to large compensating entropy and enthalpy terms. These ideas may explain calorimetry data from experiments on the met repressor. “It is always a delight when a great and beautiful idea proves to be consonant with reality.” [Einstein] 5.2 Allostery amplified by enslaved fast modes Calorimetry data on the met repressor [Cooper et al., 1994] indicates large compensatory entropic and enthalpic allosteric energies. The entropic term is too large to be accounted for by the slow modes alone. There is also a large enthalpic component, yet this is unlikely to be due to major static conformational change, since the x-ray crystal structures show no structural change on effector binding [Rafferty et al., 1989]. Therefore, we devise a model to check if, in principle, the idea of ‘enslaving’ fast modes to slow modes will account for the effect in a way that retains the dynamic nature of the allosteric signal. In this section we present a calculation of the vibrational allosteric free energy of a system that has fast, localised modes coupled to slow, global modes. The idea is that, by coupling to the delocalised modes, despite their localised nature, the fast modes contribute to the allosteric communication (unlike the models addressed earlier in this thesis). This contribution may result in an amplification of the allosteric signal. To make the idea more quantitative we begin with the scissor model drawn in figure 5.1 (the simplest case of ‘slow-mode’ dynamic allostery). The model consists of two rods 115 116 CHAPTER 5. COUPLING OF GLOBAL AND LOCAL MODES k1 k −1 Figure 5.1: Single mode scissor model. The rods (brown) represent alpha helices. The sidechains of such alpha helices are shown in black. Springs with spring constants k1 and k−1 are drawn in red. representing, for example, alpha helices, with sidechains now drawn in black. In this model there is one slow, delocalised mode (the scissor motion between the rods). There are several fast localised modes (the motions of the sidechains of the alpha helices). When the global mode is stiff, the localised sidechain modes are also stiff. This is because, in that case, the sidechains experience their native-state surroundings (a deep narrow native potential between the alpha helices). However, as the alpha helices move due to the flexible global mode, the enslaved sidechains experience a different (and weaker) potential environment. We assume therefore that the potential seen by the fast modes is dependent on the instantaneous position xs of the slow mode (or superposition of the slow modes for models with more than one enslaving slow mode). We assume that at all non zero positions of the slow mode (xs 6= 0) the fast modes will experience a flatter potential than the native one seen at the minimum of the slow mode (xs = 0). Increased motion of the slow mode, therefore, results in increased flexibility of the fast modes. On the time scale of the slow mode, the average motion of the fast modes appears increased. This is due to the fact that most of the time the sidechains see a flatter potential, since the motion of the helices has displaced the sidechains away from their native positions. At certain times within the timescale of the slow mode 5.2. ALLOSTERY AMPLIFIED BY ENSLAVED FAST MODES 117 the sidechains will see their native potential and stiffen. This stiffening will be only temporary, before the global mode moves on and the sidechains see a flatter potential again. Therefore, on the time scale of the slow mode, the average amplitude of the fast modes’ motion will be correlated with the amplitude of the average motion of the slow mode. Here we provide a calculation of this effect using the very simple scissor model shown in figure 5.1. This has just one slow global mode controlled by the two springs, with spring constants k1 , k−1 that are affected by ligand binding local to them. This is an allosteric model since the stiffening of one spring affects the vibrations of the other, due to the anchoring effect of the pivot point (which can be thought of as a spring of infinite strength). We assume that the system has N fast modes corresponding to the vibrations of the sidechains shown in black in figure 5.1. N such fast modes will arise from . N sidechains, since each sidechain will contribute one or more fast mode degrees of freedom. To calculate the general partition function, we integrate over all the fast coordinates xfi and the slow coordinate xs . Z Z Z= dxs Z ... " −1 dxfi exp kB T à Vs (xs ) + N X !# Vfi (xfi , xs ) , (5.1) i for a single slow mode and N fast modes that are coupled to the slow mode. The quadratic approximation to the slow mode potential Vs is 1 1 Vs (xs ) = −Vs0 + ks x2s = −Vs0 + (k1 + k−1 )x2s . 2 2 (5.2) The effective spring constant for the slow mode is ks = k1 + k−1 (as in figure 5.1), where, as in section 3.2.1, the effective spring constants k1 , k−1 are affected by the binding of ligands local to them. −Vs0 is the minimum of the slow potential. If there is no coupling or the slow mode is infinitely stiff, the fast mode potentials are, in the same harmonic approximation, 1 Vfi (xfi ) = −Vf0 + kf x2fi , 2 (5.3) where −Vf0 is the minimum of the fast potential and kf is the curvature, which at this level of calculation we assume to be the same for each fast mode. For a fast mode coupled to a finite slow mode, we modify the potential (5.3) to Vfi (xfi , xs ) = −Vf0 1 + 2 à kf 1+ kk x2s 2kB T ! x2fi (5.4) 118 CHAPTER 5. COUPLING OF GLOBAL AND LOCAL MODES where kk is the coupling strength with dimensions of force constant. Here, the choice of coupling function is arbitrary and chosen for analytical simplicity. By comparing equations (5.3) and (5.4) it can be seen that the instantaneous position of the slow mode, xs smoothly decreases the curvature of the enslaved mode potential, (as xs → ∞ the curvature tends to zero). In this case we assume the depth of the enslaved mode −Vf0 remains constant. In section 5.3 we will consider the case where the depth is also affected by the position of the slow mode. Substituting equations (5.2) and (5.4) into equation (5.1) and integrating over the fast modes we obtain à !# N 1 2 X −1 1 2 kf xfi 2 Z = dxs ... dxfi exp −Vs0 + ks xs + −Vf0 + kk x2s kB T 2 1 + 2k i BT · ¸µ ¶N/2 Z µ ¶N/2 · ¸ 2 N Vf0 2πkB T kk xs ks x2s Vs0 + dxs 1 + exp − = exp kB T kB T kf 2kB T 2kB T · ¸µ ¶N/2 Z N Vf0 Vs0 2πkB T = exp + (5.5) dxs ef (xs ) kB T kB T kf Z Z " Z where f (xs ) = − ks x2s N kk x2s + ln(1 + ). 2kB T 2 2kB T (5.6) The integral in equation (5.5) can be evaluated using the method of steepest decent in which the approximation 1 f (xs ) ≈ f (x¯s ) + (xs − x¯s )2 f 00 (x¯s ) 2 (5.7) is made, where x¯s is the maximum point of the function f (xs ). For kB T ( kNs − 2 kk ) N 2 > ks kk , x¯s 2 = leading to µ f (xs ) ≈ − N ks − 2 kk ¶ N + ln 2 µ N kk 2ks ¶ ks − kB T µ ¶ 2ks 1− (xs − x¯s )2 . N kk (5.8) The coupling of the fast modes to the slow mode therefore renormalises the slow mode spring constant to an effective spring constant ks (2 − 4ks N kk ). Substituting this into equation (5.5) and integrating gives · N Vf0 Vs0 + − Z = exp kB T kB T µ N ks − 2 kk ¶¸ µ 2πkB T kf ¶N/2 à 2πkB T ks (2 − N4kksk ) !1/2 µ From this partition function we can find the free energy of the system. G = − kB T ln Z N kk 2ks ¶N/2 (5.9) 5.3. DYNAMIC ENTHALPIC ALLOSTERY 119 N 1 ks 1 − )kB T − (N + 1)kB T ln 2πkB T + N kB T ln kf 2 kk 2 2 1 1 4ks 1 N kk + (N + 1)kB T ln ks + kB T ln(2 − ) − N kB T ln( ) 2 2 N kk 2 2 1 N ks 1 4ks G = (N + 1)kB T ln ks + ( − )kB T + kB T ln(2 − ) + constant (5.10) 2 2 kk 2 N kk 1 G = (N + 1)kB T ln(k1 + k−1 ) + constant. (5.11) 2 G = − Vs0 − N Vf0 + ( Equation (5.10) lists explicitly only the terms that change on ligand binding. In equation (5.11) we have taken the approximation N 2 >> ks kk for large N and kk and substi- tuted ks = k1 + k−1 for the simple scissor model. Comparing this to the form with no enslaved modes, G = 21 kB T ln(k1 + k−1 ), it is clear that enslaving N modes provides an amplification of the allosteric free energy. Interestingly, providing N 2 >> ks kk , the free energy is independent of kk . For the simple scissor model for a ligand binding in the 0 locality of k−1 stiffening it to k−1 (e.g., binding to DNA) in the two cases of apo k1 and holo k10 the allosteric signal ∆∆G = ∆Gholo − ∆Gapo is given by ∆∆G = (N + 1)kB T ln 2 µ 0 )(k + k ) (k10 + k−1 1 −1 0 0 ) (k1 + k−1 )(k1 + k−1 ¶ . (5.12) For isothermal changes this the free energy change is purely entropic ∆∆G = −T ∆∆S. This large, purely entropic, allosteric free energy cannot account for the allostery of the met repressor, since the latter shows large compensating entropic and enthalpic terms. The potentially very large amplification of the entropic allostery (equation (5.12)) could modify the calculation of ∆∆G for the lac repressor (chapter 3). In this light, all ‘slowmode only’ calculations give a lower bound to ∆∆G. In section 6.1.1 we consider a parameterisation of the coarse-grained model presented in chapter 3 that would account for such modifications. 5.3 Dynamic enthalpic allostery In this section we repeat the calculations in section 5.2 allowing the minimum of the potential seen by the fast modes to be affected by the position of the slow mode as well as the curvature. This is a much more realistic model because the depth, as well as the curvature, of the fast mode potential is likely to be affected (to a different extent) by the slow mode. In this case we model the potential for the fast modes using two coupling functions of xs , and equation (5.3) becomes Vfi (xfi , xs ) = −Vf0 1+ kv x2s 2kB T 1 + 2 à kf 1+ kk x2s 2kB T ! x2fi (5.13) 120 CHAPTER 5. COUPLING OF GLOBAL AND LOCAL MODES rather than that in equation (5.4). Substituting equations (5.2) and (5.13) into equation (5.1) and integrating over the fast modes gives " à à ! !# N X −V k −1 1 1 f f 0 Z = dxs ... dxfi exp −Vs0 + ks x2s + + x2fi kv x2s kB T 2 2 1 + kk x2s 1 + i 2k T 2kB T " B # ¸µ ¶ · ¶N/2 Z µ N/2 N Vf0 /kB T 2πkB T ks x2s Vs0 kk x2s exp − = exp dxs 1 + + kv x2s kB T kf 2kB T 2kB T 1 + 2k BT · ¸µ ¶N/2 Z Vs0 2πkB T = exp ef (xs ) dxs kB T kf Z Z Z where f (xs ) = − N Vf0 /kB T ks x2s N kk x2s + + ln(1 + ). 2 kv xs 2kB T 2 2kB T 1 + 2k BT (5.14) As in section 5.2 we use the method of steepest decent (equation (5.7)). In this case, the maximum x¯s = 0 for Vf0 kv kB T ≥ kk 2 , (assuming N Vf0 x2s − f (xs ) ≈ kB T 2kB T N 2 À ks kk ) leading to µ ¶ Vf0 kv kk ks + N ( − ) . kB T 2 (5.15) Substituting this into equation (5.14) and integrating gives · Z = exp N Vf0 Vs0 + kB T kB T ¸µ 2πkB T kf ¶N/2 1/2 2πkB T ks + N ( Vf0 kv kB T − kk 2 ) (5.16) The free energy G = −kB T ln Z is therefore µ ¶ Vf0 kv N 1 kk 1 − ) G = − Vs0 − N Vf0 − (N + 1) ln 2πkB T + kB T ln kf + kB T ln ks + N ( 2 2 2 kB T 2 µ ¶ V f kv 1 kk G = kB T ln ks + N ( 0 − ) + constant. (5.17) 2 kB T 2 As can be seen from equation (5.17) the free energy is not linear in N so does not show the amplification obtained in equation (5.11). In fact, in this case, for finite N and Vf0 kv kB T > kk 2 the allosteric free energy is less than that for no enslaved modes N = 0. In the simplifying case of Vf0 kv kB T = kk 2 we obtain the non-enslaved result. However, this free energy hides enthalpic and entropic terms that are affected by enslaving modes. We therefore calculate the enthalpic and entropic terms separately (assuming isothermal 5.3. DYNAMIC ENTHALPIC ALLOSTERY 121 changes). H =kB T 2 ∂ ln Z ∂T N kv Vf0 /2 1 H = − Vs0 − N Vf0 + (N + 1)kB T + V kv 2 ks + N ( kfB0 T − H= N kv Vf0 /2 ks + N ( Vf0 kv kB T − kk 2 ) kk 2 ) (5.18) + constant. This gives a large (linear in N ) dynamic enthalpy contribution to the allosteric free energy where the enthalpy change is favourable for a stiffening of ks . The entropy ln Z ln Z includes this same term kB T 2 ∂ ∂T . Thus this term cancels T S = kB T ln Z +kB T 2 ∂ ∂T in the free energy ∆G = ∆H − T ∆S. In this way, the favourable enthalpy of binding pays for the entropic cost of stiffening the protein. If the enslaved fast localised modes are sidechains at the DNA binding site, the potential that is affected by the slow mode vibrations is the potential between the protein and the DNA (i.e., the static enthalpy of binding DNA). In this way the vibrational dynamics of the slow modes affects the static enthalpy of binding causing a ‘dynamic’ enthalpy contribution to the allostery. In this case the enthalpy of binding to DNA, from equation (5.18) in the simplifying case of Vf0 kv kB T ∆H = −Vs0 − N Vf0 (1 − = kk 2 , is kv ), 2ks (5.19) where −Vs0 −N Vf0 is the static enthalpy of binding DNA. We have made the assumption that the potential minimum seen by the DNA binding site in the free repressor is zero and that seen in the complex with DNA is −Vs0 − N Vf0 . It can be clearly seen from equation (5.19) that the stiffer the slow mode ks , the more enthalpically favourable the DNA binding. The allosteric enthalpy has a dynamic component; ∆∆H = −N Vf0 kv (ks0 − ks ) . 2ks0 ks (5.20) To obtain this equation we have assumed no static enthalpy change (−Vs0 − N Vf0 is unchanged on effector binding). The dynamic allostery is enthalpically favourable if the holorepressor is stiffer than the aporepressor. Since equation (5.20) is linear in the number of enslaved modes N , this contribution can be very large. We can easily extend this result for more than one slow mode coupled to the fast modes by replacing ks with |Ks |, the modulus of the matrix of slow mode stiffnesses. There will be a compensating 122 CHAPTER 5. COUPLING OF GLOBAL AND LOCAL MODES unfavourable entropy component to this allostery given by, ks0 ks N Vf0 kv (ks0 − ks ) 1 T ∆∆S = − kB T ln 0 DNA − . 2 ks ksDNA 2ks0 ks (5.21) The allosteric free energy will be the much smaller value ks0 ks 1 ∆∆G = kB T ln 0 DNA , 2 ks ksDNA (5.22) which is the same as that for no enslaved modes. This exact cancellation of the enslaving effect between the enthalpic and entropic terms is true for case Vf0 kv kB T < kk 2 So the condition Vf0 kv kB T ≥ kk 2 . For the other the result will be more like that discussed in section 5.2 for kv = 0. Vf0 kv kB T ≷ kk 2 actually separates two classes of qualitatively different behaviour, giving zero or finite amplification to the net allosteric free energy. Large compensatory ∆∆H and ∆∆S values leaving a modest ∆∆G is precisely the behaviour seen in the calorimetry of the met repressor. 5.4 Application to the met repressor In this section we apply the theory described in section 5.3 to the met repressor as an example case. After introducing this protein we deduct the parameters that reproduce the calorimetry data and suggest a coarse-grained model. Introduction to the met repressor The E-coli met repressor binds DNA only with its corepressor SAM (S-adenosyl methionine), repressing the genes for the synthesis of the amino acid methionine. [He et al., 2002; Cooper et al., 1994]. The met repressor is a dimer of two intertwined monomers (seen in figure 5.2a), each with 104 amino acids giving a dimer mass of 24kDa. It is a dimer with one SAM bound to each monomer and binds DNA with a β-strand binding motif [Somers and Phillips, 1992]. The met operator contains two to five ‘met boxes’ (tandem repeats of 8bp of similar sequence). Two dimers bind to DNA and form a dimer of dimers [Phillips and Stockley, 1996]. The crystal structures of the apo and holorepressors show no significant conformation change on corepressor binding [Rafferty et al., 1989]. One theory to explain the activation of this allostery without conformational change is from long range electrostatic interactions between the positive effector SAM and negative DNA phosphate groups [Phillips and Phillips, 1994]. However, due to screening, electrostatic interactions are usually localised in proteins. An alternative suggested allosteric mechanism invokes changes in the flexibility of the repressor on ligand binding [Cooper et al., 1994]. 5.4. APPLICATION TO THE MET REPRESSOR 123 This is supported by crystal structure B-factors [Rafferty et al., 1989], which show a stiffening of the protein on corepressor binding. The NMR structure and dynamics for met have not yet been published but initial results indicate changes in dynamics [Knowles et al., 2005]. 5.4.1 Comparing theory with calorimetry experiments The met repressor has large compensating enthalpic and entropic contributions to the allostery. Equilibrium measurements by Phillips and Phillips [1994] give the allosteric free energy ∆∆G = −5.5kB T . ITC experiments by Cooper et al. [1994] determined the allosteric enthalpy ∆∆H = ∆Hholo (DNA)−∆Hapo (DNA) = −32kB T . This clearly shows a very large enthalpically favourable term to the allosteric free energy. This is compensated by a large unfavourable entropy contribution T ∆∆S = −26kB T . To reproduce the observed calorimetry from the dynamic effect (equation (5.20)) we assume Vf0 ∼ kB T (the strength of one hydrogen bond) and kv ∼ ks0 . The latter assumption is equivalent to saying that at the mean displacement of the slow mode, the fast mode potential is modified by order one changes to ∼ − 23 Vf0 . There are 14 sidechains within 6Å of DNA in the crystal structure complex (PDB ID: 1CMA [Rafferty et al., 1989]). Six of these sidechains point towards the DNA and we therefore assume the fast modes associated with these sidechains are enslaved. Assuming each of these sidechains contributes two fast modes, gives an estimate of N = 12. We multiply equation (5.20) for a met repressor dimer by two, to obtain results to compare with the calorimetry for two dimers binding to DNA. With these values we may reproduce the experimental value of ∆∆H = −32kB T in the case that ks0 ks ∼ 3.7. In this case, the stiffening of the slow mode on effector binding results in a reduction in the amplitude √ xs by 1/ 3.7, which is expected to be realistically attainable biologically. If we assume ks0 DNA ks0 = 1 (the slow mode stiffness of the apo and holorepressors when complexed with DNA are the same) and substitute ks0 ks ∼ 3.7 into equation (5.22), we require four such slow modes to reproduce the observed ∆∆G = −5.5kB T . This is not an unreasonable number: we recall that the coarse-grained model discussed in section 3.2.2 for the lac repressor contained six global modes. 5.4.2 Coarse-grained model of the met repressor In this section we suggest and discuss a suitable coarse-grained model for the met repressor. Since the met repressor is a dimer of intertwined monomers, a model of separate rigid monomers, such as that developed in chapter 3, is not suitable. The FIRST rigid cluster decomposition (introduced in section 2.2.7) for the met repressor identifies each helix as a rigid cluster. This suggests that the met repressor (crystal structure shown in figure 5.2a) can be coarse-grained as shown in figure 5.2b. This 124 CHAPTER 5. COUPLING OF GLOBAL AND LOCAL MODES (a) (b) y x z Figure 5.2: (a) Crystal structure of the met holorepressor dimer complexed to DNA [Somers and Phillips, 1992] (drawn in ‘ribbon’ representation). Repressor monomers are in red and blue; DNA in green and SAM corepressor in yellow. (b) Coarse-grained model of the met repressor dimer. Shown in the same orientation as figure 5.2a. coarse-grained model consists of two rods in a scissor orientation (corresponding to the alpha helices of this orientation in the met repressor). Each rod is from one of the monomers in the dimer. The corepressor binds at the end of these rods (helices) distant from the DNA binding site. At the end of these rods local to the DNA binding site the coarse-grained model has a flexible sheet (shown in purple in figure 5.2b). This corresponds to a beta strand from each monomer (that together make the beta sheet DNA binding motif) and a long alpha helix from each monomer (see figure 5.2). The beta strands from each monomer interact so strongly that they are likely to move concertedly in the low frequency modes. This flexible sheet is shown in purple in figure 5.2b to indicate that it contains residues from both monomers, which are shown in red and blue in figure 5.2a. The lowest mode calculated by elNémo (elastic network model software introduced in section 2.2.6) is a cleft-opening mode shown in figure 5.3a. In terms of the coarsegrained model drawn in figure 5.2b this is a rotation of the rods about the x-axis (rotation perpendicular to the plane of the paper). This opens a cleft between the rods and bends the flexible sheet that is coupled to the rods. The second lowest mode is a scissor mode (shown in figure 5.3b). The rods rotate about the z-axis and the flexible sheet twists to accommodate this motion. The third lowest mode is a rocking mode (shown in figure 5.3c). The rods rotate about the y axis and the flexible sheet shears 5.4. APPLICATION TO THE MET REPRESSOR (a) (b) 125 (c) Figure 5.3: Met aporepressor modes calculated by elNémo. The equilibrium structure is shown in blue and the structure perturbed along the (a) lowest (‘cleft-opening’) (b) second lowest (‘scissor’) and (c) third lowest (‘rocking’) frequency mode is shown in green. to accommodate this. In the coarse-grained model these modes will be controlled by springs, such as the ones shown in black on figure 5.2b, that are affected by the binding of the corepressor SAM and by moduli of the flexible sheet that are affected by DNA binding. The flexible sheet has two bending moduli (the mean and Gaussian) and one shear modulus. In principle these parameters could be determined by atomistic simulation. As an initial parameterisation we used the six lowest collective modes from an elNémo calculation using the crystal structures PDB ID: 1CMB (met aporepressor) and PDB ID: 1CMC (met holorepressor) [Rafferty et al., 1989]. The ratio of the holo to aporepressor eigenvalues for the lowest six modes are given in table 5.1. The modes Mode 1 2 3 4 5 6 ks0 /ks 1.44 1.49 1.64 2.07 1.27 1.17 Table 5.1: Ratio of eigenvalues for the lowest six modes of the met holo and aporepressor calculated by elNémo. calculated by elNémo are normal modes so to obtain |K0s |/|Ks | we simply multiply the values in table 5.1. Substituting this into equation (5.22) and multiplying by two for the two dimers gives ∆∆G ∼ −2.4kB T (assuming ks0 DNA ksDNA = 1). This value is clearly of the same order as the observed ∆∆G = −5.5kB T . Assuming the lowest three modes are coupled to the 12 fast modes of the beta sheet sidechains we obtain from equation (5.20) ∆∆H ∼ −30kB T , which is of the same order as the observed ∆∆H = −32kB T . 126 CHAPTER 5. COUPLING OF GLOBAL AND LOCAL MODES 5.5 Anharmonicity Another way, in principle of introducing compensating entropic and enthalpic effects is to include the anharmonic contributions to the global modes. Throughout this thesis we have worked within the harmonic approximation for protein normal modes. In reality protein dynamics are not completely harmonic. In this section we briefly consider anharmonic effects. Including cubic Λijk and quartic Γijkl terms in the partition function as well as harmonic terms Kij gives Z Z= · dx exp − ¸ 1 (Kij xi xj + Λijk xi xj xk + Γijkl xi xj xk xl ) 2kB T (5.23) in terms of the fluctuation coordinates xi with implicit summations over repeated indices. We expand equation (5.23) assuming small anharmonic terms, keeping two-point correlations only Z ∞ ∞ X X 1 −1 1 −1 n ( Λijk xi xj xk ) ( Γijkl xi xj xk xl )m Z = dxe n! 2kB T m! 2kB T m=0 n=0 Z ³ 1 1 − 2k1 T Kij xi xj = dxe B Λijk xi xj xk − Γijkl xi xj xk xl 1− 2kB T 2kB T ´ 1 2 2 + Λ (x x x ) i j k ijk 8(kB T )2 µ ¶ 1 1 1 2 2 =Z0 1 − Λijk hxi xj xk i − Γijkl hxi xj xk xl i + Λ (hxi xj xk i) 2kB T 2kB T 8(kB T )2 ijk µ ¶ 1 1 2 =Z0 1 − Γijkl hxi xj ihxk xl i + Λ hxi xj ihxj xk ihxk xi i . (5.24) 2kB T 8(kB T )2 ijk − 2k1 BT Kij xi xj The subscript 0 refers to the harmonic case. We use the equipartition theorem hxi xj i = −1 kB T Kij and assume a symmetric system (hxi xj ihxk xl i = hxi xk ihxj xl i etc). The com- binations of two-point correlations are counted with the aid of amputated Feynman diagrams (shown in figure 5.4). The lines in the diagrams represent the fluctuation coordinates with subscripts i, j, k and l. The number of combinations is given by the number of different routes round a diagram back to the starting point. The number of possible paths at the start is multiplied by the number of possible forward paths at each vertex until the starting point is re-encountered. This leads to µ ¶ 45 −1 −1 −1 −1 −1 2 Z = Z0 1 − 12kB T Γijkl Kij Kkl + kB T Λijk Kij Kjk Kki , 4 (5.25) 5.5. ANHARMONICITY 127 (a) (b) (d) (e) (c) (f) (g) (h) Figure 5.4: Amputated Feynman diagrams to show combinations of two-point correlations leading to equation (5.25). (a)-(c) correspond to terms of second order in two-point correlations, with coefficient Γijkl . (d)-(h) correspond to terms of third order in two-point correlations, with coefficient Λ2ijk . The number of combinations corresponding to the diagrams are as follows: (a) 8 (b) 8 (c) 8 (d) 6 (e) 12 (f ) 12 (g) 12 (h) 48. From this we obtain the expansions of the thermodynamic potentials 45 2 Λ K −1 K −1 K −1 ) 4 ijk ij jk ki 45 −1 −1 −1 −1 −1 Kjk Kki ) H =H0 − (kB T )2 (12Γijkl Kij Kkl − Λ2ijk Kij 4 45 −1 −1 −1 −1 −1 T S =S0 − 2(kB T )2 (12Γijkl Kij Kkl − Λ2ijk Kij Kjk Kki ) 4 −1 −1 G =G0 + (kB T )2 (12Γijkl Kij Kkl − (5.26) (5.27) (5.28) As can be seen from equations (5.26) - (5.28) there is partial compensation between entropy and enthalpy. The anharmonic terms will be small compared to the harmonic contribution. Due to the large number of elements of the anharmonic tensors, parameterisation of an anharmonic model is not very practical. As an initial investigation into the size of such anharmonic terms we extended the parameterisation simulations described in section 3.3.2. We pulled the monomers of the lac repressor apart by a larger separation into the region where anharmonicity is observed. We then fitted the resulting curve to a quartic. For rotation about the y-axis we obtained ΓkB T /K 2 ∼ 10−3 and Λ2 kB T /K 3 ∼ 10−5 for the lac aporepressor. The lac holorepressor showed more anharmonicity (as expected for this more flexible state) giving ΓkB T /K 2 ∼ 10−3 and Λ2 kB T /K 3 ∼ 10−4 . If we assume every element of the tensors Γ and Λ are the same, this corresponds to an anharmonic term in the aporepressor free energy Ganharm /kB T ∼ 0.01 and free energy change on ligand binding ∆Ganharm /kB T ∼ −10−3 . The enthalpic contribution is ∆Hanharm /kB T ∼ 1 × 10−3 and entropic contribution is ∆Sanharm /kB T ∼ 2 × 10−3 . These are clearly small perturbations indicating that anharmonic effects are not significant in the lac repressor. Proteins such as the met repressor may have larger anharmonic terms however these are still likely to be perturbations to the harmonic terms. The 128 CHAPTER 5. COUPLING OF GLOBAL AND LOCAL MODES functional forms (equations (5.26) - (5.28)) show the anharmonic correction to G is the same as that to H and half that to S. Such behaviour is not that seen in the calorimetry of the met repressor. Therefore we conclude anharmonic effects are unlikely to explain the large, compensatory entropic and enthalpic effects observed in repressor proteins such as the met repressor. 5.6 Discussion In this chapter we have presented a dynamic allosteric model that produces large compensating entropic and enthalpic terms. Applying this to the met repressor allostery reasonably explains the observed calorimetry. We have shown in principle how large amplifications of entropic allostery can be obtained if fast modes are enslaved to slow modes. Furthermore, this mechanism is much more effective than merely invoking anharmonicity of the global modes. For isothermal changes, the allosteric enthalpy and entropy (equations (5.20) and (5.21)) are independent of temperature. The enthalpies measured by Cooper et al. [1994] show no significant temperature dependence, in contrast to many other proteinDNA systems. Partially compensating enthalpies and entropies that depend linearly on temperature are thought to be due to the hydrophobic effect [Fisicaro et al., 2004]. Static conformational changes that bury or expose hydrophobic surface area result in such temperature dependent terms. The hydrophobic effect is primarily an entropic effect but there is also a smaller compensating enthalpic effect. One classic explanation of the entropic term in the hydrophobic effect is that the solvation of hydrophobic molecules requires the formation of a cage of ordered water molecules around the hydrophobic molecule [Kauzmann, 1959]. This ordering of solvent molecules is associated with a decrease in entropy. This entropy change on exposure of more hydrophobic surface area becomes more positive with increasing temperature, since the water becomes less ordered. The crystal structures of the met repressor show no static conformational changes, so no change in exposed hydrophobic surface area, consistent with the lack of temperature dependence of the measured thermodynamics. Our calculated dynamic enthalpy and entropy terms are independent of temperature, so are candidate explanations for the observed calorimetry. From our calculations and using the calorimetry data we can make some predictions concerning NMR experimental observations of backbone and sidechain dynamics. We expect the backbone amplitudes across the protein structure xs to decrease by √ ∼ 1/ 3.7 ∼ 0.5 in the holorepressor compared to the aporepressor (see section 5.4.1). The sidechain amplitudes near the DNA binding site xf however, will reduce by p ∼ 3/5.7 ∼ 0.7. This value is obtained by comparing the holo and apo values of the 2 kk xs effective fast mode spring constant kf /(1+ 2k ) at long times (for which x2s = kB T /ks ) BT 5.6. DISCUSSION 129 and using the assumption kk ∼ ks0 . Both these predicted reductions will be smaller if the number of enslaved modes is larger than our assumed value of N = 12. We expect non-enslaved sidechains that are distant from the corepressor and DNA binding sites to show no change in dynamics on effector binding. Another example of a protein displaying compensating allosteric enthalpy and entropy is the core binding factor CBF. NMR and ITC experiments on a core binding factor (CBF) by Yan et al. [2004] have shown that dynamics play a crucial role in the allosteric mechanism of this DNA-binding protein. The allosteric free energy is dominated by enthalpy. NMR 15 N backbone relaxation measurements indicate this is a dynamic enthalpy not a static enthalpy due to conformational change. On binding CBFβ to CBFα, the Runt domain of CBFα becomes more stable leading to an increased enthalpy of binding to DNA. The ITC results give ∆∆H = −7.6 and ∆∆S = −4.6. The entropy opposes allostery and the enthalpy favours it. The theory presented in this chapter may be applicable to this protein and others displaying similar behaviour, as well as to the met repressor. The distinctive characteristics of a system that we expect to use an enslaving mechanism are compensating, temperature independent, dynamic entropic and enthalpic contributions to the allosteric free energy. We expect repressor proteins, such as the met repressor, that have calorimetry showing these characteristics, to use an enslaving mechanism. For repressor proteins, such as the lac repressor, that do not show these characteristics we expect negligible enslaving effects. There may be repressor proteins that display behaviour between these two limiting cases, that is, small but not negligible enslaving effects. Chapter 6 Conclusions The aim of this thesis was to investigate physical mechanisms by which proteins may use intramolecular dynamics to communicate signals across large molecular distances. Our goal was to understand quantitatively and predict such vibrational contributions to allostery in proteins, using coarse-grained models. We aimed to develop coarse-grained models that are simple enough to understand intuitively and be analysed analytically, but still capture the underlying physics with the additional goal that they can be parameterised from atomistic detail and so made quantitative. The motivation is that such models may be generally applicable to a wide range of allosteric proteins and yet, through atomic level parameterisation, specific to the protein in question. We conclude that, ligand binding that modifies the flexibility of a protein can influence binding at a distant site. Our calculations indicate that changes to the vibrational free energy due to the lowest frequency modes can provide a significant contribution to allostery. This may be the primary mechanism responsible for the function of allosteric proteins that show no conformational change. Calculated values for the contribution to the allosteric free energy arising from intramolecular fluctuations are of the order of a few kB T , which is within the experimentally observed range. We have found possible ‘design rules’ for allosteric proteins. For example, to communicate via dynamic allostery the protein needs to be flexible but still rigid enough for low frequency motion to be coordinated across the structure. Ligand binding that stiffens or disrupts local interactions then alters these global modes of vibration resulting in allosteric communication. The various parameters in the models we have investigated may have been optimised by evolution to give the required allosteric effects. In chapter 3 we presented a global coarse-grained model of dynamic allostery in dimeric repressor proteins. We modelled the monomers as rigid bodies moving with respect to each other, restrained by interactions between the monomers modelled as harmonic springs. By analytically calculating the vibrational modes of this model, using equilibrium statistical mechanics, we estimated the free energy due to the low 131 132 CHAPTER 6. CONCLUSIONS frequency, global modes of the protein. This simple physical theory explains how local changes on ligand binding (affecting local spring constants) can be communicated over long distances via the global vibrational modes. The model predicts vibrational contributions to both positive and negative allosteric free energy, depending on whether the effector increases or decreases the stiffness of the protein. We suggested a computational method of parameterisation using atomic-level detail. This produced realistic results for the test case of the lac repressor. We used the model to make experimentally testable predictions such as the effect of point mutations in the lac repressor. We therefore demonstrated the usefulness and applicability of simple global-level coarse-grained modelling in understanding and predicting dynamic allostery in dimeric repressor proteins. In chapter 4, we considered allosteric communication across alpha helical coiled-coil motifs. The coiled geometry of these domains required a different model from the rigid monomers model of dimeric repressor proteins considered in chapter 3. Continuing in a coarse-grained approach, we modelled the alpha helices in a coiled-coil as elastic rods with biologically realistic parameters. We calculated analytically the relative slide of the rods, their bend and twist modes and the coupling between these motions. Modelling a ligand binding as a local attractive interaction between the rods restricting their sliding motion, results in a restriction of the coupled bend and twist modes, stiffening the whole structure. In an analogous way to the theory developed for the dimeric repressor proteins such stiffening entropically favours binding to a ligand at the other end of the coiled-coil motif. In this way, information is communicated across long molecular distances. Coiled-coil domains are found in a wide range of allosteric proteins including trans-membrane receptor proteins and motor proteins. In chapter 4 we applied our model to the molecular motor dynein as an example. Our theory provides an explanation of the allosteric nature of force generation in the dynein molecular motor. This dynamic model of allostery is able to account for the observed structural and thermodynamic data of the microtubule binding stalk of dynein. Significantly, the coiled rather than simply parallel configuration of the helices proves essential for their allosteric function. The model provides experimentally testable predictions such as the dependence of the allosteric free energy on the phase of the coil (number of turns). We extended our work on the lowest frequency modes to consider the case of high frequency modes that are coupled to these global modes (chapter 5). This resulted in a dynamic model of compensating entropic and enthalpic terms. Applying this to the example of the met repressor, reasonably explains the observed temperature independent compensating enthalpic and entropic contributions to the allosteric free energy. We showed, in principle, how large amplifications of entropic allostery can be obtained if fast localised modes are enslaved to slow global modes. 6.1. FUTURE WORK 133 The theoretical work presented in this thesis contributes to understanding the allosteric function of many proteins. Comprehending allosteric mechanisms is of great biological interest and has the potential to enhance drug design and the design of biological single molecule logic gates in biomimetics. We have highlighted the importance of protein dynamics, as well as structure, in function. We have begun to develop a novel approach to studying protein dynamics and function, using simple coarse-grained models, combining analytical statistical mechanics with atomistic parameterisation. This methodology has promising potential to complement and interpret existing and emerging computational and experimental techniques. In conclusion, using coarse-grained models we have shown that modifications to low frequency fluctuations within proteins provides a significant contribution to allosteric function. 6.1 Future Work The work presented in this thesis clearly raises many questions for further study. Much future research is needed to further develop the methods presented and to unravel more about the fascinating physical mechanisms used by biology in allosteric protein function. As Einstein put it “A theory is the more impressive the greater the simplicity of its premises, the more different kinds of things it relates and the more extended its area of applicability.” 6.1.1 Improvements and extensions to the repressor proteins work There is scope for improving the method of parameterisation of the rigid-plate coarsegrained model for dimers of rigid monomers. Improvements could be made on one of two levels. Firstly, the effective springs in the model could be replaced by more physically accurate springs, reflecting the detailed geometry and interactions of the specific protein in question. For example, interactions between atoms within a specified distance of the ligand binding site could be summed to give the spring that is affected by effector binding. Similarly, interactions between atoms within a specified distance of the DNA binding site could be summed to give the spring that is affected by DNA binding. The remaining atoms of the protein could be divided into regions within the specified distance of points where the anchor springs would be placed. Secondly, the atomistic simulations used to parameterise the springs could be extended to include a short minimisation procedure at each incremental separation of the monomers. This would simulate integrating over the fast, internal modes of the sidechains of each monomer. This would take into consideration the effect of the instantaneous position of the slow global mode on the fast modes. Thus, effects such as those considered in chapter 5 would be taken care of within the simple rigid monomers model of chapter 3. An 134 CHAPTER 6. CONCLUSIONS alternative simulation method would be to use the results for the six lowest frequency modes of a normal mode analysis (using an elastic network model for large proteins). The normal mode results could be used directly to obtain a value for the allosteric free energy. However, if a physical explanatory model is required, combinations of these normal modes would need to be found that give the springs in the model. Since there are more springs than modes in the model outlined in chapter 3, fewer springs or extra modes would need to be used. Extra modes for such a parameterisation could be generated as combinations of the six lowest frequency modes. Note it would not be physical to include higher modes from a normal mode analysis in a coarse-grained model with rigid monomers since the higher modes reflect deformations within the monomers. Such modes could of course be used to parameterise an extension of the rigid monomers model that accounts for the lowest bending modes of the monomers. Throughout this thesis we have plotted our results as functions of model parameters giving indications of the parameter space the models are in. In many cases the value of the allosteric free energy is sensitive to the values of the parameters. Further testing of the robustness of our models with respect to the values of the parameters used and the positioning of effective springs could be carried out to check the range of validity of our models. For example, alternative configurations of springs in the model for the lac repressor could be parameterised to check the results obtained. Similarly, the actual parameter values could be varied to test robustness. In order to explore the potential of the coarse-grained modelling technique presented in chapter 3 it is necessary to apply it to other suitable proteins. A few examples of suitable repressor proteins for such a study are outlined in section 3.7. In all the work on repressor proteins presented in this thesis we have not considered the dynamics of the DNA molecule itself. This is likely to be significant. We have assumed that holo and apo repressors have the same effect on DNA. However, this is unlikely to be the case. For example, NMR measurements by Yan et al. [2004] imply that dynamics and conformational changes in the DNA as well as the protein contribute to the allostery. Balaeff et al. [2004] suggest a coarse-grained model of DNA as an elastic rod. This is at the same level of coarse-graining as our models. Allowing the stiffness of the bending modes of the elastic rod to be altered by repressor binding would contribute to the calculated free energy changes of repressor binding to DNA. It would be interesting to also calculate the allosteric free energy due to static conformational changes in those proteins that show significant changes in the mean static structure. Calculating this effect would allow direct comparison with experimental thermodynamics data that cannot distinguish between static and dynamic effects. Enthalpy changes due to static conformational changes could be calculated using MD 6.1. FUTURE WORK 135 potentials provided the protein is small enough. Each structure would need to be minimised prior to a total energy calculation of the interactions between all the atoms in the structure. The enthalpy change on effector binding would then be given by the differences in calculated energies of the holorepressor and aporepressor and the free effector. Entropy changes would not be included in this, however, since MD potentials do not model hydrophobic effects. The static entropy change, due to the hydrophobic effect, could be estimated by calculating the change in exposed hydrophobic surface area, as outlined by Spolar and Record [1994]. Enthalpy changes due to static conformational changes could, in principle, be calculated analytically from coarse-grained models as the change in the potential minimum of the springs of the model. At present there is no account for the effect of solvent in the coarse-grained models considered. Hence, the change in entropy due to the change in hydrophobic effect of static conformational changes is not accounted for. Solvent will also affect the protein dynamics by damping the motion and exciting modes due to Brownian kicks. At present our coarse-grained modelling does not take possible solvent effects into consideration. A connected area of study is to consider the local changes on ligand binding and how these change the effective spring constant. Ligands that bind to the effector binding site but do not affect the repressor in the same way as the natural effector provide useful comparisons. For example, neutral analogues of the positive corepressor (SAM) of the met repressor fail to increase affinity for the met repressor for DNA. A ligand binding will introduce new local interactions (such as the electrostatic interactions introduced by SAM binding to the met repressor). This may induce local conformational change but also may affect the high frequency modes localised near the binding site. Cao et al. [2003] use normal mode analysis to study localised 1012 − 1013 Hz frequencies that are coupled to collective modes of the protein. They investigate a number of proteins including the trp repressor. For the trp repressor they find two localised modes linked to the shear motion of one of the alpha helices in the DNA binding region. Large static conformational changes tend to be activated processes. It would be interesting to investigate the speed of such structural rearrangements compared to the speed of the local conformational changes that modify the dynamics. It may be that dynamic allostery has a faster switching time and is favoured in biological systems needing to respond to signals rapidly. This may be an advantage of dynamic over static allostery. Another possible advantage of dynamic over static allostery is the role dynamics play in specificity. Specificity is essential for repressor proteins binding to DNA. They need to bind strongly to the correct operator and only weakly to non-specific DNA. However, the affinity for specific DNA cannot be too high since it needs to be sufficiently modified by effector binding. Protein flexibility provides a mechanism by which repressors can 136 CHAPTER 6. CONCLUSIONS obtain high specificity but low affinity. An entropically costly reduction in flexibility on specific DNA binding can result in low affinity but high specificity as the stiff structure only binds to the operator [Szwajkajzer and Carey, 1997]. Flexibility within the nonspecifically bound complex may aid the search for the target operator site. Models of changes in flexibility on ligand binding could be helpfully applied to the optimisation problem of affinity and specificity with implications for drug design. Met repressor A full comparison of our calculations in section 5.3 to NMR dynamics data for the met repressor is required when the data is available. A parameterisation (using a method such as that described in section 3.3.2) of coarse-grained model outlined in section 5.4.2 would give the model an atomistic basis. This parameterised model could then be used to predict the calorimetry data discussed in section 5.4.1. An investigation into how extensive enslaving is in allosteric proteins would be interesting. Other repressor proteins showing thermodynamics characteristic of this mechanism could be modelled. Examples showing different levels of enslaving may be found. The met repressor may be one of a class of repressor proteins using such a mechanism. 6.1.2 Chemotaxis receptor clustering Chemotaxis receptors are allosteric transmembrane proteins with long (tens of nm) coiled-coils spanning the membrane. The model for allostery in coiled-coils presented in chapter 4 can be applied to these transmembrane proteins. Bacterial chemotaxis is the mechanism by which bacteria sense chemicals in their environment and respond by swimming up or down the chemical gradient. The swimming motion is achieved by means of molecular motors rotating flagella. The flagella (distributed across the bacterium surface) form a bundle when rotated anticlockwise causing the bacterium to swim forwards in a state known as a ‘run’. When one or more flagella are rotated clockwise the flagella fly apart leading to a ‘tumble’ in which the direction of the bacterium is changed randomly. In steady state, with no signals (no attractant or repellent molecules in the environment), the direction of rotation reverses stochastically every few seconds. In the presence of attractants (repellents) the time of a run is increased (decreased). This allows the bacteria to swim towards food or away from toxins. The signalling mechanism is very sensitive over a large range of concentrations. The bacteria can detect 0.1 − 0.2% changes in concentration over five orders of magnitude of concentration [Bray et al., 1998]. This sensitivity and range has been and is a subject of much interest. Attractants or repellents bind to the extracellular end of transmembrane chemotaxis 6.1. FUTURE WORK 137 Figure 6.1: Cartoon of a bacterium showing the chemotaxis pathway [Alberts, 2002]. receptor proteins. The signal is transmitted to the cytoplasmic end of the proteins affecting reactions initiating the chemotaxis pathway inside the cell. Figure 6.1 shows a cartoon of this signalling pathway. A transmembrane receptor activated by repellent binding, allows the kinase protein, CheA, which is bound to it, to be self-phosphorylated by the chemical reaction ATP→ADP+Pi . The phosphate group is then transfered to another protein, CheY. This phosphorylated CheY is then able to dissociate from the complex and diffuse across the cell to the flagella motor. Here it causes the motor to rotate clockwise, inducing the tumble, causing the bacterium to change direction away from the repellent. Conversely, the binding of an attractant decreases the activity of the receptor leaving the motor rotating anticlockwise, allowing smooth forward swimming towards the attractant. A further impressive aspect of this system is that the bacterium detects changes in concentration, not the static concentration, of the attractants and repellents, allowing the bacteria to swim up or down chemical concentration gradients to find the source of food or retreat from the source of toxin. If bacteria are placed in a higher concentration they will respond with the appropriate swimming. However, if the higher level is maintained, the bacteria will adapt (within a few minutes) returning their swimming to the steady state pattern, ready to detect further changes in concentration. This adaptation is achieved by means of methylation that increases the activity of the receptor. The receptor is methylated by the protein, CheR. The binding of an attractant, which 138 CHAPTER 6. CONCLUSIONS decreases the receptor activity, is compensated by methylation in the adaptation process. Methylation neutralises the negative charge on glutamate (COO− to COOCH3 ). Liu et al. [1997] find, from electron microscopy, that higher kinase (CheA) activity for higher levels of methylation correlates with an increased stability of active complexes. They explain this by the fact that the demethylated state has negative charges that are likely to disrupt the coiled-coil structure. Methylation removes these negative charges, stabilising the structure. Our model could neatly explain this adaptation by methylation. In our model methylation introduces a local clamp on the coiled-coil at the methylation site. This increases the resistance to sliding between the rods in the coiled-coil. In turn, this stiffens the whole structure, reducing the entropic cost of binding to CheA and neighbouring receptors. The stiffening induced by methylation compensates the increased flexibility induced by attractant binding (modelled as the removal of a clamp such as that seen in figure 6.4). Figure 6.2: Cartoon of receptor clusters from Kim et al. [2002]. The receptors cluster at the cell pole. This suggests some functional advantage of clustering since this increases the distance the CheY has to diffuse to the flagella motors, which are distributed across the cell. It has been suggested that this clustering of receptors amplifies the signal, explaining the high sensitivity [Kim et al., 2002; Albert et al., 2004]. Kim et al. [2002] suggest this amplification is due to the propagation of changes in the dynamics of the receptor on ligand binding. Motivated by the trimeric crystal structure of the receptor cytoplasmic domain, Kim et al. [2002] suggest the biological model of receptor clustering shown in figure 6.2. The receptors trimerise at the cytoplasmic end and trimerise with different receptors at the periplasmic end, 6.1. FUTURE WORK 139 forming the cluster array shown. This speculative biological model needs to be backed up with an explanatory physical model. Shi and Duke [1998] present an Ising model to describe this ‘conformational spread’. Duke and Bray [1999] implement this in Monte Carlo simulations. In this Ising model the states of the receptors are modelled as ‘spins’, Si = 2(Vi − V0 )/∆V − 1, where Vi is a variable determining the state of receptor i and ∆V = Vi − V0 . They assume a two-state model where Vi is either V0 or V1 . The Ising model is then given by H=− X hiji Jij Si Sj − X Bi Si + H1 (6.1) i where Jij = Tij ∆V 2 /4 is the coupling between spins; Bi = Hi ∆V /2 is the effect of ligand binding to receptor i; H1 is the equilibrium energy in the absence of ligand binding. Mello and Tu [2003] expand on this to include coupling between different types of receptors and different methylation states. Referring to these models Webre et al. [2004] state “Although all of the above models provide interesting and arguably plausible mathematical descriptions of receptor signalling, none provides mechanistic details of how these interactions could actually occur.” We propose that this could be provided by an extension to the model we presented in chapter 4 for allostery in coiledcoils. Each receptor coiled-coil can be modelled using the model we presented in chapter 4. Extending this model to include the clustering of receptors into trimers would serve to predict the coupling constant Jij of Shi and Duke’s Ising model. However, since Jij needs to be close to criticality ideally an accurate simulation is required. Equivalently, our model would predict the affinity parameters in Rao et al.’s [2004] allosteric trimer of dimers model. Predicting such coupling would provide a quantitative physical model of dynamic clustering of receptors and may explain the signal amplification in this intriguing biological system. The individual receptors are themselves dimeric proteins made up of four long alpha helices (two from each monomer), which form a bundle through the membrane. Figure 6.3 shows a model of an individual receptor dimer made up from the crystal structures of the cytoplasmic and periplasmic domains each of which have been crystallised separately. The transmembrane section, which has not been crystallised, has been determined by computer modelling. Two alpha helices in each monomer form an antiparallel coiled-coil. These coiled-coils coil round each other in the dimer. There is experimental evidence from fluorescence and NMR studies by Seeley et al. [1996] that the cytoplasmic fragment of the aspartate receptor is globally flexible. They suggest this flexibility may be modulated in transmembrane signalling. Our model relies on such modification of protein flexibility. Figure 6.4 shows the crystal structure of the periplasmic domain of Tar (the aspartate receptor) with and without the attractant 140 CHAPTER 6. CONCLUSIONS Figure 6.3: Model of an E-coli serine receptor Tsr dimer [Kim et al., 1999]. aspartate (Asp) bound [Falke et al., 1997]. Without Asp helix α4 is immobilised by helix α1 by a ‘latch’. When Asp binds helix α4 is forced away from helix α1 unbolting the ‘latch’ allowing the observed ‘piston’ motion - a 1.6Å vertical shift of helix α4 [Ottemann et al., 1999]. Such a local conformational change supports our model of attractant binding unclamping the coiled-coil, allowing greater sliding, thereby increasing the flexibility of the receptor. This increased flexibility would increase the entropic cost of binding CheA and neighbouring receptors, decreasing the activity of the kinase CheA. Further indication of aspartate induced increased flexibility is a decrease of interdimer disulphide cross-linking on aspartate binding found by Homma et al. [2004]. Models of transmembrane proteins should consider the interaction with the membrane. A change in angle of the protein with respect to the membrane will expose or bury hydrophobic surface area. The dynamics of the protein may couple with the dynamics of the lipids within the membrane. 6.1.3 G-protein-coupled receptor transmembrane proteins G-protein-coupled receptors (GPCRs) are monomers with seven transmembrane helices. (Recently it has been recognised that some classes of GPCRs can oligomerise [Pin et al., 2005].) These allosteric proteins bind to an intracellular G-protein depending on the presence of an extracellular ligand. This causes the G-protein to hydrolyse 6.1. FUTURE WORK 141 Figure 6.4: Crystal structure of aspartate receptor (Tar) periplasmic (outside membrane) domain with and without bound aspartate showing ‘piston’ conformational change [Falke et al., 1997]. The round and square helices show the 2 subunits of the dimer. Helix α4 is shifted down by 1.6Å with Asp bound (dark grey) compared to without (light grey). GTP. Klein-Seetharaman et al. [2004] show by NMR backbone and sidechain dynamics that the activation by light of the GPCR rhodopsin may involve dynamics. The methodology we use in this thesis may help to understand the allosteric mechanism of this important class of transmembrane proteins. 6.1.4 Allostery around a ring Allosteric rings of protein domains are found in a wide range of biological systems. The sizes of rings varies from, for example, the four subdomains of haemoglobin to the ring of 34 FliM proteins in the bacterial flagella motor. We commented on the AAA ATPase ring in the dynein molecular motor in section 4.8. ATP binding increases the interactions between AAA domains in the (usually hexamic) ring and this accelerates the ATPase reaction [Vale, 2000]. Duke et al. [2001] perform Monte Carlo (MC) simulations of an Ising-type model of an idealised protein ring. Graham and Duke [2005] study dynamic hysteresis in an Ising model of a ring of proteins, suggesting that a coupling constant close to a critical value will allow the protein to respond quickly to signals whilst being robust to fluctuations in ligand concentration. Clearly a model that predicts this coupling value would be useful in determining if biological proteins have evolved to optimise this behaviour. It may be that allosteric communication around a ring of proteins uses alterations in the dynamics instead of, or as well as, static conformational changes. For example, verotoxin is a bacterial toxin that causes diarrhoeal diseases and contains a pentameric ring structure (shown in figure 6.5). NMR dynamics studies by Yung et al. [2003] of the verotoxin imply the system is in dynamic equilibrium between a ring and a lock-washer structure. Binding of a carbohydrate dimer to one of 142 CHAPTER 6. CONCLUSIONS Figure 6.5: Verotoxin-1 B subunit pentamer with carbohydrate ligand (red) bound (crystal structure PDB ID: 1QNU [Kitov et al., 2000]). the five binding sites fixes the ring structure. ITC measurements by Yung et al. [2003] indicate entropic cooperativity between the carbohydrate binding sites in neighbouring verotoxin subunits in the pentamer. Such a system is in urgent need of theoretical modelling. Coarse-grained models, similar to those developed in this thesis, of rings of proteins could investigate the allosteric mechanism of these intriguing systems. 6.2 Summary This thesis presents our attempt at applying a physicists’ coarse-grained approach to an important area in biology (allostery in proteins). Embarking on such an endeavour has involved fascinating physics and, with the use of atomistic parameterisation, led towards experimentally testable biological predictions. Bibliography Albert, R., Chiu, Y.-W., and Othmer, H. G. (2004). Dynamic receptor team formation can explain the high signal transduction gain in Escherichia coli. Biophysical Journal, 86(5):2650–2659. url: http://www.biophysj.org/cgi/content/full/86/5/2650. Alberts, B. (2002). Molecular Biology of the cell. Garland Science, 4th edition. Anderson, P. W. (1958). Absence of diffusion in certain random lattices. Physical Review, 109:1492–1505. doi:10.1103/PhysRev.109.1492. Andricioaei, I. and Karplus, M. (2001). On the calculation of entropy from covariance matrices of the atomic fluctuations. The Journal of Chemical Physics, 115(14):6289– 6292. doi:10.1063/1.1401821. Ansari, A., Berendzen, J., Bowne, S., Frauenfelder, H., Iben, I., Sauke, T., Shyamsunder, E., and Young, R. (1985). Protein states and proteinquakes. PNAS, 82(15):5000– 5004. url: http://www.pnas.org/cgi/reprint/82/15/5000. Atilgan, A., Durell, S., Jernigan, R., Demirel, M., Keskin, O., and Bahar, I. (2001). Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophysical Journal, 80(1):505–515. url: http://www.biophysj.org/cgi/content/ full/80/1/505. Bahar, I., Atilgan, A., and Erman, B. (1997). Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. Folding and Design, 2(3):173–181. doi:10.1016/S1359-0278(97)00024-2. Bahar, I., Atilgan, A. R., Demirel, M. C., and Erman, B. (1998). Vibrational dynamics of folded proteins: Significance of slow and fast motions in relation to function and stability. Physical Review Letters, 80:2733–2736. doi:10.1103/PhysRevLett.80.2733. Balaeff, A., Koudella, C., Mahadevan, L., and Schulten, K. (2004). DNA loops using continuum and statistical mechanics. tions: Modelling Philosophical Transac- Mathematical, Physical and Engineering Sciences, 362(1820):1355–1371. doi:10.1098/rsta.2004.1384. 143 144 BIBLIOGRAPHY Barkley, M. D. and Bourgeois, S. (1980). The Operon, chapter 7. Cold Spring Harbour. Bell, C. E. and Lewis, M. (2000). A closer view of the conformation of the lac repressor bound to operator. Nature Structural Biology, 7(3):209–214. PDB ID: 1EFA. doi:10.1038/73317. Bell, C. E. and Lewis, M. (2001). The lac repressor: a second generation of structural and functional studies. Current Opinion in Structural Biology, 11(1):19–25. doi:10.1016/S0959-440X(00)00180-9. Bellissent-Funel, M.-C., Daniel, R., Durand, D., Ferrand, M., Finney, J. L., Pouget, S., Reat, V., and Smith, J. C. (1998). Nanosecond protein dynamics: First detection of a neutron incoherent spin-echo signal. Journal of the American Chemical Society, 120(29):7347–7348. doi:10.1021/ja981329b. Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissig, H., Shindyalov, I., and Bourne, P. (2000). The protein data bank. Nucleic Acids Research, 28(1):235– 242. url: http://nar.oxfordjournals.org/cgi/content/abstract/28/1/235. Biroli, G. and Monasson, R. (1999). A single defect approximation for localized states on random lattices. Jouranl of Physics A: Mathematical General, 32(24):L255–L261. doi:10.1088/0305-4470/32/24/101. Boal, D. H. (2002). Mechanics of the cell. Cambridge University Press, 1st edition. Bowman, M. J., Byrne, S., and Chmielewski, J. (2005). Switching between allosteric and dimerization inhibition of HIV-1 protease. Chemistry & Biology, 12(4):439–444. doi:10.1016/j.chembiol.2005.02.004. Braxton, B. L., Mullins, L. S., Raushel, F. M., and Reinhart, G. D. (1996). Allosteric effects of carbamoyl phosphate synthetase from Escherichia coli are entropy-driven. Biochemistry, 35:11918–11924. doi:10.1021/bi961305m. Bray, D., Levin, M. D., and Morton-Firth, C. J. (1998). Receptor clustering as a cellular mechanism to control sensitivity. Nature, 393:85–88. doi:10.1038/30018. Brooks III, C. L., Karplus, M., and Pettitt, B. M. (1988). Proteins: A theoretical perspective of dynamics, structure , and thermodynamics, volume LXXI of Advances in Chemical Physics. John Wiley & Sons, Inc. Brown, P. H. and Beckett, D. (2005). Use of binding enthalpy to drive an allosteric transition. Biochemistry, 44(8):3112–3121. doi:10.1021/bi047792k. Bruinsma, R. F. (2002). Physics of protein-DNA interaction. Physica A, 313:211–237. doi:10.1016/S0378-4371(02)01038-5. BIBLIOGRAPHY 145 Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W., Jiang, J. S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T., and Warren, G. L. (1998). Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr. Sect. D-Biol. Crystallogr., 54:905–921. url: http://cns.csb.yale.edu. Brüschweiler, R. (1995). Collective protein dynamics and nuclear spin relaxation. The Journal of Chemical Physics, 102(8):3396–3403. doi:10.1063/1.469213. Buck, M., Xu, W., and Rosen, M. K. (2004). A two-state allosteric model for autoinhibition rationalizes WASP signal integration and targeting. Journal of Molecular Biology, 338(2):271–285. doi:10.1016/j.jmb.2004.02.036. Burgess, S., Walker, M., Sakakibara, H., Oiwa, K., and Knight, P. (2004a). The structure of dynein-c by negative stain electron microscopy. Journal of Structural Biology, 146(1-2):205–216. doi:10.1016/j.jsb.2003.10.005. Burgess, S. A., Walker, M. L., Sakakibara, H., Knight, P. J., and Oiwa, K. (2003). Dynein structure and power stroke. Nature, 421(6924):715–718. doi:10.1038/nature01377. Burgess, S. A., Walker, M. L., Thirumurugan, K., Trinick, J., and Knight, P. J. (2004b). Use of negative stain and single-particle image processing to explore dynamic properties of flexible macromolecules. Journal of Structural Biology, 147(3):247–258. doi:10.1016/j.jsb.2004.04.004. Cao, Z. W., Chen, X., and Chen, Y. Z. (2003). Correlation between normal modes in the 20-200 cm−1 frequency range and localized torsion motions related to certain collective motions in proteins. Journal of Molecular Graphics and Modelling, 21:309– 319. doi:10.1016/S1093-3263(02)00185-7. Carloni, P., Rothlisberger, U., and Parrinello, M. (2002). The role and perspective of ab initio molecular dynamics in the study of biological systems. Accounts of Chemical Research, 35(6):455–464. doi:10.1021/ar010018u. Case, D., Darden, T., Cheatham III, T., Simmerling, C., Wang, J., Duke, R., Luo, R., Merz, K., Wang, B., Pearlman, D., Crowley, M., Brozell, S., Tsui, V., Gohlke, H., Mongan, J., Hornak, V., Cui, G., Beroza, P., Schafmeister, C., Caldwell, J., Ross, W., and Kollman, P. (2004). AMBER 8. University of California, San Francisco. Changeux, J.-P. and Edelstein, S. J. (2005). Allosteric mechanisms of signal transduction. Science, 308(5727):1424–1428. doi:10.1126/science.1108595. 146 BIBLIOGRAPHY Chatterjee, S., Zhou, Y., Roy, S., and Adhya, S. (1997). Interaction of Gal repressor with inducer and operator: induction of gal transcription from repressor-bound DNA. PNAS, 94(7):2957–2962. url: http://www.pnas.org/cgi/content/full/94/ 7/2957. Choi, B., Zocchi, G., Canale, S., Wu, Y., Chan, S., and Perry, L. J. (2005). Artificial allosteric control of maltose binding protein. Physical Review Letters, 94:038103. doi:10.1103/PhysRevLett.94.038103. Cooper, A. (1999). Thermodynamic analysis of biomolecular interactions. Current Opinion in Chemical Biology, 3:557–563. doi:10.1016/S1367-5931(99)00008-3. Cooper, A. and Dryden, D. T. F. (1984). Allostery without conformational change - a plausible model. The European Biophysics Journal with Biophysics Letters, 11:103– 109. Cooper, A., McAlpine, A., and Stockley, P. G. (1994). Calorimetric studies of the energetics of protein-DNA interactions in the Escherichia-coli methionine repressor (MetJ) system. FEBS Letters, 348:41–45. doi:10.1016/0014-5793(94)00579-6. Cornish-Bowden, A. (2002). Enthalpy-entropy compensation: a phantom phenomenon. Journal of Biosciences, 27(2):121–126. url: http://www.ias.ac.in/jbiosci/ mar2002/121. Daniel, R., Dunn, R., Finney, J., and Smith, J. (2003). The role of dynamics in enzyme activity. Annual Review of Biophysics and Biomolecular Structure, 32:69– 92. doi:10.1146/annurev.biophys.32.110601.142445. Daniel, R., Finney, J., Rat, V., Dunn, R., Ferrand, M., and Smith, J. (1999). Enzyme dynamics and activity: time-scale dependence of dynamical transitions in glutamate dehydrogenase solution. Biophysical Journal, 77(4):2184–90. url: http://www.biophysj.org/cgi/content/full/77/4/2184. Daniel, R., Smith, J., Ferrand, M., Hry, S., Dunn, R., and Finney, J. (1998). Enzyme activity below the dynamical transition at 220 K. Biophysical Journal, 75(5):2504– 257. url: http://www.biophysj.org/cgi/content/abstract/75/5/2504. Donner, J., Caruthers, M., and Gill, S. (1982). A calorimetric investigation of the interaction of the lac repressor with inducer. Journal of Biological Chemistry, 257:14826– 14829. url: http://www.jbc.org/cgi/content/abstract/257/24/14826. Doruker, P., Jernigan, R. L., and Bahar, I. (2002). Dynamics of large proteins through hierarchical levels of coarse-grained structures. Journal of Computational Chemistry, 23:119 – 127. doi:10.1002/jcc.1160. BIBLIOGRAPHY Doster, W., Cusack, S., and Petry, W. (1989). globin revealed by inelastic neutron scattering. 147 Dynamical transition of myoNature, 337(6209):754–756. doi:10.1038/337754a0. Doyle, M. L. (1997). Characterization of binding interactions by isothermal titration calorimetry. Current Opinion in Biotechnology, 8:31–35. doi:10.1016/S09581669(97)80154-1. Duke, T. A. J. and Bray, D. (1999). Heightened sensitivity of a lattice of membrane receptors. PNAS, 96:10104–10108. url: http://0-www.pnas.org.wam.leeds.ac. uk/cgi/reprint/96/18/10104. Duke, T. A. J., Le Novere, N., and Bray, D. (2001). Conformational spread in a ring of proteins: A stochastic approach to allostery. Journal of Molecular Biology, 308:541–553. doi:10.1006/jmbi.2001.4610. Dunker, A. K., Brown, C. J., Lawson, J. D., Iakoucheva, L. M., and Obradović, Z. (2002). Intrinsic disorder and protein function. Biochemistry, 41(21):6573–6582. doi:10.1021/bi012159+. Einstein, A. (1879-1955). The new Quotable Einstein, 2005. Princeton University Press. Enlarged commemorative edition published in the 100th anniversary of the special theory of relativity. Falcon, C. and Matthews, K. (2001). Engineered disulfide linking the hinge regions within lactose repressor dimer increases operator affinity, decreases sequence selectivity, and alters allostery. Biochemistry, 40(51):15650–15659. doi:10.1021/bi0114067. Falke, J. J., Bass, R. B., Butler, S. L., Chervitz, S. A., and Danielson, M. A. (1997). The two-component signaling pathway of bacterial chemotaxis: A molecular view of signal transduction by receptors, kinases, and adaptation enzymes. Annual Review of Cell and Developmental Biology, 13:457–512. doi:10.1146/annurev.cellbio.13.1.457. Ferrand, M., Dianoux, A., Petry, W., and Zacca, G. (1993). Thermal motions and function of bacteriorhodopsin in purple membranes: effects of temperature and hydration studied by neutron scattering. PNAS, 90(20):9668–9672. url: http: //www.pnas.org/cgi/content/abstract/90/20/9668. Ferrari, S., Costi, P. M., and Wade, R. C. (2003). Inhibitor specificity via protein dynamics: Insights from the design of antibacterial agents targeted against thymidylate synthase. Chemistry & Biology, 10:1183–1193. doi:10.1016/j.chembiol.2003.11.012. Fischer, E. (1894). Einfluss der configuration auf die wirkung derenzyme. Ber. Dt. Chem. Ges., 27:2985–2993. 148 BIBLIOGRAPHY Fisicaro, E., Compari, C., and Braibanti, A. (2004). Entropy/enthalpy compensation: hydrophobic effect, micelles and protein complexes. Phys. Chem. Chem. Phys., 6(16):4156–4166. doi:10.1039/b404327h. Flory, P. J. (1976). Statistical thermodynamics of random networks. Proceedings of the Royal Society of London. Series A, 351(1666):351–378. url: http://www.jstor. org/view/00804630/ap000147/00a00070/0. Förster, T. (1948). Intermolecular energy migration and fluorescence. Annalen der Physik, 2:55–75. Frauenfelder, H., McMahon, B., Austin, R., Chu, K., and Groves, J. (2001). The role of structure, energy landscape, dynamics, and allostery in the enzymatic function of myoglobin. PNAS, 98(5):2370–2374. doi:10.1073/pnas.041614298. Freire, E. (1999). The propagation of binding interactions to remote sites in proteins: analysis of the binding of the monoclonal antibody D1.3 to lysozyme. PNAS, 96(18):10118–10122. url: http://www.pnas.org/cgi/content/full/96/18/10118. Frey, E. and Kroy, K. (2005). Brownian motion: a paradigm of soft matter and biological physics. Annalen der Physik - Berlin, 14:20–50. doi:10.1002/andp.200410132. Friedman, A. M., Fischmann, T. O., and Steitz, T. A. (1995). Crystal-structure of lac repressor core tetramer and its implications for DNA looping. Science, 268:1721– 1727. PDB ID: 1TLF. Fujisaki, H. and Straub, J. E. (2005). Vibrational energy relaxation in proteins. PNAS, 102(19):6726–6731. doi:10.1073/pnas.0409083102. Gee, M., Heuser, J., and Vallee, R. (1997). An extended microtubule-binding structure within the dynein motor domain. Nature, 390(6660):636–639. doi:10.1038/37663. Gee, M. and Vallee, R. (1998). The role of the dynein stalk in cytoplasmic and flagellar motility. European Biophysics Journal with Biophysics Letters, 27:466–473. doi:10.1007/s002490050157. Gerstein, M. and Krebs, W. (1998). A database of macromolecular motions. Nucleic Acids Research, 26(18):4280–4290. url: http://nar.oxfordjournals.org/ cgi/content/abstract/26/18/4280. Gibbons, I., Garbarino, J. E., Tan, C. E., Reck-Peterson, S. L., Vale, R. D., and Carter, A. P. (2005). The affinity of the dynein microtubule-binding domain is modulated by the conformation of its coiled-coil stalk. Journal of Biological Chemistry, 280(25):23960–23965. doi:10.1074/jbc.M501636200. BIBLIOGRAPHY 149 Graham, I. and Duke, T. A. J. (2005). Dynamic hysteresis in a one-dimensional Ising model: application to allosteric proteins. Physical Review E, 71(6):061923. doi:10.1103/PhysRevE.71.061923. Gunasekaran, K., Ma, B., and Nussinov, R. (2004). Is allostery an intrinsic property of all dynamic proteins? Proteins, 57(3):433–443. doi:10.1002/prot.20232. Harris, S., Gavathiotis, E., Searle, M., Orozco, M., and Laughton, C. (2001). Cooperativity in drug-DNA recognition: a molecular dynamics study. Journal of the American Chemical Society, 123(50):12658–12663. doi:10.1021/ja016233n. Hawkins, R. J. and McLeish, T. C. B. (2004). Coarse-grained model of entropic allostery. Physical Review Letters, 93:098104. doi:10.1103/PhysRevLett.93.098104. Hawkins, R. J. and McLeish, T. C. B. (2005). Dynamic allostery of protein alpha helical coiled-coils. Journal of The Royal Society Interface, FirstCite Early Online Publishing:68. doi:10.1098/rsif.2005.0068. He, Y. Y., Garvie, C. W., Elworthy, S., Manfield, I. W., McNally, T., Lawrenson, I. D., Phillips, S. E. V., and Stockley, P. G. (2002). Structural and functional studies of an intermediate on the pathway to operator binding by Escherichia coli MetJ. Journal of Molecular Biology, 320:39–53. PDB ID: 1MJM. doi:10.1016/S0022-2836(02)00423-0. Hinsen, K. (1998). mal mode calculations. Analysis of domain motions Proteins, 33(3):417–429. by approximate nor- doi:10.1002/(SICI)1097- 0134(19981115)33:3¡417::AID-PROT10¿3.0.CO;2-8. Hogg, R. C., Buisson, B., and Bertrand, D. (2005). tion of ligand-gated ion channels. Allosteric modula- Biochemical Pharmacology, 70:1267–1276. doi:10.1016/j.bcp.2005.06.010. Holzbaur, E. and Johnson, K. (1989a). ADP release is rate limiting in steady- state turnover by the dynein adenosinetriphosphatase. Biochemistry, 28(13):5577– 5585. url: http://pubs.acs.org/cgi-bin/sample.cgi/bichaw/1989/28/i13/ pdf/bi00439a036.pdf. Holzbaur, E. and Johnson, K. (1989b). dynein. Microtubules accelerate ADP release by Biochemistry, 28(17):7010–7016. url: http://pubs.acs.org/cgi-bin/ sample.cgi/bichaw/1989/28/i17/pdf/bi00443a034.pdf. Homma, M., Shiomi, D., Homma, M., and Kawagishi, I. (2004). Attractant binding alters arrangement of chemoreceptor dimers within its cluster at a cell pole. PNAS, 101:3462–3467. doi:10.1073/pnas.0306660101. 150 BIBLIOGRAPHY Horton, N., Lewis, M., and Lu, P. (1997). Escherichia coli lac repressor-lac operator interaction and the influence of allosteric effectors. Journal of Molecular Biology, 265:1–7. doi:10.1006/jmbi.1996.0706. Jacob, F. and Monod, J. (1961). Genetic regulatory mechanisms in the synthesis of proteins. Journal of Molecular Biology, 3:318–356. Jacobs and Thorpe (1995). Generic rigidity percolation: The pebble game. Physical Review Letters, 75(22):4051–4054. doi:10.1103/PhysRevLett.75.4051. Jacobs, D., Rader, A., Kuhn, L., and Thorpe, M. (2001). Protein flexibility predictions using graph theory. Proteins, 44(2):150–165. doi:10.1002/prot.1081. Jacobs, D. J., Dallakyan, S., Wood, G., and Heckathorne, A. (2003). Network rigidity at finite temperature: relationships between thermodynamic stability, the nonadditivity of entropy, and cooperativity in molecular systems. Physical Review E, 68(6):061109. doi:10.1103/PhysRevE.68.061109. Jardetzky, O. and Lefevre, J. F. (1994). Protein dynamics. FEBS Letters, 338:246–250. doi:10.1016/0014-5793(94)80277-7. Jimenez, R., Salazar, G., Yin, J., Joo, T., and Romesberg, F. E. (2004). Protein dynamics and the immunological evolution of molecular recognition. PNAS, 101:3803–3808. doi:10.1073/pnas.0305745101. Jin, L., Yang, J., and Carey, J. (1993). Thermodynamics of ligand binding to trp repressor. Biochemistry, 32:7302–7309. url: http://pubs.acs.org/cgi-bin/ searchRedirect.cgi/bichaw/1993/32/i28/pdf/bi00079a029.pdf. Johnson, K. A. (1985). Pathway of the microtubule-dynein ATPase and the structure of dynein: a comparison with actomyosin. Annual review of biophysics and biophysical chemistry, 14:161–188. Jülicher, F. (1994). Supercoiling transitions of closed DNA. Physical Review E, 49(3):2429–2435. doi:10.1103/PhysRevE.49.2429. Jusuf, S., Loll, P. J., and Axelsen, P. H. (2003). Configurational entropy and cooperativity between ligand binding and dimerization in glycopeptide antibiotics. Journal of the American Chemical Society, 125(13):3988–3994. doi:10.1021/ja027780r. Kalodimos, C. G., Biris, N., Bonvin, A. M. J. J., Levandoski, M. M., Guennuegues, M., Boelens, R., and Kaptein, R. (2004a). Structure and flexibility adapta- tion in nonspecific and specific protein-DNA complexes. doi:10.1126/science.1097064. Science, 305:386–389. BIBLIOGRAPHY 151 Kalodimos, C. G., Boelens, R., and Kaptein, R. (2004b). Toward an integrated model of protein-DNA recognition as inferred from NMR studies on the lac repressor system. Chemical Reviews, 104:3567–3586. doi:10.1021/cr0304065. Kaptein, R., Slijper, M., and Boelens, R. (1995). Structure and dynamics of the lac repressor-operator complex as determined by NMR. Toxicology Letters, 82:591–599. Kauzmann, W. (1959). Some factors in the interpretation of protein denaturation. Advances in Protein Chemistry, 14:1–63. Kendrew, J., Bodo, G., Dintzis, H., Parrish, R., Wyckoff, H., and Phillips, D. (1958). A three-dimensional model of the myoglobin molecule obtained by x-ray analysis. Nature, 181(4610):662–666. Kern, D. and Zuiderweg, E. R. P. (2003). The role of dynamics in allosteric regulation. Current Opinion in Structural Biology, 13:748–757. doi:10.1016/j.sbi.2003.10.008. Kim, K. K., Yokota, H., and Kim, S. H. (1999). Four-helical-bundle structure of the cytoplasmic domain of a serine chemotaxis receptor. Nature, 400:787–792. doi:10.1038/23512. Kim, S. H., Wang, W. R., and Kim, K. K. (2002). Dynamic and clustering model of bacterial chemotaxis receptors: Structural basis for signaling and high sensitivity. PNAS, 99(18):11611–11615. doi:10.1073/pnas.132376499. Kisker, C., Hinrichs, W., Tovar, K., Hillen, W., and Saenger, W. (1995). The complex formed between Tet repressor and tetracycline-Mg2+ reveals mechanism of antibiotic resistance. Journal of Molecular Biology, 247(2):260–280. PDB ID: 2TCT. doi:10.1006/jmbi.1994.0138. Kitov, P. I., Sadowska, J. M., Mulvey, G., Armstrong, G. D., Ling, H., Pannu, N. S., Read, R. J., and Bundle, D. R. (2000). Shiga-like toxins are neutralized by tailored multivalent carbohydrate ligands. Nature, 403:669–672. PDB ID: 1QNU. doi:10.1038/35001095. Klein-Seetharaman, J., Yanamala, N. V. K., Javeed, F., Reeves, P. J., Getmanova, E. V., Loewen, M. C., Schwalbe, H., and Khorana, H. G. (2004). Differential dynamics in the G protein-coupled receptor rhodopsin revealed by solution NMR. PNAS, 101(10):3409–3413. doi:10.1073/pnas.0308713101. Klostermeier, D. and Millar, D. (2001). Time-resolved fluorescence resonance energy transfer: a versatile tool for the analysis of nucleic acids. Biopolymers, 61(3):159–179. doi:10.1002/bip.10146. 152 BIBLIOGRAPHY Knowles, T. J., Homans, S. W., and Stockley, P. G. (2005). Structure and dynamics of met aporepressor and holorepressor. Private communication. Kon, T., Nishiura, M., Ohkura, R., Toyoshima, Y. Y., and Sutoh, K. (2004). Distinct functions of nucleotide-binding/hydrolysis sites in the four AAA modules of cytoplasmic dynein. Biochemistry, 43(35):11266–11274. doi:10.1021/bi048985a. Koonce, M. and Tikhonenko, I. (2000). Functional elements within the dynein microtubule-binding domain. Molecular Biology of the Cell, 11(2):523–529. url: http://www.molbiolcell.org/cgi/content/full/11/2/523. Koshland, D., Nemethy, G., and Filmer, D. (1966). Comparison of experimental binding data and theoretical models in proteins containing subunits. Biochemistry, 5:365–385. Koshland, D. E. (1958). Application of a theory of enzyme specificity to protein synthesis. PNAS, 44:98–104. url: http://www.pnas.org/cgi/reprint/44/2/98. Krebs, W., Alexandrov, V., Wilson, C., Echols, N., Yu, H., and Gerstein, M. (2002). Normal mode analysis of macromolecular motions in a database framework: developing mode concentration as a useful classifying statistic. Proteins, 48:682–695. doi:10.1002/prot.10168. Krebs, W. and Gerstein, M. (2000). The morph server: a standardized system for analyzing and visualizing macromolecular motions in a database framework. Nucleic Acids Research, 28(8):1665–1675. url: http://nar.oxfordjournals.org/cgi/ reprint/28/8/1665. Kullback, S. and Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22:79–86. url: http://www.jstor.org/view/00034851/ di983635/98p0574d/0. Kurzyński, M. (1998). A synthetic picture of intramolecular dynamics of proteins. Towards a contemporary statistical theory of biochemical processes. Progress in Biophysics & Molecular Biology, 69(1):23–82. doi:10.1016/S0079-6107(97)00033-3. Laatikainen, R. and Tuppurainen, K. (1988). Torsional entropy as co-operator - a device for synthetic allostery. Tetrahedron Letters, 29:5021–5024. doi:10.1016/S00404039(00)80669-3. Laman, G. (1970). Graphs and rigidty of plane skeletal structure. Journal of Engineering Mathematics, 4(4):331. Lawson, C. L. and Carey, J. (1993). Tandem binding in crstals of a trp repres- sor/operator half-site complex. Nature, 366(6451):178–182. PDB ID: 1TRR. BIBLIOGRAPHY 153 Leach, A. R. (2001). Molecular modelling principles and applications. Pearson Education Limited, 2nd edition. Leavitt, S. and Freire, E. (2001). Direct measurement of protein binding energetics by isothermal titration calorimetry. Current Opinion in Structural Biology, 11(5):560– 566. doi:10.1016/S0959-440X(00)00248-7. Lee, A., Kinnear, S., and Wand, A. (2000). Redistribution and loss of side chain entropy upon formation of a calmodulin-peptide complex. Nature Structural & Molecular Biology, 7(1):72–77. doi:10.1038/71280. Lehninger, A. L., Nelson, D. L., and Cox, M. M. (2000). Lehninger Principles of Biochemistry. Worth, 3rd edition. url: www.worthpublishers.com/lehninger3D. Levandoski, M. M., Tsodikov, O. V., Frank, D. E., Melcher, S. E., Saecker, R. M., and Record Jr, M. T. (1996). Cooperative and anticooperative effects in binding of the first and second plasmid osym operators to a lac1 tetramer: Evidence for contributions of non-operator DNA binding by wrapping and looping. Journal of Molecular Biology, 260(5):697–717. doi:10.1006/jmbi.1996.0431. Levitt, M. (1983). Molecular dynamics of native protein. I. Computer simulation of trajectories. Journal of Molecular Biology, 168(3):595–617. Lewis, M., Chang, G., Horton, N. C., Kercher, M. A., Pace, H. C., Schumacher, M. A., Brennan, R. G., and Lu, P. (1996). Crystal structure of the lactose operon repressor and its complexes with DNA and inducer. Science, 271(5253):1247–1254. PDB ID: 1LBI. url: http://www.sciencemag.org/cgi/content/abstract/271/5253/1247. Lindemann, C. B. and Hunt, A. J. (2003). Does axonemal dynein push, pull, or oscillate? Cell Motility and the Cytoskeleton, 56:237–244. doi:10.1002/cm.10148. Lindorff-Larsen, K., Best, R. B., Depristo, M. A., Dobson, C. M., and Vendruscolo, M. (2005). Simultaneous determination of protein structure and dynamics. Nature, 433(7022):128–132. doi:10.1038/nature03199. Lipari, G. and Szabo, A. (1982). Model-free approach to the interpretation of nuclear magnetic resonance relaxation in macromolecules. Journal of the American Chemical Society, 104:4546–4559. url: http://pubs.acs.org/cgi-bin/sample.cgi/jacsat/ 1982/104/i17/pdf/ja00381a009.pdf. Liu, L., Wales, M., and Wild, J. (1998). Temperature effects on the allosteric responses of native and chimeric aspartate transcarbamoylases. Journal of Molecular Biology, 282(4):891–901. doi:10.1006/jmbi.1998.2054. 154 BIBLIOGRAPHY Liu, X., Huang, Y., Zhang, W., Fan, G., Fan, C., and Li, G. (2005). Electrochemical investigation of redox thermodynamics of immobilized myoglobin: ionic and ligation effects. Langmuir, 21(1):375–378. doi:10.1021/la047928f. Liu, Y., Levit, M., Lurz, R., Surette, M., and Stock, J. (1997). Receptor-mediated protein kinase activation and the mechanism of transmembrane signaling in bacterial chemotaxis. The EMBO Journal, 16(24):7231–7240. doi:10.1093/emboj/16.24.7231. Livesay, D., Dallakyan, S., Wood, G., and Jacobs, D. (2004). approach for understanding protein stability. A flexible FEBS Letters, 576(3):468–476. doi:10.1016/j.febslet.2004.09.057. Louis, A. A. (2002). Beware of density dependent pair potentials. Journal of Physics: Condensed Matter, 14(40):9187–9206. doi:10.1088/0953-8984/14/40/311. Ma, B., Shatsky, M., Wolfson, H. J., and Nussinov, R. (2002). Multiple diverse ligands binding at a single protein site: a matter of pre-existing populations. Protein Science, 11(2):184–197. url: http://www.proteinscience.org/cgi/content/full/11/2/ 184. Ma, J. and Karplus, M. (1998). The allosteric mechanism of the chaperonin GroEL: a dynamic analysis. PNAS, 95(15):8502–8507. url: http://www.pnas.org/cgi/ reprint/95/15/8502. Maggs, A. C. (2001). Writhing geometry at finite temperature: Random walks and geometric phases for stiff polymers. The Journal of Chemical Physics, 114(13):5888– 5896. doi:10.1063/1.1353545. Mäler, L., Blankenship, J., Rance, M., and Chazin, W. (2000). Site-site communication in the EF-hand Ca2+ -binding protein calbindin D9k. Nature Structural & Molecular Biology, 7(3):245–250. doi:10.1038/73369. Markelz, A., Whitmire, S., Hillebrecht, J., and Birge, R. (2002). THz time domain spectroscopy of biomolecular conformational modes. Physics in Medicine and Biology, 47(21):3797–3805. doi:10.1088/0031-9155/47/21/318. Marko, J. F. dertwisting, (1998). DNA under and relaxation dynamics. high tension: Overstretching, un- Physical Review E, 57:2134–2149. doi:10.1103/PhysRevE.57.2134. Matthews, K. S. (1996). The whole lactose repressor. Science, 271(5253):1245–1246. url: http://www.sciencemag.org/cgi/content/summary/271/5253/1245. BIBLIOGRAPHY 155 Mello, B. A. and Tu, Y. H. (2003). Quantitative modeling of sensitivity in bacterial chemotaxis: The role of coupling among different chemoreceptor species. PNAS, 100:8223–8228. doi:10.1073/pnas.1330839100. Michaelis, L. and Menten, M. (1913). Die kinetik der invertinwirkung. Biochem. Z., 49:333–369. Micheletti, C., Lattanzi, G., and Maritan, A. (2002). Elastic properties of proteins: Insight on the folding process and evolutionary selection of native structures. Journal of Molecular Biology, 321:909–921. doi:10.1016/S0022-2836(02)00710-6. Ming, D. and Wall, M. E. (2005). Quantifying allosteric effects in proteins. Proteins, 59(4):697–707. doi:10.1002/prot.20440. Miyashita, O., Onuchic, J. N., and Wolynes, P. G. (2003). Nonlinear elasticity, proteinquakes, and the energy landscapes of functional transitions in proteins. PNAS, 100:12570–12575. doi:10.1073/pnas.2135471100. Mizuno, N., Toba, S., Edamatsu, M., Watai-Nishii, J., Hirokawa, N., Toyoshima, Y. Y., and Kikkawa, M. (2004). Dynein and kinesin share an overlapping microtubulebinding site. The EMBO Journal, 23(13):2459–2467. doi:10.1038/sj.emboj.7600240. Mocz, G. and Gibbons, I. (2001). Model for the motor component of dynein heavy chain based on homology to the AAA family of oligomeric ATPases. Structure (Camb), 9(2):93–103. doi:10.1016/S0969-2126(00)00557-8. Monod, J., Changeux, J., and Jacob, F. (1963). Allosteric proteins and cellular control systems. Journal of Molecular Biology, 6:306–329. Monod, J., Wyman, J., and Changeux, J.-P. (1965). On the nature of allosteric transitions: a plausible model. Journal of Molecular Biology, 12:88–118. Moroz, J. D. and Nelson, P. (1998). Entropic elasticity of twist-storing polymers. Macromolecules, 31:6333–6347. doi:10.1021/ma971804a. Nagadoi, A., Morikawa, S., Nakamura, H., Enari, M., Kobayashi, K., Yamamoto, H., Sampei, G., Mizobuchi, K., Schumacher, M., and Brennan, R. (1995). Structural comparison of the free and DNA-bound forms of the purine repressor DNA-binding domain. Structure, 3(11):1217–1224. doi:10.1016/S0969-2126(01)00257-X. Oehler, S., Amouyal, M., Kolkhof, P., von Wilcken-Bergmann, B., and Mller-Hill, B. (1994). Quality and position of the three lac operators of E. coli define efficiency of repression. The EMBO Journal, 13(14):3348–3355. url: http://embojournal. npgjournals.com/cgi/content/abstract/13/14/3348. 156 BIBLIOGRAPHY Offer, G. and Sessions, R. (1995). Computer modelling of the alpha-helical coiled coil: packing of side-chains in the inner core. Journal of Molecular Biology, 249(5):967– 987. doi:10.1006/jmbi.1995.0352. Omoto, C. and Johnson, K. (1986). Activation of the dynein adenosinetriphos- phatase by microtubules. Biochemistry, 25(2):419–427. url: http://pubs.acs.org/ cgi-bin/sample.cgi/bichaw/1986/25/i02/pdf/bi00350a022.pdf. Orth, P., Cordes, F., Schnappinger, D., Hillen, W., Saenger, W., and Hinrichs, W. (1998). Conformational changes of the Tet repressor induced by tetracy- cline trapping. Journal of Molecular Biology, 279(2):439–447. PDB ID: 1A6I. doi:10.1006/jmbi.1998.1775. Ottemann, K. M., Xiao, W. Z., Shin, Y. K., and Koshland, D. E. (1999). A piston model for transmembrane signaling of the aspartate receptor. Science, 285:1751– 1754. doi:10.1126/science.285.5434.1751. Otwinowski, L., Schevitz, R. W., Zhang, R.-G., Lawson, C. L., Joachimak, A., Marmorstein, R. Q., Luisi, B. F., and Sigler, P. (1988). Crystal structure of trp repressor/operator complex at atomic resolution. Nature, 335:321. Pace, H. C., Kercher, M. A., Lu, P., Markiewicz, P., Miller, J. H., Chang, G., and Lewis, M. (1997). Lac repressor genetic map in real space. Trends in Biochemical Sciences, 22:334–339. doi:10.1016/S0968-0004(97)01104-3. Palmer, A. G. (2001). NMR probes of molecular dynamics: overview and com- parison with other techniques. Annu Rev Biophys Biomol Struct, 30:129–155. doi:10.1146/annurev.biophys.30.1.129. Perutz, M. (1963). X-ray analysis of hemoglobin. Science, 140:863–869. Phillips, K. and Phillips, S. E. V. (1994). Electrostatic activation of Escherichia coli methionine repressor. Structure, 2:309–316. Phillips, S. E. and Stockley, P. G. (1996). Structure and function of Escherichia coli met repressor: similarities and contrasts with trp repressor. Phil. Trans. R. Soc. Lond. B, 351:527–535. Pin, J.-P., Kniazeff, J., Liu, J., Binet, V., Goudet, C., Rondard, P., and Przeau, L. (2005). Allosteric functioning of dimeric class C G-protein-coupled receptors. FEBS Journal, 272(12):2947–2955. doi:10.1111/j.1742-4658.2005.04728.x. Polit, A., BÃlaszczyk, U., and Wasylewski, Z. (2003). Steady-state and time-resolved fluorescence studies of conformational changes induced by cyclic AMP and DNA BIBLIOGRAPHY 157 binding to cyclic AMP receptor protein from Escherichia coli. European Journal of Biochemistry, 270(7):1413–1423. url: http://content.febsjournal.org/cgi/ content/full/270/7/1413. Porter, M. and Johnson, K. (1983a). Characterization of the ATP-sensitive binding of Tetrahymena 30 S dynein to bovine brain microtubules. Journal of Biological Chemistry, 258(10):6575–6581. url: http://www.jbc.org/cgi/content/abstract/ 258/10/6575. Porter, M. and Johnson, K. (1983b). Transient state kinetic analysis of the ATPinduced dissociation of the dynein-microtubule complex. Journal of Biological Chemistry, 258(10):6582–6587. url: http://www.jbc.org/cgi/reprint/258/10/6582. Porter, M. and Johnson, K. (1989). Dynein structure and function. Annual review of cell biology, 5:119–151. doi:10.1146/annurev.cb.05.110189.001003. Radivojac, P., Obradovic, Z., Smith, D. K., Zhu, G., Vucetic, S., Brown, C. J., Lawson, J. D., and Dunker, A. K. (2004). Protein flexibility and intrinsic disorder. Protein Science, 13:71–80. url: http://www.proteinscience.org/cgi/content/ abstract/13/1/71. Rafferty, J. B., Somers, W. S., Saint-Girons, I., and Phillips, S. E. V. (1989). Threedimentional crystal structures of Escherichia coli met repressor with and without corepressor. Nature, 341(6244):705–710. PDB ID: 1CMC. doi:10.1038/341705a0. Rao, C. V., Frenklach, M., and Arkin, A. P. (2004). An allosteric model for transmembrane signaling in bacterial chemotaxis. Journal of Molecular Biology, 343:291–303. doi:10.1016/j.jmb.2004.08.046. Rattle, H. (1995). An NMR primer for life scientists. Partnership press. Reinhart, G. D., Hartleip, S. B., and Symcox, M. M. (1989). Role of coupling entropy in establishing the nature and magnitude of allosteric response. PNAS, 86(11):4032– 4036. url: http://www.pnas.org/cgi/content/abstract/86/11/4032. Rhodes, G. (1993). Crystallography made crystal clear. Academic Press Inc., San Diego, California. Rhoten, D. and Parker, A. (2004). Risks and rewards of an interdisciplinary research path. Science, 306:2046. doi:10.1126/science.1103628. Rossetto, V. and Maggs, A. C. (2003). Writhing geometry of open DNA. Journal of Chemical Physics, 118:9864–9874. doi:10.1063/1.1569905. 158 Rouse, P. E. (1953). BIBLIOGRAPHY A theory of the linear viscoelastic properties of dilute so- lutions of coiling polymers. The Journal of Chemical Physics, 21(7):1272–1280. doi:10.1063/1.1699180. Schlitter, J. u. (1993). Estimation of absolute and relative entropies of macro- molecules using the covariance-matrix. Chemical Physics Letters, 215(6):617–621. doi:10.1016/0009-2614(93)89366-P. Schrödinger, E. (1944). What is Life? Cambridge University Press, 1st edition. Schymkowitz, J. W. H., Rousseau, F., Wilkinson, H. R., Friedler, A., and Itzhaki, L. S. (2001). Observation of signal transduction in three-dimensional domain swapping. Nature Structural Biology, 8:888–892. doi:10.1038/nsb1001-888. Seeley, S. K., Weis, R. M., and Thompson, L. K. (1996). The cytoplasmic fragment of the aspartate receptor displays globally dynamic behavior. Biochemistry, 35:5199– 5206. doi:10.1021/bi9524979. Shen, T., Tai, K., Henchman, R. H., and McCammon, J. A. (2002). Molecular dynamics of acetylcholinesterase. Accounts of Chemical Research, 35(6):332–340. doi:10.1021/ar010025i. Shi, Y. and Duke, T. (1998). Cooperative model of bacterial sensing. Physical Review E, 58:6399–6406. doi:10.1103/PhysRevE.58.6399. Slijper, M., Boelens, R., Davis, A. L., Konings, R. N. H., van der Marel, G. A., van Boom, J. H., and Kaptein, R. (1997). Backbone and side chain dynamics of lac repressor headpiece (1-56) and its complex with DNA. Biochemistry, 36:249–254. PDB ID: 1LQC. doi:10.1021/bi961670d. Somers, W. S. and Phillips, S. E. V. (1992). Crystal structure of the met repressoroperator complexes at 2.8Å resolution reveals DNA recognition by β-strands. Nature, 359-393:387. PDB ID: 1CMA. doi:10.1038/359387a0. Spolar, R. and Record, M. (1994). Coupling of local folding to site-specific binding of proteins to DNA. Science, 263(5148):777–784. Stockley, P. G., Turnbull, B., Hawkins, R. J., and McLeish, T. C. (2005). Calorimetry of lac repressor mutants. Discussion meetings. Streaker, E. D., Gupta, A., and Beckett, D. (2002). The biotin repressor: thermodynamic coupling of corepressor binding, protein assembly, and sequence-specific DNA binding. Biochemistry, 41(48):14263–14271. doi:10.1021/bi0203839. BIBLIOGRAPHY 159 Struik, D. J. (1961). Lectures on Classical Differential Geometry. Dover publications inc. New York, 2nd edition. Suckow, J., Markiewicz, P., Kleina, L., Miller, J., Kisters-Woike, B., and Mller-Hill, B. (1996). Genetic studies of the Lac repressor. XV: 4000 single amino acid substitutions and analysis of the resulting phenotypes on the basis of the protein structure. Journal of Molecular Biology, 261(4):509–523. doi:10.1006/jmbi.1996.0479. Suhre, K. and Yves-Henri, S. (2004). ElNémo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement. Nucleic Acids Research, 32:W610–W614. doi:10.1093/nar/gkh368. Swint-Kruse, L., Zhan, H., and Matthews, K. (2005). Integrated Insights from Simulation, Experiment, and Mutational Analysis Yield New Details of LacI Function. Biochemistry, 44(33):11201–11213. doi:10.1021/bi050404+. Swint-Kruse, L., Zhan, H. L., Fairbanks, B. M., Maheshwari, A., and Matthews, K. S. (2003). Perturbation from a distance: Mutations that alter LacI function through long-range effects. Biochemistry, 42:14004–14016. doi:10.1021/bi035116x. Szwajkajzer, D. and Carey, J. (1997). Molecular and biological constraints on ligandbinding affinity and specificity. Biopolymers, 44(2):181–198. doi:10.1002/(SICI)10970282(1997)44:2¡181::AID-BIP5¿3.0.CO;2-R. Tama, F., Gadea, F. X., Marques, O., and Sanejouand, Y. H. (2000). Building-block approach for determining low-frequency normal modes of macromolecules. Proteins, 41:1–7. doi:10.1002/1097-0134(20001001)41:1¡1::AID-PROT10¿3.0.CO;2-P. Tama, F. and Sanejouand, Y. H. (2001). Conformational change of proteins arising from normal mode calculations. Protein Engineering, 14:1–6. url: http://peds. oxfordjournals.org/cgi/content/full/14/1/1. Thain, M. and Hickman, M. (2001). Dictionary of Biology. Penguin Books. Thomas, A., Field, M., and Perahia, D. (1996). Analysis of the low-frequency normal modes of the R state of aspartate transcarbamylase and a comparison with the T state modes. Journal of Molecular Biology, 261(3):490–506. doi:10.1006/jmbi.1996.0478. Tirion, M. M. (1996). Large amplitude elastic motions in proteins from a singleparameter, atomic analysis. Physical Review Letters, 77:1905–1908. doi:10.1103/PhysRevLett.77.1905. Tournier, A. L. and Smith, J. C. (2003). Principal components of the protein dynamical transition. Physical Review Letters, 91:208106. doi:10.1103/PhysRevLett.91.208106. 160 BIBLIOGRAPHY Truong, K. and Ikura, M. (2001). The use of FRET imaging microscopy to detect protein-protein interactions and protein conformational changes in vivo. Current Opinion in Structural Biology, 11(5):573–578. doi:10.1016/S0959-440X(00)00249-9. Vale, R. (2000). AAA proteins. Lords of the ring. Journal of Cell Biology, 150(1):F13– F19. doi:10.1083/jcb.150.1.F13. Vergani, B., Kintrup, M., Hillen, W., Lami, H., Pimont, E., Bombarda, E., Alberti, P., Doglia, S., and Chabbert, M. (2000). Backbone dynamics of Tet repressor α8 ∩ α9 loop. Biochemistry, 39(10):2759–2768. doi:10.1021/bi9912591. Villa, E., Balaeff, A., and Schulten, K. (2005). Structural dynamics of the lac repressorDNA complex revealed by a multiscale simulation. PNAS, 102(19):6783–6788. doi:10.1073/pnas.0409387102. Volkman, B., Lipson, D., Wemmer, D., and Kern, D. (2001). Two-state al- losteric behavior in a single-domain signaling protein. Science, 291(5512):2429–2433. doi:10.1126/science.291.5512.2429. Wand, A. (2001a). On the dynamic origins of allosteric activation. Science, 293(5534):1395. doi:10.1126/science.293.5534.1395a. Wand, A. J. (2001b). Dynamic activation of protein function: A view emerging from NMR spectroscopy. Nature Structural Biology, 8(11):926–931. doi:10.1038/nsb1101926. Watson, J. and Crick, F. (1953). The structure of DNA. Cold Spring Harbor Symposia on Quantitative Biology, 18:123–131. Weber, G. (1972). Ligand binding and internal equilibria in proteins. istry, 11(5):864–878. Biochem- url: http://pubs.acs.org/cgi-bin/sample.cgi/bichaw/ 1972/11/i05/pdf/bi00755a028.pdf. Webre, D. J., Wolanin, P. M., and Stock, J. B. (2004). Modulated receptor interactions in bacterial transmembrane signaling. Trends in Cell Biology, 14:478–482. doi:10.1016/j.tcb.2004.07.015. Weiss, S. (2000). Measuring conformational dynamics of biomolecules by sin- gle molecule fluorescence spectroscopy. Nature Structural Biology, 7(9):724–729. doi:10.1038/78941. Wells, S., Menor, S., Hespenheide, B., and Thorpe, M. F. (2005). Constrained geometric simulation of diffusive motion in proteins. Physical Biology, 2:1–10. url: http: //physics.asu.edu/homepages/mfthorpe/237.pdf. BIBLIOGRAPHY 161 Whitby, F. and Phillips, G. (2000). 7 Angstroms resolution. Proteins, Crystal structure of tropomyosin at 38(1):49–59. doi:10.1002/(SICI)1097- 0134(20000101)38:1¡49::AID-PROT6¿3.0.CO;2-B. White, J. H. (1969). Self-linking and the gauss integral in higher dimensions. American Journal of Mathematics, 91:693–728. url: http://www.jstor.org/view/00029327/ di994388/99p0127x/0. Wlodek, S. T., Clark, T. W., and Scott, L. R.and McCammon, J. A. (1997). Molecular dynamics of acetylcholinesterase dimer complexed with tacrine. Journal of the American Chemical Society, 119(40):9513–9522. doi:10.1021/ja971226d. Xu, C., Tobi, D., and Bahar, I. (2003). Allosteric changes in protein structure computed by a simple mechanical model: Hemoglobin T↔R2 transition. Journal of Molecular Biology, 333:153–168. doi:10.1016/j.jmb.2003.08.027. Xu, H., Moraitis, M., Reedstrom, R., and Matthews, K. (1998). Kinetic and thermodynamic studies of purine repressor binding to corepressor and operator DNA. Journal of Biological Chemistry, 273(15):8958–8964. url: http://www.jbc.org/cgi/ content/abstract/273/15/8958. Yan, J., Liu, Y., Lukasik, S. M., Speck, N. A., and Bushweller, J. H. (2004). CBF β allosterically regulates the runx1 runt domain via a dynamic conformational equilibrium. Nature Structural and Molecular Biology, 11:901–906. doi:10.1038/nsmb819. Yang, D. and Kay, L. E. (1996). Contributions to conformational entropy aris- ing from bond vector fluctuations measured from NMR-derived order parameters: Application to protein folding. Journal of Molecular Biology, 263(2):369–382. doi:10.1006/jmbi.1996.0581. Yang, L.-W., Liu, X., Jursa, C. J., Holliman, M., Rader, A., Karimi, H. A., and Bahar, I. (2005). iGNM: a database of protein functional motions based on Gaussian Network Model. Bioinformatics, 21(13):2978–2987. doi:10.1093/bioinformatics/bti469. Yildirim, N. and Mackey, M. C. (2003). Feedback regulation in the lactose operon: A mathematical modeling study and comparison with experimental data. Biophysical Journal, 84:2841–2851. url: http://www.biophysj.org/cgi/content/full/84/5/ 2841. Yung, A., Turnbull, W. B., Kalverda, A. P., Thompson, G. S., Homans, S. W., Kitov, P., and Bundle, D. R. (2003). Large-scale millisecond intersubunit dynamics in the B subunit homopentamer of the toxin derived from Escherichia coli O157. Journal of the American Chemical Society, 125(43):13058–13062. doi:10.1021/ja0367288. 162 Zaccai, BIBLIOGRAPHY G. (2000). How soft is a protein? constant measured by neutron scattering. A protein dynamics force Science, 288(5471):1604–1607. doi:10.1126/science.288.5471.1604. Zhang, H., Zhao, D., Revington, M., Lee, W., Jia, X., Arrowsmith, C., and Jardetzky, O. (1994). The solution structures of the trp repressor-operator DNA complex. Journal of Molecular Biology, 238:592–614. PDB ID: 1RCS. doi:10.1006/jmbi.1994.1317. Zhang, R. G., Joachimiak, A., Lawson, C. L., Schevitz, R. W., Otwinowski, Z., and Sigler, P. B. (1987). The crystal structure of trp aporepressor at 1.8Å shows how binding tryptophan enhances DNA affinity. Nature, 327:591–597. doi:10.1038/327591a0. Zhao, D., Arrowsmith, C. H., Jia, X., and Jardetzky, O. (1993). Refined solution structures of the Escherichia coli trp holo- and aporepressor. Journal of Molecular Biology, 229:735–746. PDB ID: 1WRS and 1WRT. doi:10.1006/jmbi.1993.1076.
© Copyright 2026 Paperzz