BioSystems 85 (2006) 126–136 Molecular flexibility in protein–DNA interactions Stefan Günther∗ , Kristian Rother, Cornelius Frömmel Institute of Biochemistry Charité, Monbijoustrasse 2, 10117 Berlin, Germany Received 23 July 2005; received in revised form 7 September 2005; accepted 13 December 2005 Abstract In living cells protein–DNA interactions are fundamental processes. Here, we compare the 3D structures of several DNA-binding proteins frequently determined with and without attached DNA. We studied the global structure (backbone-traces) as well as the local structure (binding sites) by comparing pair-wise the related atoms. The DNA-interaction sites of uncomplexed proteins show conspicuously high local structural flexibility. Binding to DNA results in specific local conformations, which are clearly distinct from the unbound states. The adaptation of the protein’s binding site to DNA can never be described by the lock and key model but in all cases by the induced fit model. Conformational changes in the seven protein backbone traces take place in different ways. Two of them dock onto DNA without a significant change, while the other five proteins are characterized by a backbone conformation change caused by DNA docking. In the case of three proteins of the latter group the DNA-complexed conformation also occurs in a few uncomplexed structures. This behavior can be described by a conformational ensemble, which is narrowed down by DNA docking until only one single DNA-complexed conformation occurs. Different docking models are discussed and each of the seven proteins is assigned to one of them. © 2006 Elsevier Ireland Ltd. All rights reserved. Keywords: Conformational changes; Structure/function studies; Protein nucleic acid interactions; Computational analysis of protein structure; Conformational equilibrium 1. Introduction At the atomic level protein–DNA binding is characterized by non-specific interactions with the DNA backbone and DNA-sequence-specific interactions with individual bases. Several aspects of these interactions have already been elucidated on protein–DNA-complexes collected in the Protein Data Bank (Berman et al., 2000). For example, addressing questions such as: how large are the interfaces and which hydrogen bonds occur Abbreviations: PDB, The Protein Data Bank; MetJ, methionine repressor; CAP, catabolite activator protein; CBF, core-binding factor; DtxR, diphtheria toxin repressor; PvuII, PvuII endonuclease ∗ Corresponding author. Tel.: +49 30 450 528 375; fax: +49 30 450 528 942. E-mail address: [email protected] (S. Günther). (Nadassy et al., 1999)? How do protein mutations affect the binding specificity (Luscombe and Thornton, 2002)? How do the various chemical interactions influence the protein–DNA binding (Mandel-Gutfreund and Margalit, 1998)? Which packing density do the atoms adopt in the interface (Nadassy et al., 2001)? However, there is no detailed analysis on how proteins adopt to DNA structure. Several models have been proposed to describe molecular recognition processes of proteins. In 1894, Emil Fischer formulated the “lock and key principle” (Fischer, 1894). It implies that the binding site is inflexible, and the appropriate ligand fits it perfectly. Although it is incontestable that the function of a protein depends on its structure, the lock and key principle fails to explain several observations. Taking into account the flexibility of proteins, Koshland introduced the “induced fit model” (Koshland, 1958), which assumes the binding sites will 0303-2647/$ – see front matter © 2006 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.biosystems.2005.12.007 S. Günther et al. / BioSystems 85 (2006) 126–136 be structurally influenced by the ligand. Later Monod et al. (1965) postulated two or more pre-existing conformational states for allosteric proteins. This model, based on the “equilibrium hypothesis” (Tsai et al., 1999), is related to the funnel-shaped energy landscape of a protein, used to describe protein folding (Frauenfelder et al., 127 1991). The protein has an ensemble of local and global conformations located at the bottom of the energy funnel. Low barriers allow switching through these conformations, one of which corresponds to the ligand complexed form and ligand binding shifts the equilibrium to this state (Fig. 1(a)). Fig. 1. (a) Schematic description of local structural alteration of the binding site. Three different models are illustrated: (I) “Lock and key model”: The binding site of the free protein has a shape complementary to DNA. (II) Induced conformational change: The DNA docking is associated with a structural adaptation of the binding site. (III) “Conformational Equilibrium model”: The free binding site appears in an ensemble of conformations. DNA binds to the one that is best suited for docking. (b) Schematic illustration of different docking models describing conformational change of the backbone. (I) The backbone remains unchanged during DNA docking. (II) The DNA docking is associated with an induced conformational change. (III) Conformational diversity: The free protein appears in an ensemble of conformations. The DNA binds to the one that is best suited for docking. (IV) “Dynamic shift model”: DNA binding causes a change in the probability distribution of the ensemble of native states. 128 S. Günther et al. / BioSystems 85 (2006) 126–136 An extension of the equilibrium hypothesis is the “dynamic population shift model” (Freire, 1999). It is based on the assumption that ligand binding causes a change in the probability distribution of the native global state ensemble. The stabilization of a distinct structure by ligand binding results in a conformational change throughout the protein. This model can adequately describe proteins that exhibit allosteric behavior. In protein–protein interactions as well as in the case of DNA-binding proteins the different docking models for local binding site structure (Fig. 1(a)) and global backbone conformation (Fig. 1(b)) do not contradict each other, rather particular protein–ligand interactions relate more or less to one of them. Structural shifts induced by ligands or structural differences between pre-established states of an equilibrium are variable in scale. The differences encompass small changes of side-chain atoms and complete movements of the backbone of a particular protein. In the following we use “conformational change” to describe changes of the protein backbone measured as rmsd of C(␣)-atoms and “local structural alteration” to describe the adaptation of binding site atoms to DNA measured as rmsd of atoms located in the interface area. Structural shifts have been analyzed for protein– protein interactions (Goh et al., 2004; Echols et al., 2003). Distinct interactions could be assigned to the induced fit model, others to the equilibrium and the dynamic population shift model. The data support the hypothesis that proteins exist in an ensemble of conformational states. A small population exists between these states, which is active without the presence of a second protein or ligand. This hypothesis can explain a low level of activity of unphosphorylated proteins, which are normally described as completely inactive (Barak and Eisenbach, 1992). Due to the large number of 3D structures of free and complexed DNA-binding proteins, a similar comparative study was also possible. Here, we analyze conformational diversity within seven DNA-binding proteins, that have frequently been crystallized in DNA-complexed and free states. While several analyses point out that DNA binding is often coupled with conformational changes at least within the binding domain (Spolar and Record, 1994; Garvie and Wolberger, 2001), here local changes of the binding site are also considered to estimate their flexibility. The superposition results indicate that movements of atoms in the binding site region as well as of the backbone trace due to binding of DNA are detectable within all seven proteins. We were able to assign each of the seven proteins to one of the models given in Fig. 1(a) and (b). 2. Materials and methods 2.1. Dataset First, we collected all protein–DNA complexes from the PDB showing a resolution better than 3.2 Å . Second, using the cd-hit algorithm (Li et al., 2001) and a sequence similarity threshold of at least 90% we searched for free homologues. Only complexes co-crystallized with double-stranded DNA longer than four nucleotides were used in the analysis. Small proteins with less than 20 amino acids were removed from the dataset. Only sets of structures including at least six DNA-bound and six DNA-free examples were considered. The resulting seven protein groups are given in Table 1, for the complete list of PDB codes see Table 3 published as supporting information (http://bioinf.charite.de/protein DNA/index.html). 2.2. Definition of backbone trace and binding site For superposition, two different atom subsets of each protein structure were selected: the relevant backbone trace (C␣atoms) and the protein atoms involved directly in DNA interaction. The backbone trace was defined as all C␣-atoms of the protein chain containing the DNA-binding site. Only C␣atoms present in all backbones of a structure group were considered. The DNA binding region is defined as all heavy atoms of the protein chain within a range of 5 Å around any atom of the DNA. This procedure yields binding sites with different sizes within the same protein group. The main cause of this is that within each group a few structures exist that contain an incomplete DNA molecule. The associated DNA-binding proteins show fewer atoms in direct contact with DNA. To allow pair-wise comparisons only those atoms present in all DNAbound structures within the contact area were used for superposition. The selection of the largest common subset actually decreases the number of atoms, but comparing identical numbers of atom sets prevents statistical difficulties with 3D comparisons (Stark et al., 2003). The atomic sets for superimposition of the backbone traces were selected in an analogous manner. In five of the seven protein groups the restriction to the largest common subsets decreases the number of C␣-atoms slightly. The resulting data set and its characteristics is summarized in Table 1. 2.3. Determination of conformational diversity by superposition All superpositions were performed using the ‘pair fit’ routine implemented in the molecular graphics system ‘PyMOL’.1 The algorithm calculates the minimal three-dimensional deviation (rmsd) of two superimposed sets of atom coordinates. The atoms from both sets were assigned pairwise to each other such that only superposition of the same amino acid and atom 1 PyMOL version 0.93, DeLano Scientific, Castro City, CA. S. Günther et al. / BioSystems 85 (2006) 126–136 129 Table 1 The seven structural groups of DNA-binding proteins with the size of the binding sites (in atoms) and the length of the backbones (in amino acid residues) Protein Transcription factors Methionine-repressor Catabolite-gene-activator Runx1 runt domain Diphtheria-toxin-repressor Enzymes DNA-polymerase- Fragment of DNA-polymerase I PvuII endonuclease Number of available structures Binding site Backbone 8 8 13 16 45–82 (31) 67–104 (30) 73–89 (41) 85-100 (61) 104–104 (102) 197–208 (193) 110–125 (106) 118–219 (111) 11 6 9 82–215 (58) 175–290 (86) 124–146 (62) 241–331 (223) 525–828 (496) 151–158 (148) DNA-complexed Uncomplexed 20 15 7 12 82 7 10 The numbers in parenthesis indicate the size of the consensus subset common to all structures, which was used for superposition. type was allowed. No atoms were omitted. Within the seven protein groups all C␣-sets and binding site related atoms were superimposed with all other members of the distinct group. In total, 5726 binding site as well as 5726 backbone superpositions were calculated. The measured pairwise rmsd values result in two similarity matrices per structure group: one for the similarity of binding regions and the other for the backbone traces. The elements of the 14 resulting similarity matrices were clustered using the ‘Ward Method’ (Ward, 1963). Beside the rmsd, the maximal deviation of two superimposed sets of atoms was calculated to test the robustness of the selected method of measurement. Analogous to the rmsd values the resulting maximal deviations were transferred to similarity matrices and are published as supporting information. The calculated structure similarities were the basis for assignment to the above mentioned docking models. If the structural ensemble of a protein is separated into the DNAcomplexed and the uncomplexed states that protein was assigned to the induced fit model. If low or none structural changes between the complexed and uncomplexed protein structures were detected the assigned docking model is the lock and key principle. An assignment to the equilibrium hypothesis is characterised by two or more structural cluster. One of them consists of all bound protein-instances, but contains at least one additional unbound state. All other unbound proteins are located within further clusters. A precondition for assignment to the last docking model, the dynamic population shift model, is the presence of at least two different DNA-bound conformations of the same protein. 3. Results and discussion The different selected methods of measurement, rmsd and maximal deviation resulted in similar cluster maps. Exemplary the similarity matrices based on the rmsd values for two of the seven analyzed DNA-binding proteins (diphtheria toxin repressor and polymerase ) are shown in Fig. 2(a) and (b). The data for all seven analyzed proteins are given as supporting information (http://bioinf.charite.de/protein DNA/index.html). 3.1. Size of binding region and length of Cα-backbone Constraining the binding region to the largest common subset (see Section 2 above) of each protein group notably reduces the number of atoms, but allows straightforward 3D comparisons and ensures that detected differences between DNA-complexed and free forms are related to DNA binding (compare Table 1). The C␣-traces of two of the seven protein groups, diphtheria toxin repressor (DtxR) and polymerase I, show large differences in the length of backbone traces. In the case of DtxR most structures do not contain the C-terminal SH3-like domain. Within the polymerase I group some of the structures contain the whole enzyme while others contain only the large polymerase fragment (Klenow fragment). 3.2. Conformational diversity of protein backbones 3.2.1. Diphtheria toxin repressor Diphtheria toxin repressor (DtxR) recognizes several iron-regulated promoter/operator sequences. For example, activation of DtxR occurs at high iron concentrations. Activated DtxR binds to the tox regulatory region, toxO, located upstream of the tox structural gene, preventing expression of diphtheria toxin. When iron becomes a growth-limiting nutrient, DtxR dissociates from DNA, allowing the transcription of toxin mRNA. Different forms of DtxR are available within the PDB database: apo-DtxR, holo-DtxR and the holo-DtxR complexed with DNA. Activation of the N-terminal domain by metal ions results in a conformational 130 S. Günther et al. / BioSystems 85 (2006) 126–136 Fig. 2. (a) Diphtheria toxin repressor. Similarity matrices (rmsd values) of backbone and binding site structure. DNA-complexed structures are labeled green, the unbound structures are marked grey (top line). High similarity values are marked red, yellow cells indicate more discriminative conformations. (b) DNA-polymerase . S. Günther et al. / BioSystems 85 (2006) 126–136 Fig. 3. Structure of diphtheria toxin repressor. The activated state of the dimer is colored green and is complexed with DNA (PDB code: 1C0W, chains a and b). Cobalt ions (grey spheres) induce a conformation change of six amino-terminal residues (colored blue). For clear illustration of the conformation switch the superposition of an unactivated repressor (PDB code: 1BI2, chain b) is also shown, and is colored red. If there was no change of conformation the N-terminal part of the backbone would clash with a DNA phosphate group. change of the protein backbone that contains part of the DNA-binding site (White et al., 1998). Upon activation with the co-repressor, the six amino-terminal residues of DtxR undergo a helix-to-coil transition (Fig. 3). This conformational change is only partly reflected by clustering in the similarity matrix of the DtxR-superposition results (Fig. 2(a)). All protein structures attached to the DNA show a similar local and global conformation and are activated by metal ions. We found one DNA-free structure (2TDX), which shows a conformation similar to DNA-complexed forms, indicating that this conformation is not only DNA induced. The repressor’s other structures are divided into four distinct conformational global variants. All of these are observed with the ligand-free protein, while the metal-activated forms are limited to two of them. The results indicate that metal binding limits the ensemble of ligand free conformational variants. Such behavior is described by the equilibrium model: the conformation suitable for DNA is not only induced by DNA, but also appears within the conformational ensemble of the DNA-free protein. 3.2.2. Methionine repressor In the presence of the co-repressor S-adenosylmethionine (SAM) the methionine repressor (MetJ) binds to tandem eight base pair DNA recognition sites of the met regulon. SAM binds at a distance from the DNA binding site of the repressor (shortest distance to DNA: 131 11.5Å). Bound SAM increases the repressor’s affinity for operator DNA by a factor up to 1000. However, the protein undergoes no obvious conformational change. It is suggested that the co-repressor effect may be electrostatic (Phillips and Phillips, 1994). The conformational diversity of the backbone (rmsd max = 2.67Å) can divided into between two to five groups. Greater conformational deviations within this structure group are restricted to two loop regions (residues nos. 13–19 and 76–84), while most parts of the backbone trace retain a similar conformation. SAM binding correlates with the position of the two flexible loops. Within most ternary complexes (MetJ:SAM:DNA) these two parts of the backbone have an equivalent position, and binding of DNA and SAM surely favors this conformation compared to the other loop forms. Nevertheless, the low backbone movement of the methionine repressor mediated by DNA binding is best described by the lock and key model. 3.2.3. Catabolite gene activator The catabolite activator protein (CAP) activates transcription of the catabolite gene by facilitating the binding of RNA polymerase (RNAP) to the promoter. CAP possesses two binding sites for the C- and N-terminal subunits of RNAP, ␣CTD and ␣NTD (Lawson et al., 2004). One single structure of a CAP:␣CTD:DNA complex is present within the dataset (PDB code: 1LB2A), but the binding of ␣CTD has no effect on backbone conformation of CAP. Two distinct backbone variants occur amongst the structures of CAP. While one of them is only observed in DNA-free structures, the other conformational group arises in both DNA-complexed and uncom- Fig. 4. Structural diversity of catabolite gene activator protein (CAP). All pair-wise superpositions within the structure group of CAP are divided into three groups: rmsd values between DNA-complexed and uncomplexed structures and rmsd values resulting from superpositions within both groups. The plot illustrates the correlation between structural diversity of the backbone traces (X-axis) and those of the binding sites (Y-axis). 132 S. Günther et al. / BioSystems 85 (2006) 126–136 plexed structures. Fig. 4 illustrates the conformational ensemble by a two-sided separation of the rmsds resulting from backbone superpositions. All pair-wise rmsds of DNA-complexed structures fall below 1.5 Å. Superposition values involving additional uncomplexed states are partitioned to low and high deviations. Differences between both conformations are not limited to a few loop regions but involve the entire backbone. Each chain of the dimer consists of two domains, a small C-terminal DNA-binding domain and the larger N-terminal domain. Both are displaced relative to each other within the two conformational variants. CAP is activated by cyclic AMP molecules (cAMP), which bind to the larger N-terminal domain. The cAMP molecule induces an allosteric transition preceding the DNA binding process (Passner et al., 2000). All known structures of CAP are similar to the activated, cAMP-complexed states, so the conformational variances detected are not caused by cAMP binding. The two distinct conformations observed are well explained by reduction to a single conformation by DNA-binding. Such behavior explains why DNAcomplexed and uncomplexed structures with a similar conformation of CAP occur within this structural group and can be assigned to the equilibrium docking model. 3.2.4. Runt related transcription factor The runt related transcription factor/core-binding factor (CBF␣) regulates the transcription of different genes by binding to DNA in the proximal promoter region (Matsuo et al., 2003). The binding of CBF stabilizes a conformation suitable for the DNA docking process (Tahirov et al., 2001; Backstrom et al., 2002; Yan et al., 2004). The backbone superpositions of the transcription factor reveals three groups of conformations. Two of them are observed in DNA-free structures, the third one contains the DNA-complexed protein but is also observed in one distinct DNA-free structure (PDB-code: 1E50 A). The existence of this structure demonstrates that the conformation suitable for DNA-binding also exists without DNA. This structure is also complexed with CBF supporting the theory of the subunit promoting DNA docking. The results again suggest a behavior described by the equilibrium model. 3.2.5. DNA-polymerase β DNA-polymerase  (pol ) adds new complementary deoxynucleotides to a growing DNA chain. The enzyme has two domains. The nucleotidyl transfer reaction involves a large movement of the 8 kD domain and part of the 31 kD domain from a closed to an open conformation (Pelletier et al., 1994, 1996; Sawaya et al., 1997). Both types of conformation are present in the 82 enzyme- DNA-complexes: 75 structures were crystallized in the open conformation and 7 represent the closed conformation. Both variants deviate from the 11 DNA-free structures (Fig. 2(b)) though the closed conformation more than the open state. The presence of two distinct DNA-complexed conformations indicate a bidirectional structural change between the two bound forms, while catalysing the synthesis of DNA. The shift of the uncomplexed conformation to the different complexed states is best described by the dynamic population shift model. 3.2.6. Fragment of DNA-polymerase I DNA-polymerase I catalyzes the addition of deoxynucleotides to a primer RNA chain. The reaction requires a template chain which directs the enzyme to select a specific nucleotide. The catalysis is coupled with rotation of a subdomain of the polymerase or Klenow fragment (Li and Waksman, 2001). The superposition results show three distinct DNA-complexed backbone conformations reflecting different stages of nucleotide incorporation. One of them is represented by one single structure (1TAU) and is quite similar to the uncomplexed states (rmsd 1TAU ↔ 1TAQ = 0.84 Å). Analogous to pol , DNA-polymerase I is characterised by a structural shift from the uncomplexed conformation to the DNAcomplexed states. Therefore, DNA-polymerase I is also assigned to the dynamic population shift model. 3.2.7. Restriction enzyme PvuII The PvuII endonuclease recognizes the doublestranded DNA sequence 5 -CAGCTG-3 and hydrolyzes the phosphodiester bond in unmethylated duplex DNA between the central GC nucleotides (Gingeras et al., 1981). The restriction enzyme is a homodimer and metal ions are essential cofactors for DNA cleavage (Pingoud and Jeltsch, 1997). The structures of crystallized PvuII endonucleases are clearly partitioned into two different clusters, one containing the DNA-bound enzyme the other the uncomplexed structures. The presence of metal ions has no obvious effect on enzyme conformation and DNA docking is also possible without them. The clear separation of the DNA-complexed and uncomplexed conformation indicates a structural change induced by DNA (“induced fit”). 3.3. Structural diversity of the binding sites Protein–DNA interfaces comprise above average polar and positively charged amino acids compared to protein–protein interfaces or protein surfaces (Jones et al., 1999). The residue composition of the seven binding sites analyzed corresponds to the interface amino S. Günther et al. / BioSystems 85 (2006) 126–136 acid composition specified by Jones et al. and reflects the polar character and negative charge of the DNA. The positively charged arginine residue is most frequent, followed by polar threonine, asparagine and serine residues, and the positively charged lysine residue. Except for one structure of the polymerase I group, all DNA-bound binding sites are dissimilar to the unbound states. This is indicated by a clear separation within all structure groups shown by the cluster analysis. In DNA-bound chains the atoms in direct contact with DNA are located in a similar position. The bound structures of DtxR, MetJ, CBF␣ and PvuII do not exceed an rms deviation of 1.1 Å within each group. The uncomplexed binding sites differ clearly from them and they are less similar amongst themselves. The observed local changes in binding site structure are assigned to structural adaptation of sidechains for perfect interaction with DNA. Often the sidechain conformation is changed significantly when the C(␣) trace remains in the same conformation (Fig. 5). Comparing the binding site of CAP with and without DNA (1CGP and 1G6N) the residue Arg180 moves significantly (rmsd = 3.6 Å). Structurally similar pairs of free binding sites are rarely present within the dataset even in proteins that are analyzed frequently. This observation reflects the wide range of possible local conformations while interacting with the solvent. Similarly, a DNA-adapted local con- 133 formation is hardly ever adopted without DNA interaction. An example is illustrated in Fig. 4 for the binding site structures of CAP. In contrast to the similarity values within both the DNA-complexed and uncomplexed structure groups, no rmsd between both groups falls below 1.2 Å. Substantial alterations of side-chain structures located in the interaction site are also observed in protein–protein associations (Betts and Sternberg, 1999). Significant side-chain movement was estimated in all proteins considered by Betts and Sternberg, but greater backbone conformational changes were only observed in three of the eight complexes. With the exception of the two polymerases, all other five proteins analyzed bind to specific DNA target sites. Sequence specific DNA binding implies that proteins discriminate between, and thus have different affinities for individual bases. All five sets contain variants of the DNA target sites. A complete list of the DNA target sites within each of the five protein groups is given in Table 4, published as supporting information. There is no conspicuous relationship to the local structure of the binding site. Even if a small structural adaptation to different DNA target sites exists, the scale is small (e.g. DtxR, rmsd: 0.15–0.99 Å) compared to the large conformational change from the free to the DNA-complexed state (DtxR, rmsd: 1.17–2.29 Å). There is one exception within the structure group of polymerase I. The cluster algorithm does not pick out a complexed binding site (PDB code: 1TAU A) from the structures uninfluenced by DNA. However, this fact can be explained by the strong deviation from the other DNA-bound structures, which represent different activation states (see DNA-polymerase I section above). Nevertheless, adaptation of binding site conformation to DNA can also be observed within this structure. For the local structure of the interaction site all proteins analyzed can be assigned to the induced fit model. Due to the large number of possible iso-energetical binding site structures (see below) we cannot exclude that a DNAfree protein adopts a local conformation similar to the DNA-bound form from time to time. 3.4. Conformational change of DNA Fig. 5. Example of a local conformation change. Two structures of the CAP are superimposed. One structure is colored green (PDB code: 1CGP) and is complexed with DNA (blue), the second structure is uncomplexed and is colored red (PDB code: 1G6N). One exemplary residue of the binding site (ARG 180) are shown in stick representation, the other parts of the protein are shown in ribbon representation. In contrast to proteins the local flexibility of double stranded DNA is more restricted. However, helical axis bending can occur due to external factors including small molecule ligands or proteins (Dickerson, 1998; PerezMartin and de Lorenzo, 1997; Otwinowski et al., 1988). Among the seven proteins analyzed here, only CAP interacts with obviously bent double-stranded DNA. DNA 134 S. Günther et al. / BioSystems 85 (2006) 126–136 Table 2 Assignment of the DNA-binding proteins to different docking models Protein Docking model for the binding site Docking model for backbone conformation Methionine-repressor Runt-related-transcription factor Catabolite-gene-activator Diphtheria-toxin-repressor DNA-polymerase  Fragment of DNA-polymerase I PvuII endonuclease Induced fit Induced fit Induced fit Induced fit Induced fit Induced fit Induced fit Inflexible docking Equilibrium Equilibrium Equilibrium Dynamic shift Dynamic shift Induced fit The classification is based on the binding site and backbone similarity within each structural group. complexed to CAP forms a right angle caused by two successive kinks (Dickerson, 1998). Apart from this, only the single-stranded DNA regions within the polymerase complexes show conspicuous structural differences compared to DNA strands organized in a straight double helix. However, the influence of a protein on DNA binding is often difficult to estimate because the structure of the unbound DNA is usually unknown. For the CAPoperator sequence it was estimated by MD simulations that 40% of the bending in the complex is intrinsic to the DNA sequence, whereas 60% is induced by protein binding (Dixit et al., 2005). DNA bending does not result in a new conformation of CAP, since this would only be seen in DNA-complexed structures of the protein (see section of chapter CAP). 4. Conclusions 4.1. Backbone conformation of the binding domain Diversity in protein backbone conformation was observed within each of the seven proteins, but the degree of conformational change varies and is not always related to function. Low structural variation is observed within the structure group of the methionine repressor protein. A model describing the docking of MetJ should most probably assume an inflexible backbone. The proteins catabolite gene activator, runt related transcription factor and diphtheria toxin repressor are all characterized by the fact that binding of the co-activators/repressor narrows down the number of alternative conformations, and thus facilitates DNA-docking. This is best described by the equilibrium model. DNA-polymerase I and DNApolymerase  exhibit additional conformations while docked with DNA. Therefore, the shift of the structural ensemble is best reflected by the dynamic population shift model. The DNA bound and unbound structures of PvuII differ considerably. Binding of the co-activator has no effect on backbone conformation, but is necessary for cutting DNA. This behavior is described by the induced fit model. 4.2. The local structure of DNA binding sites The local structure of the DNA-binding sites of all seven proteins is influenced by DNA. DNA binding is combined with a change of local structure to a precise DNA-complementary binding site shape. This conformation is hardly ever observed in proteins which are not attached to DNA. The flexible side-chains located at the protein surface adopt a number of alternative local conformations. Similar to the backbone docking model for the transcription factors, DNA-docking is associated with a severe reduction in the protein’s conformational ensemble. Finally it is reduced to nearly one variant that allows discrimination between different bases. Although some protein sets contain variant DNA target-sites, little effect on local binding site structure was detected. Table 2 gives an overview of the assignments to the different docking models. 4.3. Structure of DNA molecules Structural change in double-stranded DNA molecules analyzed is mainly limited to a global bending of the axis within the complexed structures of CAP. The bending is partly induced by the protein. Local structural changes of single nucleotides are prevented by the strong association of complementary bases. 4.4. Consequences for computer-based docking The local flexibility of binding sites constitutes a frontier problem for shape-based prediction models. The number of possible ligands increases substantially even upon small conformational variations of the binding site (Ferrari et al., 2004). For small organic compounds, Yang et al. (2004) showed that binding sites complexed with their ligands S. Günther et al. / BioSystems 85 (2006) 126–136 assume a low energy conformation of the free protein. This assumption could provide an opportunity to identify binding sites by geometric criteria. However, for DNA binding, the interactions are more complex as larger interfaces are involved and the negative charges of the DNA backbone constrain the binding region on the protein site such that it is no longer equal to a low energy conformation of the free state. The side chains of the binding site accommodate to the structure they bind dynamically. It follows, that the conformational space to be considered for a potential binding site is enormous: even if a binding site comprises only 20 residues and as few as five rotamers per residue, the search space would yield 205 or 1014 combinations. Thus, a more promising way of predicting protein– DNA interactions is to combine geometric criteria with additional physical parameters that narrow down the conformational space by several orders of magnitude. References Backstrom, S., Wolf-Watz, M., Grundstrom, C., Hard, T., Grundstrom, T., Sauer, U.H., 2002. The runx1 runt domain at 1.25a resolution: a structural switch and specifically bound chloride ions modulate dna binding. J. Mol. Biol. 322, 259–272. Barak, R., Eisenbach, M., 1992. Correlation between phosphorylation of the chemotaxis protein CheY and its activity at the flagellar motor. Biochemistry 31 (6), 1821–1826. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E., 2000. The protein data bank. Nucleic Acids Res. 28, 235–242. Betts, M.J., Sternberg, M.J., 1999. An analysis of conformational changes on protein–protein association: implications for predictive docking. Protein Eng. 12 (4), 271–283. Dickerson, R.E., 1998. DNA bending: the prevalence of kinkiness and the virtues of normality. Nucleic Acids Res. 26 (8), 1906–1926. Dixit, S.B., Andrews, D.Q., Beveridge, D.L., 2005. Induced fit and the entropy of structural adaptation in the complexation of CAP and lambda-repressor with cognate DNA sequences. Biophys. J. 88 (5), 3147–3157. Echols, N., Milburn, D., Gerstein, M., 2003. Molmovdb: analysis and visualization of conformational change and structural flexibility. Nucleic Acids Res. 31, 478–482. Ferrari, A.M., Wei, B.Q., Costantino, L., Shoichet, B.K., 2004. Soft docking and multiple receptor conformations in virtual screening. J. Med. Chem. 47, 5076–5084. Fischer, E., 1894. Einfluss der configuration auf die wirkung der enzyme. Chem. Ber. 27, 2985–2993. Frauenfelder, H., Sligar, S.G., Wolynes, P.G., 1991. The energy landscapes and motions of proteins. Science 254, 1598–1603. Freire, E., 1999. The propagation of binding interactions to remote sites in proteins: analysis of the binding of the monoclonal antibody d1.3 to lysozyme. Proc. Natl. Acad. Sci. USA 96, 10118–10122. Garvie, C.W., Wolberger, C., 2001. Recognition of specific dna sequences. Mol. Cell 8, 937–946. Gingeras, T.R., Greenough, L., Schildkraut, I., Roberts, R.J., 1981. Two new restriction endonucleases from proteus vulgaris. Nucleic Acids Res. 9, 4525–4536. 135 Goh, C.S., Milburn, D., Gerstein, M., 2004. Conformational changes associated with protein–protein interactions. Curr. Opin. Struct. Biol. 14, 104–109. Jones, S., van Heyningen, P., Berman, H.M., Thornton, J.M., 1999. Protein–DNA interactions: a structural analysis. J. Mol. Biol. 287, 877–896. Koshland, D., 1958. Application of a theory of enzyme specifity to protein synthesis. Proc. Natl. Acad. Sci. USA 44, 98–104. Lawson, C., Swigon, D., Murakami, K.S., Darst, S.A., Berman, H.M., Ebright, R.H., 2004. Catabolite activator protein: DNA binding and transcription activation. Curr. Opin. Struct. Biol. 14 (1), 10–20. Li, W., Jaroszewski, L., Godzik, A., 2001. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics 17, 282–283. Li, Y., Waksman, G., 2001. Crystal structures of a ddatp-, ddttp-, ddctp, and ddgtp- trapped ternary complex of klentaq1: insights into nucleotide incorporation and selectivity. Protein Sci. 10, 1225– 1233. Luscombe, N.M., Thornton, J.M., 2002. Protein–DNA interactions: amino acid conservation and the effects of mutations on binding specificity. J. Mol. Biol. 320, 991–1009. Mandel-Gutfreund, Y., Margalit, H., 1998. Quantitative parameters for amino acid-base interaction: implications for prediction of protein– DNA binding sites. Nucleic Acids Res. 26, 2306–2312. Matsuo, N., Yu-Hua, W., Sumiyoshi, H., Sakata-Takatani, K., Nagato, H., Sakai, K., Sakurai, M., Yoshioka, H., 2003. The transcription factor ccaat-binding factor cbf/nf-y regulates the proximal promoter activity in the human alpha 1(xi) collagen gene (col11a1). J. Biol. Chem. 278, 32763–32770. Monod, J., Wyman, J., Changeux, J.P., 1965. On the nature of allosteric transitions: a plausible model. J. Mol. Biol. 12 (NIL), 88–118. Nadassy, K., Tomas-Oliveira, I., Alberts, I., Janin, J., Wodak, S.J., 2001. Standard atomic volumes in double-stranded dna and packing in protein–DNA interfaces. Nucleic Acids Res. 29, 3362– 3376. Nadassy, K., Wodak, S.J., Janin, J., 1999. Structural features of proteinnucleic acid recognition sites. Biochemistry 38, 1999–2017. Otwinowski, Z., Schevitz, R.W., Zhang, R.G., Lawson, C.L., Joachimiak, A., Marmorstein, R.Q., Luisi, B.F., Sigler, P.B., 1988. Crystal structure of trp repressor/operator complex at atomic resolution. Nature 335 (6188), 321–329. Passner, J.M., Schultz, S.C., Steitz, T.A., 2000. Modeling the campinduced allosteric transition using the crystal structure of cap-camp at 2.1 a resolution. J. Mol. Biol. 304, 847–859. Pelletier, H., Sawaya, M.R., Kumar, A., Wilson, S.H., Kraut, J., 1994. Structures of ternary complexes of rat dna polymerase beta, a dna template-primer, and ddctp. Science 264, 1891–1903. Pelletier, H., Sawaya, M.R., Wolfle, W., Wilson, S.H., Kraut, J., 1996. Crystal structures of human dna polymerase beta complexed with dna: implications for catalytic mechanism, processivity, and fidelity. Biochemistry 35, 12742–12761. Perez-Martin, J., de Lorenzo, V., 1997. Clues and consequences of DNA bending in transcription. Annu. Rev. Microbiol. 51 (NIL), 593–628. Phillips, K., Phillips, S.E., 1994. Electrostatic activation of escherichia coli methionine repressor. Structure 2, 309–316. Pingoud, A., Jeltsch, A., 1997. Recognition and cleavage of dna by type-II restriction endonucleases. Eur. J. Biochem. 246, 1–2. Sawaya, M.R., Prasad, R., Wilson, S.H., Kraut, J., Pelletier, H., 1997. Crystal structures of human dna polymerase beta complexed with gapped and nicked dna: evidence for an induced fit mechanism. Biochemistry 36, 11205–11215. 136 S. Günther et al. / BioSystems 85 (2006) 126–136 Spolar, R.S., Record Jr., M.T., 1994. Coupling of local folding to sitespecific binding of proteins to dna. Science 263, 777–784. Stark, A., Sunyaev, S., Russell, R.B., 2003. A model for statistical significance of local similarities in structure. J. Mol. Biol. 326 (5), 1307–1316. Tahirov, T.H., Inoue-Bungo, T., Morii, H., Fujikawa, A., Sasaki, M., Kimura, K., Shiina, M., Sato, K., Kumasaka, T., Yamamoto, M., Ishii, S., Ogata, K., 2001. Structural analyses of dna recognition by the aml1/runx-1 runt domain and its allosteric control by cbfbeta. Cell 104, 755–767. Tsai, C.J., Kumar, S., Ma, B., Nussinov, R., 1999. Folding funnels, binding funnels, and protein function. Protein Sci. 8, 1181–1190. Ward, J.H., 1963. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244. White, A., Ding, X., vanderSpek, J.C., Murphy, J.R., Ringe, D., 1998. Structure of the metal-ion-activated diphtheria toxin repressor/tox operator complex. Nature 394, 502–506. Yan, J., Liu, Y., Lukasik, S.M., Speck, N.A., Bushweller, J.H., 2004. Cbfbeta allosterically regulates the runx1 runt domain via a dynamic conformational equilibrium. Nat. Struct. Mol. Biol. 11, 901– 906. Yang, A.Y., Kallblad, P., Mancera, R.L., 2004. Molecular modelling prediction of ligand binding site flexibility. J. Comput. Aided Mol. Des. 18 (4), 235–250.
© Copyright 2026 Paperzz