doi:10.1016/j.jmb.2009.04.011 J. Mol. Biol. (2009) 389, 619–636 Available online at www.sciencedirect.com Desolvation Barrier Effects Are a Likely Contributor to the Remarkable Diversity in the Folding Rates of Small Proteins Allison Ferguson, Zhirong Liu and Hue Sun Chan⁎ Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A7 Received 3 March 2009; received in revised form 1 April 2009; accepted 6 April 2009 Available online 9 April 2009 The variation in folding rate among single-domain natural proteins is tremendous, but common models with explicit representations of the protein chain are either demonstrably insufficient or unclear as to their capability for rationalizing the experimental diversity in folding rates. In view of the critical role of water exclusion in cooperative folding, we apply native-centric, coarse-grained chain modeling with elementary desolvation barriers to investigate solvation effects on folding rates. For a set of 13 proteins, folding rates simulated with desolvation barriers cover ∼ 4.6 orders of magnitude, spanning a range essentially identical to that observed experimentally. In contrast, folding rates simulated without desolvation barriers cover only ∼ 2.2 orders of magnitude. Following a Hammond-like trend, the folding transition-state ensemble (TSE) of a protein model with desolvation barriers generally has a higher average number of native contacts and is structurally more specific, that is, less diffused, than the TSE of the corresponding model without desolvation barriers. Folding is generally significantly slower in models with desolvation barriers because of their higher overall macroscopic folding barriers as well as slower conformational diffusion speeds in the TSE that are ≈1/50 times those in models without desolvation barriers. Nonetheless, the average root-meansquare deviation between the TSE and the native conformation is often similar in the two modeling approaches, a finding suggestive of a more robust structural requirement for the folding rate-limiting step. The increased folding rate diversity in models with desolvation barriers originates from the tendency of these microscopic barriers to cause more heightening of the overall macroscopic folding free-energy barriers for proteins with more nonlocal native contacts than those with fewer such contacts. Thus, the enhancement of folding cooperativity by solvation effects is seen as positively correlated with a protein's native topological complexity. © 2009 Elsevier Ltd. All rights reserved. Edited by C. R. Matthews Keywords: contact order; Gō model; transition state; Kramers' theory; conformational diffusion Introduction *Corresponding author. E-mail address: [email protected]. Present address: Z. Liu, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China. Abbreviations used: TSE, transition-state ensemble; RCO, relative contact order; db, desolvation barrier; PMF, potential of mean force; LRO, long-range order; cm, contact minimum; ssm, solvent-separated minimum; CI2, chymotrypsin inhibitor 2. Theoretical studies of protein folding kinetics afford a conceptual framework for deciphering from experimental data the physicochemical interactions underlying protein behaviors.1–11 Much progress has been attained recently by investigating small, singledomain natural globular proteins whose folding/ unfolding thermodynamics and kinetics are twostate-like. For these proteins, although folding speed has not been extensively optimized by evolution,12,13 no appreciable accumulation of folding or unfolding 0022-2836/$ - see front matter © 2009 Elsevier Ltd. All rights reserved. 620 intermediates has been observed.14–16 This hallmark feature is in contrast with the more complex, multiphasic folding kinetics of somewhat larger proteins, which were often subjects of earlier experiments.17–20 Two-state-like folding is also clearly set apart from the noncooperative thermodynamics and kinetics exhibited by those proteins recently found to be likely global downhill folders.21,22 With the folding data for an increasing number of such single-domain proteins becoming available since the early 1990s14 (reviewed in Refs. 23–25), the very simplicity of their folding processes has led researchers to taking a panoramic view of biophysically important trends across many two-state-like folders. In a seminal discovery by Plaxco et al., a significant correlation was found to exist between logarithmic folding rates of two-state folders and the values of a simple parameter termed relative contact order (RCO) derived from the residue–residue contacts in a given protein's native structure; the pattern of these contacts is commonly referred to as native topology.26 (For a perspective on this usage of the term “topology” vis-à-vis that in other biomolecular contexts, see Section 1.1 of Ref. 27.) It was recognized immediately that this simple empirical rate–topology correlation should offer important clues to the energetics of protein folding. The correlation actually represented a fundamental conceptual challenge because several protein chain models embodying common notions of protein energetics at the time failed to reproduce a similar trend.28 One apparent exception was a two-dimensional square-lattice-model study conducted 20 years earlier by Taketomi and Gō.29 These early researchers concluded that local interactions speed up folding kinetics, whereas nonlocal interactions lead to more cooperative folding transitions. However, although this earlier finding was consistent with the discovery of Plaxco et al., it did not directly address the multiple-protein rate–topology correlation because only a single 49mer native structure was considered in Ref. 29. Ising-like constructs offered the first rationalizations for the rate–topology correlation.30,31 Their success provided critical physical insights regarding the different contributions of local and nonlocal interactions to the free-energy barrier to folding. Nonetheless, these constructs are not self-contained heteropolymer models because they lack an explicit representation of the protein chain.32 As such, the relationship between Ising-like constructs and explicit-chain models—which clearly bear a more direct resemblance to real proteins—remains to be better elucidated.33 The rate–topology correlation was subsequently addressed using explicit-chain, continuum Gō-like modeling34–36 by Koga and Takada, who simulated folding rates of 18 natural proteins.37 Consistent with experiment,26 a positive correlation between simulated rates and RCO was found, albeit with a weaker correlation coefficient. These results showed that the empirical rate–topology correlation can be captured, at least to a degree, by native-centric models. The results also revealed, however, a fundamental limitation of the common Gō-like potential Desolvation and Topology-Dependent Folding because it fell short in accounting for the diversity of folding rates among natural two-state-like folders. As noted by the authors, the simulated folding rates spanned ≈1.5 orders of magnitude, which is much narrower than the 6 orders of magnitude spanned by the corresponding experimental folding rates.37 It was soon realized38,39 that a likely origin for this shortcoming is that continuum Gō-like models with pairwise-additive Lennard–Jones-like potentials as well as common lattice Gō models fold less cooperatively than real two-state-like proteins.40,41 Lattice modeling efforts inspired by this realization showed more diverse folding rates when cooperativity was enhanced by nonadditive many-body energy terms.38,39 These studies further suggested that the remarkable diversity of experimental folding rates among two-state-like proteins is probably underpinned by specific, rather than generic, forms of many-body interactions, for there are substantial variations in folding rate diversity when different many-body interaction schemes were applied. For example, under a physically plausible nonadditive scheme that coupled local conformational propensity with nonlocal contact interactions, folding rates among a set of 27mer three-dimensional cubic lattice model proteins spanned ≈ 2.6 orders of magnitude.39 (See Ref. 42 for a recent analytical model based on a similar local–nonlocal coupling mechanism.39,43) In contrast, under a different nonadditive scheme,38 the corresponding computed folding rates spanned only ≈ 1.8 orders of magnitude.8 A subsequent application of the idea about many-body effect to continuum explicit-chain models showed that adding a three-body term to the common pairwise-additive Gō potential could increase model folding rate diversity among 18 proteins from ≈ 2.0 to ≈ 3.0 orders of magnitude, even though the improvement was insufficient to match the ≈ 4.5–6.0 orders of magnitude spanned by the corresponding experimental rates.44 Analytical modeling has explored the boosting effect of many-body terms on folding cooperativity,45 and a recent nonexplicit-chain variational model study also indicated that folding rate diversity was enhanced by many-body effects.46 Together, these findings have convincingly demonstrated that folding cooperativity is a crucial ingredient in the physical accounting of the empirical rate–topology correlation. A case in point is an earlier lattice-model study that has insightfully concluded that folding cooperativity increases with nonlocality of native contacts.47 However, because the models in that study were insufficiently cooperative (see Ref. 48 and Section 16.4.4, pp. 422–427, of Ref. 49 for an assessment of the 20-letter interaction scheme used in Ref. 47), severe chevron rollovers50,51 led to the simulation results that “under conditions at which each native conformation was stable, the structure with mostly nonlocal contacts folded 2 orders of magnitude faster than the one with mostly local contacts”.47 This prediction was opposite to the experimental rate–topology trend.26 In contrast, more cooperative native-centric models have milder chevron rollovers, and thus a positive 621 Desolvation and Topology-Dependent Folding correlation between simulated folding rate and RCO can be maintained even under strongly folding conditions.52 What are the physical origins of a high degree of folding cooperativity? Biophysical properties of proteins are governed by water-mediated interactions; and one key physical contributor to folding cooperativity is the energetic effects of water expulsion (desolvation).53 Recent simulations of coarse-grained protein chain models with a physics-based desolvation/solvation barrier (referred to simply as “desolvation barrier,” or “db” below) in their native-centric potential41,52,54–58 showed that microscopic db's could significantly enhance folding/unfolding cooperativity.54,57,59 In this respect, db's afford protein chain models with more realistic thermodynamics and kinetics than those stipulated by common Gō-like models with no db's. In view of the positive impact of desolvation effects on folding cooperativity and the above-discussed relationship between folding cooperativity and rate–topology correlation, we deemed it worthwhile to investigate the extent to which desolvation effects may account for the remarkable diversity in folding rates among two-state-like proteins.60 Our group has underlined the role of conformational entropy in the rate–topology correlation using common continuum Gō-like models with no db's.27 Building upon that advance and motivated by the prospect of gaining a deeper understanding about the role of solvation/desolvation in protein folding, in the present work we used db models, which have more realistic folding cooperativity, to address two main questions: (1) How are the folding rates of models of small, single-domain proteins impacted by the introduction of db's, and do they exhibit a stronger correspondence with experimental rates? (2) What is the physical basis behind the changes in the overall free-energy barrier to folding and in the speed of conformational sampling that lead to this alteration in the rates? Theory The present investigation adopts the coarsegrained native-centric approach in our previous studies.41,57–59 This modeling approach is appropriate for our purpose. Although nonnative interactions occur in the folding of some proteins,61–65 their effects are not dominant in two-state-like folders and can be treated as a perturbation on a nativecentric background. 66 Hence, our native-centric approach may be seen as a zeroth approximation in a more general modeling framework.66–68 As in our previous efforts, for computational tractability, we use an implicit-water effective db potential54,59 in place of explicit-water simulations. 69,70 The effective db potential embodies the collective effects of many water molecules, and thus, in this sense, represents a “many-body” contribution. However, it should be recognized that our pairwise-additive form of the effective db potential involves an approximation because it neglects the nonadditivity of the water-mediated effective interactions themselves8,71–73 (see below). Despite these limitations, results from recent coarse-grained native-centric db modeling have provided useful physical insights into molecular recognition74 and mechanical stability of proteins in pulling experiments.75 Moreover, our simulations showed that db's could significantly reduce native-state conformational fluctuations,52,59 a notable feature consistent with the experimental view that db's are a main factor in the kinetic stability of proteins. 76,77 As already noted above, db's enhance kinetic cooperativity; that is, they entail a more extended linear chevron regime.52,57,59 For the issue at hand regarding folding rate diversity, this chevron property, by itself, means that db's tend to increase the diversity of folding rate of a given proteins under different folding conditions. Given this trend, it is not unreasonable to expect that db's would also increase the diversity of folding rates across different native structures. This is indeed the case, as will be detailed below. Here we use a native-centric potential with an implicit-water desolvation barrier54 that our group has applied in previous investigations.41,57 Following the notation in the detailed formulation in Ref. 59, the potential is given by U ðr; rcm ; e; edb ; essm Þ 8 eZðrÞ½ZðrÞ 2 > > < h i n n 2n = CYðrÞ YðrÞ =2 ðrdb rcm Þ =2n + edb > > : B½YðrÞ h1 =½YðrÞm + h2 for rbrcm for rcm Vrbrdb ð1Þ for rbrdb where rcm is the contact-minimum (cm) separation, ɛ is the magnitude of energy at cm, ɛdb is the db height, ɛssm is the depth of the energy well at the solvent-separated minimum (ssm), as illustrated in Fig. 1a; and ZðrÞ = ðrcm =rÞk YðrÞ = ðr=rdb Þ2 C = 4nðe + edb Þ=ðrdb rcm Þ4n B = messm ðrssm rdb Þ2ðm1Þ h1 = ð1 1=mÞðrssm rdb Þ2 =ðessm =edb + 1Þ h2 = ðm 1Þðrssm rdb Þ2m =ð1 + edb =essm Þ ð2Þ In the above Eq. (2), rssm = rcm + 3 Å, which followed from the consideration that 3 Å is approximately equal to the diameter of a water molecule, and rdb = (rssm + rcm)/2, as in the original work of Cheung et al.54 We use k = 6, m = 3, and n = 2 as before.41,57,58 The form of this potential was motivated by the general behavior of two nonpolar solutes in water. In order for the solutes to be in contact, water molecules must be pushed out of the space between them (Fig. 1a). The finite size of the water molecules thus leads to an energetic cost, manifested as a barrier in the effective pair potential between the solutes, that is, the potential of mean force (PMF) with the water degrees of freedom averaged.79 Evidently, a similar effect is likely to have a signi- 622 Fig. 1. Effective (implicit-water) potential with desolvation barrier (db). (a) The db potential energy41,54,58,59 (continuous curve) is given by the expression for U(r; rcm, ɛ, ɛdb, ɛssm) in the text [which is identical to that in Eqs. (2) and (3) of Ref. 59]. Here U(r) is plotted in units of the depth ɛ of the minimum energy (= − ɛ) at the cm separation r = rcm[U(rcm) = − 1 in this plot]. Included for comparison is the PMF of two methane molecules at 25 °C computed by atomic simulation using the TIP4P model of water (dashed curve, data from Ref. 71). The schematic molecular drawings illustrate the distances between the methane molecules (full circles) at the cm, db, and ssm positions vis-à-vis the size of a water molecule (dashed circles). For the example in this figure, the rcm distance, the db height ɛdb, and the ssm depth ɛssm in U(r) (continuous curve) are shown with values equal to those in the methane–methane PMF from atomic simulation (dashed curve). In general, the contact distance rcm in the U(r) potential for a pair of natively contacting residues i and j is set equal to the Cα–Cα distance rijn between the residues in the PDB structure, whereas ɛdb and ɛssm may take values similar (see the text) but not necessarily identical to that shown in this figure. (Effects of varying ɛdb and ɛssm were explored in Refs. 52 and 59.) (b) PMF computed by explicit-water atomic simulation (dashed curve) for two 20-residue polyalanine α-helices versus an implicit-water potential for the same system (continuous curve), where r is the distance between the centers of mass of the two helices. The PMF shown was simulated at 25 °C for two essentially rigid helices at a fixed crossing angle using the TIP4P model of water (dashed curve; data from Ref. 78). The implicit-water potential here was constructed by assuming that water-mediated interactions were pairwise additive, as follows. First, “native” contacts between residues along the two helices were determined by applying the criterion for native contacts to the helices' cm configuration at r ≈ 0.75 nm. Second, the potential U for each such pair was taken to be the U(r) function in (a) except rijn was set equal to the given residue pair's distance in the helices' cm configuration (rijn can be different for different contacts). The overall implicit-water potential energy function shown by the continuous curve in (b) was then calculated as the sum of all such U's. Desolvation and Topology-Dependent Folding ficant impact on the folding process of a globular protein as well, because most, if not all, water molecules must be excluded from the hydrophobic core before the native folded structure can be formed. In general, the water-mediated PMF is temperature dependent.73,80 Therefore, to account for the temperature dependence of protein folding,81,82 some form of temperature dependence would have to be introduced into the effective potential, as in our group's recent attempt to rationalize58,59 the common yet intriguing feature of isostable intrinsic enthalpic folding barriers.83 In the present investigation, however, we use only a temperature-independent nativecentric potential with db, as in most previous studies,41,52,54–56 because our main goal here is to address the diversity of experimental folding rates of different proteins measured at essentially the same temperature. The present focus on temperatureindependent interactions also serves well to ensure that any entropic effect observed in our model must necessarily originate from conformational entropy, whose role in the rate–topology correlation27 is an issue we aim to further elucidate. Approximate additivity of the db potential As in previous applications of db potentials41,54,59 with functional form similar to that in Fig. 1a, the total native-centric interaction energy in a protein chain model is the sum of db potentials between pairs of residues. Figure 1b provides an assessment of this additivity assumption. Here, the PMF between two 20-residue α-helices simulated using an explicit-water model78 is contrasted with an effective potential constructed for the same manybody system based on assuming pairwise additivitiy of our db potential. Figure 1b shows that the overall barrier to helix–helix association (at separation ∼ 1 nm) computed from explicit-water simulation (dashed curve) is lower than that calculated by a simple summation of contributions from our implicit-water potential for individual residue pairs (continuous curve). Nonetheless, features of the two potentials in Fig. 1b are quite similar, including the position and depth of the solvent (water)separated minimum at ≈1.2 nm. This similarity suggests that one may expect pairwise additivity of the db potential to be a reasonable first approximation for coarse-grained modeling of desolvation/ solvation effects in protein folding. Activated volume in pressure-dependent folding as a db effect A noteworthy physical implication of desolvation effects is how they contribute to the volumetric signatures of protein folding.84 Recent explicit-water simulations of two-helix systems78 has revealed an intimate relationship between the enthalpic contribution to the overall folding barrier and the activation volume of folding transition state determined from pressure-based experimental methods.85 The Desolvation and Topology-Dependent Folding helix simulations in Ref. 78 highlighted the creation of a void volume when the two helices (as a model for two parts of a folding protein) were separated by a distance too small to accommodate water molecules in between them (a process termed “steric dewetting”). Thus, formation of the helix dimer entails surmounting an “activation volume” (peak of volume increase as the two helices approach each other from large separation) of ≈ 55 mL/mol and ≈ 150 mL/mol, respectively, for a pair of 20-residue polyalanine and polyleucine helices (Fig. 3 of Ref. 78). Interestingly, pressure-based experiment by Mitra et al. showed that the folding activation volume of wild-type staphylococcal nuclease is ≈ 56 mL/mol (Table 1 of Supporting Information for Ref. 85), suggesting that the extent of dehydration at the folding rate-limiting step of this protein may be similar to that typified by the dimerization of two rigid 20-residue polyalanine helices. This comparison between activation volume data from pressure experiments and from explicit-water simulation of 623 many-body hydrophobic interactions provided further support to the hypothesis that the ratelimiting step of folding for some proteins likely involves large-scale, near-simultaneous hydrophobic burial. If so, the height of the enthalpic folding barrier as well as the size of activation volume may be closely related to the degree of folding cooperativity of a given protein.58,78 How these many-body, nonadditive effects might be captured and elucidated by coarse-grained modeling is beyond the scope of the present work but is a question that would be extremely interesting to explore in the future. db's lead to a higher overall folding free-energy barrier As in most of our previous studies,41,59 we adopt ɛdb = 0.1ɛ and ɛssm = 0.2ɛ for the native-centric db potential (Fig. 1a). We focus on the 13 proteins in the previous study by Wallin and Chan27 (Fig. 2). The set Fig. 2. Ribbon diagrams of the PDB structures of the set of 13 proteins used in the present investigation (labeled below each structure by its PDB id). The same set was used in a previous study by Wallin and Chan.27 Drawings were created by RasMol. 624 of native contacts used for modeling a protein is obtained by applying the same 4.5 Å side chain–side chain separation criterion as that in Refs. 27, 58, and 59 on the given protein's Protein Data Bank (PDB) structure. Folding kinetics and equilibrium sampling are conducted by Langevin dynamics.86 As before, bias potentials are introduced to facilitate sampling52,87–89 when necessary. The parameters for Langevin dynamics simulations are identical to those in our previous works. In particular, the simulation time step δt = 0.02 and the friction coefficient γ = 0.0125, as in Refs. 27 and 59. During Langevin dynamics simulation, a pair of residues belonging to the native contact set is considered to be in contact— and thus contributing to the fractional native contact number Q—if the distance between their Cα positions is not larger than that at the db peak of their Fig. 3. Free-energy barriers and folding rates. (a) Typical Q-based one-dimensional free-energy profiles, shown here for the with-db (continuous curve) and nodb (dashed curve) models of the 6–85 fragment of λrepressor (1lmb). Each curve was simulated at approximately the transition midpoint of the given model; ΔG(Q)/kBT = − ln P(Q) + constant, where P(Q) is the conformational population as a function of Q. (b) Free-energy barrier height ΔG‡ (in units of kBT) versus logarithmic midpoint folding rate ksim determined from f simulations of with-db (filled circles) and no-db (open squares) models of the 13 proteins we studied. Straight lines were determined from linear regression with correlation coefficient r = − 0.98 for both cases. The xintercepts of the straight lines provide the preexponential (front) factors in Kramers theory for the withdb (F db) and no-db (F (0)) models. Data for the no-db models were taken from Fig. 4 of Ref. 27. Desolvation and Topology-Dependent Folding native-centric db potential.41,58,59 As illustrated by the example in Fig. 3a, free-energy profiles ΔG(Q)/ kBT (kBT is Boltzmann constant times absolute temperature) for the models with db's we studied have higher overall free-energy barriers than the profiles for their corresponding no-db models. This is part of the above-noted general trend that folding/unfolding transitions are more cooperative in with-db models than in corresponding no-db models.41,54,57 db's significantly reduce conformational diffusion at the peak of overall folding free-energy barrier Using the computational setup outlined above, we have determined the folding rates of the with-db models for the 13 proteins in Fig. 2 at or near each with-db model protein's transition midpoint. We have also determined the folding activation free energy, ΔG‡, at the corresponding model temperatures for the progress variable Q. Our ΔG‡'s are determined from Q-based free-energy profiles as exemplified by that in Fig. 3a, wherein ΔG‡ is an overall barrier height defined as the ΔG value at the peak of the overall free-energy barrier minus the ΔG value at the unfolded (or denatured, low-Q) freeenergy minimum. For the 13 with-db models, Fig. 3b shows that, to a very good approximation, there is linear relationship, with slope − 1, between logarithmic simulated folding rate ln kfsim and ΔG‡/kBT (circles). As noted previously,27 a similar linear relationship holds for the corresponding no-db models as well (squares in Fig. 3b). These trends indicate that the relationship ! DGz ð3Þ kf = F exp kB T in the conventional transition-state picture or Kramers theory of protein folding 90,91 holds approximately for our model midpoint folding rates, with F denoting the preexponential front factor92 or prefactor93 estimated by the x-intercepts of the linear fits in Fig. 3b. The formulation in Eq. (3) provides an analysis of model folding rates in terms of a product of two contributions: The front factor F characterizes the rate of conformational diffusion at the overall folding free-energy barrier, whereas the folding barrier height ΔG‡/kBT is determined by the population of conformations at the same overall barrier relative to that at the unfolded minimum. The ensemble of conformations at the overall barrier constitutes a putative folding transition state27,36 because, dynamically, the value of Q can only undergo essentially continuous variation. Hence, a chain en route to the native state must pass through one of the conformations with Q values corresponding to that of this putative transition-state ensemble (TSE) at the overall barrier. This ensemble acts as a folding bottleneck when ΔG‡/kBT is large because then the conformations it encompasses have low probabilities relative 625 Desolvation and Topology-Dependent Folding Fig. 4. Time evolution ~ of~ native contact number in Langevin dynamics. P½QðtÞ; Qðt + dtÞ is the probability, among all possible dynamic transitions effected by a Langevin dynamic time step δt,~that the number of native ~ contacts is QðtÞ at time t and Q ðt + dtÞ at a subsequent time t + δt. Results shown are for the ~ with-db ~ model of~CI2 (2ci2) simulated at ɛ = 1.172 (T = 1). Q = QQn where Qn is the ~ number of native contacts in the PDB structure and Qn = 131 for 2ci2. The transition probabilities were determined from 2 × 109 time steps of ~ sampling. Probabilities changes in Q, denoted here as ~ ~for different ~ dQuQðt + dtÞ QðtÞ, are depicted in different colors for ~ clarity: the black, red, and blue curves are for dQ = 0, – 1, and + 1, respectively. Probabilities ~ for ~ all other transitions ~ were~zero in our simulation (P½QðtÞ; Qðt + dtÞ = 0 for dQN1 or dQb 1). to those belonging to the unfolded state. Following a similar argument put forth in an earlier lattice protein model study (Fig. 2 of Ref. 94), Fig. 4 here shows that during one simulation time step δt (which is short by construction), the largest change in the number of native contacts is ± 1, which is the minimum nonzero increase or decrease possible. Thus, as expected, Q is seen as varying in a quasicontinuous manner in our model dynamics. Accordingly, properties of the transition state, such as its average potential energy, conformational entropy, and average root-mean-square deviation (RMSD)95 from the native structure, are determined from conformations sampled within a narrow range of Q values at the peak of the overall free-energy barrier as in Ref. 27. Figure 3b shows that folding rates in the with-db models are substantially slower than the corresponding no-db models. However, the with-db models' higher ΔG‡/kBT values (Fig. 3a) account only partly for the slower folding rates in these models. The analysis in Fig. 3b shows that the other major reason for their slower folding rates is that conformational diffusion is slower in the with-db models. In Fig. 3b, the intercepts of the linear fits show that the front factor F db ≈ 1.7 × 10− 5 for the with-db models is ∼ 50 times slower than the front factor F (0) ≈ 9.0 × 10− 4 for the no-db models. In general, the rate of conformational diffusion along a single progress variable Q has been found to depend on the progress variable.96,97 Results from one study suggested that the variation across the middle of range of Q may be mild.96 Using a different model, another study concluded that the rate of conformational diffusion decreases “with respect to the progression of folding toward the native state, which is caused by the collapse to a compact state constraining the configurational space for exploration”.97 Remarkably, in light of likely variations of conformational diffusion rate with respect to Q as proposed in these prior theoretical studies, our results in Fig. 3b show that the rate of transitionstate conformational diffusion, as embodied by the front factor F, is approximately uniform among a class of models for different proteins constructed using the same native-centric interaction scheme (with-db or no-db), even though it can be very different for different classes of models (with-db versus no-db). The observation here that transition-state conformational diffusion is significantly slower in the with-db model is physically reasonable because the presence of repulsive interactions in the db potential creates a more bumpy energy landscape, entailing more channeled and meandering microscopic folding paths that would take longer times to traverse. Evidently, the rate of conformational diffusion is dependent upon solvent viscosity.98 The present simulations were conducted under low viscosity for computational tractability. Nonetheless, a recent result showing that model chevron plots maintain their shape over a wide-range of Langevin friction coefficients52 and the above general physical consideration both suggest that a significant difference in the rate of transition-state conformational diffusion between with-db and no-db models should persist in Langevin dynamics with higher, more water-like friction coefficients.86 Results and Discussion db's significantly increase the diversity in folding rates among model proteins of different native topologies Applying the with-db modeling approach described above to the 13 proteins in Fig. 2, we show in Fig. 5a the simulated folding rates, kfsim , of the with-db protein models at their respective transition midpoints and compare kfsim's with experimental rates (see Ref. 27 and references therein)†. At the model transition midpoint, folding and unfolding rates are equal and the kinetic relaxation is well approximated by a single exponential,41 and thus kfsim = 1/MFPT, where MFPT is mean first passage time of folding. As in the previous no-db model study27 (Fig. 5b), we focus on kfsim at the model transition midpoint because the behaviors of no-db and with-db models are kinetically more cooperative, that is, two-state-like, at midpoint † In Ref. 27, for Coicilin E9 immunity protein (PDB id 1imq), instead of the chain length N and folding rate kf in Table 1 of this reference, they should be listed, respectively, as N = 86 and kf = 1.5 × 103s− 1. This is merely a typographical error that did not affect other results on 1imq in Ref. 27. 626 Desolvation and Topology-Dependent Folding which include three other two-state proteins (with N = 36, 43, and 115) and three three-state proteins, their simulated no-db model folding rates cover ≈ 4.7 orders of magnitude, whereas the corresponding experimental folding rates cover ≈ 8.8 orders of magnitude (see Table 1 and upper plot in Fig. 1 of Ref. 100)‡. db's tend to increase Q of the folding transition state but leave transition-state RMSD from native essentially unchanged Fig. 5. Experimental folding rates (kexp f ) versus simulated folding rates (ksim f ) of the 13 proteins studied here for (a) the with-db model and (b) the no-db model. The no-db data in (b) were from Fig. 2 of Ref. 27 and included here to facilitate comparison with the new results in (a). than when the models are under strongly folding conditions.41,57,99 Although the correlation between simulated and experimental folding rates (kfsim and kfexp in Fig. 5) in the with-db models is comparable with that of the no-db models (Pearson correlation coefficient r = 0.66 and 0.69, respectively), the with-db models exhibit a remarkable improvement over the no-db models in matching the experimental diversity in folding rates. In Fig. 5a, kfsim spans a range of ≈ 4.6 orders of magnitude, almost identical to the kexp range of ≈ 4.5 orders of magnitude. To our knowledge, such a match over 4 orders of magnitude between the range of folding rates from direct kinetic simulations of explicit-chain models and that from experiments is unprecedented. By comparison, the range of folding rates in Fig. 5b simulated using no-db Gō-like models of the same proteins spans only ≈ 2.2 orders of magnitude. Interestingly, the kfsim range of ≈ 2.2 orders of magnitude from our no-db Gō-like models is almost identical to the range of ≈ 2.1 orders of magnitude obtained previously by Chavez et al.100 using the same no-db Gō-like constructs for a somewhat different set of 13 proteins (9 of which overlap with our set) with chain lengths within the range N = 56– 98 as in our set. In contrast to our with-db model folding rates (Fig. 5a) but similar to our no-db model folding rates (Fig. 5b), the no-db model folding rates of Chavez et al. also fall short of matching the corresponding range of experimental folding rates: For their aforementioned 13 proteins with N = 56–98, the experimental folding rates span ≈ 7.0 orders of magnitude; for the set of all proteins in their study, The match between the ranges of simulated and experimental folding rates in Fig. 5a suggests convincingly that barrier effects originating from desolvation energetics are a significant contributor to folding rate diversity. As noted above, both the increase in overall folding barrier height ΔG‡ and the slower transition-state conformational diffusion (smaller front factor F ) contribute to slower folding in the with-db models than that in the no-db models. However, because F db is approximately constant among the with-db models, at least for the 13 proteins studied here (Fig. 3b), the larger diversity in folding rates among the with-db models vis-à-vis that among the no-db models is underpinned almost entirely by a larger diversity in ΔG‡ values for the with-db models. Below we provide rationalization for both the with-db models' higher ΔG‡ values as well as the larger dispersion of the ΔG‡ values. The example in Fig. 3a indicates that the peak of the with-db model's higher overall folding barrier is situated at Q ≈ 0.67, which is significantly higher than the Q ≈ 0.53 value for the peak of the overall folding barrier in the no-db model. Motivated by this observation, we show in Fig. 6 the relationship between the overall folding barrier height ΔG‡ and the corresponding change in fractional native contact Q from the denatured-state (low-Q) minimum (Q = QD) to the transition-state peak (Q = Q‡). As seen in Fig. 6, ΔG‡ is well correlated with ΔQ‡ = Q‡ − QD for both the with-db and no-db models for 10 of the proteins we study. For these model proteins, db's produce a shift in the Q-value of the peak location, leading to larger values of ΔQ‡. The Q-value of the denatured state minimum, on the other hand, remains roughly the same for a given protein in both models. ‡ We note that the rescaling procedure proposed by Chavez et al. in Eq. (C.4) in Supporting Information of Ref. 100 is unwarranted. The proposed procedure resulted in approximately 4 orders of magnitude increase in the range of their no-db model folding rates after rescaling. However, even if the model native-centric energy strength ɛ may be different for different proteins when measured in physical energy units, this consideration cannot affect model midpoint folding rate because ksim at midpoint temperature Tm is controlled by the f dimensionless quantity ΔG‡/kBTm that, therefore, is invariant with respect to change in unit for ɛ. Desolvation and Topology-Dependent Folding Fig. 6. Activation free energy (ΔG‡, in units of kBT) versus “activation” Q value (ΔQ‡ = Q‡ – QD). For both the with-db (filled circles) and no-db (open squares) models, the correlation is significant for 10 of the proteins studied (3 outliers not plotted, see the text). The straight lines are least-squares linear regression; correlation coefficient r = 0.73 and 0.75, respectively, for the with- and no-db models plotted. Thus, the larger ΔG‡ in the with-db models may be viewed as resulting from a larger ΔQ‡. This feature was noted previously for with-db models of chymotrypsin inhibitor 2 (CI2) and barnase.59 The more general result in Fig. 6 showing a substantial increase in ΔQ‡ for the with-db models over that for the no-db models is physically reasonable because db tends to decrease the stability of partially ordered conformations. As a result, folding does not proceed until a sufficiently high number of contacts have formed; that is, larger portions of the protein are ordered into native-like structure. Additionally, Fig. 6 shows for both the with-db and no-db models that an approximate linear relationship exists between ΔG‡ and ΔQ‡. We consider this trend a Hammondlike behavior,101 because it shows that the extent of structural reorganization of the transition state from that of the reactant (denatured state in our case) is negatively correlated with reaction (folding) speed, and therefore positively correlated with overall barrier height (Fig. 3b), as in the Hammond hypothesis. The underlying principle of this trend is similar to that enunciated by Hammond, although his original study of chemical reactions considered potential energy as a function of reaction coordinate101 rather than the free-energy profile used in the study of protein folding. It should be noted, however, that Hammond-like behavior does not apply to all of our protein folding models. The behaviors of three outliers— models for twitchin (1wit), spliceosomal protein U1A (1urn), and acylphosphatase (1aps)—suggest that once the overall folding barrier ΔG‡ becomes sufficiently high, its relationship with ΔQ‡ does not follow the trend exhibited in Fig. 6 for models with comparatively lower ΔG‡ values (outlier data not shown in Fig. 6). For with-db models of the outliers, ΔG‡/kBT = 9.8, 10.8, and 14.9, and ΔQ‡ = 0.40, 0.51, 627 and 0.45, respectively, For their no-db counterparts, ΔG‡/kBT = 4.7, 5.3, and 6.1, and ΔQ‡ = 0.31, 0.28, and 0.30, respectively. Nonetheless, these no-db ΔG‡ and ΔQ‡ values are lower than those for the corresponding with-db models. In this respect, they are similar to the results for the proteins shown in Fig. 6. We next turn to the increased diversity in ΔG‡ values in the with-db model. What is causing some proteins to experience an increased shift in simulated ΔQ‡ value than others when modeling is switched from the no-db to the with-db interaction scheme? To address the issue, we consider RMSD from the native structure as a function of Q (shown for two proteins in Fig. 7). In all cases, including those for the remainder of the protein set not shown in Fig. 7, RMSD is a decreasing function of Q, wherein for Q values intermediate between the denatured and native states (Q ∼ 0.2–0.8) the RMSD at a given Q is higher for the with-db than for the no-db model. Remarkably, the RMSD values at the barrier peak locations of the two models (marked by vertical lines in Fig. 7) are essentially the same. This near-invariance of transition-state RMSD with Fig. 7. RMSD from the native PDB conformation as a function of fractional number of native contacts Q. Results are shown for the examples of (a) λ–repressor (1lmb) and (b) S6 (1ris). In each panel, filled circles (upper curve) are for the with-db model, whereas open squares (lower curve) are for the no-db model. Vertical lines in each of the plot mark the locations of the overall barrier peaks along the free-energy profiles for the with-db (continuous line) and no-db (dashed line) models. 628 respect to the change from the no-db to with-db interaction scheme provides a perspective for understanding the corresponding shift in ΔG‡. It appears that adding db's shifts the ΔG peak to a higher Q‡ position because in the presence of the unfavorable interactions at the db's, a larger number of native contacts are necessary to achieve a given RMSD threshold required at the ratelimiting step of folding, and this shift in Q‡ leads to a higher ΔG‡ following a Hammond-like trend. However, the magnitude of this Q‡ shift is sensitive to native topology. Comparing the results for 1lmb and 1ris in Fig. 7, for example, indicates that on average a protein with higher native topological complexity would require a larger Q‡ shift to maintain an essentially model-independent RMSD threshold. Desolvation and Topology-Dependent Folding Folding routes with db's are more channeled To gain further insight into db effects, we examine the distribution of individual native contacts along the model folding trajectories. At each value of Q, there are conformations with different sets of native contacts that are consistent with the givenPtotal number of ~ ~ n native contacts, such that Q = Q k = 1 Pðck jQÞ=Qn , where P(ck∣Q) is the probability of contact ck in the set of conformations each of which has a given Q ~ ~ value, the contact label k = 1, 2, …, Qn , with Qn denoting the total number of contacts in the native (PDB) structure. For an individual conformation, a contact ck can either be formed or not formed (with slight variation when a smooth criterion is used instead52). But in an ensemble, P(ck) typically takes on fractional value because it involves averaging over Fig. 8. Comparing the transition states in the with-db and no-db model. (a) Contact maps showing contact probabilities (color coded as indicated) in the transition states of 2ci2 in the with-db model (upper triangle, same simulation conditions as in Fig. 4 above) and in the no-db model (lower triangle), simulated at each model's respective transition midpoint. Transition states are defined from Q-based free-energy profiles as discussed in the text. The bottom drawings illustrate conformational variations in the transition states for the with-db (b) and no-db (c) models. In (b) and (c), the thick black traces represent the backbone of the native PDB 2ci2 structure, whereas thin red traces depict representative transition-state conformations optimally superimposed on the native structure. These drawings were constructed using the method in Ref. 10. 629 Desolvation and Topology-Dependent Folding different conformations with different contact sets. We now take a closer look at the distribution of P(ck). The contact maps in Fig. 8a shows probabilities of individual contacts at the peak of the free-energy profile in both the with-db and no-db models for CI2 [P(ck) for a narrow range of Q centered at Q‡; see Ref. 27]. Transition-state contact maps such as Fig. 8a provide a useful visualization of the distribution in contact probabilities.66 The distribution of contact probabilities is more heterogeneous in the with-db model (upper triangle) than in the no-db model (lower triangle). This trend is consistent with results from the other proteins in our data set (contact maps not shown), indicating that one effect of db's is to induce more favorability to certain contacts in the TSE relative to that in the no-db case. Reflecting the higher Q‡ in the folding transition state in the withdb model than that in the no-db model (Figs. 6 and 7), the chain representations in Fig. 8b and c show a discernibly tighter conformational ensemble for the with-db model (Fig. 8b) than for the no-db model (Fig. 8c). One parameter that has been used to quantify the heterogeneity of native contacts along the freeenergy profile is the route measure 1 ~ Qn X ðPðck jQÞ QÞ2 RðQÞ = ~ Q n Qð1 QÞ k = 1 102 ð4Þ introduced by Plotkin and Onuchic and applied subsequently by Chavez et al.100 to analyze simulated data obtained from no-db Gō-like models. R(Q) is essentially the second moment of the contact probability distribution normalized by the maximum possible spread (0 ≤ R(Q) ≤ 1). Detailed discussions of the meaning of R(Q) are provided in Refs. 100 and 102. Briefly, if R(Q) takes the maximum value of unity, it indicates that only one specific set of native contacts is found at Q, in which case the protein can traverse very few possible conformational routes through the given Q value. At the other extreme, if R(Q) = 0, it means that all native contacts are equally probable at Q, and as a result many different conformational routes are available for the protein to pass through the given Q value. It follows that the value of R(Q) indicates whether there are many [small R(Q)→0] or few [large R(Q)→1] folding/unfolding routes at a given Q value.100,102 Figure 9 shows the route measure in both the with-db and no-db models for the same two proteins studied in Fig. 7. R(Q) has been computed before for several no-db Gō-like model proteins in Ref. 100. For those no-db model proteins that were considered in both that work and the present study, we obtain agreement between the two sets of results. For all proteins considered in our study, R(Q) for the with-db models (Fig. 9, filled circles) is typically larger than that for the corresponding nodb models (Fig. 9, open squares) at virtually all Q values, indicating that there are generally fewer folding routes in the with-db models. This result is consistent with our expectation that the number of Fig. 9. Route measure. Results are shown for the two proteins in Fig. 7. In each panel, route measure for the with-db model is plotted using filled circles (upper curve), whereas that for the no-db model is plotted using open squares (lower curve). Vertical lines mark the locations of overall barrier peaks along the free-energy profiles for the with-db (continuous line) and no-db (dashed line) models, as in Fig. 7. accessible conformations that are partially folded is substantially reduced by the repulsive part of db interactions. R(Q)'s for with-db models also exhibit substantially more structure. Whereas R(Q)'s for the no-db models are mostly monotonic, decreasing function with possibly a low maximum, R(Q)'s for the with-db models often have one or more prominent maxima. This feature implies that the action of db's to narrow routing possibilities along the folding trajectory is significantly more pronounced at certain values of Q. Folding may be characterized as encountering conformational entropic folding bottlenecks at these Q values.100 Thus, considering the above arguments together, with desolvation effects more appropriately accounted for by the present with-db models, folding is seen as more channeled than that predicted by no-db Gō-like models. Rate–topology correlation likely driven by conformational activation entropy To gain further insight into the biophysics of rate–topology correlation, we resolve simulated activation free-energy ΔG‡ into its energetic (ΔE‡) 630 Desolvation and Topology-Dependent Folding of the harsher restrictions imposed by db's on the TSE conformational freedom of topologically more complex proteins. Figures 11 and 12 turn attention to the relationship between native topology and simulated folding rate. Since the predictive power of RCO was discovered, 26 several measures of native topological complexity have been devised to Fig. 10. Energetic and entropic components of freeenergy barrier to folding. ΔE‡ is activation energy and ΔS‡ is activation entropy. Activation free energies (ΔG‡ ) in units of kBTm at the Tm's of the 13 with-db model proteins as well as their energetic (ΔE‡/kBTm) and entropic (−ΔS‡/kB) components are plotted as function of logarithmic simulated folding rate. Straight lines are results of least-squares linear regression. and entropic (−TmΔS‡) components, where ΔE‡ is activation energy and ΔS‡ is activation conformational entropy.27 Figure 10 shows ΔG‡ (same data as that in Fig. 3b), ΔE‡, and activation conformational entropic free-energy − TmΔS‡ for the 13 withdb model proteins. As for the no-db models studied before,27 there are large entropy-energy compensations. For example, both ΔE ‡ and − TmΔS‡ have magnitudes ∼ 130kBT for the slowest folding with-db model in Fig. 10, but they combine to yield a ΔG‡ of only ∼ 15kBT. Figure 10 shows that logarithmic model folding rate kfsim correlates quite well with ΔE‡ (negative correlation, r = − 0.79) and also with − TmΔS‡ (positive correlation, r = 0.74). Simulation data in Fig. 10 indicate further that the sign of ΔG‡ is identical to that of its conformational entropic component, − TmΔS ‡ , but opposite to that of its energetic component, ΔE‡. In other words, the conformational entropic component of ΔG‡ dominates over its energetic component. Because the variation in logarithmic folding rate across different model proteins is underpinned by the corresponding variation in ΔG‡ (Fig. 3b), the observation of entropic dominance in Fig. 10 implies that the rate–topology correlation is driven mainly by conformational entropy of the folding transition state in the withdb models, as in our previously studied no-db models.27 This trend—which has now been obtained from two explicit-chain simulation studies—is also consistent with the conclusion from an earlier nonexplicit-chain investigation 103 and recent advances in elucidating principles of loop closure.104 Taken together, the robustness of the finding led us to conclude that the rate–topology correlation in real proteins is likely a consequence of similar conformational entropic effects at the folding rate-limiting step. From this vantage point, the increased folding rate diversity in with-db models is a manifestation Fig. 11. Topological parameters versus simulated logarithmic folding rate (ksim f ). Results are shown for both with-db (filled circles) and no-db (open squares) models. Folding rates for the no-db models were from Ref. 27. Straight lines are results of least-squares linear regression. The correlation coefficients for with-db and no-db models are, respectively, (a) r = −0.64, − 0.59 for CO; (b) r = − 0.73, −0.72 for RCO; and (c) r = − 0.80, − 0.84 for LRO. In (c), the ln ksim versus LRO data for no-db models were taken from f Fig. 7b of Ref. 27 and are included here for comparison. 631 Desolvation and Topology-Dependent Folding where the summation is over contacts between nonhydrogen atoms of contacting residues, and Na is the total number of such atomic contacts.26,107 We also provide in Fig. 11c the dependence of simulated folding rate on LRO:105 LRO = 1 X nij N ibjl ð7Þ c Fig. 12. Transition-state topological parameters versus entropic component of activation free energy (see Fig. 10). Present results for the with-db models (filled circle) are compared against previous results27 for the no-db models (open squares). Straight lines are results of least-squares linear regression. The correlation coefficients for the withdb and no-db models are, respectively, (a) r = 0.34, r = 0.24 for CO‡, and (b) r = 0.70, r = 0.53 for LRO‡. Data for the nodb models were taken from Figs. 8b and 9b of Ref. 27. rationalize protein folding rates.105–109 Here we focus on RCO,26 long-range order (LRO),105 and a measure we termed27 CO: CO = 1 X lij ~ NQn ibj3 ð5Þ where N is the chain length (number of residues) of the given protein, i and j are residue labels, lij = ∣j − i∣ and the summation is over residue–residue contacts in the native structure. This measure was motivated by, but differs somewhat from, the original definition of RCO. CO was introduced for Cα chain model studies27 for its similarity with RCO. But unlike RCO, once the native contact set is determined, calculation of CO does not require knowledge about side-chain positions (Fig. 11a). The RCO values in Fig. 11b, however, are calculated by the original definition: RCO = X 1 lij NNa atomic contacts ð6Þ where nij = 1 if residues i and j are in contact; otherwise, nij = 0. Unlike CO and RCO, the terms for LRO are not weighted by the loop length lij, and LRO counts only long-range contacts satisfying a sequence cutoff lc criterion. Here we use lc = 12 as in Ref. 27. Figure 11 shows for the 13 studied proteins that the correlation between logarithmic kfsim and the topological complexity parameters are reasonably good. Introduction of db's leads to an improved correlation with CO (− r increases from 0.59 for the no-db models to 0.64 for the with-db model, Fig. 11a). But db's have little effect on the correlation of log kfsim with RCO and LRO. As noted above, the lij terms in RCO are weighted by the number of side-chain atomic contacts, whereas those in CO are not [Eqs. (5) and (6)]. Interestingly, even though kfsim is computed using a Cα chain model with a uniform strength for favorable native-centric energies, the correlation of log kfsim with the RCO measure (r ≈ − 0.73) is significantly stronger than that with the CO measure. Among the topological complexity parameters considered, the simulated logarithmic folding rates correlate most strongly with LRO (Fig. 11c). The r ≈ − 0.8 value for the correlation between log kfsim and LRO is comparable to that for the dependence of experimental log kfexp on LRO,105 despite the weaker correlation between log kfsim and log kfexp for the set of proteins we study (r = 0.66, Fig. 5). Combining the results from Figs. 10 and 11, Fig. 12 explores the relationship between activation conformational entropy ΔS‡ and the transition-state topological complexity parameters CO‡ and LRO‡ of the with-db models, an analysis that has been performed for the corresponding no-db models.27 The activation quantities CO‡ and LRO‡ are the CO and LRO values computed for the TSE instead of for the native structure; that is, they are obtained by applying Eqs. (5) and (7) but with the summation replaced by one that sums over contacts in each of the TSE conformations and then averaged over the TSE. [Note that RCO‡ cannot be computed using a Cα chain model because the side-chain information required in Eq. (6) is lacking.] Figure 12a shows that there is not much correlation between ΔS‡ and CO‡ among both the with-db and no-db models. This is not too surprising because although log kfsim correlates reasonably well with ΔS‡ (Fig. 10), the correlation between log kfsim and native CO is weaker (Fig. 11a). Nonetheless, it is interesting to note that the range of CO‡ values spanned by the 13 model proteins is 60–80% of the corresponding range of native CO values. This trend appears to be consistent with 632 recent results based on ψ-value analyses and other experimental techniques, indicating that transition states of several small proteins achieve approximately 60–80% (∼ 70%) of the RCO of their respective native structures.110,111 A somewhat lower ∼ 50% of native RCO, however, was found in putative TSE's simulated using experimental ϕvalue as constraints112 (see also comment in Ref. 113 on the method in Ref. 112). Figure 12b shows that activation conformational entropy correlates much better with LRO, and that db significantly improves the correlation, viz., r for LRO‡ versus − ΔS‡/kB increases from 0.53 for the nodb models to 0.70 for the with-db models. Contrasting this behavior with that in Fig. 12a, our finding suggests that the conformational entropic consequence of the topological complexity in the TSE may be better characterized by the LRO measure than by the RCO measure. It is clear from Fig. 12b that db's promote nonlocal contacts in the TSE, with the maximum LRO‡ among the 13 proteins studied shifting from ∼ 0.65 for the no-db models to ∼ 0.8 for the with-db models. Results in Figs. 11c and 12b indicate that LRO‡ ∼0.4 (LRO). Experimental testing of this predicted scaling should provide useful topological information about the TSE in addition to the insight gained from the CO‡ ∼ 0.7 (CO) relation discussed above. Concluding Remarks In summary, we have shown that incorporating physics-inspired pairwise db's into native-centric coarse-grained explicit-chain models of a set of natural proteins can lead to a remarkable diversity in folding rates almost identical to that observed experimentally, a feat not achievable by common Gō-like models without db's. db's give rise to more ruggedness on the energy landscape. This ruggedness enhances rather than diminishes folding cooperativity because db's serve to eliminate many partially folded conformations that are prone to kinetic trapping. In other words, energy landscapes with db are rugged with barriers but not rugged with traps, a distinction that has been pointed out in a lattice modeling context.92 Consequently, we found that folding with physical db's is more cooperative, slower, and more channeled than that stipulated by no-db modeling. In our models, the slowing of folding rate by db's as well as the concomitant enhancement of folding cooperativity is seen as mainly a transitionstate conformational entropic effect. Broadly speaking, this effect is more prominent for proteins with more complex native topologies. The correlation between simulated and experimental folding rates is fair for the set of proteins studied. Although the match between the range of simulated and experimental folding rates improves dramatically with the incorporation of db's, the degree of correlation between simulated and experimental folding rates are practically the Desolvation and Topology-Dependent Folding same, and are not very high, for our with-db and no-db models. This means that much needs to be learned about the relationship between the energetics of db models and that of real proteins, as well as the possible connections between our with-db models and many-body interaction schemes that have been invoked, with various degree of success, to rationalized rate–topology correlation.38,39,52 In this respect, it should be noted that the native-centric db model has recently been applied productively to rationalize non-two-state protein folding.67 The modeling approach also appears capable, at least for two members of the peripheral subunit-binding domain family with available PDB structures, to capture the rank order of folding cooperativity of homologous proteins. However, it may not always reproduce quantitatively the full divergence in folding rates among homologues.52,114 Experiments have shown that even a single mutation can significantly change folding speed,13 and folding rates of proteins with the same architecture, such as those for the spectrin domains, can differ by more than 3 orders of magnitude.115 Several circular permutants'116 nonconformity to the usual rate–topology correlation117 also raised questions as to the generality of any simple theoretical treatment based on native topology. To what degree native-centric approaches such as the present db model can rationalize these intriguing findings remains to be ascertained. These potential limitations of the present model notwithstanding, the fact that a simple addition of db's is sufficient to essentially reproduce the large range of experimental folding rates in this study suggests strongly that db effects are a main physical origin of the remarkable diversity in the folding rates of natural proteins. A further tantalizing suggestion from our results is that once a protein sequence is designed to specifically favor a folded conformation (as in our native-centric models) that has an appropriate topology,52,67,118 most of folding cooperativity and folding rate diversity might simply follow from the physics of desolvation. This is an attractive prospect that deserves further investigation. Acknowledgements We thank Artem Badasyan, Justin MacCallum, Cathy Royer, Peter Tieleman, and Stefan Wallin for helpful discussions. A.F. is a postdoctoral trainee of the Canadian Institutes of Health Research (CIHR) Training Program in “Protein Folding: Principles and Diseases” at the University of Toronto and thanks the Program for stipend support. We thank also CIHR (grant MOP-84281 to H.S.C.) and the Canada Research Chairs Program for funding this research. Desolvation and Topology-Dependent Folding References 1. Levitt, M. & Warshel, A. (1975). Computer simulation of protein folding. Nature, 253, 694–698. 2. Taketomi, H., Ueda, Y. & Gō, N. (1975). Studies on protein folding, unfolding and fluctuations by computer simulation. 1. The effect of specific amino acid sequence represented by specific inter-unit interactions. Int. J. Pept. Protein Res. 7, 445–459. 3. Bryngelson, J. D., Onuchic, J. N., Socci, N. D. & Wolynes, P. G. (1995). Funnels, pathways, and the energy landscape of protein folding: a synthesis. Proteins Struct. Funct. Genet. 21, 167–195. 4. Dill, K. A., Bromberg, S., Yue, K., Fiebig, K. M., Yee, D. P., Thomas, P. D. & Chan, H. S. (1995). Principles of protein folding—a perspective from simple exact models. Protein Sci. 4, 561–602. 5. Thirumalai, D. & Woodson, S. A. (1996). Kinetics of folding of proteins and RNA. Acc. Chem. Res. 29, 433–439. 6. Dill, K. A. & Chan, H. S. (1997). From Levinthal to pathways to funnels. Nat. Struct. Biol. 4, 10–19. 7. Mirny, L. & Shakhnovich, E. (2001). Protein folding theory: from lattice to all-atom models. Annu. Rev. Biophys. Biomol. Struct. 30, 361–396. 8. Chan, H. S., Shimizu, S. & Kaya, H. (2004). Cooperativity principles in protein folding. Methods Enzymol. 380, 350–379. 9. Onuchic, J. N. & Wolynes, P. G. (2004). Theory of protein folding. Curr. Opin. Struct. Biol. 14, 70–75. 10. Wallin, S. & Chan, H. S. (2005). A critical assessment of the topomer search model of protein folding using a continuum explicit-chain model with extensive conformational sampling. Protein Sci. 14, 1643–1660. 11. Shakhnovich, E. (2006). Protein folding thermodynamics and dynamics: where physics, chemistry, and biology meet. Chem. Rev. 106, 1559–1588. 12. Kim, D. E., Gu, H. & Baker, D. (1998). The sequences of small proteins are not extensively optimized for rapid folding by natural selection. Proc. Natl Acad. Sci. USA, 95, 4982–4986. 13. Northey, J. G. B., Di Nardo, A. A. & Davidson, A. R. (2002). Hydrophobic core packing in the SH3 domain folding transition state. Nat. Struct. Biol. 9, 126–130. 14. Jackson, S. E. & Fersht, A. R. (1991). Folding of chymotrypsin inhibitor 2. 1. Evidence for a two-state transition. Biochemistry, 30, 10428–10435. 15. Sosnick, T. R., Mayne, L., Hiller, R. & Englander, S. W. (1994). The barriers in protein folding. Nat. Struct. Biol. 1, 149–156. 16. Jacob, J., Krantz, B., Dothager, R. S., Thiyagarajan, P. & Sosnick, T. R. (2004). Early collapse is not an obligate step in protein folding. J. Mol. Biol. 338, 369–382. 17. Matthews, C. R. & Hurle, M. R. (1987). Mutant sequences as probes of protein folding mechanisms. BioEssays, 6, 254–257. 18. Kuwajima, K. (1989). The molten globule state as a clue for understanding the folding and cooperativity of globular protein structure. Proteins: Struct. Funct. Genet. 6, 87–103. 19. Kim, P. S. & Baldwin, R. L. (1990). Intermediates in the folding reactions of small proteins. Annu. Rev. Biochem. 59, 631–660. 20. Matthews, C. R. (1993). Pathways of protein folding. Annu. Rev. Biochem. 62, 653–683. 21. Sadqi, M., Fushman, D. & Muñoz, V. (2006). Atomby-atom analysis of global downhill protein folding. Nature, 442, 317–321. 633 22. Liu, F. & Gruebele, M. (2007). Tuning λ6–85 towards downhill folding at its melting temperature. J. Mol. Biol. 370, 574–584. 23. Jackson, S. E. (1998). How do small single-domain proteins fold? Folding Des. 3, R81–R91. 24. Baker, D. (2000). A surprising simplicity to protein folding. Nature, 405, 39–42. 25. Barrick, D. (2009). What have we learned from the studies of two-state folders, and what are the unanswered questions about two-state protein folding? Phys. Biol. 6, 015001. 26. Plaxco, K. W., Simons, K. T. & Baker, D. (1998). Contact order, transition state placement and the refolding rates of single domain proteins. J. Mol. Biol. 227, 985–994. 27. Wallin, S. & Chan, H. S. (2006). Conformational entropic barriers in topology-dependent protein folding: perspectives from a simple native-centric polymer model. J. Phys.: Condens. Matter, 18, S307–S328. 28. Chan, H. S. (1998). Matching speed and locality. Nature, 392, 761–763. 29. Gō, N. & Taketomi, H. (1978). Respective roles of short- and long-range interactions in protein folding. Proc. Natl Acad. Sci. USA, 75, 559–563. 30. Alm, E. & Baker, D. (1999). Prediction of proteinfolding mechanisms from free-energy landscapes derived from native structures. Proc. Natl Acad. Sci. USA, 96, 11305–11310. 31. Muñoz, V. & Eaton, W. A. (1999). A simple model for calculating the kinetics of protein folding from threedimensional structures. Proc. Natl Acad. Sci. USA, 96, 11311–11316. 32. Chan, H. S. (2000). Modeling protein density of states: additive hydrophobic effects are insufficient for calorimetric two-state cooperativity. Proteins: Struct. Funct. Genet. 40, 543–571. 33. Karanicolas, J. & Brooks, C. L. (2003). The importance of explicit chain representation in protein folding models: an examination of Ising-like models. Proteins: Struct. Funct. Genet. 53, 740–747. 34. Micheletti, C., Banavar, J. R., Maritan, A. & Seno, F. (1999). Protein structures and optimal folding from a geometrical variational principle. Phys. Rev. Lett. 82, 3372–3375. 35. Shea, J.-E., Onuchic, J. N. & Brooks, C. L. (1999). Exploring the origins of topological frustration: design of a minimally frustrated model of fragment B of protein A. Proc. Natl Acad. Sci. USA, 96, 12512–12517. 36. Clementi, C., Nymeyer, H. & Onuchic, J. N. (2000). Topological and energetic factors: what determines the structural details of the transition state ensemble and “en-route” intermediates for protein folding? An investigation for small globular proteins. J. Mol. Biol. 298, 937–953. 37. Koga, N. & Takada, S. (2001). Roles of native topology and chain-length scaling in protein folding: a simulation study with a Gō-like model. J. Mol. Biol. 313, 171–180. 38. Jewett, A. I., Pande, V. S. & Plaxco, K. W. (2003). Cooperativity, smooth energy landscapes and the origins of topology-dependent protein folding rates. J. Mol. Biol. 326, 247–253. 39. Kaya, H. & Chan, H. S. (2003). Contact order dependent protein folding rates: kinetic consequences of a cooperative interplay between favorable nonlocal interactions and local conformations preferences. Proteins: Struct. Funct. Genet. 52, 524–533. 634 40. Kaya, H. & Chan, H. S. (2000). Polymer principles of protein calorimetric two-state cooperativity.Proteins: Struct. Funct. Genet. 40, 637–661; [Erratum: 43, 523 (2001)]. 41. Kaya, H. & Chan, H. S. (2003). Solvation effects and driving forces for protein thermodynamic and kinetic cooperativity: how adequate is native-centric topological modeling? J. Mol. Biol. 326, 911–931; [Corrigendum: 337, 1069–1070 (2004)]. 42. Ghosh, K. & Dill, K. A. (2009). Theory for protein folding cooperativity: helix bundles. J. Am. Chem. Soc. 131, 2306–2312. 43. Kaya, H. & Chan, H. S. (2005). Explicit-chain model of native-state hydrogen exchange: implications for event ordering and cooperativity in protein folding. Proteins: Struct. Funct. Bioinf. 58, 31–44. 44. Ejtehadi, M. R., Avall, S. P. & Plotkin, S. S. (2004). Three-body interactions improve the prediction of rate and mechanism in protein folding models. Proc. Natl Acad. Sci. USA, 101, 15088–15093. 45. Wang, J., Lee, C. & Stell, G. (2005). The cooperative nature of hydrophobic forces and protein folding kinetics. Chem. Phys. 316, 53–60. 46. Qi, X. & Portman, J. J. (2007). Excluded volume, local structural cooperativity, and the polymer physics of protein folding rates. Proc. Natl Acad. Sci. USA, 104, 10841–10846. 47. Abkevich, V. I., Gutin, A. M. & Shakhnovich, E. I. (1995). Impact of local and nonlocal interactions on thermodynamics and kinetics of protein folding. J. Mol. Biol. 252, 460–471. 48. Chan, H. S. (1999). Folding alphabets. Nat. Struct. Biol. 6, 994–996. 49. Chan, H. S., Kaya, H. & Shimizu, S. (2002). Computational methods for protein folding: scaling a hierarchy of complexities. In Current Topics in Computational Molecular Biology (Jiang, T., Xu, Y. & Zhang, M. Q., eds), pp. 403–447, The MIT Press, Cambridge, MA; chapt. 16. 50. Kaya, H. & Chan, H. S. (2003). Origins of chevron rollovers in non-two-state protein folding kinetics. Phys. Rev. Lett. 90, 258104. 51. Zhou, Y., Zhang, C., Stell, G. & Wang, J. (2003). Temperature dependence of the distribution of the first passage time: results from discontinuous molecular dynamics simulations of an all-atom model of the second β-hairpin fragment of protein G. J. Am. Chem. Soc. 125, 6300–6305. 52. Badasyan, A., Liu, Z. & Chan, H. S. (2008). Probing possible downhill folding: native contact topology likely places a significant constraint on the folding cooperativity of proteins with ∼40 residues. J. Mol. Biol. 384, 512–530. 53. Rank, J. A. & Baker, D. (1997). A desolvation barrier to hydrophobic cluster formation may contribute to the rate-limiting step in protein folding. Protein Sci. 6, 347–354. 54. Cheung, M. S., García, A. E. & Onuchic, J. N. (2002). Protein folding mediated by solvation: water expulsion and formation of the hydrophobic core occur after the structural collapse. Proc. Natl Acad. Sci. USA, 99, 685–690. 55. Karanicolas, J. & Brooks, C. L. (2002). The origins of asymmetry in the folding transition states of protein L and protein G. Protein Sci. 11, 2351–2361. 56. Sessions, R. B., Thomas, G. L. & Parker, M. J. (2004). Water as a conformational editor in protein folding. J. Mol. Biol. 343, 1125–1133. Desolvation and Topology-Dependent Folding 57. Kaya, H., Liu, Z. & Chan, H. S. (2005). Chevron behavior and isostable enthalpic barriers in protein folding: successes and limitations of simple Gō-like modeling. Biophys. J. 89, 520–535. 58. Liu, Z. & Chan, H. S. (2005). Desolvation is a likely origin of robust enthalpic barriers to protein folding. J. Mol. Biol. 349, 872–889. 59. Liu, Z. & Chan, H. S. (2005). Solvation and desolvation effects in protein folding: native flexibility, kinetic cooperativity, and enthalpic barriers under isostability conditions. Phys. Biol. 2, S75–S85. 60. Ferguson, A., Liu, Z. & Chan, H. S. (2007). Desolvation effects and topology-dependent protein folding. 2007 American Physical Society March Meeting Abstract BAPS.2007.MAR.D26.3. http://meetings. aps.org/link/BAPS.2007.MAR.D26.3. 61. Capaldi, A. P., Kleanthous, C. & Radford, S. E. (2002). Im7 folding mechanism: misfolding on a path to the native state. Nat. Struct. Biol. 9, 209–216. 62. Viguera, A. R., Vega, C. & Serrano, L. (2002). Unspecific hydrophobic stabilization of folding transition states. Proc. Natl Acad. Sci. USA, 99, 5349–5354. 63. Feng, H., Takei, J., Lipsitz, R., Tjandra, N. & Bai, Y. (2003). Specific non-native hydrophobic interactions in a hidden folding intermediate: implications for protein folding. Biochemistry, 42, 12461–12465. 64. Cho, J. H., Sato, S. & Raleigh, D. P. (2004). Thermodynamics and kinetics of non-native interactions in protein folding: a single point mutant significantly stabilizes the N-terminal domain of L9 by modulating non-native interactions in the denatured state. J. Mol. Biol. 338, 827–837. 65. Gu, Z., Rao, M. K., Forsyth, W. R., Finke, J. M. & Matthews, C. R. (2007). Structural analysis of kinetic folding intermediates for a TIM barrel protein, indole-3-glycerol phosphate synthase, by hydrogen exchange mass spectrometry and Gō model simulation. J. Mol. Biol. 374, 528–546. 66. Zarrine-Afsar, A., Wallin, S., Neculai, A. M., Neudecker, P., Howell, P. L., Davidson, A. R. & Chan, H. S. (2008). Theoretical and experimental demonstration of the importance of specific nonnative interactions in protein folding. Proc. Natl Acad. Sci. USA, 105, 9999–10004. 67. Zhang, Z. & Chan, H. S. (2009). Native topology of the designed protein Top7 is not conducive to cooperative folding. Biophys. J. 96, L25–L27. 68. Chan, H. S. & Zhang, Z. (2009). Liaison amid disorder: non-native interactions may underpin long-range coupling in proteins. J. Biol. 8, 27. 69. Sheinerman, F. B. & Brooks, C. L. (1998). Molecular picture of folding of a small α/β protein. Proc. Natl Acad. Sci. USA, 95, 1562–1567. 70. Rhee, Y. M., Sorin, E. J., Jayachandran, G., Lindahl, E. & Pande, V. S. (2004). Simulations of the role of water in the protein-folding mechanism. Proc. Natl Acad. Sci. USA, 101, 6456–6461. 71. Shimizu, S. & Chan, H. S. (2002). Anti-cooperativity and cooperativity in hydrophobic interactions: threebody free energy landscapes and comparison with implicit-solvent potential functions for proteins. Proteins: Struct. Funct. Genet. 48, 15–30; [Erratum: 49, 294 (2002)]. 72. Shimizu, S. & Chan, H. S. (2002). Origins of protein denatured state compactness and hydrophobic clustering in aqueous urea: inferences from nonpolar potentials of mean force. Proteins Struct. Funct. Genet. 49, 560–566. Desolvation and Topology-Dependent Folding 73. Moghaddam, M. S., Shimizu, S. & Chan, H. S. (2005). Temperature dependence of three-body hydrophobic interactions: potential of mean force, enthalpy, entropy, heat capacity, and nonadditivity. J. Am. Chem. Soc. 127, 303–316; [Correction: 127, 2363 (2005)]. 74. Levy, Y. & Onuchic, J. N. (2006). Water mediation in protein folding and molecular recognition. Annu. Rev. Biophys. Biomol. Struct. 35, 389–415. 75. Best, R. B. & Hummer, G. (2008). Protein folding kinetics under force from molecular simulation. J. Am. Chem. Soc. 130, 3706–3707. 76. Rodríguez-Larrea, D., Minning, S., Borchert, T. V. & Sanchez-Ruiz, J. M. (2006). Role of solvation barriers in protein kinetic stability. J. Mol. Biol. 360, 715–724. 77. Costas, M., Rodríguez-Larrea, D., De Maria, L., Borchert, T. V., Gómez-Puyou, A. & Sanchez-Ruiz, J. M. (2009). Between-species variation in the kinetic stability of TIM proteins linked to solvation-barrier free energies. J. Mol. Biol. 385, 924–937. 78. MacCallum, J. L., Sabaye Moghaddam, M., Chan, H. S. & Tieleman, D. P. (2007). Hydrophobic association of α-helices, steric dewetting and enthalpic barriers to protein folding. Proc. Natl. Acad. Sci. USA, 104, 6206–6210; [Correction: 105, 19561 (2008)]. 79. Pratt, L. R. & Chandler, D. (1977). Theory of hydrophobic effect. J. Chem. Phys. 67, 3683–3704. 80. Shimizu, S. & Chan, H. S. (2000). Temperature dependence of hydrophobic interactions: a mean force perspective, effects of water density, and nonadditivity of thermodynamic signatures. J. Chem. Phys. 113, 4683–4700; [Erratum: 116, 8636 (2002)]. 81. Chen, B.-L., Baase, W. A. & Schellman, J. A. (1989). Low-temperature unfolding of a mutant of phage-T4 lysozyme. 2. Kinetic investigations. Biochemistry, 28, 691–699. 82. Oliveberg, M., Tan, Y.-J. & Fersht, A. R. (1995). Negative activation enthalpies in the kinetics of protein folding. Proc. Natl Acad. Sci. USA, 92, 8926–8929. 83. Scalley, M. L. & Baker, D. (1997). Protein folding kinetics exhibit an Arrhenius temperature dependence when corrected for the temperature dependence of protein stability. Proc. Natl Acad. Sci. USA, 94, 10636–10640. 84. Chalikian, T. V. (2003). Volumetric properties of proteins. Annu. Rev. Biophys. Biomol. Struct. 32, 207–235. 85. Mitra, L., Hata, K., Kono, R., Maeno, A., Isom, D., Rouget, J.-B. et al. (2007). Vi-value analysis: a pressure-based method for mapping the folding transition state ensemble of proteins. J. Am. Chem. Soc. 129, 14108–14109. 86. Veitshans, T., Klimov, D. & Thirumalai, D. (1997). Protein folding kinetics: timescales, pathways and energy landscapes in terms of sequence-dependent properties. Folding Des. 2, 1–22. 87. Valleau, J. P. & Torrie, G. M. (1977). A guide to Monte Carlo for statistical mechanics: 2. Byways. In Statistical Mechanics, Part A: Equilibrium Techniques (Berne, B. J., ed.), pp. 169–194, Plenum Press, New York; chapt. 5. 88. Beveridge, D. L. & DiCapua, F. M. (1989). Freeenergy via molecular simulation – Applications to chemical and biomolecular system. Annu. Rev. Biophys. Biophys. Chem. 18, 431–492. 89. Voter, A. F. (1997). Hyperdynamics: accelerated molecular dynamics of infrequent events. Phys. Rev. Lett. 78, 3908–3911. 635 90. Fersht, A. R., Matouschek, A. & Serrano, L. (1992). The folding of an enzyme. I. Theory of protein engineering analysis of stability and pathway of protein folding. J. Mol. Biol. 224, 771–782. 91. Bilsel, O. & Matthews, C. R. (2000). Barriers in protein folding reactions. Adv. Protein Chem. 53, 153–207. 92. Chan, H. S. & Dill, K. A. (1998). Protein folding in the landscape perspective: chevron plots and non-Arrhenius kinetics. Proteins: Struct. Funct. Genet. 30, 2–33. 93. Portman, J. J., Takada, S. & Wolynes, P. G. (2001). Microscopic theory of protein folding rates. II. Local reaction coordinates and chain dynamics. J. Chem. Phys. 114, 5082–5096. 94. Kaya, H. & Chan, H. S. (2002). Towards a consistent modeling of protein thermodynamic and kinetic cooperativity: how applicable is the transition state picture to folding and unfolding? J. Mol. Biol. 315, 899–909. 95. Coutsias, E. A., Seok, C. & Dill, K. A. (2004). Using quaternions to calculate RMSD. J. Comput. Chem. 25, 1849–1857. 96. Best, R. B. & Hummer, G. (2006). Diffusive model of protein folding dynamics with Kramers turnover in rate. Phys. Rev. Lett. 96, 228104. 97. Chahine, J., Oliveira, R. J., Leite, V. B. P. & Wang, J. (2007). Configuration-dependent diffusion can shift the kinetic transition state and barrier height of protein folding. Proc. Natl Acad. Sci. USA, 104, 14646–14651. 98. Jacob, M. & Schmid, F. X. (1999). Protein folding as a diffusional process. Biochemistry, 38, 13773–13779. 99. Kaya, H. & Chan, H. S. (2003). Simple two-state protein folding kinetics requires near-Levinthal thermodynamic cooperativity. Proteins: Struct. Funct. Genet. 52, 510–523. 100. Chavez, L. L., Onuchic, J. N. & Clementi, C. (2004). Quantifying the roughness on the free energy landscape: entropic bottlenecks and protein folding rates. J. Am. Chem. Soc. 126, 8426–8432. 101. Hammond, G. S. (1955). A correlation of reaction rates. J. Am. Chem. Soc. 77, 334–338. 102. Plotkin, S. S. & Onuchic, J. N. (2002). Structural and energetic heterogeneity in protein folding. I. Theory. J. Chem. Phys. 116, 5263–5283. 103. Bai, Y., Zhou, H. & Zhou, Y. (2004). Critical nucleation size in the folding of small apparently two-state proteins. Protein Sci. 13, 1173–1181. 104. Weikl, T. R. (2008). Loop-closure principles in protein folding. Arch. Biochem. Biophys. 469, 67–75. 105. Gromiha, M. M. & Selvaraj, S. (2001). Comparison between long-range interactions and contact order in determining the folding rate of two-state proteins: application of long-range order to folding rate prediction. J. Mol. Biol. 310, 27–32. 106. Zhou, H. & Zhou, Y. (2002). Folding rate prediction using total contact distance. Biophys. J. 82, 458–463. 107. Ivankov, D. N., Garbuzynskiy, S. O., Alm, E., Plaxco, K. W., Baker, D. & Finkelstein, A. V. (2003). Contact order revisited: influence of protein size on the folding rate. Protein Sci. 12, 2057–2062. 108. Micheletti, C. (2003). Prediction of folding rates and transition-state placement from native-state geometry. Proteins Struct. Funct. Genet. 51, 74–84. 109. Gong, H., Isom, D. G., Srinivasan, R. & Rose, G. D. (2003). Local secondary structure content predicts folding rates for simple, two-state proteins. J. Mol. Biol. 327, 1149–1154. 110. Pandit, A. D., Jha, A., Freed, K. F. & Sosnick, T. R. (2006). Small proteins fold through transition states with native-like topologies. J. Mol. Biol. 361, 755–770. 636 111. Baxa, M. C., Freed, K. F. & Sosnick, T. R. (2008). Quantifying the structural requirements of the folding transition state of protein A and other systems. J. Mol. Biol. 381, 1362–1381. 112. Paci, E., Lindorff-Larsen, K., Dobson, C. M., Karplus, M. & Vendruscolo, M. (2005). Transition state contact orders correlate with protein folding rates. J. Mol. Biol. 352, 495–500. 113. Hubner, I. A., Shimada, J. & Shakhnovich, E. I. (2004). Commitment and nucleation in the protein G transition state. J. Mol. Biol. 336, 745–761. 114. Badasyan, A., Liu, Z. & Chan, H. S. (2009). Interplaying roles of native topology and chain length in marginally cooperative and noncooperative folding of small protein fragments. Int. J. Quantum Chem. In press. doi:10.1002/qua.22272. Desolvation and Topology-Dependent Folding 115. Scott, K. A., Batey, S., Hooton, K. A. & Clarke, J. (2004). The folding of spectrin domains I: wildtype domains have the same stability but very different kinetic properties. J. Mol. Biol. 344, 195–205. 116. Lindberg, M., Tangrot, J. & Oliveberg, M. (2002). Complete change of the protein folding transition state upon circular permutation. Nat. Struct. Biol. 9, 818–822. 117. Miller, E. J., Fischer, K. F. & Marqusee, S. (2002). Experimental evaluation of topological parameters determining protein-folding rates. Proc. Natl Acad. Sci. USA, 99, 10359–10363. 118. Cho, S. S., Weinkam, P. & Wolynes, P. G. (2008). Origins of barriers and barrierless folding in BBL. Proc. Natl Acad. Sci. USA, 105, 118–123.
© Copyright 2026 Paperzz