Gradient)Augmented Harmonic Fourier Beads Method for Quantitative Studies of Reaction Path Ensembles Ilja V. Khavrutskii§$a) and Charles L. Brooks III§a) § The Scripps Research Institute Department of Molecular Biology, TPC6 10550 North Torrey Pines Road La Jolla, California 92037 $ Present address: Howard Hughes Medical Institute, Center for Theoretical Biological Physics, Department of Chemistry and Biochemistry, University of California at San Diego, La Jolla, California 92093-0365 a) E-mail: [email protected]; [email protected] 1 A"#t%a't We present a crucial improvement of the recently published Harmonic Fourier Beads method [I. V. Khavrutskii, K. Arora, and C. L. Brooks III, J. Chem. Phys. ()* (17), 174108:1-7 (2006)] for locating minimum free energy transition path ensembles and minimum potential energy paths in molecular systems with rugged energy landscapes. The improvement of the HFB method is due to computing the gradients of either the free energy or of the potential energy derived from the harmonic biasing potential. Using the respective energy gradients leads to a speed up in path optimization of approximately 2-2.5 over the previous method. Most importantly, the computed corresponding gradients allow reconstruction of accurate energy and free energy profiles along the paths in multidimensional coordinate space. Thus, the enhanced HFB method greatly expands our capabilities in quantitative studies of rare events, associated with processes such as ligand binding, protein folding and enzyme catalysis. The utility of this extension is demonstrated with an application to the conformational isomerization of the alanine dipeptide and early unfolding events in the 20-residue-long !-helical peptide. 2 I. Introduction "#$%&'()* (+,* '-+.-/%(0&-+()* /$-/1(+&2(0&-+3* -.* %-)$'4)$3* ,4/&+1* &%5-/0(+0 6&-%-)$'4)(/* 5/-'$33$37* .-/* $8(%5)$7* $+29%$* '(0()93&37* ,/41* 6&+,&+1* 0-* (* 5/-0$&+* 0(/1$0* (+, 5/-0$&+*.-),&+1*-.0$+*/$:4&/$*('0&;(0&-+*$+$/1&$3*&+*$8'$33*-.*0#$*0#$/%())9*(;(&)(6)$*!"T*;()4$< =#$3$*/(/$*/$-/1(+&2(0&-+3*1-;$/+*0#$*%$'#(+&3%3*-.*0#$*6&-%-)$'4)(/*5/-'$33$3<*>85$/&%$+03 '(+* 5/-;&,$* 0#$/%-,9+(%&'* (+,?-/* @&+$0&'* &+.-/%(0&-+* (6-40* 0#$* '-//$35-+,&+1* 0/(+3&0&-+3 '-%5)$%$+0$,*69*30/4'04/()*$+3$%6)$*&+.-/%(0&-+*(6-40*0#$*&+;-);$,*%$0(30(6)$*30(0$3*-.*0#$ %-)$'4)(/* 3930$%3<* A-B$;$/7* 34'#* '-+;-)40$,* ,$3'/&50&-+3* %(9* %&33* &%5-/0(+0* ,$0(&)3* -. 30/4'04/()* (+,* $+$/1$0&'* /$-/1(+&2(0&-+3* &+;-);$,* &+* 6(//&$/* '/-33&+13* 6$0B$$+* +(3'$+0 &+0$/%$,&(0$3* 0#(0* (/$* 3&1+(04/$3* -.* 0#$* 4+,$/)9&+1* %$'#(+&3%3<CD!* =#$/$.-/$7* 0#$* (6&)&09* 0,$3'/&6$*0/(+3&0&-+3*6$0B$$+*0#$*%$0(30(6)$*$+,5-&+03*&+*(0-%&'*,$0(&)*(+,*0-*%(@$*:4(+0&0(0&;$ 5/$,&'0&-+3* -.* 0#$* $+$/1$0&'3* &+;-);$,* &3* $33$+0&()* .-/* $)4'&,(0&+1* 0#$* %$'#(+&3%3* -.* 0#$ &%5-/0(+0* 6&-)-1&'()* 5/-'$33$3<* E04,9&+1* 0#$3$* /(/$* /$-/1(+&2(0&-+3* &3* ,&..&'4)0* 6-0# $85$/&%$+0())9*(+,*'-%540(0&-+())9< F$;$/0#$)$337*(,;(+'$3*&+*'-%540(0&-+()*0$'#+&:4$3*1/(,4())9*%(@$*5-33&6)$*4+'-;$/&+1 0#$*,$0(&)$,*%$'#(+&3%3*-.*0#$3$*&%5-/0(+0*0/(+3&0&-+3*-+*0#$*(0-%&'*)$;$)7*(+,*0#$/$.-/$*6/&+1 ')-3$/*-4/*4+,$/30(+,&+1*(+,*'-+0/-)*-;$/*0#$*6&-)-1&'()*%('#&+$3<G7H*=#$3$*0$'#+&:4$3*.())*&+00B-*%(&+*'(0$1-/&$37*0#$*%$0#-,3*0#(0*'-%540$*5(0#3*-+*(+*(,&(6(0&'*5-0$+0&()*$+$/19*34/.('$ (+,*0#-3$*0#(0*'-%540$*5(0#37*-/*/(0#$/*5(0#*$+3$%6)$37*-+*(*./$$*$+$/19*34/.('$<*I$0#-,3*)&@$ 0#$*+4,1$,*$)(30&'*6(+,7J*)&+$D&+0$1/()7KDCC*(+,*0#$*'-+L41(0$*5$(@*/$.&+$%$+0CM*'-%540$*5(0#3*-+ 0#$* 5-0$+0&()* $+$/19* 34/.('$* (+,* 0#43* )('@* 0$%5$/(04/$* $..$'03<* N+* '-+0/(307* ('0&-+D6(3$, ,9+(%&'37C!DCJ*0/(+3&0&-+*5(0#*3(%5)&+17CK7CO*(+,*0#$*.&+&0$*0$%5$/(04/$*30/&+1CPDMC*%$0#-,3*'-%540$ 5(0#*$+3$%6)$3*-+*0#$*./$$*$+$/19*34/.('$*(0*(*1&;$+*0$%5$/(04/$<*=#$*./$$*$+$/19*,$3'/&50&-+*&3 ! important as it governs dynamics of molecular machines at ambient conditions. However, computing accurate free energy surfaces requires considerable conformational sampling, and is, therefore, challenging.22 Previously, we introduced the HFB method that provides the simplest yet very robust means to compute the minimum free energy path ensembles, and does not require explicit knowledge of either the free energy or its gradient. To compute the final free energy profile along the optimized path ensemble we devised a simple 2D umbrella sampling procedure. However, because this procedure requires reduction of the multidimensional coordinate space to only two generalized dimensions, it is not generally applicable to large systems with complex paths. In this paper, we substantially improve of the original HFB method by noticing that the umbrella potential employed in the HFB optimization23 provides a straightforward way to compute accurate energy gradients in Cartesian coordinates. In turn, the computed gradients allow to significantly speed up the path optimization rate, and, most importantly, to reconstruct the corresponding energy profile in the multidimensional coordinate space, thus, avoiding the need to reduce the path dimensionality. Therefore, the gradient augmentation extends the applicability of the HFB method to systems of arbitrarily many dimensions. The paper is organized as follows. We first describe a very simple way to compute the gradients of either the free energy or of the potential energy and then show how the corresponding gradients can be used to speed up path optimization and to compute the corresponding energy profiles. We then apply the gradient-augmented HFB method to a threestate conformational transition in the alanine dipeptide, and finally to a few steps of unfolding of the 20-residue !-helical alanine-based peptide.24 4 !!" $%&'() *+, -&.%','/'0) 1" $%& 234 -&.%', 56&(67&8 "#$%&'(%)$*#+,%+-*.)./$0%*#$%'+12.$2%-3*#45675 % q" (! ) ! q" (0) " (q" (1) " q" (0))! " $ #$" sin# $#! $ 8 97: $!1 *#3*% .0% 3;% 3;3<=*.>3<% ?1;>*.+;% +?% 3% -2+@2$00% A32.3B<$% ! " !0;1" 6% C#.>#% ,$?.;$0% 3% >12A$% .; )1<*.,.)$;0.+;3<% >++2,.;3*$% 0-3>$% B$*C$$;% 3% @.A$;% 2$3>*3;*% 9 ! ! 0 :% 3;,% -2+,1>*% 9 ! = 1: >+;?.@123*.+;08%D;%$E13*.+;%97:%!i%.0%*#$%i*#%>+)-+;$;*%+?%*#$%>+;?.@123*.+;%A$>*+2% ! ! "q1,",q3# # 6 i#1,3% 3<0+%>3<<$,%3%B$3,6%,$0>2.B.;@%*#$%-+0.*.+;0%+?%3<<%#%3*+)0%.;%F32*$0.3;%0-3>$G% !bni "n#1,$ %32$%*#$ '+12.$2%3)-<.*1,$0G%3;,%$%.0%*#$%0$2.$0%*21;>3*.+;%.;,$H8 "+%+-*.)./$%*#$%-3*#6%*#$%&'(%)$*#+,%,.0>2$*./$0%.*%.;*+%3%?.;.*$%0$*%+?%B$3,0%3;,%*#$; $A+<A$0% $3>#% B$3,% k% .;,$-$;,$;*<=% +?% 3<<% *#$% +*#$2% B$3,0% B=% 21;;.;@% $.*#$2% 2$<3*.A$<=% 0#+2* )+<$>1<32%,=;3).>0%0.)1<3*.+;0%+2%$;$2@=%+-*.)./3*.+;%C.*#%F32*$0.3;%#32)+;.>%2$0*23.;*0%94:I %* S $$ '$ 2 f ref &$ )$ ! (""! ) ! m q " q ( ! ) # # " # * i i # )$ ' S *!1 &$ ($ %$ i! % *"2 ref # 94: "#$%1)B2$<<3%-+*$;*.3<%94:%>+;?.;$0%*#$%3B0+<1*$%-+0.*.+;0% qi %+?%+;<= '% 3*+)0% >+)-2.0.;@% *#$ 2$3>*.+;% >++2,.;3*$% 01B0-3>$% 9JFS:% *+% *#$% kL*#% 2$?$2$;>$% >+;?.@123*.+; !i !! #ref " 6% <$3A.;@% #(' 3*+)0%.;%*#$%0-$>*3*+2%>++2,.;3*$%01B0$*%9SFS:%?2$$8%&$2$6 ! %.0%*#$%0-2.;@%?+2>$%>+;0*3;*G% ! " %3;, M S %32$%*#$%)300%+?%*#$%)*#% 3*+)% 3;,% *#$% *+*3<% )300% +?% *#$% 3*+)0% .;% *#$% JFS8% "#106% ?+2% $3># 2$?$2$;>$%B$3,%kI ( !(! kref ) = q! (! kref )"K"q#' (! kref ) ) 95: ! the evolution returns either the averaged bead R b,k ( = q1 b,k ,K, q3N b,k ) (4) or the corresponding energy minimized bead [R] b,k = ([q1 ] b,k ,K, [q3N ] b,k ) . (5) The subscript b,k indicates that the evolution is performed for the kth bead in the presence of the bias (2). Following the Fourier transform of the evolved beads to obtain new sets of the amplitudes,23 redistribution of the beads along the evolved path provides new reference beads. This procedure is iterated until convergence, as measured by the cessation of path displacement. B. Computing the Energy Gradients using Harmonic Restraints To improve the HFB method we augment it with the energy gradients. To compute the free energy gradient, we follow a simple approximation due to Kastner and Thiel25 that surpasses previously used stiff-spring approximation22,25,26 and also naturally suits the HFB method. In particular, for the biasing potential of the form equivalent to (2) 2 V (x) = k v ( x ! x v ) , (6) where kv is the force constant associated with the harmonic restraint of a particular bead (or restraint window) and xv is the equilibrium position of the restraint, these authors arrive at the following estimates of the unbiased free energy gradient25 x# x !W u (x) !W˜ u (x) " = kB T !x !x $ b2 x,b # 2k v ( x # x v ) . (7) Here Wu(x) is the corresponding potential of the unbiased mean force (PMF) for the variable x, kB is the Boltzmann constant, T is the simulation temperature, and tilde indicates an approximate quantity; x x,b and ! b2 are the corresponding mean and variance, respectively, of an assumed 6 Gaussian distribution of the coordinate. To compute optimal estimates of the quantities in equation (7), these authors used histograms of the x coordinate.25 To avoid histograms that are impractical with high dimensional reaction coordinates,23 we further simplify their result by substituting x !W u (x) !x x= " x x,b !W˜ u (x) !x x= ( = #2k v x x x,b x,b into equation (7), which then reduces to: ) # xv . (8) x,b For the justification that this simplification is optimal refer to the Appendix. Furthermore, it is trivial to demonstrate that the gradient of the potential energy U(x) could also be derived from the harmonic restraint (6): !U(x) !V (x) =" = "2kv ([ x ] b " x v ) . !x x= [ x ] b !x x= [ x ] b (9) Equation (9) holds exactly at the equilibrium position [ x ] b (i.e., local minimum of the biased potential energy surface). Thus, in the context of the HFB method, equations (8) and (9) provide the multidimensional Cartesian energy gradients at the evolved beads on the fly. In what follows, we only demonstrate the use of the gradients of the free energy to save space, noting that all of the same methodology directly applies to the gradients of the potential energy. C. Gradient Directed HFB Optimization With the energy gradients at hand we can now significantly improve the convergence rate of the path optimization compared to that in the original HFB implementation. Specifically, we can step some distance away from the evolved beads along either the full energy gradients or their components orthogonal to the path. The former is beneficial in cases where the endpoints are allowed to evolve, or if the density of beads is small. 7 To achieve the maximum accuracy in computing the corresponding energy gradients, we substitute the “as is” coordinates of the original reference and of the corresponding evolved beads in the analogues of equation (8) or (9) for the restraint (2). !W˜ u R ( b,k ) # "W˜ R "W˜ u R b,k u b,k =% ,K, % " q1 " q3S b,k b,k $ ( ) ( ) (& . (10) ( ' To accurately compute the orthogonal component of the force to the evolved path, we, first, analytically compute its tangent vector following Fourier-transform of the evolved beads: r n (! ) = (q1"(! ),K, q"3S (! )) . (11) and then utilize standard projection techniques:27 ! "W˜ u R ( b,k ) = !W˜ ( R ) u b,k r n ($ k ) % !W˜ u R b,k r . # n ($ k ) r r n ($ k ) % n ($ k ) ( ) (12) To step along the estimates of the mean force further away from the evolved beads, we use the steepest descent (SD) like approach. Note, however, that because we apply the step to the evolved bead and not the reference bead, this procedure corresponds to an enhanced SD method. Thus the enhanced SD optimization step using either the full or the orthogonal component of the gradient is as follows, RkSD = R b,k + ! k "W˜ u R b,k + " k # !W˜ u R ( b,k ) (13) or SD R!k = R ( b,k ). (14) Here ! k is the parameter that controls the step size for the k th bead and the superscript SD indicates that the configuration corresponds to the bead generated with the enhanced SD step. In the present paper we use the uniform step size parameter ! for all the beads. 8 The SD step provides substantially more evolved beads in comparison to the original HFB implementation where we used just the evolved beads to generate the next path.23 From these new enhanced SD beads we generate the next path and the corresponding set of reference beads the exact same way as in the original HFB implementation from the evolved beads.23 Noteworthy, the HFB optimization provides a very useful general approach for a tough problem of optimizing saddle points or transition states28 because the harmonic biasing potential renders the modified energy surfaces along the path strongly convex, even in the vicinity of transition states. Finally, we would like to note that within the Finite Temperature String method there has been some recent effort to move away from constraints and use restraints. However these authors use a rather poor stiff-spring approximation to compute the mean forces. Furthermore, they choose complex sets of reaction coordinates that unlike the Cartesian coordinates used in our HFB method require additional computations of complex tensors and Jacobians and are not very easily extendable to arbitrary many coordinates.29 We provide a straightforward comparison of the performance of the FTS and HFB methods in Results and Discussion. D. Computing the Energy profiles along the Fourier Path Importantly, with the help of the free energy gradients, we can now compute accurate estimates of the PMF along the Fourier path in multidimensional RCS, which was not possible with the original HFB method. To this end we find it optimal to perform a Fourier transform of the computed forces the exact same way as the corresponding evolved beads. With the continuous Fourier representation of the forces and of the path normal we can now trivially compute the corresponding reversible work along the path threading the evolved beads as the generalized line integral of the second order in the RCS 9 3S ! $ "W˜ (! ) # ' ˜ W u (! ) = + * & u qi (! ))d! . "qi ( i=1 0 % (15) This procedure differs from a previously proposed method30 in that the global Fourier interpolation of both the path and the forces is performed prior to integration, thus providing the energy profile as an analytical function of the progress variable. Note that the analytical form of the energy profile and that of the corresponding path renders pinpointing the energy extrema with their accurate RCS coordinates particularly trivial. When using the potential energy gradient as opposed to the free energy gradient this procedure should give exact potential energy profiles and thus provides a perfect opportunity for benchmarking the enhanced HFB method. III. Computational Details The gradient augmented HFB method was implemented into the c34a1 version of the CHARMM program under the TREK module.31 Langevin dynamics (LD) was employed with leap frog integrator using a 2 fs time step at T = 298 K and with a friction coefficient of 10 ps-1 for all heavy atoms. All bonds involving hydrogen atoms were constrained using SHAKE32-34 with tolerance of 10-9 Å. For the alanine dipeptide we employed CHARMM2235 all-atom force field without CMAP in the gas phase, whereas for the 20-residue !-helical peptide we employed CHARMM1936 united-atom force field with the GBORn37 implicit water model.35 For the alanine dipeptide the RCS/SCS partitioning was the same as before.23 Both electrostatic and vdW interactions used 21 Å non-bonded list cutoff and were truncated with switching functions over the range from 16 Å to 18 Å. An initial path was generated with linear interpolation between the C5 and C7ax conformations in the full Cartesian coordinate space. The 10 "#$%&'()*(%&+,-(K(.+-(-&/(/)(01(+",(/2&(3)#'4&'(-&'4&-(/'#"5+/4)"(4",&6(P(.+-(-&/(/)()"&()*(/2& *)77).4"8(9+7#&-(!:;(1<()'(01= 3)'(/2&(1>?'&-4,#&(!?2&745+7(@&@/4,&((B<C)0(B<E);1<(/2&(FGH(5)$@'4-&,(I>(()#/()*(!J1 /)/+7)(+/)$-()*(/2&(G;(G!(+",(K(/L@&(/2+/(+'&(&--&"/4+7(/)(,&*4"&(%+5M%)"&(,42&,'+7(+"87&-=(N) -@&&,(#@(&"&'8L(5+75#7+/4)"-(/2&(")"?%)",&,(4"/&'+5/4)"-(&$@7)L&,(!I(O(74-/(5#/)**;(/'#"5+/4"8 /2&(4"/&'+5/4)"-(.4/2(/2&(-.4/524"8(*#"5/4)"-()9&'(!1(O(/)(!0(O('+"8&=(P4/2(-#52(-2)'/(5#/)**/2&(/2'&&(7L-4"&(-4,&52+4"-(&**&5/49&7L(,)(")/(*&&7(&+52()/2&'Q-(&7&5/')-/+/45(*4&7,=(N2&(4"4/4+7(@+/2 .+-(8&"&'+/&,(.4/2(+(-&'4&-()*(I<(/&$@&'+/#'&(R#$@-()*(!>(C(&+52(-/+'/4"8(+/(1S:=>(C(/)(4",#5& #"*)7,4"8=( T#'4"8( &+52( 5)"-/+"/( /&$@&'+/#'&( '#"( /2&( FGH( +/)$-( .&'&( '&-/'+4"&,( /)( /2& 5)"*48#'+/4)"(+9&'+8&,()9&'(/2&(@'&5&,4"8(/&$@&'+/#'&('#"()*(1J;>>>(-/&@-(.4/2(+(*)'5&(5)"-/+"/ )*( >=>>J( M5+7U$)7=( T#'4"8( /2&( ,L"+$45-( /2&( 5))',4"+/&-( .&'&( -+9&,( &9&'L( J>( VT( -/&@-=( N2& +9&'+8&,(-/'#5/#'&-(+/(&+52(/&$@&'+/#'&((/)/+7()*(I<)(@')94,&,(/2&(4"4/4+7('&*&'&"5&(@+/2(*)'(/2& *'&&(&"&'8L(@+/2()@/4$4W+/4)"(+/(1S:=>(C=(N2&("#$%&'()*(%&+,-(#-&,(K(.+-(I<(+",(/2&(/'#"5+/4)" "#$%&'(P(9+'4&,(%&/.&&"(1<(+",(<1= !"#$%&#'()*#+,)-m-/()-01 3)'(/2&(+7+"4"&(,4@&@/4,&;(/2&(X3Y(&9)7#/4)"(,L"+$45-(&$@7)L&,(+"(&Z#474%'+/4)"('#"()* 1>(@-(+",(+(@'),#5/4)"('#"()*(<>(@-=(K)/&($#52(-2)'/&'('#"(7&"8/2-(/2+"(#-&,(4"(/2&()'484"+7(X3Y @+@&'=10(T#'4"8(/2&(&Z#474%'+/4)";(/2&(-L-/&$(.+-(2&+/&,(/)(/2&(*4"+7(/&$@&'+/#'&(#-4"8(9&7)54/L '&+--48"$&"/=( N2&( @'),#5/4)"( &$@7)L&,( &657#-49&7L( +( V+"8&94"( /2&'$)-/+/( .4/2( T( [( 1S:( C= B9&'+84"8( FGH( 5))',4"+/&-( )9&'( /2&( @'),#5/4)"( /'+R&5/)'L( L4&7,-( /2&( &9)79&,( %&+,-=( N2& @'),#5/4)"('&-/+'/(*47&-(.&'&(-+9&,(/)(4"4/4+/&(,L"+$45-(*)'(/2&("&6/(X3Y(-/&@= N&"('&8#7+'(X3Y()@/4$4W+/4)"(-/&@-(.&'&(@&'*)'$&,(%&*)'&(/#'"4"8(/2&(HT()@/4)"()"= N2&(HT()@/4)"(#-&,(,4**&'&"/(-/&@(-4W&(@+'+$&/&'-(*)'(,4**&'&"/(*)'5&(5)"-/+"/-=(\"(@+'/45#7+';(*)' !! #$% '()*% *(+,#-+#, ./01 !0/0 -+2 "0/0 3*-456(4789" #$% *())%,:(+2;+< ,#%: ,;=% :-)-6%#%), >%)% ./0?!09.1 !/".?!09. -+2 @/!".?!09A 8"7(3*-456(4)9! D+4%,, +(#%2 (#$%)>;,%/ E(>%F%)1 ;+ #$% *-,% (' #$% '()*% *(+,#-+# (' "0/0 3*-456(4789" #$% ,#%: ,;=% :-)-6%#%) >-, ,%# #( A/".?!09A 8"7(3*-456(4)9! '() #$% ". GH ,#%:, ,#-)#;+< -# ,#%: !01 -+2 #$-+ >-, )%2D*%2 #( @/!".?!09A 8"7(3*-456(4)9! 2D% #( 2%F%4(:6%+# (' -+ ;+,#-I;4;#J ;+ #$% :-#$ (:#;6;=-#;(+/ K-#$ (:#;6;=-#;(+ #(>-)2 #$% 6;+;6D6 :(#%+#;-4 %+%)<J F-44%J %6:4(J%2 6;+;6;=-#;(+9 I-,%2 %F(4D#;(+ ,#%:"@ >;#$ #$% ,-6% '()*% *(+,#-+#, -+2 #$% *())%,:(+2;+< ,#%: ,;=%, -, ;+ #$% LH9I-,%2 %F(4D#;(+/ M+ #$;, *-,% #$% GH (:#;(+ >-, #D)+%2 (+ ')(6 #$% ';),# ,#%:/ N$% :-#$ *(+F%)<%+*% >-, 6(+;#()%2 IJ *(6:D#;+< #$% )((#96%-+9,OD-)% (PQG) (' #$% :-;)9>;,% )((#96%-+9,OD-)% 2%F;-#;(+, (PQGH,) I%#>%%+ #$% *())%,:(+2;+< )%'%)%+*% I%-2, ;+ #$% +%>4J %F(4F%2 -+2 #$% *$(,%+ *(6:-);,(+ :-#$ ;+ #$% PRG/ S% +(#% #$-# - D,%'D4 >-J #( 6(+;#() :-#$ *(+F%)<%+*% ;, #( '(44(> #$% "H9PQGH :-#$ :)(T%*#;(+"@ #$-# *-+ I% %-,;4J F;,D-4;=%2/ U() #$% :D):(,%, (' #$;, :-:%) (:#;6;=-#;(+ >-, ,#(::%2 -'#%) @00 *J*4%,/ R(+F%)<%+*% (' -44 #$% :-#$, >-, -##-;+%2 >;#$;+ "00 ,#%:, -, ;, ,%%+ ')(6 #$% PQG *D)F%, 4%F%4;+< ('' ;+ U;<D)% !/ V;F%+ #$% )%,;2D-4 +(;,% ;+ #$% ')%% %+%)<J :-#$, (U;<D)% !I)1 4%F%4;+< ('' (' #$% PQG *D)F%, 6;<$# I% D,%2 -, :-#$ *(+F%)<%+*% *);#%);-/ U() #$% "09)%,;2D% !9$%4;*-4 :%:#;2% >% (+4J :%)'()6%2 ')%% %+%)<J :-#$ %+,%6I4% (:#;6;=-#;(+ #( 'D)#$%) :)(F% #$% *(+*%:#/ M+;#;-44J1 #$% F%)J *(-),% AW I%-2 :-#$ ')(6 #$% #%6:%)-#D)% ;+2D*%2 D+'(42;+< ,;6D4-#;(+, >-, (:#;6;=%2 '() !X"0 ,#%:,1 -# #$% %+2 D,;+< #$% '()*% *(+,#-+# "/. 3*-456(4789" -+2 ,#%: ,;=% :-)-6%#%) ./0?!09. 8"7(3*-456(4)9! >;#$ ! = W"/ N$%+ #$% +D6I%) (' I%-2, >-, %?:-+2%2 #( !0"W -+2 #$% ';),# AW I%-2, >%)% *$(,%+ '() 'D)#$%) (:#;6;=-#;(+ (>$;*$ *())%,:(+2, #( #$% ';),# '(D) I%-2, ;+ #$% *(-),% :-#$)/ Z22;#;(+-4 [1.00 EU\ (:#;6;=-#;(+ ,#%:, >;#$ #$% ';+-4 '()*% *(+,#-+# (' !0/0 3*-456(4789" -+2 <-66- (' !/".?!09 !" # $ %&'()*+,-./,011$ 23456$!$7$89,$+5;$<=>?==5$1@,@@@$+5;$1&#,@@@$AB$3>=C3$C=D$=E/,2>4/5$?=D= C=DF/D.=;$25>4,$+$3+>43F+*>/DG$*/5E=D6=5*=H$I5$>J43$D=F45=;$/C>4.4K+>4/5,$>J=$=5;C/45>3$/F$>J=$!1 J=,4*+,$C=C>4;=$?=D=$F4L=;$>/$+,,/?$D=*/53>D2*>4/5$/F$>J=$MNO$+,/56$>J=$F2,,$25F/,;456$C+>JH B. HFB Energy Profiles The RCS Free Energy Profile PJ=$FD==$=5=D6G$CD/F4,=3$?=D=$*/.C2>=;$23456$>J=$QOR$.=>J/;$?4>J$=S2+>4/5$(1#0H$PJ= ;+>+$*/,,=*>4/5$CD/*=;2D=$F/D$>J=$=5=D6G$CD/F4,=$D=*/53>D2*>4/5$43$4;=5>4*+,$>/$>J=$QOR$=E/,2>4/5 3>=C$/5,G$23=3$,/56=D$CD/;2*>4/5$D253$+5;$=E/,E=3$+D/25;$>J=$F45+,$D=F=D=5*=$C+>JH P/$ +*J4=E=$ 32FF4*4=5>$ CD=*434/5$ 45$ >J=$ */.C2>=;$ FD==$ =5=D6G$ 6D+;4=5>$ F/D$ >J=$ +,+545= ;4C=C>4;=$?=$23=;$8$53$,/56$CD/;2*>4/5$AB$D25H$T=$+,3/$+33=33=;$>J=$=FF=*>$/F$>J=$F/D*=$*/53>+5> +5;$>J=$O/2D4=D$3=D4=3$>D25*+>4/5$45;=L$!$/5$>J=$S2+,4>G$/F$>J=$FD==$=5=D6G$CD/F4,=3,$<G$*/.C+D456 D=32,>3$ ?4>J$ >JD==$ ;4FF=D=5>$ F/D*=$ */53>+5>3,$ 5+.=,G$ #H@,$ 1@H@$ +5;$ &@H@$ )*+,-./,'%1&$ ?4>J >D25*+>4/5$45;4*=3$!$/F$&8$+5;$3&H$U>+D>456$FD/.$>J=$C+>J3$>J+>$*/DD=3C/5;$>/$3>=C$&@@$/F$UB$QOR /C>4.4K+>4/5,$?=$C=DF/D.=;$1@$QOR$/C>4.4K+>4/51*/,,=*>4/5$3>=C3$(6=5=D+>456$+$5=?$D=F=D=5*= C+>J$+F>=D$=+*J$,/56$*/,,=*>4/5$D250$?4>J$!$7$&8,$F/,,/?=;$<G$>=5$3>=C3$?4>J$!$7$3&$F/D$=+*J$F/D*= */53>+5>H$ R+3=;$ /5$ >J=$ >=5$ */53=*2>4E=$ QOR$ /C>4.4K+>4/51*/,,=*>4/5$ *G*,=3$ ?=$ */.C2>=;$ >J= .=+5$/F$>J=$D=,+>4E=$=5=D64=3$F/D$+,,$/F$>J=$4;=5>4F4=;$=L>D=.+$+5;$>J=$*/DD=3C/5;456$3>+5;+D; ;=E4+>4/53$F/D$=+*J$F/D*=$*/53>+5>$+5;$=+*J$>D25*+>4/5$45;=L$!H R=*+23=$ 45$ >J=$ !1J=,4*+,$ C=C>4;=$ >J=$ UVU$ 3C+*=$ */5>+453$ W$ +5;$ X$ +.45/$ +*4;3$ ?4>J D=,+>4E=,G$,+D6=$34;=*J+453,$.2*J$,/56=D$AB$*/,,=*>4/5$D253$?=D=$5=*=33+DG$>/$+E=D+6=$>J=$UVU */.C,=>=,G$+5;$*/5E=D6=$>J=$MNOH$Y3456$>J=$3+.=$F/D*=$*/53>+5>$+3$45$>J=$F45+,$/C>4.4K+>4/5,$?= D+5$&1$>/>+,$/F$9$53$,/56$*/,,=*>4/5$D253$=+*J$F/D$>J=$3+.=$F45+,$D=F=D=5*=$C+>JH$I5$>J=$=5;$+,,$>J= 13 #$%$&'()(&*+,-./(#&%+&0.1(&$&!23&/4&5+/0&$1()$0(4&6.(5#./0&%7(&8./$5&9()8(*%56&*+/1()0(#&:;< =*+,9>%(#&>4./0&!&?&2@AB !"##$%&#"'($)"*+,-#$,&$.$/0$1230$34.5# <+)& %7(& $5$/./(& #.9(9%.#(C& $& *+,9$).4+/& -(/*7,$)D& 8+)& %7(& E<F& 8)((& (/()06& 9)+8.5(& '$4 *+/4%)>*%(#& >4./0& @G& >,-)(55$& 4$,95./0& 8+55+'./0& $& 9)+%+*+5& 4.,.5$)& %+& %7$%& #(4*).-(# 9)(1.+>456B@H&I4&%7(&)(8()(/*(&'(&>4(#&%7(&9$%7&+9%.,.J(#&'.%7&!&?&@"&$/#&8+)*(&*+/4%$/%&+8&!KBK D*$5L,+5MNO@B&P7(&/>,-()&+8&-($#4&'$4&#+>-5(#&8)+,&H@&%+&2"C&'.%7&($*7&-($#&#(8././0&$&4./05( 4$,95./0& './#+'B& <+)& ($*7& './#+'& @K& /4& QG& )>/& '$4& 9()8+),(#B& P7)((& #.88()(/%& 8+)*( *+/4%$/%4&'()(&>4(#&'.%7&%7(&-(4%O8.%&R;SG&)(4%)$./%4C&/$,(56&TBKC&!KBK&$/#&@KBK&D*$5L,+5MNO@B P7(& #$%$& 8)+,& $55& %7(& %7)((& 4(%4& +8& 4.,>5$%.+/4& '()(& *+,-./(#& $/#& *+/1()%(#& ./%+& %7( *+))(49+/#./0&@G&8)((&(/()06&9)+8.5(&>4./0&'(.07%(#&7.4%+0)$,&$/$564.4&,(%7+#&=UEI;A&8+) -(%%()&4%$%.4%.*4BH3CHV 678$1#9:-;9$.&<$0,95:99,*& =8$>?#$.-.&,&#$<,4#4;,<# P+&#(,+/4%)$%(&%7(&>%.5.%6&+8&%7(&0)$#.(/%&$>0,(/%(#&E<F&,(%7+#C&'(&(W$,./(#&$&%7)((O 4%$%(&*+/8+),$%.+/$5&%)$/4.%.+/&+8&%7(&$5$/./(&#.9(9%.#(&./&0$4&97$4(&%7$%&*+//(*%4&@A$$%&XO!T!B"Y !ZKB2[&=\&?&KBV&D*$5L,+5A&$/#&@B.CC&$%&X2VBZY&O2ZB2[&=\&?&@B!&D*$5L,+5A&1.$&$/&./%(),(#.$%(&@B#DC&$% XO3!B"Y&ZKBT[&=\&?&KBK&D*$5L,+5AB&]/&*+/%)$4%C&9)(1.+>4&E<F&4%>#6&*+/*()/(#&$&%'+O4%$%(&%)$/4.%.+/ -(%'((/&%7(&@B#D& $/#&@B.C&,./.,$B@H&P7(&(88.*.(/*6&+8&%7(&9$%7&+9%.,.J$%.+/&'.%7&%7(&0)$#.(/% $>0,(/%(#&E<F&,(%7+#&.4&*+,9$)(#&'.%7&%7(&($)5.()&$99)+$*7&8+)&1$).+>4&8+)*(&*+/4%$/%4&= ! " # AC 4%(9&4.J(&9$)$,(%()4&= ! A&$/#&4().(4&%)>/*$%.+/&./#(W&=!AB& U(& 0$>0(& %7(& $**>)$*6& +8& %7(& E<F 9+%(/%.$5& (/()06& 9)+8.5(4& $0$./4%& %7(& (W$*%& (/()0.(4& +-%$./(#& >4./0& 4%$/#$)#& +9%.,.J$%.+/ !" #$%&'()*$+,- .'- #&$- /#&$0- &1'23- #&$- 1%%*01%4- /5- #&$- 678- 50$$- $'$094- :0/5(;$+- (+- $<1;*1#$2- =4 %/>:10(+/'-191('+#-#&$-50$$-$'$09($+-%/>:*#$2-?(#&-#&$-@A-*>=0$;;1-+1>:;('9-:0/%$2*0$,@B !" $" %&&i(i)*(+ ,& -.) /012i)*- 13/4)*-)2 567 81-. ,8-i4i91-i,* C&$-:0/90$++-/5-#&$-:1#&-/:#(>(D1#(/'-?1+->/'(#/0$2-*+('9-=$'%&>10E-0$5$0$'%$-:1#&+ :1++('9-#&0/*9&-#&$-%/00$+:/'2('9->('(>*>-$'$094-<1;;$4+,-F:$%(5(%1;;43-?$-*+$2-#&$-=$+#-:1#&+ /=#1('$2-2*0('9-:/#$'#(1;-$'$094-1'2-50$$-$'$094-678-/:#(>(D1#(/'+-?(#&-#&$-5/0%$-%/'+#1'#-1G,G E%1;H>/;IJK@-1'2-!-L-@M-1+-#&$-=$'%&>10E-:1#&+,-C&$-NOF-+(>(;10(#4->$1+*0$-=$#?$$'-%*00$'# 1'2- =$'%&>10E- :1#&+- 2$+%0(=$2- ('- #&$- O$#&/2+- +$%#(/'- 1'2- :;/##$2- ('- 7(9*0$- 1- ?1+- *+$2- #/ >/'(#/0-#&$-/:#(>(D1#(/'-:0/90$++, 7(9*0$+- 1P13- =Q- 2$:(%#- #&$- :0/90$++- /5- #&$- 678- /:#(>(D1#(/'- #/?102- #&$- >('(>*> :/#$'#(1;-$'$094-:1#&-1'2-#&$->('(>*>-50$$-$'$094-#01'+(#(/'-:1#&-$'+$>=;$3-0$+:$%#(<$;43-=/#& ?(#&-1'2-?(#&/*#-#&$-$'&1'%$2-FA-+#$:,-R'-1;;-%1+$+-+(9'(5(%1'#-+:$$2-*:-P*:-#/-@K@,"-#(>$+Q-/5 #&$-/:#(>(D1#(/'-(+-/=+$0<$2-?(#&-#&$-$'&1'%$2-FA-+#$:-0$;1#(<$-#/-#&$-/0(9('1;-678->$#&/2,-S+ $T:$%#$23- #&$- /:#(>(D1#(/'- 01#$- 2$:$'2+- /'- #&$- 5/0%$- %/'+#1'#3- ?(#&- ;109$0- 5/0%$- %/'+#1'#+ 0$)*(0('9->/0$-/:#(>(D1#(/'-+#$:+, C&$-+#$:-+(D$-:101>$#$0+-*+$2-('-#&(+-+#*24-10$-'$10-/:#(>1;3-1+-('%0$1+('9-#&$-+#$:-+(D$ :101>$#$0+- #?/5/;2- 0$'2$0+- #&$- 678- /:#(>(D1#(/'- *'+#1=;$- 1'2- )*(%E;4- 2$9012$+- #&$- :1#&, S;#&/*9&-2$<$;/:('9-#&$-('+#1=(;(#4-?(#&-;109$0-+#$:-+(D$+-%1'-=$-%/'+(2$0$2-1-2(+12<1'#19$3-(#-(+ *+$5*;-5/0-2$#$0>('('9-#&$-/:#(>1;-+#$:-+(D$-:101>$#$0+,-7*0#&$0-0$+$10%&-(+-'$%$++104-#/-$T:;/0$ /#&$0-/:#(>(D1#(/'-+#01#$9($+-=1+$2-/'-#&$-50$$-$'$094-9012($'#-1::0/T(>1#(/'-#/-91('-122(#(/'1; ('%0$1+$+-('-/:#(>(D1#(/'-$55(%($'%4, R#->(9&#-=$-/5-('#$0$+#-#/-%/>:10$-#&$-%/>:*#1#(/'1;-('#$'+(#4-/5-#&$-678->$#&/2-?(#& #&1#-/5-#&$-0$%$'#-<10(1'#-/5-#&$-7CF->$#&/2-#&1#-1;+/-$>:;/4$2-0$+#01('#+,@U-C&$-7CF->$#&/2 1" #$%&'(0'*%+&$',-'&.$/0%,.1%',2%',3-4$,+,%'5+,2'*%,3%%6'7C7eq'+6&'C 7ax8'/-95+0%&',-':('*%+&$'.6 -#0'%+0;.%0'5+5%0'76-,%',2+,'3%'#$%':('*%+&$'<-0',2%'9#/2';-6=%0',20%%4$,+,%',0+6$.,.-6'.6',2.$ 5+5%08>'?2%'@?A'9%,2-&'#$%&'B00C000'DE'$,%5$'32%0%'3%'-6;F'#$%':0C000'GE'$,%5$'5%0'*%+& %H-;#,.-6>'I-,%',2+,',2%$%'5+0+9%,%0$'2+H%'6-,'*%%6'-5,.9.1%&'.6'%.,2%0'@?A'-0'J@K'+6&'&-'6-, 50-H.&%'+'<+.0'/-95+0.$-6'-<',2%',3-'9%,2-&$>'J-3%H%0C',2%'/-95+0.$-6'-<',2%'6#9*%0'-<',-,+; 5+,2'-5,.9.1+,.-6'$,%5$'.$'L#.,%'<+.0'+$'*-,2'9%,2-&$'$,+0,'<0-9',2%';.6%+0'.6,%05-;+,.-6'.6.,.+; 5+,2>'M6,%0%$,.6=;FC'.,',+N%$'@?A'9%,2-&'!00',-'(B0'$,%5$',-'/-6H%0=%',2%'5+,2C'32%0%+$'%H%6'3.,2 ,2%' -0.=.6+;' J@K' 9%,2-&C' 3.,2-#,' ,2%' AE' $,%5C' .,' ,+N%$' -6;F' "0' $,%5$' ,-' /-6H%0=%' ,2%' 5+,2> I%%&;%$$',-'$+F'.6'-#0'J@K'9%,2-&'3%'6%H%0'2+H%',-'/-95#,%'+6F'-<',2%'O+/-*.+6$'-0'9%,0./ ,%6$-0$',2+,'+&&'+&&.,.-6+;'/-95#,+,.-6+;'-H%02%+&> A. 2. Accuracy of the HFB energy profiles The HFB Potential Energy Profile P%' ,%$,%&' ,2%' %<<%/,' -<' ,2%' @-#0.%0' $%0.%$' ,0#6/+,.-6' .6&%Q' !' -6' ,2%' <.&%;.,F' -<' ,2% -5,.9.1%&'5+,2$'*F'/-95+0.6=',2%'/-00%$5-6&.6='%6%0=F'50-<.;%$',-',2%'*%$,'*%6/29+0N'H+;#%$ -*,+.6+*;%>'M6',2%'/+$%'-<',2%'9.6.9#9'5-,%6,.+;'%6%0=F'5+,2$C'3%'/-95+0%&',2%'0%;+,.H%'%6%0=.%$ /-95#,%&'#$.6='%L#+,.-6'$.9.;+0',-'7!B8'3.,2',2%'%Q+/,'0%;+,.H%'%6%0=.%$'/-95#,%&'#$.6='$,+6&+0& -5,.9.1+,.-6',%/26.L#%$>'M6'5+0,./#;+0C'$,+0,.6='<0-9',2%'J@K'-5,.9.1%&'5+,2C'3%'#$%&'/-6R#=+,% 5%+N'0%<.6%9%6,'9%,2-&'7STU8',-'-5,.9.1%',2%',0+6$.,.-6'$,+,%$C'+6&',2%'+&+5,%&'*+$.$'I%3,-64 U+52$-6'7VKIU8'9%,2-&',-'-5,.9.1%',2%';-/+;'9.6.9+>'?2.$'/-95+0.$-6'.$',2%'9-$,'$,0.6=%6, ,%$,'-<'-#0'9%,2-&-;-=F'*%/+#$%',2%'J@K'9%,2-&'$2-#;&C'.6'50.6/.5;%C'=.H%',2%'%Q+/,'0%;+,.H% 5-,%6,.+;'%6%0=.%$> ?+*;%$'!'+6&'('$#99+0.1%',2%'9%+6'0%;+,.H%'%6%0=.%$'/-95#,%&'-H%0'!0'/-6$%/#,.H%'J@K 0#6$C'3.,2'/-00%$5-6&.6='$,+6&+0&'&%H.+,.-6$'=.H%6'.6'5+0%6,2%$.$>'V$'$%%6'<0-9'?+*;%'!C',2% !" #$%&'()$*+,'$-'(&%*$-$#.($/*0,1+2'$3*4('5*'5$*678*1$'5,3*3$)(&'$*9#,1*'5$*$:&0'*$-$#.($/*,-%; (-*'5$*/$0,-3*3$0(1&%*+%&0$<*$)$-*9,#*+&'5/*4('5*P*=*!>? @5$*$99$0'*,9*'5$*9,#0$*0,-/'&-'/*,-*'5$*#$%&'()$*+,'$-'(&%*$-$#.($/*(/*&%/,*A2('$*/1&%%?*@5$ %&#.$/'*&B/,%2'$*3$)(&'(,-*,9*C?CD*E0&%F1,%*(/*,B/$#)$3*9,#*'5$*5(.5$/'*'#&-/('(,-*/'&'$*!"#*G>?H> E0&%F1,%I* 0,1+2'$3* 4('5* P*=*JH*&-3*&*9,#0$*0,-/'&-'*,9*!C?C*E0&%F1,%KLMJ?* N/* $:+$0'$3<* '5$ #$%&'()$*$-$#.;*,9*'5$*'#&-/('(,-*/'&'$*(/*/%(.5'%;*2-3$#$/'(1&'$3?*@5$*/$0,-3*%&#.$/'*3$)(&'(,-*,9 C?CH*E0&%F1,%*(/*,B/$#)$3*9,#*'5$*%,0&%*1(-(121*$%&'*4('5*P*=*!>*&-3*&*9,#0$*0,-/'&-'*!C?C*E0&%F E0&%F1,%KLMJ?*O-0#$&/(-.*'5$*9,#0$*0,-/'&-'*&-3*'5$*'#2-0&'(,-*(-3$:*P*(1+#,)$/*&.#$$1$-'*4('5 '5$*$:&0'*#$/2%'? P$*0,-0%23$*'5&'*'5$*1(-(121*+,'$-'(&%*$-$#.;*+&'5/*,+'(1(Q$3*4('5*'5$*678*1$'5,3 &-3*'5$*0,##$/+,-3(-.*+,'$-'(&%*$-$#.;*+#,9(%$/*&#$*/299(0($-'%;*&002#&'$<*&-3*&#$*92%%;*R2/'(9($3*', B$*2/$3*(-*A2&-'('&'()$*/'23($/*,9*&3(&B&'(0*'#&-/('(,-/<*(-0%23(-.*'5,/$*4('5*ST*&-3*STFTT +,'$-'(&%/?* N* -,'$* ,9* 0&2'(,-* (/* (-* +%&0$* 5$#$U* (-* ,+'(1(Q(-.* &3(&B&'(0* '#&-/('(,-* +&'5/<* '5$ VWXFXWX*+&#'('(,-*12/'*B$*05,/$-*0&#$92%%;*',*+#$)$-'*+&'5*3(/0,-'(-2('($/*32$*',*(/,1$#(Q&'(,(-*'5$*XWX?*@5$*/&9$/'*05,(0$*(/*',*(-0%23$*&%%*'5$*2-(A2$*5$&);*&',1/*,9*'5$*/;/'$1*(-',*'5$ VWX?*7,#*&33('(,-&%*3(/02//(,-*#$.(-.*VWXFXWX*+&#'('(,-(-.*/$$*,2#*+#$)(,2/*+&+$#?!! 7,#*'5$*1(-(121*9#$$*$-$#.;*'#&-/('(,-*+&'5*$-/$1B%$/*'5$*/('2&'(,-*(/*1,#$*0,1+%(0&'$3 &/* '5$* $:&0'* #$%&'()$* 9#$$* $-$#.($/* 9,#* &%%* '5$* 1(-(1&* &-3* '#&-/('(,-* /'&'$/* &#$* -,'* E-,4-? @5$#$9,#$<*'5$*#$%&'()$*9#$$*$-$#.($/*0,1+2'$3*2/(-.*$A2&'(,-*G!YI*4$#$*0,1+&#$3*&.&(-/'*'5$ #$%&'()$*9#$$*$-$#.($/*0,1+2'$3*2/(-.*'5$*21B#$%%&*/&1+%(-.*+#,0$32#$*&%,-.*'5$*+&'5*+#,R$0'(,(-*'4,*.$-$#&%(Q$3*3(1$-/(,-/?JZ ()**+,-*)./+0)12+#3+425)*66&+7&2869-. !" As mentioned in the Methods section, in contrast to the original HFB method, here we employ Langevin dynamics to collect data with the two simultaneous RMSD restraints.23 The corresponding 2D free energy profiles are depicted in Figure 2. Noteworthy, substituting !" (current reactant) for ! #$% (former reactant) significantly changes the coordinate system. Unfortunately, because of an overlap with more energetically favorable configurations, the change in the coordinate system hides the real transition state &'(, formerly well resolved,23 and creates an illusion of a transition state with a much lower free energy than anticipated. Thus it is not surprising that the HFB path does not pass through the artefactual transition state. Caution must be exercised while interpreting the free energy surfaces obtained in few generalized dimensions. Similar observations have been made previously, in particular, apparent transition states from the reduced dimensionality free energy surfaces commonly used in studies of small peptides do not correspond to the actual transition states.40 An additional problem related to computing the 2D free energy profiles is the systematic error due to the combination of the best-fit RMSD restraints with LD. In particular, the restraint forces are computed during dynamics after superposing the restraint reference structure to the structure from which the dynamics propagator makes its next step. Because LD adds a random force on top of the other forces, the next-step configuration moves further away from the reference coordinate system, and no longer satisfies the best-fit criteria. While computing the best-fit RMSDs during data processing, we are forced to superpose each snapshot to the corresponding reference to compute the value of the RMSD-based coordinates. This value will always be smaller than the true value during dynamics since the coordinate system is being reset to the reference by the best-fit procedure. The situation could be improved if the structure immediately preceding the recorded snapshot was available. 18 To demonstrate the effect of the random force on the relative free energies we rerun the 2D-RMSD simulations for the path between !"#$ and !"%& obtained previously with the Langevin dynamics but otherwise identical conditions.23 The corresponding relative free energies for the !"%& and the '() with respect to the !"#$ are 2.27 and 8.16 kcal/mol, respectively. These should be compared with 2.55 and 8.51 kcal/mol obtained previously with the MD simulations utilizing velocity reassignment23 instead of Langevin dynamics. The relative free energies obtained using weighted histogram analysis method (WHAM) from the data generated by 2D-RMSD umbrella sampling with the Langevin MD is provided in Table 2 as the benchmark. '*#+,-.+/0##+#1#023+405/67# Because the HFB method uses the averaged restraint forces and because LD adds a small random force on top of the potential energy and the bias forces, computing accurate free energy profiles requires longer simulation times than during the routine HFB evolution to ensure that the random forces average to zero. In practice, we find that the best way to compute the minimum free energy profiles along the path ensembles is to run multiple simulations with different initial conditions for the same reference Fourier path. The only other requirement is that the corresponding trajectories have the same number of snapshots with the same statistical weight to be able to combine their averages into one cumulative average structure that can then be used in place of the evolved bead to compute the much more accurate free energy profile using equation (15). In the case of the alanine dipeptide the free energy profiles are summarized in Table 2. The relative free energies for all the points along the path are given relative to ! "#$. It is clear from Table 2 that the relative free energies computed with the HFB method have nonzero 19 standard deviations, with the largest deviation being ".11 6cal8mol. :s expected, the force constant has a small effect on the relative free energies, with higher force constants giving higher relative free energies for the transition states. The truncation parameter ! also has a small effect on the relative free energies, with higher ! giving overall slightly higher relative energies. Comparison of the relative free energies computed with the HFB method with those from the 2E-RHSE umbrella sampling8JH:H is quite favorable, and suggests that the accuracy of the HFB method with the approximate free energy gradient is sufficiently high. Je feel that the HFB method could be used to provide benchmar6 calculations in the future. !"#$%&#!'%&(i* For the 2"-residue !-helical protein, we first optimized the coarse MN-bead unfolding path and then refined only the path between the first N beads 6eeping the endpoint beads fixed. The final free energy profile was reconstructed from 21 collection steps, each O ns long (see Figure 3). Figure 3 shows three cumulative profiles at 152, 1M" and 1MO ns that are practically identical, indicating near perfect convergence. :s seen in Figure 3, the THF has four minimum free energy basins. Inspecting the corresponding path traVectory, we can relate conformational changes to the observed free energy changes. Wver the range of the whole transformation four bac6bone hydrogen bonds are being bro6en. Xote that the total number of hydrogen bonds for an ideal helix of this size is 1M, however one of the bonds (:1M to Y2") has already bean bro6en in the reactant configuration. The next hydrogen bond to brea6 is between residues :1 and Z5. Xext, bond between Z15 and :1[ brea6s immediately followed by :2 and :M bond brea6ing. Concluding the transformation is the bond brea6ing between :1N and :1O. Some C- and8or Xterminal rearrangements often immediately follow the H-bond brea6ing event. Clearly, the protein folding8unfolding landscape at 2[O Z is quite rough, with the highest barrier for folding 2" of 5.4 kcal/mol and about 10 kcal/mol for unfolding. Interestingly, breaking the first hydrogen bond of the helix leads to a substate with a comparable free energy. Breaking additional hydrogen bonds is thermodynamically unfavorable. !"#$%&'()*+&,#-./0123 The present paper crucially enhances the original HFB method23 by incorporating energy gradients into the calculations. The energy gradients computed on the fly enable the HFB method to perform more efficient path optimizations. Most significantly, the gradient augmented HFB method can now reconstruct accurate energy profiles along the path ensemble in multidimensional reaction coordinate spaces, not possible before. This alleviates the need to use the 2D RMSD-based umbrella sampling procedure that can be problematic in certain cases. Importantly, both minimum free energy transition path ensembles and the adiabatic potential energy paths are given analytically along with their energy profiles by the HFB method. Thus, the enhanced HFB method can now provide complete information about the reaction path. 455.&*+6 Consider the quantity "# " " $ %# '#(* % & ' &# !" !$ ! " "# " " $ %# !x !x ( !x#$###) % ' '#&#(* & "#!" !! !$ %# ( $###!x " !x '#&#!x#$###) ! ) !" !x " ) !$ !x (A1) ) Importantly, this quantity should identically equal zero ( ! = 0 ) at the equilibrium or stationary state to ensure that the averaged configuration over the biased trajectory is stationary. Using the standard definition of the potential of mean force41 it could be shown that the unbiased free energy enters an isomorphous equation: 21 !! " !" # !$" !% % " " # !$" " % % " ' #x%$( ' )$ &' ( & !$ !$ & # !" # !$" !% ! " " " # !$" " % % !$ !$ ) #x%$#( & ( '&)$ ' ) $# !" # !$" !$ ! $&* " $&* !% !$ (A2) $&* Suppose that the PMF of the unrestrained ensemble at the point ! can be approximated by a harmonic function centered at another point, say, ! " and with its own force constant k" , 2 W u (x) ! k u ! x " x u " # C $ W$ u (x) . (A3) The values ! " and ku are unknown; " is an unknown offset of the unbiased PMF that relates the PMF to the biasing potential; and the tilde indicates the approximate PMF. Even though we have just introduced three unknown constants they do not enter the final result. Thus, the mean force is approximated as a linear function, derived from the corresponding quadratic potential $%" !#" !! " !#" !!% . " #$ " ! # # # " " # !# !# (A4) Substituting (A3) and (A4) into the (A2) and taking the corresponding Gaussian integrals analytically one obtains a simple approximation for the mean force: !W! u #x$ !x x! " x x%b !W u #x$ !x !# x%b !V #x$ !x !# x%b !V #x$ !x x! ( ! #&kv x x x%b ) # xv . (A5) x%b This equation focuses all the information collected by the biased simulation at one particular point in configuration space, namely x ! x x!" , and, therefore, does not require the use of histograms. !"k$%&'()*(+($,We would like to acknowledge the NIH for support of this work (GM48807). The authors are grateful to Dr. Ayori Mitsutake and Dr. In-Ho Lee for their suggestions and proofreading the manuscript, and to Prof. Sheena Ratford for suggesting the !-helical peptide for this study. 22 !eferences 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 $%& '())(*+s-& .%& /%& 0*1s2r45e-& R%& 8()+*ns:(n-& ;%& $%& <*n2-& R%& =%& 0*))en>er-& '%& =% '44>r1ff-&*n>&R%&@%&ABer-&@(4c:e+(sDrB&()&E3F-&691&E1996F% ;%&I%&A())&*n>&$%&0:*n-&J*D1re&EK4n>4nF&*&E1F-&10&E1997F% I%&J%&J*2*n*D:*n-&N%&A4s:(-&I%&<1n2-&O%&$*>P(-&*n>&Q%&O1n4R-&@(4c:e+(sDrB&*)&E28F8466&E2006F% O%&;*rU)1s&*n>&V%&;1rB(*n-&/r4c%&J*D)%&Ic*>%&$c(%&N$I&+,--&6679&E2005F% .%&$c:)(cX-&Molecular Modeling and Simulation: An Interdisciplinary Guide%&E$Ur(n2erQer)*2-&JeZ&[4rX-&2002F% 8%&=enXe)+*n&*n>&=%&V4nss4n-&V%&0:e+%&/:Bs%&++(&E22F-&9978&E2000F% R%&\)]er&*n>&O%&;*rU)1s-&0:e+%&/:Bs%&KeDD%&+(.&E5F-&375&E1987F% $%&=14&*n>&V%&\%&$Dr*1]-&V%&0:e+%&/:Bs%&+,/&E13F-&5000&E1997F% $%&=14&*n>&V%&\%&$Dr*1]-&/r4De(ns&(0&E2F-&249&E1999F% 8%&A4+4D4r-&K%&$D*c:4-&*n>&O%&^%&@*n-&V%&O4)%&$Dr1cD%&E.:e4c:e+F&),+1),--&509&E2000F% ^%&Q%&;:*5r1DsX((-&R%&=%&@Br>-&*n>&0%&K%&@r44Xs&^^^-&V%&0:e+%&/:Bs%&+-*&E19F-&194903_1 E2006F% $%&<(sc:er&*n>&O%&;*rU)1s-&0:e+&/:Bs%&KeDD%&+.*&E3F-&252&E1992F% I%&\%&0:4-&V%&A%&A4))-&*n>&A%&K%&<ree+*n-&0:e+%&/:Bs%&KeDD%&--.&E3F-&218&E1994F% R%&\)]er-&01rr%&`U(n%&$Dr1cD%&@(4)%&+)-&151&E2005F% ;%&Ir4r*&*n>&.%&$c:)(cX-&0:e+%&/:Bs%&KeDD%&(/2-&1&E2003F% R%&\)]er-&I%&0ar>en*s-&I%&8:4s:-&*n>&=%&$Dern-&I>5%&0:e+%&/:Bs%&+-0-&93&E2003F% /%&8%&@4):1(s-&A%&0:*n>)er-&0%&Ae))*24-&*n>&/%&K%&8e(ss)er-&Inn1%&Re5%&/:Bs%&0:e+%&)(291&E2002F% R%&R*>:*Xr(s:n*n&*n>&.%&$c:)(cX-&/r4c%&J*D)%&Ic*>%&$c(%&N$I&+,+-&5970&E2004F% \%&'e(n*n-&'%&Ren-&*n>&\%&Q*n>en-\(bn>en-&/:Bs%&Re5%&@&00-&052301c1&E2002F% @%&/eDers-&I%&=eB>en-&I%&.%&@e))-&*n>&I%&0:*Xr*]4rDB-&V%&0:e+%&/:Bs%&+-,&E17F-&7877 E2004F% '%&Ren-&04++%&O*D:%&$c(%&+&E2F-&377&E2003F% '%&<%&5*n&81nsDeren-&(n&8omputer Simulation of :iomolecular Systems-&e>(De>&]B&'%&<% 5*n& 81nsDeren& *n>& /%& ;%& 'e(ner& E\$0`O& $c(ence& /1])(s:ers& @%& Q%-& Ke(>en-& .:e JeD:er)*n>s-&1989F-&Q4)%&1-&UU%&27% ^%&Q%&;:*5r1DsX((-&;%&Ir4r*-&*n>&0%&K%&@r44Xs&-&^^^-&V%&0:e+%&/:Bs%&+-)&E17F-&174108c1 E2006F% 0%-[%&=1*n2-&d%&8eD*:1n-&.%&'*n2-&'%&<%&Ae8r*>4-&*n>&<%&8*(-&V%&I+%&0:e+%&$4c%&+-( E48F-&12111&E2001F% '%&.:(e)&*n>&V%&;*sDner-&V%&0:e+%&/:Bs%&+-(&E14F-&144104&E2005F% V%& 5*n& \er>en-& '%& V%& @r(e)s-& $%& =*rXe+*-& *n>& A%& <e()-& 0:e+%& /:Bs%& KeDD%&+0*&E4F-&370 E1989F% K%& J%& .refeD:en& *n>& A%& @*1-&^^^-&(n&Numerical Linear Algebra& E$^IO-& /:()*>e)U:(*1997F-&UU%&41% V%&J4ce>*)&*n>&$%&'r(2:D-&Numerical >ptimi?ation%&E$Ur(n2er-&2005F% K%&O*r*2)(*n4-&I%&<(sc:er-&\%&Q*n>en-\(bn>en-&*n>&8%&0(cc4DD(&V%&0:e+%&/:Bs%&+-)&E2F024106c1&E2006F% 23 30 31 32 33 3" 35 3\ 37 38 39 "0 "1 %&'(&')**+,*,-.'/&'%*+*0,1-.'2&'34156**+.'7&'3&'(11.'%&'8&'3,491:15.';;;.'9<+'=&'>& =5**-0.'?41*5&'@41A&'B,,&'!"#'C3D.'1"0'C2003D& B&'F&'/9,G151HH.'I5&.'=&'>&'=5**-0.'@&'(&'=5**-0.';;;.'(&'JKH00*<.'=&'>*LM.'7&')*<.'9<+ /&' G95NHL0.' K<' The Encyclopedia of Computational Chemistry.' 1+KO1+' PQ' 2&' R&' >& 3,4H1Q15.'2&'>&'3,451K<15.'J&'(&'BHHK<S15.'?&'@H95-.'I&'T90O1KS15.'2&'G*HHA9<.'9<+';&'%1<5Q 8&'3,491:15'CI*4<')KH1Q'U'3*<0.'@4K,410O15.'1998D& I&X2&'>Q,-915O.'T&'@K,,*OOK'.'9<+'%&'I&'@&'=151<+01<.'I&'@*AN&'24Q0&'$%'C3D.'327'C1977D& ?&'(9Z95K+K0.'F&'I&'?*PK90.'@&'(&'=5**-0.';;;.'9<+'/&'[&'29LH9KOK0.'I&'@41A&'24Q0&'#&'C10D. 7\12'C1991D& F&'I&'?*PK90'9<+'@&'(&'=5**-0';;;.'I&'@41A&'24Q0&''#'C8D.'5115'C1988D& B&'F&'/9,G151HH.'I5&.'F&'=904:*5+.'/&'=1HH*OO.'>&'(&'FL<P59,-'I5&.'I&'F&'[R9<01,-.'/&'I& 8K1H+.'3&'8K0,415.'I&'T9*.'%&'TL*.'3&'%9.'F&'I*01N4X/,@95O4Q.'(&'GL,4<K5.'G&'GL,Z159.'8& ?&'G&'(9L.'@&'/9OO*0.'3&'/K,4<K,-.'?&'JS*.'F&'?&'JSLQ1<.'=&'25*+4*A.')&'[&'>1K415';;;. =&'>*LM.'/&'3,4H1<-5K,4.'I&'@&'3AKO4.'>&'3O*O1.'I&'3O59LP.'/&')9O9<9P1.'I&')K*5-K16K,ZX GL,Z159.'F&'7K<.'9<+'/&'G95NHL0.'I&'24Q0&'@41A&'='!"$'C8D.'358\'C1998D& [&'J15K9.'3&'8K0,415.'9<+'/&'G95NHL0.'I&'@41A&'24Q0&'!"&'C5D.'1902'C199\D& =&'J&'F*AK<Q'9<+'@&'(&'=5**-0';;;.'I&'24Q0&'@41A&'='!"%'C18D.'37\5'C1999D& 3&'GLA95.'F&'=*LZK+9.'>&'%&'361<+01<.'2&'B&'G*HHA9<.'9<+'I&'/&'>*01<P15S.'I&'@*AN& @41A&'!%'C8D.'1011'C1992D& [&'/&'=*,Z-*'9<+'@&'(&'=5**-0.';;;.'I&'24Q0&'@41A&'#('C17D.'"509'C1993D& 2&'T&'=*H4LK0.'=K*N4Q0&'I&''''C1D.'50'C2005D& ?&'2&'3O599O0A9'9<+'I&'B&'/,@9AA*<.'I&'@41A&'24Q0&'!"!'C\D.'5032'C199"D& 2" !"#$%&'# T&e (ela+,-e ./+e0+,al e0e(1,e2 345al67/l8 5/7.9+e: ;,+& +&e <=> 7e+&/: ! f f f C/0@ " # A 1C#C M S A 1C#C M S A 2C#C M S A 2C#C ! A 1D ! A 2E ! A 1D ! A 2E Fxa5+ () C#H23C#CC8 C#H13C#CC8 C#H23C#CC8 C#H13C#CC8 C#H1 !S' 1#5C3C#CC8 1#EH3C#CC8 1#5C3C#CC8 1#EH3C#CC8 1#5C (+%, C#CC3C#CC8 C#CC3C#CC8 C#CC3C#CC8 C#CC3C#CC8 C#CC !S- D#EI3C#CC8 D#E23C#CC8 D#E73C#CC8 D#E53C#CC8 D#ED (+". 2#CH3C#C18 2#CK3C#CC8 2#CD3C#CC8 2#CK3C#CC8 2#C5 T&e (ela+,-e e0e(1,e2 a(e a-e(a1e: /-e( 1C 5/02e59+,-e <=> (902L ;,+& +&e 097be(2 ,0 .a(e0+&e2,2 (e.(e2e0+,01 5/((e2./0:,01 2+a0:a(: :e-,a+,/02# T&e <=> (ela+,-e e0e(1,e2 a(e 5/7.9+e: 92,01 +&e @/(5e /@ +&e (e2+(a,0+ ,0 +&e NCS -,a (e-e(2,ble ;/(4 l,0e ,0+e1(al 31K8L ;&e(ea2 +&e exa5+ (ela+,-e e0e(1,e2 a(e 5/7.9+e: ;,+& 5/0P91a+e .ea4 (e@,0e7e0+ 3CQN8 7e+&/: @/( +&e +(a02,+,/0 2+a+e2L a0: a:a.+,-e ba2,2 Re;+/0SNa.&2/0 7e+&/: @/( l/5al 7,0,7a# 25 !"#$%&'# T&e (ela+,-e .(ee e/e(0,e1 234al567l8 4769:+e; <,+& +&e =>? 6e+&7; C7/. ! "# A 5#C ! A 2D f MS A 5#C ! A E2 f MS A 1C#C ! A 2D f MS A 1C#C ! A E2 f MS A 2C#C ! A 2D f "S A 2C#C 2GHIJSG ! A E2 L=MJ () C#512C#CD8 C#522C#CD8 C#572C#C58 C#"C2C#C78 C#"O2C#CP8 C#7C2C#CO8 C#5D !S+ 1#1E2C#C28 1#1D2C#C28 1#1P2C#CE8 1#2C2C#CD8 1#2"2C#C58 1#2"2C#C"8 C#P1 (,%- C#CC2C#CC8 C#CC2C#CC8 C#CC2C#CC8 C#CC2C#CC8 C#CC2C#CC8 C#CC2C#CC8 C#CC QC#CCR !S' 7#"O2C#CD8 7#7"2C#CD8 7#P"2C#C78 7#PO2C#C"8 O#C"2C#CO8 O#1C2C#118 /5a QO#1"R (,". 2#222C#CD8 2#2E2C#CE8 2#212C#C78 2#222C#C"8 2#1O2C#CD8 2#2E2C#CP8 2#C7 Q2#27R T&e =>? (ela+,-e .(ee e/e(0,e1 a(e 4769:+e; :1,/0 +&e a-e(a0e; .7(4e 7. +&e (e1+(a,/+ ,/ +&e ICS -,a (e-e(1,ble <7(3 l,/e ,/+e0(al 2158T a/; a-e(a0e; 7-e( 1C 47/1e4:+,-e (:/1# T&e /:6be(1 ,/ 9a(e/+&e1,1 (e9(e1e/+ +&e 47((e197/;,/0 1+a/;a(; ;e-,a+,7/1# T&e la1+ 47l:6/ 9(7-,;e1 be/4&6a(3 -al:e1 4769:+e; :1,/0 2GHIJSG :6b(ella 1a69l,/0 al7/0 +&e 47((e197/;,/0 9a+& 9(7Ue4+,7/ <,+& L=MJ be+<ee/ +&e ()&a/;&(,".T <,+& /:6be(1 ,/ 1q:a(e b(a43e+1 47((e197/; +7 +&e 9a+& be+<ee/ (,%- a/;&(,".&21ee +ex+ .7( ;e+a,l18# 2" !"#$%&'()*+",-. Figure *+ ,-e .F/ pat- opti4i5ation progress 8it- 9:;< an= 8it-out t-e en-an>e= opti4i5ation step ?or a< a=ia@ati> transition pat- 9, A B+B C<D an= @< ?ree energE transition pat- ense4@Fe 9, A !GH+B C<+ ,-e IaFues o? t-e ?or>e >onstant use= in t-e @iasing potentiaF are s-o8n in t-e Fegen= aFong+ ,-e Fourier series trun>ation in=eJ P 8as set at *HK !L an= M! at .F/ steps *K *BB an= !BBK respe>tiIeFE+ :ee teJt ?or t-e >orrespon=ing step si5e para4eters an= ?urt-er =etaiFs+ Figure !+ N ?ree energE strip aFong t-e 4ini4u4 ?ree energE pat- at , A !GH+B C+ ,-e Fine 8itt-e >ir>Fes =epi>ts t-e >enters o? t-e a>tuaF 8in=o8s e4pFoEe= =uring t-e u4@reFFa sa4pFing pro>e=ure in t-e generaFi5e= !;OPQ:; spa>e an= >orrespon=s to t-e proRe>tion o? t-e .F/ =eriIe= SQF onto t-is !;OPQ:; spa>e+ ,-e ?ree energE 8as >o4pute= using !; >onstant te4perature T.NQ+ ,-e t-ree Fo>aF 4ini4a on t-e strip are Fa@eFe= ?or >FaritE+ :ee teJt ?or ?urt-er =etaiFs+ Figure M+ ,-e SQF aFong t-e 4ini4u4 ?ree energE transition pat- ense4@Fe ?or un?oF=ingU?oF=ing o? t-e !O-eFi>aF pepti=e at , A !GH+B C+ :ee teJt ?or =es>ription o? t-e Fa@eFs an= ?urt-er =etaiFs+ !" b) Path RMS, Å Path RMS, Å a) 0.20 5.0 5.0 SD 10.0 10.0 SD 20.0 20.0 SD 0.10 0.00 0 100 0.20 300 5.0 5.0 SD 10.0 10.0 SD 20.0 20.0 SD 0.10 0.00 200 0 100 200 HFB Step 300 Figure 1 28 RMSD(product), Å C5 7 6 0.7 5 0.5 4 C 7eq 3 0.3 C 7ax 0.1 0.1 0.3 0.5 0.7 2 1 Free Energy, kcal/mol 0.9 8 0.9 RMSD(reactant), Å Figure 2 29 0.2 C/N-term A14-A18 C-term N-term K15-A19 A2-A6 0 N-term A1-K5 PMF, kcal/mol 14 12 10 8 6 4 2 0 −2 152 ns 160 ns 168 ns 0.4 0.6 0.8 progress variable α 1 Figure 3 30
© Copyright 2026 Paperzz