b - J. Andrew McCammon

Gradient)Augmented Harmonic Fourier Beads Method for Quantitative
Studies of Reaction Path Ensembles
Ilja V. Khavrutskii§$a) and Charles L. Brooks III§a)
§
The Scripps Research Institute
Department of Molecular Biology, TPC6
10550 North Torrey Pines Road
La Jolla, California 92037
$
Present address: Howard Hughes Medical Institute,
Center for Theoretical Biological Physics,
Department of Chemistry and Biochemistry,
University of California at San Diego,
La Jolla, California 92093-0365
a)
E-mail: [email protected]; [email protected]
1
A"#t%a't
We present a crucial improvement of the recently published Harmonic Fourier Beads method [I.
V. Khavrutskii, K. Arora, and C. L. Brooks III, J. Chem. Phys. ()* (17), 174108:1-7 (2006)] for
locating minimum free energy transition path ensembles and minimum potential energy paths in
molecular systems with rugged energy landscapes. The improvement of the HFB method is due
to computing the gradients of either the free energy or of the potential energy derived from the
harmonic biasing potential. Using the respective energy gradients leads to a speed up in path
optimization of approximately 2-2.5 over the previous method. Most importantly, the computed
corresponding gradients allow reconstruction of accurate energy and free energy profiles along
the paths in multidimensional coordinate space. Thus, the enhanced HFB method greatly
expands our capabilities in quantitative studies of rare events, associated with processes such as
ligand binding, protein folding and enzyme catalysis. The utility of this extension is
demonstrated with an application to the conformational isomerization of the alanine dipeptide
and early unfolding events in the 20-residue-long !-helical peptide.
2
I. Introduction
"#$%&'()* (+,* '-+.-/%(0&-+()* /$-/1(+&2(0&-+3* -.* %-)$'4)$3* ,4/&+1* &%5-/0(+0
6&-%-)$'4)(/* 5/-'$33$37* .-/* $8(%5)$7* $+29%$* '(0()93&37* ,/41* 6&+,&+1* 0-* (* 5/-0$&+* 0(/1$0* (+,
5/-0$&+*.-),&+1*-.0$+*/$:4&/$*('0&;(0&-+*$+$/1&$3*&+*$8'$33*-.*0#$*0#$/%())9*(;(&)(6)$*!"T*;()4$<
=#$3$*/(/$*/$-/1(+&2(0&-+3*1-;$/+*0#$*%$'#(+&3%3*-.*0#$*6&-%-)$'4)(/*5/-'$33$3<*>85$/&%$+03
'(+* 5/-;&,$* 0#$/%-,9+(%&'* (+,?-/* @&+$0&'* &+.-/%(0&-+* (6-40* 0#$* '-//$35-+,&+1* 0/(+3&0&-+3
'-%5)$%$+0$,*69*30/4'04/()*$+3$%6)$*&+.-/%(0&-+*(6-40*0#$*&+;-);$,*%$0(30(6)$*30(0$3*-.*0#$
%-)$'4)(/* 3930$%3<* A-B$;$/7* 34'#* '-+;-)40$,* ,$3'/&50&-+3* %(9* %&33* &%5-/0(+0* ,$0(&)3* -.
30/4'04/()* (+,* $+$/1$0&'* /$-/1(+&2(0&-+3* &+;-);$,* &+* 6(//&$/* '/-33&+13* 6$0B$$+* +(3'$+0
&+0$/%$,&(0$3* 0#(0* (/$* 3&1+(04/$3* -.* 0#$* 4+,$/)9&+1* %$'#(+&3%3<CD!* =#$/$.-/$7* 0#$* (6&)&09* 0,$3'/&6$*0/(+3&0&-+3*6$0B$$+*0#$*%$0(30(6)$*$+,5-&+03*&+*(0-%&'*,$0(&)*(+,*0-*%(@$*:4(+0&0(0&;$
5/$,&'0&-+3* -.* 0#$* $+$/1$0&'3* &+;-);$,* &3* $33$+0&()* .-/* $)4'&,(0&+1* 0#$* %$'#(+&3%3* -.* 0#$
&%5-/0(+0* 6&-)-1&'()* 5/-'$33$3<* E04,9&+1* 0#$3$* /(/$* /$-/1(+&2(0&-+3* &3* ,&..&'4)0* 6-0#
$85$/&%$+0())9*(+,*'-%540(0&-+())9<
F$;$/0#$)$337*(,;(+'$3*&+*'-%540(0&-+()*0$'#+&:4$3*1/(,4())9*%(@$*5-33&6)$*4+'-;$/&+1
0#$*,$0(&)$,*%$'#(+&3%3*-.*0#$3$*&%5-/0(+0*0/(+3&0&-+3*-+*0#$*(0-%&'*)$;$)7*(+,*0#$/$.-/$*6/&+1
')-3$/*-4/*4+,$/30(+,&+1*(+,*'-+0/-)*-;$/*0#$*6&-)-1&'()*%('#&+$3<G7H*=#$3$*0$'#+&:4$3*.())*&+00B-*%(&+*'(0$1-/&$37*0#$*%$0#-,3*0#(0*'-%540$*5(0#3*-+*(+*(,&(6(0&'*5-0$+0&()*$+$/19*34/.('$
(+,*0#-3$*0#(0*'-%540$*5(0#37*-/*/(0#$/*5(0#*$+3$%6)$37*-+*(*./$$*$+$/19*34/.('$<*I$0#-,3*)&@$
0#$*+4,1$,*$)(30&'*6(+,7J*)&+$D&+0$1/()7KDCC*(+,*0#$*'-+L41(0$*5$(@*/$.&+$%$+0CM*'-%540$*5(0#3*-+
0#$* 5-0$+0&()* $+$/19* 34/.('$* (+,* 0#43* )('@* 0$%5$/(04/$* $..$'03<* N+* '-+0/(307* ('0&-+D6(3$,
,9+(%&'37C!DCJ*0/(+3&0&-+*5(0#*3(%5)&+17CK7CO*(+,*0#$*.&+&0$*0$%5$/(04/$*30/&+1CPDMC*%$0#-,3*'-%540$
5(0#*$+3$%6)$3*-+*0#$*./$$*$+$/19*34/.('$*(0*(*1&;$+*0$%5$/(04/$<*=#$*./$$*$+$/19*,$3'/&50&-+*&3
!
important as it governs dynamics of molecular machines at ambient conditions. However,
computing accurate free energy surfaces requires considerable conformational sampling, and is,
therefore, challenging.22
Previously, we introduced the HFB method that provides the simplest yet very robust
means to compute the minimum free energy path ensembles, and does not require explicit
knowledge of either the free energy or its gradient. To compute the final free energy profile
along the optimized path ensemble we devised a simple 2D umbrella sampling procedure.
However, because this procedure requires reduction of the multidimensional coordinate space to
only two generalized dimensions, it is not generally applicable to large systems with complex
paths.
In this paper, we substantially improve of the original HFB method by noticing that the
umbrella potential employed in the HFB optimization23 provides a straightforward way to
compute accurate energy gradients in Cartesian coordinates. In turn, the computed gradients
allow to significantly speed up the path optimization rate, and, most importantly, to reconstruct
the corresponding energy profile in the multidimensional coordinate space, thus, avoiding the
need to reduce the path dimensionality. Therefore, the gradient augmentation extends the
applicability of the HFB method to systems of arbitrarily many dimensions.
The paper is organized as follows. We first describe a very simple way to compute the
gradients of either the free energy or of the potential energy and then show how the
corresponding gradients can be used to speed up path optimization and to compute the
corresponding energy profiles. We then apply the gradient-augmented HFB method to a threestate conformational transition in the alanine dipeptide, and finally to a few steps of unfolding of
the 20-residue !-helical alanine-based peptide.24
4
!!" $%&'() *+, -&.%','/'0)
1" $%& 234 -&.%', 56&(67&8
"#$%&'(%)$*#+,%+-*.)./$0%*#$%'+12.$2%-3*#45675
%
q" (! ) ! q" (0) " (q" (1) " q" (0))! " $ #$" sin# $#! $ 8
97:
$!1
*#3*% .0% 3;% 3;3<=*.>3<% ?1;>*.+;% +?% 3% -2+@2$00% A32.3B<$% ! " !0;1" 6% C#.>#% ,$?.;$0% 3% >12A$% .;
)1<*.,.)$;0.+;3<% >++2,.;3*$% 0-3>$% B$*C$$;% 3% @.A$;% 2$3>*3;*% 9 ! ! 0 :% 3;,% -2+,1>*% 9 ! = 1:
>+;?.@123*.+;08%D;%$E13*.+;%97:%!i%.0%*#$%i*#%>+)-+;$;*%+?%*#$%>+;?.@123*.+;%A$>*+2% ! ! "q1,",q3# # 6
i#1,3%
3<0+%>3<<$,%3%B$3,6%,$0>2.B.;@%*#$%-+0.*.+;0%+?%3<<%#%3*+)0%.;%F32*$0.3;%0-3>$G% !bni "n#1,$ %32$%*#$
'+12.$2%3)-<.*1,$0G%3;,%$%.0%*#$%0$2.$0%*21;>3*.+;%.;,$H8
"+%+-*.)./$%*#$%-3*#6%*#$%&'(%)$*#+,%,.0>2$*./$0%.*%.;*+%3%?.;.*$%0$*%+?%B$3,0%3;,%*#$;
$A+<A$0% $3>#% B$3,% k% .;,$-$;,$;*<=% +?% 3<<% *#$% +*#$2% B$3,0% B=% 21;;.;@% $.*#$2% 2$<3*.A$<=% 0#+2*
)+<$>1<32%,=;3).>0%0.)1<3*.+;0%+2%$;$2@=%+-*.)./3*.+;%C.*#%F32*$0.3;%#32)+;.>%2$0*23.;*0%94:I
%*
S $$
'$
2
f
ref
&$
)$
! (""! ) !
m
q
"
q
(
!
)
#
#
"
#
*
i
i
#
)$
' S *!1 &$
($
%$ i! % *"2
ref
#
94:
"#$%1)B2$<<3%-+*$;*.3<%94:%>+;?.;$0%*#$%3B0+<1*$%-+0.*.+;0% qi %+?%+;<= '% 3*+)0% >+)-2.0.;@% *#$
2$3>*.+;% >++2,.;3*$% 01B0-3>$% 9JFS:% *+% *#$% kL*#% 2$?$2$;>$% >+;?.@123*.+; !i !! #ref " 6% <$3A.;@% #('
3*+)0%.;%*#$%0-$>*3*+2%>++2,.;3*$%01B0$*%9SFS:%?2$$8%&$2$6 ! %.0%*#$%0-2.;@%?+2>$%>+;0*3;*G% ! " %3;,
M S %32$%*#$%)300%+?%*#$%)*#% 3*+)% 3;,% *#$% *+*3<% )300% +?% *#$% 3*+)0% .;% *#$% JFS8% "#106% ?+2% $3>#
2$?$2$;>$%B$3,%kI
(
!(! kref ) = q! (! kref )"K"q#' (! kref )
)
95:
!
the evolution returns either the averaged bead
R
b,k
(
= q1
b,k
,K, q3N
b,k
)
(4)
or the corresponding energy minimized bead
[R] b,k = ([q1 ] b,k ,K, [q3N ] b,k ) .
(5)
The subscript b,k indicates that the evolution is performed for the kth bead in the presence of the
bias (2). Following the Fourier transform of the evolved beads to obtain new sets of the
amplitudes,23 redistribution of the beads along the evolved path provides new reference beads.
This procedure is iterated until convergence, as measured by the cessation of path displacement.
B. Computing the Energy Gradients using Harmonic Restraints
To improve the HFB method we augment it with the energy gradients. To compute the
free energy gradient, we follow a simple approximation due to Kastner and Thiel25 that surpasses
previously used stiff-spring approximation22,25,26 and also naturally suits the HFB method. In
particular, for the biasing potential of the form equivalent to (2)
2
V (x) = k v ( x ! x v ) ,
(6)
where kv is the force constant associated with the harmonic restraint of a particular bead (or
restraint window) and xv is the equilibrium position of the restraint, these authors arrive at the
following estimates of the unbiased free energy gradient25
x# x
!W u (x) !W˜ u (x)
"
= kB T
!x
!x
$ b2
x,b
# 2k v ( x # x v ) .
(7)
Here Wu(x) is the corresponding potential of the unbiased mean force (PMF) for the variable x, kB
is the Boltzmann constant, T is the simulation temperature, and tilde indicates an approximate
quantity; x
x,b
and ! b2 are the corresponding mean and variance, respectively, of an assumed
6
Gaussian distribution of the coordinate. To compute optimal estimates of the quantities in
equation (7), these authors used histograms of the x coordinate.25
To avoid histograms that are impractical with high dimensional reaction coordinates,23 we
further simplify their result by substituting x
!W u (x)
!x x=
"
x
x,b
!W˜ u (x)
!x x=
(
= #2k v x
x
x,b
x,b
into equation (7), which then reduces to:
)
# xv .
(8)
x,b
For the justification that this simplification is optimal refer to the Appendix.
Furthermore, it is trivial to demonstrate that the gradient of the potential energy U(x)
could also be derived from the harmonic restraint (6):
!U(x)
!V (x)
="
= "2kv ([ x ] b " x v ) .
!x x= [ x ] b
!x x= [ x ] b
(9)
Equation (9) holds exactly at the equilibrium position [ x ] b (i.e., local minimum of the biased
potential energy surface).
Thus, in the context of the HFB method, equations (8) and (9) provide the
multidimensional Cartesian energy gradients at the evolved beads on the fly. In what follows, we
only demonstrate the use of the gradients of the free energy to save space, noting that all of the
same methodology directly applies to the gradients of the potential energy.
C. Gradient Directed HFB Optimization
With the energy gradients at hand we can now significantly improve the convergence rate
of the path optimization compared to that in the original HFB implementation. Specifically, we
can step some distance away from the evolved beads along either the full energy gradients or
their components orthogonal to the path. The former is beneficial in cases where the endpoints
are allowed to evolve, or if the density of beads is small.
7
To achieve the maximum accuracy in computing the corresponding energy gradients, we
substitute the “as is” coordinates of the original reference and of the corresponding evolved
beads in the analogues of equation (8) or (9) for the restraint (2).
!W˜ u R
(
b,k
)
# "W˜ R
"W˜ u R b,k
u
b,k
=%
,K,
% " q1
" q3S b,k
b,k
$
(
)
(
) (& .
(10)
(
'
To accurately compute the orthogonal component of the force to the evolved path, we, first,
analytically compute its tangent vector following Fourier-transform of the evolved beads:
r
n (! ) = (q1"(! ),K, q"3S (! )) .
(11)
and then utilize standard projection techniques:27
! "W˜ u R
(
b,k
) = !W˜ ( R )
u
b,k
r
n ($ k ) % !W˜ u R b,k
r
.
# n ($ k )
r
r
n ($ k ) % n ($ k )
(
)
(12)
To step along the estimates of the mean force further away from the evolved beads, we use the
steepest descent (SD) like approach. Note, however, that because we apply the step to the
evolved bead and not the reference bead, this procedure corresponds to an enhanced SD method.
Thus the enhanced SD optimization step using either the full or the orthogonal component of the
gradient is as follows,
RkSD = R
b,k
+ ! k "W˜ u R
b,k
+ " k # !W˜ u R
(
b,k
)
(13)
or
SD
R!k
= R
(
b,k
).
(14)
Here ! k is the parameter that controls the step size for the k th bead and the superscript SD
indicates that the configuration corresponds to the bead generated with the enhanced SD step. In
the present paper we use the uniform step size parameter ! for all the beads.
8
The SD step provides substantially more evolved beads in comparison to the original
HFB implementation where we used just the evolved beads to generate the next path.23 From
these new enhanced SD beads we generate the next path and the corresponding set of reference
beads the exact same way as in the original HFB implementation from the evolved beads.23
Noteworthy, the HFB optimization provides a very useful general approach for a tough
problem of optimizing saddle points or transition states28 because the harmonic biasing potential
renders the modified energy surfaces along the path strongly convex, even in the vicinity of
transition states.
Finally, we would like to note that within the Finite Temperature String method there has
been some recent effort to move away from constraints and use restraints. However these authors
use a rather poor stiff-spring approximation to compute the mean forces. Furthermore, they
choose complex sets of reaction coordinates that unlike the Cartesian coordinates used in our
HFB method require additional computations of complex tensors and Jacobians and are not very
easily extendable to arbitrary many coordinates.29 We provide a straightforward comparison of
the performance of the FTS and HFB methods in Results and Discussion.
D. Computing the Energy profiles along the Fourier Path
Importantly, with the help of the free energy gradients, we can now compute accurate
estimates of the PMF along the Fourier path in multidimensional RCS, which was not possible
with the original HFB method. To this end we find it optimal to perform a Fourier transform of
the computed forces the exact same way as the corresponding evolved beads. With the
continuous Fourier representation of the forces and of the path normal we can now trivially
compute the corresponding reversible work along the path threading the evolved beads as the
generalized line integral of the second order in the RCS
9
3S ! $
"W˜ (! ) # '
˜
W u (! ) = + * & u
qi (! ))d! .
"qi
(
i=1 0 %
(15)
This procedure differs from a previously proposed method30 in that the global Fourier
interpolation of both the path and the forces is performed prior to integration, thus providing the
energy profile as an analytical function of the progress variable. Note that the analytical form of
the energy profile and that of the corresponding path renders pinpointing the energy extrema
with their accurate RCS coordinates particularly trivial.
When using the potential energy gradient as opposed to the free energy gradient this procedure
should give exact potential energy profiles and thus provides a perfect opportunity for
benchmarking the enhanced HFB method.
III. Computational Details
The gradient augmented HFB method was implemented into the c34a1 version of the
CHARMM program under the TREK module.31 Langevin dynamics (LD) was employed with
leap frog integrator using a 2 fs time step at T = 298 K and with a friction coefficient of 10 ps-1
for all heavy atoms. All bonds involving hydrogen atoms were constrained using SHAKE32-34
with tolerance of 10-9 Å. For the alanine dipeptide we employed CHARMM2235 all-atom force
field without CMAP in the gas phase, whereas for the 20-residue !-helical peptide we employed
CHARMM1936 united-atom force field with the GBORn37 implicit water model.35
For the alanine dipeptide the RCS/SCS partitioning was the same as before.23 Both
electrostatic and vdW interactions used 21 Å non-bonded list cutoff and were truncated with
switching functions over the range from 16 Å to 18 Å. An initial path was generated with linear
interpolation between the C5 and C7ax conformations in the full Cartesian coordinate space. The
10
"#$%&'()*(%&+,-(K(.+-(-&/(/)(01(+",(/2&(3)#'4&'(-&'4&-(/'#"5+/4)"(4",&6(P(.+-(-&/(/)()"&()*(/2&
*)77).4"8(9+7#&-(!:;(1<()'(01=
3)'(/2&(1>?'&-4,#&(!?2&745+7(@&@/4,&((B<C)0(B<E);1<(/2&(FGH(5)$@'4-&,(I>(()#/()*(!J1
/)/+7)(+/)$-()*(/2&(G;(G!(+",(K(/L@&(/2+/(+'&(&--&"/4+7(/)(,&*4"&(%+5M%)"&(,42&,'+7(+"87&-=(N)
-@&&,(#@(&"&'8L(5+75#7+/4)"-(/2&(")"?%)",&,(4"/&'+5/4)"-(&$@7)L&,(!I(O(74-/(5#/)**;(/'#"5+/4"8
/2&(4"/&'+5/4)"-(.4/2(/2&(-.4/524"8(*#"5/4)"-()9&'(!1(O(/)(!0(O('+"8&=(P4/2(-#52(-2)'/(5#/)**/2&(/2'&&(7L-4"&(-4,&52+4"-(&**&5/49&7L(,)(")/(*&&7(&+52()/2&'Q-(&7&5/')-/+/45(*4&7,=(N2&(4"4/4+7(@+/2
.+-(8&"&'+/&,(.4/2(+(-&'4&-()*(I<(/&$@&'+/#'&(R#$@-()*(!>(C(&+52(-/+'/4"8(+/(1S:=>(C(/)(4",#5&
#"*)7,4"8=( T#'4"8( &+52( 5)"-/+"/( /&$@&'+/#'&( '#"( /2&( FGH( +/)$-( .&'&( '&-/'+4"&,( /)( /2&
5)"*48#'+/4)"(+9&'+8&,()9&'(/2&(@'&5&,4"8(/&$@&'+/#'&('#"()*(1J;>>>(-/&@-(.4/2(+(*)'5&(5)"-/+"/
)*( >=>>J( M5+7U$)7=( T#'4"8( /2&( ,L"+$45-( /2&( 5))',4"+/&-( .&'&( -+9&,( &9&'L( J>( VT( -/&@-=( N2&
+9&'+8&,(-/'#5/#'&-(+/(&+52(/&$@&'+/#'&((/)/+7()*(I<)(@')94,&,(/2&(4"4/4+7('&*&'&"5&(@+/2(*)'(/2&
*'&&(&"&'8L(@+/2()@/4$4W+/4)"(+/(1S:=>(C=(N2&("#$%&'()*(%&+,-(#-&,(K(.+-(I<(+",(/2&(/'#"5+/4)"
"#$%&'(P(9+'4&,(%&/.&&"(1<(+",(<1=
!"#$%&#'()*#+,)-m-/()-01
3)'(/2&(+7+"4"&(,4@&@/4,&;(/2&(X3Y(&9)7#/4)"(,L"+$45-(&$@7)L&,(+"(&Z#474%'+/4)"('#"()*
1>(@-(+",(+(@'),#5/4)"('#"()*(<>(@-=(K)/&($#52(-2)'/&'('#"(7&"8/2-(/2+"(#-&,(4"(/2&()'484"+7(X3Y
@+@&'=10(T#'4"8(/2&(&Z#474%'+/4)";(/2&(-L-/&$(.+-(2&+/&,(/)(/2&(*4"+7(/&$@&'+/#'&(#-4"8(9&7)54/L
'&+--48"$&"/=( N2&( @'),#5/4)"( &$@7)L&,( &657#-49&7L( +( V+"8&94"( /2&'$)-/+/( .4/2( T( [( 1S:( C=
B9&'+84"8( FGH( 5))',4"+/&-( )9&'( /2&( @'),#5/4)"( /'+R&5/)'L( L4&7,-( /2&( &9)79&,( %&+,-=( N2&
@'),#5/4)"('&-/+'/(*47&-(.&'&(-+9&,(/)(4"4/4+/&(,L"+$45-(*)'(/2&("&6/(X3Y(-/&@=
N&"('&8#7+'(X3Y()@/4$4W+/4)"(-/&@-(.&'&(@&'*)'$&,(%&*)'&(/#'"4"8(/2&(HT()@/4)"()"=
N2&(HT()@/4)"(#-&,(,4**&'&"/(-/&@(-4W&(@+'+$&/&'-(*)'(,4**&'&"/(*)'5&(5)"-/+"/-=(\"(@+'/45#7+';(*)'
!!
#$% '()*% *(+,#-+#, ./01 !0/0 -+2 "0/0 3*-456(4789" #$% *())%,:(+2;+< ,#%: ,;=% :-)-6%#%), >%)%
./0?!09.1 !/".?!09. -+2 @/!".?!09A 8"7(3*-456(4)9! D+4%,, +(#%2 (#$%)>;,%/ E(>%F%)1 ;+ #$% *-,%
(' #$% '()*% *(+,#-+# (' "0/0 3*-456(4789" #$% ,#%: ,;=% :-)-6%#%) >-, ,%# #( A/".?!09A
8"7(3*-456(4)9! '() #$% ". GH ,#%:, ,#-)#;+< -# ,#%: !01 -+2 #$-+ >-, )%2D*%2 #( @/!".?!09A
8"7(3*-456(4)9! 2D% #( 2%F%4(:6%+# (' -+ ;+,#-I;4;#J ;+ #$% :-#$ (:#;6;=-#;(+/
K-#$ (:#;6;=-#;(+ #(>-)2 #$% 6;+;6D6 :(#%+#;-4 %+%)<J F-44%J %6:4(J%2 6;+;6;=-#;(+9
I-,%2 %F(4D#;(+ ,#%:"@ >;#$ #$% ,-6% '()*% *(+,#-+#, -+2 #$% *())%,:(+2;+< ,#%: ,;=%, -, ;+ #$%
LH9I-,%2 %F(4D#;(+/ M+ #$;, *-,% #$% GH (:#;(+ >-, #D)+%2 (+ ')(6 #$% ';),# ,#%:/
N$% :-#$ *(+F%)<%+*% >-, 6(+;#()%2 IJ *(6:D#;+< #$% )((#96%-+9,OD-)% (PQG) (' #$%
:-;)9>;,% )((#96%-+9,OD-)% 2%F;-#;(+, (PQGH,) I%#>%%+ #$% *())%,:(+2;+< )%'%)%+*% I%-2, ;+
#$% +%>4J %F(4F%2 -+2 #$% *$(,%+ *(6:-);,(+ :-#$ ;+ #$% PRG/ S% +(#% #$-# - D,%'D4 >-J #(
6(+;#() :-#$ *(+F%)<%+*% ;, #( '(44(> #$% "H9PQGH :-#$ :)(T%*#;(+"@ #$-# *-+ I% %-,;4J
F;,D-4;=%2/ U() #$% :D):(,%, (' #$;, :-:%) (:#;6;=-#;(+ >-, ,#(::%2 -'#%) @00 *J*4%,/
R(+F%)<%+*% (' -44 #$% :-#$, >-, -##-;+%2 >;#$;+ "00 ,#%:, -, ;, ,%%+ ')(6 #$% PQG *D)F%,
4%F%4;+< ('' ;+ U;<D)% !/ V;F%+ #$% )%,;2D-4 +(;,% ;+ #$% ')%% %+%)<J :-#$, (U;<D)% !I)1 4%F%4;+<
('' (' #$% PQG *D)F%, 6;<$# I% D,%2 -, :-#$ *(+F%)<%+*% *);#%);-/
U() #$% "09)%,;2D% !9$%4;*-4 :%:#;2% >% (+4J :%)'()6%2 ')%% %+%)<J :-#$ %+,%6I4%
(:#;6;=-#;(+ #( 'D)#$%) :)(F% #$% *(+*%:#/ M+;#;-44J1 #$% F%)J *(-),% AW I%-2 :-#$ ')(6 #$%
#%6:%)-#D)% ;+2D*%2 D+'(42;+< ,;6D4-#;(+, >-, (:#;6;=%2 '() !X"0 ,#%:,1 -# #$% %+2 D,;+< #$%
'()*% *(+,#-+# "/. 3*-456(4789" -+2 ,#%: ,;=% :-)-6%#%) ./0?!09. 8"7(3*-456(4)9! >;#$ ! = W"/
N$%+ #$% +D6I%) (' I%-2, >-, %?:-+2%2 #( !0"W -+2 #$% ';),# AW I%-2, >%)% *$(,%+ '() 'D)#$%)
(:#;6;=-#;(+ (>$;*$ *())%,:(+2, #( #$% ';),# '(D) I%-2, ;+ #$% *(-),% :-#$)/ Z22;#;(+-4 [1.00
EU\ (:#;6;=-#;(+ ,#%:, >;#$ #$% ';+-4 '()*% *(+,#-+# (' !0/0 3*-456(4789" -+2 <-66- (' !/".?!09
!"
#
$ %&'()*+,-./,011$ 23456$!$7$89,$+5;$<=>?==5$1@,@@@$+5;$1&#,@@@$AB$3>=C3$C=D$=E/,2>4/5$?=D=
C=DF/D.=;$25>4,$+$3+>43F+*>/DG$*/5E=D6=5*=H$I5$>J43$D=F45=;$/C>4.4K+>4/5,$>J=$=5;C/45>3$/F$>J=$!1
J=,4*+,$C=C>4;=$?=D=$F4L=;$>/$+,,/?$D=*/53>D2*>4/5$/F$>J=$MNO$+,/56$>J=$F2,,$25F/,;456$C+>JH
B. HFB Energy Profiles
The RCS Free Energy Profile
PJ=$FD==$=5=D6G$CD/F4,=3$?=D=$*/.C2>=;$23456$>J=$QOR$.=>J/;$?4>J$=S2+>4/5$(1#0H$PJ=
;+>+$*/,,=*>4/5$CD/*=;2D=$F/D$>J=$=5=D6G$CD/F4,=$D=*/53>D2*>4/5$43$4;=5>4*+,$>/$>J=$QOR$=E/,2>4/5
3>=C$/5,G$23=3$,/56=D$CD/;2*>4/5$D253$+5;$=E/,E=3$+D/25;$>J=$F45+,$D=F=D=5*=$C+>JH
P/$ +*J4=E=$ 32FF4*4=5>$ CD=*434/5$ 45$ >J=$ */.C2>=;$ FD==$ =5=D6G$ 6D+;4=5>$ F/D$ >J=$ +,+545=
;4C=C>4;=$?=$23=;$8$53$,/56$CD/;2*>4/5$AB$D25H$T=$+,3/$+33=33=;$>J=$=FF=*>$/F$>J=$F/D*=$*/53>+5>
+5;$>J=$O/2D4=D$3=D4=3$>D25*+>4/5$45;=L$!$/5$>J=$S2+,4>G$/F$>J=$FD==$=5=D6G$CD/F4,=3,$<G$*/.C+D456
D=32,>3$ ?4>J$ >JD==$ ;4FF=D=5>$ F/D*=$ */53>+5>3,$ 5+.=,G$ #H@,$ 1@H@$ +5;$ &@H@$ )*+,-./,'%1&$ ?4>J
>D25*+>4/5$45;4*=3$!$/F$&8$+5;$3&H$U>+D>456$FD/.$>J=$C+>J3$>J+>$*/DD=3C/5;$>/$3>=C$&@@$/F$UB$QOR
/C>4.4K+>4/5,$?=$C=DF/D.=;$1@$QOR$/C>4.4K+>4/51*/,,=*>4/5$3>=C3$(6=5=D+>456$+$5=?$D=F=D=5*=
C+>J$+F>=D$=+*J$,/56$*/,,=*>4/5$D250$?4>J$!$7$&8,$F/,,/?=;$<G$>=5$3>=C3$?4>J$!$7$3&$F/D$=+*J$F/D*=
*/53>+5>H$ R+3=;$ /5$ >J=$ >=5$ */53=*2>4E=$ QOR$ /C>4.4K+>4/51*/,,=*>4/5$ *G*,=3$ ?=$ */.C2>=;$ >J=
.=+5$/F$>J=$D=,+>4E=$=5=D64=3$F/D$+,,$/F$>J=$4;=5>4F4=;$=L>D=.+$+5;$>J=$*/DD=3C/5;456$3>+5;+D;
;=E4+>4/53$F/D$=+*J$F/D*=$*/53>+5>$+5;$=+*J$>D25*+>4/5$45;=L$!H
R=*+23=$ 45$ >J=$ !1J=,4*+,$ C=C>4;=$ >J=$ UVU$ 3C+*=$ */5>+453$ W$ +5;$ X$ +.45/$ +*4;3$ ?4>J
D=,+>4E=,G$,+D6=$34;=*J+453,$.2*J$,/56=D$AB$*/,,=*>4/5$D253$?=D=$5=*=33+DG$>/$+E=D+6=$>J=$UVU
*/.C,=>=,G$+5;$*/5E=D6=$>J=$MNOH$Y3456$>J=$3+.=$F/D*=$*/53>+5>$+3$45$>J=$F45+,$/C>4.4K+>4/5,$?=
D+5$&1$>/>+,$/F$9$53$,/56$*/,,=*>4/5$D253$=+*J$F/D$>J=$3+.=$F45+,$D=F=D=5*=$C+>JH$I5$>J=$=5;$+,,$>J=
13
#$%$&'()(&*+,-./(#&%+&0.1(&$&!23&/4&5+/0&$1()$0(4&6.(5#./0&%7(&8./$5&9()8(*%56&*+/1()0(#&:;<
=*+,9>%(#&>4./0&!&?&2@AB
!"##$%&#"'($)"*+,-#$,&$.$/0$1230$34.5#
<+)& %7(& $5$/./(& #.9(9%.#(C& $& *+,9$).4+/& -(/*7,$)D& 8+)& %7(& E<F& 8)((& (/()06& 9)+8.5(& '$4
*+/4%)>*%(#& >4./0& @G& >,-)(55$& 4$,95./0& 8+55+'./0& $& 9)+%+*+5& 4.,.5$)& %+& %7$%& #(4*).-(#
9)(1.+>456B@H&I4&%7(&)(8()(/*(&'(&>4(#&%7(&9$%7&+9%.,.J(#&'.%7&!&?&@"&$/#&8+)*(&*+/4%$/%&+8&!KBK
D*$5L,+5MNO@B&P7(&/>,-()&+8&-($#4&'$4&#+>-5(#&8)+,&H@&%+&2"C&'.%7&($*7&-($#&#(8././0&$&4./05(
4$,95./0& './#+'B& <+)& ($*7& './#+'& @K& /4& QG& )>/& '$4& 9()8+),(#B& P7)((& #.88()(/%& 8+)*(
*+/4%$/%4&'()(&>4(#&'.%7&%7(&-(4%O8.%&R;SG&)(4%)$./%4C&/$,(56&TBKC&!KBK&$/#&@KBK&D*$5L,+5MNO@B
P7(& #$%$& 8)+,& $55& %7(& %7)((& 4(%4& +8& 4.,>5$%.+/4& '()(& *+,-./(#& $/#& *+/1()%(#& ./%+& %7(
*+))(49+/#./0&@G&8)((&(/()06&9)+8.5(&>4./0&'(.07%(#&7.4%+0)$,&$/$564.4&,(%7+#&=UEI;A&8+)
-(%%()&4%$%.4%.*4BH3CHV
678$1#9:-;9$.&<$0,95:99,*&
=8$>?#$.-.&,&#$<,4#4;,<#
P+&#(,+/4%)$%(&%7(&>%.5.%6&+8&%7(&0)$#.(/%&$>0,(/%(#&E<F&,(%7+#C&'(&(W$,./(#&$&%7)((O
4%$%(&*+/8+),$%.+/$5&%)$/4.%.+/&+8&%7(&$5$/./(&#.9(9%.#(&./&0$4&97$4(&%7$%&*+//(*%4&@A$$%&XO!T!B"Y
!ZKB2[&=\&?&KBV&D*$5L,+5A&$/#&@B.CC&$%&X2VBZY&O2ZB2[&=\&?&@B!&D*$5L,+5A&1.$&$/&./%(),(#.$%(&@B#DC&$%
XO3!B"Y&ZKBT[&=\&?&KBK&D*$5L,+5AB&]/&*+/%)$4%C&9)(1.+>4&E<F&4%>#6&*+/*()/(#&$&%'+O4%$%(&%)$/4.%.+/
-(%'((/&%7(&@B#D& $/#&@B.C&,./.,$B@H&P7(&(88.*.(/*6&+8&%7(&9$%7&+9%.,.J$%.+/&'.%7&%7(&0)$#.(/%
$>0,(/%(#&E<F&,(%7+#&.4&*+,9$)(#&'.%7&%7(&($)5.()&$99)+$*7&8+)&1$).+>4&8+)*(&*+/4%$/%4&= ! " # AC
4%(9&4.J(&9$)$,(%()4&= ! A&$/#&4().(4&%)>/*$%.+/&./#(W&=!AB& U(& 0$>0(& %7(& $**>)$*6& +8& %7(& E<F
9+%(/%.$5& (/()06& 9)+8.5(4& $0$./4%& %7(& (W$*%& (/()0.(4& +-%$./(#& >4./0& 4%$/#$)#& +9%.,.J$%.+/
!"
#$%&'()*$+,- .'- #&$- /#&$0- &1'23- #&$- 1%%*01%4- /5- #&$- 678- 50$$- $'$094- :0/5(;$+- (+- $<1;*1#$2- =4
%/>:10(+/'-191('+#-#&$-50$$-$'$09($+-%/>:*#$2-?(#&-#&$-@A-*>=0$;;1-+1>:;('9-:0/%$2*0$,@B
!" $" %&&i(i)*(+ ,& -.) /012i)*- 13/4)*-)2 567 81-. ,8-i4i91-i,*
C&$-:0/90$++-/5-#&$-:1#&-/:#(>(D1#(/'-?1+->/'(#/0$2-*+('9-=$'%&>10E-0$5$0$'%$-:1#&+
:1++('9-#&0/*9&-#&$-%/00$+:/'2('9->('(>*>-$'$094-<1;;$4+,-F:$%(5(%1;;43-?$-*+$2-#&$-=$+#-:1#&+
/=#1('$2-2*0('9-:/#$'#(1;-$'$094-1'2-50$$-$'$094-678-/:#(>(D1#(/'+-?(#&-#&$-5/0%$-%/'+#1'#-1G,G
E%1;H>/;IJK@-1'2-!-L-@M-1+-#&$-=$'%&>10E-:1#&+,-C&$-NOF-+(>(;10(#4->$1+*0$-=$#?$$'-%*00$'#
1'2- =$'%&>10E- :1#&+- 2$+%0(=$2- ('- #&$- O$#&/2+- +$%#(/'- 1'2- :;/##$2- ('- 7(9*0$- 1- ?1+- *+$2- #/
>/'(#/0-#&$-/:#(>(D1#(/'-:0/90$++,
7(9*0$+- 1P13- =Q- 2$:(%#- #&$- :0/90$++- /5- #&$- 678- /:#(>(D1#(/'- #/?102- #&$- >('(>*>
:/#$'#(1;-$'$094-:1#&-1'2-#&$->('(>*>-50$$-$'$094-#01'+(#(/'-:1#&-$'+$>=;$3-0$+:$%#(<$;43-=/#&
?(#&-1'2-?(#&/*#-#&$-$'&1'%$2-FA-+#$:,-R'-1;;-%1+$+-+(9'(5(%1'#-+:$$2-*:-P*:-#/-@K@,"-#(>$+Q-/5
#&$-/:#(>(D1#(/'-(+-/=+$0<$2-?(#&-#&$-$'&1'%$2-FA-+#$:-0$;1#(<$-#/-#&$-/0(9('1;-678->$#&/2,-S+
$T:$%#$23- #&$- /:#(>(D1#(/'- 01#$- 2$:$'2+- /'- #&$- 5/0%$- %/'+#1'#3- ?(#&- ;109$0- 5/0%$- %/'+#1'#+
0$)*(0('9->/0$-/:#(>(D1#(/'-+#$:+,
C&$-+#$:-+(D$-:101>$#$0+-*+$2-('-#&(+-+#*24-10$-'$10-/:#(>1;3-1+-('%0$1+('9-#&$-+#$:-+(D$
:101>$#$0+- #?/5/;2- 0$'2$0+- #&$- 678- /:#(>(D1#(/'- *'+#1=;$- 1'2- )*(%E;4- 2$9012$+- #&$- :1#&,
S;#&/*9&-2$<$;/:('9-#&$-('+#1=(;(#4-?(#&-;109$0-+#$:-+(D$+-%1'-=$-%/'+(2$0$2-1-2(+12<1'#19$3-(#-(+
*+$5*;-5/0-2$#$0>('('9-#&$-/:#(>1;-+#$:-+(D$-:101>$#$0+,-7*0#&$0-0$+$10%&-(+-'$%$++104-#/-$T:;/0$
/#&$0-/:#(>(D1#(/'-+#01#$9($+-=1+$2-/'-#&$-50$$-$'$094-9012($'#-1::0/T(>1#(/'-#/-91('-122(#(/'1;
('%0$1+$+-('-/:#(>(D1#(/'-$55(%($'%4,
R#->(9&#-=$-/5-('#$0$+#-#/-%/>:10$-#&$-%/>:*#1#(/'1;-('#$'+(#4-/5-#&$-678->$#&/2-?(#&
#&1#-/5-#&$-0$%$'#-<10(1'#-/5-#&$-7CF->$#&/2-#&1#-1;+/-$>:;/4$2-0$+#01('#+,@U-C&$-7CF->$#&/2
1"
#$%&'(0'*%+&$',-'&.$/0%,.1%',2%',3-4$,+,%'5+,2'*%,3%%6'7C7eq'+6&'C 7ax8'/-95+0%&',-':('*%+&$'.6
-#0'%+0;.%0'5+5%0'76-,%',2+,'3%'#$%':('*%+&$'<-0',2%'9#/2';-6=%0',20%%4$,+,%',0+6$.,.-6'.6',2.$
5+5%08>'?2%'@?A'9%,2-&'#$%&'B00C000'DE'$,%5$'32%0%'3%'-6;F'#$%':0C000'GE'$,%5$'5%0'*%+&
%H-;#,.-6>'I-,%',2+,',2%$%'5+0+9%,%0$'2+H%'6-,'*%%6'-5,.9.1%&'.6'%.,2%0'@?A'-0'J@K'+6&'&-'6-,
50-H.&%'+'<+.0'/-95+0.$-6'-<',2%',3-'9%,2-&$>'J-3%H%0C',2%'/-95+0.$-6'-<',2%'6#9*%0'-<',-,+;
5+,2'-5,.9.1+,.-6'$,%5$'.$'L#.,%'<+.0'+$'*-,2'9%,2-&$'$,+0,'<0-9',2%';.6%+0'.6,%05-;+,.-6'.6.,.+;
5+,2>'M6,%0%$,.6=;FC'.,',+N%$'@?A'9%,2-&'!00',-'(B0'$,%5$',-'/-6H%0=%',2%'5+,2C'32%0%+$'%H%6'3.,2
,2%' -0.=.6+;' J@K' 9%,2-&C' 3.,2-#,' ,2%' AE' $,%5C' .,' ,+N%$' -6;F' "0' $,%5$' ,-' /-6H%0=%' ,2%' 5+,2>
I%%&;%$$',-'$+F'.6'-#0'J@K'9%,2-&'3%'6%H%0'2+H%',-'/-95#,%'+6F'-<',2%'O+/-*.+6$'-0'9%,0./
,%6$-0$',2+,'+&&'+&&.,.-6+;'/-95#,+,.-6+;'-H%02%+&>
A. 2. Accuracy of the HFB energy profiles
The HFB Potential Energy Profile
P%' ,%$,%&' ,2%' %<<%/,' -<' ,2%' @-#0.%0' $%0.%$' ,0#6/+,.-6' .6&%Q' !' -6' ,2%' <.&%;.,F' -<' ,2%
-5,.9.1%&'5+,2$'*F'/-95+0.6=',2%'/-00%$5-6&.6='%6%0=F'50-<.;%$',-',2%'*%$,'*%6/29+0N'H+;#%$
-*,+.6+*;%>'M6',2%'/+$%'-<',2%'9.6.9#9'5-,%6,.+;'%6%0=F'5+,2$C'3%'/-95+0%&',2%'0%;+,.H%'%6%0=.%$
/-95#,%&'#$.6='%L#+,.-6'$.9.;+0',-'7!B8'3.,2',2%'%Q+/,'0%;+,.H%'%6%0=.%$'/-95#,%&'#$.6='$,+6&+0&
-5,.9.1+,.-6',%/26.L#%$>'M6'5+0,./#;+0C'$,+0,.6='<0-9',2%'J@K'-5,.9.1%&'5+,2C'3%'#$%&'/-6R#=+,%
5%+N'0%<.6%9%6,'9%,2-&'7STU8',-'-5,.9.1%',2%',0+6$.,.-6'$,+,%$C'+6&',2%'+&+5,%&'*+$.$'I%3,-64
U+52$-6'7VKIU8'9%,2-&',-'-5,.9.1%',2%';-/+;'9.6.9+>'?2.$'/-95+0.$-6'.$',2%'9-$,'$,0.6=%6,
,%$,'-<'-#0'9%,2-&-;-=F'*%/+#$%',2%'J@K'9%,2-&'$2-#;&C'.6'50.6/.5;%C'=.H%',2%'%Q+/,'0%;+,.H%
5-,%6,.+;'%6%0=.%$>
?+*;%$'!'+6&'('$#99+0.1%',2%'9%+6'0%;+,.H%'%6%0=.%$'/-95#,%&'-H%0'!0'/-6$%/#,.H%'J@K
0#6$C'3.,2'/-00%$5-6&.6='$,+6&+0&'&%H.+,.-6$'=.H%6'.6'5+0%6,2%$.$>'V$'$%%6'<0-9'?+*;%'!C',2%
!"
#$%&'()$*+,'$-'(&%*$-$#.($/*0,1+2'$3*4('5*'5$*678*1$'5,3*3$)(&'$*9#,1*'5$*$:&0'*$-$#.($/*,-%;
(-*'5$*/$0,-3*3$0(1&%*+%&0$<*$)$-*9,#*+&'5/*4('5*P*=*!>?
@5$*$99$0'*,9*'5$*9,#0$*0,-/'&-'/*,-*'5$*#$%&'()$*+,'$-'(&%*$-$#.($/*(/*&%/,*A2('$*/1&%%?*@5$
%&#.$/'*&B/,%2'$*3$)(&'(,-*,9*C?CD*E0&%F1,%*(/*,B/$#)$3*9,#*'5$*5(.5$/'*'#&-/('(,-*/'&'$*!"#*G>?H>
E0&%F1,%I* 0,1+2'$3* 4('5* P*=*JH*&-3*&*9,#0$*0,-/'&-'*,9*!C?C*E0&%F1,%KLMJ?* N/* $:+$0'$3<* '5$
#$%&'()$*$-$#.;*,9*'5$*'#&-/('(,-*/'&'$*(/*/%(.5'%;*2-3$#$/'(1&'$3?*@5$*/$0,-3*%&#.$/'*3$)(&'(,-*,9
C?CH*E0&%F1,%*(/*,B/$#)$3*9,#*'5$*%,0&%*1(-(121*$%&'*4('5*P*=*!>*&-3*&*9,#0$*0,-/'&-'*!C?C*E0&%F
E0&%F1,%KLMJ?*O-0#$&/(-.*'5$*9,#0$*0,-/'&-'*&-3*'5$*'#2-0&'(,-*(-3$:*P*(1+#,)$/*&.#$$1$-'*4('5
'5$*$:&0'*#$/2%'?
P$*0,-0%23$*'5&'*'5$*1(-(121*+,'$-'(&%*$-$#.;*+&'5/*,+'(1(Q$3*4('5*'5$*678*1$'5,3
&-3*'5$*0,##$/+,-3(-.*+,'$-'(&%*$-$#.;*+#,9(%$/*&#$*/299(0($-'%;*&002#&'$<*&-3*&#$*92%%;*R2/'(9($3*',
B$*2/$3*(-*A2&-'('&'()$*/'23($/*,9*&3(&B&'(0*'#&-/('(,-/<*(-0%23(-.*'5,/$*4('5*ST*&-3*STFTT
+,'$-'(&%/?* N* -,'$* ,9* 0&2'(,-* (/* (-* +%&0$* 5$#$U* (-* ,+'(1(Q(-.* &3(&B&'(0* '#&-/('(,-* +&'5/<* '5$
VWXFXWX*+&#'('(,-*12/'*B$*05,/$-*0&#$92%%;*',*+#$)$-'*+&'5*3(/0,-'(-2('($/*32$*',*(/,1$#(Q&'(,(-*'5$*XWX?*@5$*/&9$/'*05,(0$*(/*',*(-0%23$*&%%*'5$*2-(A2$*5$&);*&',1/*,9*'5$*/;/'$1*(-',*'5$
VWX?*7,#*&33('(,-&%*3(/02//(,-*#$.&#3(-.*VWXFXWX*+&#'('(,-(-.*/$$*,2#*+#$)(,2/*+&+$#?!!
7,#*'5$*1(-(121*9#$$*$-$#.;*'#&-/('(,-*+&'5*$-/$1B%$/*'5$*/('2&'(,-*(/*1,#$*0,1+%(0&'$3
&/* '5$* $:&0'* #$%&'()$* 9#$$* $-$#.($/* 9,#* &%%* '5$* 1(-(1&* &-3* '#&-/('(,-* /'&'$/* &#$* -,'* E-,4-?
@5$#$9,#$<*'5$*#$%&'()$*9#$$*$-$#.($/*0,1+2'$3*2/(-.*$A2&'(,-*G!YI*4$#$*0,1+&#$3*&.&(-/'*'5$
#$%&'()$*9#$$*$-$#.($/*0,1+2'$3*2/(-.*'5$*21B#$%%&*/&1+%(-.*+#,0$32#$*&%,-.*'5$*+&'5*+#,R$0'(,(-*'4,*.$-$#&%(Q$3*3(1$-/(,-/?JZ
()**+,-*)./+0)12+#3+425)*66&+7&2869-.
!"
As mentioned in the Methods section, in contrast to the original HFB method, here we
employ Langevin dynamics to collect data with the two simultaneous RMSD restraints.23 The
corresponding 2D free energy profiles are depicted in Figure 2. Noteworthy, substituting !"
(current reactant) for ! #$% (former reactant) significantly changes the coordinate system.
Unfortunately, because of an overlap with more energetically favorable configurations, the
change in the coordinate system hides the real transition state &'(, formerly well resolved,23 and
creates an illusion of a transition state with a much lower free energy than anticipated. Thus it is
not surprising that the HFB path does not pass through the artefactual transition state. Caution
must be exercised while interpreting the free energy surfaces obtained in few generalized
dimensions. Similar observations have been made previously, in particular, apparent transition
states from the reduced dimensionality free energy surfaces commonly used in studies of small
peptides do not correspond to the actual transition states.40
An additional problem related to computing the 2D free energy profiles is the systematic
error due to the combination of the best-fit RMSD restraints with LD. In particular, the restraint
forces are computed during dynamics after superposing the restraint reference structure to the
structure from which the dynamics propagator makes its next step. Because LD adds a random
force on top of the other forces, the next-step configuration moves further away from the
reference coordinate system, and no longer satisfies the best-fit criteria. While computing the
best-fit RMSDs during data processing, we are forced to superpose each snapshot to the
corresponding reference to compute the value of the RMSD-based coordinates. This value will
always be smaller than the true value during dynamics since the coordinate system is being reset
to the reference by the best-fit procedure. The situation could be improved if the structure
immediately preceding the recorded snapshot was available.
18
To demonstrate the effect of the random force on the relative free energies we rerun the
2D-RMSD simulations for the path between !"#$ and !"%& obtained previously with the Langevin
dynamics but otherwise identical conditions.23 The corresponding relative free energies for the
!"%& and the '() with respect to the !"#$ are 2.27 and 8.16 kcal/mol, respectively. These should
be compared with 2.55 and 8.51 kcal/mol obtained previously with the MD simulations utilizing
velocity reassignment23 instead of Langevin dynamics.
The relative free energies obtained using weighted histogram analysis method (WHAM)
from the data generated by 2D-RMSD umbrella sampling with the Langevin MD is provided in
Table 2 as the benchmark.
'*#+,-.+/0##+#1#023+405/67#
Because the HFB method uses the averaged restraint forces and because LD adds a small
random force on top of the potential energy and the bias forces, computing accurate free energy
profiles requires longer simulation times than during the routine HFB evolution to ensure that the
random forces average to zero.
In practice, we find that the best way to compute the minimum free energy profiles along
the path ensembles is to run multiple simulations with different initial conditions for the same
reference Fourier path. The only other requirement is that the corresponding trajectories have the
same number of snapshots with the same statistical weight to be able to combine their averages
into one cumulative average structure that can then be used in place of the evolved bead to
compute the much more accurate free energy profile using equation (15).
In the case of the alanine dipeptide the free energy profiles are summarized in Table 2.
The relative free energies for all the points along the path are given relative to ! "#$. It is clear
from Table 2 that the relative free energies computed with the HFB method have nonzero
19
standard deviations, with the largest deviation being ".11 6cal8mol. :s expected, the force
constant has a small effect on the relative free energies, with higher force constants giving higher
relative free energies for the transition states. The truncation parameter ! also has a small effect
on the relative free energies, with higher ! giving overall slightly higher relative energies.
Comparison of the relative free energies computed with the HFB method with those from
the 2E-RHSE umbrella sampling8JH:H is quite favorable, and suggests that the accuracy of
the HFB method with the approximate free energy gradient is sufficiently high. Je feel that the
HFB method could be used to provide benchmar6 calculations in the future.
!"#$%&#!'%&(i*
For the 2"-residue !-helical protein, we first optimized the coarse MN-bead unfolding path
and then refined only the path between the first N beads 6eeping the endpoint beads fixed. The
final free energy profile was reconstructed from 21 collection steps, each O ns long (see Figure
3). Figure 3 shows three cumulative profiles at 152, 1M" and 1MO ns that are practically identical,
indicating near perfect convergence. :s seen in Figure 3, the THF has four minimum free energy
basins. Inspecting the corresponding path traVectory, we can relate conformational changes to the
observed free energy changes. Wver the range of the whole transformation four bac6bone
hydrogen bonds are being bro6en. Xote that the total number of hydrogen bonds for an ideal
helix of this size is 1M, however one of the bonds (:1M to Y2") has already bean bro6en in the
reactant configuration. The next hydrogen bond to brea6 is between residues :1 and Z5. Xext,
bond between Z15 and :1[ brea6s immediately followed by :2 and :M bond brea6ing.
Concluding the transformation is the bond brea6ing between :1N and :1O. Some C- and8or Xterminal rearrangements often immediately follow the H-bond brea6ing event. Clearly, the
protein folding8unfolding landscape at 2[O Z is quite rough, with the highest barrier for folding
2"
of 5.4 kcal/mol and about 10 kcal/mol for unfolding. Interestingly, breaking the first hydrogen
bond of the helix leads to a substate with a comparable free energy. Breaking additional
hydrogen bonds is thermodynamically unfavorable.
!"#$%&'()*+&,#-./0123
The present paper crucially enhances the original HFB method23 by incorporating energy
gradients into the calculations. The energy gradients computed on the fly enable the HFB method
to perform more efficient path optimizations. Most significantly, the gradient augmented HFB
method can now reconstruct accurate energy profiles along the path ensemble in
multidimensional reaction coordinate spaces, not possible before. This alleviates the need to use
the 2D RMSD-based umbrella sampling procedure that can be problematic in certain cases.
Importantly, both minimum free energy transition path ensembles and the adiabatic potential
energy paths are given analytically along with their energy profiles by the HFB method. Thus,
the enhanced HFB method can now provide complete information about the reaction path.
455.&*+6
Consider the quantity
"# " " $ %#
'#(*
% & ' &#
!" !$
!
"
"# " " $ %#
!x !x
( !x#$###) % ' '#&#(*
&
"#!"
!!
!$ %#
( $###!x " !x '#&#!x#$###)
!
)
!"
!x
"
)
!$
!x
(A1)
)
Importantly, this quantity should identically equal zero ( ! = 0 ) at the equilibrium or stationary
state to ensure that the averaged configuration over the biased trajectory is stationary. Using the
standard definition of the potential of mean force41 it could be shown that the unbiased free
energy enters an isomorphous equation:
21
!!
" !" # !$" !% % " " # !$" " % %
"
' #x%$(
' )$
&' ( &
!$
!$ & #
!" # !$" !%
!
"
" " # !$" " % %
!$
!$
) #x%$#( & ( '&)$
'
) $#
!" # !$"
!$
!
$&*
"
$&*
!%
!$
(A2)
$&*
Suppose that the PMF of the unrestrained ensemble at the point ! can be approximated by a
harmonic function centered at another point, say, ! " and with its own force constant k" ,
2
W u (x) ! k u ! x " x u " # C $ W$ u (x) .
(A3)
The values ! " and ku are unknown; " is an unknown offset of the unbiased PMF that relates the
PMF to the biasing potential; and the tilde indicates the approximate PMF. Even though we have
just introduced three unknown constants they do not enter the final result. Thus, the mean force
is approximated as a linear function, derived from the corresponding quadratic potential
$%" !#"
!! " !#"
!!%
.
" #$ " ! # # # " " #
!#
!#
(A4)
Substituting (A3) and (A4) into the (A2) and taking the corresponding Gaussian integrals
analytically one obtains a simple approximation for the mean force:
!W! u #x$
!x x!
"
x
x%b
!W u #x$
!x
!#
x%b
!V #x$
!x
!#
x%b
!V #x$
!x x!
(
! #&kv x
x
x%b
)
# xv .
(A5)
x%b
This equation focuses all the information collected by the biased simulation at one particular
point in configuration space, namely x ! x
x!"
, and, therefore, does not require the use of
histograms.
!"k$%&'()*(+($,We would like to acknowledge the NIH for support of this work (GM48807). The authors
are grateful to Dr. Ayori Mitsutake and Dr. In-Ho Lee for their suggestions and proofreading the
manuscript, and to Prof. Sheena Ratford for suggesting the !-helical peptide for this study.
22
!eferences
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
$%& '())(*+s-& .%& /%& 0*1s2r45e-& R%& 8()+*ns:(n-& ;%& $%& <*n2-& R%& =%& 0*))en>er-& '%& =%
'44>r1ff-&*n>&R%&@%&ABer-&@(4c:e+(sDrB&()&E3F-&691&E1996F%
;%&I%&A())&*n>&$%&0:*n-&J*D1re&EK4n>4nF&*&E1F-&10&E1997F%
I%&J%&J*2*n*D:*n-&N%&A4s:(-&I%&<1n2-&O%&$*>P(-&*n>&Q%&O1n4R-&@(4c:e+(sDrB&*)&E28F8466&E2006F%
O%&;*rU)1s&*n>&V%&;1rB(*n-&/r4c%&J*D)%&Ic*>%&$c(%&N$I&+,--&6679&E2005F%
.%&$c:)(cX-&Molecular Modeling and Simulation: An Interdisciplinary Guide%&E$Ur(n2erQer)*2-&JeZ&[4rX-&2002F%
8%&=enXe)+*n&*n>&=%&V4nss4n-&V%&0:e+%&/:Bs%&++(&E22F-&9978&E2000F%
R%&\)]er&*n>&O%&;*rU)1s-&0:e+%&/:Bs%&KeDD%&+(.&E5F-&375&E1987F%
$%&=14&*n>&V%&\%&$Dr*1]-&V%&0:e+%&/:Bs%&+,/&E13F-&5000&E1997F%
$%&=14&*n>&V%&\%&$Dr*1]-&/r4De(ns&(0&E2F-&249&E1999F%
8%&A4+4D4r-&K%&$D*c:4-&*n>&O%&^%&@*n-&V%&O4)%&$Dr1cD%&E.:e4c:e+F&),+1),--&509&E2000F%
^%&Q%&;:*5r1DsX((-&R%&=%&@Br>-&*n>&0%&K%&@r44Xs&^^^-&V%&0:e+%&/:Bs%&+-*&E19F-&194903_1
E2006F%
$%&<(sc:er&*n>&O%&;*rU)1s-&0:e+&/:Bs%&KeDD%&+.*&E3F-&252&E1992F%
I%&\%&0:4-&V%&A%&A4))-&*n>&A%&K%&<ree+*n-&0:e+%&/:Bs%&KeDD%&--.&E3F-&218&E1994F%
R%&\)]er-&01rr%&`U(n%&$Dr1cD%&@(4)%&+)-&151&E2005F%
;%&Ir4r*&*n>&.%&$c:)(cX-&0:e+%&/:Bs%&KeDD%&(/2-&1&E2003F%
R%&\)]er-&I%&0ar>en*s-&I%&8:4s:-&*n>&=%&$Dern-&I>5%&0:e+%&/:Bs%&+-0-&93&E2003F%
/%&8%&@4):1(s-&A%&0:*n>)er-&0%&Ae))*24-&*n>&/%&K%&8e(ss)er-&Inn1%&Re5%&/:Bs%&0:e+%&)(291&E2002F%
R%&R*>:*Xr(s:n*n&*n>&.%&$c:)(cX-&/r4c%&J*D)%&Ic*>%&$c(%&N$I&+,+-&5970&E2004F%
\%&'e(n*n-&'%&Ren-&*n>&\%&Q*n>en-\(bn>en-&/:Bs%&Re5%&@&00-&052301c1&E2002F%
@%&/eDers-&I%&=eB>en-&I%&.%&@e))-&*n>&I%&0:*Xr*]4rDB-&V%&0:e+%&/:Bs%&+-,&E17F-&7877
E2004F%
'%&Ren-&04++%&O*D:%&$c(%&+&E2F-&377&E2003F%
'%&<%&5*n&81nsDeren-&(n&8omputer Simulation of :iomolecular Systems-&e>(De>&]B&'%&<%
5*n& 81nsDeren& *n>& /%& ;%& 'e(ner& E\$0`O& $c(ence& /1])(s:ers& @%& Q%-& Ke(>en-& .:e
JeD:er)*n>s-&1989F-&Q4)%&1-&UU%&27%
^%&Q%&;:*5r1DsX((-&;%&Ir4r*-&*n>&0%&K%&@r44Xs&-&^^^-&V%&0:e+%&/:Bs%&+-)&E17F-&174108c1
E2006F%
0%-[%&=1*n2-&d%&8eD*:1n-&.%&'*n2-&'%&<%&Ae8r*>4-&*n>&<%&8*(-&V%&I+%&0:e+%&$4c%&+-(
E48F-&12111&E2001F%
'%&.:(e)&*n>&V%&;*sDner-&V%&0:e+%&/:Bs%&+-(&E14F-&144104&E2005F%
V%& 5*n& \er>en-& '%& V%& @r(e)s-& $%& =*rXe+*-& *n>& A%& <e()-& 0:e+%& /:Bs%& KeDD%&+0*&E4F-&370
E1989F%
K%& J%& .refeD:en& *n>& A%& @*1-&^^^-&(n&Numerical Linear Algebra& E$^IO-& /:()*>e)U:(*1997F-&UU%&41%
V%&J4ce>*)&*n>&$%&'r(2:D-&Numerical >ptimi?ation%&E$Ur(n2er-&2005F%
K%&O*r*2)(*n4-&I%&<(sc:er-&\%&Q*n>en-\(bn>en-&*n>&8%&0(cc4DD(&V%&0:e+%&/:Bs%&+-)&E2F024106c1&E2006F%
23
30
31
32
33
3"
35
3\
37
38
39
"0
"1
%&'(&')**+,*,-.'/&'%*+*0,1-.'2&'34156**+.'7&'3&'(11.'%&'8&'3,491:15.';;;.'9<+'=&'>&
=5**-0.'?41*5&'@41A&'B,,&'!"#'C3D.'1"0'C2003D&
B&'F&'/9,G151HH.'I5&.'=&'>&'=5**-0.'@&'(&'=5**-0.';;;.'(&'JKH00*<.'=&'>*LM.'7&')*<.'9<+
/&' G95NHL0.' K<' The Encyclopedia of Computational Chemistry.' 1+KO1+' PQ' 2&' R&' >&
3,4H1Q15.'2&'>&'3,451K<15.'J&'(&'BHHK<S15.'?&'@H95-.'I&'T90O1KS15.'2&'G*HHA9<.'9<+';&'%1<5Q
8&'3,491:15'CI*4<')KH1Q'U'3*<0.'@4K,410O15.'1998D&
I&X2&'>Q,-915O.'T&'@K,,*OOK'.'9<+'%&'I&'@&'=151<+01<.'I&'@*AN&'24Q0&'$%'C3D.'327'C1977D&
?&'(9Z95K+K0.'F&'I&'?*PK90.'@&'(&'=5**-0.';;;.'9<+'/&'[&'29LH9KOK0.'I&'@41A&'24Q0&'#&'C10D.
7\12'C1991D&
F&'I&'?*PK90'9<+'@&'(&'=5**-0';;;.'I&'@41A&'24Q0&''#'C8D.'5115'C1988D&
B&'F&'/9,G151HH.'I5&.'F&'=904:*5+.'/&'=1HH*OO.'>&'(&'FL<P59,-'I5&.'I&'F&'[R9<01,-.'/&'I&
8K1H+.'3&'8K0,415.'I&'T9*.'%&'TL*.'3&'%9.'F&'I*01N4X/,@95O4Q.'(&'GL,4<K5.'G&'GL,Z159.'8&
?&'G&'(9L.'@&'/9OO*0.'3&'/K,4<K,-.'?&'JS*.'F&'?&'JSLQ1<.'=&'25*+4*A.')&'[&'>1K415';;;.
=&'>*LM.'/&'3,4H1<-5K,4.'I&'@&'3AKO4.'>&'3O*O1.'I&'3O59LP.'/&')9O9<9P1.'I&')K*5-K16K,ZX
GL,Z159.'F&'7K<.'9<+'/&'G95NHL0.'I&'24Q0&'@41A&'='!"$'C8D.'358\'C1998D&
[&'J15K9.'3&'8K0,415.'9<+'/&'G95NHL0.'I&'@41A&'24Q0&'!"&'C5D.'1902'C199\D&
=&'J&'F*AK<Q'9<+'@&'(&'=5**-0';;;.'I&'24Q0&'@41A&'='!"%'C18D.'37\5'C1999D&
3&'GLA95.'F&'=*LZK+9.'>&'%&'361<+01<.'2&'B&'G*HHA9<.'9<+'I&'/&'>*01<P15S.'I&'@*AN&
@41A&'!%'C8D.'1011'C1992D&
[&'/&'=*,Z-*'9<+'@&'(&'=5**-0.';;;.'I&'24Q0&'@41A&'#('C17D.'"509'C1993D&
2&'T&'=*H4LK0.'=K*N4Q0&'I&''''C1D.'50'C2005D&
?&'2&'3O599O0A9'9<+'I&'B&'/,@9AA*<.'I&'@41A&'24Q0&'!"!'C\D.'5032'C199"D&
2"
!"#$%&'# T&e (ela+,-e ./+e0+,al e0e(1,e2 345al67/l8 5/7.9+e: ;,+& +&e <=> 7e+&/:
!
f
f
f
C/0@
" # A 1C#C
M S A 1C#C
M S A 2C#C
M S A 2C#C
! A 1D
! A 2E
! A 1D
! A 2E
Fxa5+
()
C#H23C#CC8
C#H13C#CC8
C#H23C#CC8
C#H13C#CC8
C#H1
!S'
1#5C3C#CC8
1#EH3C#CC8
1#5C3C#CC8
1#EH3C#CC8
1#5C
(+%,
C#CC3C#CC8
C#CC3C#CC8
C#CC3C#CC8
C#CC3C#CC8
C#CC
!S-
D#EI3C#CC8
D#E23C#CC8
D#E73C#CC8
D#E53C#CC8
D#ED
(+".
2#CH3C#C18
2#CK3C#CC8
2#CD3C#CC8
2#CK3C#CC8
2#C5
T&e (ela+,-e e0e(1,e2 a(e a-e(a1e: /-e( 1C 5/02e59+,-e <=> (902L ;,+& +&e 097be(2 ,0
.a(e0+&e2,2 (e.(e2e0+,01 5/((e2./0:,01 2+a0:a(: :e-,a+,/02# T&e <=> (ela+,-e e0e(1,e2 a(e
5/7.9+e: 92,01 +&e @/(5e /@ +&e (e2+(a,0+ ,0 +&e NCS -,a (e-e(2,ble ;/(4 l,0e ,0+e1(al 31K8L
;&e(ea2 +&e exa5+ (ela+,-e e0e(1,e2 a(e 5/7.9+e: ;,+& 5/0P91a+e .ea4 (e@,0e7e0+ 3CQN8 7e+&/:
@/( +&e +(a02,+,/0 2+a+e2L a0: a:a.+,-e ba2,2 Re;+/0SNa.&2/0 7e+&/: @/( l/5al 7,0,7a#
25
!"#$%&'# T&e (ela+,-e .(ee e/e(0,e1 234al567l8 4769:+e; <,+& +&e =>? 6e+&7;
C7/.
!
"#
A 5#C
! A 2D
f
MS
A 5#C
! A E2
f
MS
A 1C#C
! A 2D
f
MS
A 1C#C
! A E2
f
MS
A 2C#C
! A 2D
f
"S
A 2C#C
2GHIJSG
! A E2
L=MJ
()
C#512C#CD8 C#522C#CD8 C#572C#C58 C#"C2C#C78 C#"O2C#CP8 C#7C2C#CO8
C#5D
!S+
1#1E2C#C28 1#1D2C#C28 1#1P2C#CE8 1#2C2C#CD8 1#2"2C#C58 1#2"2C#C"8
C#P1
(,%-
C#CC2C#CC8 C#CC2C#CC8 C#CC2C#CC8 C#CC2C#CC8 C#CC2C#CC8 C#CC2C#CC8
C#CC QC#CCR
!S'
7#"O2C#CD8 7#7"2C#CD8 7#P"2C#C78 7#PO2C#C"8 O#C"2C#CO8 O#1C2C#118
/5a QO#1"R
(,".
2#222C#CD8 2#2E2C#CE8 2#212C#C78 2#222C#C"8 2#1O2C#CD8 2#2E2C#CP8
2#C7 Q2#27R
T&e =>? (ela+,-e .(ee e/e(0,e1 a(e 4769:+e; :1,/0 +&e a-e(a0e; .7(4e 7. +&e (e1+(a,/+ ,/ +&e ICS
-,a (e-e(1,ble <7(3 l,/e ,/+e0(al 2158T a/; a-e(a0e; 7-e( 1C 47/1e4:+,-e (:/1# T&e /:6be(1 ,/
9a(e/+&e1,1 (e9(e1e/+ +&e 47((e197/;,/0 1+a/;a(; ;e-,a+,7/1# T&e la1+ 47l:6/ 9(7-,;e1
be/4&6a(3 -al:e1 4769:+e; :1,/0 2GHIJSG :6b(ella 1a69l,/0 al7/0 +&e 47((e197/;,/0 9a+&
9(7Ue4+,7/ <,+& L=MJ be+<ee/ +&e ()&a/;&(,".T <,+& /:6be(1 ,/ 1q:a(e b(a43e+1 47((e197/; +7
+&e 9a+& be+<ee/ (,%- a/;&(,".&21ee +ex+ .7( ;e+a,l18#
2"
!"#$%&'()*+",-.
Figure *+ ,-e .F/ pat- opti4i5ation progress 8it- 9:;< an= 8it-out t-e en-an>e= opti4i5ation
step ?or a< a=ia@ati> transition pat- 9, A B+B C<D an= @< ?ree energE transition pat- ense4@Fe 9, A
!GH+B C<+ ,-e IaFues o? t-e ?or>e >onstant use= in t-e @iasing potentiaF are s-o8n in t-e Fegen=
aFong+ ,-e Fourier series trun>ation in=eJ P 8as set at *HK !L an= M! at .F/ steps *K *BB an=
!BBK respe>tiIeFE+ :ee teJt ?or t-e >orrespon=ing step si5e para4eters an= ?urt-er =etaiFs+
Figure !+ N ?ree energE strip aFong t-e 4ini4u4 ?ree energE pat- at , A !GH+B C+ ,-e Fine 8itt-e >ir>Fes =epi>ts t-e >enters o? t-e a>tuaF 8in=o8s e4pFoEe= =uring t-e u4@reFFa sa4pFing
pro>e=ure in t-e generaFi5e= !;OPQ:; spa>e an= >orrespon=s to t-e proRe>tion o? t-e .F/
=eriIe= SQF onto t-is !;OPQ:; spa>e+ ,-e ?ree energE 8as >o4pute= using !; >onstant
te4perature T.NQ+ ,-e t-ree Fo>aF 4ini4a on t-e strip are Fa@eFe= ?or >FaritE+ :ee teJt ?or
?urt-er =etaiFs+
Figure M+ ,-e SQF aFong t-e 4ini4u4 ?ree energE transition pat- ense4@Fe ?or
un?oF=ingU?oF=ing o? t-e !O-eFi>aF pepti=e at , A !GH+B C+ :ee teJt ?or =es>ription o? t-e Fa@eFs
an= ?urt-er =etaiFs+
!"
b)
Path RMS, Å
Path RMS, Å
a)
0.20
5.0
5.0 SD
10.0
10.0 SD
20.0
20.0 SD
0.10
0.00
0
100
0.20
300
5.0
5.0 SD
10.0
10.0 SD
20.0
20.0 SD
0.10
0.00
200
0
100
200
HFB Step
300
Figure 1
28
RMSD(product), Å
C5
7
6
0.7
5
0.5
4
C 7eq
3
0.3
C 7ax
0.1
0.1
0.3
0.5
0.7
2
1
Free Energy, kcal/mol
0.9
8
0.9
RMSD(reactant), Å
Figure 2
29
0.2
C/N-term
A14-A18
C-term
N-term
K15-A19
A2-A6
0
N-term
A1-K5
PMF, kcal/mol
14
12
10
8
6
4
2
0
−2
152 ns
160 ns
168 ns
0.4
0.6
0.8
progress variable α
1
Figure 3
30