SupportingInformation

Supporting Information
A New Class Of Enhanced Kinetic Sampling Methods For Building Markov State Models
Arti Bhoutekar#, Susmita Ghosh†, Swati Bhattacharya†#, Abhijit Chatterjee#*
#
Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai 400076
India, and †Department of Physics, Indian Institute of Technology Guwahati, Guwahati 781039
India
*Email: [email protected] , [email protected]
S1. Flow chart for state constrained MD
1
S2. Flow chart for programmed state constrained MD (PSC-MD)
Note: As discussed in the main text, MD can be replaced with another dynamical method.
S3. Markov state model for alanine dipeptide
A. System setup
A single molecule of alanine dipeptide (N-acetyl-N-methyl-L-alanylamide) was placed
with 390 pre-equilibrated TIP3P water molecules. In a periodic box of dimension 2.3 x
2.3 x 2.3 nm3. We employed the CHARMM27 force field[S1].
2
B. Simulation protocols
Equilibration at constant pressure of 1 atm was performed. Next we performed energy
minimization for 1600 steps using conjugate gradient method. MD calculations using
Langevin thermostat were performed at 300 K with NAMD [S2]. Particles mesh Ewald
electrostatics was employed for electrostatics terms. RATTLE[S3] and SETTLE[S4]
algorithms were applied to covalent bond involving hydrogen in water and peptide,
respectively. A time step of 2 fs was used. Independent MD trajectories were generated
using different random seeds. The system configuration was checked for transition after
every 100 MD steps by comparing the structure to a database of states. Structures were
compared using Kabsch algorithm with a tolerance of 1.5 Å. A match is said to be
occurred when non-hydrogen of the alanine dipeptide molecule lie within the tolerance.
A short MD calculation for τ = 0.8 ps is additionally performed to avoid counting
recrossing events as transitions. MSM of Fig. 2 in main text was constructed using stateconstrained MD calculations (see Supporting Information Section S1 for flowchart).
C. Comparison to previous literature studies

Free energy map in the ( , ) space obtained using standard MD calculations at 300 K
with the setup mentioned earlier is shown below. The free energy map is in good
agreement with one in top-left panel of Fig. 1 of Ref. [S5].
Figure S1. Free energy map (in units of
kBT) at 300 K from MD simulations.
State 5 is not accessible at the
timescales accessed.



Based on ( , ) values state 1 and 3 correspond to the  R conformation, states 2 and 4
are associated with β/PII/C7eq structures of the system, while state 5 (see Fig. 2 of main
text) can be identified as the C7ax conformation (see Ref. [S4]).
The free energy of states 2-4 with respect to state 1 obtained by solving the MSM shown
in the main text is −0.06, 2.28 and 1.596 kcal/mol, respectively, which is in good
agreement with the free energy map shown above as well as free energy differences
mentioned in Ref. [S6].
Transitions between β/PII/C7eq structures are known to be fast (less than ps at 300 K) as
reported in Ref. [S7], which is found to be true also with our results (see Fig. 2). The rate
3

for moves from 1 to 2 (alpha-helix to beta strand transitions) is estimated to be 0.045 ps−1
(mean escape time of ∼22 ps) which is comparable to previous estimates (Ref. [S8]).
The activation barriers for the moves 2-5, 5-2, and 5-4 are 6.55, 5.7, and 5 kcal/mol,
respectively. The corresponding barriers in vacuum [S9] are less than 7, 5, and 5
kcal/mol, respectively.
S4. Additional results for stretched deca-alanine
A. Work done while stretching deca-alanine computed from MSMs
Work done in pulling the deca-alanine from a compact to a stretched configuration can be
calculated using the MSMs as follows. MSMs constructed for d=16, 18, 20, 22, 23, 24
and 25 Å provide average force and end-to-end distance for the molecule (see Fig. 11 in
main text for average force). We assume that the molecule is stretched by moving one
anchor point infinitely slowly, allowing us to calculate the work done using the MSMs
since the molecule reaches steady state for each value of anchor separation d.
25
Work (kcal/mol)
20
Figure S2: Work done while pulling
deca-alanine is calculated using
MSMs constructed in the main text
(shown in orange filled circles). Good
agreement is observed with Ref.
[S10]. Each symbol is obtained from a
separate MSM constructed for the
anchor separation d mentioned in the
figure.
MSM
Data from Ref. [S10]
15
10 d=25 Å
d=24 Å
d=23 Å
5
d=22 Å
0
-5
10
d=20 Å
d=18 Å
d=16 Å
15
20
25
30
End-to-end distance (Å)
35
B. Rates along a folding pathway
Note that folding/unfolding can occur via parallel pathways as shown in the MSMs in
Fig. 9 in the main text. Suppose we focus on the pathway from states 176 21.
Table below shows the α-helicity and 310-helicity of states along this pathway computed
4
from state-constrained MD simulations. State 2 has the largest 310-helicity, i.e., 310helical configurations form intermediates between the folded and unfolded state. The
rates of folding/unfolding depend on the anchor separation as given in Fig. 3 below.
Table S1. Average α-helicity and 310-helicity of states along a folding pathway
State
α-Helicity
310-Helicity
1
0.801
0.08
2
0.201
0.21
6
0.021
0.19
17
0.013
0.15
Figure S3. States of stretched deca-alanine a) state 1, b) state 2 and c) state 6. Rates of
folding and unfolding as a function of the anchor separation along path d) state 2 to 1,
and e) state 6 to 2.
S5. Test for Markovian behavior
A special property of discrete state, continuous time Markov processes is that the waiting
times t for transitions from a state S are exponentially distributed, i.e.,
5
p(t)=k exp(-kt).
Since the kinetic pathways from state S are supposed to be independent, waiting times kinetic
pathway 𝑆 ⇋ 𝑆′ between states 𝑆 and 𝑆′ is also exponentially distributed, i.e., the pathways
are first-order processes. Although we employ likelihood estimates for the rate constant kf(d)
in our MSMs, the probability density from MLE rate constant should be in good agreement
with the observed waiting time distribution [S11].
References
(S1) Mackerell, a D.; Jr; Bashford, D.; Bellott, M.; Dunbrack, R. L.; Evanseck, J. D.; Field, M.
J.; Fischer, S.; Gao, J.; Guo, H.; et al. All-Atom Empirical Potential for Molecular
Modeling and Dynamics Studies of Proteins. J Phys Chem B 1998, 102, 3586–3616.
(S2) Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.;
Skeel, R. D.; Kalé, L.; Schulten, K. Scalable Molecular Dynamics with NAMD. J.
Comput. Chem. 2005, 26, 1781–1802.
(S3) Andersen, H. C. Rattle: A “velocity” Version of the Shake Algorithm for Molecular
Dynamics Calculations. J. Comput. Phys. 1983, 52, 24–34.
(S4) Miyamoto, S.; Kollman, P. a. SETTLE: An Analytical Version of the SHAKE and
RATTLE Algorithm for Rigid Water Models. J. Comput. Chem. 1992, 13, 952–962.
(S5) Paul E. Smith. The Alanine Dipeptide Free Energy Surface in Solution. J. Chem. Phys.
1999, 111, 5568-5579.
(S6) Tobias, D. J.; Brooks III, C. L. Conformational Equilibrium in the Alanine Dipeptide in
the Gas Phase and Aqueous Solution: A Comparison of Theoretical Results. J. Phys.
Chem. 1992, 96,3864–3870.
(S7) Strodel, B.; Wales, D. J. Implicit Solvent Models and the Energy Landscape for
Aggregation of the Amyloidogenic KFFE Peptide. J. Chem. Theory Comput. 2008, 4,
657–672.
(S8) Chekmarev, D. S.; Ishida, T.; Levy, R. M. Long-Time Conformational Transitions of
Alanine Dipeptide in Aqueous Solution: Continuous and Discrete-State Kinetic Models. J.
Phys. Chem. B 2004, 108, 19487–19495.
(S9) Ren, W.; Vanden-Eijnden, E.; Maragakis, P.; E, W. Transition Pathways in Complex
Systems: Application of the Finite-Temperature String Method to the Alanine Dipeptide.
J. Chem. Phys. 2005, 123, 1–12.
6
(S10) Park, S.; Khalili-Araghi, F.; Tajkhorshid, E.; Schulten, K. Free Energy Calculation from
Steered Molecular Dynamics Simulations Using Jarzynski’s Equality. J. Chem. Phys.
2003, 119, 3559-3566.
(S11) A. Chatterjee and S. Bhattacharya, J. Chem. Phys. 2015, 143, 114109.
7