Supporting Information A New Class Of Enhanced Kinetic Sampling Methods For Building Markov State Models Arti Bhoutekar#, Susmita Ghosh†, Swati Bhattacharya†#, Abhijit Chatterjee#* # Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai 400076 India, and †Department of Physics, Indian Institute of Technology Guwahati, Guwahati 781039 India *Email: [email protected] , [email protected] S1. Flow chart for state constrained MD 1 S2. Flow chart for programmed state constrained MD (PSC-MD) Note: As discussed in the main text, MD can be replaced with another dynamical method. S3. Markov state model for alanine dipeptide A. System setup A single molecule of alanine dipeptide (N-acetyl-N-methyl-L-alanylamide) was placed with 390 pre-equilibrated TIP3P water molecules. In a periodic box of dimension 2.3 x 2.3 x 2.3 nm3. We employed the CHARMM27 force field[S1]. 2 B. Simulation protocols Equilibration at constant pressure of 1 atm was performed. Next we performed energy minimization for 1600 steps using conjugate gradient method. MD calculations using Langevin thermostat were performed at 300 K with NAMD [S2]. Particles mesh Ewald electrostatics was employed for electrostatics terms. RATTLE[S3] and SETTLE[S4] algorithms were applied to covalent bond involving hydrogen in water and peptide, respectively. A time step of 2 fs was used. Independent MD trajectories were generated using different random seeds. The system configuration was checked for transition after every 100 MD steps by comparing the structure to a database of states. Structures were compared using Kabsch algorithm with a tolerance of 1.5 Å. A match is said to be occurred when non-hydrogen of the alanine dipeptide molecule lie within the tolerance. A short MD calculation for τ = 0.8 ps is additionally performed to avoid counting recrossing events as transitions. MSM of Fig. 2 in main text was constructed using stateconstrained MD calculations (see Supporting Information Section S1 for flowchart). C. Comparison to previous literature studies Free energy map in the ( , ) space obtained using standard MD calculations at 300 K with the setup mentioned earlier is shown below. The free energy map is in good agreement with one in top-left panel of Fig. 1 of Ref. [S5]. Figure S1. Free energy map (in units of kBT) at 300 K from MD simulations. State 5 is not accessible at the timescales accessed. Based on ( , ) values state 1 and 3 correspond to the R conformation, states 2 and 4 are associated with β/PII/C7eq structures of the system, while state 5 (see Fig. 2 of main text) can be identified as the C7ax conformation (see Ref. [S4]). The free energy of states 2-4 with respect to state 1 obtained by solving the MSM shown in the main text is −0.06, 2.28 and 1.596 kcal/mol, respectively, which is in good agreement with the free energy map shown above as well as free energy differences mentioned in Ref. [S6]. Transitions between β/PII/C7eq structures are known to be fast (less than ps at 300 K) as reported in Ref. [S7], which is found to be true also with our results (see Fig. 2). The rate 3 for moves from 1 to 2 (alpha-helix to beta strand transitions) is estimated to be 0.045 ps−1 (mean escape time of ∼22 ps) which is comparable to previous estimates (Ref. [S8]). The activation barriers for the moves 2-5, 5-2, and 5-4 are 6.55, 5.7, and 5 kcal/mol, respectively. The corresponding barriers in vacuum [S9] are less than 7, 5, and 5 kcal/mol, respectively. S4. Additional results for stretched deca-alanine A. Work done while stretching deca-alanine computed from MSMs Work done in pulling the deca-alanine from a compact to a stretched configuration can be calculated using the MSMs as follows. MSMs constructed for d=16, 18, 20, 22, 23, 24 and 25 Å provide average force and end-to-end distance for the molecule (see Fig. 11 in main text for average force). We assume that the molecule is stretched by moving one anchor point infinitely slowly, allowing us to calculate the work done using the MSMs since the molecule reaches steady state for each value of anchor separation d. 25 Work (kcal/mol) 20 Figure S2: Work done while pulling deca-alanine is calculated using MSMs constructed in the main text (shown in orange filled circles). Good agreement is observed with Ref. [S10]. Each symbol is obtained from a separate MSM constructed for the anchor separation d mentioned in the figure. MSM Data from Ref. [S10] 15 10 d=25 Å d=24 Å d=23 Å 5 d=22 Å 0 -5 10 d=20 Å d=18 Å d=16 Å 15 20 25 30 End-to-end distance (Å) 35 B. Rates along a folding pathway Note that folding/unfolding can occur via parallel pathways as shown in the MSMs in Fig. 9 in the main text. Suppose we focus on the pathway from states 176 21. Table below shows the α-helicity and 310-helicity of states along this pathway computed 4 from state-constrained MD simulations. State 2 has the largest 310-helicity, i.e., 310helical configurations form intermediates between the folded and unfolded state. The rates of folding/unfolding depend on the anchor separation as given in Fig. 3 below. Table S1. Average α-helicity and 310-helicity of states along a folding pathway State α-Helicity 310-Helicity 1 0.801 0.08 2 0.201 0.21 6 0.021 0.19 17 0.013 0.15 Figure S3. States of stretched deca-alanine a) state 1, b) state 2 and c) state 6. Rates of folding and unfolding as a function of the anchor separation along path d) state 2 to 1, and e) state 6 to 2. S5. Test for Markovian behavior A special property of discrete state, continuous time Markov processes is that the waiting times t for transitions from a state S are exponentially distributed, i.e., 5 p(t)=k exp(-kt). Since the kinetic pathways from state S are supposed to be independent, waiting times kinetic pathway 𝑆 ⇋ 𝑆′ between states 𝑆 and 𝑆′ is also exponentially distributed, i.e., the pathways are first-order processes. Although we employ likelihood estimates for the rate constant kf(d) in our MSMs, the probability density from MLE rate constant should be in good agreement with the observed waiting time distribution [S11]. References (S1) Mackerell, a D.; Jr; Bashford, D.; Bellott, M.; Dunbrack, R. L.; Evanseck, J. D.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; et al. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J Phys Chem B 1998, 102, 3586–3616. (S2) Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R. D.; Kalé, L.; Schulten, K. Scalable Molecular Dynamics with NAMD. J. Comput. Chem. 2005, 26, 1781–1802. (S3) Andersen, H. C. Rattle: A “velocity” Version of the Shake Algorithm for Molecular Dynamics Calculations. J. Comput. Phys. 1983, 52, 24–34. (S4) Miyamoto, S.; Kollman, P. a. SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithm for Rigid Water Models. J. Comput. Chem. 1992, 13, 952–962. (S5) Paul E. Smith. The Alanine Dipeptide Free Energy Surface in Solution. J. Chem. Phys. 1999, 111, 5568-5579. (S6) Tobias, D. J.; Brooks III, C. L. Conformational Equilibrium in the Alanine Dipeptide in the Gas Phase and Aqueous Solution: A Comparison of Theoretical Results. J. Phys. Chem. 1992, 96,3864–3870. (S7) Strodel, B.; Wales, D. J. Implicit Solvent Models and the Energy Landscape for Aggregation of the Amyloidogenic KFFE Peptide. J. Chem. Theory Comput. 2008, 4, 657–672. (S8) Chekmarev, D. S.; Ishida, T.; Levy, R. M. Long-Time Conformational Transitions of Alanine Dipeptide in Aqueous Solution: Continuous and Discrete-State Kinetic Models. J. Phys. Chem. B 2004, 108, 19487–19495. (S9) Ren, W.; Vanden-Eijnden, E.; Maragakis, P.; E, W. Transition Pathways in Complex Systems: Application of the Finite-Temperature String Method to the Alanine Dipeptide. J. Chem. Phys. 2005, 123, 1–12. 6 (S10) Park, S.; Khalili-Araghi, F.; Tajkhorshid, E.; Schulten, K. Free Energy Calculation from Steered Molecular Dynamics Simulations Using Jarzynski’s Equality. J. Chem. Phys. 2003, 119, 3559-3566. (S11) A. Chatterjee and S. Bhattacharya, J. Chem. Phys. 2015, 143, 114109. 7
© Copyright 2026 Paperzz