Molecular Docking Tuomo Laitinen, PhD (Chem.) School of Pharmacy UEF Flow of slides… • • • • • Introduction Search algorithms Scoring functions Reliability / limitations Performance University of Eastern Finland School of Pharmacy Molecular Docking “Docking s tudies are computational techniques for the exploration of the possible binding modes of a substrate to a given receptor, enzyme or other binding site.” (PAC, 1997, 69, 1137 (IUPAC Recommendations 1997)) on page 1142) “Docking is a computational technique that samples conformations of small molecules in protein binding sites;; s coring functions are used to assess which of these conformations best complements the protein binding site.” (Warren et al. J .Med.Chem. 2006) Slide made by Heikki Salo. In theory, there is no difference between theory and practice. But, in practice, there is. (Jan L. A. van de Snepscheut/Yogi Berra) What is docking? • Protein-‐Small ligand • Protein-‐Protein Protein-protein interaction is key to understanding cellular events Kuopion Yliopisto Farmaseuttinen kemia Key-‐lock theory • The specific action of an protein with a single substrate can be explained using a Lock and Key analogy first postulated in 1894 by Emil Fischer. – the lock is the enzyme and the key is the substrate. – only the correctly sized key (substrate) fits into the key hole (active site) of the lock (enzyme). University of Eastern Finland School of Pharmacy Key lock theory 9 June, 2016 Tuomo Laitinen Molecular databases Key lock theory 9 June, 2016 Tuomo Laitinen Molecular databases Induced Fit Theory • not all experimental evidence can be adequately explained by using the key-‐lock theory. • assumes that the substrate plays a role in determining the final shape of the enzyme and that the enzyme is partially flexible. – explains why certain compounds can bind to the enzyme but do not react because the enzyme has been distorted too much. – other molecules may be too small to induce the proper alignment and therefore cannot react – only the proper substrate is capable of inducing the proper alignment of the active site University of Eastern Finland School of Pharmacy Molecular recognition • A central phenomenon in biochemistry. – – – – enzymes and their substrates protein receptors and ligands antigens and their antibodies etc. • Approaches to investigate molecular recognition – Molecular docking – Free energy calculations • MM-‐GBSA, FEP, TI, LIE,… – QM/MM methods • Part of system is defined with QM and rest of system is MM. • ONIOM-‐method developed by Morokuma et al., 1996 Some definitions: • pKd measures tightness of binding • pKi measures ability to inhibit • Free energy of binding • DG= DH -‐T DS (DH enthalpy, DS entropy) • Mechanisms – competitive inhibition (most typical case in docking) – allosteric inhibition • Inhibitor binds to different pocket – allosteric activation • activator first binds to other pocket and activates enzyme Kuopion Yliopisto Farmaseuttinen kemia Energetics of binding • • • Gibbs energy of binding DG= DH -‐T DS (DH enthalpy, DS entropy) and DG = -‐RTlnKi = DH -‐T DS • Molecular recognition is depending on both Enthalpy (DH) and Entropy (DS) • Enthalpy • – Direct interactions between ligand, solvent, proteins, ions • • • • Entropy – – – – Ligand-‐Protein interaction Ligand-‐Solvent interaction Solvent-‐Protein interaction Conformational changes during binding Rotational and translational entropy Conformational entropy Solvent reorganization (hydrophobicity) Vibrational entropy Prediction of binding energetics • How to estimate binding – Entropy is usually size-‐dependent (rotational, translational, conformational, vibrational) • Are waters released in binding from cavity? How tight were they bound? • Can be measured using calorimetric methods – Isothermal titration calorimetry (ITC) – Differential Scanning Calorimetry (DSC) • Difficult to estimate computationally – Also hydrophobicity is connected to size – Enthalpy usually deals with the direct binding effects PLUS solvent effects • Conformational effects are also affecting enthalpic component University of Eastern Finland School of Pharmacy What is docking ? • Finding a right docking pose • AND making difference between right and wrong docking poses • AND making difference between high and not so high affinity compounds Molecular Docking – Three Tasks of Molecular Docking Binding mode prediction Binding affinity prediction Score: -9.1 Predicted affinity (e.g. ΔG prediction) Relative binding affinity prediction Score: -9.3 VS. Score: -7.6 Experimental measurements (e.g. ITC ΔG measurement) Slide made by Heikki Salo. Score: -8.5 Usage of Molecular Docking: • Reproduce the binding mode of x-‐ray ligand • Predicting the binding mode of a known active ligands • Predicting the binding affinities of related compounds from a known active series • Identifying new ligands using virtual screening Amy C. Anderson. The process of structure-‐based drug design. Chem. Biol. 10:787-‐797, 2003 Determine 3 D structure of ligand-‐protein complex (XRC tai NMR) Choose Drug Target Determine 3 D structure (XRC tai NMR) Homology modeling Analyze interactions in silico-‐optimization Is the lead a nM inhibitor? Analyze inhibitor binding sites Dock compound database to selected sites yes Can the lead be modified Select a subset for in vitro -‐testing no Is the lead a µM inhibitor? Slide made by Heikki Salo. yes no yes A potential drug candidate passed on to further drug development phases Approaches for molecular docking • Rigid ligand – rigid protein – Historically the first approaches – Search for the relative orientation of the two molecules with lowest energy – Conformational analysis is needed for both ligand and protein – FLOG (Flexible Ligands Oriented on Grid): each ligand represented by up to 25 low energy conformations. • Ligand flexible -‐ protein rigid – Several protein structures are needed – GOLD, AutoDock, GLIDE • Both ligand and protein are flexible – More time consuming, – ”Induced Fit” methods (Glide, MOE) – Sidechain flexibility: Surf-‐Flex, GOLD, Autodock,.. Crude workflow of molecular Docking: • Binding mode prediction – a search algorithm that finds the docking complex structure measured by the scoring function – consume of CPU-‐time is critical – local minima • Binding affinity prediction eg. Ranking – a scoring function that can discriminate correct (experimentally observed) docking complex structure from incorrect ones – strict control of false positives – good correlation with pKd – (Note, pKd does not always correlate with activity) – No consensus – Multiple terms Search algorithms: • 1) Stochastic search • 2) Incremental construction • 3) Multiconformer – Generation of a set of low-‐energy conformers for ligands – Rigid docking – FRED, FLOG Stochastic search: • Simulated Annealing, MC, Genetic algorithm, Tabu Search • Gold (GA), Glide (MC), AutoDock4.0 (Lamarckian Genetic Algorithm), • Monte Carlo simulated annealing (MCSA) – Random – Outcome varies – Repeat to improve chances of success – Glide • an initial rough positioning • torsionally flexible energy optimization (OPLS-‐AA) • MC refinement for energetically best poses – AutoDock 2.4 • random changes in ligand's orientation and conformation in each temperature cycle • new state is accepted (1) if the energy is lower than previous state (2) otherwise accepted based on probability expression Kuopion Yliopisto Farmaseuttinen kemia Genetic Optimization for Ligand Docking (GOLD) • GA (a genetic algorithm) – mimics the process of evolution – initially a population of conformations is g enerated – scoring algorithm evaluates the fitness of each conformation • => conformation=chromosome – genetic operations (crossover, mutation) • • • Gold has fitness functions: – GoldScore or ChemScore – Calculations based on chemical and physical theories • • Geometrical properties Bonding affinities Full ligand flexibility Partial protein flexibility, – including protein side chain and backbone flexibility for up to ten user-‐defined residues – the ability to dock into multiple models of the same or different proteins, i.e. ensemble docking Stochastic algorithms continues...: • Tabu search - limits conformational search space • impose restrictions in order to help search process to negotiate difficult regions • difficult regions are listed and search is prevented to go these regions again Initial solution randomly Evaluate Tabu list Generate moves Rank moves based on interaction energy lowest energy accept or not tabu list Examine reject Incremental construction: – Ligand is divided into single fragments – Incrementally reconstructed inside active site: preferred torsinal angles Dock, Flexx, Surflex-‐Dock SO2CH3 Protocol 1. Fragmentation 2. Selection of anchor fragment -‐ specificity -‐ placeability 3. Anchor fragment placement 4. Incremental addition of other fragments Cl COOH COOH COOH Surflex-‐Dock (Sybyl) • • • Surflex-‐Dock is developed by A. N. Jain (J. Med Chem (2003), 46, 499-‐511) Uses an empirical scoring function • Succesful to eliminate false positive • Note:Surflex-‐Dock results are dependent upon having a properly typed input ligand! • – basis on the binding affinities of protein-‐ligand complexes and on their X-‐ray structures – terms: hydrophobic, polar, repulsive, entropic, solvation – scores are expressed in -‐log10(Kd) units to represent binding affinities Identify active site (cavity) Probe binding pocket Protein's surface is coated with three types of probes: CH4 represents steric, hydrophobic probe N-H represents hydrogen bond donor probe C=O represents hydrogen bond acceptor probe Protomol generation Aligns ligand fragments to protomol But! -‐> scoring is calculated inside the binding pocket using scoring functions. E Scoring functions • Gibbs energy of binding • DG= DH -‐T DS (DH enthalpy, DS entropy) • => exact calculating time-‐consuming • Scoring functions are used to estimate free energies of binding • Force field scoring – GoldScore, DOKC, AutoDock • Empirical scoring – ChemScore, Glide SP/XP • Knowledge-‐based scoring – PMF, Drug Score • Consensus scoring E Scoring functions continues.. • Force field scoring – fast and transferable – well studied and physical basis – disadvantages • only parts of relevant energies included (electrostatic dominating which cause systematic problems in ranking) – force field scoring is based on idea to use only enthalpic contributions to estimate the binding free energy – for example DOCK's force field score consists intermolecular terms of Amber energy function – Intra ligand interactions are also included in the score Scoring functions continues... • 2. Empirical scoring function – – – – Fast and good predictive power multivariate regression method FlexX uses empirical scoring function generated by Böhm free energy of binding is estimated as a sum of Nrot is the number of rotatable bonds that are immobilized in the c omplex Ghb, Gio, Grot, and G0 are adjustable parameters. Garo accounts for the interactions of aromatic groups, and is set to -0.7 kJ/mol. Glipo is a modified term that is c alculated as a pairwise s um over all atom-atom c ontacts. f( R, ) is a s caling function that penalizes deviations from ideal geometry. R = R -R0, where R is the distance between the atom c enters R0 is the ideal v alue, which is assumed to be the s um of both van der Waals radii plus 0.6 Å. Scoring functions continues... 2.Empirical scoring functions – X-‐Score – includes van der Waals interaction, hydrogen bonding, hydrophobic effect and deformation effect – Disadvantages • the function are trained solely on crystal structures of protein-‐ ligand complexes (medium-‐strong affinity) • no effective penalty term for bad conformations • usage/accuracy feasible only within similar compounds included in training set ChemScrore Scoring functions continues... 3. Knowledge-‐based Scoring function • Based on information from known protein-‐ligand complex (protein databank) • More general than empirical scoring functions • PMF-‐Score The protein-‐ligand interaction energy is calculated as a sum of distance dependent pairwise potentials over all heavy atom pairs between the complex. Both enthalpic and entropic effects are assumed to be included implicitly: Scoring functions continues... 3. Knowledge-‐based Scoring function • DrugScore. The total protein-‐ligand binding score is combination of distance-‐dependent potentials and surface-‐dependent potentials. Use combinations of mean field terms and extraterms for e.g. solvation Scoring functions continues... 4. Consensus scoring – In protein−ligand docking, the scoring function is responsible for identifying the correct pose of a particular ligand as well as separating ligands from nonligands. – Consensus scoring involves combining the results from several rescoring experiments – “Consensus” hypothesis: rescoring is a way of combining results from two scoring functions such that only true positives are likely to score highly – “Complementary” hypothesis: the scoring functions used in rescoring have complementary strengths; one is better at ranking actives with respect to inactives while the other is better at ranking poses of actives Venn diagrams contrasting the performance of the three scoring functions for different proteins. The results are taken from a single repetition (the first repetition) of the docking experiments. (a) The number of actives placed in the top-ranked position. For example there are 25 actives (out of 85) which all three scoring functions correctly place in the top-ranked position. (b) Poses correctly predicted;; that is, where the top-ranked pose is within 2.0 Å rmsd of the crystal structure. For example there are 53 proteins where all three scoring functions correctly predict the active pose. Published in: Noel M. O’Boyle;; J ohn W. Liebeschuetz;; J ason C. Cole;; J . Chem. Inf. Model. 2009, 49, 1871-1878. DOI: 10.1021/ci900164f Copyright © 2009 American Chemical Society Problems with Docking and Scoring functions • The best of docking methods predict the experimental pose about 70% of the time, although selecting the program that will give the best result for any given target is not straightforward • The most stringent test of docking is the accurate prediction of the binding affinities of a series of related compounds – This goal is essentially beyond all of the current docking methods Virtual screening performance Kuopion Yliopisto Farmaseuttinen kemia Problems continue… • Prediction of binding affinities for a diverse set of molecules is difficult task – Scoring – Search Space is high-‐dimensional • Both molecules are flexible – hundreds to thousands of degrees of freedom • Total number of possible poses is astronomical • About 30 docking programs • Calculations in gas phase – accurate calculations in water time-‐consuming • Free energy differences between best ligand (potency 50 nM) and experimental detection limit (potency 100 mikroM) is only 4.5 kcal/mol) – Conformational factors alone for ligands can be as large as that Kuopion Yliopisto Farmaseuttinen kemia Docking Accuracy • Docking of 100 ligands to their cognate protein X-‐ray structure. • Cumulative percentage of complexes as a function of the RMSD from the X-‐ray pose. (A) Docking accuracy: RMSD in Å of the best pose (nearest to the experimental binding mode) from the experimental solution. Scoring accuracy • • B. RMSD in Å of the top pose (best scored solution) from the experimental solution. Current plot have been obtained considering the X-‐ray pose as input conformation of the ligand to dock. Note the scale! Scoring…L Docking Performance Binding mode prediction Docking Power Score: -9.1 Binding affinity prediction Scoring Power Predicted affinity (e.g. ΔG prediction) VS. Score: -7.6 Experimental measurements (e.g. ITC ΔG measurement) Relative binding affinity prediction Screening Performance Score: -9.3 Score: -8.5
© Copyright 2026 Paperzz