CovDock for Covalent Docking
Dora Toledo Warshaviak
Senior Applications Scientist
Overview
• Introduction
– General background and challenges
• Schrödinger’s approaches to covalent docking
– CovDock, Pose Prediction mode (default mode)
• Pose Prediction
• SAR
– CovDock, Virtual Screening mode
• Virtual Screening
• CovDock job setup
• Conclusions and Further Work
Introduction: 100 Years Time-Line of Covalent Drugs
• Numerous examples of
effective drugs that use a
covalent mechanism of
action
• More recently
•
Selinexor which inhibit XP01
– Phase II Clinical trials for a
range of hematological and
solid tumor malignancies
• Afatinib which targets EGFR
tyrosine kinase – Phase III
clinical trials for metastatic
non-small cell lung cancer
• CC 292 (AVL 292) which
targets Bruton’s tyrosine
kinase – Phase I/II clinical
trials for Non-Hodgkin
Lymphoma, Rheumatoid
Arthritis
Singh et al., “The resurgence of covalent drugs” (2011) Nature Reviews Drug Discovery 10, 307-317
Analysis of 39 FDA Approved Covalent Drugs
Blockbuster covalent drugs:
Other (13%)
inflammation (3%)
Anti-infective (33%),
cardiovascular (5%)
• Clopidogrel inhibits P2Y
purinergic receptor 12 innovative treatment for
vascular disorders
• Esomeprazole and
Lansoprazole are proton
pump inhibitor - long
duration of action
central nervous
system (10%)
cancer (20%)
gastrointestinal (15%)
Adapted from “Singh et al., (2011) Nature Reviews Drug Discovery 10, 307-317”
Mechanism of Covalent Inhibition
– A covalent bond is formed between the target
and the inhibitor
– The inhibition can be either reversible or
irreversible
• Fig.a, EFGR protein bound Neratinib (2JIV)
• Fig.b, Hep C Virus protease bound peptide-like
ligand (3OPY)
– Protein bb green, Lig and Cys sidechain sticks
– Covalent bond between Cys side chain and inhibitor
Examples of Bond Formation in Covalent Inhibitors
A. Michael Reactions
A.
– The ligand, L is the electrophile with the Michael
acceptor group (O=C-C=C), a Cysteine residue in
the receptor R is the nucleophile
• EGFR inhibitor, Afatinib
B. Nucleophilic Addition
– The double bond is the electrophile and a
Cysteine residue is the nucleophile
B.
• HCV Protease inhibitor, Boceprevir
C. β-lactam Ring Opening
– The four-membered β-lactam is the electrophile,
while the nucleophile is a Serine residue
• Penicillin Binding Protein 1B inhibitor,
Penicillin
C.
Challenges for Covalent Inhibitor Design
• Historically there have been many challenging issues mostly concerned with
safety
– Off target binding
– Off target reactivity
• Undergo metabolism to form highly reactive intermediates that covalently bind to
proteins causing tissue damage, liver toxicity, or, initiate an unwarranted immune
response
• Pre-existing reactive electrophilic functionality in the parent structure (penicillin)
• Further, measuring activity is difficult
– For irreversible inhibitors in particular, drug activity is governed by reaction kinetics
rather than conventional binding thermodynamics
• Residence time and percentage of receptor occupancy leading to a pharmacological
effect
Overcoming Challenges
• Nevertheless, covalent interaction with the target protein has
the benefit of prolonged duration of the biological effect and
potential for improved selectivity
– Use bioinformatics to identify proteins with rare nucleophiles
– Use warhead to target non-conserved residues
– Off target reactivity is reduced by ‘designing specific combinations of
non-covalent and covalent interactions’
• Exhibits a clear need for a technology that supports structurebased design and screening of covalent drugs
CovDock Uses Glide & Prime
Main steps
• Conventional non-covalent docking of pre-reactive species (Glide)
• Formation of covalent attachment (via a number of different mechanisms)
• Structural refinement of the covalent complex (Prime)
More details in the following two Schrödinger papers:
•
Pose Prediction and Scoring
- Zhu, K.; et al. Docking Covalent Inhibitors: A Parameter Free Approach to Pose
Prediction and Scoring, 2014, J. Chem. Inf. Model., 54(7):1932-40
• Virtual Screening
- Toledo Warshaviak, D.; et al. A Structure-Based Virtual Screening Approach for Discovery
of Covalently Bound Ligands, 2014, J. Chem. Inf. Model., 54(7):1941-50
Why are Prime Refinement Steps Used?
• Example of reactions:
– Michael Addition (Fig. A)
– Nucleophilic Addition (Fig. B)
• The change in hybridization can
drastically affect the relative
orientations of the three functional
groups (R1, R2 and R3)
• Prime is used to simultaneously
optimize the ligand pose and the
attachment residue to produce a good
geometry
• The covalent bond parameters are
taken from the OPLS force field
– non-physical bond distances, angles or
torsions are penalized
Cys
A.
S-
Cys
R1
4
S
R1
3
R3
1
2
R2
R2
R3
1,[C]=[C]-[C]=[O]
B.
Cys
S-
R1
4
2
Cys
1
R2
3
4,[N-0X3][C-0X3](=[O-0X1])[C-0X3]=[O-0X1]
R1
R2
Details: Mimicking Key Steps of Binding Process
• GOOD PROXIMITY of Receptor R, Ligand L
– A successful covalent inhibitor in its pre-reactive form must fit in
the binding site with a pose that brings the respective reactive
functional groups R, L, into close proximity
• ConfGen samples L conformations, saving 3 low energy ones, prior to Glide
docking
• To enable closer approach during docking, reactive residue is mutated to ALA
• A positional constraint is used in Glide to keep L reactive moiety within 8A of
C-beta atom of R reactive residue
• STABLE BINDING MODE
– Non-covalent interactions must be able to maintain an appropriate
stable binding mode for a sufficiently long time to allow formation
of the covalent bond
• All L poses with a Glide Score within 2.5 kcal/mol of lowest sampled score, are
retained
Details: Mimicking Key Steps of Binding Process
• CONFORMATIONAL CHANGES, BOND FORMATION
– R & L may undergo conformational changes to facilitate the
covalent reaction, but through the reaction process the noncovalent interactions must keep the L appropriately positioned
in the binding site for reaction to proceed
• Reactive residue in R is mutated back, and its side-chain conformation
sampled with a rotamer library.
• Poses are checked for distance criteria between two ultimate atoms that
form the bond. 5Å distance limit, but later gets reduced to more
physically realistic distance by subsequent optimization steps
– Covalent bond formed based on a number of possible reaction
types (allowing for bond order changes, protonation changes,
all chiral centres are kept)
• Additional types can be added via input file
Details: Mimicking Key Steps of Binding Process
• UNSTRAINED FINAL GEOMETRY
– The geometry of the L & R residue should exert minimal strain on the covalent bond and
the ligand pose, post reactive form.
• Complexes clustered to eliminate duplicates
• Using Prime, covalent complexes are minimized in vacuum restoring normal bond lengths, clashes,
providing good starting point for further optimization
• Selected poses minimised using VSGB2.0. Prime energy used for ranking
and finding the most likely binding geometry
• SCORING
– Affinity is a combination of covalent and non-covalent
interactions
Affinity Score = average of initial GlideScore and post-reaction GlideScore
• does not include a direct calculation of the bond formation energy
(for ranking compounds with similar intrinsic reactivity)
– Additional scoring schemes are available, such as MMGBSA
Summary of CovDock Steps
CovDock – Pose Prediction Mode (default)
Mutate
reactive
residue
to ALA
Sample ligand
conformations
with ConfGen
Glide
docking
with
positional
constraint
Apply
pose
selection
criteria
Prime rotamer
sampling of
attachment
residue + bond
formation
Cluster
poses by
XYZ
Minimize
poses with
Prime
VSGB2.0
Glide in
place
scoring of
minimized
poses
Scoring
by
affinity
score
• Automatic workflow
• SMARTS based definition of
reactive groups
• No parameter fitting
• Easily customizable protocol
20-100
poses
120-600
poses
30-50 poses
1 pose
Application to Pose Prediction and Scoring
Protocol
• Three Areas of Interest: Pose Prediction; SAR Study; Virtual Screening
• Preparation
– Ligands prepared in their pre-reactive form with LigPrep, Epik
– Protein preparation with Protein Preparation Wizard
• Pose Prediction
– Self Docking: 38 PDB Complexes
• 27 Michael Addition; 11 Nucleophilic Addition
– Head to head: 76 complexes Ouyang et al.;
• 13 Michael Addition; 63 acetylation beta-lactam
– Assessment based on heavy atom RMSD to reference crystal structure
*Ouyang, X. et al. (2012) Journal of Computational Chemistry, 34(4), 326–336
Self Docking Results
• Results from 38 Complexes, Self Docking
− Protein families include; Cysteine and Serine Proteases, Kinases, Polymerase, Caspase
Step
Successful Prediction
RMSD< 2 Å
RMSD between lowest
energy pose and
reference Xtal structure
Glide initial
63%
2.06
Final prediction with Prime
Best of 10 lowest energy poses
76%
84%
1.52
1.1
• Challenges
− Prime refinement and ranking provides additional accuracy (e.g. Cysteine protease, 1u9q
cruzain)
− Largest prediction errors seen in ‘HCV NS5B Polymerase’, where pre- and post-reactive forms
are highly dissimilar
Challenges for Pose Prediction
Binding pose of 1u9q, Cysteine Protease
Showing Need of Final Prime Prediction
Green = X-tal structure
Purple = Top pose Glide initial (RMSD 2.17Å)
Blue = Final Prime Prediction (RMSD 0.97Å)
Binding pose of 2ax1 from ‘HCV NS5B
Polymerase’ Group. Large Conformational
Change Poorly Handled
Green = X-tal structure
Purple = Final Prime Prediction (RMSD 6.19)
Results for Head to Head Comparison
• Results from 76 Ouyang et al. complexes
— 13 Michael Addition; 63 acetylation beta-lactam
• Additional comparison made with AutoDock and GOLD
• RMSD is measured between the docked pose and the reference crystal
structure
Pose
Schrödinger
CovDockdefault
Top scoring pose
Best of 10 lowest
energy poses
1.8
1.4
CovalentDock* Autodock*
3.4
1.9
*Ouyang, X. et al. (2012) Journal of Computational Chemistry, 34(4), 326–336
3.5
2.5
GOLD*
4.0
3.4
Overview of SAR Study: Series I
Three Areas of Interest: PP; SAR Study
for two congeneric series ; VS
• 11 acrylamide inhibitors of cSrc kinase
(Michael reaction with Cys345)
• Correlation between calculated and
experimental binding activities
R2 =0.62
• Br-Phenyl (strongest in green) fits pocket
better than tert-butyl (weakest in purple)
which is pushed outside thus reflecting a
poor score
Overview of SAR Study: Series II
Second congeneric series
• 26 peptidyl ketoamides of HCV serine
protease (NA to Ser139)
– Core constraints used due to shallow
and solvent exposed pocket, as well
as size and flexibility of ligands
– R2 =0.32
– Unaccounted entropic contributions
from long flexible part of ligands;
measured IC50 accuracy
– Worthy of further analysis with FEP
CovDock for Virtual Screening
• Limited VS applications/tools for covalent docking currently exist
and process is not well automated
– Across tools: limited auto preparation of ligands and protein, manual definition of
reactive atoms and reaction type
• CovDock “pose prediction” mode takes about 1-3 hours/ligand per
CPU. Need better speed to screen thousands of ligands efficiently
• The CovDock Virtual Screening (-VS) Mode specifically tailored to
address throughput needs, while retaining good pose-prediction
(VS mode is 10-fold faster than the pose prediction mode)
• Method for CovDock-VS varies slightly from default CovDock “Pose
Prediction” mode w.r.t. sampling and scoring
‘Virtual Screening Mode’ Varies in Sampling and Scoring from the
‘Pose Prediction Mode’
Pose Prediction Mode (default)
Mutate
reactive
residue
to ALA
Sample
ligand confs
with
ConfGen
Glide
docking
with
positional
constraint
Apply pose
selection
criteria
Prime rotamer
sampling of
attachment
residue + bond
formation
Cluster
poses by
XYZ;
keep all
(Kelly)
Minimize
poses
with
Prime
VSGB2.0
Glide in
place
scoring of
minimized
poses
Scoring
by
Affinity
score
Virtual Screening Mode
Mutate
reactive
residue
to ALA
…
Glide
docking
with
positional
constraint
Apply
different
pose
selection
criteria
rotamer
sampling of
attachment
residue + bond
formation
Cluster
poses by
XYZ; keep
all (ncluster 3)
…
…
Scoring
by
Glide
Score
Virtual Screening Study on Four Targets
Dataset and Preparation
• Known covalent inhibitors and xtal structures for HCV NS3
protease, Cathepsin K, EGFR, XPO1
– Multiple conformations used EGFR (active/inactive), and prepared
with PP Wizard
• Decoy library of similar physicochemical properties to the
known actives, and same chemical warheads
– Various sources including ChemDiv, Chembridge, Enamine, LifeChem
– Known actives sourced from literature
– Both prepared with LigPrep
Toledo Warshaviak, D.; et al. A Structure-Based Virtual Screening Approach for Discovery of
Covalently Bound Ligands, 2014, J. Chem. Inf. Model., 54(7):1941-50
Virtual Screening Results
CovDock-VS used with protein-ligand interaction filtering, is effective in
retrieving known covalently bound actives quickly
– Found viable poses satisfying the expected P-L interactions for 72%, 81%, 77% of
the known actives of 3 systems shown below respectively
– XPO1 had no filters applied
Potency Range
Known
Actives
Decoys
EF 1%
EF 10%
BEDROC (a= 20)
2-4300 nM
25
1562
52
7
0.70
0.13 – 460 nM
21
1562
9
8
0.48
EGFR
0.5 – 1 uM
34
5000
46
8
0.65
XPO1
25nM – 5 uM
21
5000
33
7
0.52
HCV NS3 Protease
Cathepsin K
Quality of Known Active Rankings wrt Decoys
• For 3 systems, all of known
actives that satisfied H-bond
filters were found in top 6% of
database
• In XP01 case, curve is more
gradual
– 95% actives retrieved within 30%
of decoy library
– 33% of known compounds are
within 1% of decoys library
(EF1% = 33)
Cathepsin K
EGFR
HCV NS3 Protease
XP01
Apply H-bond Filters in Virtual Screening
Cathepsin K
HCV NS3 protease
• CovDock-VS mode (orange)
CovDock- default mode (green)
Crystal structure (gray)
EGFR
XPO1
• The known H-bond interactions
used in post-docking filtering of
the poses are shown in dashed
lines
• A maximum of ten poses are
generated for each ligand.
Those that do not make the Hbond are filtered out.
The Effects of the Filters
The use of H-bond filters is important to the quality of
enrichment results
– Filters improve enrichment results between 1-3 fold
– H-bond specificity not well modelled by Glide score
– Knowledge can be incorporated to reduce the false positives
With Filters
HCV NS3 Protease
Cathepsin K
EGFR
XPO1
Without Filters
EF1%
EF10%
EF1%
EF10%
52
7.2
16
2.7
9.4
8.1
4.7
1.4
46
7.7
38
6.2
33
6.7
33
6.7
A Comparison of the Binding Mode Quality
• CovDock-default and CovDock-VS are
different in their protocols
3
– Sampling and scoring are reduced in VS
mode in order to save time
Virtual Screening Mode
Pose Prediction Mode
• CovDock VS, failed to generate poses
that satisfy important H-bond
constraints mostly in HCV NS3
protease
Time per CPU
((per ligand)
% Failed
RMSD < 2 Å
(with filters)
Mean RMSD
(with filters)
CovDockdefault
1-3 hours
11%
71%
(76%)
1.6 Å
(1.5 Å)
CovDock-VS
3-14 min
26%
43%
(64%)
2.5 Å
(1.9 Å)
Average RMSD (Angstroms)
2.5
2
1.5
1
0.5
0
Cathepsin K HCV NS3
protease
CRM1
XPO1
EGFR
All
Based on self docking of 21 covalent complexes
across 4 targets
A Comparison of Docking Pose Quality
Example of poses that did not satisfy post-docking H-bond filters
– Dash lines = H-bond used in post-docking filtering
– CovDock-VS mode (orange)
– CovDock- default mode (green)
– Crystal structure (gray)
HCV NS3 Protease with Boceprevir
HCV NS3 Protease with Narlaprevir
HCV NS3 Protease with CVS4819
Conclusions and Further Work
• The binding pose prediction is successful and compares well with other
programs
• SAR can provide useful information for lead optimization work
–
–
–
–
Current affinity score works reasonably well
Need further datasets
Develop better scoring scheme for affinity (e.g. MMGBSA, FEP)
Use of QM to provide an estimate of reactivity
• Virtual Screening mode can handle thousands of ligands on desktop
computers and has been validated on various targets
– On average, 10 fold faster than pose prediction mode
– Further improvement on speed and scalability
• The protocol has no parameter fitting to any specific reaction type, is fully
automatic and easy to set up
– SMARTS based definition of reactive atoms
– Supports custom chemical reaction
Working with the Interface
Two modes of docking
Applications > Glide > Covalent Docking
1. Pose Prediction
2. Virtual Screening
Pose Prediction Mode
Set Up Docking
• Applications > Glide > Covalent Docking
1.
2.
3.
4.
5.
Choose the ligand source as selected entries
Choose reaction type
Receptor reactive residue (pick CYS 797)
Centroid of active site for docking, using ligand
Choose docking mode
1
2
3
4
5
2JIV
Results for 2JIV Covalent Docking
Analyse “Results” in PT
• Fig. 1 Covalent bond is correctly formed in predicted structure (purple) between CYS797 and
Ligand
– Cdock affinity score is used to rank across different ligands
– The Prime energy is used to rank across poses of the same ligand
• Fig. 2 Predicted binding pose shows some variation from green crystal structure
• Run time on 1 CPU, Wins XP, rel.14-2, 1h 15min
Fig. 1
Fig. 2
Custom Reactions
• When the reaction of interest is not
pre-defined, a custom reaction can
be created
• For example:
– cyano group instead of a keto
group as EWG in Michael
acceptors
– If you try to setup Covalent
Docking panel with the “Michael
addition” reaction type, it will not
recognize the [C]=[C]-[C]#[N]
pattern as a Michael acceptor
– Choose Custom, browse for
<filename>.cdock file
Custom Reaction example
Custom
Chemistry
Definitions
Modified Michael Addition (*cdock file)
# Options to set custom chemistry
# Select the first atom in this smarts pattern as the ligand attachment atom
LIGAND_SMARTS_PATTERN 1,[C]=[C]-[C]#[N]
# Select the second atom in this smarts pattern as the receptor attachment atom
RECEPTOR_SMARTS_PATTERN 2,[C]-[S,O;H1,-1]
Covalent Docking
# Set the charge on the first atom in the SMARTS pattern <1>
User Manual,
# (where <1> is replaced by the receptor attachment atom ) to 0
CUSTOM_CHEMISTRY ("<1>",("charge",0,1))
Chapter 5 pg.22 ...
# Set the order of the bond between the first and second atoms in the
# following SMARTS pattern to 1 (where <1> is the receptor attachment atom and
LOCATION OF REACTIVE ATOMS
# <2> is the ligand attachment atom
KEYWORD, ATOM#, SMARTS
CUSTOM_CHEMISTRY ("<1>|<2>",("bond",1,(1,2)))
# Set the order of the bond between the 1st and 2nd atoms in the following
FORMATION OF COVALENT COMPLEX # smarts pattern to 1 (where <1> is the receptor attachment atom and <2>
(pattern,("keyword",[value,]indices)) # is the ligand attachment atom)
CUSTOM_CHEMISTRY ("<2>=[C]-[C]#[N]",("bond",1,(1,2)))
# Done options to set custom chemistry
Defining Custom Reactions
Job can be run in the GUI, or, write file out to
command line > *inp file.
$SCHRODINGER/covalent_docking <filename>.inp
covalent_docking_4D9U-mod.inp
REC_FILE
covalent_docking_4D9U-mod_rec.maegz
LIG_FILE
covalent_docking_4D9U-mod_lig.maegz
ATTACHMENT_RESIDUE
A:436
CUSTOM_CHEMISTRY
('<1>', ('charge', 0, 1))
CUSTOM_CHEMISTRY
('<1>|<2>', ('bond', 1, (1, 2)))
CUSTOM_CHEMISTRY
('<2>=[C]-[C]#[N]', ('bond', 1, (1, 2)))
LIGAND_SMARTS_PATTERN
1,[C]=[C]-[C]#[N]
RECEPTOR_SMARTS_PATTERN
2,[C]-[S,O;H1,-1]
GRID_OPTION
GRID_OPTION
GRID_OPTION
GRID_OPTION
GRID_OPTION
GRID_OPTION
NPOSES
AFFINITY
ACTXRANGE=18.251091
ACTYRANGE=18.251091
ACTZRANGE=18.251091
INNERBOX=10,10,10
GRID_CENTER=-28.080497,3.730237,-25.023411
OUTERBOX=18.251091,18.251091,18.251091
1
Acknowledgements
•
•
•
•
Kai Zhu and CovDock development team
Gali Golan and Ori Kalid @ Karyopharm Therapeutics
Jas Bhachoo (applications science)
Schrödinger developers, applications scientists and technical
support
More info:
• Newsletter
– http://www.schrodinger.com/newsletter/26/183/
•
Pose Prediction and Scoring
- Zhu, K.; et al. Docking Covalent Inhibitors: A Parameter Free Approach to Pose Prediction
and Scoring, 2014, J. Chem. Inf. Model., 54(7):1932-40
• Virtual screening
- Toledo Warshaviak, D.; et al. A Structure-Based Virtual Screening Approach for Discovery
of Covalently Bound Ligands, 2014, J. Chem. Inf. Model., 54(7):1941-50
© Copyright 2026 Paperzz