The Conformation Search Problem

Jon Sutter
Senior Manager
Life Sciences R&D
[email protected]
Jiabo Li
Senior Scientist
Life Sciences R&D
[email protected]
CAESAR: Conformer
Algorithm based on
Energy Screening and
Recursive Buildup
The Conformation Search Problem
• 3D conformation generation is important in many
applications
–
–
–
–
3D pharmacophore generation
Database building and searching
Docking
etc.
• Efficient search is a challenge
– Exponential explosion of conformer space
– Ring flexibility
– Removing duplicate conformations while considering topological
symmetry
1
Timeline of CAESAR
Validation
Early Validation
Initial
Idea
Discovery Studio 1.7
work begins.
D.S. 1.7
Released
Now
Sabbatical
The CAESAR Algorithm
1. Recursively partition
2. Ring Conformations
E
4. Remove symmetry duplicates
3. Recursively assemble
AB
= E A + E B + E A− B
5. Quickly filter out bad clashes
2
Split Molecule and Generate Ring Conformers
• Each tree node is either a ring or
a rigid structure
• A molecule tree is recursively
partitioned into the smallest
units
– At top level, a tree is
divided into two sub-trees of
approximately equal
complexity
– Repeat steps for the two
sub-trees until no
partitioning can be
performed
• Compute ring conformations
Recursive Conformer Assembling
Confs (FragA)
Confs (FragB)
Assemble confs for FragAB
• Step 1. Select ConfA, ConfB and
rotation from a pool of NAxNBxNR
combinations
• Step 2. Fast energy filtering
• Step 3. Repeat Step 1 and 2
until enough conformations are
generated for FragAB
• Step 4: Repeat 1-3 for upper
level
3
Energy Screening
Confs (FragA)
Confs (FragB)
•
Assemble confs for FragAB
Fast energy computation and filtering as
follows:
E(ConfAB) = E (ConfA)
+ E (ConfB)
+ E (ConfA-ConfB)
Removing Duplicate Conformations
• Normally quick, but if considering topological
symmetry can be costly to enumerate all
possibilities.
• There is a potentially significant time savings if
we can avoid creating the duplicates in the first
place.
• Since we are assembling the molecules from
pieces, we can do this in an intelligent manner
to avoid creating duplicate conformers.
4
Symmetry Unique Rotations
A
• The most common cases are 2fold symmetry such as phenyl
groups and 3-fold symmetry such
as t-butyl groups.
• If the default number of torsion
grids is set to 6, then only one
torsion angle is symmetry unique
for this case.
B
1
60˚
2
3
2
1
180˚
2
3
1
3
CAESAR flow diagram
Split molecule and generate ring conformations
Join each pair of fragments
Select a combination of Conformer A,
Conformer B and a Symmetry Unique Torsion angle
Energy Filtering
No
Enough?
Yes
No
Top?
Yes
Done
5
Performance and Validation
Questions …
• How well does it sample conformational space?
– Are database search results similar?
– Is CAESAR able to find conformations close to the
bioactive conformer?
• How fast is CAESAR?
6
Datasets
• 919 ligands extracted from
PDB*
• 168 Molecules from
Derwent Drug World Index
• 10 Sulfonamide Molecules
• ~50,000 Molecule
Database (Maybridge)
300
250
200
150
100
50
0
1
2
3
4
5
6
7
8
9
10
Molecular Weight (x100)
120
100
80
60
40
20
* Thanks to Dr. Johannes Kirchmair and Professor
0
Thierry Langer of the University of Innsbruck.
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
Number of Rotatable Bonds
Similarity of Database Search Results
Query
Number of Hits*
Number of
Common
Hits
Similarity
(%)
FAST
CAESAR
Pharm_3F
106
117
93
93
Pharm_5F
51
50
41
81
Shape
68
98
58
70
Pharm_5F
+ Shape
10
13
6
52
Total hits
235
278
198
77
* Two Maybridge databases were built with FAST and CAESAR.
The default settings were used.
7
Bioactive Conformations?
• Extract protein-ligand complexes from the
Protein Data Bank (PDB)
• Generate conformations without knowledge of
crystal structure conformation
• Compare conformations to crystal structure
Kirchmair, J., et al. J. Chem. Inf. Model. 45(2): 422-430 (2005).
Protocol
8
Bioactive conformations
PDB Data
CAESAR
FAST
Average RMS
1.07
1.07
RMS < 0.5
21%
19%
RMS < 1.0
49%
50%
RMS < 2.0
92%
93%
RMS < 3.0
99%
99%
Speed Test: CAESAR vs. Catalyst FAST
Data Set
MaxConfs
Speed-up
(x faster)
WDI168
(168 molecules)
100
4.9
250
9.0
500
17.7
100
6.0
250
11.6
500
15.8
100
6.5
250
13.6
500
21.1
Sulfonamide10
10 sulfonamides
PDB919
919 ligands from
PDB
Note: Pre-built ring fragment database file is used.
9
Conclusions
• CAESAR provides conformational coverage
similar to Catalyst FAST conformer generation
• CAESAR is 5-20 times faster than FAST
Thanks …
C.M. Venkatachalam (Venkat)
Remy Hoffman
Marvin Waldman
Paul Flook
Samuel Toba
Xuan Hong
Shihka Varma
Tedman Ehlers
Johannes Kirchmair, University of Innsbruck
Thierry Langer, Univeristy of Innsbruck
10
Questions:
Jon Sutter Ph.D. [email protected]
Jiabo Li Ph.D. [email protected]
11