View the CrossMiner Tutorials

1
CrossMiner:FederatedCSDandPDB3D
searching
TableofContents
CrossMinerTerminology....................................................................2
Introduction.......................................................................................2
OverviewofCrossMiner.....................................................................3
Example1:SearchingwithaPharmacophore.....................................4
Example2:EditingaPharmacophore.................................................8
Example3:CreatingaPharmacophorefromaReferenceMolecule&
ScaffoldHopping...............................................................................11
Exercise4:BuildingyourOwnFeatureDatabase...............................13
Version1.0–July2016
CrossMinerver.1.2
2
Introduction
CrossMinerTerminology
CrossMinercanbethoughtofasapharmacophorebasedquerytool.However,
itismuchmorepowerfulthantraditionalpharmacophorequerytoolsasit
allowsyoutoquerynotonlydatabasesofligands,butalsoproteinsand
protein-ligandinteractions.CrossMinerincludesapreconfigureddatabaseof
biologicallyrelevantsubsetsoftheCambridgeStructuralDatabase(CSD)and
theProteinDataBank(PDB).Thepharmacophoreusedinthequeryis
interactive,allowingyoutoeasilyedititandinanumberofwaysthrougha
simpleuserinterface.Thisdeliversanoverallinteractivesearchexperience
withapplicationareasininteractionsearching,scaffoldhoppingorthe
identificationofnovelfragmentsforspecificproteinenvironments.
ThesubsetoftheCSDincludedwithCrossMinerconsistsofstructureswhich
arenotorganometallic,haveanR-factorofatmaximum10%,have3D
coordinates,havenodisorder,andarenotpolymeric(about234400
structurestotal).TheincludedPDBdatabaseisacuratedProasisdatabase
subsetofallco-crystalizedligands,withthebindingsitedefinedasall
moleculeswithanatomwithina6Åradiusaroundtheligand(210,200binding
sitesfrom58,300PDBentries).
Forfurtherdiscussion,pleaserefertotheCrossMinerdocumentation(inthe
installationdirectory)ortheoriginalpaper:
KorbO,KuhnB,HertJ,TaylorN,ColeJ,GroomC&StahlM“Interactiveand
VersatileNavigationofStructuralDatabases”JMedChem,2016,59(9):4257,
DOI:10.1021/acs.jmedchem.5b01756
ThistutorialisgearedtowardsthenoviceCrossMineruserwhohaslifescience
experience.Itcoverstheprimaryfeatures,insearchingacrossligands,
proteins,andligand-proteininteractions.Someoftheresultsmayvary,
dependingonyourversionofCrossMiner.Whenyouhavecompletedthis
tutorial,youshouldbeabletobuildqueries,executesearches,andcreateyour
owncustomizedfeaturedatabase.
CrossMinerusesseveralterms,somecommontothefieldofdrugdiscovery,
andsomenot.Forease,thesetermsaredefinedasbelow:
Feature:apoint(orvector)whichrepresentsaSMARTSquery,andinthecase
ofavector,geometricrules.Afeaturecouldbeinaprotein,ligand,or
pharmacophore.
Pharmacophorefeature:(orvector)issimplyafeaturethathasbeenselected
tobepartofapharmacophore.
Exitvector:Afeaturethatrepresentsabondofanytype.Thisfeaturewillbe
representedastwomeshspheres.InthecaseofCrossMiner,directionalityin
anexitvectordoesnotmatter.
3
OverviewofCrossMiner
CrossMinerisapowerfultoolwithasimpleuserinterface.Thisquicksection
willfamiliariseyouwiththebasicfunctionsandunderlyingdatacomponents
beforemovingontoexploringsomescientificquestions.
1. LaunchCrossMinerfromtheStartMenu>CCDC(Windows)or
Applications/CCDC.Thistutorialwillrefertoseveralareasinthetool,such
as‘resultsbrowser’and‘pharmacophoreandreferencefeaturebrowser’
(referredtofromhereoutjustasthe‘featurebrowser’).Notewherethese
componentsare.
CrossMiner’sunderlyingdataisadatabaseofcrystalstructurefeature
annotations.Forthistutorial,wewillusetheincludeddatabase–subsetsof
theCSDandthePDB.Youwillnoticethatnothingisloadedwhenyouopen
CrossMiner–thisisdoneintentionallytoallowtheusertoloadacustomized
databaseinsteadofthedefaultone.Toloadthedefaultdatabase:
2. ClickFile>>LoadFeatureDatabase,andselect:
[CCDCinstallation]\CSD_CrossMiner1.2\databases\csd_pdb_crossminer.feat.
Bydefault,CCDCinstallationpathsare:
Windows:C:\ProgramFiles\CCDC
Mac:\Applications\CCDC
Loadingwilltakeafewminutes,butevenoncethebarhits100%,itwillneeda
momenttoinitialisethestructures.
3. Onceloaded,youwillseetheCSDandPDBsubsetdatabaseslistedinthe
databaseselectionwindow.Youcanloadmultipledatabases,andusethe
tickboxestoindicatewhichdatabaseshouldbesearched.
4. Youwillalsoseealistoffeaturesnowinthebottomright,inthefeature
browser.Thesearethefeaturesusedtogeneratethesedatabases
(referencecolumn).
4
Example1:SearchingwithaPharmacophore
HumancathepsinLplaysamajorroleinproteincatabolism,andisimplicated
inanumberofpathologicalprocesses.Assuch,itisacommonresearchtarget
andservesasagoodexamplefornoveldrugdiscovery.Forthisexample,we
willbeusingapharmacophoreincludedwithCrossMinertoexploretheCSD
andPDBforpossiblehits.
1. LoadthecathepsinLpharmacophorebyclickingFile>>Load
Pharmacophoreandselect:
[CrossMinerinstallation]\example_pharmacophores\catl_s3.cm
2. Thiswillloadthepharmacophoreintotheviewingarea.Takeamomentto
rotatethemoleculeandunderstandthedifferentpharmacophorefeature
points:
P:Proteinfeature
S:Smallmoleculefeature
Dashedline:intramolecularconstraint.Constrainedfeaturesmustbelongto
eitherthesamemoleculeaseachother(intra,dashedgreenline)or
differentmolecules(inter,dashedredline).
Meshsphere:theactualfeatureitself,wherethespheresizerepresentsthe
radiusoftolerance.
Solidsphere:theprojectedvirtualpointtorepresentthedirectionalityofa
hydrogenbondacceptor/donor.Afeaturecanhavemorethanone
projectedpoint.Forexample,aHbondacceptorcanhavemultiple
potentiallonepairpreferredprojections
Notethatthecolorcodingisdefinedinthefeaturebrowser.Eghydrophobic
featuresaregreen,hydrogenbondacceptorsarered,andsoon.The
pharmacophoreinyourvieweronlyhasoneprojectedHbondacceptor.These
pointscorrelatetothefeaturebrowser:BindicatestheBasefeature,andV
indicatestheaccompanyingVirtualpoint.
Protein:Hbondacceptorfeature(mesh)
withprojecteddirectionality(solid)
Smallmolecule:
heavyatoms
constraint
Intramolecular
constraint
Protein:Hbonddonorfeature(mesh)with
projecteddirectionality(solid)
Protein:hydrophobic
feature
constraint
5
3. Spheresizeisdirectlycorrelatedtotolerance–thesmallerthespherethe
lowerthetoleranceforgeometricmatching.Youcanseetheradiusofeach
feature(Å)inthefeaturebrowser.Bydefault,theradiussizeis1Å.
Notethatsometimestheviewercangetquitecrowded–aneasywaytofind
yourfeatureistoclickthetickboxinthefeatureonandoffagain.
4. Clickthestartbutton
tobeginsearchingacrossthedatabasesfor
matches.Asthesearchruns,youwillseeresultspopulatingtheresults
browserwindow,aswellastheprogressbarwithtotalnumberofhitsat
topright.
5. Thisparticularsearchreturnsahighnumberofhits,andmaytakeawhile.
Thespherenexttothestartsearchbuttonisgreenwhilethesearchis
running.Pausethesearchbyclickingthepause
buttonwhenyoufeel
youhaveenoughhits(afewhundredwilldo).Donotstopthesearch,as
thatstopstheentireprocessandremovesallhits.
6. Bydefault,allresultsareoverlaid,whichmakesforaneasyappreciationof
thecommonmotifsmatchingthepharmacophores.However,thisis
messy.Byclickoneachresultinthebrowser,youcanviewthemoneata
time.Or,holddowntheshiftorctrlkeytoselectmultiple.
7. LocatetheresultwiththelowestRMSDbyclickingonthermsdcolumnin
thebrowser,toshowascendingorder.ThelowestRMSDresultinthis
exampleis2HJ_001_1–yoursmaybedifferent,dependingonwhenyou
stoppedyoursearch.Theseresultsallcomebackwitha2Ddiagram,with
pharmacophorematchesindicated.SelectyoursmallestRMSDresultby
clickingontherow(doesnotneedtobeticked).
8. Foreaseofviewing,changethestyleintheupperlefttoCappedSticks.
Take a few moments to explore how the returned result matches the
pharmacophorequery.Inparticulartonote:
• Feature matching is based on size of sphere (and thus the tolerance
level).Tinysphereswithtighttolerancewillresultinhitswithveryclose
Startsearch
Pausesearch
6
•
alignmenttothecentreofthesphere,whilelargersphereshaveawider
areaforalignment.
There are no explicit hydrogens in the protein sites – hydrogen bond
donors and acceptors are defined in this database based on feature
definitions,whichcomefromexpertknowledgeaboutprotonationand
tautomerstate.Thus,thereisnowaytodisplayhydrogens,astheyare
inferred. See the reference paper listed in the introduction for more
information.
Therearealotofresults,likelyduetothelargeradiusofthefeatures.Lowerthe
toleranceofthefeaturestogetasmallerresultsetwithmoreprecisealignment
to the pharmacophore centres. To do this, you will need to edit the
pharmacophore,whichcannotbedonewhenviewingresults.
9. Clickthestop
buttontocleartheresultsandenableediting.
10. EnsurethatPickingmodeissettoEditPharmacophore.
11. In editing mode, you can double click on the radius sizes in the feature
browsertochangethem.Changetheradiusofeveryfeatureto0.5tolower
thetolerance.Thiswillresultinfewerhits,whichareallcloseralignedwith
thecentreofthepharmacophorefeatures.
Startthesearchonthenew,tighter,pharmacophore.Itwilltakeafewminutes
topickuphits,astherearefarfewer.Letthesearchgotocompletionthistime,
resultingin20hits.
12. Ifclusterhitsisselected,thentheresultsareclusteredbasedontheadjacent
Tanimoto value. If clustered, you will see representatives of those similar
groups in the results viewer (and thus fewer hits than actually returned).
Unchecktheboxtoseeallhits,includingthosewhichareverysimilartoeach
other.
The lowest RMSDresultshereare2XU1and2XU3 – whichareunsurprisingly
cathepsinLPDBentries.Ifyouexplorethetopresult(2XU1_001),you’llseethat
thehitmatchesthepharmacophorealmostexactly.
Stopsearch
7
13. TosavethetwolowestRMSDhitsforfurtherworkoutsideCrossMiner,click
thetickboxestotheleftof2XU1and2XU3toselectthem.Markingthem
doesnotdisplaythem,andsimilarly,displayingthem(viashiftorctrl)does
notmarkthem.
14. ClickFiletodisplayseveralsaveoptions,whichwillsavethevisiblehits(in
thedisplayarea),allhits,orthemarkedhits.Forthisstep,clickFile>>Save
MarkedHits.Saveyourhitsascathepsin_hits.mol2.
Amol2filewillcontainproteinresidueinformation,whereasaSDFwillnot.
8
Example2:EditingaPharmacophore
OneofthepowersofCrossMineristheabilitytomanuallyinteractwithandedit
apharmacophoreatanytime(evenwhileasearchisrunning).Forthisexample,
you will be editing the cathepsin L pharmacophore. For ease, start with the
defaultcathepsinLpharmacophore.Reloaditasperstep1inexample1.Ifthe
searchstartsagain,justclickstop.
1. Make sure you are in Edit Pharmacophore mode. Right click on the
hydrophobic feature to bring up the editing menu. Click Morph Into and
water.YouwillalsoneedtoclickAnyMolecule,asthewateristechnically
notapartoftheproteinorsmallmolecule.
2. You’llnoticethatthenameofthefeaturedoesnotchange.Youwillneedto
do this manually by right clicking on the feature and selecting Change
Description.Callitwater.
Therightclickmenuoffeaturesiswhereanumberofthingsaredefined.Use
thismenuto:
• Definewherethefeaturebelongs:aprotein,asmallmolecule,orany.
• Addconstraints(Constrainto).
• Changethefeaturetype(MorphInto).
• Changethelabelofthefeature(ChangeDescription).
• Deleteafeature(DeleteFeature).
3. Changetheradiusofthevirtual‘acceptor_projected’(V)from1.20to1.00,
andthewaterradiusfrom1.50to1.00.Runthesearch.
4. Reducingthefeaturesizestillreturnsalargenumberofhits.Selectoneof
thelowRMSDresultsandvisualizeit.
CrossMinerallowsyoutovisuallyinteractwiththepharmacophore.Youcanedit
thefeaturesnumerically in thefeaturebrowser,and also manually inviewer.
Whilethisisnotarigorousapproach,itishelpfulforexploringchemicalspace.
9
5. Mouseoverthewaterfeature,thenclickanddragusingthemiddlemouse
button.Thiswillchangetheradiusofthewatersphere.Thislargersphere
allows for more flexibility in a water placement around the ligand. If you
havetroublegettingthiswork,ensurethatInteractiveeditingmodebutton
is on (see step 7 below). Change the radius back to 1.00 in the feature
browser.
Wheneveryoueditthepharmacophore,thecurrentresultwilldisappear(asthis
startsanewsearch).Toeditapharmacophoreoverlaidwithamoleculeinthe
viewingarea,youwillneedtofirstsetthemoleculeasareference.
6. Rightclickonahitintheresultsbrowser,andclickUseasreference.Thiswill
loadthemoleculeintotheviewer,withallfeaturesdefined.Thismeansthat
projection lines between base and virtual features are shown, as well as
centroidsandprojectedaromaticvirtualpoints.
7. SwitchtointeractiveeditingbyclickontheInteractiveeditingbutton.An
openhandmeanseditingisoff,andaclosedhandmeanseditingison.
8. Dragoneoftheheavyatomfeaturestoanearbycarbonatom.Notethatthe
searchstartsassoonasyouletitgo.Thereferencemoleculestaysvisible
evenwhilethesearchruns.Toundisplaythereferencemolecule,clickthe
tickboxnexttoreferenceinthevisibilitybar.Thisbarisforcontrollingthe
visibilityofthereferencemolecule,hits,constraints,andlabels.
EveryCrossMineruserhasaccidentlygrabbedapharmacophorepointanddrug
itacrossthescreenwhentheymeanttorotatethemoleculeviewer.Thereare
twomajortrickstoavoidingthis:
• Besuretoturnoff‘interactiveediting’modewhenyouarenotusingit.
• Set‘Pickingmode’backtoPickatomswhenyouaredoneeditingyour
pharmacophore.
Ctrl+z(orEdit>>Undo)isveryhelpful.Itwillundothelastchangemadetothe
pharmacophore. However, the search will start over again, so form good
CrossMinerhabitsearly!
Centroid
Projectedvirtualpoints
Interactiveediting
off
Interactiveediting
on
10
9. Youcanalsoaddnewfeaturesfromthefeaturebrowser.Scrolltofluorine
in the feature browser and right click, then click Create fluorine. This will
dropthenewfeatureintothecanvas.
10. Pickapointyouwouldliketoalignthefluorineto.WhilestillwithInteractive
editingmodeon,dragthenewfluorineneartothereferencepoint.When
thenewfluorinefeatureis clearlyclosertothatreference pointthanany
otherreferencepoint,right-clickthefeature,andselectSnapToAtom.This
willalignthefeaturetothenearestatomicreferencepoint.
This new search probably won’t find any results. This step was just to
demonstratehowtoaddandmanoeuvrenewfeatures.
11. Save your editedpharmacophorebyclickingFile >> Save Pharmacophore.
Namethefilepharmacophore_edited.cm.
Therearetwooptionsforsavingapharmacophore,asdefinedbelow.
SavePharmacophore:WillsavethecurrentpharmacophoreinaCrossMinerfile
format(cm).
SavePymolPharmacophore:Willsavethecurrentpharmacophoreina(py)file
format.Thisisavisualbasedformattingwithnoscientificinformationcontained
beyondlabels(noatomtyping,etc).
11
Example3:CreatingaPharmacophorefroma
ReferenceMolecule&ScaffoldHopping
AfarmorecommonuseofCrossMineristocreateapharmacophorefroma
referencemolecule.Thiscouldbebuildingapharmacophorefromtheligandof
aco-crystallizedprotein-ligandcomplex,ormanuallycreatingaCrossMiner
pharmacophorefromasetofatomsrepresentingapharmacophorecreatedby
anothermechanism.
Ifyoualreadyhaveworkinyourviewer,clearitbyclickingEdit>>Clear
PharmacophoreandEdit>>ClearReference.
1. ClickFile>>LoadReferenceandselect2xu1_ligand.sdffromtheworkshop
datafolder.
Notethatthefeaturesareautomaticallyassigned.Thisisbasedontheatom
typesandfeaturedefinitionsthatarepartoftheloadeddatabase.
2. Rightclickonfeaturesofthereferencemoleculetodefinepharmacophore
featurepoints,asyoudidabove.Definethepharmacophoreasshownin
theimageatright.
3. Wewantallofthesefeaturestobeinthesamemolecule,andthusallto
haveintraconstraints.Ratherthanmanuallysettingtheseup,justclickthe
Intrabuttononthetoolbar,toautomaticallyconstrainallthefeatures.
Startthesearch.ImmediatelyalotofhitswillbefoundintheCSD(asindicated
bythe6-letterrefcodesintheresultsbrowser).Butforthisexperiment,weare
onlyinterestedinmatchingligandsfromthePDB.Stopthesearch.
4. Clickthetickboxnexttocsd536_crossminerinthedatabasebrowserto
disablesearchingtheCSD.Restartthesearchandletitruntocompletion.
Scrollthroughtheresults,assortedbyRMSD.Clearlytherearehitsfromthecocrystalizedligand,buttherearealsosomeinterestinganalogs.
donor_projected
hydrophobic
planar_ring_projected
(projectedawayfrommolecule)
acceptor_projected
(oxygenprojectingawayfrommolecule)
Setallfeaturestointraconstraints
12
This demonstrates how easy scaffold hopping could be carried out with
CrossMiner.Asanotherexample,youwillexperimentwithscaffoldhoppingof
PD180970,theco-crystallizedligandoftheABLkinasedomain.
5. Load the file 2hzi_ligand.sdf as you did previously. Define the
pharmacophore as indicated. For the acceptor_projected, two points are
indicated – ensure that both projected virtual points are set to features.
Ensurethatthefeaturesareallsettointraconstraints,asabove.
Forthissearch,leaveboththeCSDandPDBdatabasesselected.You’llseethat
there is no shortage of great results from the CSD to match this query. Try
experimentingwithnarrowingthisresultsetdown.Tryrestrictingsomeofthe
radii,oraddinginanewfeaturethatyourchemistryexperiencesaysmightbe
important.
planar_ring_projected
acceptor_projected(2)
exit_vector
hydrophobic
13
Exercise4:BuildingyourOwnFeatureDatabase
Forthisexerciseyouwillbebuildingyourownfeaturedatabasefromasetof
ligands. These ligands do not have pre-configured features, and will rely on
CrossMinertodetectandaddthefeatures.
1. SelectFeatureDatabase>>Create.Select‘Add’andnavigatetothetutorial
datafolderandloadfviia_ligands01.sdf.
2. You will also need to load the feature definitions – this step allows for
flexibilityinhowfeaturesaredefined,ifyouwanttoaddcustomfeatures.
Click on Add Substructures, and navigate to [CrossMiner
Installation]\feature_definitions.Selecteveryentrystartingwithfeatures_,
usingtheshiftkey.ClickOK.
3. ClickCreateFeatureDatabase,notOK,tocreatethedatabase.Itwillaskyou
for a save location – indicate the tutorial data directory, and name it
fviaa_ligands01.feat.Oncetheindexinghascompleted,clickOKtoclosethe
CreateFeatureDatabasewindow.
Tryoutthenewdatabase.LoaditbyclickingFile>>LoadFeatureDatabaseas
youhavebefore.Notethatinthedatabasebrowser,onlyyourdatabaseislisted.
4. Thesimplesttestistoseeifyoucanpullbackamoleculefromtheoriginal
dataset.Loadareferencemoleculefromtheworkshopdatadirectory(File
>> Load Reference >> 4YT6_4JY.SDF). Define two planar_ring_projecte’
pharmacophorepointsasindicatedinthediagramatright.Makesurethey
haveintraconstraints.Startthesearch.
5. InCrossMinerversion1.2youshouldgetback29hits,whichwhenclustered
areshownas5results.
To custom define features (which are based on SMARTS), refer to the
CrossMinerdocumentation.
planar_ring_projected
14