Graphlet signature-based scoring method to estimate protein–ligand

Downloaded from http://rsos.royalsocietypublishing.org/ on July 12, 2017
rsos.royalsocietypublishing.org
Research
Cite this article: Singh O, Sawariya K, Aparoy
P. 2014 Graphlet signature-based scoring
method to estimate protein–ligand binding
affinity. R. Soc. open sci. 1: 140306.
http://dx.doi.org/10.1098/rsos.140306
Received: 15 September 2014
Accepted: 31 October 2014
Subject category:
Structural biology and biophysics
Subject Areas:
bioinformatics/theoretical
biology/computational biology
Keywords:
graphlet signature, interaction network,
docking, binding affinity
Author for correspondence:
P. Aparoy
e-mail: [email protected];
[email protected]
Electronic supplementary material is available
at http://dx.doi.org/10.1098/rsos.140306 or via
http://rsos.royalsocietypublishing.org.
Graphlet signature-based
scoring method to estimate
protein–ligand binding
affinity
Omkar Singh, Kunal Sawariya and
Polamarasetty Aparoy
Centre for Computational Biology and Bioinformatics, Central University of Himachal
Pradesh, Dharamshala, Himachal Pradesh 176215, India
1. Summary
Over the years, various computational methodologies have
been developed to understand and quantify receptor–ligand
interactions. Protein–ligand interactions can also be explained in
the form of a network and its properties. The ligand binding at the
protein-active site is stabilized by formation of new interactions
like hydrogen bond, hydrophobic and ionic. These non-covalent
interactions when considered as links cause non-isomorphic subgraphs in the residue interaction network. This study aims to
investigate the relationship between these induced sub-graphs
and ligand activity. Graphlet signature-based analysis of networks
has been applied in various biological problems; the focus of
this work is to analyse protein–ligand interactions in terms of
neighbourhood connectivity and to develop a method in which
the information from residue interaction networks, i.e. graphlet
signatures, can be applied to quantify ligand affinity. A scoring
method was developed, which depicts the variability in signatures
adopted by different amino acids during inhibitor binding, and
was termed as GSUS (graphlet signature uniqueness score).
The score is specific for every individual inhibitor. Two wellknown drug targets, COX-2 and CA-II and their inhibitors, were
considered to assess the method. Residue interaction networks
of COX-2 and CA-II with their respective inhibitors were used.
Only hydrogen bond network was considered to calculate GSUS
and quantify protein–ligand interaction in terms of graphlet
signatures. The correlation of the GSUS with pIC50 was consistent
in both proteins and better in comparison to the Autodock
results. The GSUS scoring method was better in activity prediction
of molecules with similar structure and diverse activity and
vice versa. This study can be a major platform in developing
approaches that can be used alone or together with existing
methods to predict ligand affinity from protein–ligand complexes.
2014 The Authors. Published by the Royal Society under the terms of the Creative Commons
Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted
use, provided the original author and source are credited.
Downloaded from http://rsos.royalsocietypublishing.org/ on July 12, 2017
2. Introduction
The aim of the present study was to evaluate the efficiency of graphlet signatures in inhibitor activity
prediction. For this study two well-known drug targets, COX-2 and CA-II, were considered. Thirty
inhibitors for each of the targets were chosen randomly. The crystal structures of CA-II complexed with
various inhibitors have been obtained from the Protein Data Bank (PDB) [10]. Similarly, COX-2 crystal
structure co-complexed with Celecoxib was selected for the studies [11].
For the inhibitors which are not co-crystallized with these enzymes, docking procedure was employed
to identify the favourable conformations at the active site. Structures in PDB, 2POU and 3LN1 were
employed for docking studies with CA-II and COX-2, respectively. Prior to docking, all the potential
ligands were prepared in DG-AMMOS [12] using AMMP force field sp4. Autodock program was used
for docking [13]. During the docking process, maximum number of conformer generation was set to 100
and other parameters were set to default values.
In this study, residue interaction networks for COX-2 (3LN1) and CA-II (2POU) [11,14] were obtained
from RING server [15]. Furthermore, the networks were visualized and analysed in Cytoscape [16].
The hydrogen bond pattern in protein structure was identified using the HB explore tool in RING
server [17] and was analysed in RINalyzer [18]. Further, graphlet counter was used to examine the
signature patterns made by different inhibitors with the active-site residues [19] (figure 1). Ligand
binding at the protein-active site is stabilized by the formation of non-covalent interactions. Hydrogen
bond interactions play a major role in proteins [20]. The residues present in protein-active site have
hydrogen bond connectivity with their neighbours and create a local hydrogen bond network. Ligand
binding leads to change in the graphlet signatures of the residues at the active site. In this study, only the
hydrogen bond network of the protein was considered and the effects of ligand binding on the signatures
at the active site were studied. The signature analysis method that provides an opportunity to analyse
the residues that are not in direct contact with inhibitors and are in secondary shell of active site is also
considered in this approach.
.................................................
3. Material and methods
rsos.royalsocietypublishing.org R. Soc. open sci. 1: 140306
Understanding and quantifying receptor–ligand interactions forms the core of computer-aided drug
discovery methods [1]. One such method, molecular docking, has been widely used owing to its
high speed and performance. The approximations in the scoring methods are one of the limitations
of these docking programs. Recently, to improve the performance of virtual screening experiments,
approaches like free energy perturbation methods, pharmacophore modelling, post docking consensus
scoring/tuned scoring, etc. are used in combination with docking methods [2]. There is a need for
development of more such methods that can improve the authenticity of virtual screening findings when
used alone or together with the existing methods.
System biology is an emerging discipline used to analyse and interpret various kinds of biological
networks [3]. Various graph theory methods, especially residue interaction networks, have been
extensively used to analyse protein structure and dynamics [4]. In the residue interaction network, amino
acids (AAs) are represented as nodes and edges represent the interactions among them. Reports suggest
that the protein–ligand interactions can also be explained in the form of network. The impact of a ligand
binding on protein network can be measured in terms of its network properties and it has substantial
effects on the closeness centrality of network [5].
The purpose of this study was to analyse the changes in local connectivity of each node (active
site residues) present in protein-active site after ligand binding in the residue interaction networks
and to relate these changes to compound activity. Each node present in protein-active site has its
own neighbourhood and local connectivity. Binding of ligand/substrate with active site residues is
mediated by the creation of new interactions. These new interactions will change the local connectivity
and neighbourhood of the active site residues and induce non-isomorphic sub-graphs in the proteinactive site with respect to an active site residue. Small connected non-isomorphic induced sub-graphs of
large network are termed as graphlets [6]. Graphlet signature-based analysis of biological networks has
been successfully applied extensively in various biological problems [7,8]. The importance of graphlet
signatures in networks led to the development of combinatorial approaches for graphlet counting
[9]. This study focuses on the identification of induced sub-graphs in the protein-active site after
ligand binding employing graphlet signature-based analysis of residue interaction networks and their
application to estimate binding affinity of various ligands.
2
Downloaded from http://rsos.royalsocietypublishing.org/ on July 12, 2017
(a)
(b)
(c)
step I
step II
with inhibitor
without inhibitor
(c2)
step III
identification and selection of
active site residues (yellow)
step IV
(d1)
(d2)
AA1
AA1
AA2
addition of inhibitor ( j)
(blue)
no inhibitor
AA2
j
AA3
AA3
(e1)
step V
amino
acids
i = 1 to n
AA1
signatures at respective
AA
0
signatures formed at the active site with
every respective amino acid were studied
in the absence and presence of inhibitor
amino
acids
i = 1 to n
signatures at respective
AA
0
1
AA1
1
AA2
0
( f)
1
8
2
8
AA2
2
AA3
(e2)
2
step VI
determination of St, Sij, Si, Li
from the signatures identified
AA3
0
1
2
8
signature pool
collection of unique signatures made by ith amino
acid with ligand ‘j’ at respective AA’s 1 to n, Sij
collection of all the signatures identified as unique
signatures in the presence of all individual inhibitors, St
Sij / St
n
GSUS = S
(S /S + L / L)
i=1 i t i
Si = total number of unique signatures made by ith AA with all ligands
Li = total number of ligand making unique signature with ith AA
L = total number of ligands
Figure 1. Flowchart depicting the work plan: (a) crystal structure of CA-II retrieved from PDB; (b) residue interaction network of the
PDB structure with all types of non-covalent interactions generated by RING server; (c) residual interaction network of the hydrogen
bond interactions from RINanalyzer. Identification and selection of active site residue (yellow) and analysis of graphlet signature by using
GRAPHLET COUNTER in the absence of ligand (c1–e1) and in the presence of ligand (c2–e2) at the active site (yellow) and (f ) extraction
of new signatures and computation of GSUS.
The graphlet signatures corresponding to every AA in the active site were analysed before and after
ligand binding. Unique signatures, i.e. the signatures that are formed at the active site only after ligand
binding were identified. These are the signatures which exist only after ligand binding and are nonexistent in apoprotein structure.
A pool of all the unique signatures formed by the 30 inhibitors with respect to the residues present
in protein-active site was created. Furthermore, the information from these graphlet signatures was
employed to quantify affinity of every inhibitor in the dataset. Uniqueness score for each inhibitor was
.................................................
AA interaction network with
hydrogen bonds only
rsos.royalsocietypublishing.org R. Soc. open sci. 1: 140306
AA interaction network
with all type of interactions
three-dimensional structure
(c1)
3
Downloaded from http://rsos.royalsocietypublishing.org/ on July 12, 2017
measured in terms of variability in signatures adopted by different AAs during inhibitor binding. The
uniqueness at these signatures was quantified as follows:
Sij /St
i=1
(Si /St + Li /L)
,
(3.1)
where
GSUS = graphlet signature uniqueness score for ligand j;
Sij = 73
k=1 max(0, Xijk − Xi∗ k ), where Xijk − Xi∗ k = {1 if Xijk = Xi∗ k , else 0}, Xi∗ k represent the total
number of signature in absence of ligand with respect to ith AA, Xijk represent the total number
of signature in presence of ligand with respect to ith AA in particular orbit k, and Sij represents
the number of unique signatures made by ith AA with inhibitor j;
total number of unique signatures made by all the ligands with all AAs (signature pool);
St = Si = Lj=1 Sij and it represents the total number of unique signatures made by ith AA with all the
ligands;
Li = Lj=1 min(Sij , 1) and it represents the number of ligands forming unique signatures with ith
AA; and
L = total number of ligands used in the dataset.
The correlation between the biological activity and predicted activity (GSUS and Autodock score with
default settings) for the compounds in the dataset was performed using Pearson’s correlation coefficient.
The pair-wise diversity of the compound dataset was measured. The dependency of the method on
diversity was illustrated to check if the methods hold good for various scaffolds. The similarity between
each pair of molecules was measured using Tanimoto coefficient in OpenBabel [21].
4. Results and discussion
Studies on the application of network properties to differentiate ligand selectivity have been reported
[22]. Graphlet signature methods have been successfully applied to identify functional residues in
protein-active site and ligand-binding site predictions [23,24]. This study focuses on the knowledge of
induced sub-graphs in protein-active site and their application to estimate and quantify protein–ligand
interactions is novel and promising.
4.1. Studies on COX-2
RING server was used to generate the residue interaction network of COX-2. All the AAs within 10 Å
radius from the centre of the active site were selected (electronic supplementary material, table S1). From
the hydrogen bond interaction networks formed, we analysed signatures with varying orbits present
at each selected residue. Collectively, these were termed as native signatures of active site as they are
present in the native structure of protein, i.e. non-inhibitor bound form. To analyse the hydrogen bond
interactions between inhibitor and enzyme, the complexes of 30 inhibitors with COX-2 were used. The
respective hydrogen bonds formed by each of the inhibitors at the active site were identified (electronic
supplementary material, table S2). Prior to the graphlet signature analysis, the individual inhibitors
were added to the protein–hydrogen bond network manually and the contacts were created between
the inhibitor and the respective hydrogen bond forming AAs. The graphlet signatures of each individual
inhibitor with the active site AAs were studied and unique signatures, i.e. the new signature formed
from non-existence at each of the AAs, were identified.
In some cases, it was observed that the same AA was involved in signatures with different orbits more
than once. Each of such situations was counted as unique signatures. All the unique signatures formed
with each of the selected AAs by every inhibitor were identified as listed in electronic supplementary
material, table S3. The total number of unique signatures for all the inhibitors with all the AAs was
found to be 761. This collection was termed as signature pool, St . All the quantified features were further
applied in the calculation of GSUS for each inhibitor using equation (3.1) (table 1). Calculation strategy
used for Celecoxib is shown in figure 2 with all the graphlet signature details. Valdecoxib had the highest
GSUS of 0.756, and lowest score was 0.0064 for Etodalac. Most of the active molecules were scored high
and least active were scored low. For all the 30 molecules, correlation coefficient was 0.55. The correlation
results clearly indicate positive association between biological activity and GSUS. The correlation of
the GSUS with pIC50 was almost the same as the correlation of Autodock scores with pIC50 , which
.................................................
n
rsos.royalsocietypublishing.org R. Soc. open sci. 1: 140306
GSUS =
4
Downloaded from http://rsos.royalsocietypublishing.org/ on July 12, 2017
Table 1. Estimation of binding affinity of COX-2 inhibitors.
5
IC50 (nM)
pIC50
GSUS
Autodock score
1
6-methylnaphthylacetic acid
80 000
4.09691
0.16908121
−7.09
2
Piroxicam
70 000
4.154902
0.00913255
−8.13
3
Etodalac
60 000
4.221849
0.00639931
−7.49
4
Ibuprofen
40 000
4.39794
0.01738897
−7.04
5
flufenamic acid
20 000
4.69897
0.10528901
−7.1
6
ETYA
15 000
4.823909
0.02308672
−7.17
7
BW755C
10 000
5
0.07490419
−5.71
8
Lumiracoxib
7000
5.154902
0.02972949
−7.68
9
SC-560
6300
5.200659
0.0237849
−8.74
10
Etoricoxib
5000
5.30103
0.01738897
−11.16
11
Fenclofenac
4000
5.39794
0.09343407
−8.26
12
Ketoprofen
2500
5.60206
0.02308673
−8.71
13
Suprofen
2000
5.69897
0.01738897
−8.4
14
Naproxen
2000
5.69897
0.05585896
−7.15
15
Flurbiprofen
500
6.30103
0.01335906
−7.58
16
Nimuslide
500
6.30103
0.06857699
−8.98
17
Rofecoxib
500
6.30103
0.22399585
−10.79
18
Meloxicam
400
6.39794
0.49026752
−8.27
19
Licofelon
370
6.431798
0.03997682
−9.57
20
SC-58125
300
6.522879
0.01738897
−9.99
21
mefenamic acid
300
6.522879
0.018128
−7.56
22
Flosulide
130
6.886057
0.23717307
−8.85
23
CHEMBL257539
100
7
0.1167376
−8.65
24
Indisulam
100
7
0.12526685
−9.46
25
niflumic acid
100
7
0.23149516
−6.67
26
NS398
81
7.091515
0.05656345
−9.1
27
Celecoxib
50
7.30103
0.40196448
−10.35
28
Dichlofenac
9.4
8.02
0.44306776
−8.32
29
DUP-697
8.7
8.060481
0.01738894
−11.22
30
Valdecoxib
5
8.30103
0.7564579
−10.54
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
was −0.55. The major challenges for in silico binding affinity prediction methods are to differentiate
structurally similar molecules with different activities and structurally diverse molecules with similar
activity. To check the efficiency of GSUS method in such cases, subset of compounds were made based
on structure similarity (greater than 0.7) quantified by Tanimoto coefficient (electronic supplementary
material, table S5). Dichlofenac and Lumiracoxib have high similarity in structure but there is 700-fold
difference in their pIC50 values against COX-2. Similarly, four pairs of compounds, Ibuprofen/Naproxen,
Piroxicam/Meloxicam, SC-560/SC58125 and flufenamic acid/mefenamic acid have high structural
similarity and diverse pIC50 values. GSUS method was more accurate in differentiating active and
inactive molecules in the subsets. Autodock was unable to distinguish the activities of the molecules
with similar structures.
The performance of scoring method was also assessed for distinguishing pairs of inhibitors
with very low structural similarity and high activity similarity (electronic supplementary material,
table S6). COX-2 inhibitor pairs indomethacin and niflumic acid, SC58125 and mefenamic acid,
Flurbinprofen/Nimesulide, CHEMBL257539/indomethacin, etc. show very low structural similarity
.................................................
drug
rsos.royalsocietypublishing.org R. Soc. open sci. 1: 140306
no.
.........................................................................................................................................................................................................................
addition of Celecoxib
amino acids (shown in yellow) forming
unique signatures in the presence
of Celecoxib (i = 13)
active site
residues
(d)
Phe 504
Arg 499
Val 335
Gln 178
Gln 336
Leu 338
Ala 421
Tyr 334
Tyr 341
Arg 106
Ser 339
Ala 502
no. of unique signatures with Celecoxib, Sij
i=1
n
Sij / St
S (Si /St + Li / L) = 0.339
lle 503
GSUS =
total no. of ligands = 30
total no. of unique signatures identified, St = 761
Ser 339 = 18, Arg 106 = 7, Tyr 341 = 19, Ala 421= 4, val 335 = 18,
Leu 338 = 4, Arg 449 = 17, Tyr 334 = 4, Gln 336 = 10, Gln 178 = 5,
Ala 502 = 15, Ile 503 = 5, Phe 504 = 19
no. of ligands forming unique signatures with AAi, Li
Ser 339 = 73, Arg 106 = 47, Tyr 341 = 67, Ala 421= 5, val 335 = 36,
Leu 338 = 24, Arg 449 = 85, Tyr 334 = 7, Gln 336 = 23, Gln 178 = 38,
Ala 502 = 22, Ile 503 = 17, Phe 504 = 77
no. of unique signatures with all inhibitors, Si
Ser 339 = 6, Arg 106 = 10, Tyr 341 = 2, Ala 421=1, Val 335 = 3,
Leu 338 = 8, Arg 449 = 7, Tyr 334 = 4, Gln 336 = 3, Gln 178 = 8,
Ala 502 = 3, Ile 503 = 4, Phe 504 = 3
rsos.royalsocietypublishing.org R. Soc. open sci. 1: 140306
.................................................
identification of unique signatures formed at the 13 amino acid
residues in the presence of Celecoxib
Figure 2. Computation of GSUS of Celecoxib with COX-2: (a) AA interaction network; (b) selection of active site residues in hydrogen bond network; (c) Celecoxib induces unique graphlet signatures with respect to the AAs present
in the active site (yellow) and (d) various signature parameters formed with respect to individual AAs.
(c)
(b)
(a)
Downloaded from http://rsos.royalsocietypublishing.org/ on July 12, 2017
6
Downloaded from http://rsos.royalsocietypublishing.org/ on July 12, 2017
Table 2. Estimation of binding affinity of CA-II inhibitors.
7
IC50 (nM)
pIC50
GSUS
Autodock score
1
2-hydroxy-3-methylbenzoic acid
4 700 000
2.33
0.002865
−5.08
2
4-amino-2-hydroxybenzoic acid
750 000
3.12
0.044757
−4.6
3
2-hydroxy-5-sulfobenzoic acid
290 000
3.54
0.047693
−5.87
4
saccharin
5950
5.225483
0.009956
−4.32
5
(E)-6-oxo-3-(2-(4-(N-(pyridin-2yl)sulfamoyl)phenyl)hydrazono)
4490
5.35
0.177222
−6.97
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
cyclohexa-1,4-dienecarboxylic
acid
.........................................................................................................................................................................................................................
6
2-hydroxy-3,5-dinitrobenzoic acid
2800
5.55
0.016072
−5.33
7
3-(4-sulfamoylphenyl)propanoic
495
6.305395
0.044916
−6.18
.........................................................................................................................................................................................................................
acid
.........................................................................................................................................................................................................................
8
2-aminobenzenesulfonamide
295
6.530178
0.02644
−5.75
9
4-sulfamoylbenzoic acid
133
6.876148
0.105051
−5.58
10
2-hydrazinylbenzenesulfonamide
124
6.906578
0.098203
−6.21
11
4-amino-6-chlorobenzene-1,3disulfonamide
75
7.124939
0.089529
−7.07
4-amino-6-
63
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
12
7.200659
0.046369
−6.66
(trifluoromethyl)benzene-1,3disulfonamide
.........................................................................................................................................................................................................................
13
4-amino-3-
60
7.221849
0.02644
−5.24
fluorobenzenesulfonamide
.........................................................................................................................................................................................................................
14
4-amino-N-(4sulfamoylphenethyl)
50
7.30103
0.148955
−7.65
benzenesulfonamide
.........................................................................................................................................................................................................................
15
methazolamide
50
7.30103
0.114469
−4.79
16
4-amino-N-(4sulfamoylbenzyl)benzenesulfonamide
46
7.337242
0.137639
−7.15
17
sulpiride
40
7.39794
0.097432
−6.77
18
dichlorophenamide
38
7.420216
0.076751
−5.38
19
zonisamide
35
7.455932
0.055204
−6.95
20
4-((2-aminopyrimidin-4-
33
7.481486
0.162283
−5.79
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
yl)amino)benzenesulfonamide
.........................................................................................................................................................................................................................
21
Celecoxib
21
7.677781
0.088067
−6.56
22
5-imino-4-methyl-4,5-dihydro1,3,4-thiadiazole-2-sulfonamide
19
7.721246
0.095185
−5.35
23
indisulam
15
7.823909
0.082407
−6.83
24
acetazolamide
12
7.920819
0.0338
−4.66
25
topiramate
10
8
0.3007
−4.86
26
sulthiame
9
8.045757
0.03563
−4.39
27
benzolamide
9
8.045757
0.076824
−5.06
28
dorzolamide
9
8.045757
0.110511
−5.69
29
ethoxzolamide
8
8.09691
0.16463
−5.18
30
brinzolamide
3
8.522879
0.110511
−4.53
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.........................................................................................................................................................................................................................
.................................................
drug
rsos.royalsocietypublishing.org R. Soc. open sci. 1: 140306
no.
.........................................................................................................................................................................................................................
Downloaded from http://rsos.royalsocietypublishing.org/ on July 12, 2017
The unique signature selection was performed using the same procedure as we used in COX-2. The
total number of unique signatures was found to be 1201 collectively for all the inhibitors (electronic
supplementary material, table S4). All the quantified features were further applied in the calculation
of GSUS for each inhibitor using equation (3.1) and it was observed that topiramate had the highest
GSUS of 0.3, and lowest value was 0.002 for 2-hydroxy-3-methylbenzoic acid (table 2). Correlation
coefficient has been calculated for pIC50 value and GSUS. Correlation coefficient was 0.40 for all the
30 molecules and was significant at the 0.05 level (two tailed). In the dataset of CA-II inhibitors
considered, three pairs of inhibitors, 2-aminobenzenesulfonamide/2-hydrazinylbenzenesulfonamide,
2-hydroxy-3-methylbenzoic acid/4-amino-2-hydroxybenzoic acid and 4-amino-6-chlorobenzene-1,3disulfonamide/dichlorophenamide, showed high structural similarity of Tanimoto coefficient greater
than 0.7 with activity difference of twofold, sixfold and twofold, respectively (electronic supplementary
material, table S7). The performance of GSUS was better in distinguishing the activities of these molecules
with similar structure.
The performance of GSUS method in CA-II was consistent with that in COX-2, unlike Autodock
which showed great difference in correlation coefficient in both of these enzymes. The results in this
study hint that GSUS method is promising, and it can be further improved into a more applicable
and reliable method. Its performance in distinguishing activities of structurally related and unrelated
compounds shows that it can be a part of virtual screening experiments employing multi-layered
screening methods.
Acknowledgements. The authors thank Central University of Himachal Pradesh for providing the necessary
computational facilities.
Funding statement. We duly acknowledge UGC for providing Rajiv Gandhi National fellowship to Omkar Singh.
Competing interests. We declare we have no competing interests.
References
1. Aparoy P, Reddy KK, Reddanna P. 2012 Structure and
ligand based drug design strategies in the
development of novel 5- LOX inhibitors. Curr. Med.
Chem. 19, 3763–3778.
(doi:10.2174/092986712801661112)
2. Cross JB, Thompson DC, Rai BK, Baber JC, Fan KY, Hu
Y, Humblet C. 2009 Comparison of several molecular
docking programs: pose prediction and virtual
screening accuracy. J. Chem. Inf. Model. 49,
1455–1474. (doi:10.1021/ci900056c)
3. Uetz P, Finley Jr RL. 2005 From protein networks to
biological systems. FEBS Lett. 579, 1821–1827.
(doi:10.1016/j.febslet.2005.02.001)
4. Bode C, Kovacs IA, Szalay MS, Palotai R, Korcsmaros
T, Csermely P. 2007 Network analysis of protein
dynamics. FEBS Lett. 581, 2776–2782.
(doi:10.1016/j.febslet.2007.05.021)
5. Hu Z, Bowen D, Southerland WM, del Sol A, Pan Y,
Nussinov R, Ma B. 2007 Ligand binding and circular
permutation modify residue interaction network in
DHFR. PLoS Comput. Biol. 3, e117.
(doi:10.1371/journal.pcbi.0030117)
6. Milenkovic T, Przulj N. 2008 Uncovering biological
network function via graphlet degree signatures.
Cancer Inform. 6, 257–273.
7. Wang XD, Huang JL, Yang L, Wei DQ, Qi YX, Jiang ZL.
2014 Identification of human disease genes from
interactome network using graphlet interaction.
PLoS ONE 9, e86142.
(doi:10.1371/journal.pone.0086142)
8. Milenkovic T, Memisevic V, Ganesan AK, Przulj N.
2010 Systems-level cancer gene identification from
protein interaction network topology applied to
melanogenesis-related functional genomics data.
J. R. Soc. Interface 7, 423–437. (doi:10.1098/rsif.
2009.0192)
9. Hocevar T, Demsar J. 2014 A combinatorial approach
to graphlet counting. Bioinformatics 30, 559–565.
(doi:10.1093/bioinformatics/btt717)
10. Bernstein FC, Koetzle TF, Williams GJ, Meyer Jr EF,
Brice MD, Rodgers JR, Kennard O, Shimanouchi T,
Tasumi M. 1978 The Protein Data Bank: a
computer-based archival file for macromolecular
structures. Arch. Biochem. Biophys. 185, 584–591.
(doi:10.1016/0003-9861(78)90204-7)
11. Wang JL et al. 2010 The novel benzopyran
class of selective cyclooxygenase-2 inhibitors.
Part 2. The second clinical candidate having a
shorter and favorable human half-life. Bioorg. Med.
Chem. Lett. 20, 7159–7163. (doi:10.1016/
j.bmcl.2010.07.054)
12. Lagorce D, Pencheva T, Villoutreix BO, Miteva MA.
2009 DG-AMMOS: a new tool to generate 3D
conformation of small molecules using distance
geometry and automated molecular mechanics
optimization for in silico screening. BMC Chem. Biol.
9, 6. (doi:10.1186/1472-6769-9-6)
13. Morris GM, Goodsell DS, Halliday RS, Huey R, Hart
WE, Belew RK, Olson AJ. 1998 Automated docking
using a Lamarckian genetic algorithm and an
14.
15.
16.
17.
18.
empirical binding free energy function. J. Comput.
Chem. 19, 1639–1662. (doi:10.1002/(SICI)1096-987X
(19981115)19:14<1639::AID-JCC10>3.0.CO;2-B)
Alterio V, De Simone G, Monti SM, Scozzafava A,
Supuran CT. 2007 Carbonic anhydrase inhibitors:
inhibition of human, bacterial, and archaeal
isozymes with
benzene-1,3-disulfonamides—solution and
crystallographic studies. Bioorg. Med. Chem. Lett. 17,
4201–4207. (doi:10.1016/j.bmcl.2007.05.045)
Alberto JM, Martin MV, Filippo B, Tomás Di D, Ian W,
Silvio CET. 2011 RING: networking interacting
residues, evolutionary information and energetics
in protein structures. Bioinformatics 27, 2003–2005.
(doi:10.1093/bioinformatics/btr191)
Smoot ME, Ono K, Ruscheinski J, Wang PL,
Ideker T. 2111 Cytoscape 2.8: new features for data
integration and network visualization.
Bioinformatics 27, 431–432. (doi:10.1093/
bioinformatics/btq675)
Lindauer K, Bendic C, Suhnel J. 1996
HBexplore—a new tool for identifying and
analysing hydrogen bonding patterns in
biological macromolecules. Comput. Appl. Biosci. 12,
281–289.
Doncheva NT, Assenov Y, Domingues FS, Albrecht M.
2012 Topological analysis and interactive
visualization of biological networks and protein
structures. Nat. Protoc. 7, 670–685.
(doi:10.1038/nprot.2012.004)
.................................................
4.2. Studies on CA-II
8
rsos.royalsocietypublishing.org R. Soc. open sci. 1: 140306
but their activity against COX-2 is almost the same. GSUS method was more accurate in the
activity prediction of these molecules and the results show clearly that GSUS is more efficient in
differentiating similar structure molecules with varied activity and diverse structure molecules with
similar activity.
Downloaded from http://rsos.royalsocietypublishing.org/ on July 12, 2017
23. Vacic V, Iakoucheva LM, Lonardi S, Radivojac P. 2010
Graphlet kernels for prediction of functional
residues in protein structures. J. Comput. Biol. 17,
55–72. (doi:10.1089/cmb.2009.0029)
24. Bertolazzi P, Guerra C, Liuzzi G. 2014 Predicting
protein-ligand and protein-peptide interfaces. Eur.
Phys. J. Plus 129, 132. (doi:10.1140/epjp/i201414132-1)
9
.................................................
21. O’Boyle NM, Banck M, James CA, Morley C,
Vandermeersch T, Hutchison GR. 2011 Open Babel:
an open chemical toolbox. J. Cheminform. 3, 33.
(doi:10.1186/1758-2946-3-33)
22. Lalanne J-B, François P. 2013 Principles of adaptive
sorting revealed by in silico evolution. Phys. Rev.
Lett. 110, 218102. (doi:10.1103/
PhysRevLett.110.218102)
rsos.royalsocietypublishing.org R. Soc. open sci. 1: 140306
19. Whelan C, Sonmez K. 2012 Computing graphlet
signatures of network nodes and motifs in
Cytoscape with GraphletCounter. Bioinformatics
28, 290–291. (doi:10.1093/bioinformatics/
btr637)
20. Hubbard RE, Kamran Haider M. 2001 Hydrogen
bonds in proteins: role and strength. New York, NY:
John Wiley & Sons, Ltd.