IJCB 44B(8) 1693-1707

Indian Journal of Chemistry
Vol. 44B, August 2005, pp. 1693-1707
QSPR with TAU indices: Molar refractivity of diverse functional acyclic
compounds
Kunal Roy* & Achintya Sahaa
Drug Theoretics and Cheminformatics Lab, Division of Medicinal and Pharmaceutical Chemistry,
Department of Pharmaceutical Technology, Jadavpur University, Calcutta 700 032, India,
a
Department of Chemical Technology, University of Calcutta, 92, A P C Road, Calcutta 700 009, India
E-mail: [email protected], URL: http://www.geocities.com/kunalroy_in
Received 19 July 2004; accepted (revised) 17 November 2004
Molar refractivity of diverse functional acyclic compounds (n = 166) has been correlated with first order TAU indices
to unravel the diagnostic feature of the TAU scheme. It has been found that TAU relations could satisfactorily explain the
variances of the molar refractivity values of diverse functional compounds (up to 98.6% predicted variance and explained
variance for the composite set of 166 compounds), especially when the first order composite topochemical index is
partitioned into different components. Moreover, specific contributions of functionality, branching, shape and size terms to
the molar refractivity values could be found out from the relations involving TAU parameters. It is observed that molar
refractivity increases with the rise in molecular bulk. Further, branching has specific contribution on molar refraction
depending on the type of ramification: it has negative contribution to molar refractivity for compounds with same molecular
bulk. Negative impact of hydroxy, amino and oxy groups and positive impact of bromo and iodo functionalities on molar
refraction are also observed. The predicted MR values based on a selected TAU model are also compared with the
calculated MR values according to the Crippen's fragmentation method.
Keywords. QSPR, TAU, VEM, Topological index, Molar refractivity
IPC: Int.Cl.7 C 07 C
Topological indices are two-dimensional descriptors
of molecular structure formulated in graph theoretic
approach1,2, which defines chemical constitution by
the number and kind of atoms and linkages among
them. These indices encode structural information like
atomic arrangements, size, shape, branching,
cyclicity, presence of hetero-atoms and unsaturation
in numerical form purporting for correlation of
chemical structure with various physical properties,
chemical reactivity or biological activity3-12. Usually,
the numerical basis of topological indices is either
adjacency matrix or topological distance matrix13.
Plethora of such descriptors have been described in
the last three decades and their usefulness in
quantitative structure-activity relationship (QSAR)
and quantitative structure-property relationship
(QSPR) studies has been extensively studied5. Among
these, some of the most commonly used indices are
Wiener path number14, molecular connectivity
indices15, kappa shape indices16, electrotopological
state atom index17, 18, Balaban indices19, Basak
indices20, etc. Topological indices may be computed
for whole molecules, for fragments, or for atoms.
Apart from exploring suitable QSAR or QSPR
relations21-23, these indices may be used for
classification of bioactive molecules, carcinogens and
environmental pollutants24-26. Recently, application of
topological descriptors in drug design has been
reviewed by Estrada et al27.
An important task of a QSAR worker is to identify
appropriate descriptors that are representative of the
molecular features responsible for the relevant
activity or property. A single descriptor may be
insufficient in many cases in describing structureactivity/property relations as different molecular
features are encoded by different descriptors. Newer
topological descriptors are being reported by different
groups of workers and their usefulness in
QSAR/QSPR is being explored as evidenced from
some recent reports28-32.
Molar refractivity is a very important physicochemical parameter that has historically made
1694
INDIAN J. CHEM., SEC B, AUGUST 2005
significant contribution to the understanding of
bonding electrons in organic molecules. Although
molar refractivity has, to a great extent, been
displaced from the frontier of structure determination
by newer, more sophisticated techniques, it is now
being looked upon with resurgent interest as a
parameter for use in QSAR, especially in biological
systems. Molar refractivity being an additiveconstitutive property, can be calculated based on the
atom and bond contributions and various correction
factors33. In spite of this, attempt has been made by
different QSAR workers to model molar refractivity
using different indices to check the applicability of
such indices in modeling studies13, 34.
Topochemically arrived unique (TAU) descriptors
were introduced by Pal et al.35-38 in the late eighties.
However, these indices have not been thoroughly
explored to prove their utility in QSAR/QSPR studies.
In this paper, as a continuation of our recent efforts to
model various physicochemical and biological
properties using TAU indices39-45, attempt has been
made to explore the diagnostic and predictive
potential of TAU descriptors using molar refractivity
(Rm) of diverse functional acyclic compounds as the
model data set. We have also compared the predicted
molar refractivity values based on TAU models with
the calculated molar refractivity values according to
the Crippen's fragmentation method46.
Materials and Methods
Molar refractivity values of diverse functional
compounds were taken from reference 13. The
method of calculations of the TAU indices35-38 is
discussed herewith as an illustrative example.
In the TAU scheme, an atom is considered to be
composed of core (non-valence) and valence
electrons, and an indicator of core environment, core
count (λ), is defined as the ratio of number of core
electrons to that of valence electrons. Obviously, 1/λ
implies the strength of positive field of the atomic
core. The valence electrons are again partitioned into
two terms, localised and mobile. The mobile valence
electronic environment, identified as θ, is defined as
follows (see Eqn 1).
θ = 8 − (2h + 1.5ν + 2l )
… (1)
In the above Eqn 1, h, ν and l indicate the number
of hydrogen atoms attached, sigma bonds (other than
hydrogen) and lone pair of electrons of the atom
respectively. It is considered that the pair of electrons
forming a covalent bond with a hydrogen atom is
predominantly enjoyed (like lone pair of electrons) by
the atom to which it is bonded (and thus forms a self
loop on the molecular graph). Further, it is assumed,
as a simplifying condition, that an atom enjoys,
besides its own electron, fifty percent of the other
electron in a sigma bond with a non-hydrogen atom.
Thus, the mobile valence electron (VEM)
environment index (θ) is obtained by subtracting
these localised electronic contributions from 8, since
eight electrons make up the valence electronic
environment around a bonded atom according to the
octet rule.
In presence of π electrons, θ is defined as follows
(see Eqn 2).
θ = 0.5ν + 2π
… (2)
Thus, π electrons and σ electrons are given unequal
weights in the TAU scheme.
The VEM vertex weight (Vi) of the ith vertex and
VEM edge weight (Eij) of the edge formed by the ith
and jth vertices are defined as follows (see Eqns 3 and
4).
Vi = λ i / θi
… (3)
1/ 2
Eij = (VV
i j)
… (4)
Finally, the first order VEM molecular index (T) of
a molecular graph is defined by the algebraic sum of
all VEM edge weights (Eqn 5).
T = ∑ Eij
… (5)
i< j
The VEM edge weight of the edge incident on a
hetero-atom is assigned a negative value to account
for the difference of electronegativity between two
vertices of the edge. The first order composite
topochemical index (T) is partitioned into two
components, viz., functionality (F) and skeletal index
(TR). The first order skeletal index (TR) is obtained by
replacing the hetero-atoms with carbon and/or
removing multiple bonds, if present. TR is considered
as an index of intrinsic lipophilicity and the (overall)
functionality contribution (mainly electronic) is
represented by F35, 36, 39, 40. In case of monofunctional
compounds, functionality (F) calculated by the usual
method [as shown in Eqn (6) (vide infra)] represent
the contribution of the concerned functional group. TR
is again split into two components, viz., branching
index (B) and vertex count (NV). Branching index (B)
ROY et al.: QSPR AND TAU INDICES OF ACYCLIC COMPOUNDS
Calculation of TAU indices is illustrated in Table I
taking example of 2-methyl-4-penten-3-ol.
Molar refractivity values (Rm)13 of diverse
functional acyclic compounds [Table II; compounds
1-32 (alcohols); 33-78 (alkanes); 79-117 (alkenes);
118-139 (amines); 140-149 (ethers); 150-166
(halocarbons)] were correlated by linear regression
technique with first order VEM molecular index (T)
or combinations of its different component parts. The
objective of the study was to explore the information
extractable from first order topochemical index after
its suitable partitioning into different terms.
A GW-BASIC program RRR98 developed by one
of the authors47 was used for multiple linear
regression analyses. Statistical quality of the
equations48 was judged by examining the parameters
like Ra2 (adjusted R2, i.e., explained variance), r or R
(correlation coefficient), F (variance ratio) with df
(degree of freedom), s (standard error of estimate) and
AVRES (average of absolute values of residuals).
Significance of the regression coefficients was judged
by standard error of the coefficients and ‘t’ test. In
case that intercept of an equation was statistically
insignificant and omission of the same did not affect
the quality of the equation, exclusion of the intercept
gave statistically more acceptable equation. A
compound was considered as an outlier for a
particular equation when the residual exceeded twice
the standard error of estimate of the equation.
Averages of signed values of residuals were also
is obtained by subtracting TR from the first order
VEM index of the corresponding normal alkane (TN).
NV is a constitutional parameter and is indicative of
molecular bulk. The overall relation is represented by
Eqn (6).
T = TR − F = TN − B − F
1695
… (6)
Vertex count (NV) of the hydrogen-suppressed
molecular formula is purely a constitutional parameter
as it is obtained directly from the molecular graph.
Obviously, any index showing better correlation with
physicochemical or biological activity than that
shown by NV will have significance in the context of
QSPR / QSAR studies. NV can further be partitioned
into NP, NI and NB denoting the numbers of primary
carbon, secondary carbon and branched carbon
respectively. NB is further split into NY (number of
tertiary carbon) and NX (number of quaternary
carbon). NP, NX and NY are considered as the shape
parameters38. Although such integer indices may have
been used by some other workers also, these are
obtained in TAU scheme by obvious sequential
partitioning of the composite index. During
development of QSAR equations with TAU
parameters, the abovementioned hierarchical relations
are followed. For obvious reasons, B and NB (both
represent branching) or NP and NB (both have
interrelation)38 or NV and NI (NI may be considered
as trimmed counterpart of NV)38 are not used in the
same equation.
Table I ⎯ Calculation of TAU indices: Example of 2-methyl-4-penten-3-ol
6
6
e
a
e
c 3
d
5
a
b2
4 f
c 3
d
1
5
OH
7
b2
4
6
d4
f
a
b2
1
7
Reference alkane
Vertex Count
3
4
5
6
7
a
2-Methyl-4-penten-3-ol
Compound
2-Methyl-4-penten-3-ol
Reference alkane
Normal alkane
T =1.436
TN =3.414
B =0.233
NB =2
NY =2
NP =4
7
c3
e 5
f
1
1
2
1
1
1
1/3
1/3
1/2
1/3
1/3
1/2
1/6
1/2
1/2
1/5
1
1/2
1
2/3 0.577
1
1
0.577
1/2
1
0.707
TR =3.181
F =1.745
NV =7
NX =0
NI =1
Normal alkane
Edge Count
b
c
d
0.333
0.333
0.500
0.236
0.408
0.500
0.183
0.707
0.500
e
f
0.577
0.577
0.500
-0.471
0.577
0.707
INDIAN J. CHEM., SEC B, AUGUST 2005
1696
Table II ⎯ Topochemical indices and observed and calculated molar refractions (Rm) of diverse functional aliphatic
compounds
Compound
2-Propanol 1
1-Propanol 2
2-Methyl-l-propanol 3
1-Butanol 4
2-Methyl-2-butanol 5
2-Pentanol 6
3-Methyl-1-butanol 7
2-Methyl-1-butanol 8
1-Pentanol 9
3-Pentanol 10
2-Methyl-2-pentanol 11
3-Methyl-3-pentanol 12
4-Methyl-2-pentanol 13
2-Methyl-3-pentanol 14
4-Methyl-1-pentanol 15
2-Methyl-1-pentanol 16
2-Ethyl-1-butanol 17
1-Hexanol 18
2,4-Dimethyl-3-pentanol 19
3-Ethyl-3-pentanol 20
2-Methyl-1-hexanol 21
1-Heptanol 22
2-Methyl-2-heptanol 23
3-Methyl-3-heptanol 24
4-Methyl-4-heptanol 25
6-Methyl-1-heptanol 26
2-Ethyl-1-hexanol 27
n-Octanol 28
2,6-Dimethyl-4-heptnol 29
2-Methyl-2-octanol 30
4-Ethyl-4-heptanol 31
2,2-Dimethyl-1-butanol 32
n-Pentane 33
2-Methylbutane 34
n-Hexane 35
3-Methylpentane 36
2-Methylpentane 37
2,2-Dimethylbutane 38
2,3-Dimethylbutane 39
n-Heptane 40
2-Methylhexane 41
3-Methylhexane 42
3-Ethylpentane 43
T
Descriptors
TR
TN
Obs.a
Rm
Calc.
0.683
0.630
0.985
1.130
1.652
1.721
1.485
1.523
1.630
1.759
2.152
2.213
2.076
2.131
1.985
2.023
2.061
2.130
2.503
2.774
2.523
2.630
3.152
3.213
3.213
2.985
3.061
3.130
3.469
3.652
3.774
1.837
2.414
2.269
2.914
2.807
2.769
2.561
2.641
3.414
3.269
3.307
3.345
1.731
1.914
2.270
2.414
2.561
2.769
2.769
2.807
2.914
2.807
3.061
3.121
3.124
3.179
3.269
3.307
3.345
3.414
3.551
3.682
3.807
3.914
4.061
4.121
4.121
4.269
4.345
4.414
4.517
4.561
4.682
3.121
2.414
2.269
2.914
2.807
2.769
2.561
2.641
3.414
3.269
3.307
3.345
1.914
1.914
2.414
2.414
2.914
2.914
2.914
2.914
2.914
2.914
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.914
3.914
3.914
3.914
4.414
4.414
4.414
4.414
4.414
4.414
4.914
4.914
4.914
3.414
2.414
2.414
2.914
2.914
2.914
2.914
2.914
3.414
3.414
3.414
3.414
17.705
17.529
22.103
22.067
26.722
26.680
26.904
26.697
26.801
26.618
31.211
31.183
31.351
31.138
31.489
31.164
31.180
31.429
35.675
35.822
35.931
36.094
40.899
40.447
40.439
40.737
40.625
40.638
45.521
45.207
44.920
31.269
25.267
25.294
29.981
29.949
29.804
29.938
29.813
34.555
34.595
34.464
34.287
17.463b
17.503b
22.096b
22.136b
26.619b
26.729b
26.729b
26.729b
26.770b
26.729b
31.252b
31.252b
31.322b
31.322b
31.363b
31.363b
31.363b
31.403b
35.916b
35.885b
35.996b
36.036b
40.518b
40.518b
40.518b
40.629b
40.629b
40.669b
45.182b
45.151b
45.151b
31.252b
25.365c
25.284c
29.963c
29.882c
29.882c
29.882c
29.801c
34.561c
34.480c
34.480c
34.480c
Calc.h
17.757
17.210
21.793
21.850
27.073
27.037
26.436
26.436
26.490
27.037
31.713
31.716
31.623
31.623
31.076
31.076
31.076
31.130
36.209
36.356
35.716
35.770
40.993
40.996
40.996
40.356
40.356
40.410
45.489
45.633
45.636
30.758
25.120
25.066
29.760
29.706
29.706
29.388
29.652
34.400
34.346
34.346
34.346
⎯ Contd
ROY et al.: QSPR AND TAU INDICES OF ACYCLIC COMPOUNDS
1697
Table II ⎯ Topochemical indices and observed and calculated molar refractions (Rm) of diverse functional aliphatic
compounds ⎯ Contd
Compound
2,2-Dimethylpentane 44
2,3-Dimethylpentane 45
2,4-Dimethylpentane 46
3,3-Dimethylpentane 47
2,2,3-Trimethylbutane 48
n-Octane 49
2-Methylheptane 50
3-Methylheptane 51
4-Methylheptane 52
3-Ethylhexane 53
2,2-Dimethylhexane 54
2,3-Dimethylhexane 55
2,4-Dimethylhexane 56
2,5-Dimethylhexane 57
3,3-Dimethylhexane 58
3,4-Dimethylhexane 59
2-Methyl-3-ethylpentane 60
3-Methyl-3-ethylpentane 61
2,2,3-Trimethylpentane 62
2,2,4-Trimethylpentane 63
2,3,3-Trimethylpentane 64
2,3,4-Trimethylpentane 65
n-Nonane 66
2,2,5-Trimethylhexane 67
2,4,4-Trimethylhexane 68
3,3-Diethylpentane 69
2,2,3,3-Tetramethylpentane 70
2,2,3,4-Tetramethylpentane 71
2,2,4,4-Tetramethylpentane 72
2,3,3,4-Tetramethylpentane 73
2,4-Dimethyl-3-iso-propylpentane 74
2,2,4,5-Tetramethylhexane 75
2,2,5,5-Tetramethylhexane 76
2,2,3,4,4-Pentamethylpentane 77
2,2,3,3-Tetramethylhexane 78
2-Methyl-2-butene 79
1-Pentene 80
3,3-Dimethyl-1-butene 81
2,3-Dimethyl-2-butene 82
2,3-Dimethyl-1-butene 83
4-Methyl-1-pentene 84
2-Methyl-1-pentene 85
2-Methyl-2-pentene 86
T
Descriptors
TR
TN
Obs.a
Rm
Calc.
3.061
3.179
3.124
3.121
2.943
3.914
3.769
3.807
3.807
3.845
3.561
3.679
3.662
3.624
3.621
3.717
3.717
3.682
3.481
3.416
3.503
3.551
4.414
3.916
3.976
4.243
3.811
3.853
3.707
3.885
4.461
4.326
4.207
4.154
4.311
1.319
1.679
1.887
1.656
1.920
2.034
2.022
1.907
3.061
3.179
3.124
3.121
2.943
3.914
3.769
3.807
3.807
3.845
3.561
3.679
3.662
3.624
3.621
3.717
3.717
3.682
3.481
3.416
3.503
3.551
4.414
3.916
3.976
4.243
3.811
3.853
3.707
3.885
4.461
4.326
4.207
4.154
4.311
2.269
2.414
2.561
2.641
2.641
2.769
2.769
2.769
3.414
3.414
3.414
3.414
3.414
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
4.414
4.414
4.414
4.414
4.414
4.414
4.414
4.414
4.914
4.914
4.914
4.914
4.914
2.414
2.414
2.914
2.914
2.914
2.914
2.914
2.914
34.621
34.328
34.623
34.336
34.378
39.194
39.234
39.102
39.119
38.946
39.255
38.983
39.132
39.261
39.011
38.864
38.838
38.719
38.927
39.264
38.764
38.870
43.846
43.939
43.663
43.117
43.218
43.439
43.878
43.205
47.913
48.262
48.575
47.988
47.905
24.955
24.858
29.598
29.590
30.063
29.542
29.398
29.754
34.456c
34.399c
34.399c
34.456c
34.375c
39.159c
39.078c
39.078c
39.078c
39.078c
39.054c
38.997c
38.997c
38.997c
39.054c
38.997c
38.997c
39.054c
38.973c
38.973c
38.973c
38.916c
43.757c
43.571c
43.571c
43.652c
43.546c
43.490c
43.546c
43.490c
48.031c
48.087c
48.144c
48.063c
48.144c
24.967d
24.934d
29.556d
29.560d
29.560d
29.526d
29.526d
29.526d
Calc.h
34.028
34.292
34.292
34.028
33.974
39.040
38.986
38.986
38.986
38.986
38.668
38.932
38.932
38.932
38.668
38.932
38.932
38.668
38.614
38.614
38.614
38.878
43.680
43.254
43.254
43.308
42.936
43.200
42.936
43.200
48.104
47.840
47.576
47.522
47.576
25.066
25.120
29.388
29.652
29.652
29.706
29.706
29.706
⎯ Contd
INDIAN J. CHEM., SEC B, AUGUST 2005
1698
Table II ⎯ Topochemical indices and observed and calculated molar refractions (Rm) of diverse functional aliphatic
compounds ⎯ Contd
Compound
3-Methyl-1-pentene 87
2-Ethyl-1-butene 88
1-Hexene 89
2,3,3-Trimethyl-1-butene 90
4,4-Dimethyl-1-pentene 91
3,3-Dimethyl-1-pentene 92
2,3-Dimethyl-2-pentene 93
3,4-Dimethyl-1-pentene 94
3,4-Dimethyl-2-pentene 95
3-Methyl-2-ethyl-1-butene 96
2,3-Dimethyl-1-pentene 97
5-Methyl-1-hexene 98
2-Methyl-2-hexene 99
2-Methyl-1-hexene 100
2-Methyl-3-hexene 101
4-Methyl-1-hexene 102
2-Ethyl-1-pentene 103
3-Ethyl-1-pentene 104
3-Ethyl-2-pentene 105
1-Heptene 106
2,4,4-Trimethyl-2-pentene 107
2,4,4-Trimethyl-1-pentene 108
2,3,4-Trimethyl-1-pentene 109
3,3,4-Trimethyl-1-pentene 110
2-iso-Propyl-3-methyl-1-butene 111
5,5-Dimethyl-1-hexene 112
4,4-Dimethyl-1-hexene 113
3,3-Dimethyl-1-hexene 114
2,5-Dimethyl-3-hexene 115
3,5-Dimethyl-1-hexene 116
3-Ethyl-4-methyl-1-pentene 117
Trimethylamine 118
1-Aminopropane 119
2-Amino-2-methylpropane 120
1-Aminobutane 121
1-Amino-2,2-dimethyl-propane 122
1-Amino-3-methylbutane 123
3-Aminopentane 124
Dipropylamine 125
1-Aminopentane 126
3-Amino-2,2-dimethyl-butane 127
Di-iso-propylamine 128
Butyldimethylamine 129
T
Descriptors
TR
TN
Obs.a
Rm
Calc.
2.111
2.118
2.179
2.236
2.325
2.448
2.252
2.483
2.314
2.516
2.458
2.534
2.407
2.522
2.553
2.572
2.351
2.649
2.512
2.679
2.615
2.668
2.830
2.830
2.914
2.825
2.886
2.948
2.947
2.966
3.021
-1.550
0.575
1.053
1.075
1.221
1.430
1.714
1.520
1.575
1.849
1.578
0.308
2.807
2.807
2.914
2.943
3.061
3.121
3.179
3.179
3.179
3.179
3.179
3.269
3.269
3.269
3.269
3.307
3.307
3.345
3.345
3.414
3.416
3.416
3.551
3.503
3.551
3.561
3.621
3.621
3.624
3.662
3.717
1.731
1.914
2.000
2.414
2.561
2.769
2.807
3.414
2.914
2.943
3.124
3.269
2.914
2.914
2.914
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.414
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
3.914
1.914
1.914
2.414
2.414
2.914
2.914
2.914
3.414
2.914
3.414
3.414
3.414
29.485
29.391
29.208
33.980
34.233
34.011
34.203
33.919
33.900
34.024
34.005
34.139
34.395
34.114
34.223
34.078
33.991
34.039
34.117
34.136
39.015
38.769
38.836
38.499
38.385
38.785
38.643
38.548
38.823
38.764
38.591
19.595
19.401
24.257
24.079
28.471
28.672
28.617
33.515
28.728
25.098
33.641
33.816
29.526d
29.526d
29.493d
34.149d
34.115d
34.115d
34.119d
34.119d
34.119d
34.119d
34.119d
34.085d
34.085d
34.085d
34.085d
34.085d
34.085d
34.085d
34.085d
34.052d
38.708d
38.708d
38.711d
38.708d
38.711d
38.674d
38.674d
38.674d
38.678d
38.678d
38.678d
19.775e
20.100e
21.499e
24.651e
26.050e
28.877e
28.877e
33.753e
29.202e
30.276e
33.103e
33.428e
Calc.h
29.706
29.706
29.760
33.974
34.028
34.028
34.292
34.292
34.292
34.292
34.292
34.346
34.346
34.346
34.346
34.346
34.346
34.346
34.346
34.400
38.614
38.614
38.878
38.614
38.878
38.668
38.668
38.668
38.932
38.932
38.932
18.809
19.820
24.282
24.460
28.728
29.047
29.168
33.467
29.101
33.435
33.530
32.888
⎯ Contd
ROY et al.: QSPR AND TAU INDICES OF ACYCLIC COMPOUNDS
1699
Table II ⎯ Topochemical indices and observed and calculated molar refractions (Rm) of diverse functional aliphatic
compounds ⎯ Contd
Compound
Triethylamine 130
Butylethylamine 131
1-Aminohexane 132
Dimethylpentylamine 133
2-Aminoheptane 134
1-Aminoheptane 135
Di-iso-butylamine 136
Dimethyl-iso-butylamine 137
Tripropylamine 138
1-Aminononane 139
Methyl propyl ether 140
Diethyl ether 141
n-Butyl methyl ether 142
sec-Butyl ethyl ether 143
Di-n-propyl ether 144
Butyl ethyl ether 145
Butyl iso-propyl ether 146
Di-n-butyl ether 147
Ethyl iso-propyl ether 148
Ethyl pentyl ether 149
1-Chloropropane 150
2-Chlorobutane 151
1-Chloro-2-methylpropane 152
1-Chlorobutane 153
3-Chloropentane 154
2-Bromopropane 155
1-Bromopropane 156
2-Bromobutane 157
1-Bromo-2-methylpropane 158
1-Bromobutane 159
3-Bromopentane 160
2-Iodobutane 161
3-Iodopentane 162
2-Iodopentane 163
1-Iodopentane 164
1-Iodohexane 165
1-Iodoheptane 166
T
Descriptors
TR
TN
Obs.a
Rm
Calc.
Calc.h
1.025
1.520
2.075
0.808
2.676
2.575
2.230
0.163
2.525
3.575
0.222
0.598
0.722
1.658
1.598
1.598
2.120
2.598
1.120
2.098
0.012
0.717
0.367
0.512
1.255
-0.478
-0.793
0.060
-0.438
-0.293
0.598
-0.401
0.137
0.099
-0.358
0.142
0.642
3.345
3.414
3.414
3.769
3.769
3.914
4.124
3.124
4.845
4.914
2.414
2.414
2.914
3.307
3.414
3.414
3.769
4.414
2.769
3.914
1.914
2.269
2.269
2.414
2.807
1.731
1.914
2.269
2.269
2.414
2.807
2.269
2.807
2.769
2.914
3.414
3.914
3.414
3.414
3.414
3.914
3.914
3.914
4.414
3.414
4.914
4.914
2.414
2.414
2.914
3.414
3.414
3.414
3.914
4.414
2.914
3.914
1.914
2.414
2.414
2.414
2.914
1.914
1.914
2.414
2.414
2.414
2.914
2.414
2.914
2.914
2.914
3.414
3.914
33.794
33.452
33.290
38.281
38.038
38.003
42.920
33.852
47.783
47.277
22.049
22.493
27.021
31.560
32.226
31.734
36.027
40.987
27.679
36.364
20.847
25.506
25.360
25.441
30.161
23.935
23.679
28.651
28.537
28.347
33.068
33.940
38.354
38.314
38.264
42.891
47.610
33.428e
33.753e
33.753e
37.979e
37.979e
38.304e
42.205e
33.103e
47.080e
47.405e
22.477f
22.477f
27.105f
31.756f
31.733f
31.733f
36.384f
40.989f
27.128f
36.361f
20.872g
25.833g
25.239g
25.239g
30.200g
24.098g
24.098g
28.465g
28.465g
28.465g
32.832g
33.463g
37.831g
37.831g
38.958g
43.325g
47.692g
33.203
33.467
33.741
37.528
38.448
38.381
42.639
32.833
47.123
47.661
22.117
22.513
26.613
31.979
31.793
31.793
36.619
41.073
27.339
36.433
20.480
25.066
25.066
25.120
29.706
23.435
24.168
28.075
28.753
28.808
32.715
32.722
37.362
37.362
39.142
43.782
48.422
a
Ref. [13], Obs.= Observed; Calc.= Calculated; bAs per Eq. 9; cAs per Eq. 11; dAs per Eq. 13; eAs per Eq. 15; fAs per Eq. 17;
As per Eq. 21; hAs per Eq. 26
g
noted in such cases. The robustness of the equations
under individual series was checked with PRESS
(predicted residual sum of squares) statistics obtained
by the "leave-one-out" (LOO) technique49-51 using
programs KRPRES1 and KRPRES247. Two LOO
parameters, Q2 (crossvalidation R2 or predicted
variance) and SDEP (standard deviation of error of
predictions), were used to compare the equations. In
case of the composite set, "leave-50%-out" crossvalidation was performed.
Results and Discussion
The calculated topological indices of 166
compounds are given in Table II. Tables III to IX
INDIAN J. CHEM., SEC B, AUGUST 2005
1700
Table III ⎯ Relations of molar refraction (Rm) of alcohols with
topochemical indices
Rm = ∑ βi xi + α
Model equation,
Equation No.
Regression
coefficients
(standard
errors)a
7
12.336
(0.520)
9.035 T
(0.217)
α
(s.e.)
β1
(s.e.)
β2
8
6.894
(1.287)
9.374 TR
(0.186)
-5.220 F-
Table V ⎯ Relations of molar refraction (Rm) of alkenes with
topochemical indices
9
8.237
(0.080)
4.633 NI
(0.019)
13.749 NX
Equation No.
Regression
coefficients
(standard errors)a
OH
(s.e.)
β3
(s.e.)
(0.872)
Q2
PRESS
SDEP
Ra2
R
s
F (df)b
Statistics
0.980
35.1
1.048
0.982
0.991
1.012
1729.0
(1, 30)
0.798
32
AVRES
n
0.988
22.0
0.829
0.989
0.995
0.793
1418.2
(2, 29)
0.625
32
0.999
1.0
0.177
1.000
1.000
0.154
25300.0
(3, 28)
0.108
32
β1
(s.e.)
Regression
coefficients
(standard errors)a
Statistics
α
(s.e.)
β1
(s.e.)
β2
(s.e.)
β3
(s.e.)
2
Q
PRESS
SDEP
Ra2
R
s
F (df)b
AVRES
n
c
10.822 T
(0.074)
0.901
152.4
1.820
0.906
0.952
1.795
21400.0 (1,
45)
1.485c
46
Average of signed values of residuals = 0.03
11.571
(0.117)
4.598 NI
(0.032)
13.688 NX
(0.070)
9.115 NY
(0.056)
0.999
2.0
0.209
0.999
0.999
0.199
12900.0 (3,
42)
0.145
46
(0.135)
4.559 NI
(0.561)
(0.043)
13.741 NX
Q
0.861
0.998
PRESS
81.7
1.4
SDEP
1.447
0.187
2
Ra
0.872
0.998
R
0.936
0.999
s
1.405
0.175
F (df)
259.8 (1,
37)
6389.1 (3,
35)
AVRES
1.111
0.128
n
39
39
b
Table VI ⎯ Relations of molar refraction (Rm) of amines with
topochemical indices
Model equation,
11
(1.373)
9.050 T
(0.072)
Rm = ∑ βi xi + α
Equation No.
10
13
11.256
(0.103)
9.152 NY
2
Statistics
Rm = ∑ βi xi + α
Equation No.
12
12.038
β3
(s.e.)
t values of the regression coefficients are significant at 95% level
[ df = n − np − i , np = no. of predictor variables; i = 1 if
intercept is present; i = 0, otherwise]
b
F values are significant at 99% level [ df = np , n − np − i ]
Table IV ⎯ Relations of molar refraction (Rm) of alkanes with
topochemical indices.
α
(s.e.)
β2
(s.e.)
(0.073)
9.226 NY
(0.046)
a
Model equation,
Rm = ∑ βi xi + α
Model equation,
Regression
coefficients
(standard errors)a
α
(s.e.)
β1
(s.e.)
14
15
3.460
10.999
(1.482)
9.057 TR
(0.922)
4.551 NI
(0.448)
(0.222)
10.501 NX
β2
(s.e.)
(1.305)
8.777 NY
β3
(s.e.)
Statistics
(0.592)
Q2
0.945
0.913
PRESS
68.9
108.7
SDEP
1.770
2.223
Ra2
0.951
0.959
R
0.976
0.982
s
1.712
1.566
b
F (df)
408.3 (1, 20)
164.6 (3, 18)
AVRES
1.230
0.818
n
22
22
ROY et al.: QSPR AND TAU INDICES OF ACYCLIC COMPOUNDS
Table VII ⎯ Relations of molar refraction (Rm) of ethers with
topochemical indices
Model equation,
Equation No.
Regression
coefficients
(standard
errors)a
Statistics
show relations of molar refractivity (Rm) with
different topochemical indices. All regression
coefficients and variance ratios of the reported
equations are significant at 95% and 99% levels
respectively unless otherwise stated. Table II shows
the literature Rm values of the compounds13 and also
the calculated values according to the best equations
of individual series and the composite set (vide
footnote of Table II). Functionality contributions of
different groups like hydroxy, unsaturation (ene),
amino, oxy (ether), chloro, bromo and iodo are
represented by F-OH, F=, F-NH2, F-O-, F-Cl, F-Br and F-I
respectively.
Rm = ∑ βi xi + α
α
16
19.413
17
8.593
(s.e.)
(0.777)
(0.452)
β1
7.955 T
4.628 NI
(s.e.)
(0.485)
(0.091)
β2
9.278 NY
(s.e.)
(0.293)
Q2
0.951
0.994
PRESS
16.6
2.0
SDEP
1.287
0.451
Ra2
0.968
0.997
R
0.985
0.999
QSPR for alcohols (n = 32)
s
1.099
0.359
F (df)b
269.5 (1, 8)
1295.9 (2, 7)
AVRES
0.801
0.213
n
10
10
Table III shows relations of molar refractivity of
alcohols with different topochemical indices. First
order composite topochemical index (T) predicted
98.0% and explained 98.2% of the variance of molar
refractivity. When the composite index was
partitioned into skeletal index (TR) and functionality
(F-OH), predicted variance and explained variance rose
Table VIII ⎯ Relations of molar refraction (Rm) of halocarbons with topochemical indices
Model equation,
Equation No.
Regression coefficients
(standard errors)a
Rm = ∑ βi xi + α
18
19
20
14.488
(0.990)
(0.703)
β1
12.352 TR
5.220 F
8.420 TR
4.367 NV
(s.e.)
(0.281)
(0.559)
(0.371)
(0.126)
5.139 NI
-4.910 F-Cl
-1.696 F-Cl
α
(s.e.)
β2
Statistics
c
21
6.628
(s.e.)
(0.404)
(0.253)
(0.130)
β3
12.236 NY
-2.276 F-Br
1.872 F-I
(s.e.)
(0.911)
(0.184)
(0.099)
Q2
0.822
0.953
0.991
0.997
PRESS
162.0
42.3
7.9
3.1
SDEP
3.083
1.577
0.682
0.427
Ra2
0.841
0.961
0.993
0.997
R
0.917
0.983
0.997
0.999
s
3.003
1.482
0.608
0.381
F (df)b
1936.4
(1, 16)
2669.9
(3, 14)
813.6
(3, 13)
2080.2
(3, 13)
AVRES
2.418c
0.994d
0.418
0.271
n
17
17
17
17
Average of signed values of residuals = -0.015
Average of signed values of residuals = -0.044
d
1701
INDIAN J. CHEM., SEC B, AUGUST 2005
1702
Table IX ⎯ Relations of molar refraction (Rm) of the composite set with topochemical indices
Model equation,
Equation No.
Regression coefficients
(standard errors)a
α
(s.e.)
β1
(s.e.)
β2
(s.e.)
β3
(s.e.)
β4
(s.e.)
22
10.661
23
3.791
24
2.289
25
2.044
26
11.200
(0.613)
0.722 F
(0.669)
9.682 TR
(0.346)
-2.556 F-OH
(0.339)
-2.547 F-OH
(0.254)
-2.547 F-OH
(0.246)
4.417 NI
(0.187)
-3.422 F-OH
(0.146)
-0.486 F-NH2
(0.145)
-0.476 F-NH2
(0.145)
-0.493 F-NH2
(0.148)
9.172 NY
(0.292)
-0.753c F=
(0.100)
-1.448 F-O-
(0.100)
-1.431 F-O-
(0.100)
-1.436 F-O-
(0.328)
13.665 NX
(0.419)
-1.050 F-NH2
(0.153)
1.359 F-Br
(0.151)
1.362 F-Br
(0.152)
1.362 F-Br
(0.486)
(0.196)
-2.553 F-O-
(0.142)
2.864 F-I
(0.142)
2.871 F-I
(0.141)
2.867 F-I
(0.281)
0.898 F-Br
(0.l15)
4.632 NV
(0.114)
-0.803c B
(0.114)
4.640 NI
(0.276)
2.276 F-I
(0.052)
-0.136 NP
(0.443)
4.623 NV
(0.052)
9.226 NY
(0.218)
(0.072)
(0.050)
(0.110)
13.548 NX
β5
(s.e.)
β6
(s.e.)
β7
(s.e.)
Statistics
β8
(s.e.)
Q2d
PRESS
SDEPd
Ra2
R
s
F (df)b
AVRES
n
Rm = ∑ βi xi + α
0.877
937.9
2.377
0.883
0.941
2.322
312.2
(4, 161)
1.613
166
0.947
404.0
1.560
0.951
0.976
1.504
457.5
(7, 158)
1.195
166
0.986
107.0
0.803
0.986
0.994
0.789
1722.1
(7, 158)
0.379
166
0.986
108.4
0.808
0.986
0.993
0.790
1719.0
(7, 158)
0.380
166
(0.156)
0.985
113.8
0.828
0.987
0.994
0.785
1521.7
(8, 157)
0.377
166
c
Regression coefficient significant at 90% level
Leave-50%-out cross-validation; Compounds were deleted in two cycles as follows:
1, 3, 5,.......,163, 165; 2, 4, 6,......, 164, 166
d
to 98.8% and 98.9% respectively. The positive
coefficient of TR and the negative coefficient of F-OH
indicate positive contribution of lipophilicity and
negative contribution of functionality respectively. On
correlating Rm values with integer indices, a relation
with 99.9% predicted variance and 100% explained
variance was obtained. Standard deviation of error of
prediction value for this equation is 0.177. Variance
ratio (which is an indicator of stability of βcoefficients) of this equation is 14 times the
corresponding variance ratio of the equation with first
order composite topochemical index (T). However,
compounds 23 (2-methyl-2-heptanol) and 29 (2,6-
dimethyl-4-heptanol) act as outliers but are included
in Eqn (9). The calculated molar refractivity values
according to Eqn (9) are shown in Table II.
QSPR for alkanes (n = 46)
Table IV shows the relations of molar refractivity
values of alkanes with different topochemical indices.
First order composite topochemical index (T) could
predict 90.1% of the variance (explained variance
90.6%), while the relation involving integer indices
could predict 99.9% of the variance (explained
variance 99.9%). Both relations show variance ratios
of the order of 104. In case of the composite
ROY et al.: QSPR AND TAU INDICES OF ACYCLIC COMPOUNDS
1703
Table X ⎯ Validation of the TAU model Eqn (26).
Set 1
Set 2
Compd
Obs.a
Calc.b
Pred.c
Compd
Obs.a
Calc.b
Pred.d
2-Propanol 1
2-Methyl-l-propanol 3
2-Methyl-2-butanol 5
3-Methyl-1-butanol 7
1-Pentanol 9
2-Methyl-2-pentanol 11
4-Methyl-2-pentanol 13
4-Methyl-1-pentanol 15
2-Ethyl-1-butanol 17
2,4-Dimethyl-3-pentanol 19
2-Methyl-1-hexanol 21
2-Methyl-2-heptanol 23
4-Methyl-4-heptanol 25
2-Ethyl-1-hexanol 27
2,6-Dimethyl-4-heptnol 29
4-Ethyl-4-heptanol 31
n-Pentane 33
n-Hexane 35
2-Methylpentane 37
2,3-Dimethylbutane 39
2-Methylhexane 41
3-Ethylpentane 43
2,3-Dimethylpentane 45
3,3-Dimethylpentane 47
n-Octane 49
3-Methylheptane 51
3-Ethylhexane 53
2,3-Dimethylhexane 55
2,5-Dimethylhexane 57
3,4-Dimethylhexane 59
3-Methyl-3-ethylpentane 61
2,2,4-Trimethylpentane 63
2,3,4-Trimethylpentane 65
2,2,5-Trimethylhexane 67
3,3-Diethylpentane 69
2,2,3,4-Tetramethylpentane 71
2,3,3,4-Tetramethylpentane 73
17.705
22.103
26.722
26.904
26.801
31.211
31.351
31.489
31.180
35.675
35.931
40.899
40.439
40.625
45.521
44.920
25.267
29.981
29.804
29.813
34.595
34.287
34.328
34.336
39.194
39.102
38.946
38.983
39.261
38.864
38.719
39.264
38.870
43.939
43.117
43.439
43.205
17.53
22.22
26.67
26.82
26.64
31.27
31.51
31.42
31.42
36.29
36.02
40.47
40.47
40.62
45.49
45.07
25.27
29.87
30.05
30.23
34.65
34.65
34.83
34.45
39.07
39.25
39.25
39.43
39.43
39.43
39.05
39.23
39.61
43.83
43.65
44.01
44.01
17.768
21.787
27.256
26.414
26.451
31.881
31.604
31.038
31.038
36.191
35.663
41.130
41.132
40.287
45.440
45.757
25.110
29.734
29.697
29.659
34.321
34.321
34.284
34.205
38.983
38.946
38.946
38.908
38.908
38.908
38.829
38.792
38.871
43.416
43.454
43.379
43.379
17.529
22.067
26.680
26.697
26.618
31.183
31.138
31.164
31.429
35.822
36.094
40.447
40.737
40.638
45.207
31.269
25.294
29.949
29.938
34.555
34.464
34.621
34.623
34.378
39.234
39.119
39.255
39.132
39.011
38.838
38.927
38.764
43.846
43.663
43.218
43.878
47.913
17.44
22.04
26.73
26.82
26.73
31.27
31.51
31.42
31.24
35.87
35.84
40.47
40.62
40.44
45.07
31.22
25.45
30.05
29.85
34.47
34.65
34.45
34.83
34.63
39.25
39.25
39.05
39.43
39.05
39.43
39.23
39.23
43.67
43.83
43.63
43.63
48.98
17.172
21.863
27.063
26.469
27.063
31.401
31.669
31.160
31.246
36.093
35.937
40.784
40.543
40.629
45.473
30.455
25.010
29.702
28.996
34.478
34.393
33.688
34.308
33.602
39.085
39.085
38.379
38.999
38.379
38.999
38.294
38.294
43.861
42.985
42.279
42.279
48.212
2,2,4,5-Tetramethylhexane 75
2,2,3,4,4-Pentamethylpentane 77
2-Methyl-2-butene 79
3,3-Dimethyl-1-butene 81
2,3-Dimethyl-1-butene 83
2-Methyl-1-pentene 85
3-Methyl-1-pentene 87
1-Hexene 89
48.262
47.988
24.955
29.598
30.063
29.398
29.485
29.208
48.61
48.41
26.27
29.90
29.74
29.56
30.10
29.92
48.004
47.887
25.072
29.580
29.659
29.697
29.697
29.734
1-Propanol 2
1-Butanol 4
2-Pentanol 6
2-Methyl-1-butanol 8
3-Pentanol 10
3-Methyl-3-pentanol 12
2-Methyl-3-pentanol 14
2-Methyl-1-pentanol 16
1-Hexanol 18
3-Ethyl-3-pentanol 20
1-Heptanol 22
3-Methyl-3-heptanol 24
6-Methyl-1-heptanol 26
n-Octanol 28
2-Methyl-2-octanol 30
2,2-Dimethyl-1-butanol 32
2-Methylbutane 34
3-Methylpentane 36
2,2-Dimethylbutane 38
n-Heptane 40
3-Methylhexane 42
2,2-Dimethylpentane 44
2,4-Dimethylpentane 46
2,2,3-Trimethylbutane 48
2-Methylheptane 50
4-Methylheptane 52
2,2-Dimethylhexane 54
2,4-Dimethylhexane 56
3,3-Dimethylhexane 58
2-Methyl-3-ethylpentane 60
2,2,3-Trimethylpentane 62
2,3,3-Trimethylpentane 64
n-Nonane 66
2,4,4-Trimethylhexane 68
2,2,3,3-Tetramethylpentane 70
2,2,4,4-Tetramethylpentane 72
2,4-Dimethyl-3-iso-propylpentane
74
2,2,5,5-Tetramethylhexane 76
2,2,3,3-Tetramethylhexane 78
1-Pentene 80
2,3-Dimethyl-2-butene 82
4-Methyl-1-pentene 84
2-Methyl-2-pentene 86
2-Ethyl-1-butene 88
2,3,3-Trimethyl-1-butene 90
48.575
47.905
24.858
29.590
29.542
29.754
29.391
33.980
48.23
48.23
25.33
30.51
30.10
30.87
29.56
34.14
46.971
46.971
25.096
29.617
29.702
29.702
29.702
33.602
⎯ Contd
INDIAN J. CHEM., SEC B, AUGUST 2005
1704
Table X ⎯ Validation of the TAU model Eqn (26) ⎯ Contd
Set 1
Set 2
Compd
Obs.a
Calc.b
Pred.c
Compd
Obs.a
Calc.b
Pred.d
4,4-Dimethyl-1-pentene 91
2,3-Dimethyl-2-pentene 93
3,4-Dimethyl-2-pentene 95
2,3-Dimethyl-1-pentene 97
2-Methyl-2-hexene 99
2-Methyl-3-hexene 101
2-Ethyl-1-pentene 103
3-Ethyl-2-pentene 105
2,4,4-Trimethyl-2-pentene 107
2,3,4-Trimethyl-1-pentene 109
2-iso-Propyl-3-methyl-1-butene
111
4,4-Dimethyl-1-hexene 113
2,5-Dimethyl-3-hexene 115
3-Ethyl-4-methyl-1-pentene 117
1-Aminopropane 119
1-Aminobutane 121
34.233
34.203
33.900
34.005
34.395
34.223
33.991
34.117
39.015
38.836
38.385
34.50
35.11
35.65
34.34
35.47
36.02
34.16
35.47
40.05
39.12
39.12
34.205
34.284
34.284
34.284
34.321
34.321
34.321
34.321
38.792
38.871
38.871
3,3-Dimethyl-1-pentene 92
3,4-Dimethyl-1-pentene 94
3-Methyl-2-ethyl-1-butene 96
5-Methyl-1-hexene 98
2-Methyl-1-hexene 100
4-Methyl-1-hexene 102
3-Ethyl-1-pentene 104
1-Heptene 106
2,4,4-Trimethyl-1-pentene 108
3,3,4-Trimethyl-1-pentene 110
5,5-Dimethyl-1-hexene 112
34.011
33.919
34.024
34.139
34.114
34.078
34.039
34.136
38.769
38.499
38.785
34.50
34.88
34.34
34.70
34.16
34.70
34.70
34.52
38.74
39.28
39.10
33.688
34.308
34.308
34.393
34.393
34.393
34.393
34.478
38.294
38.294
38.379
38.643
38.823
38.591
19.401
24.079
39.10
40.79
39.48
19.30
23.90
38.829
38.908
38.908
19.958
24.583
38.548
38.764
19.595
24.257
28.471
39.10
39.48
19.94
23.93
28.48
38.379
38.999
18.374
23.743
28.202
28.672
33.515
25.098
33.816
33.452
38.281
38.003
33.852
47.277
22.493
31.560
31.734
40.987
36.364
25.506
25.441
23.935
28.651
28.347
33.940
38.314
42.891
28.68
33.70
33.17
33.94
33.70
38.54
37.70
34.12
46.90
22.40
31.69
31.60
40.80
36.20
25.25
25.16
23.63
28.23
28.14
33.64
38.24
42.75
r2pred =
0.978
29.170
33.614
33.737
33.157
33.614
37.781
38.457
33.119
47.706
22.640
32.079
31.890
41.139
36.514
25.072
25.110
23.328
27.953
28.639
32.486
37.111
43.444
r2 =
0.976
3,3-Dimethyl-1-hexene 114
3,5-Dimethyl-1-hexene 116
Trimethylamine 118
2-Amino-2-methylpropane 120
1-Amino-2,2-dimethylpropane
122
3-Aninopentane 124
1-Aminopentane 126
Di-iso-propylamine 128
Triethylamine 130
1-Aminohexane 132
2-Aminoheptane 134
Di-iso-butylamine 136
Tripropylamine 138
Methyl propyl ether 140
n-Butyl methyl ether 142
Di-n-propyl ether 144
Butyl iso-propyl ether 146
Ethyl iso-propyl ether 148
1-Chloropropane 150
1-Chloro-2-methyl propane 152
3-Chloropentane 154
1-Bromopropane 156
1-Bromo-2-methyl propane 158
3-Bromopentane 160
3-Iodopentane 162
1-Iodopentane 164
1-Iodoheptane 166
28.617
28.728
33.641
33.794
33.290
38.038
42.920
47.783
22.049
27.021
32.226
36.027
27.679
20.847
25.360
30.161
23.679
28.537
33.068
38.354
38.264
47.610
28.59
28.50
33.88
34.34
33.10
37.79
43.26
48.14
22.20
26.80
31.60
36.29
27.09
20.56
25.34
29.85
23.54
28.32
32.83
38.24
38.15
47.35
r2pred =
0.997
29.054
28.993
33.392
33.018
33.685
38.437
42.568
47.092
21.848
26.385
31.660
36.525
27.142
20.404
25.010
29.702
24.338
28.944
32.912
37.655
39.534
48.917
r2 =
0.994
1-Amino-3-methylbutane 123
Dipropylamine 125
3-Amino-2,2-dimethylbutane 127
Butyldimethylamine 129
Butylethylamine 131
Dimethylpentylamine 133
1-Aminoheptane 135
Dimethyl-iso-butylamine 137
1-Aminononane 139
Diethyl ether 141
sec-Butyl ethyl ether 143
Butyl ethyl ether 145
Di-n-butyl ether 147
Ethyl pentyl ether 149
2-Chlorobutane 151
1-Chlorobutane 153
2-Bromopropane 155
2-Bromobutane 157
1-Bromobutane 159
2-Iodobutane 161
2-Iodopentane 163
1-Iodohexane 165
Statisticse
a
From Ref. 13
Calculated according to Crippen's fragmentation method
c
Calculated based on equation developed from set 2 compounds
d
Calculated based on equation developed from set 1 compounds
e
See text for details
b
ROY et al.: QSPR AND TAU INDICES OF ACYCLIC COMPOUNDS
topochemical (T) index, insignificant intercept was
dropped (set to zero) in the reported Eqn (10). In
addition to average of absolute values of residuals,
average of signed values of residuals also is
mentioned for this equation. Eqn (11) suggests that
molar refractivity of alkanes depends on branching as
evidenced from the coefficients of NX and NY and
impact of quaternary type carbon is more in
comparison to that of tertiary type. Further, molar
refractivity also increases with increase in bulk (as
evidenced from coefficient of NI). 3,3-Diethylpentane
(69) and 2,2,5,5-tetramethylhexane (76) act as outliers
but are included in Eqn (11). The calculated molar
refractivity values according to Eqn (11) are shown in
Table II.
QSPR for alkenes (n = 39)
Table V shows that the first order composite
topochemical index (T) predicted 81.1% of the
variance (explained variance 87.2%). However, with
integer indices, statistically much superior relation
was obtained. Eqn (13) has predicted variance (Q2) of
99.8%, explained variance (Ra2) of 99.8%, variance
ratio (F) of 6389 (df 3, 35) and standard deviation of
error of prediction of 0.187. This equation shows
specific contributions of branching and bulk to molar
refractivity. 2,3-Dimethyl-1-butene (83) acts as an
outlier but is included in Eqn (13). The calculated
refractivity values according to Eqn (13) are given in
Table II.
QSPR for amines (n = 22)
In exploring QSPR of amines (Table VI), first
order VEM connectivity index (T) did not give
acceptable relation, while first order skeletal
topochemical index (TR) generated an equation with
predicted variance of 94.5% (95.1% explained
variance). Integer indices gave marginally inferior
relation (predicted variance of 91.3%, explained
variance of 95.9%). However, this equation showed
specific impact of branching and bulk on the molar
refractivity values. The calculated values according to
Eqn (15) are shown in Table II. 3-Amino-2,2dimethylbutane (127) is an outlier but is included in
Eqn (15).
QSPR for ethers (n = 10)
Table VII shows that the VEM topochemical
index (T) could predict 95.1% of the variance
(explained variance 96.8%). The relation involving
integer indices showed specific impact of branching
1705
and bulk and predicted 99.4% of the variance (99.7%
explained variance). The variance ratio of Eqn (17) is
1295 (df 2, 7) which indicates stability of the
β-coefficients. Standard deviation of error of
prediction value for this equation is 0.451. The
calculated refractivity values according to Eqn (17)
are given in Table II. All compounds fitted well in
Eqn (17).
QSPR for halocarbons (n = 17)
In case of halocarbons (Table VIII), first order
VEM skeletal index showed 82.2% predicted variance
and 84.1% explained variance. The relation with first
order VEM index (T) could not generate any
acceptable relation. However, when the composite
index was split into different components (F, NI and
NY), a statistically superior relation (predicted
variance 95.3% and explained variance 96.1%) was
obtained. This relation (Eqn 19) showed specific
contributions of functionality, bulk and branching as
evidenced from coefficients of F, NI and NY.
Insignificant intercepts were dropped in cases of Eqns
(18) and (19). In addition to averages of absolute
values of residuals, averages of signed values of
residuals also are mentioned for these equations. On
using functionality of individual halogen atoms (F-Cl,
F-Br and F-I), statistically excellent relations were
generated. Eqns (20) and (21) showed negative
impact of chlorine and bromine, and positive impact
of iodine on molar refraction (considering only
halocarbon compounds). The calculated refractivity
values according to Eqn (21) are listed in Table II.
All halocarbons fitted well in Eqn (21).
QSPR for all compounds (n = 166)
In case of the composite set (Table IX), first order
VEM topochemical index (T) could not generate any
acceptable relation, while the relation with first order
VEM skeletal index showed 82.9% predicted variance
and 83.3% explained variance (equation not shown).
However, when the first order composite index was
split into different components (F, NI, NX and NY),
statistically more acceptable relations was obtained.
Moreover, this relation (Eqn 22) showed specific
contributions of functionality, bulk and branching
(tertiary and quaternary carbons) and all these are
obtained by suitable partitioning of first order
composite TAU index into different terms (without
considering higher order terms). On using
functionality of individual groups (alcohol, amine,
ether, bromo and iodo), statistically excellent relations
1706
INDIAN J. CHEM., SEC B, AUGUST 2005
were generated. These relations (Eqns 23, 24, 25 and
26), show negative impact of hydroxy, amino and oxy
groups and positive impact of bromo and iodo
functionalities. On using vertex count NV, negative
coefficients of NP and B are observed (Eqns 24 and 25
respectively) though the latter term is significant at
90% level. In equations involving NI term (Eqns 22
and 26), shape parameters show positive coefficients
as these are components of NV which has positive
contribution to molar refractivity. It is to be noted
here that the effect of branching on the property
irrespective of molecular bulk can be found out only
when branching or shape parameters are
simultaneously used along with NV or bulk parameter
in an equation. Thus, Eqns (24) and (25) show
negative impact of branching on molar refractivity for
compounds with same molecular bulk. Further,
contributions of unsaturation and chlorine atoms were
not found statistically significant (and thus are absent
in the final relations). The calculated refractivity
values according to Eqn (26) are listed in Table II.
All compounds fitted well in Eqn (26).
increases with increase in molecular bulk. Moreover,
it also depends on the type and number of
ramification (as evidenced from contributions of
shape terms). The impact of quaternary carbons (more
branched) on refractivity values is more in
comparison to that of tertiary carbon (less branched).
Further, branching shows negative impact on molar
refractivity for compounds with same molecular bulk.
Among the functionalities, hydroxy, amine and oxy
groups show negative impact and bromo and iodo
substitutions show positive impact on molar refraction
in comparison to that of the corresponding reference
alkane (non-functional compound). The impact of
unsaturation and chlorine atom on molar refractivity
in comparison to non-functional compound is not
significant. The usefulness of TAU scheme lies in its
ability to model properties of heterofunctional
compounds and explore functionality, size and
branching contributions. From this study, it appears
that TAU indices may be used as a tool, in addition to
other indices, for exploring QSPR.
Acknowledgement
Validation of the TAU model Eqn(26)
In order to validate the TAU model Eqn (26), the
data set was divided into two subsets 1 and 2 (vide
Table X) and set 1 was used as the training set for
developing a model using the descriptors present in
Eqn (26) and then molar refractivity values of set 2
(test set) were calculated using the developed model.
The same was repeated by making sets 2 and 1
training and test sets respectively. In each case, r2pred
values were calculated and compared with the r2
values derived from observed and calculated values
(Crippen's fragmentation method46). It was found that
in both cases, the r2 and r2pred values were comparable.
Conclusions
This study shows that though the composite
topochemical index T does not always provide
acceptable model for molar refractivity of
heterofunctional acyclic compounds, TAU scheme
can generate statistically acceptable relations when
the first order composite index is partitioned into
different components like skeletal index, size and
shape terms, branching and functionality. Moreover,
TAU indices can unravel specific contributions of
molecular bulk (size), functionality, branching and
shape parameters to the molar refractivity of diverse
functional compounds. In general, molar refractivity
The authors are grateful to Sri Dipak Kumar Pal for
guidance and inspiration. A financial grant from J. U.
Research Fund is also thankfully acknowledged.
References
1 Trinajstic N, Chemical Graph Theory, (CRC Press, Boca
Raton, FL), 1983.
2 Hansen P J & Jurs P C, J Chem Educ, 65, 1988, 574.
3 Motoc I & Balaban A T, Rev Roum Chim, 26, 1981, 593.
4 Mercader A, Castro E A & Toropov A A, Int J Mol Sci, 2,
2001, 121, http://www.mdpi.org/ijms
5 Devillers J & Balaban A T (Eds), Topological Indices and
Related Descriptors in QSAR and QSPR, (Gordon and Breach
Science Publishers, Netherlands), 1999.
6 Randic M, J Mol Graphics Mod, 20, 2001, 19.
7 Randic M, Balaban A T & Basak S C, J Chem Inf Comput Sci,
41, 2001, 593.
8 Randic M & Zupan J, J Chem Inf Comput Sci, 41, 2001, 550.
9 Hosoya H, Internet Electron J Mol Des, 1, 2002, 428,
http://www.biochempress.com
10 Marino D J G, Peruzzo P J, Castro E A & Toropov A A,
Internet Electron J Mol Des, 1, 2002, 115,
http://www.biochempress.com
11 Gozalbes R, Doucet J P & Derouin F, Curr Drug Targets
Infect Disord, 2, 2002, 93.
12 Estrada E, J Phys Chem A, 106, 2002, 9085.
13 Kier L B & Hall L H, Molecular Connectivity in Chemistry
and Drug Research, (Academic Press, New York), 1976.
14 Wiener H, J Am Chem Soc, 69, 1947, 17.
15 Kier L B & Hall L H, Molecular Connectivity in StructureActivity Analysis, (Research Studies Press, Letchworth,
England), 1986.
ROY et al.: QSPR AND TAU INDICES OF ACYCLIC COMPOUNDS
16 Kier L B, Prog Clin Biol Res, 291, 1989, 105.
17 Kier L B & Hall L H, Pharm Res, 7, 1990, 801.
18 Hall L H, Mohney B K & Kier L B, Quant Struct-Act Relat,
12, 1993, 44.
19 Balaban A T, Catana C, Dawson M & Niculescu-Duvaz I, Rev
Roum Chim, 35, 1990, 997.
20 Basak S C & Gute B D, SAR QSAR Environ Res, 7, 1997, 1.
21 Duchowicz P, Sinani R G, Castro E A & Toropov A A, Indian
J Chem, 42A, 2003, 1354.
22 Pyka A & Bober K, Indian J Chem, 42A, 2003, 1360.
23 Pyka A, Kepczynska E & Bojarski J, Indian J Chem, 42A,
2003, 1405.
24 Kauffman G W & Jurs P C, J Chem Inf Comput Sci, 41, 2001,
1553.
25 Garcia-Domenech R, de Julian-Ortiz J V, Duart M J, GarciaTorrecillas J M, Anton-Fos G M, Rios-Santamarina I, de
Gregorio-Alapont C & Galvez J, SAR QSAR Environ Res, 12,
2001, 237.
26 Bakken G A & Jurs P C, J Med Chem, 43, 2000, 4534.
27 Estarada E, Paltewicz G & Uriate E, Indian J Chem, 42A,
2003, 1315.
28 Torrens F, J Comput-Aided Mol Des, 15, 2001, 709.
29 Patel H & Cronin M T, J Chem Inf Comput Sci, 41, 2001,
1228.
30 Khadikar P V, Karmakar S, Singh S & Shrivastava A, Bioorg
Med Chem, 10, 2002, 3163.
31 Ren B, J Chem Inf Comput Sci, 42, 2002, 858.
32 Estrada E & Molina E, J Mol Graphics Mod, 50, 2001, 54.
33 Glsstone S, Textbook of Physical Chemistry, (MacMillan &
Co. Ltd, London), 1948, pp. 528-532.
34 Saxena A K, Quant Struct-Act Relat, 14, 1995, 142.
35 Pal D K, Sengupta C & De A U, Indian J Chem, 27B, 1988,
734.
1707
36 Pal D K, Sengupta C & De A U, Indian J Chem, 28B, 1989,
261.
37 Pal D K, Sengupta M, Sengupta C & De A U, Indian J Chem,
29B, 1990, 451.
38 Pal D K, Purkayastha S K, Sengupta C & De A U, Indian J
Chem, 31B, 1992, 109.
39 Roy K, Pal D K, De A U & Sengupta C, Indian J Chem, 38B,
1999, 664.
40 Roy K, Pal D K, De A U & Sengupta C, Indian J Chem, 40B,
2001, 129.
41 Roy K & Saha A, J Mol Model, 9, 2003, 259.
42 Roy K & Saha A, Internet Electron J Mol Des, 2, 2003, 288,
http://www.biochempress.com
43 Roy K & Saha A, Internet Electron J Mol Des, 2, 2003, 475,
http://www.biochempress.com
44 Roy K, Chakraborty S, Ghosh C C & Saha A, J Indian Chem
Soc, 81, 2004, 115.
45 Roy K & Saha A, Indian J Chem, 43A, 2004, 1369.
46 Ghose A K & Crippen G M, J Chem Inf Comput Sci, 27,
1987, 21.
47 The GW-BASIC programs RRR98 (multiple regression),
KRPRES1 and KRPRES2 (PRESS statistics) were developed
by Kunal Roy (1998) and standardized using known data sets
48 Snedecor G W & Cochran W G, Statistical Methods; (Oxford &
IBH Publishing Co. Pvt Ltd, New Delhi), 1967, pp 381 – 418.
49 Kier L B & Hall L H, In Advances in Drug Research, edited by
B Testa, Vol 22, (Academic Press, New York), 1992, pp 1-38.
50 Wold S & Eriksson L, In Chemometric Methods in Molecular
Design, edited by H van de Waterbeemd, (VCH, Weinheim),
1995, p 312.
51 Debnath A K, In Combinatorial Library Design and
Evaluation, edited by A K Ghose & V N Viswanadhan,
(Marcel Dekker, Inc, New York), 2001, pp 73-129.