J. Chem. Phys. (Commun.) 119, 11501

JOURNAL OF CHEMICAL PHYSICS
VOLUME 119, NUMBER 22
8 DECEMBER 2003
COMMUNICATIONS
Combined first-principles calculation and neural-network correction
approach for heat of formation
LiHong Hu, XiuJun Wang, LaiHo Wong, and GuanHua Chen
Department of Chemistry, The University of Hong Kong, Hong Kong, China
共Received 17 March 2003; accepted 14 October 2003兲
Despite their success, the results of first-principles quantum mechanical calculations contain
inherent numerical errors caused by various intrinsic approximations. We propose here a
neural-network-based algorithm to greatly reduce these inherent errors. As a demonstration, this
combined quantum mechanical calculation and neural-network correction approach is applied to the
evaluation of standard heat of formation ⌬ fH 両 for 180 small- to medium-sized organic molecules
at 298 K. A dramatic reduction of numerical errors is clearly shown with systematic deviation being
eliminated. For example, the root-mean-square deviation of the calculated ⌬ fH 両 for the 180
molecules is reduced from 21.4 to 3.1 kcal mol⫺1 for B3LYP/6-311⫹G(d, p) and from 12.0 to
3.3 kcal mol⫺1 for B3LYP/6-311⫹G(3d f ,2p) before and after the neural-network correction.
© 2003 American Institute of Physics. 关DOI: 10.1063/1.1630951兴
One of the Holy Grails of computational science is to
quantitatively predict properties of matter prior to experiments. Despite the fact that the first-principles quantum mechanical calculation1,2 has become an indispensable research
tool and experimentalists have been increasingly relying on
computational results to interpret their experimental findings,
the practically used numerical methods by far are often not
accurate enough in particular for complex systems. This
limitation is caused by the inherent approximations adopted
in the first-principles methods. Because of computational
cost, electron correlation has always been a difficult obstacle
for first-principles calculations. Finite basis sets chosen in
practical computations are not able to cover entire physical
space and this inadequacy introduces also inherent computational errors. Effective core potential is frequently used to
approximate the relativistic effects, resulting inevitably in
errors for systems that contain heavy atoms. The accuracy of
a density functional theory 共DFT兲 calculation is mainly determined by the exchange-correlation 共XC兲 functional being
employed,1 whose exact form is however unknown.
Nevertheless, the results of first-principles quantum mechanical calculation can capture the essence of physics. For
instance, the calculated results, despite that their absolute
values may agree poorly with measurements, are usually of
the same tendency among different molecules as their experimental counterparts. The quantitative discrepancy between
the calculated and experimental results depends predominantly on the property of primary interest and, to a less extent, also on other related properties of the material. There
exists thus a sort of quantitative relation between the calculated and experimental results, as the aforementioned approximations to a large extent contribute to the systematic
errors of specified first-principles methods. Can we develop
general ways to eliminate the systematic computational errors and further to quantify the accuracies of numerical
0021-9606/2003/119(22)/11501/7/$20.00
methods used? It has been proven an extremely difficult task
to determine the calculation errors from the first-principles.
Alternatives must be sought.
We propose here a neural-network-based algorithm to
determine the quantitative relationship between the experimental data and the first-principles calculation results. The
determined relation will subsequently be used to eliminate
the systematic deviations of the calculated results and thus
reduce the numerical uncertainties. Since its beginning in the
late fifties, neural networks has been applied to various engineering problems, such as robotics, pattern recognition,
speech, etc.3,4 As the first application of neural networks to
quantum mechanical calculations of molecules, we choose
the standard heat of formation ⌬ fH 両 at 298.15 K as the
property of interest.
A total of 180 small- or medium-sized organic molecules, whose ⌬ fH 両 values are well documented in Refs.
5–7, are selected to test our proposed approach. The three
tabulated values of ⌬ fH 両 in the three references differ less
than 1.0 kcal mol⫺1 for any one of the 180 molecules. The
uncertainties of all ⌬ fH 両 values are less than 1.0 kcal mol⫺1
in Refs. 5–7. These selected molecules contain elements
such as H, C, N, O, F, Si, S, Cl, and Br. The heaviest molecule contains 14 heavy atoms, and the largest has 32 atoms.
We divide these molecules randomly into the training set
共150 molecules兲 and the testing set 共30 molecules兲. The geometries of 180 molecules are optimized via B3LYP/6-311
⫹G(d, p) 8 calculations, and the zero point energies 共ZPEs兲
are calculated at the same level. The enthalpy of each molecule is calculated at both B3LYP/6-311⫹G(d, p) and
B3LYP/6-311⫹G(3d f ,2p). 8
B3LYP/6-311⫹G(3d f ,2p)
employs a larger basis set than B3LYP/6-311⫹G(d, p). The
unscaled B3LYP/6-311⫹G(d, p) ZPE is employed in the
⌬ fH 両 calculations. The strategies in Ref. 9 are adopted to
calculate ⌬ fH 両 . The calculated ⌬ fH 両 s for B3LYP/6-311
11501
© 2003 American Institute of Physics
Downloaded Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
11502
Hu et al.
J. Chem. Phys., Vol. 119, No. 22, 8 December 2003
FIG. 1. Experimental ⌬ fH 両 vs the
calculated ⌬ fH 両 for all 180 compounds. 共a兲 and 共b兲 are the comparisons of the experimental ⌬ fH 両 s to
their raw B3LYP/6-311⫹G(d,p) and
B3LYP/6-311⫹G(3d f ,2p)
results,
respectively. 共c兲 and 共d兲 are the comparisons of the experimental ⌬ fH 両 s to
the neural-network corrected B3LYP/
6-311⫹G(d,p) and B3LYP/6-311
⫹G(3d f ,2p) ⌬ fH 両 s, respectively. In
共c兲 and 共d兲, the triangles are for the
training set and the crosses for the
testing set. The correlation coefficients
of the linear fits are 0.998 and 0.994 in
共c兲 and 共d兲, respectively. Insets are the
histograms for the differences between
the experimental and calculated
⌬ fH 両 s 共before and after the neuralnetwork corrections兲. All values are in
the units of kcal mol⫺1 .
⫹G(d, p) and B3LYP/6-311⫹G(3d f ,2p) are compared to
their experimental data in Figs. 1共a兲 and 1共b兲. The horizontal
coordinates are the raw calculated data and the vertical coordinates are the experimental values. The dashed lines are
where the vertical and horizontal coordinates are equal, i.e.,
where the B3LYP calculations and experiments would have
the perfect match. The raw calculation values are mostly
below the dashed line, i.e., most raw ⌬ fH 両 s are larger than
the experimental data. In other words, there are systematic
deviations for B3LYP ⌬ fH 両 . Compared to the experimental
measurements, the root-mean-square 共RMS兲 deviations for
⌬ fH 両 are 21.4 and 12.0 kcal mol⫺1 for B3LYP/6-311
⫹G(d, p) and B3LYP/6-311⫹G(3d f ,2p) calculations, respectively. In Table I we compare the B3LYP and experimental ⌬ fH 両 s for 180 molecules with first seven small molecules reported in Ref. 9. Overall, B3LYP/6-311
⫹G(3d f ,2p) calculations yield better agreements with the
experiments than B3LYP/6-311⫹G(d, p). In particular, for
small molecules with few heavy elements B3LYP/6-311
⫹G(3d p,2p) calculations result in very small deviations
from the experiments. For instance, the ⌬ fH 両 deviations for
CH4 and CS2 are only ⫺0.3 and 0.4 kcal mol⫺1 , respectively. The results in Ref. 9 are slightly better than the
B3LYP/6-311⫹G(3d f ,2p) results for small molecules, and
this is because the ZPEs in Ref. 9 are scaled by an empirical
factor 0.98. For large molecules, both B3LYP/6-311
⫹G(d, p) and B3LYP/6-311⫹G(3d f ,2p) calculations yield
quite large deviations from their experimental counterparts.
To improve the comparison with the experiment, different
empirical scaling factors have been employed for large molecules.
Our neural network adopts a three-layer architecture
which has an input layer consisting of inputs from the physical descriptors and a bias, a hidden layer containing a number of hidden neurons, and an output layer that outputs the
corrected values for ⌬ fH 両 共see Fig. 2兲. The number of hid-
den neurons is to be determined. The most important issue is
to select the proper physical descriptors of our molecules,
which are to be used as the input for our neural network. The
calculated ⌬ fH 両 contains the essence of exact ⌬ fH 両 , and is
thus an obvious choice of the primary descriptor for correcting ⌬ fH 両 . We observe that the size of a molecule affects the
accuracies of calculations. We plot the difference between
the calculated ⌬ fH 両 and the measured ⌬ fH 両 versus the
number of atoms N t in Fig. 3. Roughly speaking, the more
atoms a molecule has, the larger the deviation. The deviation
is approximately proportional to N t , although the proportionality is by no means strict. This is consistent with the
general observations of others in the field.9 N t of a molecule
is thus chosen as the second descriptor for the molecule. ZPE
is an important parameter in calculating ⌬ fH 両 . Its calculated
value is often rescaled in evaluating ⌬ fH 両 , 9 and it is thus
FIG. 2. Structure of our neural network.
Downloaded Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
J. Chem. Phys., Vol. 119, No. 22, 8 December 2003
Calculated heat of formation
FIG. 3. Deviations 共theory–expt.兲 vs the number of atoms 共a兲 for
B3LYP/6-311⫹G(d,p); 共b兲 for B3LYP/6-311⫹G(3d f ,2p).
taken as the third physical descriptor. Finally, the number of
double bonds, N db , is selected as the fourth and last descriptor to reflect the chemical structure of a molecule. To ensure
the quality of our neural network, a cross-validation procedure is employed to determine our neural network including
its structure and weights.10 We randomly divide further 150
training molecules into five subsets of equal size. Four of
them are used to train the neural network, and the fifth to
validate its predictions. This procedure is repeated five times
in rotation. The number of neurons in the hidden layer is
varied from 1 to 10 to decide the optimal structure of our
neural network. We find that the hidden layer containing two
neurons yields the best overall results. Therefore, the 5-2-1
structure is adopted for our neural network as depicted in
Fig. 2. The input values at the input layer, x 1 , x 2 , x 3 , x 4 ,
and x 5 , are scaled ⌬ fH 両 , N t , ZPE, N db , and bias, respectively. The bias x 5 is set to 1. The weights 兵 Wx i j 其 s connect
the input neurons 兵 x i 其 and the hidden neurons y 1 and y 2 , and
兵 Wy j 其 s connect the hidden neurons and the output Z which
is the scaled ⌬ fH 両 upon neural-network correction. The output Z is related to the input 兵 x i 其 as
Z⫽
兺
j⫽1,2
Wy j Sig
冉兺
i⫽1,5
冊
Wx i j x i ,
共1兲
where Sig( v )⫽ 1/关 1⫹exp(⫺␣v)兴 and ␣ is a parameter that
controls the switch steepness of Sigmoidal function Sig( v ).
An error back-propagation learning procedure4 is used to optimize the values of Wx i j and Wy j (i⫽1,2,3,4,5 and j
⫽1,2). In Figs. 1共c兲 and 1共d兲 we plot the comparison be-
11503
tween the neural-network corrected ⌬ fH 両 s and their experimental values 共with the vertical coordinates for the experimental values and the horizontal coordinates for the
calculated ⌬ fH 両 s). The triangles belong to the training set
and the crosses belong to the testing set. Compared to the
raw calculated results, the neural-network corrected values
are much closer to the experimental values for both training
and testing sets. More importantly, the systematic deviations
for ⌬ fH 両 in Figs. 1共a兲 and 1共b兲 are eliminated, and the resulting numerical deviations are reduced substantially. This
can be further demonstrated by the error analysis performed
for the raw and neural-network corrected ⌬ fH 両 s of all 180
molecules. In the insets of Fig. 1, we plot the histograms for
the deviations 共from the experiments兲 of the raw B3LYP
⌬ fH 両 s and their neural-network corrected values. Obviously,
the raw calculated ⌬ fH 両 s have large systematic deviations:
21.4 and 12.0 kcal mol⫺1 for ⌬ fH 両 s at B3LYP/6-311
⫹G(d, p) and B3LYP/6-311⫹G(3d f ,2p), respectively. On
the contrary, the neural-network corrected ⌬ fH 両 s have virtually no systematic deviations. Moreover, the remaining numerical deviations are much smaller. Upon the neuralnetwork corrections, the RMS deviations of ⌬ fH 両 s are
reduced from 21.4 to 3.1 kcal mol⫺1 and 12.0 to
3.3 kcal mol⫺1
for
B3LYP/6-311⫹G(d, p)
and
B3LYP/6-311⫹G(3d f ,2p), respectively. Note that the error
distributions after the neural-network correction are of approximate Gaussian distributions 关see Figs. 1共c兲 and 1共d兲兴.
Although the raw B3LYP/6-311⫹G(d, p) results have much
larger deviations than those of B3LYP/6-311⫹G(3d f ,2p),
the neural-network corrected values of both calculations
have deviations of the same magnitude. This implies that it
may be sufficient to employ the smaller basis set 6-311
⫹G(d, p) in our combined DFT calculation and neuralnetwork correction 共or DFT-NEURON兲 approach. The
neural-network-based algorithm can correct easily the deficiency of a small basis set. Therefore, the DFT-NEURON
approach can potentially be applied to much larger systems.
In Table I we distinguish the molecules of the testing sets.
The deviations of large molecules are of the same magnitude
as those of small molecules. Unlike other quantum mechanical calculations that usually yield worse results for larger
molecules than for small ones, the DFT-NEURON approach
does not discriminate against the large molecules.
In Table II we list the values of 兵 Wx i j 其 and 兵 Wy j 其 of the
two neural networks for correcting ⌬ fH 両 of B3LYP/6-311
⫹G(d, p) and B3LYP/6-311⫹G(3d f ,2p) calculations.
Analysis of our neural network reveals that the weights connecting the input for ⌬ fH 両 have the dominant contribution
in all cases. This confirms our fundamental assumption that
the calculated ⌬ fH 両 captures the essential value of exact
⌬ fH 両 . The bias contributes significantly to the correction of
systematic deviations in the raw calculated data, and always
has the second largest weights for all cases. The input for the
second physical descriptor, N t , has also large weights in all
cases, in particular for 6-311⫹G(d, p). This is because the
raw ⌬ fH 両 deviations are roughly proportional to N t as
shown in Fig. 3, which confirms the importance of N t as a
significant descriptor of our neural network. ZPE has been
often rescaled to account for the discrepancies between cal-
Downloaded Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
11504
Hu et al.
J. Chem. Phys., Vol. 119, No. 22, 8 December 2003
TABLE I. Experimental and calculated ⌬ fH 両 共298 K兲 for 180 compounds 共all data are in the units of kcal mol⫺1 ).
⌬ fH 両 共298 K兲
Formula
CF2 O
CH2 Cl2
CH2 F2 g
CH4
CH4 Og
CS2
C6 H6 g
CBrCl3
CBrF3 g
CClF3 g,h
CClNg
CCl2 O
CF4
CHCl3
CHF3
CH2 O2
CH3 Br
CH3 NO2
CH3 NO2
CH4 Sg,h
CH5 N
COS
C2 H2
C2 H2 Cl2 g
C2 H2 F2
C2 H2 O4
C2 H3 Br
C2 H3 ClOh
C2 H3 ClO2
C2 H3 Cl3 g
C2 H3 F
C2 H4 g
C2 H4 Br2
C2 H4 Cl2
C2 H4 Cl2
C2 H4 F2
C2 H4 O
C2 H4 O2
C2 H4 S
C2 H5 Br
C2 H5 Cl
C2 H5 N
C2 H5 NOg,h
C2 H5 NO2
C2 H5 NO3
C2 H6
C2 H6 O
C2 H6 S
C2 H7 N
C2 H7 Nh
C2 H8 N2
C2 N2
C3 H3 NO
C3 H4 g
C3 H4 h
C3 H4 O3
C3 H5 Cl3
C3 H6 g
C3 H6 g
C3 H6 Br2
C3 H6 Cl2 h
C3 H6 O
C3 H6 O2 h
C3 H6 O2 g
C3 H6 Sh
Name
carbonyl fluoride
dichloromethane
difluoromethane
methane
methanol
carbon disulfide
benzene
bromotrichloromethane
bromotrifluoromethane
chlorotrifluoromethane
cyanogen chloride
phosgene
carbon tetrafluoride
chloroform
trifluoromethane
formic acid
methyl bromide
nitromethane
methyl nitrite
methyl mercaptan
methylamine
carbonyl sulfide
acetylene
1,1-dichloroethylene
1,1-difluoroethylene
oxalic acid
vinyl bromide
acetyl chloride
chloroacetic acid
1,1,1-trichloroethane
vinyl fluoride
ethylene
1,2-dibromoethane
1,1-dichloroethane
1,2-dichloroethane
1,1-difluoroethane
ethylene oxide
acetic acid
thiacyclopropane
bromoethane
ethyl chloride
ethyleneimine
acetamide
nitroethane
ethyl nitrate
ethane
dimethyl ether
dimethyl sulfide
dimethylamine
ethylamine
ethylenediamine
cyanogen
oxazole
methylacetylene
propadiene
ethylene carbonate
1,2,3-trichloropropane
cyclopropane
propylene
1,2-dibromopropane
1,2-dichloropropane
acetone
methyl acetate
propionic acid
thiacyclobutane
Expt.a
DFT1b
NN1c
DFT2d
NN2e
DFT3f
⫺152.7
⫺22.8
⫺108.2
⫺17.9
⫺48.1
28.0
19.8
⫺9.3
⫺155.1
⫺169.2
33.0
⫺52.3
⫺223.0
⫺24.2
⫺166.7
⫺90.5
⫺9.0
⫺17.9
⫺15.3
⫺5.5
⫺5.5
⫺33.1
54.2
0.6
⫺80.5
⫺173.0
18.7
⫺58.3
⫺104.3
⫺34.0
⫺33.2
12.5
⫺9.3
⫺31.0
⫺31.0
⫺118.0
⫺12.6
⫺103.9
19.7
⫺15.3
⫺26.7
29.5
⫺57.0
⫺24.2
⫺36.8
⫺20.2
⫺44.0
⫺9.0
⫺4.5
⫺11.0
⫺4.1
73.8
⫺3.7
44.3
45.9
⫺121.2
⫺44.4
12.7
4.9
⫺17.4
⫺39.6
⫺52.0
⫺98.0
⫺108.4
14.6
⫺132.9
⫺12.2
⫺100.1
⫺16.6
⫺42.9
36.5
36.2
11.3
⫺140.4
⫺150.9
40.3
⫺40.4
⫺203.2
⫺7.4
⫺153.2
⫺82.9
⫺6.8
⫺11.0
⫺7.9
1.4
⫺3.4
⫺25.8
59.8
15.0
⫺73.5
⫺152.7
22.6
⫺47.5
⫺97.3
⫺12.7
⫺27.7
16.3
⫺0.0
⫺17.6
⫺18.2
⫺109.3
⫺3.6
⫺91.9
30.3
⫺9.8
⫺18.1
36.8
⫺49.6
⫺14.4
⫺24.2
⫺15.9
⫺36.3
1.9
0.9
⫺6.3
3.4
78.4
8.9
51.5
49.4
⫺101.4
⫺17.5
21.6
11.9
⫺5.2
⫺20.4
⫺41.6
⫺83.9
⫺91.8
31.1
⫺146.0
⫺19.2
⫺107.2
⫺16.7
⫺46.2
31.2
21.1
⫺5.3
⫺156.9
⫺168.0
34.3
⫺49.8
⫺218.6
⫺18.8
⫺167.6
⫺89.2
⫺10.3
⫺20.1
⫺17.8
⫺4.0
⫺6.5
⫺29.4
54.2
4.4
⫺83.9
⫺177.3
15.6
⫺58.3
⫺112.6
⫺29.4
⫺33.8
13.5
⫺12.4
⫺29.7
⫺29.9
⫺121.9
⫺9.8
⫺103.8
22.4
⫺18.0
⫺26.0
29.4
⫺60.5
⫺28.8
⫺42.3
⫺20.2
⫺44.5
⫺7.9
⫺7.2
⫺14.2
⫺8.2
65.7
⫺1.6
42.5
41.5
⫺119.3
⫺38.3
14.1
4.4
⫺22.5
⫺37.2
⫺53.6
⫺100.3
⫺108.3
18.8
⫺144.2
⫺17.8
⫺107.5
⫺18.2
⫺47.2
28.4
26.7
1.2
⫺153.4
⫺165.0
32.7
⫺49.2
⫺218.9
⫺15.6
⫺164.5
⫺89.6
⫺9.2
⫺20.5
⫺17.4
⫺3.3
⫺7.3
⫺34.0
56.7
6.8
⫺83.5
⫺166.5
18.5
⫺55.0
⫺97.3
⫺22.0
⫺34.0
13.1
⫺4.3
⫺24.3
⫺24.6
⫺117.9
⫺10.3
⫺100.1
23.9
⫺13.2
⫺22.6
30.5
⫺57.5
⫺25.1
⫺38.4
⫺18.8
⫺42.6
⫺4.6
⫺4.5
⫺11.4
⫺4.0
70.7
⫺2.1
46.8
44.4
⫺115.9
⫺27.2
16.7
7.2
⫺10.5
⫺28.1
⫺48.5
⫺94.1
⫺101.3
23.9
⫺146.2
⫺17.9
⫺107.5
⫺16.8
⫺47.3
31.1
21.1
⫺1.1
⫺156.7
⫺169.1
33.8
⫺48.6
⫺220.1
⫺16.7
⫺168.2
⫺89.2
⫺8.4
⫺22.0
⫺19.1
⫺3.7
⫺8.0
⫺30.6
55.7
5.8
⫺84.7
⫺176.9
18.1
⫺57.0
⫺101.2
⫺26.7
⫺34.1
13.7
⫺7.8
⫺27.9
⫺28.1
⫺121.7
⫺11.7
⫺103.4
21.8
⫺15.9
⫺25.2
27.7
⫺61.1
⫺30.7
⫺45.1
⫺20.5
⫺46.2
⫺8.3
⫺8.6
⫺15.5
⫺10.5
66.9
⫺3.9
43.7
42.9
⫺122.7
⫺35.1
13.4
4.6
⫺17.5
⫺35.1
⫺53.2
⫺100.8
⫺108.1
18.6
⫺143.6
⫺18.2
⫺107.7
⫺19.5
⫺48.1
28.2
24.2
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
Downloaded Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
J. Chem. Phys., Vol. 119, No. 22, 8 December 2003
Calculated heat of formation
11505
DFT3f
TABLE I 共Continued兲.
⌬ fH 両 共298 K兲
Formula
C3 H7 Br
C3 H7 Brh
C3 H7 Cl
C3 H7 Cl
C 3 H7 F
C3 H7 NOg
C3 H7 NO2
C3 H7 NO2
C3 H7 NO3
C3 H7 NO3
C3 H8 h
C3 H8 O
C 3 H8 S
C 3 H8 S
C 3 H8 S
C 3 H9 N
C 3 H9 N
C 3 H9 N
C3 H10N2 h
C4 H4 N2
C4 H6
C4 H6 Oh
C4 H8 h
C4 H8 O
C4 H8 O2
C4 H9 Br
C4 H9 Cl
C4 H10O
C4 H10O2 h
C4 H10S
C4 H10S
C4 H11N
C 5 H5 N
C5 H6 S
C5 H8
C5 H8 O2 h
C5 H10h
C5 H10
C5 H10
C5 H10
C5 H10
C5 H10
C5 H10h
C5 H10O
C5 H10O
C5 H10O2 g
C5 H10Sh
C5 H10Sg,h
C5 H11Br
C5 H11Cl
C5 H11N
C5 H12
C5 H12
C5 H12O
C5 H12Og,h
C5 H12O
C5 H12O
C5 H12O
C5 H12O
C5 H12O4
C5 H12Sg
C5 H12S
C 6 F6
C6 H4 Cl2
C6 H4 F2 g,h
Name
1-bromopropane
2-bromopropane
isopropyl chloride
n-propyl chloride
1-fluoropropane
N,N-dimethylformamide
1-nitropropane
2-nitropropane
propyl nitrate
isopropyl nitrate
propane
methyl ethyl ether
n-propyl mercaptan
isopropyl mercaptan
ethyl methyl sulfide
n-propylamine
isopropylamine
trimethylamine
1,2-propanediamine
succinonitrile
1,2-butadiene
divinyl ether
1-butene
isobutyraldehyde
ethyl acetate
1-bromobutane
tert-butyl chloride
sec-butanol
1,4-butanediol
isobutyl mercaptan
methyl propyl sulfide
tert-butylamine
pyridine
2-methylthiophene
trans-1,3-pentadiene
acetylacetone
cyclopentane
2-methyl-1-butene
2-methyl-2-butene
3-methyl-1-butene
1-pentene
cis-2-pentene
trans-2-pentene
2-pentanone
valeraldehyde
valeric acid
thiacyclohexane
cyclopentanethiol
1-bromopentane
1-chloropentane
piperidine
isopentane
n-pentane
2-methyl-1-butanol
3-methyl-1-butanol
3-methyl-2-butanol
2-pentanol
3-pentanol
ethyl propyl ether
pentaerythritol
n-pentyl mercaptan
butyl methyl sulfide
hexafluorobenzene
m-dichlorobenzene
p-difluorobenzene
Expt.a
DFT1b
NN1c
DFT2d
NN2e
⫺21.0
⫺23.2
⫺35.0
⫺31.1
⫺67.2
⫺45.8
⫺29.8
⫺33.5
⫺41.6
⫺45.6
⫺24.8
⫺51.7
⫺16.2
⫺18.2
⫺14.2
⫺17.3
⫺20.0
⫺5.7
⫺12.8
50.1
38.8
⫺3.3
⫺0.0
⫺51.5
⫺105.9
⫺25.6
⫺43.8
⫺69.9
⫺102.0
⫺23.2
⫺19.5
⫺28.6
33.5
20.0
18.6
⫺90.8
⫺18.5
⫺8.7
⫺10.2
⫺6.9
⫺5.0
⫺6.7
⫺7.6
⫺61.8
⫺54.4
⫺117.2
⫺15.1
⫺11.5
⫺30.9
⫺41.8
⫺11.7
⫺36.9
⫺35.0
⫺72.2
⫺72.2
⫺75.1
⫺75.0
⫺75.7
⫺65.1
⫺185.6
⫺25.9
⫺24.4
⫺228.6
6.3
⫺73.4
⫺10.7
⫺12.8
⫺21.7
⫺18.9
⫺58.9
⫺36.2
⫺15.0
⫺17.9
⫺24.9
⫺28.1
⫺16.8
⫺40.6
⫺1.6
⫺2.7
0.7
⫺7.0
⫺9.7
3.4
0.4
63.5
46.5
10.0
11.3
⫺35.6
⫺88.0
⫺11.4
⫺24.6
⫺52.4
⫺79.8
⫺2.4
⫺0.1
⫺12.1
46.2
44.3
31.7
⫺66.6
0.9
7.9
5.9
10.6
10.4
9.0
7.5
⫺43.8
⫺35.3
⫺94.4
11.4
14.7
⫺12.2
⫺20.5
9.7
⫺18.4
⫺18.4
⫺49.2
⫺48.2
⫺52.1
⫺52.0
⫺49.8
⫺45.4
⫺143.7
⫺3.2
⫺0.9
⫺194.8
32.8
⫺51.3
⫺23.7
⫺26.1
⫺34.7
⫺31.7
⫺71.5
⫺51.5
⫺33.7
⫺36.8
⫺47.7
⫺51.3
⫺25.9
⫺53.7
⫺16.5
⫺17.8
⫺13.8
⫺19.7
⫺22.7
⫺9.8
⫺16.1
44.5
34.4
⫺4.5
⫺0.8
⫺52.3
⫺109.4
⫺29.3
⫺42.9
⫺70.5
⫺101.3
⫺22.3
⫺19.4
⫺30.4
31.2
25.6
16.0
⫺91.4
⫺14.9
⫺9.2
⫺11.6
⫺6.5
⫺6.5
⫺8.1
⫺9.7
⫺65.5
⫺56.6
⫺120.6
⫺9.2
⫺7.3
⫺35.0
⫺43.0
⫺9.6
⫺37.4
⫺37.2
⫺71.9
⫺70.7
⫺75.1
⫺74.7
⫺72.6
⫺68.2
⫺179.6
⫺27.8
⫺25.1
⫺230.3
9.3
⫺74.8
⫺15.4
⫺17.5
⫺27.5
⫺24.8
⫺65.6
⫺45.9
⫺27.0
⫺29.6
⫺40.3
⫺43.4
⫺21.0
⫺48.1
⫺8.7
⫺9.7
⫺7.0
⫺13.4
⫺16.2
⫺3.6
⫺8.0
53.2
40.1
⫺0.5
5.3
⫺43.7
⫺99.4
⫺17.5
⫺31.7
⫺60.3
⫺90.3
⫺10.7
⫺9.1
⫺19.8
35.6
31.7
23.7
⫺78.8
⫺5.8
0.5
⫺1.7
3.3
3.1
1.6
0.1
⫺53.3
⫺44.7
⫺106.5
1.8
5.2
⫺19.6
⫺29.0
0.8
⫺25.3
⫺25.2
⫺58.4
⫺57.4
⫺61.4
⫺61.1
⫺59.0
⫺55.4
⫺159.3
⫺12.9
⫺11.2
⫺225.2
18.5
⫺67.1
⫺21.5
⫺23.6
⫺33.6
⫺30.8
⫺71.7
⫺52.8
⫺35.3
⫺38.1
⫺50.5
⫺53.7
⫺26.2
⫺55.1
⫺16.0
⫺17.0
⫺14.1
⫺21.0
⫺23.8
⫺11.2
⫺17.9
45.3
35.7
⫺5.6
⫺0.6
⫺51.7
⫺109.8
⫺27.0
⫺41.3
⫺70.7
⫺102.1
⫺21.4
⫺19.6
⫺31.0
30.6
24.6
16.7
⫺90.2
⫺15.5
⫺8.9
⫺11.2
⫺6.1
⫺6.2
⫺7.7
⫺9.3
⫺64.8
⫺56.1
⫺120.6
⫺9.9
⫺6.7
⫺32.5
⫺41.9
⫺11.3
⫺37.4
⫺37.3
⫺72.2
⫺71.1
⫺75.2
⫺74.9
⫺72.8
⫺69.2
⫺181.2
⫺27.0
⫺25.2
⫺232.6
11.1
⫺75.3
DownloadedRedistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
11506
Hu et al.
J. Chem. Phys., Vol. 119, No. 22, 8 December 2003
TABLE I 共Continued兲.
⌬ fH 両 共298 K兲
Formula
C6 H5 Cl
C6 H5 F
C6 H5 NO2
C6 H6 N2 O2
C6 H6 O
C6 H6 O2
C6 H7 N
C6 H8 N2
C6 H10g
C6 H10
C6 H10O3
C6 H11NOg
C6 H12
C6 H12O
C6 H12Og
C6 H14g,h
C6 H14S
C 7 H5 N
C7 H6 O
C 7 H8 g
C7 H8 O
C 7 H9 N
C7 H14
C7 H15Br
C7 H16h
C7 H16
C7 H16S
C8 H6 O4
C8 H8 O
C8 H10h
C8 H10Og
C8 H16
C8 H16h
C8 H16h
C8 H18
C8 H18
C8 H18
C8 H18g
C8 H18Og
C8 H18S2
C9 H12
C9 H12g
C9 H18Oh
C9 H20h
C9 H20
C10H14
C10H14
C10H18O4
C10H20O2
C12H10
Name
monochlorobenzene
fluorobenzene
nitrobenzene
m-nitroaniline
phenol
1,3-benzenediol
2-methylpyridine
adiponitrile
1-methylcyclopentene
1,5-hexadiene
propionic anhydride
e-caprolactam
trans-3-hexene
butyl vinyl ether
3-hexanone
3-methylpentane
methyl pentyl sulfide
benzonitrile
benzaldehyde
toluene
o-cresol
2,6-dimethylpyridine
cis-1,2-dimethylcyclopentane
1-bromoheptane
3,3-dimethylpentane
2,2,3-trimethylbutane
n-heptyl mercaptan
terephthalic acid
acetophenone
o-xylene
3,4-xylenol
cis-1,2-dimethylcyclohexane
trans-1,4-dimethylcyclohexane
2,4,4-trimethyl-2-pentene
2,3-dimethylhexane
3-ethylhexane
4-methylheptane
2,3,4-trimethylpentane
2-ethyl-1-hexanol
dibutyl disulfide
m-ethyltoluene
1,2,3-trimethylbenzene
diisobutyl ketone
3,3-diethylpentane
2,2,3,4-tetramethylpentane
sec-butylbenzene
isobutylbenzene
sebacic acid
n-decanoic acid
acenaphthene
Expt.a
DFT1b
NN1c
DFT2d
NN2e
12.4
⫺27.9
16.2
14.0
⫺23.0
⫺65.7
23.6
35.7
⫺1.3
20.1
⫺149.7
⫺58.8
⫺13.0
⫺43.7
⫺66.4
⫺41.0
⫺29.3
52.3
⫺8.8
11.9
⫺30.7
14.0
⫺31.0
⫺40.2
⫺48.2
⫺48.9
⫺35.8
⫺171.6
⫺20.7
4.5
⫺37.4
⫺41.1
⫺44.1
⫺25.1
⫺51.1
⫺50.4
⫺50.7
⫺52.0
⫺87.3
⫺37.9
⫺0.5
⫺2.3
⫺85.5
⫺55.4
⫺56.6
⫺4.0
⫺4.9
⫺220.3
⫺142.0
37.0
34.0
⫺8.3
37.1
38.2
⫺1.5
⫺39.3
40.7
59.3
20.9
38.7
⫺116.5
⫺32.1
6.9
⫺18.9
⫺43.8
⫺16.1
⫺1.7
72.1
13.2
32.9
⫺5.0
35.2
⫺1.0
⫺13.8
⫺17.4
⫺17.4
⫺4.1
⫺121.9
7.0
30.3
⫺7.1
⫺5.7
⫺4.7
6.8
⫺17.7
⫺16.6
⫺19.4
⫺14.0
⫺47.7
7.5
29.3
29.1
⫺45.4
⫺11.2
⫺14.1
31.5
30.4
⫺163.4
⫺92.9
79.0
14.8
⫺26.8
12.1
8.8
⫺20.3
⫺63.1
20.8
31.7
1.4
18.2
⫺153.0
⫺58.4
⫺15.1
⫺44.9
⫺70.2
⫺39.8
⫺30.7
48.4
⫺8.6
12.7
⫺29.1
10.2
⫺26.9
⫺46.2
⫺46.2
⫺46.6
⫺38.4
⫺169.0
⫺19.9
5.1
⫺36.7
⫺36.1
⫺35.0
⫺25.4
⫺51.3
⫺50.0
⫺52.8
⫺47.6
⫺84.4
⫺36.2
⫺0.8
⫺1.3
⫺86.3
⫺49.4
⫺52.8
⫺3.1
⫺4.2
⫺218.8
⫺142.2
41.6
22.1
⫺21.0
19.6
18.0
⫺14.0
⫺54.8
28.7
46.4
12.3
29.6
⫺133.6
⫺45.0
⫺1.8
⫺31.1
⫺54.6
⫺24.2
⫺13.3
58.7
⫺0.3
22.0
⫺18.8
21.9
⫺10.3
⫺23.8
⫺26.7
⫺26.8
⫺16.5
⫺144.6
⫺7.8
17.9
⫺22.4
⫺16.1
⫺15.0
⫺4.5
⫺28.5
⫺27.3
⫺30.1
⫺24.7
⫺60.8
⫺11.9
15.6
15.3
⫺60.0
⫺23.0
⫺26.1
16.7
15.5
⫺188.6
⫺112.1
60.5
15.7
⫺27.3
11.0
6.9
⫺21.1
⫺64.5
20.5
32.6
1.6
19.1
⫺153.9
⫺59.9
⫺14.6
⫺45.9
⫺69.5
⫺39.7
⫺30.7
49.2
⫺8.1
13.1
⫺29.5
10.3
⫺27.0
⫺43.6
⫺45.7
⫺45.9
⫺37.5
⫺169.6
⫺19.3
5.6
⫺36.8
⫺36.2
⫺35.1
⫺24.3
⫺50.9
⫺49.7
⫺52.5
⫺47.1
⫺84.6
⫺38.2
⫺0.1
⫺0.5
⫺85.3
⫺48.8
⫺52.0
⫺2.4
⫺3.5
⫺221.4
⫺144.6
42.7
DFT3f
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
a
The experimental data were taken from Ref. 5.
The calculated ⌬ fH 両 by using B3LPY/6-311⫹G(d,p) geometries, zero point energies and enthalpies.
c
The calculated ⌬ fH 両 by B3LYP/6-311⫹G(d,p)-neural networks approach.
d
The calculated ⌬ fH 両 by using the 6-311⫹G(d,p) geometries and zero point energies, and recalculated enthalpies by 6-311⫹G(3d f ,2p) basis.
e
The calculated ⌬ fH 両 by B3LYP/6-311⫹G(3d f ,2p)-neural networks approach.
f
The ⌬ fH 両 s were taken from Ref. 9, where the zero point energies were corrected by a scale factor 0.98.
g
The molecule belongs to the testing set in NN1 calculation.
h
The molecule belongs to the testing set in NN2 calculation.
b
culations and experiments,9 and it is thus expected to have
large weights. This is indeed the case, especially when the
smaller basis set 6-311⫹G(d, p) is adopted in the calculations. In all cases the number of double bonds, N db , has the
smallest but non-negligible weights.
Our DFT-NEURON approach has a RMS deviation of
⬃3 kcal mol⫺1 for the 180 small- to medium-sized organic
molecules. The physical descriptors adopted in our neural
network, the raw calculated ⌬ fH 両 , the number of atoms N t ,
the number of double bonds N db , and the ZPE are quite
Downloaded Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp
J. Chem. Phys., Vol. 119, No. 22, 8 December 2003
Calculated heat of formation
TABLE II. Weights of DFT-neural networks for ⌬ fH 両 .
NN1a
a
NN2b
Weights
y1
y2
y1
y2
Wx1 j
Wx2 j
Wx3 j
Wx4 j
Wx5 j
Wy j
⫺2.48
0.25
0.01
0.26
0.41
⫺0.21
0.79
⫺0.50
0.38
0.01
⫺0.49
1.41
0.85
⫺0.18
0.09
0.01
⫺0.56
1.43
⫺2.77
0.09
0.20
0.20
0.59
⫺0.20
NN1 refers B3LYP/6-311⫹G(d,p)-neural networks approach.
NN2 refers B3LYP/6-311⫹G(3d f ,2p)-neural networks approach.
b
general, and are not limited to special properties of organic
molecules. The DFT-NEURON approach developed here is
expected to yield a RMS deviation of ⬃3 kcal mol⫺1 for
⌬ fH 両 s of any small- to medium-sized organic molecules. G2
共Ref. 9兲 and G3 共Ref. 11兲 methods result in more accurate
⌬ fH 両 for small molecules. Like the DFT-NEURON approach, G2 and G3 methods have empirical fittings to the
high order electron correlations and ZPEs, and are thus not
entirely ab initio. Our approach is much more efficient and
can be applied to much larger systems. To improve the accuracy for the DFT-NEURON approach, we need more and
better experimental data, and possibly, more and better
physical descriptors for the molecules. Besides ⌬ fH 両 , the
DFT-NEURON approach can be generalized to calculate
other properties such as Gibbs free energy, ionization energy,
dissociation energy, absorption frequency, band gap, etc. The
raw first-principles property of interest contains the essence
of its exact value, and is thus always the primary descriptor.
As the raw calculation error accumulates with increasing molecular size, the number of atoms N t should thus be selected
for other DFT-NEURON calculations. Additional physical
descriptors should be chosen according to their relations to
the property of interest and to the physical and chemical
characteristics of the compounds. Others have used neural
networks to determine the quantitative relationship between
the experimental measured thermodynamic properties and
the structure parameters of the molecules.10 We distinguish
our work from them by utilizing specifically the firstprinciples methods and with the objective to improve quantum mechanical results. Since the first-principles calculations
capture readily the essences of the properties of interest, our
approach is more reliable and covers much a wider range of
molecules or compounds.
11507
To summarize, we have developed a promising new approach to improve the results of first-principles quantum mechanical calculations and to calibrate their uncertainties. The
accuracy of DFT-NEURON approach can be systematically
improved as more and better experimental data are available.
As the systematic deviations caused by small basis sets and
less sophisticated methods adopted in the calculations can be
easily corrected by neural networks, the requirements on
first-principles calculations are modest. Our approach is thus
highly efficient compared to much more sophisticated firstprinciples methods of similar accuracy, and more importantly, is expected to be applied to much larger systems. The
combined first-principles calculation and neural-network correction approach developed in this work is potentially a powerful tool in computational physics and chemistry, and may
open the possibility for first-principles methods to be employed practically as predictive tools in materials research
and design.
We thank Professor YiJing Yan for extensive discussion
on the subject and generous help in manuscript preparation.
Support from the Hong Kong Research Grant Council
共RGC兲 and the Committee for Research and Conference
Grants 共CRCG兲 of the University of Hong Kong is gratefully
acknowledged.
1
R. G. Parr and W. Yang, Density-Functional Theory of Atoms and Molecules 共Oxford University Press, New York, 1989兲, and references therein.
2
H. F. Schaefer III, Methods of Electronic Structure Theory 共Plenum, New
York and London, 1977兲, and references therein.
3
B. D. Ripley, Pattern Recognition and Neural Networks 共Cambridge University Press, New York, 1996兲.
4
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Nature 共London兲 323,
533 共1986兲.
5
C. L. Yaws, Chemical Properties Handbook 共McGraw–Hill, New York,
1999兲.
6
D. R. Lide, CRC Handbook of Chemistry and Physics, 3rd electronic ed.
共CRC, Boca Raton, FL, 2000兲.
7
J. B. Pedley, R. D. Naylor, and S. P. Kirby, Thermochemical Data of
Organic Compounds, 2nd ed. 共Chapman and Hall, New York, 1986兲.
8
M. J. Frisch, G. W. Trucks, H. B. Schlegel et al., GAUSSIAN 98, Revision
A.11.3, Gaussian, Inc., Pittsburgh, PA, 2002.
9
L. A. Curtiss, K. Raghavachari, P. C. Redfern, and J. A. Pople, J. Chem.
Phys. 106, 1063 共1997兲.
10
X. Yao, X. Zhang, R. Zhang, M. Liu, Z. Hu, and B. Fan, Comput. Chem.
25, 475 共2001兲.
11
L. A. Curtiss, K. Raghavachari, P. C. Redfern, V. Rassolov, and J. A.
Pople, J. Chem. Phys. 109, 7764 共1998兲.
Downloaded Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp