Accurate Thermochemistry with Small Data Sets: A Bond Additivity Correction and Transfer Learning Approach
Author(s)
Grambow, Colin A.; Li, Yi-Pei; Green Jr, William H
DownloadGrambow_2019_manuscript.pdf (675.5Kb)
Terms of use
Metadata
Show full item recordAbstract
Machine learning provides promising new methods for accurate yet rapid prediction of molecular properties, including thermochemistry, which is an integral component of many computer simulations, particularly automated reaction mechanism generation. Often, very large data sets with tens of thousands of molecules are required for training the models, but most data sets of experimental or high-accuracy quantum mechanical quality are much smaller. To overcome these limitations, we calculate new high-level data sets and derive bond additivity corrections to significantly improve enthalpies of formation. We adopt a transfer learning technique to train neural network models that achieve good performance even with a relatively small set of high-accuracy data. The training data for the entropy model are carefully selected so that important conformational effects are captured. The resulting models are generally applicable thermochemistry predictors for organic compounds with oxygen and nitrogen heteroatoms that approach experimental and coupled cluster accuracy while only requiring molecular graph inputs. Due to their versatility and the ease of adding new training data, they are poised to replace conventional estimation methods for thermochemical parameters in reaction mechanism generation. Since high-accuracy data are often sparse, similar transfer learning approaches are expected to be useful for estimating many other molecular properties.
Date issued
2019-06Department
Massachusetts Institute of Technology. Department of Chemical EngineeringJournal
Journal of Physical Chemistry A
Publisher
American Chemical Society (ACS)
Citation
Grambow, Colin A. et al. "Accurate Thermochemistry with Small Data Sets: A Bond Additivity Correction and Transfer Learning Approach." Journal of Physical Chemistry A 123, 27 (June 2019): 5826-5835 © 2019 American Chemical Society
Version: Author's final manuscript
ISSN
1089-5639
1520-5215
Keywords
Exxon Mobil Corporation (Grant EM09079)