MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Data-Driven Approaches Can Overcome the Cost–Accuracy Trade-Off in Multireference Diagnostics

Author(s)
Duan, Chenru; Liu, Fang; Nandy, Aditya; Kulik, Heather J
Thumbnail
DownloadSubmitted version (11.65Mb)
Open Access Policy

Open Access Policy

Creative Commons Attribution-Noncommercial-Share Alike

Terms of use
Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/
Metadata
Show full item record
Abstract
Copyright © 2020 American Chemical Society. High-throughput computational screening typically employs methods (i.e., density functional theory or DFT) that can fail to describe challenging molecules, such as those with strongly correlated electronic structure. In such cases, multireference (MR) correlated wavefunction theory (WFT) would be the appropriate choice but remains more challenging to carry out and automate than single-reference (SR) WFT or DFT. Numerous diagnostics have been proposed for identifying when MR character is likely to have an effect on the predictive power of SR calculations, but conflicting conclusions about diagnostic performance have been reached on small data sets. We compute 15 MR diagnostics, ranging from affordable DFT-based to more costly MR-WFT-based diagnostics, on a set of 3165 equilibrium and distorted small organic molecules containing up to six heavy atoms. Conflicting MR character assignments and low pairwise linear correlations among diagnostics are also observed over this set. We evaluate the ability of existing diagnostics to predict the percent recovery of the correlation energy, %Ecorr. None of the DFT-based diagnostics are nearly as predictive of %Ecorr as the best WFT-based diagnostics. To overcome the limitation of this cost-accuracy trade-off, we develop machine learning (ML, i.e., kernel ridge regression) models to predict WFT-based diagnostics from a combination of DFT-based diagnostics and a new, size-independent 3D geometric representation. The ML-predicted diagnostics correlate as well with MR effects as their computed (i.e., with WFT) values, significantly improving over the DFT-based diagnostics on which the models were trained. These ML models thus provide a promising approach to improve upon DFT-based diagnostic accuracy while remaining suitably low cost for high-throughput screening.
Date issued
2020
URI
https://hdl.handle.net/1721.1/134401
Department
Massachusetts Institute of Technology. Department of Chemical Engineering; Massachusetts Institute of Technology. Department of Chemistry
Journal
Journal of Chemical Theory and Computation
Publisher
American Chemical Society (ACS)

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.