Multi-fidelity prediction of molecular optical peaks with deep learning

Greenman, Kevin P; Green Jr, William H; Gomez-Bombarelli, Rafael

dc.contributor.author	Greenman, Kevin P
dc.contributor.author	Green Jr, William H
dc.contributor.author	Gomez-Bombarelli, Rafael
dc.date.accessioned	2022-01-05T19:36:22Z
dc.date.available	2022-01-05T13:32:46Z
dc.date.available	2022-01-05T19:36:22Z
dc.date.issued	2022-01-04
dc.identifier.issn	2041-6520
dc.identifier.issn	2041-6539
dc.identifier.uri	https://hdl.handle.net/1721.1/138813.2
dc.description.abstract	Optical properties are central to molecular design for many applications, including solar cells and biomedical imaging. A variety of ab initio and statistical methods have been developed for their prediction, each with a trade-off between accuracy, generality, and cost. Existing theoretical methods such as time-dependent density functional theory (TD-DFT) are generalizable across chemical space because of their robust physics-based foundations but still exhibit random and systematic errors with respect to experiment despite their high computational cost. Statistical methods can achieve high accuracy at a lower cost, but data sparsity and unoptimized molecule and solvent representations often limit their ability to generalize. Here, we utilize directed message passing neural networks (DMPNNs) to represent both dye molecules and solvents for predictions of molecular absorption peaks in solution. Additionally, we demonstrate a multi-fidelity approach based on an auxiliary model trained on over 28,000 TD-DFT calculations that further improves accuracy and generalizability, as shown through rigorous splitting strategies. Combining several openly-available experimental datasets, we benchmark these methods against a state-of-the-art regression tree algorithm and compare the DMPNN solvent representation to several alternatives. Finally, we explore the interpretability of the learned representations using dimensionality reduction and evaluate the use of ensemble variance as an estimator of the epistemic uncertainty in our predictions of molecular peak absorption in solution. The prediction methods proposed herein can be integrated with active learning, generative modeling, and experimental workflows to enable the more rapid design of molecules with targeted optical properties.	en_US
dc.description.sponsorship	National Science Foundation (Grant 1745302)	en_US
dc.publisher	Royal Society of Chemistry (RSC)	en_US
dc.relation.isversionof	10.1039/d1sc05677h	en_US
dc.rights	Creative Commons Attribution 3.0 unported license	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/3.0/	en_US
dc.source	Kevin P. Greenman	en_US
dc.title	Multi-fidelity prediction of molecular optical peaks with deep learning	en_US
dc.type	Article	en_US
dc.identifier.citation	Greenman, Kevin P, Green, William H. and Gomez-Bombarelli, Rafael. 2022. "Multi-fidelity prediction of molecular optical peaks with deep learning." Chemical Science.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Chemical Engineering	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Materials Science and Engineering	en_US
dc.relation.journal	Chemical Science	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dspace.date.submission	2022-01-05T01:15:21Z
mit.license	PUBLISHER_CC
mit.metadata.status	Publication Information Needed	en_US

Files in this item

Name:: d1sc05677h-combined.pdf
Size:: 4.467Mb
Format:: Unknown

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record

Version	Item	Date	Summary
2	1721.1/138813.2*	2022-01-05T19:30:46Z	Authority information verified/added.
1	1721.1/138813	2022-01-05T13:32:46Z

DSpace@MIT

Multi-fidelity prediction of molecular optical peaks with deep learning

Files in this item

This item appears in the following Collection(s)

Version History