MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Dataset Design for Building Models of Chemical Reactivity

Author(s)
Raghavan, Priyanka; Haas, Brittany C; Ruos, Madeline E; Schleinitz, Jules; Doyle, Abigail G; Reisman, Sarah E; Sigman, Matthew S; Coley, Connor W; ... Show more Show less
Thumbnail
DownloadPublished version (2.400Mb)
Publisher with Creative Commons License

Publisher with Creative Commons License

Creative Commons Attribution

Terms of use
Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/
Metadata
Show full item record
Abstract
Models can codify our understanding of chemical reactivity and serve a useful purpose in the development of new synthetic processes via, for example, evaluating hypothetical reaction conditions or in silico substrate tolerance. Perhaps the most determining factor is the composition of the training data and whether it is sufficient to train a model that can make accurate predictions over the full domain of interest. Here, we discuss the design of reaction datasets in ways that are conducive to data-driven modeling, emphasizing the idea that training set diversity and model generalizability rely on the choice of molecular or reaction representation. We additionally discuss the experimental constraints associated with generating common types of chemistry datasets and how these considerations should influence dataset design and model building.
Date issued
2023-12-27
URI
https://hdl.handle.net/1721.1/158178
Department
Massachusetts Institute of Technology. Department of Chemical Engineering; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Journal
ACS Central Science
Publisher
American Chemical Society
Citation
Priyanka Raghavan, Brittany C. Haas, Madeline E. Ruos, Jules Schleinitz, Abigail G. Doyle, Sarah E. Reisman, Matthew S. Sigman, and Connor W. Coley. ACS Central Science 2023 9 (12), 2196-2204.
Version: Final published version

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.