MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Predictive Chemistry Augmented with Text Retrieval

Author(s)
Qian, Yujie; Li, Zhening; Tu, Zhengkai; Coley, Connor; Barzilay, Regina
Thumbnail
DownloadAccepted version (1.194Mb)
Open Access Policy

Open Access Policy

Creative Commons Attribution-Noncommercial-Share Alike

Terms of use
Creative Commons Attribution-Noncommercial-ShareAlike http://creativecommons.org/licenses/by-nc-sa/4.0/
Metadata
Show full item record
Abstract
This paper focuses on using natural language descriptions to enhance predictive models in the chemistry field. Conventionally, chemoinformatics models are trained with extensive structured data manually extracted from the literature. In this paper, we introduce TextReact, a novel method that directly augments predictive chemistry with texts retrieved from the literature. TextReact retrieves text descriptions relevant for a given chemical reaction, and then aligns them with the molecular representation of the reaction. This alignment is enhanced via an auxiliary masked LM objective incorporated in the predictor training. We empirically validate the framework on two chemistry tasks: reaction condition recommendation and one-step retrosynthesis. By leveraging text retrieval, TextReact significantly outperforms state-of-the-art chemoinformatics models trained solely on molecular data.
Date issued
2023-12
URI
https://hdl.handle.net/1721.1/165434
Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science; Massachusetts Institute of Technology. Department of Chemical Engineering
Journal
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Publisher
Association for Computational Linguistics
Citation
Yujie Qian, Zhening Li, Zhengkai Tu, Connor Coley, and Regina Barzilay. 2023. Predictive Chemistry Augmented with Text Retrieval. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 12731–12745, Singapore. Association for Computational Linguistics.
Version: Author's final manuscript

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.