Show simple item record

dc.contributor.authorQian, Yujie
dc.contributor.authorLi, Zhening
dc.contributor.authorTu, Zhengkai
dc.contributor.authorColey, Connor
dc.contributor.authorBarzilay, Regina
dc.date.accessioned2026-04-14T19:31:27Z
dc.date.available2026-04-14T19:31:27Z
dc.date.issued2023-12
dc.identifier.urihttps://hdl.handle.net/1721.1/165434
dc.description.abstractThis paper focuses on using natural language descriptions to enhance predictive models in the chemistry field. Conventionally, chemoinformatics models are trained with extensive structured data manually extracted from the literature. In this paper, we introduce TextReact, a novel method that directly augments predictive chemistry with texts retrieved from the literature. TextReact retrieves text descriptions relevant for a given chemical reaction, and then aligns them with the molecular representation of the reaction. This alignment is enhanced via an auxiliary masked LM objective incorporated in the predictor training. We empirically validate the framework on two chemistry tasks: reaction condition recommendation and one-step retrosynthesis. By leveraging text retrieval, TextReact significantly outperforms state-of-the-art chemoinformatics models trained solely on molecular data.en_US
dc.language.isoen
dc.publisherAssociation for Computational Linguisticsen_US
dc.relation.isversionof10.18653/v1/2023.emnlp-main.784en_US
dc.rightsCreative Commons Attribution-Noncommercial-ShareAlikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceauthoren_US
dc.titlePredictive Chemistry Augmented with Text Retrievalen_US
dc.typeArticleen_US
dc.identifier.citationYujie Qian, Zhening Li, Zhengkai Tu, Connor Coley, and Regina Barzilay. 2023. Predictive Chemistry Augmented with Text Retrieval. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 12731–12745, Singapore. Association for Computational Linguistics.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Chemical Engineeringen_US
dc.relation.journalProceedings of the 2023 Conference on Empirical Methods in Natural Language Processingen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2026-04-14T19:24:18Z
dspace.orderedauthorsQian, Y; Li, Z; Tu, Z; Coley, C; Barzilay, Ren_US
dspace.date.submission2026-04-14T19:24:19Z
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record