Show simple item record

dc.contributor.authorQian, Yujie
dc.contributor.authorGuo, Jiang
dc.contributor.authorTu, Zhengkai
dc.contributor.authorColey, Connor W
dc.contributor.authorBarzilay, Regina
dc.date.accessioned2025-02-07T20:19:07Z
dc.date.available2025-02-07T20:19:07Z
dc.date.issued2023-07-10
dc.identifier.urihttps://hdl.handle.net/1721.1/158184
dc.description.abstractReaction diagram parsing is the task of extracting reaction schemes from a diagram in the chemistry literature. The reaction diagrams can be arbitrarily complex; thus, robustly parsing them into structured data is an open challenge. In this paper, we present RxnScribe, a machine learning model for parsing reaction diagrams of varying styles. We formulate this structured prediction task with a sequence generation approach, which condenses the traditional pipeline into an end-to-end model. We train RxnScribe on a dataset of 1378 diagrams and evaluate it with cross validation, achieving an 80.0% soft match F1 score, with significant improvements over previous models. Our code and data are publicly available at https://github.com/thomas0809/RxnScribe.en_US
dc.language.isoen
dc.publisherAmerican Chemical Society (ACS)en_US
dc.relation.isversionof10.1021/acs.jcim.3c00439en_US
dc.rightsCreative Commons Attribution-Noncommercial-ShareAlikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourcearxiven_US
dc.titleRxnScribe: A Sequence Generation Model for Reaction Diagram Parsingen_US
dc.typeArticleen_US
dc.identifier.citationYujie Qian, Jiang Guo, Zhengkai Tu, Connor W. Coley, and Regina Barzilay. Journal of Chemical Information and Modeling 2023 63 (13), 4030-4041.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Chemical Engineeringen_US
dc.relation.journalJournal of Chemical Information and Modelingen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2025-02-07T20:13:40Z
dspace.orderedauthorsQian, Y; Guo, J; Tu, Z; Coley, CW; Barzilay, Ren_US
dspace.date.submission2025-02-07T20:13:41Z
mit.journal.volume63en_US
mit.journal.issue13en_US
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record