| dc.contributor.author | Lin, Min H. | |
| dc.contributor.author | Tu, Zhengkai | |
| dc.contributor.author | Coley, Connor W. | |
| dc.date.accessioned | 2022-03-21T12:56:08Z | |
| dc.date.available | 2022-03-21T12:56:08Z | |
| dc.date.issued | 2022-03-15 | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/141316 | |
| dc.description.abstract | Abstract
Retrosynthesis is at the core of organic chemistry. Recently, the rapid growth of artificial intelligence (AI) has spurred a variety of novel machine learning approaches for data-driven synthesis planning. These methods learn complex patterns from reaction databases in order to predict, for a given product, sets of reactants that can be used to synthesise that product. However, their performance as measured by the top-N accuracy in matching published reaction precedents still leaves room for improvement. This work aims to enhance these models by learning to re-rank their reactant predictions. Specifically, we design and train an energy-based model to re-rank, for each product, the published reaction as the top suggestion and the remaining reactant predictions as lower-ranked. We show that re-ranking can improve one-step models significantly using the standard USPTO-50k benchmark dataset, such as RetroSim, a similarity-based method, from 35.7 to 51.8% top-1 accuracy and NeuralSym, a deep learning method, from 45.7 to 51.3%, and also that re-ranking the union of two models’ suggestions can lead to better performance than either alone. However, the state-of-the-art top-1 accuracy is not improved by this method.
Graphical Abstract | en_US |
| dc.publisher | Springer International Publishing | en_US |
| dc.relation.isversionof | https://doi.org/10.1186/s13321-022-00594-8 | en_US |
| dc.rights | Creative Commons Attribution | en_US |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | en_US |
| dc.source | Springer International Publishing | en_US |
| dc.title | Improving the performance of models for one-step retrosynthesis through re-ranking | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | Journal of Cheminformatics. 2022 Mar 15;14(1):15 | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Chemical Engineering | |
| dc.identifier.mitlicense | PUBLISHER_CC | |
| dc.eprint.version | Final published version | en_US |
| dc.type.uri | http://purl.org/eprint/type/JournalArticle | en_US |
| eprint.status | http://purl.org/eprint/status/PeerReviewed | en_US |
| dc.date.updated | 2022-03-20T04:15:26Z | |
| dc.language.rfc3066 | en | |
| dc.rights.holder | The Author(s) | |
| dspace.embargo.terms | N | |
| dspace.date.submission | 2022-03-20T04:15:26Z | |
| mit.license | PUBLISHER_CC | |
| mit.metadata.status | Authority Work and Publication Information Needed | en_US |