MOFSimplify, machine learning models with extracted stability data of three thousand metal–organic frameworks
Author(s)
Nandy, Aditya; Terrones, Gianmarco; Arunachalam, Naveen; Duan, Chenru; Kastner, David W; Kulik, Heather J; ... Show more Show less
DownloadPublished version (2.875Mb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
<jats:title>Abstract</jats:title><jats:p>We report a workflow and the output of a natural language processing (NLP)-based procedure to mine the extant metal–organic framework (MOF) literature describing structurally characterized MOFs and their solvent removal and thermal stabilities. We obtain over 2,000 solvent removal stability measures from text mining and 3,000 thermal decomposition temperatures from thermogravimetric analysis data. We assess the validity of our NLP methods and the accuracy of our extracted data by comparing to a hand-labeled subset. Machine learning (ML, i.e. artificial neural network) models trained on this data using graph- and pore-geometry-based representations enable prediction of stability on new MOFs with quantified uncertainty. Our web interface, MOFSimplify, provides users access to our curated data and enables them to harness that data for predictions on new MOFs. MOFSimplify also encourages community feedback on existing data and on ML model predictions for community-based active learning for improved MOF stability models.</jats:p>
Date issued
2022-12Department
Massachusetts Institute of Technology. Department of Chemical Engineering; Massachusetts Institute of Technology. Department of Chemistry; Massachusetts Institute of Technology. Department of Biological EngineeringJournal
Scientific Data
Publisher
Springer Science and Business Media LLC
Citation
Nandy, Aditya, Terrones, Gianmarco, Arunachalam, Naveen, Duan, Chenru, Kastner, David W et al. 2022. "MOFSimplify, machine learning models with extracted stability data of three thousand metal–organic frameworks." Scientific Data, 9 (1).
Version: Final published version
ISSN
2052-4463