dc.contributor.author | Harper, Daniel R | |
dc.contributor.author | Nandy, Aditya | |
dc.contributor.author | Arunachalam, Naveen | |
dc.contributor.author | Duan, Chenru | |
dc.contributor.author | Janet, Jon Paul | |
dc.contributor.author | Kulik, Heather J | |
dc.date.accessioned | 2022-09-19T12:10:40Z | |
dc.date.available | 2022-09-19T12:10:40Z | |
dc.date.issued | 2022 | |
dc.identifier.uri | https://hdl.handle.net/1721.1/145470 | |
dc.description.abstract | Strategies for machine-learning(ML)-accelerated discovery that are general
across materials composition spaces are essential, but demonstrations of ML
have been primarily limited to narrow composition variations. By addressing the
scarcity of data in promising regions of chemical space for challenging targets
like open-shell transition-metal complexes, general representations and
transferable ML models that leverage known relationships in existing data will
accelerate discovery. Over a large set (ca. 1000) of isovalent transition-metal
complexes, we quantify evident relationships for different properties (i.e.,
spin-splitting and ligand dissociation) between rows of the periodic table
(i.e., 3d/4d metals and 2p/3p ligands). We demonstrate an extension to
graph-based revised autocorrelation (RAC) representation (i.e., eRAC) that
incorporates the effective nuclear charge alongside the nuclear charge
heuristic that otherwise overestimates dissimilarity of isovalent complexes. To
address the common challenge of discovery in a new space where data is limited,
we introduce a transfer learning approach in which we seed models trained on a
large amount of data from one row of the periodic table with a small number of
data points from the additional row. We demonstrate the synergistic value of
the eRACs alongside this transfer learning strategy to consistently improve
model performance. Analysis of these models highlights how the approach
succeeds by reordering the distances between complexes to be more consistent
with the periodic table, a property we expect to be broadly useful for other
materials domains. | en_US |
dc.language.iso | en | |
dc.publisher | AIP Publishing | en_US |
dc.relation.isversionof | 10.1063/5.0082964 | en_US |
dc.rights | Creative Commons Attribution 4.0 International license | en_US |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | en_US |
dc.source | American Institute of Physics (AIP) | en_US |
dc.title | Representations and strategies for transferable machine learning improve model performance in chemical discovery | en_US |
dc.type | Article | en_US |
dc.identifier.citation | Harper, Daniel R, Nandy, Aditya, Arunachalam, Naveen, Duan, Chenru, Janet, Jon Paul et al. 2022. "Representations and strategies for transferable machine learning improve model performance in chemical discovery." The Journal of Chemical Physics, 156 (7). | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Chemistry | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Chemical Engineering | en_US |
dc.relation.journal | The Journal of Chemical Physics | en_US |
dc.eprint.version | Final published version | en_US |
dc.type.uri | http://purl.org/eprint/type/JournalArticle | en_US |
eprint.status | http://purl.org/eprint/status/PeerReviewed | en_US |
dc.date.updated | 2022-09-19T11:59:37Z | |
dspace.orderedauthors | Harper, DR; Nandy, A; Arunachalam, N; Duan, C; Janet, JP; Kulik, HJ | en_US |
dspace.date.submission | 2022-09-19T11:59:42Z | |
mit.journal.volume | 156 | en_US |
mit.journal.issue | 7 | en_US |
mit.license | PUBLISHER_CC | |
mit.metadata.status | Authority Work and Publication Information Needed | en_US |