Show simple item record

dc.contributor.authorJanet, Jon Paul
dc.contributor.authorKulik, Heather J.
dc.date.accessioned2020-02-20T18:25:58Z
dc.date.available2020-02-20T18:25:58Z
dc.date.issued2017-11
dc.date.submitted2017-10
dc.identifier.issn1089-5639
dc.identifier.issn1520-5215
dc.identifier.urihttps://hdl.handle.net/1721.1/123835
dc.description.abstractMachine learning (ML) of quantum mechanical properties shows promise for accelerating chemical discovery. For transition metal chemistry where accurate calculations are computationally costly and available training data sets are small, the molecular representation becomes a critical ingredient in ML model predictive accuracy. We introduce a series of revised autocorrelation functions (RACs) that encode relationships of the heuristic atomic properties (e.g., size, connectivity, and electronegativity) on a molecular graph. We alter the starting point, scope, and nature of the quantities evaluated in standard ACs to make these RACs amenable to inorganic chemistry. On an organic molecule set, we first demonstrate superior standard AC performance to other presently available topological descriptors for ML model training, with mean unsigned errors (MUEs) for atomization energies on set-aside test molecules as low as 6 kcal/mol. For inorganic chemistry, our RACs yield 1 kcal/mol ML MUEs on set-aside test molecules in spin-state splitting in comparison to 15–20× higher errors for feature sets that encode whole-molecule structural information. Systematic feature selection methods including univariate filtering, recursive feature elimination, and direct optimization (e.g., random forest and LASSO) are compared. Random-forest- or LASSO-selected subsets 4–5× smaller than the full RAC set produce sub- to 1 kcal/mol spin-splitting MUEs, with good transferability to metal–ligand bond length prediction (0.004–5 Å MUE) and redox potential on a smaller data set (0.2–0.3 eV MUE). Evaluation of feature selection results across property sets reveals the relative importance of local, electronic descriptors (e.g., electronegativity, atomic number) in spin-splitting and distal, steric effects in redox potential and bond lengths.en_US
dc.description.sponsorshipUnited States. Office of Naval Research (Grant N00014-17-1-2956)en_US
dc.description.sponsorshipNational Science Foundation (Grant ECCS-1449291)en_US
dc.description.sponsorshipNational Science Foundation (Grant CBET-1704266)en_US
dc.publisherAmerican Chemical Society (ACS)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1021/acs.jpca.7b08750en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceProf. Kuliken_US
dc.titleResolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure–Property Relationshipsen_US
dc.typeArticleen_US
dc.identifier.citationJanet, Jon Paul and Heather J. Kulik. "Resolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure–Property Relationships." Journal of Physical Chemistry A 121, 46 (November 2017): 8939-8954 © 2017 American Chemical Societyen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Chemical Engineeringen_US
dc.relation.journalJournal of Physical Chemistry Aen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dspace.date.submission2020-02-13T02:28:04Z
mit.journal.volume121en_US
mit.journal.issue46en_US
mit.licensePUBLISHER_POLICY
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record