Notice

This is not the latest version of this item. The latest version can be found at:https://dspace.mit.edu/handle/1721.1/139633.2

Show simple item record

dc.contributor.authorChung, Yunsie
dc.contributor.authorVermeire, Florence H.
dc.contributor.authorWu, Haoyang
dc.contributor.authorWalker, Pierre J.
dc.contributor.authorAbraham, Michael H.
dc.contributor.authorGreen, William H.
dc.date.accessioned2022-01-20T13:35:35Z
dc.date.available2022-01-20T13:35:35Z
dc.date.issued2022-01-19
dc.identifier.issn1549-9596
dc.identifier.issn1549-960X
dc.identifier.urihttps://hdl.handle.net/1721.1/139633
dc.description.abstractWe present a group contribution method (SoluteGC) and a machine learning model (SoluteML) to predict the Abraham solute parameters, as well as a machine learning model (DirectML) to predict solvation free energy and enthalpy at 298 K. The proposed group contribution method uses atom-centered functional groups with corrections for ring and polycyclic strain while the machine learning models adopt a directed message passing neural network. The solute parameters predicted from SoluteGC and SoluteML are used to calculate solvation energy and enthalpy via linear free energy relationships. Extensive data sets containing 8366 solute parameters, 20,253 solvation free energies, and 6322 solvation enthalpies are compiled in this work to train the models. The three models are each evaluated on the same test sets using both random and substructure-based solute splits for solvation energy and enthalpy predictions. The results show that the DirectML model is superior to the SoluteML and SoluteGC models for both predictions and can provide accuracy comparable to that of advanced quantum chemistry methods. Yet, even though the DirectML model performs better in general, all three models are useful for various purposes. Uncertain predicted values can be identified by comparing the three models, and when the 3 models are combined together, they can provide even more accurate predictions than any one of them individually. Finally, we present our compiled solute parameter, solvation energy, and solvation enthalpy databases (SoluteDB, dGsolvDBx, dHsolvDB) and provide public access to our final prediction models through a simple web-based tool, software packages, and source code.en_US
dc.description.sponsorshipNational Science Foundation (NSF)en_US
dc.description.sponsorshipEni S.p.A., Machine Learning for Pharmaceutical Discovery and Synthesis Consortium (MLPDS), Belgian American Educational Foundation (BAEF), MolSSIen_US
dc.publisherAmerican Chemical Society (ACS)en_US
dc.relation.isversionof10.1021/acs.jcim.1c01103en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceProf. Greenen_US
dc.subjectLibrary and Information Sciencesen_US
dc.subjectComputer Science Applicationsen_US
dc.subjectGeneral Chemical Engineeringen_US
dc.subjectGeneral Chemistryen_US
dc.titleGroup Contribution and Machine Learning Approaches to Predict Abraham Solute Parameters, Solvation Free Energy, and Solvation Enthalpyen_US
dc.typeArticleen_US
dc.identifier.citationChung, Yunsie, Vermeire, Florence H., Wu, Haoyang, Walker, Pierre J., Abraham, Michael H. et al. 2022. "Group Contribution and Machine Learning Approaches to Predict Abraham Solute Parameters, Solvation Free Energy, and Solvation Enthalpy." Journal of Chemical Information and Modeling.
dc.relation.journalJournal of Chemical Information and Modelingen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.identifier.doi10.1021/acs.jcim.1c01103
dspace.date.submission2022-01-19T21:12:22Z
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

VersionItemDateSummary

*Selected version