Show simple item record

dc.contributor.authorFrey, Nathan C
dc.contributor.authorSoklaski, Ryan
dc.contributor.authorAxelrod, Simon
dc.contributor.authorSamsi, Siddharth
dc.contributor.authorGómez-Bombarelli, Rafael
dc.contributor.authorColey, Connor W
dc.contributor.authorGadepally, Vijay
dc.date.accessioned2025-02-11T21:05:33Z
dc.date.available2025-02-11T21:05:33Z
dc.date.issued2023
dc.identifier.urihttps://hdl.handle.net/1721.1/158195
dc.description.abstractMassive scale, in terms of both data availability and computation, enables important breakthroughs in key application areas of deep learning such as natural language processing and computer vision. There is emerging evidence that scale may be a key ingredient in scientific deep learning, but the importance of physical priors in scientific domains makes the strategies and benefits of scaling uncertain. Here we investigate neural-scaling behaviour in large chemical models by varying model and dataset sizes over many orders of magnitude, studying models with over one billion parameters, pre-trained on datasets of up to ten million datapoints. We consider large language models for generative chemistry and graph neural networks for machine-learned interatomic potentials. We investigate the interplay between physical priors and scale and discover empirical neural-scaling relations for language models in chemistry with a scaling exponent of 0.17 for the largest dataset size considered, and a scaling exponent of 0.26 for equivariant graph neural network interatomic potentials.en_US
dc.language.isoen
dc.publisherSpringer Science and Business Media LLCen_US
dc.relation.isversionof10.1038/s42256-023-00740-3en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceSpringer Science and Business Media LLCen_US
dc.titleNeural scaling of deep chemical modelsen_US
dc.typeArticleen_US
dc.identifier.citationFrey, N.C., Soklaski, R., Axelrod, S. et al. Neural scaling of deep chemical models. Nat Mach Intell 5, 1297–1305 (2023).en_US
dc.contributor.departmentLincoln Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Materials Science and Engineeringen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Chemical Engineeringen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.relation.journalNature Machine Intelligenceen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2025-02-11T20:58:27Z
dspace.orderedauthorsFrey, NC; Soklaski, R; Axelrod, S; Samsi, S; Gómez-Bombarelli, R; Coley, CW; Gadepally, Ven_US
dspace.date.submission2025-02-11T20:58:30Z
mit.journal.volume5en_US
mit.journal.issue11en_US
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record