Show simple item record

dc.contributor.authorLütjens, Björn
dc.contributor.authorFerrari, Raffaele
dc.contributor.authorWatson‐Parris, Duncan
dc.contributor.authorSelin, Noelle E
dc.date.accessioned2025-11-07T15:37:32Z
dc.date.available2025-11-07T15:37:32Z
dc.date.issued2025-08-26
dc.identifier.urihttps://hdl.handle.net/1721.1/163594
dc.description.abstractFull-complexity Earth system models (ESMs) are computationally very expensive, limiting their use in exploring the climate outcomes of multiple emission pathways. More efficient emulators that approximate ESMs can directly map emissions onto climate outcomes, and benchmarks are being used to evaluate their accuracy on standardized tasks and data sets. We investigate a popular benchmark in data-driven climate emulation, ClimateBench, on which deep learning-based emulators are currently achieving the best performance. We compare these deep learning emulators with a linear regression-based emulator, akin to pattern scaling, and show that it outperforms the incumbent 100M-parameter deep learning foundation model, ClimaX, on 3 out of 4 regionally resolved climate variables, notably surface temperature and precipitation. While emulating surface temperature is expected to be predominantly linear, this result is surprising for emulating precipitation. Precipitation is a much more noisy variable, and we show that deep learning emulators can overfit to internal variability noise at low frequencies, degrading their performance in comparison to a linear emulator. We address the issue of overfitting by increasing the number of climate simulations per emission pathway (from 3 to 50) and updating the benchmark targets with the respective ensemble averages from the MPI-ESM1.2-LR model. Using the new targets, we show that linear pattern scaling continues to be more accurate on temperature, but can be outperformed by a deep learning-based technique for emulating precipitation. We publish our code and data at https://github.com/blutjens/climate-emulator.en_US
dc.language.isoen
dc.publisherWileyen_US
dc.relation.isversionofhttps://doi.org/10.1029/2024MS004619en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceWileyen_US
dc.titleThe Impact of Internal Variability on Benchmarking Deep Learning Climate Emulatorsen_US
dc.typeArticleen_US
dc.identifier.citationLütjens, B., Ferrari, R., Watson-Parris, D., & Selin, N. E. (2025). The impact of internal variability on benchmarking deep learning climate emulators. Journal of Advances in Modeling Earth Systems, 17, e2024MS004619.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Earth, Atmospheric, and Planetary Sciencesen_US
dc.contributor.departmentMIT Institute for Data, Systems, and Societyen_US
dc.relation.journalJournal of Advances in Modeling Earth Systemsen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2025-11-07T15:28:06Z
dspace.orderedauthorsLütjens, B; Ferrari, R; Watson‐Parris, D; Selin, NEen_US
dspace.date.submission2025-11-07T15:28:11Z
mit.journal.volume17en_US
mit.journal.issue8en_US
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record