Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions

Tuckute, Greta; Feather, Jenelle; Boebinger, Dana; McDermott, Josh H

dc.contributor.author	Tuckute, Greta
dc.contributor.author	Feather, Jenelle
dc.contributor.author	Boebinger, Dana
dc.contributor.author	McDermott, Josh H
dc.date.accessioned	2026-04-22T17:50:27Z
dc.date.available	2026-04-22T17:50:27Z
dc.date.issued	2023-12-13
dc.identifier.uri	https://hdl.handle.net/1721.1/165640
dc.description.abstract	Models that predict brain responses to stimuli provide one measure of understanding of a sensory system and have many potential applications in science and engineering. Deep artificial neural networks have emerged as the leading such predictive models of the visual system but are less explored in audition. Prior work provided examples of audio-trained neural networks that produced good predictions of auditory cortical fMRI responses and exhibited correspondence between model stages and brain regions, but left it unclear whether these results generalize to other neural network models and, thus, how to further improve models in this domain. We evaluated model-brain correspondence for publicly available audio neural network models along with in-house models trained on 4 different tasks. Most tested models outpredicted standard spectromporal filter-bank models of auditory cortex and exhibited systematic model-brain correspondence: Middle stages best predicted primary auditory cortex, while deep stages best predicted non-primary cortex. However, some state-of-the-art models produced substantially worse brain predictions. Models trained to recognize speech in background noise produced better brain predictions than models trained to recognize speech in quiet, potentially because hearing in noise imposes constraints on biological auditory representations. The training task influenced the prediction quality for specific cortical tuning properties, with best overall predictions resulting from models trained on multiple tasks. The results generally support the promise of deep neural networks as models of audition, though they also indicate that current models do not explain auditory cortical responses in their entirety.	en_US
dc.language.iso	en
dc.publisher	Public Library of Science (PLoS)	en_US
dc.relation.isversionof	https://doi.org/10.1371/journal.pbio.3002366	en_US
dc.rights	Creative Commons Attribution	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.source	Public Library of Science (PLoS)	en_US
dc.title	Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions	en_US
dc.type	Article	en_US
dc.identifier.citation	Tuckute G, Feather J, Boebinger D, McDermott JH (2023) Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions. PLoS Biol 21(12): e3002366.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences	en_US
dc.contributor.department	McGovern Institute for Brain Research at MIT	en_US
dc.contributor.department	Center for Brains, Minds, and Machines	en_US
dc.relation.journal	PLOS Biology	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dc.date.updated	2026-04-22T17:43:30Z
dspace.orderedauthors	Tuckute, G; Feather, J; Boebinger, D; McDermott, JH	en_US
dspace.date.submission	2026-04-22T17:43:32Z
mit.journal.volume	21	en_US
mit.journal.issue	12	en_US
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: journal.pbio.3002366.pdf
Size:: 5.705Mb
Format:: PDF
Description:: Published version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record