A Computational Model for Combinatorial Generalization in Physical Perception from Sound

Wang, Yunyun; Gan, Chuang; Siegel, Max Harmon; Zhang, Zhoutong; Wi, Jiajun; Tenenbaum, Joshua B

dc.contributor.author	Wang, Yunyun
dc.contributor.author	Gan, Chuang
dc.contributor.author	Siegel, Max Harmon
dc.contributor.author	Zhang, Zhoutong
dc.contributor.author	Wi, Jiajun
dc.contributor.author	Tenenbaum, Joshua B
dc.date.accessioned	2021-12-10T21:12:51Z
dc.date.available	2021-12-07T13:44:38Z
dc.date.available	2021-12-10T21:12:51Z
dc.date.issued	2019
dc.identifier.uri	https://hdl.handle.net/1721.1/138340.2
dc.description.abstract	Humans possess the unique ability of combinatorial generalization in auditory perception: given novel auditory stimuli, humans perform auditory scene analysis and infer causal physical interactions based on prior knowledge. Could we build a computational model that achieves human-like combinatorial generalization? In this paper, we present a case study on box-shaking: having heard only the sound of a single ball moving in a box, we seek to interpret the sound of two or three balls of different materials. To solve this task, we propose a hybrid model with two components: a neural network for perception, and a physical audio engine for simulation. We use the outcome of the network as an initial guess and perform MCMC sampling with the audio engine to improve the result. Combining neural networks with a physical audio engine, our hybrid model achieves combinatorial generalization efficiently and accurately in auditory scene perception.	en_US
dc.language.iso	en
dc.publisher	Cognitive Computational Neuroscience	en_US
dc.relation.isversionof	10.32470/CCN.2019.1276-0	en_US
dc.rights	Creative Commons Attribution 3.0 unported license	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/3.0/	en_US
dc.source	Cognitive Computational Neuroscience	en_US
dc.title	A Computational Model for Combinatorial Generalization in Physical Perception from Sound	en_US
dc.type	Article	en_US
dc.identifier.citation	Wang, Yunyun, Gan, Chuang, Siegel, Max, Zhang, Zhoutong, Wu, Jiajun et al. 2019. "A Computational Model for Combinatorial Generalization in Physical Perception from Sound." 2019 Conference on Cognitive Computational Neuroscience.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.department	MIT-IBM Watson AI Lab	en_US
dc.contributor.department	Center for Brains, Minds, and Machines	en_US
dc.relation.journal	2019 Conference on Cognitive Computational Neuroscience	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2021-12-07T13:39:22Z
dspace.orderedauthors	Wang, Y; Gan, C; Siegel, M; Zhang, Z; Wu, J; Tenenbaum, J	en_US
dspace.date.submission	2021-12-07T13:39:24Z
mit.license	PUBLISHER_CC
mit.metadata.status	Publication Information Needed	en_US

Files in this item

Name:: 0000751.pdf
Size:: 1.445Mb
Format:: Unknown
Description:: Published version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record

Version	Item	Date	Summary
2	1721.1/138340.2*	2021-12-10T21:00:31Z	Verified or entered authority metadata.
1	1721.1/138340	2021-12-07T13:44:38Z

DSpace@MIT

A Computational Model for Combinatorial Generalization in Physical Perception from Sound

Files in this item

This item appears in the following Collection(s)

Version History