Show simple item record

dc.contributor.authorZhang, Zhoutong
dc.contributor.authorWu, Jiajun
dc.contributor.authorLi, Qiujia
dc.contributor.authorHuang, Zhengjia
dc.contributor.authorTraer, James
dc.contributor.authorMcDermott, Josh H.
dc.contributor.authorTenenbaum, Joshua B.
dc.contributor.authorFreeman, William T.
dc.date.accessioned2021-11-05T13:52:25Z
dc.date.available2021-11-05T13:52:25Z
dc.date.issued2017-10
dc.identifier.urihttps://hdl.handle.net/1721.1/137459
dc.description.abstract© 2017 IEEE. Humans infer rich knowledge of objects from both auditory and visual cues. Building a machine of such competency, however, is very challenging, due to the great difficulty in capturing large-scale, clean data of objects with both their appearance and the sound they make. In this paper, we present a novel, open-source pipeline that generates audiovisual data, purely from 3D object shapes and their physical properties. Through comparison with audio recordings and human behavioral studies, we validate the accuracy of the sounds it generates. Using this generative model, we are able to construct a synthetic audio-visual dataset, namely Sound-20K, for object perception tasks. We demonstrate that auditory and visual information play complementary roles in object perception, and further, that the representation learned on synthetic audio-visual data can transfer to real-world scenarios.en_US
dc.language.isoen
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.isversionof10.1109/iccv.2017.141en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceMIT web domainen_US
dc.titleGenerative Modeling of Audible Shapes for Object Perceptionen_US
dc.typeArticleen_US
dc.identifier.citationZhang, Zhoutong, Wu, Jiajun, Li, Qiujia, Huang, Zhengjia, Traer, James et al. 2017. "Generative Modeling of Audible Shapes for Object Perception."
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.contributor.departmentCenter for Brains, Minds, and Machines
dc.contributor.departmentMassachusetts Institute of Technology. Department of Brain and Cognitive Sciences
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2019-05-28T15:06:20Z
dspace.date.submission2019-05-28T15:06:21Z
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record