Show simple item record

dc.contributor.authorIsola, Phillip
dc.contributor.authorMcDermott, Josh
dc.contributor.authorAdelson, Edward H.
dc.contributor.authorFreeman, William T.
dc.contributor.authorTorralba, Antonio
dc.contributor.authorOwens, Andrew Hale
dc.date.accessioned2017-12-08T17:59:29Z
dc.date.available2017-12-08T17:59:29Z
dc.date.issued2016-06
dc.identifier.isbn978-1-4673-8851-1
dc.identifier.issn1063-6919
dc.identifier.urihttp://hdl.handle.net/1721.1/112659
dc.description.abstractObjects make distinctive sounds when they are hit or scratched. These sounds reveal aspects of an object's material properties, as well as the actions that produced them. In this paper, we propose the task of predicting what sound an object makes when struck as a way of studying physical interactions within a visual scene. We present an algorithm that synthesizes sound from silent videos of people hitting and scratching objects with a drumstick. This algorithm uses a recurrent neural network to predict sound features from videos and then produces a waveform from these features with an example-based synthesis procedure. We show that the sounds predicted by our model are realistic enough to fool participants in a "real or fake" psychophysical experiment, and that they convey significant information about material properties and physical interactions.en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (grant 6924450)en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (grant 6926677)en_US
dc.description.sponsorshipShell Oil Companyen_US
dc.description.sponsorshipMicrosoft Corporationen_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/CVPR.2016.264en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourcearXiven_US
dc.titleVisually Indicated Soundsen_US
dc.typeArticleen_US
dc.identifier.citationOwens, Andrew, Phillip Isola, Josh McDermott, Antonio Torralba, Edward H. Adelson, and William T. Freeman. “Visually Indicated Sounds.” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016). © 2016 Institute of Electrical and Electronics Engineers (IEEE)en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.mitauthorTorralba, Antonio
dc.contributor.mitauthorOwens, Andrew Hale
dc.relation.journalIEEE Conference on Computer Vision and Pattern Recognition, 2016. CVPR 2016en_US
dc.eprint.versionOriginal manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsOwens, Andrew; Isola, Phillip; McDermott, Josh; Torralba, Antonio; Adelson, Edward H.; Freeman, William T.en_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0003-4915-0256
dc.identifier.orcidhttps://orcid.org/0000-0001-9020-9593
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record