The visual microphone: Passive recovery of sound from video

Davis, Abe; Rubinstein, Michael; Wadhwa, Neal; Mysore, Gautham J.; Durand, Fredo; Freeman, William T.

dc.contributor.author	Davis, Abe
dc.contributor.author	Rubinstein, Michael
dc.contributor.author	Wadhwa, Neal
dc.contributor.author	Mysore, Gautham J.
dc.contributor.author	Durand, Fredo
dc.contributor.author	Freeman, William T.
dc.date.accessioned	2015-11-24T14:22:15Z
dc.date.available	2015-11-24T14:22:15Z
dc.date.issued	2014-07
dc.identifier.issn	07300301
dc.identifier.uri	http://hdl.handle.net/1721.1/100023
dc.description.abstract	When sound hits an object, it causes small vibrations of the object's surface. We show how, using only high-speed video of the object, we can extract those minute vibrations and partially recover the sound that produced them, allowing us to turn everyday objects---a glass of water, a potted plant, a box of tissues, or a bag of chips---into visual microphones. We recover sounds from high-speed footage of a variety of objects with different properties, and use both real and simulated data to examine some of the factors that affect our ability to visually recover sound. We evaluate the quality of recovered sounds using intelligibility and SNR metrics and provide input and recovered audio samples for direct comparison. We also explore how to leverage the rolling shutter in regular consumer cameras to recover audio from standard frame-rate videos, and use the spatial resolution of our method to visualize how sound-related vibrations vary over an object's surface, which we can use to recover the vibration modes of an object.	en_US
dc.description.sponsorship	Qatar Computing Research Institute	en_US
dc.description.sponsorship	National Science Foundation (U.S.) (CGV-1111415)	en_US
dc.description.sponsorship	National Science Foundation (U.S.). Graduate Research Fellowship (Grant 1122374)	en_US
dc.description.sponsorship	Massachusetts Institute of Technology. Department of Mathematics	en_US
dc.description.sponsorship	Microsoft Research (PhD Fellowship)	en_US
dc.language.iso	en_US
dc.publisher	Association for Computing Machinery (ACM)	en_US
dc.relation.isversionof	http://dx.doi.org/10.1145/2601097.2601119	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	MIT web domain	en_US
dc.title	The visual microphone: Passive recovery of sound from video	en_US
dc.type	Article	en_US
dc.identifier.citation	Abe Davis, Michael Rubinstein, Neal Wadhwa, Gautham J. Mysore, Fredo Durand, and William T. Freeman. 2014. The visual microphone: passive recovery of sound from video. ACM Trans. Graph. 33, 4, Article 79 (July 2014), 10 pages.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Mathematics	en_US
dc.contributor.mitauthor	Davis, Abe	en_US
dc.contributor.mitauthor	Rubinstein, Michael	en_US
dc.contributor.mitauthor	Wadhwa, Neal	en_US
dc.contributor.mitauthor	Durand, Fredo	en_US
dc.contributor.mitauthor	Freeman, William T.	en_US
dc.relation.journal	ACM Transactions on Graphics	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dspace.orderedauthors	Davis, Abe; Rubinstein, Michael; Wadhwa, Neal; Mysore, Gautham J.; Durand, Fredo; Freeman, William T.	en_US
dc.identifier.orcid	https://orcid.org/0000-0002-3707-3807
dc.identifier.orcid	https://orcid.org/0000-0003-1469-2696
dc.identifier.orcid	https://orcid.org/0000-0002-2902-6752
dc.identifier.orcid	https://orcid.org/0000-0001-9919-069X
dc.identifier.orcid	https://orcid.org/0000-0002-2231-7995
mit.license	OPEN_ACCESS_POLICY	en_US
mit.metadata.status	Complete

Files in this item

Name:: VisualMic_SIGGRAPH2014.pdf
Size:: 17.74Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record