MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Ambient Sound Provides Supervision for Visual Learning

Author(s)
Owens, Andrew Hale; Wu, Jiajun; McDermott, Joshua H.; Freeman, William T.; Torralba, Antonio
Thumbnail
DownloadAmbient sound provides.pdf (7.436Mb)
OPEN_ACCESS_POLICY

Open Access Policy

Creative Commons Attribution-Noncommercial-Share Alike

Terms of use
Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/
Metadata
Show full item record
Abstract
The sound of crashing waves, the roar of fast-moving cars – sound conveys important information about the objects in our surroundings. In this work, we show that ambient sounds can be used as a supervisory signal for learning visual models. To demonstrate this, we train a convolutional neural network to predict a statistical summary of the sound associated with a video frame. We show that, through this process, the network learns a representation that conveys information about objects and scenes. We evaluate this representation on several recognition tasks, finding that its performance is comparable to that of other state-of-the-art unsupervised learning methods. Finally, we show through visualizations that the network learns units that are selective to objects that are often associated with characteristic sounds.
Date issued
2016-09
URI
http://hdl.handle.net/1721.1/111172
Department
Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Journal
Lecture Notes in Computer Science
Publisher
Springer-Verlag
Citation
Owens, Andrew, et al. “Ambient Sound Provides Supervision for Visual Learning.” Lecture Notes in Computer Science 9905 (September 2016): 801–816. © 2016 Springer International Publishing AG
Version: Original manuscript
ISBN
978-3-319-46447-3
978-3-319-46448-0
ISSN
0302-9743
1611-3349

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.