Show simple item record

dc.contributor.authorHarwath, David
dc.contributor.authorTorralba, Antonio
dc.contributor.authorGlass, James R.
dc.date.accessioned2020-03-31T18:36:20Z
dc.date.available2020-03-31T18:36:20Z
dc.date.issued2017
dc.date.submitted2016-12
dc.identifier.issn1049-5258
dc.identifier.urihttps://hdl.handle.net/1721.1/124455
dc.description.abstractHumans learn to speak before they can read or write, so why can't computers do the same? In this paper, we present a deep neural network model capable of rudimentary spoken language acquisition using untranscribed audio training data, whose only supervision comes in the form of contextually relevant visual images. We describe the collection of our data comprised of over 120,000 spoken audio captions for the Places image dataset and evaluate our model on an image search and annotation task. We also provide some visualizations which suggest that our model is learning to recognize meaningful words within the caption spectrograms.en_US
dc.language.isoen
dc.publisherNeural Information Processing Systems Foundation, Inc.en_US
dc.relation.isversionofhttps://papers.nips.cc/paper/6186-unsupervised-learning-of-spoken-language-with-visual-contexten_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceNeural Information Processing Systems (NIPS)en_US
dc.titleUnsupervised learning of spoken language with visual contexten_US
dc.typeArticleen_US
dc.identifier.citationHarwath, David et al. "Unsupervised Learning of Spoken Language with Visual Context." Advances in Neural Information Processing Systems 29 (NIPS 2016), December 2016, Barcelona, Spain, NIPS, 2017. © 2016 NIPS Foundation.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.relation.journalAdvances in Neural Information Processing Systems 29 (NIPS 2016)en_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2019-07-11T16:09:41Z
dspace.date.submission2019-07-11T16:09:42Z
mit.journal.volume29en_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record