dc.contributor.author | Berzak, Yevgeni | |
dc.contributor.author | Barbu, Andrei | |
dc.contributor.author | Harari, Daniel | |
dc.contributor.author | Katz, Boris | |
dc.contributor.author | Ullman, Shimon | |
dc.date.accessioned | 2016-06-30T20:20:10Z | |
dc.date.available | 2016-06-30T20:20:10Z | |
dc.date.issued | 2016-06-10 | |
dc.identifier.uri | http://hdl.handle.net/1721.1/103400 | |
dc.description.abstract | Understanding language goes hand in hand with the ability to integrate complex contextual information obtained via perception. In this work, we present a novel task for grounded language understanding: disambiguating a sentence given a visual scene which depicts one of the possible interpretations of that sentence. To this end, we introduce a new multimodal corpus containing ambiguous sentences, representing a wide range of syntactic, semantic and discourse ambiguities, coupled with videos that visualize the different interpretations for each sentence. We address this task by extending a vision model which determines if a sentence is depicted by a video. We demonstrate how such a model can be adjusted to recognize different interpretations of the same underlying sentence, allowing to disambiguate sentences in a unified fashion across the different ambiguity types. | en_US |
dc.description.sponsorship | This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF – 1231216. | en_US |
dc.language.iso | en_US | en_US |
dc.publisher | Center for Brains, Minds and Machines (CBMM), arXiv | en_US |
dc.relation.ispartofseries | CBMM Memo Series;051 | |
dc.rights | Attribution-NonCommercial-ShareAlike 3.0 United States | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/3.0/us/ | * |
dc.subject | Computer Language | en_US |
dc.subject | Language understanding | en_US |
dc.subject | Computer vision | en_US |
dc.title | Do You See What I Mean? Visual Resolution of Linguistic Ambiguities | en_US |
dc.type | Technical Report | en_US |
dc.type | Working Paper | en_US |
dc.type | Other | en_US |
dc.identifier.citation | arXiv:1603.08079v1 [cs.CV] | en_US |