Now showing items 1-2 of 2
Seeing What You’re Told: Sentence-Guided Activity Recognition In Video
(Center for Brains, Minds and Machines (CBMM), arXiv, 2014-05-29)
We present a system that demonstrates how the compositional structure of events, in concert with the compositional structure of language, can interplay with the underlying focusing mechanisms in video action recognition, ...
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities
(Center for Brains, Minds and Machines (CBMM), arXiv, 2016-06-10)
Understanding language goes hand in hand with the ability to integrate complex contextual information obtained via perception. In this work, we present a novel task for grounded language understanding: disambiguating a ...