Search

Now showing items 1-4 of 4

Seeing What You’re Told: Sentence-Guided Activity Recognition In Video

Siddharth, Narayanaswamy; Barbu, Andrei; Siskind, Jeffrey Mark (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-05-29)

We present a system that demonstrates how the compositional structure of events, in concert with the compositional structure of language, can interplay with the underlying focusing mechanisms in video action recognition, ...

Unsupervised learning of clutter-resistant visual representations from natural videos

Liao, Qianli; Leibo, Joel Z; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-04-27)

Populations of neurons in inferotemporal cortex (IT) maintain an explicit code for object identity that also tolerates transformations of object appearance e.g., position, scale, viewing angle [1, 2, 3]. Though the learning ...

Robust Estimation of 3D Human Poses from a Single Image

Wang, Chunyu; Wang, Yizhou; Lin, Zhouchen; Yuille, Alan L.; Gao, Wen (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-10)

Human pose estimation is a key step to action recognition. We propose a method of estimating 3D human poses from a single image, which works in conjunction with an existing 2D pose/joint detector. 3D pose estimation is ...

The Invariance Hypothesis Implies Domain-Specific Regions in Visual Cortex

Leibo, Joel Z; Liao, Qianli; Anselmi, Fabio; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), bioRxiv, 2015-04-26)

Is visual cortex made up of general-purpose information processing machinery, or does it consist of a collection of specialized modules? If prior knowledge, acquired from learning a set of objects is only transferable to ...