Search
Now showing items 1-10 of 13
Robust Estimation of 3D Human Poses from a Single Image
(Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-10)
Human pose estimation is a key step to action recognition. We propose a method of estimating 3D human poses from a single image, which works in conjunction with an existing 2D pose/joint detector. 3D pose estimation is ...
Seeing What You’re Told: Sentence-Guided Activity Recognition In Video
(Center for Brains, Minds and Machines (CBMM), arXiv, 2014-05-29)
We present a system that demonstrates how the compositional structure of events, in concert with the compositional structure of language, can interplay with the underlying focusing mechanisms in video action recognition, ...
Unsupervised learning of clutter-resistant visual representations from natural videos
(Center for Brains, Minds and Machines (CBMM), arXiv, 2015-04-27)
Populations of neurons in inferotemporal cortex (IT) maintain an explicit code for object identity that also tolerates transformations of object appearance e.g., position, scale, viewing angle [1, 2, 3]. Though the learning ...
When Computer Vision Gazes at Cognition
(Center for Brains, Minds and Machines (CBMM), arXiv, 2014-12-12)
Joint attention is a core, early-developing form of social interaction. It is based on our ability to discriminate the third party objects that other people are looking at. While it has been shown that people can accurately ...
The Invariance Hypothesis Implies Domain-Specific Regions in Visual Cortex
(Center for Brains, Minds and Machines (CBMM), bioRxiv, 2015-04-26)
Is visual cortex made up of general-purpose information processing machinery, or does it consist of a collection of specialized modules? If prior knowledge, acquired from learning a set of objects is only transferable to ...
Neural tuning size is a key factor underlying holistic face processing
(Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-14)
Faces are a class of visual stimuli with unique significance, for a variety of reasons. They are ubiquitous throughout the course of a person’s life, and face recognition is crucial for daily social interaction. Faces are ...
Can a biologically-plausible hierarchy e ectively replace face detection, alignment, and recognition pipelines?
(Center for Brains, Minds and Machines (CBMM), arXiv, 2014-03-27)
The standard approach to unconstrained face recognition in natural photographs is via a detection, alignment, recognition pipeline. While that approach has achieved impressive results, there are several reasons to be ...
Learning Real and Boolean Functions: When Is Deep Better Than Shallow
(Center for Brains, Minds and Machines (CBMM), arXiv, 2016-03-08)
We describe computational tasks - especially in vision - that correspond to compositional/hierarchical functions. While the universal approximation property holds both for hierarchical and shallow networks, we prove that ...
Representation Learning in Sensory Cortex: a theory
(Center for Brains, Minds and Machines (CBMM), 2014-11-14)
We review and apply a computational theory of the feedforward path of the ventral stream in visual cortex based on the hypothesis that its main function is the encoding of invariant representations of images. A key ...
Fast, invariant representation for human action in the visual system
(Center for Brains, Minds and Machines (CBMM), arXiv, 2016-01-06)
The ability to recognize the actions of others from visual input is essential to humans' daily lives. The neural computations underlying action recognition, however, are still poorly understood. We use magnetoencephalography ...