Browsing CBMM Memo Series by Subject "Artificial Intelligence"

Building machines that learn and think like people

Lake, Brenden M.; Ullman, Tomer D.; Tenenbaum, Joshua B.; Gershman, Samuel J. (Center for Brains, Minds and Machines (CBMM), arXiv, 2016-04-01)

Recent progress in artificial intelligence (AI) has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object ...

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

Mao, Junhua; Xu, Wei; Yang, Yi; Wang, Jiang; Huang, Zhiheng; e.a. (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-05-07)

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. ...

Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts

Chen, Xianjie; Mottaghi, Roozbeh; Liu, Xiaobai; Fidler, Sanja; Urtasun, Raquel; e.a. (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-10)

Detecting objects becomes difficult when we need to deal with large shape deformation, occlusion and low resolution. We propose a novel approach to i) handle large deformations and partial occlusions in animals (as examples ...

The Genesis Story Understanding and Story Telling System A 21st Century Step toward Artificial Intelligence

Winston, Patrick Henry (Center for Brains, Minds and Machines (CBMM), 2014-06-10)

Story understanding is an important differentiator of human intelligence, perhaps the most important differentiator. The Genesis system was built to model and explore aspects of story understanding using simply expressed, ...

Neural tuning size is a key factor underlying holistic face processing

Tan, Cheston; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-14)

Faces are a class of visual stimuli with unique significance, for a variety of reasons. They are ubiquitous throughout the course of a person’s life, and face recognition is crucial for daily social interaction. Faces are ...

Parsing Occluded People by Flexible Compositions

Chen, Xianjie; Yuille, Alan L. (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-06-01)

This paper presents an approach to parsing humans when there is significant occlusion. We model humans using a graphical model which has a tree structure building on recent work [32, 6] and exploit the connectivity prior ...

Robust Estimation of 3D Human Poses from a Single Image

Wang, Chunyu; Wang, Yizhou; Lin, Zhouchen; Yuille, Alan L.; Gao, Wen (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-10)

Human pose estimation is a key step to action recognition. We propose a method of estimating 3D human poses from a single image, which works in conjunction with an existing 2D pose/joint detector. 3D pose estimation is ...

Semantic Part Segmentation using Compositional Model combining Shape and Appearance

Wang, Jianyu; Yuille, Alan L. (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-06-08)

In this paper, we study the problem of semantic part segmentation for animals. This is more challenging than standard object detection, object segmentation and pose estimation tasks because semantic parts of animals often ...

Unsupervised learning of clutter-resistant visual representations from natural videos

Liao, Qianli; Leibo, Joel Z; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-04-27)

Populations of neurons in inferotemporal cortex (IT) maintain an explicit code for object identity that also tolerates transformations of object appearance e.g., position, scale, viewing angle [1, 2, 3]. Though the learning ...

When Computer Vision Gazes at Cognition

Gao, Tao; Harari, Daniel; Tenenbaum, Joshua; Ullman, Shimon (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-12-12)

Joint attention is a core, early-developing form of social interaction. It is based on our ability to discriminate the third party objects that other people are looking at. While it has been shown that people can accurately ...