Browsing Publications by Title
Now showing items 2443 of 143

Deep Captioning with Multimodal Recurrent Neural Networks (mRNN)
(Center for Brains, Minds and Machines (CBMM), arXiv, 20150507)In this paper, we present a multimodal Recurrent Neural Network (mRNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. ... 
Deep compositional robotic planners that follow natural language commands
(Center for Brains, Minds and Machines (CBMM), Computation and Systems Neuroscience (Cosyne), 20200531)We demonstrate how a samplingbased robotic planner can be augmented to learn to understand a sequence of natural language commands in a continuous configuration space to move and manipu late objects. Our approach combines ... 
Deep Convolutional Networks are Hierarchical Kernel Machines
(Center for Brains, Minds and Machines (CBMM), arXiv, 20150805)We extend itheory to incorporate not only pooling but also rectifying nonlinearities in an extended HW module (eHW) designed for supervised learning. The two operations roughly correspond to invariance and selectivity, ... 
Deep Nets: What have they ever done for Vision?
(Center for Brains, Minds and Machines (CBMM), 20180510)This is an opinion paper about the strengths and weaknesses of Deep Nets. They are at the center of recent progress on Artificial Intelligence and are of growing importance in Cognitive Science and Neuroscience since they ... 
Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning
(Center for Brains, Minds and Machines (CBMM), arXiv, 20170301)While great strides have been made in using deep learning algorithms to solve supervised learning tasks, the problem of unsupervised learning—leveraging unlabeled examples to learn about the structure of a domain — remains ... 
Deep Regression Forests for Age Estimation
(Center for Brains, Minds and Machines (CBMM), 20180601)Age estimation from facial images is typically cast as a nonlinear regression problem. The main challenge of this problem is the facial feature space w.r.t. ages is inhomogeneous, due to the large variation in facial ... 
A Deep Representation for Invariance And Music Classification
(Center for Brains, Minds and Machines (CBMM), arXiv, 20141703)Representations in the auditory cortex might be based on mechanisms similar to the visual ventral stream; modules for building invariance to transformations and multiple layers for compositionality and selectivity. In this ... 
Deep vs. shallow networks : An approximation theory perspective
(Center for Brains, Minds and Machines (CBMM), arXiv, 20160812)The paper briefly reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in ... 
DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection under Partial Occlusion
(Center for Brains, Minds and Machines (CBMM), 20180619)In this paper, we study the task of detecting semantic parts of an object, e.g., a wheel of a car, under partial occlusion. We propose that all models should be trained without seeing occlusions while being able to transfer ... 
A Definition of General Problem Solving
(20200713)What is general intelligence? What does it mean by general problem solving? We attempt to give a definition of general problem solving, characterize the common process of problem solving and provide a basic algorithm that ... 
Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts
(Center for Brains, Minds and Machines (CBMM), arXiv, 20140610)Detecting objects becomes difficult when we need to deal with large shape deformation, occlusion and low resolution. We propose a novel approach to i) handle large deformations and partial occlusions in animals (as examples ... 
Detecting Semantic Parts on Partially Occluded Objects
(Center for Brains, Minds and Machines (CBMM), 20170904)In this paper, we address the task of detecting semantic parts on partially occluded objects. We consider a scenario where the model is trained using nonoccluded images but tested on occluded images. The motivation is ... 
DiscriminateandRectify Encoders: Learning from Image Transformation Sets
(Center for Brains, Minds and Machines (CBMM), arXiv, 20170313)The complexity of a learning task is increased by transformations in the input space that preserve class identity. Visual object recognition for example is affected by changes in viewpoint, scale, illumination or planar ... 
Do Deep Neural Networks Suffer from Crowding?
(Center for Brains, Minds and Machines (CBMM), arXiv, 20170626)Crowding is a visual effect suffered by humans, in which an object that can be recognized in isolation can no longer be recognized when other objects, called flankers, are placed close to it. In this work, we study the ... 
Do Neural Networks for Segmentation Understand Insideness?
(Center for Brains, Minds and Machines (CBMM), 20200404)The insideness problem is an image segmentation modality that consists of determining which pixels are inside and outside a region. Deep Neural Networks (DNNs) excel in segmentation benchmarks, but it is unclear that they ... 
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities
(Center for Brains, Minds and Machines (CBMM), arXiv, 20160610)Understanding language goes hand in hand with the ability to integrate complex contextual information obtained via perception. In this work, we present a novel task for grounded language understanding: disambiguating a ... 
Double descent in the condition number
(Center for Brains, Minds and Machines (CBMM), 20191204)In solving a system of n linear equations in d variables Ax=b, the condition number of the (n,d) matrix A measures how much errors in the data b affect the solution x. Bounds of this type are important in many inverse ... 
Dreaming with ARC
(Center for Brains, Minds and Machines (CBMM), 20201123)Current machine learning algorithms are highly specialized to whatever it is they are meant to do –– e.g. playing chess, picking up objects, or object recognition. How can we extend this to a system that could solve a ... 
The Effects of Image Distribution and Task on Adversarial Robustness
(Center for Brains, Minds and Machines (CBMM), 20210218)In this paper, we propose an adaptation to the area under the curve (AUC) metric to measure the adversarial robustness of a model over a particular εinterval [ε_0, ε_1] (interval of adversarial perturbation strengths) ... 
Encoding formulas as deep networks: Reinforcement learning for zeroshot execution of LTL formulas
(Center for Brains, Minds and Machines (CBMM), The Ninth International Conference on Learning Representations (ICLR), 20201025)We demonstrate a reinforcement learning agent which uses a compositional recurrent neural network that takes as input an LTL formula and determines satisfying actions. The input LTL formulas have never been seen before, ...