Now showing items 11-30 of 99

    • Can a biologically-plausible hierarchy e ectively replace face detection, alignment, and recognition pipelines? 

      Liao, Qianli; Leibo, Joel Z; Mroueh, Youssef; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-03-27)
      The standard approach to unconstrained face recognition in natural photographs is via a detection, alignment, recognition pipeline. While that approach has achieved impressive results, there are several reasons to be ...
    • Classical generalization bounds are surprisingly tight for Deep Networks 

      Liao, Qianli; Miranda, Brando; Hidary, Jack; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2018-07-11)
      Deep networks are usually trained and tested in a regime in which the training classification error is not a good predictor of the test error. Thus the consensus has been that generalization, defined as convergence of the ...
    • Complexity of Representation and Inference in Compositional Models with Part Sharing 

      Yuille, Alan L.; Mottaghi, Roozbeh (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-05-05)
      This paper performs a complexity analysis of a class of serial and parallel compositional models of multiple objects and shows that they enable efficient representation and rapid inference. Compositional models are generative ...
    • The Compositional Nature of Event Representations in the Human Brain 

      Barbu, Andrei; Narayanaswamy, Siddharth; Xiong, Caiming; Corso, Jason J.; Fellbaum, Christiane D.; e.a. (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-07-14)
      How does the human brain represent simple compositions of constituents: actors, verbs, objects, directions, and locations? Subjects viewed videos during neuroimaging (fMRI) sessions from which sentential descriptions of ...
    • Computational role of eccentricity dependent cortical magnification 

      Poggio, Tomaso; Mutch, Jim; Isik, Leyla (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-06)
      We develop a sampling extension of M-theory focused on invariance to scale and translation. Quite surprisingly, the theory predicts an architecture of early vision with increasing receptive field sizes and a high resolution ...
    • Concepts in a Probabilistic Language of Thought 

      Goodman, Noah D.; Tenenbaum, Joshua B.; Gerstenberg, Tobias (Center for Brains, Minds and Machines (CBMM), 2014-06-14)
      Knowledge organizes our understanding of the world, determining what we expect given what we have already seen. Our predictive representations have two key properties: they are productive, and they are graded. Productive ...
    • Constant Modulus Algorithms via Low-Rank Approximation 

      Adler, Amir; Wax, Mati (Center for Brains, Minds and Machines (CBMM), 2018-04-12)
      We present a novel convex-optimization-based approach to the solutions of a family of problems involving constant modulus signals. The family of problems includes the constant modulus and the constrained constant modulus, ...
    • Contrastive Analysis with Predictive Power: Typology Driven Estimation of Grammatical Error Distributions in ESL 

      Berzak, Yevgeni; Reichart, Roi; Katz, Boris (Center for Brains, Minds and Machines (CBMM), arXiv, 2016-06-05)
      This work examines the impact of crosslinguistic transfer on grammatical errors in English as Second Language (ESL) texts. Using a computational framework that formalizes the theory of Contrastive Analysis (CA), we demonstrate ...
    • Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) 

      Mao, Junhua; Xu, Wei; Yang, Yi; Wang, Jiang; Huang, Zhiheng; e.a. (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-05-07)
      In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. ...
    • Deep Convolutional Networks are Hierarchical Kernel Machines 

      Anselmi, Fabio; Rosasco, Lorenzo; Tan, Cheston; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-08-05)
      We extend i-theory to incorporate not only pooling but also rectifying nonlinearities in an extended HW module (eHW) designed for supervised learning. The two operations roughly correspond to invariance and selectivity, ...
    • Deep Nets: What have they ever done for Vision? 

      Yuille, Alan L.; Liu, Chenxi (Center for Brains, Minds and Machines (CBMM), 2018-05-10)
      This is an opinion paper about the strengths and weaknesses of Deep Nets. They are at the center of recent progress on Artificial Intelligence and are of growing importance in Cognitive Science and Neuroscience since they ...
    • Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning 

      Lotter, William; Kreiman, Gabriel; Cox, David (Center for Brains, Minds and Machines (CBMM), arXiv, 2017-03-01)
      While great strides have been made in using deep learning algorithms to solve supervised learning tasks, the problem of unsupervised learning—leveraging unlabeled examples to learn about the structure of a domain — remains ...
    • Deep Regression Forests for Age Estimation 

      Shen, Wei; Guo, Yilu; Wang, Yan; Zhao, Kai; Wang, Bo; e.a. (Center for Brains, Minds and Machines (CBMM), 2018-06-01)
      Age estimation from facial images is typically cast as a nonlinear regression problem. The main challenge of this problem is the facial feature space w.r.t. ages is inhomogeneous, due to the large variation in facial ...
    • A Deep Representation for Invariance And Music Classification 

      Zhang, Chiyuan; Evangelopoulos, Georgios; Voinea, Stephen; Rosasco, Lorenzo; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-17-03)
      Representations in the auditory cortex might be based on mechanisms similar to the visual ventral stream; modules for building invariance to transformations and multiple layers for compositionality and selectivity. In this ...
    • Deep vs. shallow networks : An approximation theory perspective 

      Mhaskar, Hrushikesh; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2016-08-12)
      The paper briefly reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in ...
    • DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection under Partial Occlusion 

      Zhang, Zhishuai; Xie, Cihang; Wang, Jianyu; Xie, Lingxi; Yuille, Alan L. (Center for Brains, Minds and Machines (CBMM), 2018-06-19)
      In this paper, we study the task of detecting semantic parts of an object, e.g., a wheel of a car, under partial occlusion. We propose that all models should be trained without seeing occlusions while being able to transfer ...
    • Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts 

      Chen, Xianjie; Mottaghi, Roozbeh; Liu, Xiaobai; Fidler, Sanja; Urtasun, Raquel; e.a. (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-10)
      Detecting objects becomes difficult when we need to deal with large shape deformation, occlusion and low resolution. We propose a novel approach to i) handle large deformations and partial occlusions in animals (as examples ...
    • Detecting Semantic Parts on Partially Occluded Objects 

      Wang, Jianyu; Xe, Cihang; Zhang, Zhishuai; Zhu, Jun; Xie, Lingxi; e.a. (Center for Brains, Minds and Machines (CBMM), 2017-09-04)
      In this paper, we address the task of detecting semantic parts on partially occluded objects. We consider a scenario where the model is trained using non-occluded images but tested on occluded images. The motivation is ...
    • Discriminate-and-Rectify Encoders: Learning from Image Transformation Sets 

      Tachetti, Andrea; Voinea, Stephen; Evangelopoulos, Georgios (Center for Brains, Minds and Machines (CBMM), arXiv, 2017-03-13)
      The complexity of a learning task is increased by transformations in the input space that preserve class identity. Visual object recognition for example is affected by changes in viewpoint, scale, illumination or planar ...
    • Do Deep Neural Networks Suffer from Crowding? 

      Volokitin, Anna; Roig, Gemma; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2017-06-26)
      Crowding is a visual effect suffered by humans, in which an object that can be recognized in isolation can no longer be recognized when other objects, called flankers, are placed close to it. In this work, we study the ...