Search
Now showing items 1-7 of 7
Musings on Deep Learning: Properties of SGD
(Center for Brains, Minds and Machines (CBMM), 2017-04-04)
[previously titled "Theory of Deep Learning III: Generalization Properties of SGD"] In Theory III we characterize with a mix of theory and experiments the generalization properties of Stochastic Gradient Descent in ...
Theory II: Landscape of the Empirical Risk in Deep Learning
(Center for Brains, Minds and Machines (CBMM), arXiv, 2017-03-30)
Previous theoretical work on deep learning and neural network optimization tend to focus on avoiding saddle points and local minima. However, the practical observation is that, at least for the most successful Deep ...
Theory of Deep Learning III: explaining the non-overfitting puzzle
(arXiv, 2017-12-30)
THIS MEMO IS REPLACED BY CBMM MEMO 90
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrization and despite the large capacity demonstrated by zero training error on randomly ...
Exact Equivariance, Disentanglement and Invariance of Transformations
(2017-12-31)
Invariance, equivariance and disentanglement of transformations are important topics in the field of representation learning. Previous models like Variational Autoencoder [1] and Generative Adversarial Networks [2] attempted ...
Object-Oriented Deep Learning
(Center for Brains, Minds and Machines (CBMM), 2017-10-31)
We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI ...
3D Object-Oriented Learning: An End-to-end Transformation-Disentangled 3D Representation
(2017-12-31)
We provide more detailed explanation of the ideas behind a recent paper on “Object-Oriented Deep Learning” [1] and extend it to handle 3D inputs/outputs. Similar to [1], every layer of the system takes in a list of ...
Theory of Deep Learning IIb: Optimization Properties of SGD
(Center for Brains, Minds and Machines (CBMM), 2017-12-27)
In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolutional networks by Stochastic Gradient Descent. The main new result in this paper is theoretical and experimental evidence ...