Search
Now showing items 1-3 of 3
Musings on Deep Learning: Properties of SGD
(Center for Brains, Minds and Machines (CBMM), 2017-04-04)
[previously titled "Theory of Deep Learning III: Generalization Properties of SGD"] In Theory III we characterize with a mix of theory and experiments the generalization properties of Stochastic Gradient Descent in ...
Theory of Deep Learning III: explaining the non-overfitting puzzle
(arXiv, 2017-12-30)
THIS MEMO IS REPLACED BY CBMM MEMO 90
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrization and despite the large capacity demonstrated by zero training error on randomly ...
Theory of Deep Learning IIb: Optimization Properties of SGD
(Center for Brains, Minds and Machines (CBMM), 2017-12-27)
In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolutional networks by Stochastic Gradient Descent. The main new result in this paper is theoretical and experimental evidence ...