Search
Now showing items 1-3 of 3
Theory IIIb: Generalization in Deep Networks
(Center for Brains, Minds and Machines (CBMM), arXiv.org, 2018-06-29)
The general features of the optimization problem for the case of overparametrized nonlinear networks have been clear for a while: SGD selects with high probability global minima vs local minima. In the overparametrized ...
Classical generalization bounds are surprisingly tight for Deep Networks
(Center for Brains, Minds and Machines (CBMM), 2018-07-11)
Deep networks are usually trained and tested in a regime in which the training classification error is not a good predictor of the test error. Thus the consensus has been that generalization, defined as convergence of the ...
Theory of Deep Learning III: explaining the non-overfitting puzzle
(arXiv, 2017-12-30)
THIS MEMO IS REPLACED BY CBMM MEMO 90
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrization and despite the large capacity demonstrated by zero training error on randomly ...