dc.contributor.author	Zhang, Chiyuan
dc.contributor.author	Liao, Qianli
dc.contributor.author	Rakhlin, Alexander
dc.contributor.author	Sridharan, Karthik
dc.contributor.author	Miranda, Brando
dc.contributor.author	Golowich, Noah
dc.contributor.author	Poggio, Tomaso
dc.date.accessioned	2017-04-04T21:32:29Z
dc.date.available	2017-04-04T21:32:29Z
dc.date.issued	2017-04-04
dc.identifier.uri	http://hdl.handle.net/1721.1/107841
dc.description.abstract	[previously titled "Theory of Deep Learning III: Generalization Properties of SGD"] In Theory III we characterize with a mix of theory and experiments the generalization properties of Stochastic Gradient Descent in overparametrized deep convolutional networks. We show that Stochastic Gradient Descent (SGD) selects with high probability solutions that 1) have zero (or small) empirical error, 2) are degenerate as shown in Theory II and 3) have maximum generalization.	en_US
dc.description.sponsorship	This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216. H.M. is supported in part by ARO Grant W911NF-15-1- 0385.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Center for Brains, Minds and Machines (CBMM)	en_US
dc.relation.ispartofseries	CBMM Memo Series;067
dc.rights	Attribution-NonCommercial-ShareAlike 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/us/	*
dc.title	Musings on Deep Learning: Properties of SGD	en_US
dc.type	Technical Report	en_US
dc.type	Working Paper	en_US
dc.type	Other	en_US
dc.audience.educationlevel

Musings on Deep Learning: Properties of SGD

Files in this item

This item appears in the following Collection(s)