Complexity control by gradient descent in deep networks

This is not the latest version of this item. The latest version can be found at:https://dspace.mit.edu/handle/1721.1/136301.2

dc.contributor.author	Poggio, Tomaso
dc.contributor.author	Liao, Qianli
dc.contributor.author	Banburski, Andrzej
dc.date.accessioned	2021-10-27T20:34:47Z
dc.date.available	2021-10-27T20:34:47Z
dc.date.issued	2020
dc.identifier.uri	https://hdl.handle.net/1721.1/136301
dc.description.abstract	© 2020, The Author(s). Overparametrized deep networks predict well, despite the lack of an explicit complexity control during training, such as an explicit regularization term. For exponential-type loss functions, we solve this puzzle by showing an effective regularization effect of gradient descent in terms of the normalized weights that are relevant for classification.
dc.language.iso	en
dc.publisher	Springer Science and Business Media LLC
dc.relation.isversionof	10.1038/S41467-020-14663-9
dc.rights	Creative Commons Attribution 4.0 International license
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.source	Nature
dc.title	Complexity control by gradient descent in deep networks
dc.type	Article
dc.relation.journal	Nature Communications
dc.eprint.version	Final published version
dc.type.uri	http://purl.org/eprint/type/JournalArticle
eprint.status	http://purl.org/eprint/status/PeerReviewed
dc.date.updated	2021-03-24T13:05:49Z
dspace.orderedauthors	Poggio, T; Liao, Q; Banburski, A
dspace.date.submission	2021-03-24T13:05:50Z
mit.journal.volume	11
mit.journal.issue	1
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed