Complexity control by gradient descent in deep networks

Poggio, Tomaso; Liao, Qianli; Banburski, Andrzej

Notice

This is not the latest version of this item. The latest version can be found at:https://dspace.mit.edu/handle/1721.1/136301.2

Author(s)

Poggio, Tomaso; Liao, Qianli; Banburski, Andrzej

DownloadPublished version (431.6Kb)

Publisher with Creative Commons License

Terms of use

Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/

Metadata

Show full item record

Abstract

© 2020, The Author(s). Overparametrized deep networks predict well, despite the lack of an explicit complexity control during training, such as an explicit regularization term. For exponential-type loss functions, we solve this puzzle by showing an effective regularization effect of gradient descent in terms of the normalized weights that are relevant for classification.

Date issued

2020

URI

https://hdl.handle.net/1721.1/136301

Journal

Nature Communications

Publisher

Springer Science and Business Media LLC

Collections

MIT Open Access Articles

Version	Item	Date	Summary
2	1721.1/136301.2	2022-01-12T20:48:30Z	Authority information verified/added.
1	1721.1/136301*	2021-10-27T20:34:47Z

DSpace@MIT

Notice