dc.contributor.author | Poggio, Tomaso A | |
dc.contributor.author | Liao, Qianli | |
dc.contributor.author | Banburski, Andrzej | |
dc.date.accessioned | 2022-01-12T20:53:18Z | |
dc.date.available | 2021-10-27T20:34:47Z | |
dc.date.available | 2022-01-12T20:53:18Z | |
dc.date.issued | 2020 | |
dc.identifier.uri | https://hdl.handle.net/1721.1/136301.2 | |
dc.description.abstract | © 2020, The Author(s). Overparametrized deep networks predict well, despite the lack of an explicit complexity control during training, such as an explicit regularization term. For exponential-type loss functions, we solve this puzzle by showing an effective regularization effect of gradient descent in terms of the normalized weights that are relevant for classification. | en_US |
dc.language.iso | en | |
dc.publisher | Springer Science and Business Media LLC | en_US |
dc.relation.isversionof | http://dx.doi.org/10.1038/S41467-020-14663-9 | en_US |
dc.rights | Creative Commons Attribution 4.0 International license | en_US |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | en_US |
dc.source | Nature | en_US |
dc.title | Complexity control by gradient descent in deep networks | en_US |
dc.type | Article | en_US |
dc.contributor.department | Center for Brains, Minds, and Machines | en_US |
dc.relation.journal | Nature Communications | en_US |
dc.eprint.version | Final published version | en_US |
dc.type.uri | http://purl.org/eprint/type/JournalArticle | en_US |
eprint.status | http://purl.org/eprint/status/PeerReviewed | en_US |
dc.date.updated | 2021-03-24T13:05:49Z | |
dspace.orderedauthors | Poggio, T; Liao, Q; Banburski, A | en_US |
dspace.date.submission | 2021-03-24T13:05:50Z | |
mit.journal.volume | 11 | en_US |
mit.journal.issue | 1 | en_US |
mit.license | PUBLISHER_CC | |
mit.metadata.status | Publication Information Needed | en_US |