| dc.contributor.advisor | Joshua B. Tenenbaum. | en_US |
| dc.contributor.author | Tenka, Samuel C. | en_US |
| dc.contributor.other | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. | en_US |
| dc.date.accessioned | 2021-01-06T18:33:00Z | |
| dc.date.available | 2021-01-06T18:33:00Z | |
| dc.date.copyright | 2020 | en_US |
| dc.date.issued | 2020 | en_US |
| dc.identifier.uri | https://hdl.handle.net/1721.1/129180 | |
| dc.description | Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, 2020 | en_US |
| dc.description | Cataloged from student-submitted PDF version of thesis. | en_US |
| dc.description | Includes bibliographical references. | en_US |
| dc.description.abstract | We analyze stochastic gradient descent (SGD) at small learning rates. Unlike prior analyses based on stochastic differential equations, our theory models discrete time and hence non-Gaussian noise. We illustrate our theory by discussing four of its corollaries: we (A) generalize the Akaike information criterion (AIC) to a smooth estimator of overfitting, hence enabling gradient-based model selection; (B) show how non-stochastic GD with a modified loss function may emulate SGD; (C) prove that gradient noise systematically pushes SGD toward flatter minima; and (D) characterize when and why flat minima overfit less than other minima. | en_US |
| dc.description.statementofresponsibility | by Samuel C. Tenka. | en_US |
| dc.format.extent | 62 pages | en_US |
| dc.language.iso | eng | en_US |
| dc.publisher | Massachusetts Institute of Technology | en_US |
| dc.rights | MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. | en_US |
| dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | en_US |
| dc.subject | Electrical Engineering and Computer Science. | en_US |
| dc.title | A perturbative analysis of stochastic descent | en_US |
| dc.type | Thesis | en_US |
| dc.description.degree | S.M. | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
| dc.identifier.oclc | 1227278188 | en_US |
| dc.description.collection | S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science | en_US |
| dspace.imported | 2021-01-06T18:32:59Z | en_US |
| mit.thesis.degree | Master | en_US |
| mit.thesis.department | EECS | en_US |