Understanding neural network sample complexity and interpretable convergence-guaranteed deep learning with polynomial regression

Emschwiller, Matt V.

dc.contributor.advisor	David Gamarnik.	en_US
dc.contributor.author	Emschwiller, Matt V.	en_US
dc.contributor.other	Massachusetts Institute of Technology. Operations Research Center.	en_US
dc.date.accessioned	2020-09-15T21:50:30Z
dc.date.available	2020-09-15T21:50:30Z
dc.date.copyright	2020	en_US
dc.date.issued	2020	en_US
dc.identifier.uri	https://hdl.handle.net/1721.1/127290
dc.description	Thesis: S.M., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, May, 2020	en_US
dc.description	Cataloged from PDF version of thesis.	en_US
dc.description	Includes bibliographical references (pages 83-89).	en_US
dc.description.abstract	We first study the sample complexity of one-layer neural networks, namely the number of examples that are needed in the training set for such models to be able to learn meaningful information out-of-sample. We empirically derive quantitative relationships between the sample complexity and the parameters of the network, such as its input dimension and its width. Then, we introduce polynomial regression as a proxy for neural networks through a polynomial approximation of their activation function. This method operates in the lifted space of tensor products of input variables, and is trained by simply optimizing a standard least squares objective in this space. We study the scalability of polynomial regression, and are able to design a bagging-type algorithm to successfully train it. The method achieves competitive accuracy on simple image datasets while being more simple. We also demonstrate that it is more robust and more interpretable that existing approaches. It also offers more convergence guarantees during training. Finally, we empirically show that the widely-used Stochastic Gradient Descent algorithm makes the weights of the trained neural networks converge to the optimal polynomial regression weights.	en_US
dc.description.statementofresponsibility	by Matt V. Emschwiller.	en_US
dc.format.extent	89 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Operations Research Center.	en_US
dc.title	Understanding neural network sample complexity and interpretable convergence-guaranteed deep learning with polynomial regression	en_US
dc.type	Thesis	en_US
dc.description.degree	S.M.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Operations Research Center	en_US
dc.contributor.department	Sloan School of Management
dc.identifier.oclc	1191900751	en_US
dc.description.collection	S.M. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center	en_US
dspace.imported	2020-09-15T21:50:29Z	en_US
mit.thesis.degree	Master	en_US
mit.thesis.department	Sloan	en_US
mit.thesis.department	OperRes	en_US

Files in this item

Name:: 1191900751-MIT.pdf
Size:: 4.786Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record