Show simple item record

dc.contributor.advisorDavid Gamarnik.en_US
dc.contributor.authorEmschwiller, Matt V.en_US
dc.contributor.otherMassachusetts Institute of Technology. Operations Research Center.en_US
dc.date.accessioned2020-09-15T21:50:30Z
dc.date.available2020-09-15T21:50:30Z
dc.date.copyright2020en_US
dc.date.issued2020en_US
dc.identifier.urihttps://hdl.handle.net/1721.1/127290
dc.descriptionThesis: S.M., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, May, 2020en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 83-89).en_US
dc.description.abstractWe first study the sample complexity of one-layer neural networks, namely the number of examples that are needed in the training set for such models to be able to learn meaningful information out-of-sample. We empirically derive quantitative relationships between the sample complexity and the parameters of the network, such as its input dimension and its width. Then, we introduce polynomial regression as a proxy for neural networks through a polynomial approximation of their activation function. This method operates in the lifted space of tensor products of input variables, and is trained by simply optimizing a standard least squares objective in this space. We study the scalability of polynomial regression, and are able to design a bagging-type algorithm to successfully train it. The method achieves competitive accuracy on simple image datasets while being more simple. We also demonstrate that it is more robust and more interpretable that existing approaches. It also offers more convergence guarantees during training. Finally, we empirically show that the widely-used Stochastic Gradient Descent algorithm makes the weights of the trained neural networks converge to the optimal polynomial regression weights.en_US
dc.description.statementofresponsibilityby Matt V. Emschwiller.en_US
dc.format.extent89 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectOperations Research Center.en_US
dc.titleUnderstanding neural network sample complexity and interpretable convergence-guaranteed deep learning with polynomial regressionen_US
dc.typeThesisen_US
dc.description.degreeS.M.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Operations Research Centeren_US
dc.contributor.departmentSloan School of Management
dc.identifier.oclc1191900751en_US
dc.description.collectionS.M. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Centeren_US
dspace.imported2020-09-15T21:50:29Zen_US
mit.thesis.degreeMasteren_US
mit.thesis.departmentSloanen_US
mit.thesis.departmentOperResen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record