Show simple item record

dc.contributor.advisorSra, Suvrit
dc.contributor.advisorJadbabaie, Ali
dc.contributor.authorZhang, Jingzhao
dc.date.accessioned2022-06-15T13:12:09Z
dc.date.available2022-06-15T13:12:09Z
dc.date.issued2022-02
dc.date.submitted2022-03-04T20:47:56.512Z
dc.identifier.urihttps://hdl.handle.net/1721.1/143318
dc.description.abstractMachine learning is a technology developed for extracting predictive models from data so as to be able to generalize predictions to unobserved data. The process of selecting a good model based on a known dataset requires optimization. In particular, an optimization procedure generates a variable in a constraint set to minimize an objective. This process subsumes many machine learning pipelines including neural network training, which will be our main testing ground for theoretical analyses in this thesis. Among different kinds of optimization algorithms, gradient methods have become the dominant algorithms in deep learning due to their scalability to high dimensions and their natural bound to backpropagation. However, despite the popularity of gradient-based algorithms, our understanding of such algorithms in a machine learning context from a theory perspective seems far from sufficient. On one hand, within the current theory framework, most upper and lower bounds are closed, and the theory problems seem solved. On the other hand, the theoretical analyses hardly generate empirically faster algorithms than those found by practitioners. In this thesis, we review the theoretical analyses of gradient methods, and point out the discrepancy between theory and practice. We then provide an explanation for why the mismatch happens and propose some initial solutions by developing theoretical analyses driven by empirical observations.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright MIT
dc.rights.urihttp://rightsstatements.org/page/InC-EDU/1.0/
dc.titleOptimization Theory and Machine Learning Practice: Mind the Gap
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record