MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Understanding and Overcoming Optimization Barriers in Non-convex and Non-smooth Machine Learning

Author(s)
Gatmiry, Khashayar
Thumbnail
DownloadThesis PDF (8.865Mb)
Advisor
Jegelka, Stefanie
Kelner, Jonathan A.
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
At their core, our machine learning systems are trained by solving an optimization problem, where the goal is to minimize a predefined objective function by adjusting model parameters based on the data. Despite the wealth of structure and prior knowledge present in the data and feedback, our training methods remain relatively simple and independent of this structure. In spite of, or perhaps because of, this simplicity, these methods are often lacking in theoretical guarantees. To design machine learning algorithms that are less data-hungry while ensuring theoretical guarantees on both computational efficiency and output validity, it is essential to better understand and leverage the rich structure within the learning setup and the data distribution, e.g. by altering the geometry of the solution space or adjusting the objective function to induce a more effective learning procedure. This approach moves beyond classical algorithm design, which focuses primarily on handling worst-case instances. This thesis investigates the optimization landscape of central learning problems and develops geometric and analytic schemes adapted to their structure, leading to algorithms with superior computational and statistical performance. In addition, it seeks to advance our mathematical understanding of the principles underlying the success of deep learning.
Date issued
2025-09
URI
https://hdl.handle.net/1721.1/164603
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.