| dc.contributor.advisor | Jegelka, Stefanie | |
| dc.contributor.advisor | Kelner, Jonathan A. | |
| dc.contributor.author | Gatmiry, Khashayar | |
| dc.date.accessioned | 2026-01-20T19:48:03Z | |
| dc.date.available | 2026-01-20T19:48:03Z | |
| dc.date.issued | 2025-09 | |
| dc.date.submitted | 2025-09-15T14:40:36.700Z | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/164603 | |
| dc.description.abstract | At their core, our machine learning systems are trained by solving an optimization problem, where the goal is to minimize a predefined objective function by adjusting model parameters based on the data. Despite the wealth of structure and prior knowledge present in the data and feedback, our training methods remain relatively simple and independent of this structure. In spite of, or perhaps because of, this simplicity, these methods are often lacking in theoretical guarantees. To design machine learning algorithms that are less data-hungry while ensuring theoretical guarantees on both computational efficiency and output validity, it is essential to better understand and leverage the rich structure within the learning setup and the data distribution, e.g. by altering the geometry of the solution space or adjusting the objective function to induce a more effective learning procedure. This approach moves beyond classical algorithm design, which focuses primarily on handling worst-case instances. This thesis investigates the optimization landscape of central learning problems and develops geometric and analytic schemes adapted to their structure, leading to algorithms with superior computational and statistical performance. In addition, it seeks to advance our mathematical understanding of the principles underlying the success of deep learning. | |
| dc.publisher | Massachusetts Institute of Technology | |
| dc.rights | In Copyright - Educational Use Permitted | |
| dc.rights | Copyright retained by author(s) | |
| dc.rights.uri | https://rightsstatements.org/page/InC-EDU/1.0/ | |
| dc.title | Understanding and Overcoming Optimization Barriers in Non-convex and Non-smooth Machine Learning | |
| dc.type | Thesis | |
| dc.description.degree | Ph.D. | |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
| mit.thesis.degree | Doctoral | |
| thesis.degree.name | Doctor of Philosophy | |