Show simple item record

dc.contributor.advisorOzdaglar, Asuman
dc.contributor.authorPattathil, Sarath
dc.date.accessioned2023-11-02T20:23:34Z
dc.date.available2023-11-02T20:23:34Z
dc.date.issued2023-09
dc.date.submitted2023-09-21T14:26:38.320Z
dc.identifier.urihttps://hdl.handle.net/1721.1/152869
dc.description.abstractThis thesis explores minimax formulations of machine learning and multi-agent learning problems, focusing on algorithmic optimization and generalization performance. The first part of the thesis delves into the smooth convex-concave minimax problem, providing a unified analysis of widely used algorithms such as Extra-Gradient (EG) and Optimistic Gradient Descent Ascent (OGDA), whose convergence behavior was not systematically understood. We derive convergence rates for these algorithms in the convex-concave setting. We show that these algorithms work effectively due to their approximation of the Proximal Point (PP) method, which converges to the solution at a fast rate, but is impractical to implement. In the next chapter, we expand our study to nonconvex-nonconcave problems. These problems are generally challenging to solve, as a solution may not be well defined, or even if a solution exists, its computation may not be tractable. We identify a class of nonconvex-nonconcave problems that do have well defined and computationally tractable solutions. Leveraging the concepts developed in the first chapter, we design algorithms to efficiently tackle this special class of nonconvex-nonconcave problems. The final part of this thesis addresses the issue of generalization. In many cases, such as GANs and adversarial training, the objective function for finding the saddle point can be written as an expected value over the data distribution. However, since we often do not have direct access to this distribution, we solve the empirical problem instead, which involves averaging over the available dataset. The final chapter aims to evaluate the quality of solutions to the empirical problem compared to the original population problem. Existing metrics like the primal risk, which are used to assess generalization in the minimax setting are found to be inadequate in capturing the generalization of minimax learners. This prompts the proposal of a new metric, the primal gap, which overcomes these limitations. This novel metric is then utilized to investigate the generalization performance of popular algorithms like Gradient Descent Ascent (GDA) and Gradient Descent-Max (GDMax).
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleOptimization and Generalization of Minimax Algorithms
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record