dc.contributor.advisor | Ozdaglar, Asuman | |
dc.contributor.advisor | Farina, Gabriele | |
dc.contributor.author | Liu, Mingyang | |
dc.date.accessioned | 2025-03-27T16:58:22Z | |
dc.date.available | 2025-03-27T16:58:22Z | |
dc.date.issued | 2025-02 | |
dc.date.submitted | 2025-03-04T17:28:56.371Z | |
dc.identifier.uri | https://hdl.handle.net/1721.1/158922 | |
dc.description.abstract | In this thesis, we explore the design of algorithms capable of handling large games where the state space is too large to store strategies in a tabular format from a theoretical perspective. Specifically, we focus on developing algorithms suitable for deep reinforcement learning in two-player zero-sum extensive-form games. There are three critical properties for effective deep multi-agent reinforcement learning: (last/best) iterate convergence, efficient utilization of stochastic trajectory feedback, and theoretically sound avoidance of importance sampling corrections. Chapter 3 introduces Regularized Optimistic Mirror Descent (Reg-OMD), which provably converges to the Nash equilibrium (NE) linearly in last-iterate. Chapter 4 shows that algorithms based on regret decomposition enjoy best-iterate convergence to the NE. Chapter 5 proposes Q-value based Regret Minimization (QFR), which achieves all three properties simultaneously. | |
dc.publisher | Massachusetts Institute of Technology | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) | |
dc.rights | Copyright retained by author(s) | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
dc.title | On Solving Larger Games: Designing New Algorithms Adaptable to Deep Reinforcement Learning | |
dc.type | Thesis | |
dc.description.degree | S.M. | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
mit.thesis.degree | Master | |
thesis.degree.name | Master of Science in Electrical Engineering and Computer Science | |