On Solving Larger Games: Designing New Algorithms Adaptable to Deep Reinforcement Learning
Author(s)
Liu, Mingyang
DownloadThesis PDF (1.016Mb)
Advisor
Ozdaglar, Asuman
Farina, Gabriele
Terms of use
Metadata
Show full item recordAbstract
In this thesis, we explore the design of algorithms capable of handling large games where the state space is too large to store strategies in a tabular format from a theoretical perspective. Specifically, we focus on developing algorithms suitable for deep reinforcement learning in two-player zero-sum extensive-form games. There are three critical properties for effective deep multi-agent reinforcement learning: (last/best) iterate convergence, efficient utilization of stochastic trajectory feedback, and theoretically sound avoidance of importance sampling corrections. Chapter 3 introduces Regularized Optimistic Mirror Descent (Reg-OMD), which provably converges to the Nash equilibrium (NE) linearly in last-iterate. Chapter 4 shows that algorithms based on regret decomposition enjoy best-iterate convergence to the NE. Chapter 5 proposes Q-value based Regret Minimization (QFR), which achieves all three properties simultaneously.
Date issued
2025-02Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology