On Solving Larger Games: Designing New Algorithms Adaptable to Deep Reinforcement Learning

Liu, Mingyang

Author(s)

Liu, Mingyang

DownloadThesis PDF (1.016Mb)

Advisor

Ozdaglar, Asuman

Farina, Gabriele

Terms of use

Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/

Metadata

Show full item record

Abstract

In this thesis, we explore the design of algorithms capable of handling large games where the state space is too large to store strategies in a tabular format from a theoretical perspective. Specifically, we focus on developing algorithms suitable for deep reinforcement learning in two-player zero-sum extensive-form games. There are three critical properties for effective deep multi-agent reinforcement learning: (last/best) iterate convergence, efficient utilization of stochastic trajectory feedback, and theoretically sound avoidance of importance sampling corrections. Chapter 3 introduces Regularized Optimistic Mirror Descent (Reg-OMD), which provably converges to the Nash equilibrium (NE) linearly in last-iterate. Chapter 4 shows that algorithms based on regret decomposition enjoy best-iterate convergence to the NE. Chapter 5 proposes Q-value based Regret Minimization (QFR), which achieves all three properties simultaneously.

Date issued

2025-02

URI

https://hdl.handle.net/1721.1/158922

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses