MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

No-Regret Learning in General Games

Author(s)
Fishelson, Maxwell K.
Thumbnail
DownloadThesis PDF (868.1Kb)
Advisor
Daskalakis, Constantinos
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
This thesis investigates the regret performance of no-regret learning algorithms in the competitive, though not fully-adversarial, environment of games. We establish exponential improvements on previously best-known external and internal regret bounds for these settings. We show that Optimistic Hedge – a common variant of multiplicative-weights-updates with recency bias – attains poly(log T) regret in multi-player general-sum games. In particular, when every player of the game uses Optimistic Hedge to iteratively update her strategy in response to the history of play so far, then after T rounds of interaction, each player experiences total regret that is poly(log T). Our bound improves, exponentially, the O(T¹ᐟ²) regret attainable by standard no-regret learners in games, the O(T¹ᐟ⁴) regret attainable by no-regret learners with recency bias [Syr+15], and the O(T¹ᐟ⁶) bound that was recently shown for Optimistic Hedge in the special case of two-player games [CP20]. A corollary of our bound is that Optimistic Hedge converges to coarse correlated equilibrium in general games at a rate of [formula]. We then extend this result from external regret to internal and swap regret, thereby establishing uncoupled learning dynamics that converge to an approximate correlated equilibrium at the rate of [formula]. This substantially improves over the prior best rate of convergence for correlated equilibria of O(T⁻³ᐟ⁴) due to Chen and Peng (NeurIPS ‘20), and it is optimal up to polylogarithmic factors in T. The results presented here originate from my works [DFG21] and [Ana+22].
Date issued
2023-02
URI
https://hdl.handle.net/1721.1/150227
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.