MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Mitigating Social Dilemmas in Multi-Agent Reinforcement Learning with Formal Contracting

Author(s)
Christoffersen, Phillip Johannes Kerr
Thumbnail
DownloadThesis PDF (1.021Mb)
Advisor
Hadfield-Menell, Dylan
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
As society deploys more and more sophisticated artificial intelligence (AI) agents, it will be increasingly necessary for such agents, while pursuing their own objectives, to coexist in common environments in the physical or digital worlds. This may pose a challenge if the agents’ objectives conflict with each other– in the worst case, this can prevent any given agent from being able to fulfill their own objectives (e.g. self driving cars in a traffic jam). Situations such as these are termed social dilemmas. In this thesis, it is demonstrated that providing RL agents with the software infrastructure to precommit to zero-sum incentive modifications 1. Induces maximal social welfare in theory; and 2. When implemented with deep multi-agent reinforcement learning (MARL), also avoids social dilemmas in practice. Specifically, a novel algorithmic framework is proposed, termed formal contracting, which is formalized, studied game-theoretically, and investigated empirically. In formal contracting, before engaging in a given shared environment, agents are given the opportunity negotiate a binding modification to all agents’ objective functions, in order to provide incentives for the optimal use of shared resources. Within this framework, at all subgame-perfect equilibria (SPE), agents will in fact maximize social welfare, that is, the sum of all agent objectives in the original environment. Moreover, studies in simple domains, such as the classic prisoner’s dilemma, and more complex ones such as dynamic simulations of pollution management, show that this algorithmic framework can be implemented in MARL, and does indeed lead to outcomes with superior welfare in social dilemmas. This thesis concludes with discussions of related work, limitations of the approach, and future work, particularly involving scaling this methodology to larger problem instances containing more agents than studied.
Date issued
2024-02
URI
https://hdl.handle.net/1721.1/153795
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.