MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Learning Effective and Human-like Policies for Strategic, Multi-Agent Games

Author(s)
Jacob, Athul Paul
Thumbnail
DownloadThesis PDF (1.951Mb)
Advisor
Brown, Noam
Andreas, Jacob
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
We consider the task of building effective but human-like policies in multi-agent decision-making problems. Imitation learning (IL) is effective at predicting human actions but may not match the strength of expert humans, while reinforcement learning (RL) and search algorithms lead to strong performance but may produce policies that are difficult for humans to understand and coordinate with. We first study the problem of producing human-like communication in latent language policies (LLPs), in which high-level instructor and low-level executor agents communicate using natural language. While LLPs can solve long-horizon RL problems, past work has found that LLP training produces agents that use messages in ways inconsistent with their natural language meanings. We introduce a sample-efficient multitask training scheme that yields human-like communication in a complex realtime strategy game. We then turn to the problem of producing human-like decision-making in a more general class of policies. We develop a regret-minimization algorithm for imperfect information games that can leverage human demonstrations. We show that using this algorithm for search in no-press Diplomacy yields a policy that matches the human-likeness of IL while achieving much higher reward. ___________ This thesis is based on the papers, Multitasking Inhibits Semantic Drift published at NAACL 2021 and Modeling Strong and Human-Like Gameplay with KL-Regularized Search which is currently under review for publication at ICML 2022. The contents of this paper are used with the permission of co-authors David J. Wu, Gabriele Farina, Adam Lerer, Hengyuan Hu, Anton Bakhtin, Mike Lewis, Noam Brown, and Jacob Andreas.
Date issued
2022-05
URI
https://hdl.handle.net/1721.1/144569
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.