Finding friend and foe in multi-agent games
Author(s)
Serrino, Jack; Parkes, DC; Kleiman-Weiner, Max; Tenenbaum, Joshua B
DownloadPublished version (681.5Kb)
Publisher Policy
Publisher Policy
Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.
Terms of use
Metadata
Show full item recordAbstract
© 2019 Neural information processing systems foundation. All rights reserved. Recent breakthroughs in AI for multi-agent games like Go, Poker, and Dota, have seen great strides in recent years. Yet none of these games address the real-life challenge of cooperation in the presence of unknown and uncertain teammates. This challenge is a key game mechanism in hidden role games. Here we develop the DeepRole algorithm, a multi-agent reinforcement learning agent that we test on The Resistance: Avalon, the most popular hidden role game. DeepRole combines counterfactual regret minimization (CFR) with deep value networks trained through self-play. Our algorithm integrates deductive reasoning into vector-form CFR to reason about joint beliefs and deduce partially observable actions. We augment deep value networks with constraints that yield interpretable representations of win probabilities. These innovations enable DeepRole to scale to the full Avalon game. Empirical game-theoretic methods show that DeepRole outperforms other hand-crafted and learned agents in five-player Avalon. DeepRole played with and against human players on the web in hybrid human-agent teams. We find that DeepRole outperforms human players as both a cooperator and a competitor.
Date issued
2019-01Department
Center for Brains, Minds, and Machines; Massachusetts Institute of Technology. Department of Brain and Cognitive SciencesJournal
Advances in Neural Information Processing Systems
Citation
Serrino, J, Parkes, DC, Kleiman-Weiner, M and Tenenbaum, JB. 2019. "Finding friend and foe in multi-agent games." Advances in Neural Information Processing Systems, 32.
Version: Final published version