MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Warm-Starting Networks for Sample-Efficient Continuous Adaptation to Parameter Perturbations in Multi-Agent Reinforcement Learning

Author(s)
Huang, Vivian
Thumbnail
DownloadThesis PDF (3.259Mb)
Advisor
How, Jonathan P.
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Deep reinforcement learning (RL) methods have made significant advancements over recent years toward mastering challenging problems. Because many real-world systems involve multiple agents interacting with each other in a shared environment, one particularly active subfield of RL is multi-agent reinforcement learning (MARL). Learning robust multi-agent policies in real-time strategy games, such as StarCraft II, is an important objective. In particular, being able to quickly adapt game playing agents to perturbations in rules and successfully displaying the ability to take advantage of such changes can yield insights about properties, such as game balance. However, progress in MARL research faces a major challenge associated with the high cost of sample complexity, which makes learning a complicated task from scratch computationally intensive. Therefore, this thesis work details the design and implementation of a MARL framework that facilitates the training of robust agents which are adaptive to perturbations in a multi-agent, StarCraft II-based real-time strategy game such that the features that most affect game balance can be determined. The framework also includes an incremental warm-start approach to improve the computational complexity of agent adaptation. The results show that our approach achieves up to 97% improvement in computational time compared to the standard approach of training the policy with a random initialization.
Date issued
2022-02
URI
https://hdl.handle.net/1721.1/143288
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.