Winning at Pokémon Random Battles Using Reinforcement Learning
Author(s)
Wang, Jett
DownloadThesis PDF (1.299Mb)
Advisor
Tenenbaum, Joshua
Terms of use
Metadata
Show full item recordAbstract
Pokémon battling is a challenging domain for reinforcement learning techniques, due to the massive state space, stochasticity, and partial observability. We demonstrate an agent which employs a Monte Carlo Tree Search informed by a actor-critic network trained using Proximal Policy Optimization with experience collected through self-play. The agent peaked at rank 8 (1693 Elo) on the official Pokémon Showdown gen4randombattles ladder, which is the best known performance by any non-human agent for this format. This strong showing lays the foundation for superhuman performance in Pokémon and other complex turn-based games of imperfect information, expanding the viability of methods which have historically been used in perfect-information games.
Date issued
2024-02Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology