dc.contributor.advisor | Tenenbaum, Joshua | |
dc.contributor.author | Wang, Jett | |
dc.date.accessioned | 2024-03-21T19:13:48Z | |
dc.date.available | 2024-03-21T19:13:48Z | |
dc.date.issued | 2024-02 | |
dc.date.submitted | 2024-03-04T16:38:11.408Z | |
dc.identifier.uri | https://hdl.handle.net/1721.1/153888 | |
dc.description.abstract | Pokémon battling is a challenging domain for reinforcement learning techniques, due to the massive state space, stochasticity, and partial observability. We demonstrate an agent which employs a Monte Carlo Tree Search informed by a actor-critic network trained using Proximal Policy Optimization with experience collected through self-play. The agent peaked at rank 8 (1693 Elo) on the official Pokémon Showdown gen4randombattles ladder, which is the best known performance by any non-human agent for this format. This strong showing lays the foundation for superhuman performance in Pokémon and other complex turn-based games of imperfect information, expanding the viability of methods which have historically been used in perfect-information games. | |
dc.publisher | Massachusetts Institute of Technology | |
dc.rights | In Copyright - Educational Use Permitted | |
dc.rights | Copyright retained by author(s) | |
dc.rights.uri | https://rightsstatements.org/page/InC-EDU/1.0/ | |
dc.title | Winning at Pokémon Random Battles Using Reinforcement Learning | |
dc.type | Thesis | |
dc.description.degree | M.Eng. | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
mit.thesis.degree | Master | |
thesis.degree.name | Master of Engineering in Electrical Engineering and Computer Science | |