Inferring Structured World Models from Videos
Author(s)
Kapur, Shreyas
DownloadThesis PDF (2.044Mb)
Advisor
Tenenbaum, Joshua B.
Terms of use
Metadata
Show full item recordAbstract
Advances in reinforcement learning have allowed agents to learn a variety of board games and video games at superhuman levels. Unlike humans - which can generalize to a wide range of tasks with very little experience - these algorithms typically need vast number of experience replays to perform at the same level. In this thesis, we propose a model-based reinforcement learning approach that represents the environment using an explicit symbolic model in the form of a domain-specific language (DSL) that represents the world as a set of discrete objects with underlying latent properties that govern their dynamical interactions. We present a novel, neurally guided, on-line inference technique to recover the structured world representation from raw video observations, with the intent to be used for downstream model-based planning. We qualitatively evaluate our inference performance on classical Atari games, as well as on physics-based mobile games.
Date issued
2022-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology