Understanding Vision-based Dynamics Models

Liu, Cynthia

Author(s)

Liu, Cynthia

DownloadThesis PDF (17.50Mb)

Advisor

Torralba, Antonio

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Recent developments in vision-based dynamics models have helped researchers achieve state-of-the-art results in a number of fields. For instance, in model-based reinforcement learning, vision-based methods perform extremely well on a variety of games and control tasks while using orders of magnitudes less data than model-free methods. One example is GameGAN, which learns to simulate the dynamics of observed games solely from visual and action inputs. However, there is very little understanding of these models and how they work. To address this lack of understanding, we apply the Network Dissection framework to analyze vision-based dynamics prediction models. We inspect individual trained neurons in convolutional layers of these models and modify the output of neurons to understand their effect on the representation. We also theoretically extend the Network Dissection framework by generalizing it to fully connected layers instead of only convolutional layers. Overall, we provide insight into the node-level workings of dynamics models.

Date issued

2021-06

URI

https://hdl.handle.net/1721.1/139564

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses