Reinforcement learning with multi-fidelity simulators

Cutler, Mark; Walsh, Thomas J.; How, Jonathan P.

Author(s)

Cutler, Mark Johnson; Walsh, Thomas J; How, Jonathan P

DownloadHow_Reinforcement learning.pdf (5.124Mb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

We present a framework for reinforcement learning (RL) in a scenario where multiple simulators are available with decreasing amounts of fidelity to the real-world learning scenario. Our framework is designed to limit the number of samples used in each successively higher-fidelity/cost simulator by allowing the agent to choose to run trajectories at the lowest level that will still provide it with information. The approach transfers state-action Q-values from lower-fidelity models as heuristics for the “Knows What It Knows” family of RL algorithms, which is applicable over a wide range of possible dynamics and reward representations. Theoretical proofs of the framework's sample complexity are given and empirical results are demonstrated on a remote controlled car with multiple simulators. The approach allows RL algorithms to find near-optimal policies for the real world with fewer expensive real-world samples than previous transfer approaches or learning without simulators.

Date issued

2014-06

URI

http://hdl.handle.net/1721.1/104936

Department

Massachusetts Institute of Technology. Department of Aeronautics and Astronautics; Massachusetts Institute of Technology. Laboratory for Information and Decision Systems

Journal

IEEE International Conference on Robotics and Automation. Proceedings

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Citation

Cutler, Mark, Thomas J. Walsh, and Jonathan P. How. “Reinforcement Learning with Multi-Fidelity Simulators.” IEEE, 2014. 3888–3895.

Version: Author's final manuscript

ISBN

978-1-4799-3685-4

ISSN

1050-4729

Collections

MIT Open Access Articles

DSpace@MIT