Reinforcement learning with multi-fidelity simulators
Author(s)
Cutler, Mark Johnson; Walsh, Thomas J; How, Jonathan P
DownloadHow_Reinforcement learning.pdf (5.124Mb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
We present a framework for reinforcement learning (RL) in a scenario where multiple simulators are available with decreasing amounts of fidelity to the real-world learning scenario. Our framework is designed to limit the number of samples used in each successively higher-fidelity/cost simulator by allowing the agent to choose to run trajectories at the lowest level that will still provide it with information. The approach transfers state-action Q-values from lower-fidelity models as heuristics for the “Knows What It Knows” family of RL algorithms, which is applicable over a wide range of possible dynamics and reward representations. Theoretical proofs of the framework's sample complexity are given and empirical results are demonstrated on a remote controlled car with multiple simulators. The approach allows RL algorithms to find near-optimal policies for the real world with fewer expensive real-world samples than previous transfer approaches or learning without simulators.
Date issued
2014-06Department
Massachusetts Institute of Technology. Department of Aeronautics and Astronautics; Massachusetts Institute of Technology. Laboratory for Information and Decision SystemsJournal
IEEE International Conference on Robotics and Automation. Proceedings
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Citation
Cutler, Mark, Thomas J. Walsh, and Jonathan P. How. “Reinforcement Learning with Multi-Fidelity Simulators.” IEEE, 2014. 3888–3895.
Version: Author's final manuscript
ISBN
978-1-4799-3685-4
ISSN
1050-4729