Nonparametric Bayesian Policy Priors for Reinforcement Learning
Author(s)
Doshi-Velez, Finale P.; Wingate, David; Roy, Nicholas; Tenenbaum, Joshua B.
DownloadRoy_Nonparametric Bayesian.pdf (175.4Kb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
We consider reinforcement learning in partially observable domains where the agent can query an expert for
demonstrations. Our nonparametric Bayesian approach combines model knowledge, inferred from expert information and independent exploration, with policy knowledge inferred from expert trajectories. We introduce priors that bias the agent towards models with both simple representations and simple policies, resulting in improved policy and model learning.
Date issued
2010-12Department
Massachusetts Institute of Technology. Department of Aeronautics and Astronautics; Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences; Massachusetts Institute of Technology. Laboratory for Information and Decision SystemsJournal
Proceedings of the 24th Annual Conference on Neural Information Processing Systems, (NIPS 2010)
Publisher
Neural Information Processing Systems Foundation
Citation
Doshi-Velez, Finale, David Wingate, Nicholas Roy, and Joshua Tenenbaum. "Nonparametric Bayesian Policy Priors for Reinforcement Learning." Proceedings of the 24th Annual Conference on Neural Information Processing Systems, NIPS 2010, December 6-9, 2010, Vancouver, British Columbia.
Version: Author's final manuscript
ISBN
9781617823800