Show simple item record

dc.contributor.advisorJoshua Tenenbaum
dc.contributor.authorWingate, Daviden_US
dc.contributor.authorDiuk, Carlosen_US
dc.contributor.authorO'Donnell, Timothyen_US
dc.contributor.authorTenenbaum, Joshuaen_US
dc.contributor.authorGershman, Samuelen_US
dc.contributor.otherComputational Cognitive Scienceen
dc.date.accessioned2013-04-18T00:45:04Z
dc.date.available2013-04-18T00:45:04Z
dc.date.issued2013-04-12
dc.identifier.urihttp://hdl.handle.net/1721.1/78573
dc.description.abstractThis paper describes a probabilistic framework for incorporating structured inductive biases into reinforcement learning. These inductive biases arise from policy priors, probability distributions over optimal policies. Borrowing recent ideas from computational linguistics and Bayesian nonparametrics, we define several families of policy priors that express compositional, abstract structure in a domain. Compositionality is expressed using probabilistic context-free grammars, enabling a compact representation of hierarchically organized sub-tasks. Useful sequences of sub-tasks can be cached and reused by extending the grammars nonparametrically using Fragment Grammars. We present Monte Carlo methods for performing inference, and show how structured policy priors lead to substantially faster learning in complex domains compared to methods without inductive biases.en_US
dc.description.sponsorshipThis work was supported by AFOSR FA9550-07-1-0075 and ONR N00014-07-1-0937. SJG was supported by a Graduate Research Fellowship from the NSF.en
dc.format.extent17 p.en_US
dc.relation.ispartofseriesMIT-CSAIL-TR-2013-007
dc.titleCompositional Policy Priorsen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record