Representation Discovery for Kernel-Based Reinforcement Learning

Zewdie, Dawit H.; Konidaris, George

dc.contributor.advisor	Leslie Kaelbling
dc.contributor.author	Zewdie, Dawit H.	en_US
dc.contributor.author	Konidaris, George	en_US
dc.contributor.other	Learning and Intelligent Systems	en
dc.date.accessioned	2015-11-30T19:30:04Z
dc.date.available	2015-11-30T19:30:04Z
dc.date.issued	2015-11-24
dc.identifier.uri	http://hdl.handle.net/1721.1/100053
dc.description.abstract	Recent years have seen increased interest in non-parametric reinforcement learning. There are now practical kernel-based algorithms for approximating value functions; however, kernel regression requires that the underlying function being approximated be smooth on its domain. Few problems of interest satisfy this requirement in their natural representation. In this paper we define Value-Consistent Pseudometric (VCPM), the distance function corresponding to a transformation of the domain into a space where the target function is maximally smooth and thus well-approximated by kernel regression. We then present DKBRL, an iterative batch RL algorithm interleaving steps of Kernel-Based Reinforcement Learning and distance metric adjustment. We evaluate its performance on Acrobot and PinBall, continuous-space reinforcement learning domains with discontinuous value functions.	en_US
dc.format.extent	16 p.	en_US
dc.relation.ispartofseries	MIT-CSAIL-TR-2015-032
dc.rights	Creative Commons Attribution-ShareAlike 4.0 International
dc.rights.uri	http://creativecommons.org/licenses/by-sa/4.0/
dc.subject	Metric learning	en_US
dc.title	Representation Discovery for Kernel-Based Reinforcement Learning	en_US
dc.date.updated	2015-11-30T19:30:04Z

Files in this item

Name:: MIT-CSAIL-TR-2015-032.pdf
Size:: 1.869Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

CSAIL Technical Reports (July 1, 2003 - present)

Show simple item record