Show simple item record

dc.contributor.advisorRuss Tedrakeen_US
dc.contributor.authorRohanimanesh, Khashayaren_US
dc.contributor.authorRoy, Nicholasen_US
dc.contributor.authorTedrake, Russen_US
dc.contributor.otherRobot Locomotion Groupen_US
dc.date.accessioned2007-11-13T14:45:30Z
dc.date.available2007-11-13T14:45:30Z
dc.date.issued2007-11-01en_US
dc.identifier.otherMIT-CSAIL-TR-2007-051en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/39427
dc.description.abstractChoosing features for the critic in actor-critic algorithms with function approximation is known to be a challenge. Too few critic features can lead to degeneracy of the actor gradient, and too many features may lead to slower convergence of the learner. In this paper, we show that a well-studied class of actor policies satisfy the known requirements for convergence when the actor features are selected carefully. We demonstrate that two popular representations for value methods - the barycentric interpolators and the graph Laplacian proto-value functions - can be used to represent the actor in order to satisfy these conditions. A consequence of this work is a generalization of the proto-value function methods to the continuous action actor-critic domain. Finally, we analyze the performance of this approach using a simulation of a torque-limited inverted pendulum.en_US
dc.format.extent9 p.en_US
dc.relationMassachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratoryen_US
dc.relationen_US
dc.subjectreinforcement learningen_US
dc.titleTowards Feature Selection In Actor-Critic Algorithmsen_US


Files in this item

Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record