Towards Feature Selection In Actor-Critic Algorithms

Rohanimanesh, Khashayar; Roy, Nicholas; Tedrake, Russ

Author(s)

Rohanimanesh, Khashayar; Roy, Nicholas; Tedrake, Russ

DownloadMIT-CSAIL-TR-2007-051.pdf (184.6Kb)

Additional downloads

MIT-CSAIL-TR-2007-051.ps (652.4Kb)

Other Contributors

Robot Locomotion Group

Advisor

Russ Tedrake

Metadata

Show full item record

Abstract

Choosing features for the critic in actor-critic algorithms with function approximation is known to be a challenge. Too few critic features can lead to degeneracy of the actor gradient, and too many features may lead to slower convergence of the learner. In this paper, we show that a well-studied class of actor policies satisfy the known requirements for convergence when the actor features are selected carefully. We demonstrate that two popular representations for value methods - the barycentric interpolators and the graph Laplacian proto-value functions - can be used to represent the actor in order to satisfy these conditions. A consequence of this work is a generalization of the proto-value function methods to the continuous action actor-critic domain. Finally, we analyze the performance of this approach using a simulation of a torque-limited inverted pendulum.

Date issued

2007-11-01

URI

http://hdl.handle.net/1721.1/39427

Other identifiers

MIT-CSAIL-TR-2007-051

Keywords

reinforcement learning

Collections

CSAIL Technical Reports (July 1, 2003 - present)

The following license files are associated with this item:

Creative Commons