A unified framework for temporal difference methods
Author(s)Bertsekas, Dimitri P.
MetadataShow full item record
We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.
DepartmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science; Massachusetts Institute of Technology. Laboratory for Information and Decision Systems
IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
Institute of Electrical and Electronics Engineers
Bertsekas, D.P. “A unified framework for temporal difference methods.” Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on. 2009. 1-7. © 2009, IEEE
Final published version
INSPEC Accession Number: 10647004