dc.contributor.author | Bertsekas, Dimitri P. | |
dc.date.accessioned | 2010-10-01T18:17:46Z | |
dc.date.available | 2010-10-01T18:17:46Z | |
dc.date.issued | 2009-03 | |
dc.identifier.isbn | 978-1-4244-2761-1 | |
dc.identifier.other | INSPEC Accession Number: 10647004 | |
dc.identifier.uri | http://hdl.handle.net/1721.1/58831 | |
dc.description.abstract | We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead. | en_US |
dc.description.sponsorship | National Science Foundation (U.S.) (NSF grant ECCS-0801549) | en_US |
dc.language.iso | en_US | |
dc.publisher | Institute of Electrical and Electronics Engineers | en_US |
dc.relation.isversionof | http://dx.doi.org/10.1109/ADPRL.2009.4927518 | en_US |
dc.rights | Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. | en_US |
dc.source | IEEE | en_US |
dc.title | A unified framework for temporal difference methods | en_US |
dc.type | Article | en_US |
dc.identifier.citation | Bertsekas, D.P. “A unified framework for temporal difference methods.” Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on. 2009. 1-7. © 2009, IEEE | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Laboratory for Information and Decision Systems | en_US |
dc.contributor.approver | Bertsekas, Dimitri P. | |
dc.contributor.mitauthor | Bertsekas, Dimitri P. | |
dc.relation.journal | IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning | en_US |
dc.eprint.version | Final published version | en_US |
dc.type.uri | http://purl.org/eprint/type/JournalArticle | en_US |
eprint.status | http://purl.org/eprint/status/PeerReviewed | en_US |
dspace.orderedauthors | Bertsekas, Dimitri P. | en |
dc.identifier.orcid | https://orcid.org/0000-0001-6909-7208 | |
mit.license | PUBLISHER_POLICY | en_US |
mit.metadata.status | Complete | |