Show simple item record

dc.contributor.authorBertsekas, Dimitri P.
dc.date.accessioned2010-10-01T18:17:46Z
dc.date.available2010-10-01T18:17:46Z
dc.date.issued2009-03
dc.identifier.isbn978-1-4244-2761-1
dc.identifier.otherINSPEC Accession Number: 10647004
dc.identifier.urihttp://hdl.handle.net/1721.1/58831
dc.description.abstractWe propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (NSF grant ECCS-0801549)en_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/ADPRL.2009.4927518en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceIEEEen_US
dc.titleA unified framework for temporal difference methodsen_US
dc.typeArticleen_US
dc.identifier.citationBertsekas, D.P. “A unified framework for temporal difference methods.” Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on. 2009. 1-7. © 2009, IEEEen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.departmentMassachusetts Institute of Technology. Laboratory for Information and Decision Systemsen_US
dc.contributor.approverBertsekas, Dimitri P.
dc.contributor.mitauthorBertsekas, Dimitri P.
dc.relation.journalIEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learningen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dspace.orderedauthorsBertsekas, Dimitri P.en
dc.identifier.orcidhttps://orcid.org/0000-0001-6909-7208
mit.licensePUBLISHER_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record