Show simple item record

dc.contributor.authorYu, Huizhen
dc.contributor.authorBertsekas, Dimitri P.
dc.date.accessioned2010-10-13T18:33:03Z
dc.date.available2010-10-13T18:33:03Z
dc.date.issued2009-05
dc.date.submitted2009-03
dc.identifier.isbn978-1-4244-2761-1
dc.identifier.otherINSPEC Accession Number: 10647014
dc.identifier.urihttp://hdl.handle.net/1721.1/59288
dc.description.abstractWe generalize a basis adaptation method for cost approximation in Markov decision processes (MDP), extending earlier work of Menache, Mannor, and Shimkin. In our context, basis functions are parametrized and their parameters are tuned by minimizing an objective function involving the cost function approximation obtained when a temporal differences (TD) or other method is used. The adaptation scheme involves only low order calculations and can be implemented in a way analogous to policy gradient methods. In the generalized basis adaptation framework we provide extensions to TD methods for nonlinear optimal stopping problems and to alternative cost approximations beyond those based on TD.en_US
dc.description.sponsorshipAcademy of Finland (grant 118653 (ALGODAN))en_US
dc.description.sponsorshipIST Programme of the European Community (IST-2002-506778)en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Grant ECCS-0801549)en_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/ADPRL.2009.4927528en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceMIT web domainen_US
dc.titleBasis Function Adaptation Methods for Cost Approximation in MDPen_US
dc.typeArticleen_US
dc.identifier.citationHuizhen Yu, and D.P. Bertsekas. “Basis function adaptation methods for cost approximation in MDP.” Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on. 2009. 74-81. ©2009 Institute of Electrical and Electronics Engineers.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.departmentMassachusetts Institute of Technology. Laboratory for Information and Decision Systemsen_US
dc.contributor.approverBertsekas, Dimitri P.
dc.contributor.mitauthorBertsekas, Dimitri P.
dc.relation.journalIEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09en_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dspace.orderedauthorsYu, Huizhen; Bertsekas, Dimitri P.en
dc.identifier.orcidhttps://orcid.org/0000-0001-6909-7208
mit.licensePUBLISHER_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record