Show simple item record

dc.contributor.authorDesai, Vijay V.
dc.contributor.authorFarias, Vivek F.
dc.contributor.authorMoallemi, Ciamac C.
dc.date.accessioned2019-02-21T15:11:11Z
dc.date.available2019-02-21T15:11:11Z
dc.date.issued2013
dc.identifier.isbn9781118453988
dc.identifier.urihttp://hdl.handle.net/1721.1/120518
dc.description.abstractWe consider the problem of producing lower bounds on the optimal cost-to-go function of a Markov decision problem. We present two approaches to this problem: one based on the methodology of approximate linear programming (ALP) and another based on the so-called martingale duality approach. We show that these two approaches are intimately connected. Exploring this connection leads us to the problem of finding "optimal" martingale penalties within the martingale duality approach which we dub the pathwise optimization (PO) problem. We show interesting cases where the PO problem admits a tractable solution and establish that these solutions produce tighter approximations than the ALP approach. © 2013 The Institute of Electrical and Electronics Engineers, Inc.en_US
dc.publisherJohn Wiley & Sons, Inc.en_US
dc.relation.isversionofhttp://dx.doi.org/10.1002/9781118453988.ch20en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceNon-MIT author websiteen_US
dc.titleBounds for Markov Decision Processesen_US
dc.typeArticleen_US
dc.identifier.citationDesai, Vijay V., Vivek F. Farias, and Ciamac C. Moallemi. “Bounds for Markov Decision Processes.” Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, edited by Frank L. Lewis and Derong Liu, John Wiley & Sons, Inc., 2013, pp. 452–473.en_US
dc.contributor.departmentSloan School of Managementen_US
dc.contributor.mitauthorFarias, Vivek F.
dc.relation.journalReinforcement Learning and Approximate Dynamic Programming for Feedback Controlen_US
dc.eprint.versionOriginal manuscripten_US
dc.type.urihttp://purl.org/eprint/type/BookItemen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2019-02-12T14:40:50Z
dspace.orderedauthorsDesai, Vijay V.; Farias, Vivek F.; Moallemi, Ciamac C.en_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0002-5856-9246
mit.licenseOPEN_ACCESS_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record