Show simple item record

dc.contributor.authorJaillet, Patrick
dc.contributor.authorMastin, Dana Andrew
dc.date.accessioned2014-05-09T14:23:59Z
dc.date.available2014-05-09T14:23:59Z
dc.date.issued2012-12
dc.identifier.isbn978-1-4673-2066-5
dc.identifier.isbn978-1-4673-2065-8
dc.identifier.isbn978-1-4673-2063-4
dc.identifier.isbn978-1-4673-2064-1
dc.identifier.urihttp://hdl.handle.net/1721.1/86896
dc.description.abstractWe analyze losses resulting from uncertain transition probabilities in Markov decision processes with bounded nonnegative rewards. We assume that policies are precomputed using exact dynamic programming with the estimated transition probabilities, but the system evolves according to different, true transition probabilities. Given a bound on the total variation error of estimated transition probability distributions, we derive upper bounds on the loss of expected total reward. The approach analyzes the growth of errors incurred by stepping backwards in time while precomputing value functions, which requires bounding a multilinear program. Loss bounds are given for the finite horizon undiscounted, finite horizon discounted, and infinite horizon discounted cases, and a tight example is shown.en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Grant 1029603)en_US
dc.description.sponsorshipNational Science Foundation (U.S.). Graduate Research Fellowship Programen_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/CDC.2012.6426504en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceMIT web domainen_US
dc.titleLoss bounds for uncertain transition probabilities in Markov decision processesen_US
dc.typeArticleen_US
dc.identifier.citationMastin, Andrew, and Patrick Jaillet. “Loss Bounds for Uncertain Transition Probabilities in Markov Decision Processes.” 2012 IEEE 51st IEEE Conference on Decision and Control (CDC) (n.d.).en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.departmentMassachusetts Institute of Technology. Laboratory for Information and Decision Systemsen_US
dc.contributor.mitauthorMastin, Dana Andrewen_US
dc.contributor.mitauthorJaillet, Patricken_US
dc.relation.journalProceedings of the 2012 IEEE 51st IEEE Conference on Decision and Control (CDC)en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsMastin, Andrew; Jaillet, Patricken_US
dc.identifier.orcidhttps://orcid.org/0000-0002-8585-6566
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record