dc.contributor.author | Ahmed, Asrar | |
dc.contributor.author | Varakantham, Pradeep | |
dc.contributor.author | Adulyasak, Yossiri | |
dc.contributor.author | Jaillet, Patrick | |
dc.date.accessioned | 2015-12-19T02:23:48Z | |
dc.date.available | 2015-12-19T02:23:48Z | |
dc.date.issued | 2013 | |
dc.identifier.issn | 1049-5258 | |
dc.identifier.uri | http://hdl.handle.net/1721.1/100443 | |
dc.description.abstract | In this paper, we seek robust policies for uncertain Markov Decision Processes (MDPs). Most robust optimization approaches for these problems have focussed on the computation of {\em maximin} policies which maximize the value corresponding to the worst realization of the uncertainty. Recent work has proposed {\em minimax} regret as a suitable alternative to the {\em maximin} objective for robust optimization. However, existing algorithms for handling {\em minimax} regret are restricted to models with uncertainty over rewards only. We provide algorithms that employ sampling to improve across multiple dimensions: (a) Handle uncertainties over both transition and reward models; (b) Dependence of model uncertainties across state, action pairs and decision epochs; (c) Scalability and quality bounds. Finally, to demonstrate the empirical effectiveness of our sampling approaches, we provide comparisons against benchmark algorithms on two domains from literature. We also provide a Sample Average Approximation (SAA) analysis to compute a posteriori error bounds. | en_US |
dc.description.sponsorship | Singapore. National Research Foundation (Singapore-MIT Alliance for Research and Technology Center. Future Urban Mobility Program) | en_US |
dc.description.sponsorship | United States. Office of Naval Research (Grant N00014-12-1-0999) | en_US |
dc.language.iso | en_US | |
dc.publisher | Neural Information Processing Systems | en_US |
dc.relation.isversionof | http://papers.nips.cc/paper/4970-regret-based-robust-solutions-for-uncertain-markov-decision-processes | en_US |
dc.rights | Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. | en_US |
dc.source | NIPS | en_US |
dc.title | Regret Based Robust Solutions for Uncertain Markov Decision Processes | en_US |
dc.type | Article | en_US |
dc.identifier.citation | Ahmed, Asrar, Pradeep Varakantham, Yossiri Adulyasak, and Patrick Jaillet. "Regret Based Robust Solutions for Uncertain Markov Decision Processes." Advances in Neural Information Processing Systems 26 (NIPS 2013). | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
dc.contributor.mitauthor | Adulyasak, Yossiri | en_US |
dc.contributor.mitauthor | Jaillet, Patrick | en_US |
dc.relation.journal | Advances in Neural Information Processing Systems (NIPS) | en_US |
dc.eprint.version | Final published version | en_US |
dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
dspace.orderedauthors | Ahmed, Asrar; Varakantham, Pradeep; Adulyasak, Yossiri; Jaillet, Patrick | en_US |
dc.identifier.orcid | https://orcid.org/0000-0002-8585-6566 | |
mit.license | PUBLISHER_POLICY | en_US |
mit.metadata.status | Complete | |