Regret Based Robust Solutions for Uncertain Markov Decision Processes

Ahmed, Asrar; Varakantham, Pradeep; Adulyasak, Yossiri; Jaillet, Patrick

Author(s)

Ahmed, Asrar; Varakantham, Pradeep; Adulyasak, Yossiri; Jaillet, Patrick

DownloadJaillet_Regret based.pdf (345.4Kb)

PUBLISHER_POLICY

Terms of use

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

Metadata

Show full item record

Abstract

In this paper, we seek robust policies for uncertain Markov Decision Processes (MDPs). Most robust optimization approaches for these problems have focussed on the computation of {\em maximin} policies which maximize the value corresponding to the worst realization of the uncertainty. Recent work has proposed {\em minimax} regret as a suitable alternative to the {\em maximin} objective for robust optimization. However, existing algorithms for handling {\em minimax} regret are restricted to models with uncertainty over rewards only. We provide algorithms that employ sampling to improve across multiple dimensions: (a) Handle uncertainties over both transition and reward models; (b) Dependence of model uncertainties across state, action pairs and decision epochs; (c) Scalability and quality bounds. Finally, to demonstrate the empirical effectiveness of our sampling approaches, we provide comparisons against benchmark algorithms on two domains from literature. We also provide a Sample Average Approximation (SAA) analysis to compute a posteriori error bounds.

Date issued

2013

URI

http://hdl.handle.net/1721.1/100443

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Journal

Advances in Neural Information Processing Systems (NIPS)

Publisher

Neural Information Processing Systems

Citation

Ahmed, Asrar, Pradeep Varakantham, Yossiri Adulyasak, and Patrick Jaillet. "Regret Based Robust Solutions for Uncertain Markov Decision Processes." Advances in Neural Information Processing Systems 26 (NIPS 2013).

Version: Final published version

ISSN

1049-5258

Collections

MIT Open Access Articles

DSpace@MIT