Mean-Variance Optimization in Markov Decision Processes

Mannor, Shie; Tsitsiklis, John

Author(s)

Mannor, Shie; Tsitsiklis, John N.

DownloadC-11-mv-MDP-ICML.pdf (376.9Kb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/

Metadata

Show full item record

Abstract

We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NP-hard for some cases, and strongly NP-hard for others. We finally offer pseudo-polynomial exact and approximation algorithms.

Date issued

2011-06

URI

http://hdl.handle.net/1721.1/79401

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Journal

Proceedings of the Twenty-Eighth International Conference on Machine Learning, ICML 2011

Publisher

International Machine Learning Society

Citation

Mannor, Shie and John Tsitsiklis. "Mean-Variance Optimization in Markov Decision Processes ." in Twenty-Eighth International Conference on Machine Learning, ICML 2011, Jun. 28-Jul.2, Bellevue, Washington. 2011.

Version: Author's final manuscript

ISBN

9781450306195

1450306195

Collections

MIT Open Access Articles

DSpace@MIT