Least Squares Temporal Difference Methods: An Analysis under General Conditions

Yu, Huizhen

Author(s)

Yu, Huizhen

DownloadYu-2012-LEAST SQUARES TEMPORAL DIFFERENCE METHODS.pdf (420.4Kb)

PUBLISHER_POLICY

Terms of use

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

Metadata

Show full item record

Abstract

We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) with the least squares temporal difference (LSTD) algorithm, LSTD($\lambda$), in an exploration-enhanced learning context, where policy costs are computed from observations of a Markov chain different from the one corresponding to the policy under evaluation. We establish for the discounted cost criterion that LSTD($\lambda$) converges almost surely under mild, minimal conditions. We also analyze other properties of the iterates involved in the algorithm, including convergence in mean and boundedness. Our analysis draws on theories of both finite space Markov chains and weak Feller Markov chains on a topological space. Our results can be applied to other temporal difference algorithms and MDP models. As examples, we give a convergence analysis of a TD($\lambda$) algorithm and extensions to MDP with compact state and action spaces, as well as a convergence proof of a new LSTD algorithm with state-dependent $\lambda$-parameters.

Date issued

2012-12

URI

http://hdl.handle.net/1721.1/77629

Department

Massachusetts Institute of Technology. Laboratory for Information and Decision Systems

Journal

SIAM Journal on Control and Optimization

Publisher

Society for Industrial and Applied Mathematics

Citation

Yu, Huizhen. “Least Squares Temporal Difference Methods: An Analysis Under General Conditions.” SIAM Journal on Control and Optimization 50.6 (2012): 3310–3343. © 2012, Society for Industrial and Applied Mathematics

Version: Final published version

ISSN

0363-0129

1095-7138

Collections

MIT Open Access Articles

DSpace@MIT