Distributed Asynchronous Policy Iteration in Dynamic Programming

Bertsekas, Dimitri P.; Yu, Huizhen

Author(s)

Bertsekas, Dimitri P.; Yu, Huizhen

DownloadBertsekas_Distributed asynchronous.pdf (251.1Kb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/

Metadata

Show full item record

Abstract

We consider the distributed solution of dynamic programming (DP) problems by policy iteration. We envision a network of processors, each updating asynchronously a local policy and a local cost function, defined on a portion of the state space. The computed values are communicated asynchronously between processors and are used to perform the local policy and cost updates. The natural algorithm of this type can fail even under favorable circumstances, as shown by Williams and Baird [WiB93]. We propose an alternative and almost as simple algorithm, which converges to the optimum under the most general conditions, including asynchronous updating by multiple processors using outdated local cost functions of other processors.

Date issued

2010-09

URI

http://hdl.handle.net/1721.1/63169

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Journal

Allerton Conference on Communication, Control, and Computing. Proceedings, 48th, 2010

Publisher

University of Illinois at Urbana-Champaign

Citation

Bertsekas, Dimitri P. and Huizhen Yu. “Distributed asynchronous policy iteration in dynamic programming.” 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton). Monticello, IL, USA, 2010. 1368-1375.

Version: Author's final manuscript

Collections

MIT Open Access Articles

DSpace@MIT