Projected equation and aggregation-based approximate dynamic programming methods for Tetris

Hwang, Daw-sen

dc.contributor.advisor	Dimitri P. Bertsekas.	en_US
dc.contributor.author	Hwang, Daw-sen	en_US
dc.contributor.other	Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2011-09-27T18:34:48Z
dc.date.available	2011-09-27T18:34:48Z
dc.date.copyright	2011	en_US
dc.date.issued	2011	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/66033
dc.description	Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.	en_US
dc.description	Cataloged from PDF version of thesis.	en_US
dc.description	Includes bibliographical references (p. 65-67).	en_US
dc.description.abstract	In this thesis, we survey approximate dynamic programming (ADP) methods and test the methods with the game of Tetris. We focus on ADP methods where the cost-to- go function J is approximated with [phi]r, where [phi] is some matrix and r is a vector with relatively low dimension. There are two major categories of methods: projected equation methods and aggregation methods. In projected equation methods, the cost-to-go function approximation [phi]r is updated by simulation using one of several policy-updated algorithms such as LSTD([lambda]) [BB96], and LSPE(A) [B196]. Projected equation methods generally may not converge. We define a pseudometric of policies and view the oscillations of policies in Tetris. Aggregation methods are based on a model approximation approach. The original problem is reduced to an aggregate problem with significantly fewer states. The weight vector r is the cost-to-go function of the aggregate problem and [phi] is the matrix of aggregation probabilities. In aggregation methods, the vector r converges to the optimal cost-to-go function of the aggregate problem. In this thesis, we implement aggregation methods for Tetris, and compare the performance of projected equation methods and aggregation methods.	en_US
dc.description.statementofresponsibility	by Daw-sen Hwang.	en_US
dc.format.extent	111 p.	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Projected equation and aggregation-based approximate dynamic programming methods for Tetris	en_US
dc.title.alternative	Approximate dynamic programming : projected equation and aggregation methods for Tetris	en_US
dc.type	Thesis	en_US
dc.description.degree	S.M.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	752149312	en_US

Files in this item

Name:: 752149312-MIT.pdf
Size:: 4.429Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record