MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Projected equation and aggregation-based approximate dynamic programming methods for Tetris

Author(s)
Hwang, Daw-sen
Thumbnail
DownloadFull printable version (4.429Mb)
Alternative title
Approximate dynamic programming : projected equation and aggregation methods for Tetris
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
Dimitri P. Bertsekas.
Terms of use
M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
In this thesis, we survey approximate dynamic programming (ADP) methods and test the methods with the game of Tetris. We focus on ADP methods where the cost-to- go function J is approximated with [phi]r, where [phi] is some matrix and r is a vector with relatively low dimension. There are two major categories of methods: projected equation methods and aggregation methods. In projected equation methods, the cost-to-go function approximation [phi]r is updated by simulation using one of several policy-updated algorithms such as LSTD([lambda]) [BB96], and LSPE(A) [B196]. Projected equation methods generally may not converge. We define a pseudometric of policies and view the oscillations of policies in Tetris. Aggregation methods are based on a model approximation approach. The original problem is reduced to an aggregate problem with significantly fewer states. The weight vector r is the cost-to-go function of the aggregate problem and [phi] is the matrix of aggregation probabilities. In aggregation methods, the vector r converges to the optimal cost-to-go function of the aggregate problem. In this thesis, we implement aggregation methods for Tetris, and compare the performance of projected equation methods and aggregation methods.
Description
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.
 
Cataloged from PDF version of thesis.
 
Includes bibliographical references (p. 65-67).
 
Date issued
2011
URI
http://hdl.handle.net/1721.1/66033
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.