An approach for nonlinear control design via approximate dynamic programming
Author(s)Boussios, Constantinos I
Munther A. Dahleh and John N. Tsitsiklis.
MetadataShow full item record
This thesis proposes and studies a methodology for designing controllers for nonlinear dynamic systems. We are interested in state feedback controllers (policies) that stabilize the state in a given region around an equilibrium point while minimizing a cost functional that captures the performance of the closed loop system. The optimal control problem can be solved in principle using dynamic programming algorithms such as policy iteration. Exact policy iteration is computationally infeasible for systems of even moderate dimension, which leads us to consider methods based on Approximate Policy Iteration. In such methods, we first select an approximation architecture (i.e., a parametrized class of functions) that is used to approximate the cost-to-go function under a given policy, on the basis of cost samples that are obtained through simulation. The resulting approximate cost function is used to derive another, hopefully better policy, and the procedure is repeated iteratively. There are several case studies exploring the use of this methodology, but they are of limited generality, and without much of a theoretical foundation. This thesis explores the soundness of Approximate Policy Iteration. We address the problem of improving the performance of a given stabilizing controller, as well as the problem of designing stabilizing controllers for unstable systems. For the first problem, we develop bounds on the approximation error that can be tolerated if we wish to guarantee that the resulting controllers are stabilizing and/or offer improved performance. We give bounds on the suboptimality of the resulting controllers, in terms of the assumed approximation errors. We also extend the methodology to the unstable case by introducing an appropriate modification of the optimal control problem. The computational burden of cost function approximation can be often reduced, thereby enhancing the practicality of the method, by exploiting special structure. We illustrate this for a special class of nonlinear systems with fast linear and slow nonlinear dynamics. We also present an approximation based on state space gridding, whose performance can be evaluated via a systematic test. Finally, analysis is supported by applying Approximate Policy Iteration to two specific problems, one involving a missile model and the other involving a beam-and-ball model.
Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 1998.Includes bibliographical references (p. 173-181).
DepartmentMassachusetts Institute of Technology. Department of Mechanical Engineering
Massachusetts Institute of Technology