Optimization of neural network feedback control systems using automatic differentiation

Rollins, Elizabeth, S.M. Massachusetts Institute of Technology

Author(s)

Rollins, Elizabeth, S.M. Massachusetts Institute of Technology

DownloadFull printable version (8.359Mb)

Other Contributors

Massachusetts Institute of Technology. Dept. of Aeronautics and Astronautics.

Advisor

Steven R. Hall and Christopher W. Dever.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Optimal control problems can be challenging to solve, whether using analytic or numerical methods. This thesis examines the application of an adjoint method for optimal feedback control, which combines various algorithmic techniques into an original numerical method. In the method investigated here, a neural network defines the control input in both trajectory and feedback control optimization problems. The weights of the neural network that minimize a cost function are determined by an unconstrained optimization routine. By using automatic differentiation on the code that evaluates the cost function, the gradient of the cost with respect to the weights is obtained for the gradient search phase of the optimization process. Automatic differentiation is more efficient than hand-differentiating code for the user and provides exact gradients, allowing the optimization of the neural network weights to proceed more rapidly. Another benefit of this method comes from its use of neural networks, which are able to represent complex feedback control policies, because they are general nonlinear function approximators. Neural networks also have the potential to be generalizable, meaning that a control policy found using a sufficiently rich training set will often work well for other initial conditions outside of the training set. Finally, the software implementation is modular, which means that the user only needs to adjust a few codes in order to set up the method for a specific problem. The application of the adjoint method to three control problems with known solutions demonstrates the ability of the method to determine neural networks that produce near-optimal trajectories and control policies.

Description

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2009.

Cataloged from PDF version of thesis.

Includes bibliographical references (p. 95-97).

Date issued

2009

URI

http://hdl.handle.net/1721.1/59691

Department

Massachusetts Institute of Technology. Department of Aeronautics and Astronautics

Publisher

Massachusetts Institute of Technology

Keywords

Aeronautics and Astronautics.

Collections

Graduate Theses