Hazard avoidance alerting with Markov decision processes

Winder, Lee F. (Lee Francis), 1973-

Author(s)

Winder, Lee F. (Lee Francis), 1973-

DownloadFull printable version (12.75Mb)

Other Contributors

Massachusetts Institute of Technology. Dept. of Aeronautics and Astronautics.

Advisor

James K. Kuchar.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

(cont.) (incident rate and unnecessary alert rate), the MDP-based logic can meet or exceed that of alternate logics.

This thesis describes an approach to designing hazard avoidance alerting systems based on a Markov decision process (MDP) model of the alerting process, and shows its benefits over standard design methods. One benefit of the MDP method is that it accounts for future decision opportunities when choosing whether or not to alert, or in determining resolution guidance. Another benefit is that it provides a means of modeling uncertain state information, such as unmeasurable mode variables, so that decisions are more informed. A mode variable is an index for distinct types of behavior that a system exhibits at different times. For example, in many situations normal system behavior tends to be safe, but rare deviations from the normal increase the likelihood of a harmful incident. Accurate modeling of mode information is needed to minimize alerting system errors such as unnecessary or late alerts. The benefits of the method are illustrated with two alerting scenarios where a pair of aircraft must avoid collisions when passing one another. The first scenario has a fully observable state and the second includes an uncertain mode describing whether an intruder aircraft levels off safely above the evader or is in a hazardous blunder mode. In MDP theory, outcome preferences are described in terms of utilities of different state trajectories. In keeping with this, alerting system requirements are stated in the form of a reward function. This is then used with probabilistic dynamic and sensor models to compute an alerting logic (policy) that maximizes expected utility. Performance comparisons are made between the MDP-based logics and alternate logics generated with current methods. It is found that in terms of traditional performance measures

Description

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2004.

Includes bibliographical references (p. 123-125).

Date issued

2004

URI

http://hdl.handle.net/1721.1/28860

Department

Massachusetts Institute of Technology. Department of Aeronautics and Astronautics

Publisher

Massachusetts Institute of Technology

Keywords

Aeronautics and Astronautics.

Collections

Doctoral Theses