Diverse Behavior Prediction through Deep Hybrid Models

Huang, Xin

Author(s)

Huang, Xin

DownloadThesis PDF (6.797Mb)

Advisor

Williams, Brian C.

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Predicting future motions of agents is a crucial task for autonomous vehicles. This task is challenging due to multi-modal traffic agent behaviors such as maneuvers. In addition, predictors are often required to generate a limited number of prediction samples to cover diverse behaviors, due to the time complexity of processing these samples for downstream tasks. Existing model-based prediction methods leverage hybrid reasoning techniques to predict qualitatively representative agent motions from a large prediction space, yet they often assume simple agent dynamics and fail to account for scene context. Recently, learning-based approaches have demonstrated great success in learning complicated agent dynamics and scene context through deep neural networks to produce accurate trajectories in complex traffic scenes. On the other hand, they often leverage a black box deep neural network and fail the explore the structure of the problem. In this thesis, we propose deep hybrid models by unifying the power of model-based hybrid reasoning algorithms and learning-based models to predict a small set of accurate trajectory samples that cover qualitatively representative agent maneuvers. Our approach offers several advantages compared to existing model-based and learning-based predictors. First, it handles evolving intent over time by learning an accurate hybrid model representing evolving discrete maneuvers and continuous trajectories, and sampling a set of hybrid trajectory sequences through importance sampling based on learned proposal distributions. Second, it handles ambiguous maneuvers by learning a latent space of qualitative maneuvers that mimics human concepts of qualitatively representative maneuvers. Third, it generates samples that support multiple downstream tasks, including autonomous planning and driver warning, by adding a task-informed loss that leverages the specification of the task when the additional task information is given. We train and validate our models on large-scale public driving benchmarks, including Argoverse forecasting dataset and Waymo open motion dataset. We perform extensive qualitative and quantitative experimental results to demonstrate the advantage of our predictor over state-of-the-art model-based and learning-based baselines, in terms of accuracy, diversity, and task performance.

Date issued

2022-09

URI

https://hdl.handle.net/1721.1/147476

Department

Massachusetts Institute of Technology. Department of Aeronautics and Astronautics

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses