Learning and control for interactions in mixed human-robot environments

Schwarting, Wilko.

Author(s)

Schwarting, Wilko.

Download1252062025-MIT.pdf (24.78Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Daniela Rus.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Autonomous robots will soon be a commonplace presence in our daily lives in environments such as homes, factories, and roads. In order to reap the tremendous benefits that these robots offer to society, we must ensure that they can interact with humans seamlessly and safely. In this dissertation, we study intelligent agents that learn how to reason about human behavior and people's intentions. These agents predict others' intentions and implicitly communicate their own intentions through human-like actions that can be understood by people. They also anticipate and leverage the effect of their actions on the actions of others in the environment. When their own interests and the interests of others are not aligned, the agents quantify people's willingness to cooperate or defect and negotiate through social behavior. The agents form beliefs by perceiving the world and the actions of others.

They create plans to actively gather information about themselves, others, and the environment, while simultaneously avoiding actions that lead to high uncertainty. They also reason about the beliefs of others, and can leverage how their actions influence others' beliefs. In part (I) of this thesis, we formulate social human-robot interactions between agents as a best-response game wherein each agent negotiates to maximize their utility, and learn human rewards from data. We measure Social Value Orientation (SVO) to quantify an agent's degree of selfishness or altruism to better predict human behavior. In part (II) we additionally enable agents to leverage information gain and reasoning about the beliefs of others in stochastic environments with partial observations by combining game-theoretic and belief-space planning. In part (III) we present a multi-agent reinforcement learning algorithm that learns competitive visual control policies through self-play in imagination.

The agent learns from competition by imagining multi-agent interaction sequences in the compact latent space of a learned world model that combines a joint transition function with opponent viewpoint prediction. Lastly, in part (IV) we introduce Parallel Autonomy, a Guardian system that uses uncertain predictions to provide safety in challenging driving scenarios while following people's desired actions as close as safely possible.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2021

Cataloged from the official PDF of thesis.

Includes bibliographical references (pages 217-235).

Date issued

2021

URI

https://hdl.handle.net/1721.1/130771

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses