Learning and control for interactions in mixed human-robot environments
Author(s)
Schwarting, Wilko.
Download1252062025-MIT.pdf (24.78Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Daniela Rus.
Terms of use
Metadata
Show full item recordAbstract
Autonomous robots will soon be a commonplace presence in our daily lives in environments such as homes, factories, and roads. In order to reap the tremendous benefits that these robots offer to society, we must ensure that they can interact with humans seamlessly and safely. In this dissertation, we study intelligent agents that learn how to reason about human behavior and people's intentions. These agents predict others' intentions and implicitly communicate their own intentions through human-like actions that can be understood by people. They also anticipate and leverage the effect of their actions on the actions of others in the environment. When their own interests and the interests of others are not aligned, the agents quantify people's willingness to cooperate or defect and negotiate through social behavior. The agents form beliefs by perceiving the world and the actions of others. They create plans to actively gather information about themselves, others, and the environment, while simultaneously avoiding actions that lead to high uncertainty. They also reason about the beliefs of others, and can leverage how their actions influence others' beliefs. In part (I) of this thesis, we formulate social human-robot interactions between agents as a best-response game wherein each agent negotiates to maximize their utility, and learn human rewards from data. We measure Social Value Orientation (SVO) to quantify an agent's degree of selfishness or altruism to better predict human behavior. In part (II) we additionally enable agents to leverage information gain and reasoning about the beliefs of others in stochastic environments with partial observations by combining game-theoretic and belief-space planning. In part (III) we present a multi-agent reinforcement learning algorithm that learns competitive visual control policies through self-play in imagination. The agent learns from competition by imagining multi-agent interaction sequences in the compact latent space of a learned world model that combines a joint transition function with opponent viewpoint prediction. Lastly, in part (IV) we introduce Parallel Autonomy, a Guardian system that uses uncertain predictions to provide safety in challenging driving scenarios while following people's desired actions as close as safely possible.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2021 Cataloged from the official PDF of thesis. Includes bibliographical references (pages 217-235).
Date issued
2021Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.