Perturbation training for human-robot teams

Ramakrishnan, Ramya

Author(s)

Ramakrishnan, Ramya

DownloadFull printable version (5.205Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Julie A. Shah.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Today, robots are often deployed to work separately from people. Combining the strengths of humans and robots, however, can potentially lead to a stronger joint team. To have fluid human-robot collaboration, these teams must train to achieve high team performance and flexibility on new tasks. This requires a computational model that supports the human in learning and adapting to new situations. In this work, we design and evaluate a computational learning model that enables a human-robot team to co-develop joint strategies for performing novel tasks requiring coordination. The joint strategies are learned through "perturbation training," a human team-training strategy that requires practicing variations of a given task to help the team generalize to new variants of that task. Our Adaptive Perturbation Training (AdaPT) algorithm is a hybrid of transfer learning and reinforcement learning techniques and extends the Policy Reuse in Q-Learning (PRQL) algorithm to learn more quickly in new task variants. We empirically validate this advantage of AdaPT over PRQL through computational simulations. We then augment our algorithm AdaPT with a co-learning framework and a computational bi-directional communication protocol so that the robot can work with a person in live interactions. These three features constitute our human-robot perturbation training model. We conducted human subject experiments to show proof-of-concept that our model enables a robot to draw from its library of prior experiences in a way that leads to high team performance. We compare our algorithm with a standard reinforcement learning algorithm Q-learning and find that AdaPT-trained teams achieved significantly higher reward on novel test tasks than Q-learning teams. This indicates that the robot's algorithm, rather than just the human's experience of perturbations, is key to achieving high team performance. We also show that our algorithm does not sacrifice performance on the base task after training on perturbations. Finally, we demonstrate that human-robot training in a simulation environment using AdaPT produced effective team performance with an embodied robot partner.

Description

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 63-67).

Date issued

2015

URI

http://hdl.handle.net/1721.1/99845

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Graduate Theses