Learning for multi-robot cooperation in partially observable stochastic environments with macro-actions

Liu, Miao; Sivakumar, Kavinayan; Omidshafiei, Shayegan; Amato, Christopher; How, Jonathan P.

Author(s)

Amato, Christopher; Liu, Miao; Sivakumar, Kavinayan P; Omidshafiei, Shayegan; How, Jonathan P

Download1707.07399.pdf (7.673Mb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

This paper presents a data-driven approach for multi-robot coordination in partially-observable domains based on Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a general framework for cooperative sequential decision making under uncertainty and MAs allow temporally extended and asynchronous action execution. To date, most methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. Previous methods which aim to address these issues suffer from local optimality and sensitivity to initial conditions. Additionally, few hardware demonstrations involving a large team of heterogeneous robots and with long planning horizons exist. This work addresses these gaps by proposing an iterative sampling based Expectation-Maximization algorithm (iSEM) to learn polices using only trajectory data containing observations, MAs, and rewards. Our experiments show the algorithm is able to achieve better solution quality than the state-of-the-art learning-based methods. We implement two variants of multi-robot Search and Rescue (SAR) domains (with and without obstacles) on hardware to demonstrate the learned policies can effectively control a team of distributed robots to cooperate in a partially observable stochastic environment.

Date issued

2017-12

URI

http://hdl.handle.net/1721.1/114739

Department

Massachusetts Institute of Technology. Department of Aeronautics and Astronautics; Massachusetts Institute of Technology. Department of Mechanical Engineering; Massachusetts Institute of Technology. Laboratory for Information and Decision Systems

Journal

2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Citation

Liu, Miao, Kavinayan Sivakumar, Shayegan Omidshafiei, Christopher Amato, and Jonathan P. How. “Learning for Multi-Robot Cooperation in Partially Observable Stochastic Environments with Macro-Actions.” 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017, Vancouver, BC, Canada, 2017.

Version: Original manuscript

ISBN

978-1-5386-2682-5

978-1-5386-2681-8

978-1-5386-2683-2

ISSN

2153-0866

Collections

MIT Open Access Articles

DSpace@MIT