Learning for multi-robot cooperation in partially observable stochastic environments with macro-actions

Liu, Miao; Sivakumar, Kavinayan; Omidshafiei, Shayegan; Amato, Christopher; How, Jonathan P.

dc.contributor.author	Amato, Christopher
dc.contributor.author	Liu, Miao
dc.contributor.author	Sivakumar, Kavinayan P
dc.contributor.author	Omidshafiei, Shayegan
dc.contributor.author	How, Jonathan P
dc.date.accessioned	2018-04-13T22:28:08Z
dc.date.available	2018-04-13T22:28:08Z
dc.date.issued	2017-12
dc.date.submitted	2017-09
dc.identifier.isbn	978-1-5386-2682-5
dc.identifier.isbn	978-1-5386-2681-8
dc.identifier.isbn	978-1-5386-2683-2
dc.identifier.issn	2153-0866
dc.identifier.uri	http://hdl.handle.net/1721.1/114739
dc.description.abstract	This paper presents a data-driven approach for multi-robot coordination in partially-observable domains based on Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a general framework for cooperative sequential decision making under uncertainty and MAs allow temporally extended and asynchronous action execution. To date, most methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. Previous methods which aim to address these issues suffer from local optimality and sensitivity to initial conditions. Additionally, few hardware demonstrations involving a large team of heterogeneous robots and with long planning horizons exist. This work addresses these gaps by proposing an iterative sampling based Expectation-Maximization algorithm (iSEM) to learn polices using only trajectory data containing observations, MAs, and rewards. Our experiments show the algorithm is able to achieve better solution quality than the state-of-the-art learning-based methods. We implement two variants of multi-robot Search and Rescue (SAR) domains (with and without obstacles) on hardware to demonstrate the learned policies can effectively control a team of distributed robots to cooperate in a partially observable stochastic environment.	en_US
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)	en_US
dc.relation.isversionof	http://dx.doi.org/10.1109/IROS.2017.8206001	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	arXiv	en_US
dc.title	Learning for multi-robot cooperation in partially observable stochastic environments with macro-actions	en_US
dc.type	Article	en_US
dc.identifier.citation	Liu, Miao, Kavinayan Sivakumar, Shayegan Omidshafiei, Christopher Amato, and Jonathan P. How. “Learning for Multi-Robot Cooperation in Partially Observable Stochastic Environments with Macro-Actions.” 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017, Vancouver, BC, Canada, 2017.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Aeronautics and Astronautics	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Mechanical Engineering	en_US
dc.contributor.department	Massachusetts Institute of Technology. Laboratory for Information and Decision Systems	en_US
dc.contributor.mitauthor	Liu, Miao
dc.contributor.mitauthor	Sivakumar, Kavinayan P
dc.contributor.mitauthor	Omidshafiei, Shayegan
dc.contributor.mitauthor	How, Jonathan P
dc.relation.journal	2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)	en_US
dc.eprint.version	Original manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2018-03-21T16:14:11Z
dspace.orderedauthors	Liu, Miao; Sivakumar, Kavinayan; Omidshafiei, Shayegan; Amato, Christopher; How, Jonathan P.	en_US
dspace.embargo.terms	N	en_US
dc.identifier.orcid	https://orcid.org/0000-0002-1648-8325
dc.identifier.orcid	https://orcid.org/0000-0003-0903-0137
dc.identifier.orcid	https://orcid.org/0000-0001-8576-1930
mit.license	OPEN_ACCESS_POLICY	en_US

Files in this item

Name:: 1707.07399.pdf
Size:: 7.673Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record