| dc.contributor.author | Amato, Christopher | |
| dc.contributor.author | Liao, Xuejun | |
| dc.contributor.author | Carin, Lawrence | |
| dc.contributor.author | Liu, Miao | |
| dc.contributor.author | How, Jonathan P | |
| dc.date.accessioned | 2016-10-21T19:07:30Z | |
| dc.date.available | 2016-10-21T19:07:30Z | |
| dc.date.issued | 2015-07 | |
| dc.identifier.uri | http://hdl.handle.net/1721.1/104918 | |
| dc.description.abstract | Expectation maximization (EM) has recently been shown to be an efficient algorithm for learning finite-state controllers (FSCs) in large decentralized POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often converge to maxima that are far from the optimal value. This paper represents the local policy of each agent using variable-sized FSCs that are constructed usinga stick-breaking prior, leading to a new framework called decentralized stick-breaking policy representation (Dec-SBPR). This approach learns the controller parameters with a variational Bayesian algorithm without having to assume that the DecPOMDP model is available. The performance of Dec-SBPR is demonstrated on several benchmark problems, showing that the algorithm scales to large problems while outperforming other state-of-the-art methods. | en_US |
| dc.description.sponsorship | United States. Office of Naval Research. Multidisciplinary University Research Initiative (Award N000141110688) | en_US |
| dc.description.sponsorship | National Science Foundation (U.S.) (Award 1463945) | en_US |
| dc.language.iso | en_US | |
| dc.publisher | International Joint Conferences on Artificial Intelligence, Inc. | en_US |
| dc.relation.isversionof | http://ijcai-15.org/index.php/accepted-papers | en_US |
| dc.rights | Creative Commons Attribution-Noncommercial-Share Alike | en_US |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | en_US |
| dc.source | MIT web domain | en_US |
| dc.title | Stick-breaking policy learning in Dec-POMDPs | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | Liu, Miao et al. "Stick-Breaking Policy Learning in Dec-POMDPs." International Joint Conference on Artificial Intelligence, July 25-31, 2015, Buenos Aires, Argentina. | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Aeronautics and Astronautics | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Laboratory for Information and Decision Systems | en_US |
| dc.contributor.mitauthor | Liu, Miao | |
| dc.contributor.mitauthor | How, Jonathan P | |
| dc.relation.journal | International Joint Conference on Artificial Intelligence | en_US |
| dc.eprint.version | Author's final manuscript | en_US |
| dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
| eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
| dspace.orderedauthors | Liu, Miao; Amato, Christopher; Liao, Xuejun; Carin, Lawrence; How, Jonathan P. | en_US |
| dspace.embargo.terms | N | en_US |
| dc.identifier.orcid | https://orcid.org/0000-0002-1648-8325 | |
| dc.identifier.orcid | https://orcid.org/0000-0001-8576-1930 | |
| mit.license | OPEN_ACCESS_POLICY | en_US |
| mit.metadata.status | Complete | |