Show simple item record

dc.contributor.authorAmato, Christopher
dc.contributor.authorLiao, Xuejun
dc.contributor.authorCarin, Lawrence
dc.contributor.authorLiu, Miao
dc.contributor.authorHow, Jonathan P
dc.date.accessioned2016-10-21T19:07:30Z
dc.date.available2016-10-21T19:07:30Z
dc.date.issued2015-07
dc.identifier.urihttp://hdl.handle.net/1721.1/104918
dc.description.abstractExpectation maximization (EM) has recently been shown to be an efficient algorithm for learning finite-state controllers (FSCs) in large decentralized POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often converge to maxima that are far from the optimal value. This paper represents the local policy of each agent using variable-sized FSCs that are constructed usinga stick-breaking prior, leading to a new framework called decentralized stick-breaking policy representation (Dec-SBPR). This approach learns the controller parameters with a variational Bayesian algorithm without having to assume that the DecPOMDP model is available. The performance of Dec-SBPR is demonstrated on several benchmark problems, showing that the algorithm scales to large problems while outperforming other state-of-the-art methods.en_US
dc.description.sponsorshipUnited States. Office of Naval Research. Multidisciplinary University Research Initiative (Award N000141110688)en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Award 1463945)en_US
dc.language.isoen_US
dc.publisherInternational Joint Conferences on Artificial Intelligence, Inc.en_US
dc.relation.isversionofhttp://ijcai-15.org/index.php/accepted-papersen_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceMIT web domainen_US
dc.titleStick-breaking policy learning in Dec-POMDPsen_US
dc.typeArticleen_US
dc.identifier.citationLiu, Miao et al. "Stick-Breaking Policy Learning in Dec-POMDPs." International Joint Conference on Artificial Intelligence, July 25-31, 2015, Buenos Aires, Argentina.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Aeronautics and Astronauticsen_US
dc.contributor.departmentMassachusetts Institute of Technology. Laboratory for Information and Decision Systemsen_US
dc.contributor.mitauthorLiu, Miao
dc.contributor.mitauthorHow, Jonathan P
dc.relation.journalInternational Joint Conference on Artificial Intelligenceen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsLiu, Miao; Amato, Christopher; Liao, Xuejun; Carin, Lawrence; How, Jonathan P.en_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0002-1648-8325
dc.identifier.orcidhttps://orcid.org/0000-0001-8576-1930
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record