Show simple item record

dc.contributor.authorPazis, Jason
dc.contributor.authorHow, Jonathan P
dc.date.accessioned2018-03-26T14:30:15Z
dc.date.available2018-03-26T14:30:15Z
dc.date.issued2016
dc.identifier.issn1049-5258
dc.identifier.urihttp://hdl.handle.net/1721.1/114290
dc.description.abstractWe present the first application of the median of means in a PAC exploration algorithm for MDPs. Using the median of means allows us to significantly reduce the dependence of our bounds on the range of values that the value function can take, while introducing a dependence on the (potentially much smaller) variance of the Bellman operator. Additionally, our algorithm is the first algorithm with PAC bounds that can be applied to MDPs with unbounded rewards.en_US
dc.description.sponsorshipUnited States. Office of Naval Research (Grant N000141110688)en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Grant IIS-1218931)en_US
dc.publisherNeural Information Processing Systems Foundationen_US
dc.relation.isversionofhttps://papers.nips.cc/paper/6577-improving-pac-exploration-using-the-median-of-meansen_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceNeural Information Processing Systems (NIPS)en_US
dc.titleImproving PAC exploration using the median of meansen_US
dc.typeArticleen_US
dc.identifier.citationPazis, Jason et al. "Improving PAC Exploration Using the Median Of Means." Advances in Neural Information Processing Systems (NIPS 2016), 29 (2016) © 2016 NIPS Foundation - All Rights Reserved.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Aerospace Controls Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Aeronautics and Astronauticsen_US
dc.contributor.departmentMassachusetts Institute of Technology. Laboratory for Information and Decision Systemsen_US
dc.contributor.mitauthorPazis, Jason
dc.contributor.mitauthorHow, Jonathan P
dc.relation.journalAdvances in Neural Information Processing Systems (NIPS)en_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2018-03-21T18:03:04Z
dspace.orderedauthorsPazis, Jason; Parr, Ronald E.; How, Jonathan P.en_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0001-8576-1930
mit.licensePUBLISHER_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record