dc.contributor.author | Pazis, Jason | |
dc.contributor.author | How, Jonathan P | |
dc.date.accessioned | 2018-03-26T14:30:15Z | |
dc.date.available | 2018-03-26T14:30:15Z | |
dc.date.issued | 2016 | |
dc.identifier.issn | 1049-5258 | |
dc.identifier.uri | http://hdl.handle.net/1721.1/114290 | |
dc.description.abstract | We present the first application of the median of means in a PAC exploration algorithm for MDPs. Using the median of means allows us to significantly reduce the dependence of our bounds on the range of values that the value function can take, while introducing a dependence on the (potentially much smaller) variance of the Bellman operator. Additionally, our algorithm is the first algorithm with PAC bounds that can be applied to MDPs with unbounded rewards. | en_US |
dc.description.sponsorship | United States. Office of Naval Research (Grant N000141110688) | en_US |
dc.description.sponsorship | National Science Foundation (U.S.) (Grant IIS-1218931) | en_US |
dc.publisher | Neural Information Processing Systems Foundation | en_US |
dc.relation.isversionof | https://papers.nips.cc/paper/6577-improving-pac-exploration-using-the-median-of-means | en_US |
dc.rights | Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. | en_US |
dc.source | Neural Information Processing Systems (NIPS) | en_US |
dc.title | Improving PAC exploration using the median of means | en_US |
dc.type | Article | en_US |
dc.identifier.citation | Pazis, Jason et al. "Improving PAC Exploration Using the Median Of Means." Advances in Neural Information Processing Systems (NIPS 2016), 29 (2016) © 2016 NIPS Foundation - All Rights Reserved. | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Aerospace Controls Laboratory | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Aeronautics and Astronautics | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Laboratory for Information and Decision Systems | en_US |
dc.contributor.mitauthor | Pazis, Jason | |
dc.contributor.mitauthor | How, Jonathan P | |
dc.relation.journal | Advances in Neural Information Processing Systems (NIPS) | en_US |
dc.eprint.version | Final published version | en_US |
dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
dc.date.updated | 2018-03-21T18:03:04Z | |
dspace.orderedauthors | Pazis, Jason; Parr, Ronald E.; How, Jonathan P. | en_US |
dspace.embargo.terms | N | en_US |
dc.identifier.orcid | https://orcid.org/0000-0001-8576-1930 | |
mit.license | PUBLISHER_POLICY | en_US |