Show simple item record

dc.contributor.authorGutin, Eli
dc.contributor.authorFarias, Vivek F.
dc.date.accessioned2020-11-12T20:39:49Z
dc.date.available2020-11-12T20:39:49Z
dc.date.issued2016-12
dc.identifier.urihttps://hdl.handle.net/1721.1/128464
dc.description.abstractStarting with the Thomspon sampling algorithm, recent years have seen a resurgence of interest in Bayesian algorithms for the Multi-armed Bandit (MAB) problem. These algorithms seek to exploit prior information on arm biases and while several have been shown to be regret optimal, their design has not emerged from a principled approach. In contrast, if one cared about Bayesian regret discounted over an infinite horizon at a fixed, pre-specified rate, the celebrated Gittins index theorem offers an optimal algorithm. Unfortunately, the Gittins analysis does not appear to carry over to minimizing Bayesian regret over all sufficiently large horizons and computing a Gittins index is onerous relative to essentially any incumbent index scheme for the Bayesian MAB problem. The present paper proposes a sequence of 'optimistic' approximations to the Gittins index. We show that the use of these approximations in concert with the use of an increasing discount factor appears to offer a compelling alternative to state-of-the-art index schemes proposed for the Bayesian MAB problem in recent years by offering substantially improved performance with little to no additional computational overhead. In addition, we prove that the simplest of these approximations yields frequentist regret that matches the Lai-Robbins lower bound, including achieving matching constants.en_US
dc.publisherNIPS Foundationen_US
dc.relation.isversionofhttps://papers.nips.cc/paper/6036-optimistic-gittins-indicesen_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceNeural Information Processing Systems (NIPS)en_US
dc.titleOptimistic gittins indicesen_US
dc.typeArticleen_US
dc.identifier.citationGutin, Eli and Vivek F. Farias. "Optimistic gittins indices." Advances in Neural Information Processing Systems 29 (NIPS 2016), Barcelona, Spain, NIPS Foundation, December 2016. © 2016 NIPS Foundationen_US
dc.contributor.departmentSloan School of Managementen_US
dc.contributor.departmentMassachusetts Institute of Technology. Operations Research Centeren_US
dc.relation.journalAdvances in Neural Information Processing Systems 29 (NIPS 2016)en_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2019-02-12T15:33:33Z
dspace.embargo.termsNen_US
dspace.date.submission2019-04-04T14:47:22Z
mit.licensePUBLISHER_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record