Show simple item record

dc.contributor.authorFoster, Dylan J
dc.contributor.authorRakhlin, Alexander
dc.date.accessioned2021-12-03T15:09:50Z
dc.date.available2021-12-03T15:09:50Z
dc.date.issued2020
dc.identifier.urihttps://hdl.handle.net/1721.1/138306
dc.description.abstractA fundamental challenge in contextual bandits is to develop flexible, general-purpose algorithms with computational requirements no worse than classical supervised learning tasks such as classification and regression. Algorithms based on regression have shown promising empirical success, but theoretical guarantees have remained elusive except in special cases. We provide the first universal and optimal reduction from contextual bandits to online regression. We show how to transform any oracle for online regression with a given value function class into an algorithm for contextual bandits with the induced policy class, with no overhead in runtime or memory requirements. We characterize the minimax rates for contextual bandits with general, potentially nonparametric function classes, and show that our algorithm is minimax optimal whenever the oracle obtains the optimal rate for regression. Compared to previous results, our algorithm requires no distributional assumptions beyond realizability, and works even when contexts are chosen adversarially.en_US
dc.language.isoen
dc.relation.isversionofhttps://proceedings.mlr.press/v119/foster20aen_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceProceedings of Machine Learning Researchen_US
dc.titleBeyond UCB: Optimal and Efficient Contextual Bandits with Regression Oraclesen_US
dc.typeArticleen_US
dc.identifier.citationFoster, Dylan J and Rakhlin, Alexander. 2020. "Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles." INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 119.
dc.contributor.departmentStatistics and Data Science Center (Massachusetts Institute of Technology)
dc.contributor.departmentMassachusetts Institute of Technology. Institute for Data, Systems, and Society
dc.contributor.departmentMassachusetts Institute of Technology. Laboratory for Information and Decision Systems
dc.contributor.departmentMassachusetts Institute of Technology. Department of Brain and Cognitive Sciences
dc.relation.journalINTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119en_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2021-12-03T15:06:05Z
dspace.orderedauthorsFoster, DJ; Rakhlin, Aen_US
dspace.date.submission2021-12-03T15:06:07Z
mit.journal.volume119en_US
mit.licensePUBLISHER_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record