Batched Bandit Problems

Perchet, Vianney; Rigollet, Philippe; Chassang, Sylvain; Snowberg, Erik

Author(s)

Perchet, Vianney; Rigollet, Philippe; Chassang, Sylvain; Snowberg, Erik

DownloadRigollet_Batched bandit.pdf (326.4Kb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

Date issued

2015-09-24

URI

http://hdl.handle.net/1721.1/98879

Department

Massachusetts Institute of Technology. Department of Mathematics

Journal

forthcoming in Annals of Statistics

Publisher

Institute of Mathematical Statistics

Citation

Perchet, Vianney, Philippe Rigollet, Sylvain Chassang, and Erik Snowberg. "Batched Bandit Problems." Annals of Statistics (2015).

Version: Author's final manuscript

ISSN

0090-5364

Collections

MIT Open Access Articles

DSpace@MIT