Batched Bandit Problems
Author(s)Perchet, Vianney; Rigollet, Philippe; Chassang, Sylvain; Snowberg, Erik
MetadataShow full item record
Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.
DepartmentMassachusetts Institute of Technology. Department of Mathematics
forthcoming in Annals of Statistics
Institute of Mathematical Statistics
Perchet, Vianney, Philippe Rigollet, Sylvain Chassang, and Erik Snowberg. "Batched Bandit Problems." Annals of Statistics (2015).
Author's final manuscript