Show simple item record

dc.contributor.advisorAmin, Saurabh
dc.contributor.advisorJaillet, Patrick
dc.contributor.authorGuinet, Gauthier Marc Benoit
dc.date.accessioned2023-01-19T18:45:41Z
dc.date.available2023-01-19T18:45:41Z
dc.date.issued2022-09
dc.date.submitted2022-08-09T19:39:03.971Z
dc.identifier.urihttps://hdl.handle.net/1721.1/147326
dc.description.abstractIn this thesis, we study sequential decision-making models where the feedback received by the principal depends on strategic uncertainty (e.g., agents’ willingness to follow a recommendation) and/or random uncertainty (e.g., loss or delay in arrival of information). Such challenges often arise in AI-driven platforms, with applications in recommender systems, revenue management or transportation. We model and study this class of problems through the lens of multi-armed and contextual bandits evolving in censored environments. Our goal is to estimate the performance loss due to censorship in the context of classical algorithms designed for uncensored environments. Our main contributions include the introduction of a broad class of censorship models and their analysis in terms of the effective dimension of the problem – a natural measure of its underlying statistical complexity and main driver of the regret bound. In particular, the effective dimension allows us to maintain the structure of the original problem at first order, while embedding it in a bigger space, and thus naturally leads to results analogous to uncensored settings. Our analysis involves a continuous generalization of the Elliptical Potential Inequality, which we believe is of independent interest. We also discover an interesting property of decision-making under censorship: a transient phase during which initial misspecification of censorship is self-corrected at an extra cost; followed by a stationary phase that reflects the inherent slowdown of learning governed by the effective dimension.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright MIT
dc.rights.urihttp://rightsstatements.org/page/InC-EDU/1.0/
dc.titleBandit Problems under Censored Feedback
dc.typeThesis
dc.description.degreeS.M.
dc.contributor.departmentMassachusetts Institute of Technology. Operations Research Center
mit.thesis.degreeMaster
thesis.degree.nameMaster of Science in Operations Research


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record