| dc.contributor.advisor | Alex P. Pentland. | en_US |
| dc.contributor.author | Dubey, Abhimanyu. | en_US |
| dc.contributor.other | Program in Media Arts and Sciences (Massachusetts Institute of Technology) | en_US |
| dc.date.accessioned | 2020-01-23T17:01:54Z | |
| dc.date.available | 2020-01-23T17:01:54Z | |
| dc.date.copyright | 2019 | en_US |
| dc.date.issued | 2019 | en_US |
| dc.identifier.uri | https://hdl.handle.net/1721.1/123636 | |
| dc.description | Thesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2019 | en_US |
| dc.description | Cataloged from PDF version of thesis. | en_US |
| dc.description | Includes bibliographical references (pages 99-106). | en_US |
| dc.description.abstract | In this thesis, I consider the research problem of designing optimal algorithms for two specific settings of the stochastic multi-armed bandit problem. The first setting considers the problem where rewards are drawn from a family of extremely heavy-tailed distributions known as a-stable distributions. For this setting, I extended an existing upper confidence bound algorithm, to create an optimal frequentist algorithm, titled [alpha]-UCB. Next, I developed a variant of the Bayesian Thompson Sampling algorithm in this setting, titled Robust [alpha]-TS, which involved developing an efficient pipeline for posterior inference. I also proved finite-time regret bounds for this algorithm, that are optimal up to logarithmic factors. The second problem setting I considered was the networked multi-agent problem where agents have local communication, and have unique preferences. This problem setting is a generalization of the co-operative multi-agent stochastic bandit problem, and is a closely related variant of the single-agent bandit setting with side observations. For this setting, I developed an optimal upper confidence bound algorithm, titled Net-UCB. I also proved finite-time regret bounds for this algorithm that are logarithmic in the number of rounds, and are sub-linear in the number of agents. For both settings, I conducted extensive experiments to verify the tightness of the regret bounds established, and compare performance with existing state-of-the-art algorithms. The algorithms proposed in this thesis obtain competitive regret and state-of-the-art performance across a variety of problem settings. | en_US |
| dc.description.statementofresponsibility | by Abhimanyu Dubey. | en_US |
| dc.format.extent | 106 pages | en_US |
| dc.language.iso | eng | en_US |
| dc.publisher | Massachusetts Institute of Technology | en_US |
| dc.rights | MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. | en_US |
| dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | en_US |
| dc.subject | Program in Media Arts and Sciences | en_US |
| dc.title | Robust sequential decision-making on networks | en_US |
| dc.type | Thesis | en_US |
| dc.description.degree | S.M. | en_US |
| dc.contributor.department | Program in Media Arts and Sciences (Massachusetts Institute of Technology) | en_US |
| dc.identifier.oclc | 1136490586 | en_US |
| dc.description.collection | S.M. Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences | en_US |
| dspace.imported | 2020-01-23T17:01:53Z | en_US |
| mit.thesis.degree | Master | en_US |
| mit.thesis.department | Media | en_US |