Approximating the crowd
Author(s)
Ertekin, Şeyda; Rudin, Cynthia; Hirsh, Haym; Ertekin, Seyda
Download10618_2014_354_ReferencePDF.pdf (396.8Kb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
The problem of “approximating the crowd” is that of estimating the crowd’s majority opinion by querying only a subset of it. Algorithms that approximate the crowd can intelligently stretch a limited budget for a crowdsourcing task. We present an algorithm, “CrowdSense,” that works in an online fashion where items come one at a time. CrowdSense dynamically samples subsets of the crowd based on an exploration/exploitation criterion. The algorithm produces a weighted combination of the subset’s votes that approximates the crowd’s opinion. We then introduce two variations of CrowdSense that make various distributional approximations to handle distinct crowd characteristics. In particular, the first algorithm makes a statistical independence approximation of the labelers for large crowds, whereas the second algorithm finds a lower bound on how often the current subcrowd agrees with the crowd’s majority vote. Our experiments on CrowdSense and several baselines demonstrate that we can reliably approximate the entire crowd’s vote by collecting opinions from a representative subset of the crowd.
Date issued
2014-06Department
Massachusetts Institute of Technology. Center for Collective Intelligence; Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Sloan School of ManagementJournal
Data Mining and Knowledge Discovery
Publisher
Springer US
Citation
Ertekin, Şeyda, Cynthia Rudin, and Haym Hirsh. “Approximating the Crowd.” Data Min Knowl Disc 28, no. 5–6 (June 14, 2014): 1189–1221.
Version: Author's final manuscript
ISSN
1384-5810
1573-756X