Show simple item record

dc.contributor.authorBerke, Alex
dc.contributor.authorMahari, Robert
dc.contributor.authorPentland, Sandy
dc.contributor.authorLarson, Kent
dc.contributor.authorCalacci, Dana
dc.date.accessioned2024-12-12T22:31:37Z
dc.date.available2024-12-12T22:31:37Z
dc.date.issued2024-11-08
dc.identifier.issn2573-0142
dc.identifier.urihttps://hdl.handle.net/1721.1/157844
dc.description.abstractData generated by users on digital platforms are a crucial resource for advocates and researchers interested in uncovering digital inequities, auditing algorithms, and understanding human behavior. Yet data access is often restricted. How can researchers both effectively and ethically collect user data? This paper shares an innovative approach to crowdsourcing user data to collect otherwise inaccessible Amazon purchase histories, spanning 5 years, from more than 5,000 U.S. users. We developed a data collection tool that prioritizes participant consent and includes an experimental study design. The design allows us to study multiple important aspects of privacy perception and user data sharing behavior, including how socio-demographics, monetary incentives and transparency can impact share rates. Experiment results (N=6,325) reveal both monetary incentives and transparency can significantly increase data sharing. Age, race, education, and gender also played a role, where female and less-educated participants were more likely to share. Our study design enables a unique empirical evaluation of the “privacy paradox”, where users claim to value their privacy more than they do in practice. We set up both real and hypothetical data sharing scenarios and find measurable similarities and differences in share rates across these contexts. For example, increasing monetary incentives had a 6 times higher impact on share rates in real scenarios. In addition, we study participants' opinions on how data should be used by various third parties, again finding that gender, age, education, and race have a significant impact. Notably, the majority of participants disapproved of government agencies using purchase data yet the majority approved of use by researchers. Overall, our findings highlight the critical role that transparency, incentive design, and user demographics play in ethical data collection practices, and provide guidance for future researchers seeking to crowdsource user generated data.en_US
dc.publisherACMen_US
dc.relation.isversionofhttps://doi.org/10.1145/3687005en_US
dc.rightsCreative Commons Attribution-Noncommercialen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleInsights from an Experiment Crowdsourcing Data from Thousands of US Amazon Users: The importance of transparency, money, and data useen_US
dc.typeArticleen_US
dc.identifier.citationBerke, Alex, Mahari, Robert, Pentland, Sandy, Larson, Kent and Calacci, Dana. 2024. "Insights from an Experiment Crowdsourcing Data from Thousands of US Amazon Users: The importance of transparency, money, and data use." Proceedings of the ACM on Human-Computer Interaction, 8 (CSCW2).
dc.contributor.departmentMassachusetts Institute of Technology. Media Laboratoryen_US
dc.relation.journalProceedings of the ACM on Human-Computer Interactionen_US
dc.identifier.mitlicensePUBLISHER_CC
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2024-12-01T08:50:47Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2024-12-01T08:50:47Z
mit.journal.volume8en_US
mit.journal.issueCSCW2en_US
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record