Crowdsourcing step-by-step information extraction to enhance existing how-to videos

Kim, Juho; Nguyen, Phu Tran; Weir, Sarah; Guo, Philip J.; Miller, Robert C.; Gajos, Krzysztof Z.

The MIT Libraries is completing a major upgrade to DSpace@MIT. Starting May 5 2026, DSpace will remain functional, viewable, searchable, and downloadable, however, you will not be able to edit existing collections or add new material. We are aiming to have full functionality restored by May 18, 2026, but intermittent service interruptions may occur. Please email dspace-lib@mit.edu with any questions. Thank you for your patience as we implement this important upgrade.

Show simple item record

dc.contributor.author	Nguyen, Phu Tran
dc.contributor.author	Weir, Sarah
dc.contributor.author	Guo, Philip J.
dc.contributor.author	Miller, Robert C.
dc.contributor.author	Gajos, Krzysztof Z.
dc.contributor.author	Kim, Ju Ho
dc.date.accessioned	2014-09-26T18:02:11Z
dc.date.available	2014-09-26T18:02:11Z
dc.date.issued	2014-04
dc.identifier.isbn	9781450324731
dc.identifier.uri	http://hdl.handle.net/1721.1/90410
dc.description.abstract	Millions of learners today use how-to videos to master new skills in a variety of domains. But browsing such videos is often tedious and inefficient because video player interfaces are not optimized for the unique step-by-step structure of such videos. This research aims to improve the learning experience of existing how-to videos with step-by-step annotations. We first performed a formative study to verify that annotations are actually useful to learners. We created ToolScape, an interactive video player that displays step descriptions and intermediate result thumbnails in the video timeline. Learners in our study performed better and gained more self-efficacy using ToolScape versus a traditional video player. To add the needed step annotations to existing how-to videos at scale, we introduce a novel crowdsourcing workflow. It extracts step-by-step structure from an existing video, including step times, descriptions, and before and after images. We introduce the Find-Verify-Expand design pattern for temporal and visual annotation, which applies clustering, text processing, and visual analysis algorithms to merge crowd output. The workflow does not rely on domain-specific customization, works on top of existing videos, and recruits untrained crowd workers. We evaluated the workflow with Mechanical Turk, using 75 cooking, makeup, and Photoshop videos on YouTube. Results show that our workflow can extract steps with a quality comparable to that of trained annotators across all three domains with 77% precision and 81% recall.	en_US
dc.language.iso	en_US
dc.publisher	Association for Computing Machinery (ACM)	en_US
dc.relation.isversionof	http://dx.doi.org/10.1145/2556288.2556986	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	Other univ. web domain	en_US
dc.title	Crowdsourcing step-by-step information extraction to enhance existing how-to videos	en_US
dc.type	Article	en_US
dc.identifier.citation	Juho Kim, Phu Tran Nguyen, Sarah Weir, Philip J. Guo, Robert C. Miller, and Krzysztof Z. Gajos. 2014. Crowdsourcing step-by-step information extraction to enhance existing how-to videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 4017-4026.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.contributor.mitauthor	Kim, Ju Ho	en_US
dc.contributor.mitauthor	Nguyen, Phu Tran	en_US
dc.contributor.mitauthor	Weir, Sarah	en_US
dc.contributor.mitauthor	Miller, Robert C.	en_US
dc.relation.journal	Proceedings of the 32nd annual ACM conference on Human factors in computing systems (CHI '14)	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dspace.orderedauthors	Kim, Juho; Nguyen, Phu Tran; Weir, Sarah; Guo, Philip J.; Miller, Robert C.; Gajos, Krzysztof Z.	en_US
dc.identifier.orcid	https://orcid.org/0000-0001-6348-4127
dc.identifier.orcid	https://orcid.org/0000-0002-0442-691X
mit.license	OPEN_ACCESS_POLICY	en_US
mit.metadata.status	Complete

Files in this item

Name:: Miller_crowdsourcing.pdf
Size:: 6.594Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record