EdVidParse : detecting people and content in educational videos

Pratusevich, Michele

dc.contributor.advisor	Robert C. Miller and Antonio Torralba.	en_US
dc.contributor.author	Pratusevich, Michele	en_US
dc.contributor.other	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2016-01-04T20:01:57Z
dc.date.available	2016-01-04T20:01:57Z
dc.date.copyright	2015	en_US
dc.date.issued	2015	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/100647
dc.description	Thesis: M. Eng. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.	en_US
dc.description	This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.	en_US
dc.description	Cataloged from student-submitted PDF version of thesis.	en_US
dc.description	Includes bibliographical references (pages 61-65).	en_US
dc.description.abstract	There are thousands of hours of educational content on the Internet, with services like edX, Coursera, Berkeley WebCasts, and others offering hundreds of courses to hundreds of thousands of learners. Consequently, researchers are interested in the effectiveness of video learning. While educational videos vary, they share two common attributes: people and textual content. People are presenting content to learners in the form of text, graphs, charts, tables, and diagrams. With an annotation of people and textual content in an educational video, researchers can study the relationship between video learning and retention. This thesis presents EdVidParse, an automatic tool that takes an educational video and annotates it with bounding boxes around the people and textual content. EdVidParse uses internal features from deep convolutional neural networks to estimate the bounding boxes, achieving a 0.43 AP score on a test set. Three applications of EdVidParse, including identifying the video type, identifying people and textual content for interface design, and removing a person from a picture-in-picture video are presented. EdVidParse provides an easy interface for identifying people and textual content inside educational videos for use in video annotation, interface design, and video reconfiguration.	en_US
dc.description.statementofresponsibility	by Michele Pratusevich.	en_US
dc.format.extent	65 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	EdVidParse : detecting people and content in educational videos	en_US
dc.title.alternative	Detecting people and content in educational videos	en_US
dc.type	Thesis	en_US
dc.description.degree	M. Eng. in Computer Science and Engineering	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	933247843	en_US

Files in this item

Name:: 933247843-MIT.pdf
Size:: 15.57Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record