HeadLock : wide-range head pose estimation for low resolution video

DeCamp, Philip (Philip James)

dc.contributor.advisor	Deb Roy.	en_US
dc.contributor.author	DeCamp, Philip (Philip James)	en_US
dc.contributor.other	Massachusetts Institute of Technology. Dept. of Architecture. Program in Media Arts and Sciences.	en_US
dc.date.accessioned	2008-09-03T15:35:00Z
dc.date.available	2008-09-03T15:35:00Z
dc.date.copyright	2007	en_US
dc.date.issued	2008	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/42411
dc.description	Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, February 2008.	en_US
dc.description	Includes bibliographical references (p. 85-87).	en_US
dc.description.abstract	This thesis focuses on data mining technologies to extract head pose information from low resolution video recordings. Head pose, as an approximation of gaze direction, is a key indicator of human behavior and interaction. Extracting head pose information from video recordings is a labor intensive endeavor that severely limits the feasibility of using large video corpora to perform tasks that require analysis of human behavior. HeadLock is a novel head pose annotation and tracking tool. Pose annotation is formulated as a semiautomatic process in which a human annotator is aided by computationally generated head pose estimates, significantly reducing the human effort required to accurately annotate video recordings. HeadLock has been designed to perform head pose tracking on video from overhead, wide-angle cameras. The head pose estimation system used by HeadLock can perform pose estimation to arbitrary precision on images that reveal only the top or back of a head. This system takes a 3D model-based approach in which heads are modeled as 3D surfaces covered with localized features. The set of features used can be reliably extracted from both hair and skin regions at any resolution, providing better performance for images that may contain small facial regions and no discernible facial features. HeadLock is evaluated on video recorded for the Human Speechome Project (HSP), a research initiative to study human language development by analyzing longitudinal audio-video recordings of a developing child. Results indicate that HeadLock may enable annotation of head pose at ten times the speed of a manual approach. In addition to head tracking, this thesis describes the data collection and data management systems that have been developed for HSP, providing a comprehensive example of how very large corpora of video recordings may be used to research human development, health and behavior.	en_US
dc.description.statementofresponsibility	by Philip DeCamp.	en_US
dc.format.extent	87 p.	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Architecture. Program in Media Arts and Sciences.	en_US
dc.title	HeadLock : wide-range head pose estimation for low resolution video	en_US
dc.title.alternative	Wide-range head pose estimation for low resolution video	en_US
dc.type	Thesis	en_US
dc.description.degree	S.M.	en_US
dc.contributor.department	Program in Media Arts and Sciences (Massachusetts Institute of Technology)
dc.identifier.oclc	237210074	en_US

Files in this item

Name:: 237210074-MIT.pdf
Size:: 11.60Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record