HeadLock : wide-range head pose estimation for low resolution video
Author(s)
DeCamp, Philip (Philip James)
DownloadFull printable version (11.60Mb)
Alternative title
Wide-range head pose estimation for low resolution video
Other Contributors
Massachusetts Institute of Technology. Dept. of Architecture. Program in Media Arts and Sciences.
Advisor
Deb Roy.
Terms of use
Metadata
Show full item recordAbstract
This thesis focuses on data mining technologies to extract head pose information from low resolution video recordings. Head pose, as an approximation of gaze direction, is a key indicator of human behavior and interaction. Extracting head pose information from video recordings is a labor intensive endeavor that severely limits the feasibility of using large video corpora to perform tasks that require analysis of human behavior. HeadLock is a novel head pose annotation and tracking tool. Pose annotation is formulated as a semiautomatic process in which a human annotator is aided by computationally generated head pose estimates, significantly reducing the human effort required to accurately annotate video recordings. HeadLock has been designed to perform head pose tracking on video from overhead, wide-angle cameras. The head pose estimation system used by HeadLock can perform pose estimation to arbitrary precision on images that reveal only the top or back of a head. This system takes a 3D model-based approach in which heads are modeled as 3D surfaces covered with localized features. The set of features used can be reliably extracted from both hair and skin regions at any resolution, providing better performance for images that may contain small facial regions and no discernible facial features. HeadLock is evaluated on video recorded for the Human Speechome Project (HSP), a research initiative to study human language development by analyzing longitudinal audio-video recordings of a developing child. Results indicate that HeadLock may enable annotation of head pose at ten times the speed of a manual approach. In addition to head tracking, this thesis describes the data collection and data management systems that have been developed for HSP, providing a comprehensive example of how very large corpora of video recordings may be used to research human development, health and behavior.
Description
Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, February 2008. Includes bibliographical references (p. 85-87).
Date issued
2008Department
Program in Media Arts and Sciences (Massachusetts Institute of Technology)Publisher
Massachusetts Institute of Technology
Keywords
Architecture. Program in Media Arts and Sciences.