A wearable system that learns a kinematic model and finds structure in everyday manipulation by using absolute orientation sensors and a camera
Author(s)Kemp, Charles C. (Charles Clark), 1972-
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
MetadataShow full item record
This thesis presents Duo, the first wearable system to autonomously learn a kinematic model of the wearer via body-mounted absolute orientation sensors and a head-mounted camera. With Duo, we demonstrate the significant benefits of endowing a wearable system with the ability to sense the kinematic configuration of the wearer's body. We also show that a kinematic model can be autonomously estimated offline from less than an hour of recorded video and orientation data from a wearer performing unconstrained, unscripted, household activities within a real, unaltered, home environment. We demonstrate that our system for autonomously estimating this kinematic model places very few constraints on the wearer's body, the placement of the sensors, and the appearance of the hand, which, for example, allows it to automatically discover a left-handed kinematic model for a left-handed wearer, and to automatically compensate for distinct camera mounts, and sensor configurations. Furthermore, we show that this learned kinematic model efficiently and robustly predicts the location of the dominant hand within video from the head-mounted camera even in situations where vision-based hand detectors would be likely to fail.(cont.) Additionally, we show ways in which the learned kinematic model can facilitate highly efficient processing of large databases of first person experience. Finally, we show that the kinematic model can efficiently direct visual processing so as to acquire a large number of high quality segments of the wearer's hand and the manipulated objects. Within the course of justifying these claims, we present methods for estimating global image motion, segmenting foreground motion, segmenting manipulation events, finding and representing significant hand postures, segmenting visual regions, and detecting visual points of interest with associated shape descriptors. We also describe our architecture and user-level application for machine augmented annotation and browsing of first person video and absolute orientations. Additionally, we present a real-time application in which the human and wearable cooperate through tightly integrated behaviors coordinated by the wearable's kinematic perception, and together acquire high-quality visual segments of manipulable objects that interest the wearable.
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 215-220).
DepartmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.