A wearable system that learns a kinematic model and finds structure in everyday manipulation by using absolute orientation sensors and a camera

Kemp, Charles C. (Charles Clark), 1972-

Author(s)

Kemp, Charles C. (Charles Clark), 1972-

DownloadFull printable version (11.37Mb)

Other Contributors

Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.

Advisor

Rodney Brooks.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

This thesis presents Duo, the first wearable system to autonomously learn a kinematic model of the wearer via body-mounted absolute orientation sensors and a head-mounted camera. With Duo, we demonstrate the significant benefits of endowing a wearable system with the ability to sense the kinematic configuration of the wearer's body. We also show that a kinematic model can be autonomously estimated offline from less than an hour of recorded video and orientation data from a wearer performing unconstrained, unscripted, household activities within a real, unaltered, home environment. We demonstrate that our system for autonomously estimating this kinematic model places very few constraints on the wearer's body, the placement of the sensors, and the appearance of the hand, which, for example, allows it to automatically discover a left-handed kinematic model for a left-handed wearer, and to automatically compensate for distinct camera mounts, and sensor configurations. Furthermore, we show that this learned kinematic model efficiently and robustly predicts the location of the dominant hand within video from the head-mounted camera even in situations where vision-based hand detectors would be likely to fail.

(cont.) Additionally, we show ways in which the learned kinematic model can facilitate highly efficient processing of large databases of first person experience. Finally, we show that the kinematic model can efficiently direct visual processing so as to acquire a large number of high quality segments of the wearer's hand and the manipulated objects. Within the course of justifying these claims, we present methods for estimating global image motion, segmenting foreground motion, segmenting manipulation events, finding and representing significant hand postures, segmenting visual regions, and detecting visual points of interest with associated shape descriptors. We also describe our architecture and user-level application for machine augmented annotation and browsing of first person video and absolute orientations. Additionally, we present a real-time application in which the human and wearable cooperate through tightly integrated behaviors coordinated by the wearable's kinematic perception, and together acquire high-quality visual segments of manipulable objects that interest the wearable.

Description

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.

Includes bibliographical references (p. 215-220).

Date issued

2005

URI

http://hdl.handle.net/1721.1/33920

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses