A few days of a robot's life in the human's world : toward incremental individual recognition

Aryananda, Lijin, 1975-

Author(s)

Aryananda, Lijin, 1975-

DownloadFull printable version (40.93Mb)

Other Contributors

Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.

Advisor

Rodney Brooks.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

This thesis presents an integrated framework and implementation for Mertz, an expressive robotic creature for exploring the task of face recognition through natural interaction in an incremental and unsupervised fashion. The goal of this thesis is to advance toward a framework which would allow robots to incrementally "get to know" a set of familiar individuals in a natural and extendable way. This thesis is motivated by the increasingly popular goal of integrating robots in the home. In order to be effective in human-centric tasks, the robots must be able to not only recognize each family member, but also to learn about the roles of various people in the household. In this thesis, we focus on two particular limitations of the current technology. Firstly, most of face recognition research concentrate on the supervised classification problem. Currently, one of the biggest problems in face recognition is how to generalize the system to be able to recognize new test data that vary from the training data. Thus, until this problem is solved completely, the existing supervised approaches may require multiple manual introduction and labelling sessions to include training data with enough variations. Secondly, there is typically a large gap between research prototypes and commercial products, largely due to lack of robustness and scalability to different environmental settings.

(cont.) In this thesis, we propose an unsupervised approach which would allow for a more adaptive system which can incrementally update the training set with more recent data or new individuals over time. Moreover, it gives the robots a more natural social recognition mechanism to learn not only to recognize each person's appearance, but also to remember some relevant contextual information that the robot observed during previous interaction sessions. Therefore, this thesis focuses on integrating an unsupervised and incremental face recognition system within a physical robot which interfaces directly with humans through natural social interaction. The robot autonomously detects, tracks, and segments face images during these interactions and automatically generates a training set for its face recognition system. Moreover, in order to motivate robust solutions and address scalability issues, we chose to put the robot, Mertz, in unstructured public environments to interact with naive passersby, instead of with only the researchers within the laboratory environment. While an unsupervised and incremental face recognition system is a crucial element toward our target goal, it is only a part of the story. A face recognition system typically receives either pre-recorded face images or a streaming video from a static camera.

(cont.) As illustrated an ACLU review of a commercial face recognition installation, a security application which interfaces with the latter is already very challenging. In this case, our target goal is a robot that can recognize people in a home setting. The interface between robots and humans is even more dynamic. Both the robots and the humans move around. We present the robot implementation and its unsupervised incremental face recognition framework. We describe an algorithm for clustering local features extracted from a large set of automatically generated face data. We demonstrate the robot's capabilities and limitations in a series of experiments at a public lobby. In a final experiment, the robot interacted with a few hundred individuals in an eight day period and generated a training set of over a hundred thousand face images. We evaluate the clustering algorithm performance across a range of parameters on this automatically generated training data and also the Honda-UCSD video face database. Lastly, we present some recognition results using the self-labelled clusters.

Description

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.

Includes bibliographical references (p. 234-244).

Date issued

2007

URI

http://hdl.handle.net/1721.1/40495

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses