Learning commonsense human-language descriptions from temporal and spatial sensor-network data
Author(s)
Morgan, Bo
DownloadFull printable version (8.119Mb)
Other Contributors
Massachusetts Institute of Technology. Dept. of Architecture. Program In Media Arts and Sciences
Advisor
Walter Bender.
Terms of use
Metadata
Show full item recordAbstract
Embedded-sensor platforms are advancing toward such sophistication that they can differentiate between subtle actions. For example, when placed in a wristwatch, such platforms can tell whether a person is shaking hands or turning a doorknob. Sensors placed on objects in the environment now report many parameters, including object location, movement, sound, and temperature. A persistent problem, however, is the description of these sense data in meaningful human-language. This is an important problem that appears across domains ranging from organizational security surveillance to individual activity journaling. Previous models of activity recognition pigeon-hole descriptions into small, formal categories specified in advance; for example, location is often categorized as "at home" or "at the office." These models have not been able to adapt to the wider range of complex, dynamic, and idiosyncratic human activities. We hypothesize that the commonsense, semantically related, knowledge bases can be used to bootstrap learning algorithms for classifying and recognizing human activities from sensors. (cont.) Our system, LifeNet, is a first-person commonsense inference model, which consists of a graph with nodes drawn from a large repository of commonsense assertions expressed in human-language phrases. LifeNet is used to construct a mapping between streams of sensor data and partially ordered sequences of events, co-located in time and space. Further, by gathering sensor data in vivo, we are able to validate and extend the commonsense knowledge from which LifeNet is derived. LifeNet is evaluated in the context of its performance on a sensor-network platform distributed in an office environment. We hypothesize that mapping sensor data into LifeNet will act as a "semantic mirror" to meaningfully interpret sensory data into cohesive patterns in order to understand and predict human action.
Description
Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2006. Includes bibliographical references (p. 105-109) and index.
Date issued
2006Department
Program in Media Arts and Sciences (Massachusetts Institute of Technology)Publisher
Massachusetts Institute of Technology
Keywords
Architecture. Program In Media Arts and Sciences