Searching for commonsense

Eslick, Ian S. (Ian Scott)

Author(s)

Eslick, Ian S. (Ian Scott)

DownloadFull printable version (4.455Mb)

Other Contributors

Massachusetts Institute of Technology. Dept. of Architecture. Program In Media Arts and Sciences

Advisor

Walter Bender, Hugh Herr and Rada Mihalcea.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Acquiring and representing the large body of "common sense" knowledge underlying ordinary human reasoning and communication is a long standing problem in the field of artificial intelligence. This thesis will address the question whether a significant quantity of this knowledge may be acquired by mining natural language content on the Web. Specifically, this thesis emphasizes the representation of knowledge in the form of binary semantic relationships, such as cause, effect, intent, and time, among natural language phrases. The central hypothesis is that seed knowledge collected from volunteers enables automated acquisition of this knowledge from a large, unannotated, general corpus like the Web. A text mining system, ConceptMiner, was developed to evaluate this hypothesis. ConceptMiner leverages web search engines, Information Extraction techniques and the ConceptNet toolkit to analyze Web content for textual evidence indicating common sense relationships.

(cont.) Experiments are reported for three semantic relation classes: desire, effect, and capability. A Pointwise Mutual Infomation measure computed from Web hit counts is demonstrated to filter general common sense from instance knowledge true only in specific circumstances. A semantic distance metric is introduced which significantly reduces negative instances from the extracted hypotheses. The results confirm that significant relational common sense knowledge exists on the Web and provides evidence that the algorithms employed by ConceptMiner can extract this knowledge with a precision approaching that provided by human subjects.

Description

Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2006.

Includes bibliographical references (leaves 97-101).

Date issued

2006

URI

http://hdl.handle.net/1721.1/37385

Department

Program in Media Arts and Sciences (Massachusetts Institute of Technology)

Publisher

Massachusetts Institute of Technology

Keywords

Architecture. Program In Media Arts and Sciences

Collections

Graduate Theses