Automatic classification of documents with an in-depth analysis of information extraction and automatic summarization

Hohm, Joseph Brandon, 1982-

Author(s)

Hohm, Joseph Brandon, 1982-

DownloadFull printable version (8.039Mb)

Other Contributors

Massachusetts Institute of Technology. Dept. of Civil and Environmental Engineering.

Advisor

John R. Williams.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Today, annual information fabrication per capita exceeds two hundred and fifty megabytes. As the amount of data increases, classification and retrieval methods become more necessary to find relevant information. This thesis describes a .Net application (named I-Document) that establishes an automatic classification scheme in a peer-to-peer environment that allows free sharing of academic, business, and personal documents. A Web service architecture for metadata extraction, Information Extraction, Information Retrieval, and text summarization is depicted. Specific details regarding the coding process, competition, business model, and technology employed in the project are also discussed.

Description

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2004.

Includes bibliographical references (leaves 78-80).

Date issued

2004

URI

http://hdl.handle.net/1721.1/29415

Department

Massachusetts Institute of Technology. Department of Civil and Environmental Engineering

Publisher

Massachusetts Institute of Technology

Keywords

Civil and Environmental Engineering.

Collections

Graduate Theses