| dc.contributor.advisor | Steven L. Rohall and Chris Schmandt. | en_US |
| dc.contributor.author | Lam, Derek Scott, 1979- | en_US |
| dc.contributor.other | Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. | en_US |
| dc.date.accessioned | 2005-05-19T15:00:27Z | |
| dc.date.available | 2005-05-19T15:00:27Z | |
| dc.date.copyright | 2002 | en_US |
| dc.date.issued | 2002 | en_US |
| dc.identifier.uri | http://hdl.handle.net/1721.1/16846 | |
| dc.description | Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002. | en_US |
| dc.description | Includes bibliographical references (p. 77-81). | en_US |
| dc.description | This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. | en_US |
| dc.description.abstract | For this thesis, I designed and implemented a system to summarize e-mail messages. The system exploits two aspects of e-mail, thread reply chains and commonly-found features, to generate summaries. The system uses existing software designed to summarize single text documents. Such software typically performs best on well-authored, formal documents. E-mail messages, however, are typically neither well-authored, nor formal. As a result, existing summarization software typically gives a poor summary of e-mail messages. To remedy this poor performance, the system's approach preprocesses e-mail messages to synthesize new input to this software, so that it will output more useful summaries of e-mail. This pre-processing involves a lightweight, heuristics-based approach to filtering e-mail to remove e-mail signatures, header fields, and quoted parent messages. I also present a heuristics-based approach to identifying and reporting names, dates, and companies found in e-mail messages. Lastly, I discuss conclusions from a pilot user study of my summarization system, and conclude with areas for further investigation. | en_US |
| dc.description.statementofresponsibility | by Derek Scott Lam. | en_US |
| dc.format.extent | 81 p. | en_US |
| dc.format.extent | 310153 bytes | |
| dc.format.extent | 309910 bytes | |
| dc.format.mimetype | application/pdf | |
| dc.format.mimetype | application/pdf | |
| dc.language.iso | eng | en_US |
| dc.publisher | Massachusetts Institute of Technology | en_US |
| dc.rights | M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. | en_US |
| dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | |
| dc.subject | Electrical Engineering and Computer Science. | en_US |
| dc.title | Exploiting E-mail structure to improve summarization | en_US |
| dc.type | Thesis | en_US |
| dc.description.degree | M.Eng. | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
| dc.identifier.oclc | 51479527 | en_US |