dc.contributor.advisor | Peter Szolovits. | en_US |
dc.contributor.author | Chasin, Rachel (Rachel G.) | en_US |
dc.contributor.other | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. | en_US |
dc.date.accessioned | 2014-02-10T16:57:36Z | |
dc.date.available | 2014-02-10T16:57:36Z | |
dc.date.issued | 2013 | en_US |
dc.identifier.uri | http://hdl.handle.net/1721.1/84878 | |
dc.description | Thesis (M. Eng.)--Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013. | en_US |
dc.description | Cataloged from PDF version of thesis. | en_US |
dc.description | Includes bibliographical references (pages 57-59). | en_US |
dc.description.abstract | Lexical ambiguity, the ambiguity arising from a string with multiple meanings, is pervasive in language of all domains. Word sense disambiguation (WSD) and word sense induction (WSI) are the tasks of resolving this ambiguity. Applications in the clinical and biomedical domain focus on the potential disambiguation has for information extraction. Most approaches to the problem are unsupervised or semi-supervised because of the high cost of obtaining enough annotated data for supervised learning. In this thesis we compare the application of a semi-supervised general domain state of the art WSI method to clinical text to the best known knowledge-based unsupervised methods in the clinical domain. We also explore making improvements to the general domain method, which is based on topic modeling, by adding features that incorporate syntax and information from knowledge bases, and investigate ways to mitigate the need for annotated data. | en_US |
dc.description.statementofresponsibility | by Rachel Chasin. | en_US |
dc.format.extent | 59 pages | en_US |
dc.language.iso | eng | en_US |
dc.publisher | Massachusetts Institute of Technology | en_US |
dc.rights | M.I.T. theses are protected by
copyright. They may be viewed from this source for any purpose, but
reproduction or distribution in any format is prohibited without written
permission. See provided URL for inquiries about permission. | en_US |
dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | en_US |
dc.subject | Electrical Engineering and Computer Science. | en_US |
dc.title | Word sense disambiguation in clinical text | en_US |
dc.title.alternative | WSD in clinical text | en_US |
dc.type | Thesis | en_US |
dc.description.degree | M.Eng. | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
dc.identifier.oclc | 868672786 | en_US |