Minimal characterization of linguistic phenomena for robust ternary expression construction
Author(s)
Tong, Jason Kar Chun
DownloadFull printable version (800.9Kb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Boris Katz.
Terms of use
Metadata
Show full item recordAbstract
This thesis introduces Astroparse, a system that uses the output of a third-party neural network based dependency parser (spaCy) to construct semantic parses of sentences in the form of ternary expressions as pioneered by the Start Natural Language system. Ternary expressions are a powerful representation for efficiently indexing, matching, and retrieving natural language. Because Start is a purely symbolic system, extending Start's parser, which produces ternary expressions from sentences, requires significant effort. Astroparse makes it far easier to extend Start's coverage. Learning from examples (pairs of sentences and ternary expressions), Astroparse automatically learns to associate the linguistic phenomenon corresponding to an example's ternary expression with a subtree of the example sentence's dependency tree and the token-level features (e.g., lemma, part-of-speech tags) of the subtree's nodes. Given unseen sentences, Astroparse recognizes the learned minimal characterizations of linguistic phenomena to construct ternary expressions from spaCy's parse of the sentence. By leveraging the output of a neural network based dependency parser with high efficiency and state-of-the-art accuracy, Astroparse offers a fast, high-recall, easy-to-train system to augment Start's current parser for constructing ternary expressions.
Description
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 101-102).
Date issued
2018Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.