Ordering prenominal modifiers with a ranking approach
Author(s)
Liu, Jingyi, M. Eng. Massachusetts Institute of Technology
DownloadFull printable version (2.415Mb)
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
Regina Barzilay.
Terms of use
Metadata
Show full item recordAbstract
In this thesis we present a solution to the natural language processing task of ordering prenominal modifiers, a problem that has applications from machine translation to natural language generation. In machine translation, constraints on modifier orderings vary from language to language so some reordering of modifiers may be necessary. In natural language generation, a representation of an object and its properties often needs to be formulated into a concrete noun phrase. We detail a novel approach that frames this task as a ranking problem amongst the permutations of a set of modifiers, admitting arbitrary features on each candidate permutation and exploiting hundreds of thousands of features in total. We compare our methods to a state-of-the-art class based ordering approach and a strong baseline that makes use of the Google n-gram corpus. We attain a maximum error reduction of 69.8% and average error reduction across all test sets of 59.1% compared to the state-of-the-art, and we attain a maximum error reduction of 68.4% and average error reduction across all test sets of 41.8% compared to our Google n-gram baseline. Finally, we present an analysis of our approach as compared to our baselines and describe several potential improvements to our system.
Description
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011. Cataloged from PDF version of thesis. Includes bibliographical references (p. 57-58).
Date issued
2011Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.