Decoding algorithms for complex natural language tasks

Deshpande, Pawan

dc.contributor.advisor	Regina Barzilay.	en_US
dc.contributor.author	Deshpande, Pawan	en_US
dc.contributor.other	Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2008-05-19T16:05:01Z
dc.date.available	2008-05-19T16:05:01Z
dc.date.copyright	2007	en_US
dc.date.issued	2007	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/41647
dc.description	Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.	en_US
dc.description	Includes bibliographical references (p. 79-85).	en_US
dc.description.abstract	This thesis focuses on developing decoding techniques for complex Natural Language Processing (NLP) tasks. The goal of decoding is to find an optimal or near optimal solution given a model that defines the goodness of a candidate. The task is challenging because in a typical problem the search space is large, and the dependencies between elements of the solution are complex. The goal of this work is two-fold. First, we are interested in developing decoding techniques with strong theoretical guarantees. We develop a decoding model based on the Integer Linear Programming paradigm which is guaranteed to compute the optimal solution and is capable of accounting for a wide range of global constraints. As an alternative, we also present a novel randomized algorithm which can guarantee an arbitrarily high probability of finding the optimal solution. We apply these methods to the task of constructing temporal graphs and to the task of title generation. Second, we are interested in carefully investigating the relations between learning and decoding. We build on the Perceptron framework to integrate the learning and decoding procedures into a single unified process. We use the resulting model to automatically generate tables-of-contents, structures with deep hierarchies and rich contextual dependencies. In all three natural language tasks, our experimental results demonstrate that theoretically grounded and stronger decoding strategies perform better than existing methods. As a final contribution, we have made the source code for these algorithms publicly available for the NLP research community.	en_US
dc.description.statementofresponsibility	by Pawan Deshpande.	en_US
dc.format.extent	85 p.	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Decoding algorithms for complex natural language tasks	en_US
dc.type	Thesis	en_US
dc.description.degree	M.Eng.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	219709911	en_US

Files in this item

Name:: 219709911-MIT.pdf
Size:: 4.332Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record