Gaussian alignments in statistical translation models

Mohammad, Ali (Ali H.)

dc.contributor.advisor	Michael J. Collins.	en_US
dc.contributor.author	Mohammad, Ali (Ali H.)	en_US
dc.contributor.other	Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2007-01-10T16:48:01Z
dc.date.available	2007-01-10T16:48:01Z
dc.date.copyright	2006	en_US
dc.date.issued	2006	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/35611
dc.description	Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.	en_US
dc.description	Includes bibliographical references (leaves 52-53).	en_US
dc.description.abstract	Machine translation software has been under development almost since the birth of the electronic computer. Current state-of-the-art methods use statistical techniques to learn how to translate from one natural language to another from a corpus of hand-translated text. The success of these techniques comes from two factors: a simple statistical model and vast training data sets. The standard agenda for improving such models is to enable it to model greater complexity; however, it is a byword within the machine learning community that added complexity must be supported with more training data. Given that current models already require huge amounts of data, our agenda is instead to simplify current models before adding extensions. We present one such simplification, which results in fewer than 10% as many alignment model parameters and produces results competitive with the original model. An unexpected benefit of this technique is that it naturally gives a measure for how difficult it is to translate from one language to another given a data set. Next, we present one suggestion for adding complexity to model new behavior.	en_US
dc.description.statementofresponsibility	by Ali Mohammad.	en_US
dc.format.extent	53 leaves	en_US
dc.format.extent	2283102 bytes
dc.format.extent	2431327 bytes
dc.format.mimetype	application/pdf
dc.format.mimetype	application/pdf
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Gaussian alignments in statistical translation models	en_US
dc.type	Thesis	en_US
dc.description.degree	S.M.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	75292112	en_US

Files in this item

Name:: 75292112-MIT.pdf
Size:: 2.318Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record