Show simple item record

dc.contributor.advisorGlass, James R.
dc.contributor.advisorBelinkov, Yonatan
dc.contributor.authorManna, Rami
dc.date.accessioned2022-02-07T15:15:18Z
dc.date.available2022-02-07T15:15:18Z
dc.date.issued2021-09
dc.date.submitted2021-11-03T19:25:26.781Z
dc.identifier.urihttps://hdl.handle.net/1721.1/139953
dc.description.abstractThis thesis explores novel approaches to the Arabic-English speech-to-text translation task. First, we construct a novel Modern Standard Arabic speech and English text parallel dataset. Second, we propose a novel framework for leveraging unsupervised machine translation to improve speech-to-text translation, and apply this framework to the task of Arabic-English speech-to-text translation. In particular, we propose a 3-step cascade approach to speech-to-text translation. In step 1, we use a speech recognition model to transcribe the Arabic speech into Arabic text. In step 2, we leverage unsupervised machine translation to learn a mapping between the output of the speech recognition model (transcribed Arabic) and Modern Standard Arabic (formal written Arabic). In step 3, we use an Arabic-English machine translation model to translate the output of the unsupervised model to English. Our third contribution is an exploration of approaches to low-resource end-to-end speech-to-text translation. We present and compare two approaches for synthesizing parallel training data. Finally, we compare the end-to-end approach with the cascaded approach. We found that the 3-step cascaded speech-to-text did not perform as well as the 2-step cascaded speech-to-text baseline. We show that with the end-to-end approach trained with synthetic English text, we are able to achieve similar performance to the 2-step cascaded speech-to-text baseline.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright MIT
dc.rights.urihttp://rightsstatements.org/page/InC-EDU/1.0/
dc.titleConstructing Low Resource Approaches to Improve Speech-to-text Translation from Modern Standard Arabic to English
dc.typeThesis
dc.description.degreeM.Eng.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Engineering in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record