dc.contributor.advisor | Glass, James R. | |
dc.contributor.advisor | Belinkov, Yonatan | |
dc.contributor.author | Manna, Rami | |
dc.date.accessioned | 2022-02-07T15:15:18Z | |
dc.date.available | 2022-02-07T15:15:18Z | |
dc.date.issued | 2021-09 | |
dc.date.submitted | 2021-11-03T19:25:26.781Z | |
dc.identifier.uri | https://hdl.handle.net/1721.1/139953 | |
dc.description.abstract | This thesis explores novel approaches to the Arabic-English speech-to-text translation task. First, we construct a novel Modern Standard Arabic speech and English text parallel dataset. Second, we propose a novel framework for leveraging unsupervised machine translation to improve speech-to-text translation, and apply this framework to the task of Arabic-English speech-to-text translation. In particular, we propose a 3-step cascade approach to speech-to-text translation. In step 1, we use a speech recognition model to transcribe the Arabic speech into Arabic text. In step 2, we leverage unsupervised machine translation to learn a mapping between the output of the speech recognition model (transcribed Arabic) and Modern Standard Arabic (formal written Arabic). In step 3, we use an Arabic-English machine translation model to translate the output of the unsupervised model to English. Our third contribution is an exploration of approaches to low-resource end-to-end speech-to-text translation. We present and compare two approaches for synthesizing parallel training data. Finally, we compare the end-to-end approach with the cascaded approach. We found that the 3-step cascaded speech-to-text did not perform as well as the 2-step cascaded speech-to-text baseline. We show that with the end-to-end approach trained with synthetic English text, we are able to achieve similar performance to the 2-step cascaded speech-to-text baseline. | |
dc.publisher | Massachusetts Institute of Technology | |
dc.rights | In Copyright - Educational Use Permitted | |
dc.rights | Copyright MIT | |
dc.rights.uri | http://rightsstatements.org/page/InC-EDU/1.0/ | |
dc.title | Constructing Low Resource Approaches to Improve Speech-to-text Translation from Modern Standard Arabic to English | |
dc.type | Thesis | |
dc.description.degree | M.Eng. | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
mit.thesis.degree | Master | |
thesis.degree.name | Master of Engineering in Electrical Engineering and Computer Science | |