dc.contributor.advisor | James Glass and Suwon Shon. | en_US |
dc.contributor.author | Rivera, Gabrielle Cristina. | en_US |
dc.contributor.other | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. | en_US |
dc.date.accessioned | 2019-12-05T18:06:15Z | |
dc.date.available | 2019-12-05T18:06:15Z | |
dc.date.copyright | 2019 | en_US |
dc.date.issued | 2019 | en_US |
dc.identifier.uri | https://hdl.handle.net/1721.1/123151 | |
dc.description | This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. | en_US |
dc.description | Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 | en_US |
dc.description | Cataloged from student-submitted PDF version of thesis. | en_US |
dc.description | Includes bibliographical references (pages 61-65). | en_US |
dc.description.abstract | Multilingual and multidialectal speakers commonly switch between languages and dialects while speaking, leading to the linguistic phenomenon known as code-switching. Most acoustic systems, such as automatic speech recognition systems, are unable to robustly handle input with unexpected language or dialect switching. Generally, this results from both a lack of available corpora and an increase in the difficulty of the task when applied to code-switching data. This thesis focuses on constructing an acoustic-based model to gather code-switching information from utterances containing Modern Standard Arabic and dialectal Arabic. We utilize the multidialectal GALE Arabic dataset to classify the code-switching style of an utterance and later to detect the location of code-switching within an utterance. We discuss the failed classification schemes and detection methods, providing analysis for why these approaches were unsuccessful. We also present an alignment-free classification scheme which is able to detect locations within an utterance where dialectal Arabic is likely being spoken. This method presents a marked improvement over the proposed baseline in average detection miss rate. By utilizing this information, Arabic acoustic systems will be more robust to dialectal shifts within a given input. | en_US |
dc.description.statementofresponsibility | by Gabrielle Cristina Rivera. | en_US |
dc.format.extent | 65 pages | en_US |
dc.language.iso | eng | en_US |
dc.publisher | Massachusetts Institute of Technology | en_US |
dc.rights | MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. | en_US |
dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | en_US |
dc.subject | Electrical Engineering and Computer Science. | en_US |
dc.title | Automatic detection of code-switching in Arabic dialects | en_US |
dc.type | Thesis | en_US |
dc.description.degree | M. Eng. | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
dc.identifier.oclc | 1128868510 | en_US |
dc.description.collection | M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science | en_US |
dspace.imported | 2019-12-05T18:06:14Z | en_US |
mit.thesis.degree | Master | en_US |
mit.thesis.department | EECS | en_US |