MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Speech-Based Artificial Intelligence Emotion Biomarkers in Frontotemporal Dementia

Author(s)
Parllaku, Fjona
Thumbnail
DownloadThesis PDF (1.811Mb)
Advisor
Glass, James
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Acoustic speech markers are well-characterized in Frontotemporal Dementia (FTD), a heterogeneous spectrum of progressive neurodegenerative diseases that can affect speech production and comprehension as well as higher-order cognition, behavior, and motor control. While profound apathy and deficits in emotion processing are also common symptoms, emotional content has yet to be explored in acoustic models of speech. We retrospectively analyze a dataset of standard elicited speech tasks from 69 FTD and 131 healthy elderly controls seen at the University of Melbourne. We develop two ResNet50 models to classify FTD vs healthy elderly controls using spectrograms of speech samples: 1) a naive model, and 2) a model that was pretrained on an emotions speech dataset. We compare the validation accuracies of the two models on different speech tasks. The pre-trained model better classifies FTD vs. healthy elderly controls, and the behavioral variant of FTD (bvFTD) vs. healthy elderly controls with validation accuracy scores of 79% and 84% respectively in the monologue speech task, and 93% and 90% in the picture description one. When considered singularly, the ‘happy’ emotion best discriminates between FTD vs healthy elderly controls compared to other latent emotions. Pre-training acoustic models on latent emotion increases the classification accuracy for FTD. We demonstrate the greatest improvement in model performance on elicited speech tasks with greater emotional content. Considered more broadly, our findings suggest that inclusion of latent emotion in acoustic classification models provides a benefit in neurologic diseases that affect emotion.
Date issued
2022-09
URI
https://hdl.handle.net/1721.1/147525
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.