Optimization of acoustic feature extraction from dysarthric speech
Author(s)
DiCicco, Thomas M., Jr. (Thomas Minotti)
DownloadFull printable version (16.85Mb)
Other Contributors
Harvard University--MIT Division of Health Sciences and Technology.
Advisor
Rupal Patel.
Terms of use
Metadata
Show full item recordAbstract
Dysarthria is a motor speech disorder characterized by weak or uncoordinated movements of the speech musculature. While unfamiliar listeners struggle to understand speakers with severe dysarthria, familiar listeners are often able to comprehend with high accuracy. This observation implies that although the speech produced by an individual with dysarthria may appear distorted and unintelligible to the untrained listener, there must be a set of consistent acoustic cues that the familiar communication partner is able to interpret. While dysarthric speech has been characterized both acoustically and perceptually, most accounts tend to compare dysarthric productions to those of healthy controls rather than identify the set of reliable and consistently controlled segmental cues. This work aimed to elucidate possible recognition strategies used by familiar listeners by optimizing a model of human speech recognition, Stevens' Lexical Access from Features (LAFF) framework, for ten individual speakers with dysarthria (SWDs). The LAFF model is rooted in distinctive feature theory, with acoustic landmarks indicating changes in the manner of articulation. The acoustic correlates manifested around landmarks provide the identity to articulator-free (manner) and articulator-bound (place) features. (cont.) SWDs created weaker consonantal landmarks, likely due to an inability to form complete closures in the vocal tract and to fully release consonantal constrictions. Identification of speaker-optimized acoustic correlate sets improved discrimination of each speaker's productions, evidenced by increased sensitivity and specificity. While there was overlap between the types of correlates identified for healthy and dysarthric speakers, using the optimal sets of correlates identified for SWDs adversely impaired discrimination of healthy speech. These results suggest that the combinations of correlates suggested for SWDs were specific to the individual and different from the segmental cues used by healthy individuals. Application of the LAFF model to dysarthric speech has potential clinical utility as a diagnostic tool, highlighting the fine-grain components of speech production that require intervention and quantifying the degree of impairment.
Description
Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, February 2010. Cataloged from PDF version of thesis. Includes bibliographical references (p. 171-180).
Date issued
2010Department
Harvard University--MIT Division of Health Sciences and TechnologyPublisher
Massachusetts Institute of Technology
Keywords
Harvard University--MIT Division of Health Sciences and Technology.