Personalized Modeling of Real-World Vocalizations from Nonverbal Individuals

Narain, J; Johnson, KT; Ferguson, C; O'Brien, A; Talkar, T; Weninger, YZ; Wofford, P; Quatieri, T; Picard, Rosalind W.; Maes, P

dc.contributor.author	Narain, J
dc.contributor.author	Johnson, KT
dc.contributor.author	Ferguson, C
dc.contributor.author	O'Brien, A
dc.contributor.author	Talkar, T
dc.contributor.author	Weninger, YZ
dc.contributor.author	Wofford, P
dc.contributor.author	Quatieri, T
dc.contributor.author	Picard, Rosalind W.
dc.contributor.author	Maes, P
dc.date.accessioned	2021-11-02T14:17:52Z
dc.date.available	2021-11-02T14:17:52Z
dc.date.issued	2020
dc.identifier.uri	https://hdl.handle.net/1721.1/137088
dc.description.abstract	© 2020 Owner/Author. Nonverbal vocalizations contain important affective and communicative information, especially for those who do not use traditional speech, including individuals who have autism and are non- or minimally verbal (nv/mv). Although these vocalizations are often understood by those who know them well, they can be challenging to understand for the community-at-large. This work presents (1) a methodology for collecting spontaneous vocalizations from nv/mv individuals in natural environments, with no researcher present, and personalized in-the-moment labels from a family member; (2) speaker-dependent classification of these real-world sounds for three nv/mv individuals; and (3) an interactive application to translate the nonverbal vocalizations in real time. Using support-vector machine and random forest models, we achieved speaker-dependent unweighted average recalls (UARs) of 0.75, 0.53, and 0.79 for the three individuals, respectively, with each model discriminating between 5 nonverbal vocalization classes. We also present first results for real-time binary classification of positive- and negative-affect nonverbal vocalizations, trained using a commercial wearable microphone and tested in real time using a smartphone. This work informs personalized machine learning methods for non-traditional communicators and advances real-world interactive augmentative technology for an underserved population.	en_US
dc.language.iso	en
dc.publisher	ACM	en_US
dc.relation.isversionof	10.1145/3382507.3418854	en_US
dc.rights	Creative Commons Attribution 4.0 International license	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.source	ACM	en_US
dc.title	Personalized Modeling of Real-World Vocalizations from Nonverbal Individuals	en_US
dc.type	Article	en_US
dc.identifier.citation	Narain, J, Johnson, KT, Ferguson, C, O'Brien, A, Talkar, T et al. 2020. "Personalized Modeling of Real-World Vocalizations from Nonverbal Individuals." ICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction.
dc.contributor.department	Massachusetts Institute of Technology. Media Laboratory
dc.contributor.department	Lincoln Laboratory
dc.relation.journal	ICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2021-06-30T18:18:52Z
dspace.orderedauthors	Narain, J; Johnson, KT; Ferguson, C; O'Brien, A; Talkar, T; Weninger, YZ; Wofford, P; Quatieri, T; Picard, R; Maes, P	en_US
dspace.date.submission	2021-06-30T18:18:55Z
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: 3382507.3418854.pdf
Size:: 6.631Mb
Format:: PDF
Description:: Published version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record