Show simple item record

dc.contributor.advisorRosalind W. Picard.en_US
dc.contributor.authorFernandez, Raulen_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Architecture. Program in Media Arts and Sciences.en_US
dc.date.accessioned2005-05-17T14:42:00Z
dc.date.available2005-05-17T14:42:00Z
dc.date.copyright2004en_US
dc.date.issued2004en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/16621
dc.descriptionThesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2004.en_US
dc.descriptionIncludes bibliographical references (p. 272-284).en_US
dc.descriptionThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.en_US
dc.description.abstractSpoken language, in addition to serving as a primary vehicle for externalizing linguistic structures and meaning, acts as a carrier of various sources of information, including back- ground, age, gender, membership in social structures, as well as physiological, pathological and emotional states. These sources of information are more than just ancillary to the main purpose of linguistic communication: Humans react to the various non-linguistic factors en- coded in the speech signal, shaping and adjusting their interactions to satisfy interpersonal and social protocols. Computer science, artificial intelligence and computational linguistics have devoted much active research to systems that aim to model the production and recovery of linguistic lexico-semantic structures from speech. However, less attention has been devoted to systems that model and understand the paralinguistic and extralinguistic information in the signal. As the breadth and nature of human-computer interaction escalates to levels previously reserved for human-to-human communication, there is a growing need to endow computational systems with human-like abilities which facilitate the interaction and make it more natural. Of paramount importance amongst these is the human ability to make inferences regarding the affective content of our exchanges. This thesis proposes a framework for the recognition of affective qualifiers from prosodic- acoustic parameters extracted from spoken language.en_US
dc.description.abstract(cont.) It is argued that modeling the affective prosodic variation of speech can be approached by integrating acoustic parameters from various prosodic time scales, summarizing information from more localized (e.g., syllable level) to more global prosodic phenomena (e.g., utterance level). In this framework speech is structurally modeled as a dynamically evolving hierarchical model in which levels of the hierarchy are determined by prosodic constituency and contain parameters that evolve according to dynamical systems. The acoustic parameters have been chosen to reflect four main components of speech thought to reflect paralinguistic and affect-specific information: intonation, loudness, rhythm and voice quality. The thesis addresses the contribution of each of these components separately, and evaluates the full model by testing it on datasets of acted and of spontaneous speech perceptually annotated with affective labels, and by comparing it against human performance benchmarks.en_US
dc.description.statementofresponsibilityby Raul Fernandez.en_US
dc.format.extent284 p.en_US
dc.format.extent2715103 bytes
dc.format.extent2685154 bytes
dc.format.mimetypeapplication/pdf
dc.format.mimetypeapplication/pdf
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582
dc.subjectArchitecture. Program in Media Arts and Sciences.en_US
dc.titleA computational model for the automatic recognition of affect in speechen_US
dc.typeThesisen_US
dc.description.degreePh.D.en_US
dc.contributor.departmentProgram in Media Arts and Sciences (Massachusetts Institute of Technology)
dc.identifier.oclc55771160en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record