Show simple item record

dc.contributor.authorPicard, Rosalind W.
dc.contributor.authorHoque, Mohammed Ehasanul
dc.contributor.authorEl Kaliouby, Rana
dc.date.accessioned2010-07-15T19:06:40Z
dc.date.available2010-07-15T19:06:40Z
dc.date.issued2009-09
dc.identifier.isbn978-3-642-04379-6
dc.identifier.urihttp://hdl.handle.net/1721.1/56633
dc.description.abstractThis paper describes the challenges of getting ground truth affective labels for spontaneous video, and presents implications for systems such as virtual agents that have automated facial analysis capabilities. We first present a dataset from an intelligent tutoring application and describe the most prevalent approach to labeling such data. We then present an alternative labeling approach, which closely models how the majority of automated facial analysis systems are designed. We show that while participants, peers and trained judges report high inter-rater agreement on expressions of delight, confusion, flow, frustration, boredom, surprise, and neutral when shown the entire 30 minutes of video for each participant, inter-rater agreement drops below chance when human coders are asked to watch and label short 8 second clips for the same set of labels. We also perform discriminative analysis for facial action units for each affective state represented in the clips. The results emphasize that human coders heavily rely on factors such as familiarity of the person and context of the interaction to correctly infer a person’s affective state; without this information, the reliability of humans as well as machines attributing affective labels to spontaneous facial-head movements drops significantly.en_US
dc.language.isoen_US
dc.publisherSpringer Berlinen_US
dc.relation.isversionofhttp://dx.doi.org/10.1007/978-3-642-04380-2_37en_US
dc.rightsAttribution-Noncommercial-Share Alike 3.0 Unporteden_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/en_US
dc.sourceAlex Khitrik [akhitrik@media.mit.edu] after request by Rosalyn Picarden_US
dc.titleWhen Human Coders (and Machines) Disagree on the Meaning of Facial Affect in Spontaneous Videosen_US
dc.typeArticleen_US
dc.identifier.citationHoque, M. E., R. El Kaliouby, and R. W. Picard. "When Human Coders (and Machines) Disagree on the Meaning of Facial Affect in Spontaneous Videos." Intelligent Virtual Agents, Proceedings 5773 (2009): 337-43.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Media Laboratoryen_US
dc.contributor.departmentProgram in Media Arts and Sciences (Massachusetts Institute of Technology)en_US
dc.contributor.approverPicard, Rosalind W.
dc.contributor.mitauthorPicard, Rosalind W.
dc.contributor.mitauthorHoque, Mohammed Ehasanul
dc.contributor.mitauthorEl Kaliouby, Rana
dc.relation.journalIntelligent Virtual Agents, 9th International Conference, IVA 2009 Amsterdam, The Netherlands, September 14-16, 2009 Proceedingsen_US
dc.eprint.versionOriginal manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
dspace.orderedauthorsHoque, Mohammed E.; Kaliouby, Rana; Picard, Rosalind W.en
dc.identifier.orcidhttps://orcid.org/0000-0002-5661-0022
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record