Multimodal human behavior analysis: Learning correlation and interaction across modalities

Song, Yale; Morency, Louis-Philippe; Davis, Randall

dc.contributor.author	Song, Yale
dc.contributor.author	Morency, Louis-Philippe
dc.contributor.author	Davis, Randall
dc.date.accessioned	2014-04-11T14:20:52Z
dc.date.available	2014-04-11T14:20:52Z
dc.date.issued	2012-10
dc.identifier.isbn	9781450314671
dc.identifier.uri	http://hdl.handle.net/1721.1/86099
dc.description.abstract	Multimodal human behavior analysis is a challenging task due to the presence of complex nonlinear correlations and interactions across modalities. We present a novel approach to this problem based on Kernel Canonical Correlation Analysis (KCCA) and Multi-view Hidden Conditional Random Fields (MV-HCRF). Our approach uses a nonlinear kernel to map multimodal data to a high-dimensional feature space and finds a new projection of the data that maximizes the correlation across modalities. We use a multi-chain structured graphical model with disjoint sets of latent variables, one set per modality, to jointly learn both view-shared and view-specific sub-structures of the projected data, capturing interaction across modalities explicitly. We evaluate our approach on a task of agreement and disagreement recognition from nonverbal audio-visual cues using the Canal 9 dataset. Experimental results show that KCCA makes capturing nonlinear hidden dynamics easier and MV-HCRF helps learning interaction across modalities.	en_US
dc.description.sponsorship	United States. Office of Naval Research (Grant N000140910625)	en_US
dc.description.sponsorship	National Science Foundation (U.S.) (Grant IIS-1118018)	en_US
dc.description.sponsorship	National Science Foundation (U.S.) (Grant IIS-1018055)	en_US
dc.description.sponsorship	United States. Army Research, Development, and Engineering Command	en_US
dc.language.iso	en_US
dc.relation.isversionof	http://dx.doi.org/10.1145/2388676.2388684	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	MIT web domain	en_US
dc.title	Multimodal human behavior analysis: Learning correlation and interaction across modalities	en_US
dc.type	Article	en_US
dc.identifier.citation	Yale Song, Louis-Philippe Morency, and Randall Davis. 2012. Multimodal human behavior analysis: learning correlation and interaction across modalities. In Proceedings of the 14th ACM international conference on Multimodal interaction (ICMI '12). ACM, New York, NY, USA, 27-30.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.contributor.mitauthor	Song, Yale	en_US
dc.contributor.mitauthor	Davis, Randall	en_US
dc.relation.journal	Proceedings of the 14th ACM international conference on Multimodal interaction (ICMI '12)	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dspace.orderedauthors	Song, Yale; Morency, Louis-Philippe; Davis, Randall	en_US
dc.identifier.orcid	https://orcid.org/0000-0001-5232-7281
mit.license	OPEN_ACCESS_POLICY	en_US
mit.metadata.status	Complete

Files in this item

Name:: Davis_Multimodal human.pdf
Size:: 158.4Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record