Multimodal human behavior analysis: Learning correlation and interaction across modalities

Song, Yale; Morency, Louis-Philippe; Davis, Randall

Author(s)

Song, Yale; Morency, Louis-Philippe; Davis, Randall

DownloadDavis_Multimodal human.pdf (158.4Kb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

Multimodal human behavior analysis is a challenging task due to the presence of complex nonlinear correlations and interactions across modalities. We present a novel approach to this problem based on Kernel Canonical Correlation Analysis (KCCA) and Multi-view Hidden Conditional Random Fields (MV-HCRF). Our approach uses a nonlinear kernel to map multimodal data to a high-dimensional feature space and finds a new projection of the data that maximizes the correlation across modalities. We use a multi-chain structured graphical model with disjoint sets of latent variables, one set per modality, to jointly learn both view-shared and view-specific sub-structures of the projected data, capturing interaction across modalities explicitly. We evaluate our approach on a task of agreement and disagreement recognition from nonverbal audio-visual cues using the Canal 9 dataset. Experimental results show that KCCA makes capturing nonlinear hidden dynamics easier and MV-HCRF helps learning interaction across modalities.

Date issued

2012-10

URI

http://hdl.handle.net/1721.1/86099

Department

Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Journal

Proceedings of the 14th ACM international conference on Multimodal interaction (ICMI '12)

Citation

Yale Song, Louis-Philippe Morency, and Randall Davis. 2012. Multimodal human behavior analysis: learning correlation and interaction across modalities. In Proceedings of the 14th ACM international conference on Multimodal interaction (ICMI '12). ACM, New York, NY, USA, 27-30.

Version: Author's final manuscript

ISBN

9781450314671

Collections

MIT Open Access Articles

DSpace@MIT