Multimodal human behavior analysis: Learning correlation and interaction across modalities
Author(s)
Song, Yale; Morency, Louis-Philippe; Davis, Randall
DownloadDavis_Multimodal human.pdf (158.4Kb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
Multimodal human behavior analysis is a challenging task due to the presence of complex nonlinear correlations and interactions across modalities. We present a novel approach to this problem based on Kernel Canonical Correlation Analysis (KCCA) and Multi-view Hidden Conditional Random Fields (MV-HCRF). Our approach uses a nonlinear kernel to map multimodal data to a high-dimensional feature space and finds a new projection of the data that maximizes the correlation across modalities. We use a multi-chain structured graphical model with disjoint sets of latent variables, one set per modality, to jointly learn both view-shared and view-specific sub-structures of the projected data, capturing interaction across modalities explicitly. We evaluate our approach on a task of agreement and disagreement recognition from nonverbal audio-visual cues using the Canal 9 dataset. Experimental results show that KCCA makes capturing nonlinear hidden dynamics easier and MV-HCRF helps learning interaction across modalities.
Date issued
2012-10Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
Proceedings of the 14th ACM international conference on Multimodal interaction (ICMI '12)
Citation
Yale Song, Louis-Philippe Morency, and Randall Davis. 2012. Multimodal human behavior analysis: learning correlation and interaction across modalities. In Proceedings of the 14th ACM international conference on Multimodal interaction (ICMI '12). ACM, New York, NY, USA, 27-30.
Version: Author's final manuscript
ISBN
9781450314671