Permutation-based Significance Tests for Multi-modal Hierarchical Dirichlet Processes with Application to Audio-visual Data
Author(s)
Anderson, Madeline Loui
DownloadThesis PDF (8.504Mb)
Advisor
Fisher III, John W.
Terms of use
Metadata
Show full item recordAbstract
Complex underlying distributions in multi-modal data motivate the need for data fusion methods that integrate observations of different modalities in a meaningful way. We explore the multi-modal hierarchical Dirichlet process (mmHDP) mixture model as a Bayesian non-parametric approach to data fusion. In particular, we elaborate on its censored-data perspective, which aligns groups of observations at a group level to accommodate for missing data in any modality. To explore the model behavior, we develop a processing pipeline that applies the mmHDP to audio-visual data, a common and practical multi-modal system. We apply this pipeline to musical data with known audio-visual relationships and provide in-depth qualitative analyses on the learned model parameters. Because of its non-parametric and unsupervised clustering nature, it can be difficult to quantify the significance of the learned mmHDP structure. We propose a novel permutation testing framework that empirically measures the significance of the mmHDP structure and demonstrate its viability using both synthetic and real audio-visual data. The results convey that the mmHDP model captures meaningful structure in the audio-visual data and that the permutation testing framework is a viable method for quantifying model significance.
Date issued
2023-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology