Permutation-based Significance Tests for Multi-modal Hierarchical Dirichlet Processes with Application to Audio-visual Data

Anderson, Madeline Loui

Author(s)

Anderson, Madeline Loui

DownloadThesis PDF (8.504Mb)

Advisor

Fisher III, John W.

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Complex underlying distributions in multi-modal data motivate the need for data fusion methods that integrate observations of different modalities in a meaningful way. We explore the multi-modal hierarchical Dirichlet process (mmHDP) mixture model as a Bayesian non-parametric approach to data fusion. In particular, we elaborate on its censored-data perspective, which aligns groups of observations at a group level to accommodate for missing data in any modality. To explore the model behavior, we develop a processing pipeline that applies the mmHDP to audio-visual data, a common and practical multi-modal system. We apply this pipeline to musical data with known audio-visual relationships and provide in-depth qualitative analyses on the learned model parameters. Because of its non-parametric and unsupervised clustering nature, it can be difficult to quantify the significance of the learned mmHDP structure. We propose a novel permutation testing framework that empirically measures the significance of the mmHDP structure and demonstrate its viability using both synthetic and real audio-visual data. The results convey that the mmHDP model captures meaningful structure in the audio-visual data and that the permutation testing framework is a viable method for quantifying model significance.

Date issued

2023-09

URI

https://hdl.handle.net/1721.1/152853

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses