Analyzing Multimodal Interactions through Improved Partial Information Decomposition Estimation
Author(s)
Balachandran, Adithya S.
DownloadThesis PDF (13.47Mb)
Advisor
Liang, Paul Pu
Terms of use
Metadata
Show full item recordAbstract
Multimodal AI aims to build comprehensive models by integrating information from diverse sensory inputs such as text, audio, and vision. However, significant challenges remain in understanding how these different modalities interact and contribute to downstream tasks. In particular, we seek to characterize how modalities complement each other, overlap in the information they convey, or contribute jointly to patterns that are not clear from any single modality alone. To address this, we propose novel methods for quantifying these multimodal interactions using information-theoretic techniques. Specifically, we will introduce a novel estimator for Partial Information Decomposition (PID) using normalizing flows, with the ability to scale well to high-dimensional data. We also develop a new framework for estimating pointwise PID, which provides insights into how individual data points contribute to information sharing and interactions across modalities, and show how to apply this framework for anomaly detection. We demonstrate the effectiveness of our methods on a variety of high-dimensional datasets, including both synthetic and real-world multimodal data such as videos.
Date issued
2025-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology