Robust Inference via Optimal Transport Ambiguity Sets
Author(s)
Wang, Zheyu
DownloadThesis PDF (3.476Mb)
Advisor
Marzouk, Youssef M.
Terms of use
Metadata
Show full item recordAbstract
Uncertainty quantification is pivotal for ensuring the safety and reliability of predictive algorithms in high-stakes applications—ranging from cancer diagnosis to autonomous driving. This challenge is exacerbated by distribution shift, in which the true data–generating distribution diverges from the nominal distribution on which our statistical methods were trained. In this thesis, we formalize distribution shifts via ambiguity sets—metric neighborhoods in the space of probability measures defined by distances such as the Wasserstein metric—and demonstrate that leveraging these ambiguity sets endows two widely used statistical algorithms with distributional robustness. The Kalman filter enables accurate, real-time tracking of latent states by assimilating noisy, indirect measurements over time. Its performance relies on precise state-space models for both the evolution dynamics and the observation process. In practice, uncertainties in these models introduce errors that can significantly degrade filter accuracy. Here, we review two robust Kalman-filter variants that explicitly account for such errors via Wasserstein ambiguity sets. Split conformal prediction, hereafter referred to as conformal prediction, offers a powerful framework for quantifying predictive uncertainty by constructing prediction intervals with finite-sample, distribution-free guarantees. Despite its widespread success, ensuring its validity under train-test distribution shifts remains a significant challenge. We model distribution shifts using ambiguity sets defined by two optimal transport-based metrics and propose two robust conformal prediction algorithms that preserves validity under these shifts. First, we consider ambiguity sets defined by a pseudo-divergence derived from the LévyProkhorov (LP) metric, which captures both local and global data perturbations. We provide a self-contained overview of LP ambiguity sets and their connections to widely used metrics such as the Wasserstein and Total Variation distances. We then establish a natural link between conformal prediction and LP ambiguity sets: by propagating the LP ambiguity set through the scoring function, we reduce complex high-dimensional distribution shifts to manageable one-dimensional shifts, enabling exact computation of the worst-case quantile and coverage. Building on this foundation, we develop valid robust conformal prediction intervals under distribution shifts, explicitly relating LP parameters to interval width and confidence levels. Experimental results on real-world datasets demonstrate the effectiveness of the proposed approach. Next, we extend our analysis to robust conformal prediction over Wasserstein-2 ambiguity sets, deriving a theoretical characterization of the worst-case quantile. However, we identify intractability due to the dependence on the shape of the original score CDF and conclude with potential future directions.
Date issued
2025-05Department
Massachusetts Institute of Technology. Department of Aeronautics and AstronauticsPublisher
Massachusetts Institute of Technology