On the low-dimensional structure of Bayesian inference
Author(s)
Spantini, Alessio
DownloadFull printable version (9.948Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Aeronautics and Astronautics.
Advisor
Youssef M. Marzouk.
Terms of use
Metadata
Show full item recordAbstract
The Bayesian approach to inference characterizes model parameters and predictions through the exploration of their posterior distributions, i.e., their distributions conditioned on available data. The Bayesian paradigm provides a flexible, principled framework for quantifying uncertainty, wherein heterogeneous and incomplete sources of information (e.g., prior knowledge, noisy observations, imperfect models) can be properly rationalized. Yet a major obstacle to deploying Bayesian inference in realistic applications is computational: characterizing the associated high-dimensional and non-Gaussian posterior distributions remains a challenging task. While the Bayesian formulation is quite general, essential features of a statistical model can bring additional structure to the Bayesian update. For instance, the prior distribution often encodes some kind of regularity in the parameters; observations might be sparse and corrupted by noise; observations might also be indirect, related to the parameters by a forward operator that filters out some information; the posterior distribution might satisfy conditional independence assumptions that reflect local probabilistic interactions; and in some cases we might be uninterested in the posterior distribution per se, but rather in specific prediction goals. In this thesis we: (1) provide a rigorous mathematical characterization of low-dimensional structures that enable efficient Bayesian inference in high-dimensional and continuous parameter spaces; and (2) exploit this characterization to devise new structure-exploiting and computationally efficient inference algorithms. Our contributions encompass multiple related topics. First we characterize optimal low-rank approximations of linear-Gaussian Bayesian inverse problems, and of their goal-oriented extensions. Then we turn to inference in the nonlinear non-Gaussian setting--analyzing the sparsity, decomposability, and low-rank structure of deterministic couplings between distributions. These couplings facilitate efficient computation of posterior expectations in generically non-Gaussian settings. Based on this analysis, we introduce a number of approaches for representing non-Gaussian Markov random fields and for exploiting their conditional independence structure in computation by means of sparse nonlinear transport maps. We also develop new variational algorithms for nonlinear smoothing and sequential parameter estimation. These algorithms can be understood as the natural generalization--to the non-Gaussian case--of the square-root Rauch-Tung-Striebel Gaussian smoother. Finally, we outline a new class of nonlinear filters induced by local couplings, for inference in high-dimensional spatiotemporal processes with chaotic dynamics.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2017. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 239-266).
Date issued
2017Department
Massachusetts Institute of Technology. Department of Aeronautics and AstronauticsPublisher
Massachusetts Institute of Technology
Keywords
Aeronautics and Astronautics.