On the low-dimensional structure of Bayesian inference

Spantini, Alessio

Author(s)

Spantini, Alessio

DownloadFull printable version (9.948Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Aeronautics and Astronautics.

Advisor

Youssef M. Marzouk.

Terms of use

MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

The Bayesian approach to inference characterizes model parameters and predictions through the exploration of their posterior distributions, i.e., their distributions conditioned on available data. The Bayesian paradigm provides a flexible, principled framework for quantifying uncertainty, wherein heterogeneous and incomplete sources of information (e.g., prior knowledge, noisy observations, imperfect models) can be properly rationalized. Yet a major obstacle to deploying Bayesian inference in realistic applications is computational: characterizing the associated high-dimensional and non-Gaussian posterior distributions remains a challenging task. While the Bayesian formulation is quite general, essential features of a statistical model can bring additional structure to the Bayesian update. For instance, the prior distribution often encodes some kind of regularity in the parameters; observations might be sparse and corrupted by noise; observations might also be indirect, related to the parameters by a forward operator that filters out some information; the posterior distribution might satisfy conditional independence assumptions that reflect local probabilistic interactions; and in some cases we might be uninterested in the posterior distribution per se, but rather in specific prediction goals. In this thesis we: (1) provide a rigorous mathematical characterization of low-dimensional structures that enable efficient Bayesian inference in high-dimensional and continuous parameter spaces; and (2) exploit this characterization to devise new structure-exploiting and computationally efficient inference algorithms. Our contributions encompass multiple related topics. First we characterize optimal low-rank approximations of linear-Gaussian Bayesian inverse problems, and of their goal-oriented extensions. Then we turn to inference in the nonlinear non-Gaussian setting--analyzing the sparsity, decomposability, and low-rank structure of deterministic couplings between distributions. These couplings facilitate efficient computation of posterior expectations in generically non-Gaussian settings. Based on this analysis, we introduce a number of approaches for representing non-Gaussian Markov random fields and for exploiting their conditional independence structure in computation by means of sparse nonlinear transport maps. We also develop new variational algorithms for nonlinear smoothing and sequential parameter estimation. These algorithms can be understood as the natural generalization--to the non-Gaussian case--of the square-root Rauch-Tung-Striebel Gaussian smoother. Finally, we outline a new class of nonlinear filters induced by local couplings, for inference in high-dimensional spatiotemporal processes with chaotic dynamics.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2017.

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.

Cataloged from student-submitted PDF version of thesis.

Includes bibliographical references (pages 239-266).

Date issued

2017

URI

http://hdl.handle.net/1721.1/113716

Department

Massachusetts Institute of Technology. Department of Aeronautics and Astronautics

Publisher

Massachusetts Institute of Technology

Keywords

Aeronautics and Astronautics.

Collections

Doctoral Theses