Probabilistic modeling and Bayesian inference via triangular transport

Baptista, Ricardo Miguel

dc.contributor.advisor	Marzouk, Youssef
dc.contributor.author	Baptista, Ricardo Miguel
dc.date.accessioned	2022-08-29T16:29:33Z
dc.date.available	2022-08-29T16:29:33Z
dc.date.issued	2022-05
dc.date.submitted	2022-06-09T16:13:56.781Z
dc.identifier.uri	https://hdl.handle.net/1721.1/145049
dc.description.abstract	Probabilistic modeling and Bayesian inference in non-Gaussian settings are pervasive challenges for science and engineering applications. Transportation of measure provides a principled framework for treating non-Gaussianity and for generalizing many methods that rest on Gaussian assumptions. A transport map deterministically couples a simple reference distribution (e.g., a standard Gaussian) to a complex target distribution via a bijective transformation. Finding such a map enables efficient sampling from the target distribution and immediate access to its density. Triangular maps comprise a general class of transports that are attractive from the perspectives of analysis, modeling, and computation. This thesis: (1) develops a general representation for monotone triangular maps, and adaptive methodologies for estimating such maps (and their associated pushforward densities) from samples; (2) uses triangular maps and their compositions to perform Bayesian computation in likelihood-free settings, including new ensemble methods for nonlinear filtering; and (3) proposes parameter and data dimension reduction techniques with error guarantees for high-dimensional inverse problems. The first part of the thesis explores the use of triangular transport maps for density estimation and for learning probabilistic graphical models. To construct triangular maps, we represent monotone functions as smooth transformations of unconstrained (non-monotone) functions. We show how certain structural choices for these transformations lead to smooth optimization problems with no spurious local minima, i.e., where all local minima are global minima. Given samples, we then propose an adaptive algorithm that estimates maps with sparse variable dependence. We demonstrate how this framework enables joint and conditional density estimation across a range of sample sizes, and how it can explicitly learn the Markov properties of a continuous non-Gaussian distribution. To this end, we introduce a consistent estimator for the Markov structure based on integrated Hessian information from the log-density. We then propose an iterative algorithm for learning sparse graphical models by exploiting a corresponding sparsity structure in triangular maps. A core advantage of triangular maps is that their components expose conditionals of the target distribution. Hence, learning a map that depends on both parameters and observations enables efficient sampling from the posterior distribution in a Bayesian inference problem. Crucially, this can be done without evaluating the likelihood function, which is often inaccessible or computationally prohibitive in scientific applications (as with forward models given by stochastic partial differential equations, which we consider here). In the second part of this thesis, we propose and analyze a specific composition of transport maps that directly transforms prior samples into posterior samples. We show that this approach, termed the stochastic map (SM) algorithm, improves over other transport-based methods for conditional sampling by reducing the bias and variance of the associated posterior approximation. We then use the SM algorithm to sequentially estimate the state of a chaotic dynamical system given online observations, a nonlinear filtering problem known in geophysical applications as “data assimilation” (DA). We show that when the SM algorithm is restricted to linear maps, it reduces to the ensemble Kalman filter (EnKF), a workhorse algorithm for DA; with nonlinear updates, however, the SM algorithm substantially improves on the performance of the EnKF in challenging regimes. Finally, we extend the use of transport for high-dimensional inference problems by developing a joint dimension reduction strategy for parameters and observations. We identify relevant low-dimensional projections of these variables by minimizing an information theoretic upper bound on the error in the posterior approximation. We show that this approach reduces to canonical correlation analysis in the linear– Gaussian setting, while outperforming standard dimension reduction strategies in a variety of nonlinear and non-Gaussian inference problems.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright MIT
dc.rights.uri	http://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Probabilistic modeling and Bayesian inference via triangular transport
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
dc.identifier.orcid	http://orcid.org/0000-0002-0421-890X
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: Baptista-rsb-PhD-AeroAstro-202 ...
Size:: 16.67Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Show simple item record