Approximate Bayesian Modeling with Embedded Gaussian Processes
Author(s)
Chen, Rujian
DownloadThesis PDF (4.532Mb)
Advisor
Fisher III, John W.
Terms of use
Metadata
Show full item recordAbstract
Bayesian inference plays an important role in uncertainty quantification and decision making in science and engineering applications. However, the presence of complex model components can significantly impact model accuracy and computational feasibility of Bayesian inference methods. We propose the embedded Gaussian process framework to address these challenges. The embedded GP model captures the uncertainty of complex physical models and incorporates it in posterior inference where a joint distribution of all uncertain quantities, including the complex physical models, is learned from data. We show that under appropriate conditions, inference can be done efficiently through various Gaussian process related properties and computational techniques. Compared to previous deterministic modeling approaches, our proposed framework leads to improved model fit and decision making capabilities, which we demonstrate through an application to Bayesian inference and experimental design for a large-scale off-shore oil production system featuring complex fluid dynamical processes.
Gaussian processes are increasingly employed in science and engineering applications to provide efficient approximations to complex computational models. Recent literature continues to expand and generalize Gaussian process surrogate models, including the introduction of generalized real or simulator observations. Despite the increasing applications, theoretical properties and statistical implications of GP approximations on data analysis are not yet fully understood. We contribute advances in this front by studying asymptotic properties of the approximate posterior in GP surrogate models with generalized observations. We prove conditions and guarantees for consistent approximate inference in terms of posterior expectations and KL-divergence. Our convergence results provide a family of consistency guarantees for downstream prediction, estimation and decision making.
Finally, we study the problem of hyperparameter optimization in probabilistic latent variable models. Although efficient algorithms are available for many classes of popular models, they cannot handle fully general models due to the appearance of intractable quantities that must be computed. In general, hyperparameter optimization is well known to be a challenging problem for complex or high-dimensional models without special model features. This is the case, for example, for the embedded GP model, where GP hyperparameters can have a strong impact on inference properties. We develop a new particle-based approximate algorithm which is both simple to implement and can handle general models, including hard cases that cannot be solved by aforementioned specialized optimizers. We show that our algorithm outperforms state-of-the-art optimizers on a number of simulation experiments. We then demonstrate our method on the embedded GP model for the large-scale oil production system. Besides generating substantial improvements to model performance, our method allows us to infer simulator mismatch characteristics from data in greater detail, a capability which was not possible before.
Date issued
2023-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology