Practical Methods for Scalable Bayesian and Causal Inference with Provable Quality Guarantees

Agrawal, Raj

dc.contributor.advisor	Broderick, Tamara
dc.contributor.advisor	Uhler, Caroline
dc.contributor.author	Agrawal, Raj
dc.date.accessioned	2022-01-14T15:05:47Z
dc.date.available	2022-01-14T15:05:47Z
dc.date.issued	2021-06
dc.date.submitted	2021-06-23T19:34:24.578Z
dc.identifier.uri	https://hdl.handle.net/1721.1/139350
dc.description.abstract	Many scientific and decision-making tasks require learning complex relationships between a set of 𝑝 covariates and a target response, from 𝑁 observed datapoints with 𝑁 ≪ 𝑝. For example, in genomics and precision medicine, there may be thousands or millions of genetic and environmental covariates but just hundreds or thousands of observed individuals. Researchers would like to (1) identify a small set of factors associated with diseases, (2) quantify these factors’ effects, and (3) test for causality. Unfortunately, in this high-dimensional data regime, inference is statistically and computationally challenging due to non-linear interaction effects, unobserved confounders, and the lack of randomized experimental data. In this thesis, I start by addressing the problems of variable selection and estimation when there are non-linear interactions and fewer datapoints than covariates. Unlike previous methods whose runtimes scale at least quadratically in the number of covariates, my new method (SKIM-FA) uses a kernel trick to perform inference in linear time by exploiting special interaction structure. While SKIM-FA identifies potential risk-factors, not all of these factors need be causal. So next I aim to identify causal factors to aid in decision making. To this end, I show when we can extract causal relationships from observational data, even in the presence of unobserved confounders, non-linear effects, and a lack of randomized controlled data. In the last part of my thesis, I focus on experimental design. Specifically, if the observational data is not adequate, how do we optimally collect new experimental data to test if particular causal relationships of interest exist.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright MIT
dc.rights.uri	http://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Practical Methods for Scalable Bayesian and Causal Inference with Provable Quality Guarantees
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: Agrawal-rajisme-PhD-EECS-2021- ...
Size:: 5.869Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record