Show simple item record

dc.contributor.advisorBroderick, Tamara
dc.contributor.advisorUhler, Caroline
dc.contributor.authorAgrawal, Raj
dc.date.accessioned2022-01-14T15:05:47Z
dc.date.available2022-01-14T15:05:47Z
dc.date.issued2021-06
dc.date.submitted2021-06-23T19:34:24.578Z
dc.identifier.urihttps://hdl.handle.net/1721.1/139350
dc.description.abstractMany scientific and decision-making tasks require learning complex relationships between a set of 𝑝 covariates and a target response, from 𝑁 observed datapoints with 𝑁 ≪ 𝑝. For example, in genomics and precision medicine, there may be thousands or millions of genetic and environmental covariates but just hundreds or thousands of observed individuals. Researchers would like to (1) identify a small set of factors associated with diseases, (2) quantify these factors’ effects, and (3) test for causality. Unfortunately, in this high-dimensional data regime, inference is statistically and computationally challenging due to non-linear interaction effects, unobserved confounders, and the lack of randomized experimental data. In this thesis, I start by addressing the problems of variable selection and estimation when there are non-linear interactions and fewer datapoints than covariates. Unlike previous methods whose runtimes scale at least quadratically in the number of covariates, my new method (SKIM-FA) uses a kernel trick to perform inference in linear time by exploiting special interaction structure. While SKIM-FA identifies potential risk-factors, not all of these factors need be causal. So next I aim to identify causal factors to aid in decision making. To this end, I show when we can extract causal relationships from observational data, even in the presence of unobserved confounders, non-linear effects, and a lack of randomized controlled data. In the last part of my thesis, I focus on experimental design. Specifically, if the observational data is not adequate, how do we optimally collect new experimental data to test if particular causal relationships of interest exist.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright MIT
dc.rights.urihttp://rightsstatements.org/page/InC-EDU/1.0/
dc.titlePractical Methods for Scalable Bayesian and Causal Inference with Provable Quality Guarantees
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record