Algorithmic Approaches to Nonparametric Causal Inference
Author(s)
Cohen, Peter L.
DownloadThesis PDF (3.036Mb)
Advisor
Fogarty, Colin B.
Terms of use
Metadata
Show full item recordAbstract
This thesis presents procedures for performing inferences of causal parameters across an array of contexts including observational studies, completely randomized designs, paired experiments, and covariate-adaptive designs. First, we discuss an application of convex optimization to conduct directional inference and sensitivity analyses in matched observational studies. We design an algorithm which maximizes the signal-to-noise ratio while accounting for unobserved confounding. We analyze the asymptotic distributional behavior of the algorithm's output to develop asymptotically valid hypothesis tests for causal effects. The resulting procedure achieves the maximal design sensitivity over a broad class of procedures. Second, we examine the role of feature information in drawing high-precision inferences of effects in completely randomized experiments. We construct a calibration technique based around linear regression which constructs imputation estimators with upper bounds on the asymptotic variance of the estimator. We show that this calibration procedure is applicable to any imputation estimator which may be semiparametric efficient and automatically certifies that the resulting nonlinear regression-adjusted estimator is at least as asymptotically precise as the difference in means; a feature that was previously not guaranteed for nonlinear regression-adjusted estimators under model misspecification. Third, we introduce Gaussian prepivoting: an algorithmic technique to construct test statistics for which randomization inference remains asymptotically valid even when symmetries underlying the randomization hypothesis are violated in the null. We demonstrate that randomization tests based upon prepivoted statistics are finite-sample exact under sharp nulls while they asymptotically control the probability of false rejection under weak nulls. This allows for the formation of confidence regions for treatment effects with simultaneous interpretations as exact confidence regions for homogeneous additive treatment effects and asymptotic confidence regions for heterogeneous additive effects; thereby unifying Fisherian and Neymanian inference for many experimental designs including rerandomized experiments. Fourth, we construct a nested hierarchy of resampling algorithms which exploit probabilistic structure in superpopulation, fixed covariate, and finite population models to facilitate nonparametric inference for a wide variety of statistics in completely randomized designs. The resampling algorithms extend the classical bootstrap paradigm by leveraging modern results on regression-adjustment and optimal transport to achieve significant gains under fixed covariate and finite population models.
Date issued
2022-05Department
Massachusetts Institute of Technology. Operations Research CenterPublisher
Massachusetts Institute of Technology