Active learning for optimal intervention design in causal models

Zhang, Jiaqi; Cammarata, Louis; Squires, Chandler; Sapsis, Themistoklis P.; Uhler, Caroline

Author(s)

Zhang, Jiaqi; Cammarata, Louis; Squires, Chandler; Sapsis, Themistoklis P.; Uhler, Caroline

Downloads42256-023-00719-0.pdf (8.868Mb)

Publisher Policy

Terms of use

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

Metadata

Show full item record

Abstract

Sequential experimental design to discover interventions that achieve a desired outcome is a key problem in various domains including science, engineering and public policy. When the space of possible interventions is large, making an exhaustive search infeasible, experimental design strategies are needed. In this context, encoding the causal relationships between the variables, and thus the effect of interventions on the system, is critical for identifying desirable interventions more efficiently. Here we develop a causal active learning strategy to identify interventions that are optimal, as measured by the discrepancy between the post-interventional mean of the distribution and a desired target mean. The approach employs a Bayesian update for the causal model and prioritizes interventions using a carefully designed, causally informed acquisition function. This acquisition function is evaluated in closed form, allowing for fast optimization. The resulting algorithms are theoretically grounded with information-theoretic bounds and provable consistency results for linear causal models with known causal graph. We apply our approach to both synthetic data and single-cell transcriptomic data from Perturb–CITE-sequencing experiments to identify optimal perturbations that induce a specific cell-state transition. The causally informed acquisition function generally outperforms existing criteria, allowing for optimal intervention design with fewer but carefully selected samples.

Date issued

2023-10-02

URI

https://hdl.handle.net/1721.1/154216

Department

Massachusetts Institute of Technology. Laboratory for Information and Decision Systems; Statistics and Data Science Center (Massachusetts Institute of Technology); Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science; Massachusetts Institute of Technology. Institute for Data, Systems, and Society

Journal

Nature Machine Intelligence

Publisher

Springer Science and Business Media LLC

Citation

Zhang, J., Cammarata, L., Squires, C. et al. Active learning for optimal intervention design in causal models. Nat Mach Intell 5, 1066–1075 (2023).

Version: Final published version

ISSN

2522-5839

Keywords

Artificial Intelligence, Computer Networks and Communications, Computer Vision and Pattern Recognition, Human-Computer Interaction, Software

Collections

MIT Open Access Articles