Scaling Bayesian inference for generative models via probabilistic programming
Author(s)
Loula Guimarães de Campos, João
DownloadThesis PDF (5.915Mb)
Advisor
Tenenbaum, Joshua B.
Mansinghka, Vikash
Terms of use
Metadata
Show full item recordAbstract
This thesis addresses these challenges for the field of data science, developing probabilistic programming methods that enable rational AI agents in that domain. The work is organized into two parts: Part I introduces GenLM and Adaptive Weighted Rejection Sampling for translating natural language instructions into structured programs with both syntactic and semantic constraints, outperforming existing approaches across a number of domains. Part II develops Bayesian generative models for tabular data that can answer a wide range of questions, yield stable inferences across subpopulations of different sizes, and scale to hundreds of millions of rows on GPUs; as well as early work on Large Population Models that unify heterogeneous datasets. Together, these contributions provide first steps towards a unified framework for creating AI agents that can rationally formalize and answer questions about data.
Date issued
2025-09Department
Massachusetts Institute of Technology. Department of Brain and Cognitive SciencesPublisher
Massachusetts Institute of Technology