Scaling Bayesian inference for generative models via probabilistic programming

Loula Guimarães de Campos, João

Author(s)

Loula Guimarães de Campos, João

DownloadThesis PDF (5.915Mb)

Advisor

Tenenbaum, Joshua B.

Mansinghka, Vikash

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

This thesis addresses these challenges for the field of data science, developing probabilistic programming methods that enable rational AI agents in that domain. The work is organized into two parts: Part I introduces GenLM and Adaptive Weighted Rejection Sampling for translating natural language instructions into structured programs with both syntactic and semantic constraints, outperforming existing approaches across a number of domains. Part II develops Bayesian generative models for tabular data that can answer a wide range of questions, yield stable inferences across subpopulations of different sizes, and scale to hundreds of millions of rows on GPUs; as well as early work on Large Population Models that unify heterogeneous datasets. Together, these contributions provide first steps towards a unified framework for creating AI agents that can rationally formalize and answer questions about data.

Date issued

2025-09

URI

https://hdl.handle.net/1721.1/165128

Department

Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses