Mixed-Variable Bayesian Optimization using Prior-Data
Fitted Networks

Qian, Janet

Author(s)

Qian, Janet

DownloadThesis PDF (5.405Mb)

Advisor

Ahmed, Faez

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Bayesian optimization (BO) is a powerful framework for optimizing expensive blackbox functions, widely used in domains such as materials science, engineering design, and hyperparameter tuning. Traditional BO relies on Gaussian processes (GPs) as surrogate models, but GPs face limitations in flexibility and scalability. Prior-Data Fitted Networks (PFNs) have recently emerged as a promising alternative, leveraging transformer architectures and in-context learning to approximate posterior predictive distributions (PPDs) in a single forward pass. By training on large amounts of synthetically generated data from sample-able function priors, PFNs can learn to rapidly predict PPDs across a wide range of function classes. In this thesis, we investigate the application of PFNs to mixed-variable BO, a particularly challenging setting due to the interplay between continuous and discrete inputs and the combinatorial complexity of the search space. We evaluate how PFNs perform when integrated with a range of mixed-variable BO strategies, including various encoding schemes and discrete-aware acquisition optimization. Additionally, we explore how finetuning PFNs on targeted function priors can enhance performance when prior knowledge about the objective is available. Our contributions include empirical evaluations of mixed-BO techniques, insights into PFN training, and a suite of mixed-variable benchmark problems.

Date issued

2025-05

URI

https://hdl.handle.net/1721.1/162942

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses