Mixed-Variable Bayesian Optimization using Prior-Data Fitted Networks
Author(s)
Qian, Janet
DownloadThesis PDF (5.405Mb)
Advisor
Ahmed, Faez
Terms of use
Metadata
Show full item recordAbstract
Bayesian optimization (BO) is a powerful framework for optimizing expensive blackbox functions, widely used in domains such as materials science, engineering design, and hyperparameter tuning. Traditional BO relies on Gaussian processes (GPs) as surrogate models, but GPs face limitations in flexibility and scalability. Prior-Data Fitted Networks (PFNs) have recently emerged as a promising alternative, leveraging transformer architectures and in-context learning to approximate posterior predictive distributions (PPDs) in a single forward pass. By training on large amounts of synthetically generated data from sample-able function priors, PFNs can learn to rapidly predict PPDs across a wide range of function classes. In this thesis, we investigate the application of PFNs to mixed-variable BO, a particularly challenging setting due to the interplay between continuous and discrete inputs and the combinatorial complexity of the search space. We evaluate how PFNs perform when integrated with a range of mixed-variable BO strategies, including various encoding schemes and discrete-aware acquisition optimization. Additionally, we explore how finetuning PFNs on targeted function priors can enhance performance when prior knowledge about the objective is available. Our contributions include empirical evaluations of mixed-BO techniques, insights into PFN training, and a suite of mixed-variable benchmark problems.
Date issued
2025-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology