Programming with Neural Surrogates of Programs
Author(s)
Renda, Alex
DownloadThesis PDF (5.411Mb)
Advisor
Carbin, Michael
Terms of use
Metadata
Show full item recordAbstract
Surrogate programming, the act of replacing programs with surrogate models of their behavior, is being increasingly leveraged to solve software development challenges. Surrogates are typically machine learning models trained on input-output examples of the program under consideration. With surrogate compilation, programmers train a surrogate that replicates the behavior of the original program to deploy to end-users in its place, with the goal of improving performance. With surrogate adaptation, programmers first train a surrogate of a program then continue to train the surrogate on a downstream task, with the goal of improving the accuracy of the surrogate on the task. With surrogate optimization, programmers train a surrogate of a program then use the surrogate to optimize the program's inputs, with the goal of optimizing inputs more efficiently than with the original program. These emerging design patterns represent an important new frontier of software development. However, we lack a coherent understanding of the applications and methodology underlying surrogate programming.
In this thesis I investigate three hypotheses about surrogate programming: that surrogate programming can be used to achieve state-of-the-art results on large-scale programming tasks; that there is a small set of methodologically distinct design patterns that can be grouped into a single programming methodology, unifying existing uses of surrogates in the literature; and that we can guide surrogate design using facts derived from the modeled program to train surrogates more efficiently and achieve better performance on downstream tasks.
To argue these hypotheses, I present four sets of contributions. I first present DiffTune, a surrogate optimization based approach to tuning the parameters of a large-scale CPU simulator. I next generalize this approach to identify the three design patterns above, and lay out the common methodology underlying all design patterns. I then present Turaco, a program analysis which allows developers to reason about the training data distribution to use to train a surrogate of a given program. I conclude with Renamer, a neural network architecture which mirrors source programs' invariance over variable renaming in their inputs.
Surrogate programming has the potential to change how developers program large-scale computer systems, by abstracting away much of the complexity to machine learning algorithms. Together, the contributions in my thesis lay the groundwork for a principled understanding of the applications and methodology of surrogate programming.
Date issued
2024-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology