Program Synthesis over Noisy Data
Author(s)
Handa, Shivam
DownloadThesis PDF (830.7Kb)
Advisor
Rinard, Martin
Terms of use
Metadata
Show full item recordAbstract
I present a new framework and associated synthesis algorithms for program synthesis over noisy data, i.e., data that may contain incorrect/corrupted input-output examples. I model the process that produced the noisy dataset as the selection of inputs and a hidden program from an input source and program source followed by the application of a noise source to the correct outputs from the hidden program to obtain the noisy dataset. This model makes it possible to formulate the problem of noisy program synthesis as an optimization problem formulated over the loss of a candidate program over the noisy dataset and the complexity of the candidate program.
I present a noisy program synthesis algorithm based on finite tree automaton. Results from an implemented system running this algorithm on problems from the SyGuS 2018 benchmark suite highlight the algorithm’s ability to successfully synthesize programs in the face of noisy data.
I extend the noisy program synthesis framework to formally define the concepts of an optimal loss function and the convergence of a program synthesis algorithm to a correct program. Working with these concepts, I present optimal loss functions and convergence results for a wide range of program synthesis problems in the text manipulation domain, including results that characterize optimality and convergence properties of noise sources and loss functions used in experiments with the implemented synthesis algorithm. These results provide insight into the reasons for the success of the presented technique and can help enable the development of effective loss functions and noisy program synthesis algorithms in a range of contexts.
I also present a new noisy program synthesis algorithm that uses an abstraction refinement based optimization process to synthesize programs. The presented experimental results demonstrate the significant performance improvements that this new technique can deliver. Building on this abstraction refinement technique, I present new noisy program synthesis algorithms that can work with both noisy inputs and noisy outputs as well as domain specific languages that include infinite sets of constants.
Date issued
2022-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology