Using existing knowledge for transfer and regularization for program synthesis with genetic programming
Author(s)
Wick, Jordan(Jordan M.)
Download1193031850-MIT.pdf (1.195Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Erik Hemberg and Una-May O'Reilly.
Terms of use
Metadata
Show full item recordAbstract
In normal Genetic Programming (GP), test case performance is the only signal the population has to improve on. However, human programmers use other signals to guide them - they know what "good" code looks like. They often reuse program patterns across multiple functions, showing Transferability of knowledge between programming tasks. Pieces of code also become more or less likely in different contexts - you don't see many For loops nested 4 layers deep in codebases generated by humans. In this thesis, Transferability is explored in the context of Grammatical Evolution, looking at how a population of problems being optimized to solve one problem can be used to aid in solving a similar problem. To do this, methods were created to parse existing solutions from a codebase into the Grammatical Evolution representation, and operators were implemented that switch the GE objective throughout the process of evolution. A "humanlike" objective was defined which takes into account the distribution of AST nodes within different program contexts. This was used as Regularization during GE, in that programs that strayed further from the humanlike distribution of nodes received a penalty. It was found that optimizing for one problem first can make it easier to find a solution to other similar problems, especially when the solution to one problem is used in the other - however, the amount of pre-optimization and the choice of problem are of great imprtance. Additionally, optimizing directly for code to become more "humanlike" via the defined measure was not effective in allowing the population solve test cases more efficiently, although selecting directly for these metrics did change the distribution of these metrics in the resulting populations. This shows that while surrogate objectives can improve performance, they need to be chosen carefully.
Description
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020 Cataloged from the official PDF of thesis. Includes bibliographical references (pages 81-84).
Date issued
2020Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.