Program Inference and Regeneration via Active Learning
Author(s)
Shen, Jiasi
DownloadThesis PDF (3.118Mb)
Advisor
Rinard, Martin C.
Terms of use
Metadata
Show full item recordAbstract
Software now plays a central role in numerous aspects of human society. Current software development practices involve significant developer effort in all phases of the software life cycle, including the development of new software, detection and elimination of defects and security vulnerabilities in existing software, maintenance of legacy software, and integration of existing software into more contexts, with the quality of the resulting software still leaving much to be desired. The goal of my research is to improve software quality and reduce costs by automating tasks that currently require substantial manual engineering effort.
I present a novel approach for program inference and regeneration, which takes an existing program, learns its core functionality as a black box, builds a model that captures this functionality, and uses the model to generate a new program. The new program delivers the same core functionality but is potentially augmented or transformed to eliminate defects, systematically introduce safety or security checks, or operate successfully in different environments.
This research enables the rejuvenation and retargeting of existing software and provides a powerful way for developers to express program functionality that adapts flexibly to a variety of contexts. For instance, one benefit is enabling new development methodologies that work with simple prototype implementations as specifications, then use regeneration to automatically obtain clean, efficient, and secure implementations. Another benefit is automatically improving program comprehension and producing cleaner code, making the code more transparent and the developers more productive. A third benefit is automatically extracting the human knowledge crystallized and encapsulated in legacy software systems and retargeting it to new languages and platforms, including languages and platforms that provide more powerful features.
In this thesis, I present two systems that implement this approach for database-backed programs.
Date issued
2022-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology