Code Summarization and Program Synthesis with Large Language Models
Author(s)
Lam, Kelly
DownloadThesis PDF (559.2Kb)
Advisor
Cafarella, Michael
Terms of use
Metadata
Show full item recordAbstract
Automatic source code summarization and generation are naturally complimentary operations because they bridge the gap between natural-language text and executable programs, allowing users to flow between the two modes. Even though large language models, have become increasingly popular, it is unclear how effective they are with code summarization and generation, especially as we examine longer source code segments or more complicated prompts for generation. In this thesis, we will formalize the automatic code summarization and generation problems, identify some cases where large-language models can perform poorly, propose some techniques to correct the initial bad results, and evaluate our results against appropriate baselines using suitable evaluation metrics.
Date issued
2024-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology