LEO: an LLM-Powered EDA Overview
Author(s)
Zheng, Sophia
DownloadThesis PDF (2.114Mb)
Advisor
Satyanarayan, Arvind
Terms of use
Metadata
Show full item recordAbstract
Computational notebooks impose a linear structure that impedes data analysts’ sensemaking process with overwritten cells, dead-end code, and fragmented logic. This challenge is especially pronounced when analysts either encounter a notebook authored by someone else or revisit a self-authored notebook after significant time has passed. In both cases, understanding the analysis code becomes convoluted and laborious. To address these barriers, we introduce LEO, a computational notebook tool that operationalizes notebook summarization by leveraging large language models to (1) cluster analysis patterns and (2) trace variable use. LEO organizes code into a two-level hierarchy–General Level Sections and Code Level Actions—integrated with in-line textual summaries filtered on the variable-level, further supporting task-driven exploration. We evaluate the system’s effectiveness in a user study with five computational notebook users across two realistic use cases. Participants reported that LEO streamlined code comprehension and navigation of undocumented notebooks by allowing them to query variables and traverse code cells with greater ease.
Date issued
2025-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology