LEO: an LLM-Powered EDA Overview

Zheng, Sophia

Author(s)

Zheng, Sophia

DownloadThesis PDF (2.114Mb)

Advisor

Satyanarayan, Arvind

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Computational notebooks impose a linear structure that impedes data analysts’ sensemaking process with overwritten cells, dead-end code, and fragmented logic. This challenge is especially pronounced when analysts either encounter a notebook authored by someone else or revisit a self-authored notebook after significant time has passed. In both cases, understanding the analysis code becomes convoluted and laborious. To address these barriers, we introduce LEO, a computational notebook tool that operationalizes notebook summarization by leveraging large language models to (1) cluster analysis patterns and (2) trace variable use. LEO organizes code into a two-level hierarchy–General Level Sections and Code Level Actions—integrated with in-line textual summaries filtered on the variable-level, further supporting task-driven exploration. We evaluate the system’s effectiveness in a user study with five computational notebook users across two realistic use cases. Participants reported that LEO streamlined code comprehension and navigation of undocumented notebooks by allowing them to query variables and traverse code cells with greater ease.

Date issued

2025-05

URI

https://hdl.handle.net/1721.1/162950

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses