Improving the Programmability of A Distributed Hardware Accelerator

Shwatal, Nathan A.

Author(s)

Shwatal, Nathan A.

DownloadThesis PDF (602.5Kb)

Advisor

Sanchez, Daniel

Terms of use

Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/

Metadata

Show full item record

Abstract

Sparse iterative matrix algorithms are critical to many scientific and engineering workloads, yet they perform poorly on conventional hardware. (Ōmeteōtl, a new hardware accelerator with a distributed-memory and task-based execution model, aims to address these performance bottlenecks. However, programming for (Ōmeteōtl is low-level, error-prone, and far removed from the simplicity of typical iterative formulations. This thesis presents Lapis, a domain-specific language and compiler that allows users to express sparse matrix algorithms in high-level Python code and automatically generates efficient C++ code for (Ōmeteōtl. Lapis abstracts away data partitioning and task orchestration, reducing implementation complexity: for example, it lowers lines of code by 30× for conjugate gradients and 46× for power iteration. Despite this abstraction, generated code achieves 75.7% to 92.6% of the performance of manually written implementations across several benchmarks.

Date issued

2025-05

URI

https://hdl.handle.net/1721.1/162938

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses