Spatial Accelerator Generation and Optimization for Tensor Applications

Zhang, Zhekai

Author(s)

Zhang, Zhekai

DownloadThesis PDF (1.410Mb)

Advisor

Han, Song

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Modern foundation models and generative AI applications require multiple input modalities (both vision and language), which increases the demand for flexible accelerator architecture. Existing frameworks suffer from the trade-off between design flexibility and productivity of RTL generation: either limited to very few hand-written templates or cannot automatically generate the RTL. To address this challenge, we propose the LEGO framework, which automatically generates and optimizes spatial architecture design in the front end and outputs synthesizable RTL code in the back end without RTL templates. LEGO front end finds all possible interconnections between function units and determines the memory system shape by solving the integer linear equations, and establishes the connections by a minimum-spanning-tree-based algorithm and a breadth-first-search-based heuristic algorithm for merging different spatial dataflow designs. LEGO back end then translates the hardware in a primitive-level graph to perform lower-level optimizations, and applies a set of linear-programming algorithms to optimally insert pipeline registers and reduce the overhead of unused logic when switching spatial dataflows. Our evaluation demonstrates that LEGO can achieve 3.2× speedup and 2.4× energy efficiency compared to previous work Gemmini, and can generate one architecture for diverse modern foundation models in generative AI applications.

Date issued

2023-09

URI

https://hdl.handle.net/1721.1/152655

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses