LoopTree: Enabling Systematic and Flexible
Exploration of Fused-layer Dataflow Accelerators

Gilbert, Michael

Author(s)

Gilbert, Michael

DownloadThesis PDF (2.022Mb)

Advisor

Sze, Vivienne

Emer, Joel

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Deep neural network (DNN) accelerators exploit data reuse to reduce memory traffic. Typically, DNN accelerators exploit data reuse within layers. However, there is also reuse between layers. To exploit this inter-layer reuse opportunity, fused-layer dataflow accelerators tile and buffer intermediate data between layers on-chip to benefit from inter-layer reuse while minimizing buffer size. To further minimize buffer space requirement, some fused-layer dataflows also propose not buffering part of the tile at the cost of recomputation of the unbuffered data. The design space of fused-layer dataflow accelerators is large, but prior work only considers a subset of the design space. Prior works are limited in a number of ways: (1) tiling only in certain dimensions, leaving some designs unexplored; (2) limited choices of reuse/recompute which are applied uniformly to all layers, leading to increased recomputation; (3) not exploring the interaction of tiling and reuse/recompute choices; and (4) applying the same design choices for all layers in the DNN despite diverse layer shapes, which call for different choices. To address these limitations, we propose (1) a more extensive design space, (2) a taxonomy that introduces structure into the design space, and (3) a fast, flexible, analytical model, called LoopTree, to evaluate the latency, energy consumption, buffer space requirements, and bandwidth requirements of designs in this design space. Finally, we present case studies enabled by LoopTree that show how exploring this larger space results in designs that require less buffer space (e.g., up to 7.6× buffer space reduction for the same off-chip transfers).

Date issued

2023-09

URI

https://hdl.handle.net/1721.1/152743

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses