Systematic Modeling and Design of Sparse Deep Neural Network Accelerators

Wu, Yannan

Author(s)

Wu, Yannan

DownloadThesis PDF (3.819Mb)

Advisor

Emer, Joel S.

Sze, Vivienne

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Sparse deep neural networks (DNNs) are an important computation kernel in many data and computation-intensive applications (e.g., image classification, speech recognition, and language processing). The sparsity in such kernels has motivated the development of many sparse DNN accelerators. However, despite the abundant existing proposals, there has not been a systematic way to understand, model, and develop various sparse DNN accelerators. To address these limitations, this thesis first presents a taxonomy of sparsity-related acceleration features to allow a systematic understanding of the sparse DNN accelerator design space. Based on the taxonomy, it proposes Sparseloop, the first analytical modeling tool for fast, accurate, and flexible evaluations of sparse DNN accelerators, enabling early-stage exploration of the large and diverse sparse DNN accelerator design space. Across representative accelerator designs and workloads, Sparseloop achieves over 2000× faster modeling speed than cycle-level simulations, maintains relative performance trends, and achieves ≤ 8% average modeling error. Employing Sparseloop, this thesis studies the design space and presents HighLight, an efficient and flexible sparse DNN accelerator. Specifically, HighLight accelerates DNNs with a novel sparsity pattern, called hierarchical structured sparsity, with the key insight that we can efficiently accelerate diverse degrees of sparsity (including dense) by having them hierarchically composed of simple sparsity patterns. Compared to existing works, HighLight achieves a geomean of upto 6.4× better energy-delay product (EDP) across workloads with diverse sparsity degrees, and always sits on the EDP-accuracy Pareto frontier for representative DNNs.

Date issued

2023-06

URI

https://hdl.handle.net/1721.1/151571

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses