HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity

Wu, Yannan; Tsai, Po-An; Muralidharan, Saurav; Parashar, Angshuman; Sze, Vivienne; Emer, Joel

dc.contributor.author	Wu, Yannan
dc.contributor.author	Tsai, Po-An
dc.contributor.author	Muralidharan, Saurav
dc.contributor.author	Parashar, Angshuman
dc.contributor.author	Sze, Vivienne
dc.contributor.author	Emer, Joel
dc.date.accessioned	2024-01-04T13:48:13Z
dc.date.available	2024-01-04T13:48:13Z
dc.date.issued	2023-10-28
dc.identifier.isbn	979-8-4007-0329-4
dc.identifier.uri	https://hdl.handle.net/1721.1/153277
dc.description.abstract	Due to complex interactions among various deep neural network (DNN) optimization techniques, modern DNNs can have weights and activations that are dense or sparse with diverse sparsity degrees. To offer a good trade-off between accuracy and hardware performance, an ideal DNN accelerator should have high flexibility to efficiently translate DNN sparsity into reductions in energy and/or latency without incurring significant complexity overhead. This paper introduces hierarchical structured sparsity (HSS), with the key insight that we can systematically represent diverse sparsity degrees by having them hierarchically composed from multiple simple sparsity patterns. As a result, HSS simplifies the underlying hardware since it only needs to support simple sparsity patterns; this significantly reduces the sparsity acceleration overhead, which improves efficiency. Motivated by such opportunities, we propose a simultaneously efficient and flexible accelerator, named HighLight, to accelerate DNNs that have diverse sparsity degrees (including dense). Due to the flexibility of HSS, different HSS patterns can be introduced to DNNs to meet different applications’ accuracy requirements. Compared to existing works, HighLight achieves a geomean of up to 6.4 × better energy-delay product (EDP) across workloads with diverse sparsity degrees, and always sits on the EDP-accuracy Pareto frontier for representative DNNs.	en_US
dc.publisher	ACM\|56th Annual IEEE/ACM International Symposium on Microarchitecture	en_US
dc.relation.isversionof	https://doi.org/10.1145/3613424.3623786	en_US
dc.rights	Creative Commons Attribution	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.title	HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity	en_US
dc.type	Article	en_US
dc.identifier.citation	Wu, Yannan, Tsai, Po-An, Muralidharan, Saurav, Parashar, Angshuman, Sze, Vivienne et al. 2023. "HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity."
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.mitlicense	PUBLISHER_CC
dc.identifier.mitlicense	PUBLISHER_CC
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2024-01-01T08:48:35Z
dc.language.rfc3066	en
dc.rights.holder	The author(s)
dspace.date.submission	2024-01-01T08:48:36Z
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: license_rdf
Size:: 40bytes
Format:: application/rdf+xml

View/Open

Name:: 3613424.3623786.pdf
Size:: 3.262Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record