A sparse iteration space transformation framework for sparse tensor algebra

Senanayake, Ryan; Hong, Changwan; Wang, Ziheng; Wilson, Amalee; Chou, Stephen; Kamil, Shoaib; Amarasinghe, Saman; Kjolstad, Fredrik

dc.contributor.author	Senanayake, Ryan
dc.contributor.author	Hong, Changwan
dc.contributor.author	Wang, Ziheng
dc.contributor.author	Wilson, Amalee
dc.contributor.author	Chou, Stephen
dc.contributor.author	Kamil, Shoaib
dc.contributor.author	Amarasinghe, Saman
dc.contributor.author	Kjolstad, Fredrik
dc.date.accessioned	2021-10-27T20:04:00Z
dc.date.available	2021-10-27T20:04:00Z
dc.date.issued	2020
dc.identifier.uri	https://hdl.handle.net/1721.1/134211
dc.description.abstract	© 2020 Owner/Author. We address the problem of optimizing sparse tensor algebra in a compiler and show how to define standard loop transformations - -split, collapse, and reorder - -on sparse iteration spaces. The key idea is to track the transformation functions that map the original iteration space to derived iteration spaces. These functions are needed by the code generator to emit code that maps coordinates between iteration spaces at runtime, since the coordinates in the sparse data structures remain in the original iteration space. We further demonstrate that derived iteration spaces can tile both the universe of coordinates and the subset of nonzero coordinates: the former is analogous to tiling dense iteration spaces, while the latter tiles sparse iteration spaces into statically load-balanced blocks of nonzeros. Tiling the space of nonzeros lets the generated code efficiently exploit heterogeneous compute resources such as threads, vector units, and GPUs. We implement these concepts by extending the sparse iteration theory implementation in the TACO system. The associated scheduling API can be used by performance engineers or it can be the target of an automatic scheduling system. We outline one heuristic autoscheduling system, but other systems are possible. Using the scheduling API, we show how to optimize mixed sparse-dense tensor algebra expressions on CPUs and GPUs. Our results show that the sparse transformations are sufficient to generate code with competitive performance to hand-optimized implementations from the literature, while generalizing to all of the tensor algebra.	en_US
dc.language.iso	en
dc.publisher	Association for Computing Machinery (ACM)	en_US
dc.relation.isversionof	10.1145/3428226	en_US
dc.rights	Creative Commons Attribution 4.0 International license	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.source	ACM	en_US
dc.title	A sparse iteration space transformation framework for sparse tensor algebra	en_US
dc.type	Article	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.relation.journal	Proceedings of the ACM on Programming Languages	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2021-09-27T14:41:22Z
dspace.orderedauthors	Senanayake, R; Hong, C; Wang, Z; Wilson, A; Chou, S; Kamil, S; Amarasinghe, S; Kjolstad, F	en_US
dspace.date.submission	2021-09-27T14:41:23Z
mit.journal.volume	4	en_US
mit.journal.issue	OOPSLA	en_US
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: 3428226.pdf
Size:: 1.492Mb
Format:: PDF
Description:: Published version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record