WACO: Learning workload-aware co-optimization of the format and schedule of a sparse tensor program
Author(s)
Won, Jaeyeon
DownloadThesis PDF (2.645Mb)
Advisor
Emer, Joel S.
Amarasinghe, Saman
Terms of use
Metadata
Show full item recordAbstract
Leveraging the existence of the large number of zeros in sparse tensors offer a powerful way to solve complex problems efficiently in many applications. However, optimizing the performance of those applications poses a challenge. Sparse tensor programs must find the ideal balance between data format and implementation strategy to achieve optimal performance.
This thesis presents WACO, a novel method of co-optimizing the format and schedule of a given sparsity pattern in a sparse tensor program. A core challenge in this thesis is the design of a lightweight cost model that accurately predicts the runtime of a sparse tensor program by considering the sparsity pattern, the format, and the schedule. The key idea in addressing this is exploiting a sparse convolutional network to learn meaningful features of the sparsity pattern and embedding a coupled behavior between the format and the schedule using a specially designed schedule template. In addition, within the enormous search space of co-optimization, our novel search strategy, an approximate nearest neighbor search, efficiently and accurately retrieves the best format and schedule for a given sparsity pattern.
We evaluate WACO for four different algorithms (SpMV, SpMM, SDDMM, and MTTKRP) on a CPU using 726 different sparsity patterns. Our experimental results shows that WACO outperformed four state-of-the-art baselines, Intel MKL, Formatonly auto-tuner, TACO with a default schedule, and ASpT. Compared to the best of four baselines, WACO achieved 1.43×, 1.18×, 1.14×, and 1.27× average speedups on SpMV, SpMM, SDDMM, and MTTKRP, respectively.
Date issued
2023-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology