Symphony: Orchestrating Sparse and Dense Tensors with Hierarchical Heterogeneous Processing
Author(s)
Pellauer, Michael; Clemons, Jason; Balaji, Vignesh; Crago, Neal; Jaleel, Aamer; Lee, Donghyuk; O'Connor, Mike; Parashar, Angshuman; Treichler, Sean; Tsai, Po-An; Keckler, Stephen; Emer, Joel; ... Show more Show less
Download3630007.pdf (868.0Kb)
Publisher Policy
Publisher Policy
Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.
Terms of use
Metadata
Show full item recordAbstract
Sparse tensor algorithms are becoming widespread, particularly in the domains of deep learning, graph and data analytics, and scientific computing. Current high-performance broad-domain architectures, such as GPUs, often suffer memory system inefficiencies by moving too much data or moving it too far through the memory hierarchy. To increase performance and efficiency, proposed domain-specific accelerators tailor their architectures to the data needs of a narrow application domain, but as a result cannot be applied to a wide range of algorithms or applications that contain a mix of sparse and dense algorithms. This paper proposes Symphony, a hybrid programmable/specialized architecture which focuses on the orchestration of data throughout the memory hierarchy to simultaneously reduce the movement of unnecessary data and data movement distances. Key elements of the Symphony architecture include (1) specialized reconfigurable units aimed not only at roofline floating-point computations, but at supporting data orchestration features such as address generation, data filtering, and sparse metadata processing; and (2) distribution of computation resources (both programmable and specialized) throughout the on-chip memory hierarchy. We demonstrate that Symphony can match non-programmable ASIC performance on sparse tensor algebra, and provide 31× improved runtime and 44× improved energy over a comparably provisioned GPU for these applications.
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science; Massachusetts Institute of Technology. Computer Science and Artificial Intelligence LaboratoryJournal
ACM Transactions on Computer Systems
Publisher
ACM
Citation
Pellauer, Michael, Clemons, Jason, Balaji, Vignesh, Crago, Neal, Jaleel, Aamer et al. "Symphony: Orchestrating Sparse and Dense Tensors with Hierarchical Heterogeneous Processing." ACM Transactions on Computer Systems.
Version: Final published version