Universal graph framework : achieving high-performance across algorithms, graph types, and architectures
Author(s)
Brahmakshatriya, Ajay Rajendra.
Download1227521404-MIT.pdf (725.0Kb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Saman Amarasinghe.
Terms of use
Metadata
Show full item recordAbstract
The performance of graph programs depends highly on the algorithm, the size and structure of the input graph, and the features of the underlying hardware. No single set of optimizations or single hardware platform works well across all applications. Currently, when switching to a different hardware platform, programmers must re-implement graph algorithms in a completely different language or framework, and use different optimizations to achieve high performance. We propose the Universal Graph Framework (UGF), a new graph processing framework that achieves high performance across CPUs, GPUs, and Domain-Specific Accelerators (DSAs) automatically, using the same algorithm specification. UGF achieves portability with reasonable effort by decoupling algorithm, schedule, and backend. We introduce a new domain-specific intermediate representation, GraphIR, that is key to this decoupling. GraphIR encodes high-level algorithm and optimization information needed for hardware-specific code generation, making it easy to develop different backends (GraphVMs) for diverse architectures spanning CPUs, GPUs, and DSAs. UGF builds on the GraphIt domain-specific language (DSL), over which it introduces a new extensible scheduling language that separates hardware-independent and hardware-specific transformations. The scheduling language enables combining load balancing, edge traversal direction, active vertex set creation, kernel fusion, and other optimizations on GPUs and can be extended to support other hardware backends, such as CPUs and DSAs. We also built an autotuner on top of UGF to automatically find the best schedules for different hardware platforms. We demonstrate that UGF's techniques enable high performance and portability across a wide range of architectures by building three backends that target highly diverse hardware platforms: GPUs, CPUs, and the Swarm DSA. We evaluate UGF on five algorithms and 9 input graphs on these architectures. UGF outperforms stateof- the-art frameworks by up to 5.1x, and is the fastest in 62 out of 90 experiments.
Description
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, 2020 Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 63-70).
Date issued
2020Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.