Universal graph framework : achieving high-performance across algorithms, graph types, and architectures

Brahmakshatriya, Ajay Rajendra.

Author(s)

Brahmakshatriya, Ajay Rajendra.

Download1227521404-MIT.pdf (725.0Kb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Saman Amarasinghe.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

The performance of graph programs depends highly on the algorithm, the size and structure of the input graph, and the features of the underlying hardware. No single set of optimizations or single hardware platform works well across all applications. Currently, when switching to a different hardware platform, programmers must re-implement graph algorithms in a completely different language or framework, and use different optimizations to achieve high performance. We propose the Universal Graph Framework (UGF), a new graph processing framework that achieves high performance across CPUs, GPUs, and Domain-Specific Accelerators (DSAs) automatically, using the same algorithm specification. UGF achieves portability with reasonable effort by decoupling algorithm, schedule, and backend. We introduce a new domain-specific intermediate representation, GraphIR, that is key to this decoupling.

GraphIR encodes high-level algorithm and optimization information needed for hardware-specific code generation, making it easy to develop different backends (GraphVMs) for diverse architectures spanning CPUs, GPUs, and DSAs. UGF builds on the GraphIt domain-specific language (DSL), over which it introduces a new extensible scheduling language that separates hardware-independent and hardware-specific transformations. The scheduling language enables combining load balancing, edge traversal direction, active vertex set creation, kernel fusion, and other optimizations on GPUs and can be extended to support other hardware backends, such as CPUs and DSAs. We also built an autotuner on top of UGF to automatically find the best schedules for different hardware platforms. We demonstrate that UGF's techniques enable high performance and portability across a wide range of architectures by building three backends that target highly diverse hardware platforms: GPUs, CPUs, and the Swarm DSA.

We evaluate UGF on five algorithms and 9 input graphs on these architectures. UGF outperforms stateof- the-art frameworks by up to 5.1x, and is the fastest in 62 out of 90 experiments.

Description

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, 2020

Cataloged from student-submitted PDF version of thesis.

Includes bibliographical references (pages 63-70).

Date issued

2020

URI

https://hdl.handle.net/1721.1/129207

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Graduate Theses