Weld: A common runtime for high performance data analytics

Palkar, S; Thomas, JJ; Shanbhag, A; Narayanan, D; Pirk, H; Schwarzkopf, M; Amarasinghe, S; Zaharia, M

Notice

This is not the latest version of this item. The latest version can be found at:https://dspace.mit.edu/handle/1721.1/137425.2

Author(s)

Palkar, S; Thomas, JJ; Shanbhag, A; Narayanan, D; Pirk, H; ... Show more

DownloadPublished version (229.4Kb)

Publisher with Creative Commons License

Terms of use

Creative Commons Attribution 3.0 unported license https://creativecommons.org/licenses/by/3.0/

Metadata

Show full item record

Abstract

© 2017 Conference on Innovative Data Systems Research (CIDR). All rights reserved. Modern analytics applications combine multiple functions from different libraries and frameworks to build increasingly complex workflows. Even though each function may achieve high performance in isolation, the performance of the combined workflow is often an order of magnitude below hardware limits due to extensive data movement across the functions. To address this problem, we propose Weld, a runtime for data-intensive applications that optimizes across disjoint libraries and functions. Weld uses a common intermediate representation to capture the structure of diverse data-parallel workloads, including SQL, machine learning and graph analytics. It then performs key data movement optimizations and generates efficient parallel code for the whole workflow. Weld can be integrated incrementally into existing frameworks like TensorFlow, Apache Spark, NumPy and Pandas without changing their user-facing APIs. We show that Weld can speed up these frameworks, as well as applications that combine them, by up to 30×.

Date issued

2017-01

URI

https://hdl.handle.net/1721.1/137425

Journal

CIDR 2017 - 8th Biennial Conference on Innovative Data Systems Research

Citation

Palkar, S, Thomas, JJ, Shanbhag, A, Narayanan, D, Pirk, H et al. 2017. "Weld: A common runtime for high performance data analytics." CIDR 2017 - 8th Biennial Conference on Innovative Data Systems Research.

Version: Final published version

Collections

MIT Open Access Articles

Version	Item	Date	Summary
2	1721.1/137425.2	2021-12-07T17:49:38Z	Verified or entered authority metadata.
1	1721.1/137425*	2021-11-05T11:48:09Z

DSpace@MIT

Notice