Distributed Halide

Denniston, Tyler; Kamil, Shoaib; Amarasinghe, Saman P

dc.contributor.author	Denniston, Tyler
dc.contributor.author	Kamil, Shoaib
dc.contributor.author	Amarasinghe, Saman P
dc.date.accessioned	2017-07-18T16:00:20Z
dc.date.available	2017-07-18T16:00:20Z
dc.date.issued	2016-08
dc.identifier.isbn	9781450340922
dc.identifier.uri	http://hdl.handle.net/1721.1/110762
dc.description.abstract	Many image processing tasks are naturally expressed as a pipeline of small computational kernels known as stencils. Halide is a popular domain-specific language and compiler designed to implement image processing algorithms. Halide uses simple language constructs to express what to compute and a separate scheduling co-language for expressing when and where to perform the computation. This approach has demonstrated performance comparable to or better than hand-optimized code. Until now, however, Halide has been restricted to parallel shared memory execution, limiting its performance for memory-bandwidth-bound pipelines or large-scale image processing tasks. We present an extension to Halide to support distributed-memory parallel execution of complex stencil pipelines. These extensions compose with the existing scheduling constructs in Halide, allowing expression of complex computation and communication strategies. Existing Halide applications can be distributed with minimal changes, allowing programmers to explore the tradeoff between recomputation and communication with little effort. Approximately 10 new of lines code are needed even for a 200 line, 99 stage application. On nine image processing benchmarks, our extensions give up to a 1.4× speedup on a single node over regular multithreaded execution with the same number of cores, by mitigating the effects of non-uniform memory access. The distributed benchmarks achieve up to 18× speedup on a 16 node testing machine and up to 57× speedup on 64 nodes of the NERSC Cori supercomputer.	en_US
dc.description.sponsorship	United States. Department of Energy (award DE-SC0005288)	en_US
dc.description.sponsorship	United States. Department of Energy (award DE-SC0008923)	en_US
dc.description.sponsorship	National Science Foundation (U.S.) (XPS-1533753)	en_US
dc.language.iso	en_US
dc.publisher	Association for Computing Machinery	en_US
dc.relation.isversionof	http://dx.doi.org/10.1145/2851141.2851157	en_US
dc.rights	Creative Commons Attribution-NonCommercial-NoDerivs License	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	en_US
dc.source	ACM	en_US
dc.title	Distributed Halide	en_US
dc.type	Article	en_US
dc.identifier.citation	Denniston, Tyler, Shoaib Kamil, and Saman Amarasinghe. “Distributed Halide.” Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - PPoPP ’16 (2016).	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.contributor.mitauthor	Denniston, Tyler
dc.contributor.mitauthor	Kamil, Shoaib
dc.contributor.mitauthor	Amarasinghe, Saman P
dc.relation.journal	Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - PPoPP '16	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dspace.orderedauthors	Denniston, Tyler; Kamil, Shoaib; Amarasinghe, Saman	en_US
dspace.embargo.terms	N	en_US
dc.identifier.orcid	https://orcid.org/0000-0003-4400-8947
dc.identifier.orcid	https://orcid.org/0000-0002-7231-7643
mit.license	PUBLISHER_CC	en_US
mit.metadata.status	Complete

Files in this item

Name:: Distributed halide.pdf
Size:: 681.0Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record