Efficient automatic scheduling of imaging and vision pipelines for the GPU

Anderson, Luke; Adams, Andrew; Ma, Karima; Li, Tzu-Mao; Jin, Tian; Ragan-Kelley, Jonathan

dc.contributor.author	Anderson, Luke
dc.contributor.author	Adams, Andrew
dc.contributor.author	Ma, Karima
dc.contributor.author	Li, Tzu-Mao
dc.contributor.author	Jin, Tian
dc.contributor.author	Ragan-Kelley, Jonathan
dc.date.accessioned	2022-07-19T12:34:07Z
dc.date.available	2022-07-19T12:34:07Z
dc.date.issued	2021
dc.identifier.uri	https://hdl.handle.net/1721.1/143843
dc.description.abstract	<jats:p>We present a new algorithm to quickly generate high-performance GPU implementations of complex imaging and vision pipelines, directly from high-level Halide algorithm code. It is fully automatic, requiring no schedule templates or hand-optimized kernels. We address the scalability challenge of extending search-based automatic scheduling to map large real-world programs to the deep hierarchies of memory and parallelism on GPU architectures in reasonable compile time. We achieve this using (1) a two-phase search algorithm that first ‘freezes’ decisions for the lowest cost sections of a program, allowing relatively more time to be spent on the important stages, (2) a hierarchical sampling strategy that groups schedules based on their structural similarity, then samples representatives to be evaluated, allowing us to explore a large space with few samples, and (3) memoization of repeated partial schedules, amortizing their cost over all their occurrences. We guide the process with an efficient cost model combining machine learning, program analysis, and GPU architecture knowledge. We evaluate our method’s performance on a diverse suite of real-world imaging and vision pipelines. Our scalability optimizations lead to average compile time speedups of 49x (up to 530x). We find schedules that are on average 1.7x faster than existing automatic solutions (up to 5x), and competitive with what the best human experts were able to achieve in an active effort to beat our automatic results.</jats:p>	en_US
dc.language.iso	en
dc.publisher	Association for Computing Machinery (ACM)	en_US
dc.relation.isversionof	10.1145/3485486	en_US
dc.rights	Creative Commons Attribution 4.0 International license	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.source	ACM	en_US
dc.title	Efficient automatic scheduling of imaging and vision pipelines for the GPU	en_US
dc.type	Article	en_US
dc.identifier.citation	Anderson, Luke, Adams, Andrew, Ma, Karima, Li, Tzu-Mao, Jin, Tian et al. 2021. "Efficient automatic scheduling of imaging and vision pipelines for the GPU." Proceedings of the ACM on Programming Languages, 5 (OOPSLA).
dc.contributor.department	Koch Institute for Integrative Cancer Research at MIT
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.relation.journal	Proceedings of the ACM on Programming Languages	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dc.date.updated	2022-07-19T12:27:01Z
dspace.orderedauthors	Anderson, L; Adams, A; Ma, K; Li, T-M; Jin, T; Ragan-Kelley, J	en_US
dspace.date.submission	2022-07-19T12:27:04Z
mit.journal.volume	5	en_US
mit.journal.issue	OOPSLA	en_US
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: 3485486.pdf
Size:: 3.122Mb
Format:: PDF
Description:: Published version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record