Show simple item record

dc.contributor.authorAnderson, Luke
dc.contributor.authorAdams, Andrew
dc.contributor.authorMa, Karima
dc.contributor.authorLi, Tzu-Mao
dc.contributor.authorJin, Tian
dc.contributor.authorRagan-Kelley, Jonathan
dc.date.accessioned2022-07-19T12:34:07Z
dc.date.available2022-07-19T12:34:07Z
dc.date.issued2021
dc.identifier.urihttps://hdl.handle.net/1721.1/143843
dc.description.abstract<jats:p>We present a new algorithm to quickly generate high-performance GPU implementations of complex imaging and vision pipelines, directly from high-level Halide algorithm code. It is fully automatic, requiring no schedule templates or hand-optimized kernels. We address the scalability challenge of extending search-based automatic scheduling to map large real-world programs to the deep hierarchies of memory and parallelism on GPU architectures in reasonable compile time. We achieve this using (1) a two-phase search algorithm that first ‘freezes’ decisions for the lowest cost sections of a program, allowing relatively more time to be spent on the important stages, (2) a hierarchical sampling strategy that groups schedules based on their structural similarity, then samples representatives to be evaluated, allowing us to explore a large space with few samples, and (3) memoization of repeated partial schedules, amortizing their cost over all their occurrences. We guide the process with an efficient cost model combining machine learning, program analysis, and GPU architecture knowledge. We evaluate our method’s performance on a diverse suite of real-world imaging and vision pipelines. Our scalability optimizations lead to average compile time speedups of 49x (up to 530x). We find schedules that are on average 1.7x faster than existing automatic solutions (up to 5x), and competitive with what the best human experts were able to achieve in an active effort to beat our automatic results.</jats:p>en_US
dc.language.isoen
dc.publisherAssociation for Computing Machinery (ACM)en_US
dc.relation.isversionof10.1145/3485486en_US
dc.rightsCreative Commons Attribution 4.0 International licenseen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceACMen_US
dc.titleEfficient automatic scheduling of imaging and vision pipelines for the GPUen_US
dc.typeArticleen_US
dc.identifier.citationAnderson, Luke, Adams, Andrew, Ma, Karima, Li, Tzu-Mao, Jin, Tian et al. 2021. "Efficient automatic scheduling of imaging and vision pipelines for the GPU." Proceedings of the ACM on Programming Languages, 5 (OOPSLA).
dc.contributor.departmentKoch Institute for Integrative Cancer Research at MIT
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.relation.journalProceedings of the ACM on Programming Languagesen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2022-07-19T12:27:01Z
dspace.orderedauthorsAnderson, L; Adams, A; Ma, K; Li, T-M; Jin, T; Ragan-Kelley, Jen_US
dspace.date.submission2022-07-19T12:27:04Z
mit.journal.volume5en_US
mit.journal.issueOOPSLAen_US
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record