Show simple item record

dc.contributor.authorChaurasia, Gaurav
dc.contributor.authorRagan-Kelley, Jonathan
dc.contributor.authorParis, Sylvain
dc.contributor.authorDrettakis, George
dc.contributor.authorDurand, Fredo
dc.date.accessioned2021-11-05T16:59:16Z
dc.date.available2021-11-05T16:59:16Z
dc.date.issued2015
dc.identifier.urihttps://hdl.handle.net/1721.1/137548
dc.description.abstract© 2015 ACM. Infinite impulse response (IIR) or recursive filters, are essential for image processing because they turn expensive large-footprint convolutions into operations that have a constant cost per pixel regardless of kernel size. However, their recursive nature constrains the order in which pixels can be computed, severely limiting both parallelism within a filter and memory locality across multiple filters. Prior research has developed algorithms that can compute IIR filters with image tiles. Using a divide-and-recombine strategy inspired by parallel prefix sum, they expose greater parallelism and exploit producer-consumer locality in pipelines of IIR filters over multidimensional images. While the principles are simple, it is hard, given a recursive filter, to derive a corresponding tile-parallel algorithm, and even harder to implement and debug it. We show that parallel and locality-aware implementations of IIR filter pipelines can be obtained through program transformations, which we mechanize through a domain-specific compiler. We show that the composition of a small set of transformations suffices to cover the space of possible strategies. We also demonstrate that the tiled implementations can be automatically scheduled in hardwarespecific manners using a small set of generic heuristics. The programmer specifies the basic recursive filters, and the choice of transformation requires only a few lines of code. Our compiler then generates high-performance implementations that are an order of magnitude faster than standard GPU implementations, and outperform hand tuned tiled implementations of specialized algorithms which require orders of magnitude more programming effort-a few lines of code instead of a few thousand lines per pipeline.en_US
dc.language.isoen
dc.publisherAssociation for Computing Machinery (ACM)en_US
dc.relation.isversionof10.1145/2790060.2790063en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceOther repositoryen_US
dc.titleCompiling High Performance Recursive Filtersen_US
dc.typeArticleen_US
dc.identifier.citationChaurasia, Gaurav, Ragan-Kelley, Jonathan, Paris, Sylvain, Drettakis, George and Durand, Fredo. 2015. "Compiling High Performance Recursive Filters."
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2019-05-29T12:04:28Z
dspace.date.submission2019-05-29T12:04:30Z
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record