Show simple item record

dc.contributor.authorChan, Cy
dc.contributor.authorWong, Yee Lok
dc.contributor.authorEdelman, Alan
dc.contributor.authorAnsel, Jason Andrew
dc.contributor.authorAmarasinghe, Saman P.
dc.date.accessioned2014-03-28T13:14:04Z
dc.date.available2014-03-28T13:14:04Z
dc.date.issued2009-11
dc.identifier.isbn9781605587448
dc.identifier.urihttp://hdl.handle.net/1721.1/85940
dc.description.abstractAlgorithmic choice is essential in any problem domain to realizing optimal computational performance. Multigrid is a prime example: not only is it possible to make choices at the highest grid resolution, but a program can switch techniques as the problem is recursively attacked on coarser grid levels to take advantage of algorithms with different scaling behaviors. Additionally, users with different convergence criteria must experiment with parameters to yield a tuned algorithm that meets their accuracy requirements. Even after a tuned algorithm has been found, users often have to start all over when migrating from one machine to another. We present an algorithm and autotuning methodology that address these issues in a near-optimal and efficient manner. The freedom of independently tuning both the algorithm and the number of iterations at each recursion level results in an exponential search space of tuned algorithms that have different accuracies and performances. To search this space efficiently, our autotuner utilizes a novel dynamic programming method to build efficient tuned algorithms from the bottom up. The results are customized multigrid algorithms that invest targeted computational power to yield the accuracy required by the user. The techniques we describe allow the user to automatically generate tuned multigrid cycles of different shapes targeted to the user's specific combination of problem, hardware, and accuracy requirements. These cycle shapes dictate the order in which grid coarsening and grid refinement are interleaved with both iterative methods, such as Jacobi or Successive Over-Relaxation, as well as direct methods, which tend to have superior performance for small problem sizes. The need to make choices between all of these methods brings the issue of variable accuracy to the forefront. Not only must the autotuning framework compare different possible multigrid cycle shapes against each other, but it also needs the ability to compare tuned cycles against both direct and (non-multigrid) iterative methods. We address this problem by using an accuracy metric for measuring the effectiveness of tuned cycle shapes and making comparisons over all algorithmic types based on this common yardstick. In our results, we find that the flexibility to trade performance versus accuracy at all levels of recursive computation enables us to achieve excellent performance on a variety of platforms compared to algorithmically static implementations of multigrid. Our implementation uses PetaBricks, an implicitly parallel programming language where algorithmic choices are exposed in the language. The PetaBricks compiler uses these choices to analyze, autotune, and verify the PetaBricks program. These language features, most notably the autotuner, were key in enabling our implementation to be clear, correct, and fast.en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Award CCF-0832997)en_US
dc.description.sponsorshipGigaScale Systems Research Centeren_US
dc.language.isoen_US
dc.publisherAssociation for Computing Machinery (ACM)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1145/1654059.1654065en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceMIT web domainen_US
dc.titleAutotuning multigrid with PetaBricksen_US
dc.typeArticleen_US
dc.identifier.citationCy Chan, Jason Ansel, Yee Lok Wong, Saman Amarasinghe, and Alan Edelman. 2009. Autotuning multigrid with PetaBricks. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC '09). ACM, New York, NY, USA, Article 5, 12 pages.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mathematicsen_US
dc.contributor.mitauthorChan, Cyen_US
dc.contributor.mitauthorAnsel, Jason Andrewen_US
dc.contributor.mitauthorWong, Yee Loken_US
dc.contributor.mitauthorAmarasinghe, Saman P.en_US
dc.contributor.mitauthorEdelman, Alanen_US
dc.relation.journalProceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC '09)en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsChan, Cy; Ansel, Jason; Wong, Yee Lok; Amarasinghe, Saman; Edelman, Alanen_US
dc.identifier.orcidhttps://orcid.org/0000-0001-7676-3133
dc.identifier.orcidhttps://orcid.org/0000-0002-7231-7643
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record