Show simple item record

dc.contributor.authorLee, I-Ting Angelina
dc.contributor.authorSchardl, Tao Benjamin
dc.contributor.authorKuszmaul, Bradley C
dc.contributor.authorLeiserson, William Mitchell
dc.contributor.authorLeiserson, Charles E
dc.date.accessioned2018-01-10T19:15:43Z
dc.date.available2018-01-10T19:15:43Z
dc.date.issued2015-06
dc.identifier.issn978-1-4503-3588-1
dc.identifier.urihttp://hdl.handle.net/1721.1/113050
dc.description.abstractCilkprof is a scalability profiler for multithreaded Cilk computations. Unlike its predecessor Cilkview, which analyzes only the whole-program scalability of a Cilk computation, Cilkprof collects work (serial running time) and span (critical-path length) data for each call site in the computation to assess how much each call site contributes to the overall work and span. Profiling work and span in this way enables a programmer to quickly diagnose scalability bottlenecks in a Cilk program. Despite the detail and quantity of information required to collect these measurements, Cilkprof runs with only constant asymptotic slowdown over the serial running time of the parallel computation. As an example of Cilkprof's usefulness, we used Cilkprof to diagnose a scalability bottleneck in an 1800-line parallel breadth-first search (PBFS) code. By examining Cilkprof's output in tandem with the source code, we were able to zero in on a call site within the PBFS routine that imposed a scalability bottleneck. A minor code modification then improved the parallelism of PBFS by a factor of 5. Using Cilkprof, it took us less than two hours to find and fix a scalability bug which had, until then, eluded us for months. This paper describes the Cilkprof algorithm and proves theoretically using an amortization argument that Cilkprof incurs only constant overhead compared with the application's native serial running time. Cilkprof was implemented by compiler instrumentation, that is, by modifying the LLVM compiler to insert instrumentation into user programs. On a suite of 16 application benchmarks, Cilkprof incurs a geometric-mean multiplicative overhead of only 1.9 and a maximum multiplicative overhead of only 7.4 compared with running the benchmarks without instrumentation.en_US
dc.language.isoen_US
dc.publisherAssociation for Computing Machineryen_US
dc.relation.isversionofhttp://dx.doi.org/10.1145/2755573.2755603en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceOther univ. web domainen_US
dc.titleThe Cilkprof Scalability Profileren_US
dc.typeArticleen_US
dc.identifier.citationSchardl, Tao B. et al. “The Cilkprof Scalability Profiler.” Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures - SPAA ’15 (2015), 13-15 June, 2015, Portland, Oregon, ACM Press, 2015en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.mitauthorSchardl, Tao Benjamin
dc.contributor.mitauthorKuszmaul, Bradley C
dc.contributor.mitauthorLeiserson, William Mitchell
dc.contributor.mitauthorLeiserson, Charles E
dc.relation.journalProceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures - SPAA '15en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsSchardl, Tao B.; Kuszmaul, Bradley C.; Lee, I-Ting Angelina; Leiserson, William M.; Leiserson, Charles E.en_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0003-0198-3283
dc.identifier.orcidhttps://orcid.org/0000-0002-0868-7121
mit.licenseOPEN_ACCESS_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record