Show simple item record

dc.contributor.authorLeiserson, Charles E.
dc.contributor.authorSchardl, Tao Benjamin
dc.date.accessioned2016-01-19T18:21:30Z
dc.date.available2016-01-19T18:21:30Z
dc.date.issued2010-06
dc.identifier.isbn9781450300797
dc.identifier.urihttp://hdl.handle.net/1721.1/100925
dc.description.abstractWe have developed a multithreaded implementation of breadth-first search (BFS) of a sparse graph using the Cilk++ extensions to C++. Our PBFS program on a single processor runs as quickly as a standar. C++ breadth-first search implementation. PBFS achieves high work-efficiency by using a novel implementation of a multiset data structure, called a "bag," in place of the FIFO queue usually employed in serial breadth-first search algorithms. For a variety of benchmark input graphs whose diameters are significantly smaller than the number of vertices -- a condition met by many real-world graphs -- PBFS demonstrates good speedup with the number of processing cores. Since PBFS employs a nonconstant-time "reducer" -- "hyperobject" feature of Cilk++ -- the work inherent in a PBFS execution depends nondeterministically on how the underlying work-stealing scheduler load-balances the computation. We provide a general method for analyzing nondeterministic programs that use reducers. PBFS also is nondeterministic in that it contains benign races which affect its performance but not its correctness. Fixing these races with mutual-exclusion locks slows down PBFS empirically, but it makes the algorithm amenable to analysis. In particular, we show that for a graph G=(V,E) with diameter D and bounded out-degree, this data-race-free version of PBFS algorithm runs it time O((V+E)/P + Dlg[superscript 3](V/D)) on P processors, which means that it attains near-perfect linear speedup if P << (V+E)/Dlg[superscript 3](V/D).en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Grant CNS-0615215)en_US
dc.language.isoen_US
dc.publisherAssociation for Computing Machinery (ACM)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1145/1810479.1810534en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceMIT web domainen_US
dc.titleA work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers)en_US
dc.typeArticleen_US
dc.identifier.citationCharles E. Leiserson and Tao B. Schardl. 2010. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers). In Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures (SPAA '10). ACM, New York, NY, USA, 303-314.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.mitauthorLeiserson, Charles E.en_US
dc.contributor.mitauthorSchardl, Tao Benjaminen_US
dc.relation.journalProceedings of the 22nd ACM symposium on Parallelism in algorithms and architectures (SPAA '10)en_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dspace.orderedauthorsLeiserson, Charles E.; Schardl, Tao B.en_US
dc.identifier.orcidhttps://orcid.org/0000-0003-0198-3283
mit.licenseOPEN_ACCESS_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record