dc.contributor.author | Fraser, Sean | |
dc.contributor.author | Xu, Helen | |
dc.contributor.author | Leiserson, Charles E | |
dc.date.accessioned | 2022-07-14T18:27:54Z | |
dc.date.available | 2022-07-14T18:27:54Z | |
dc.date.issued | 2020 | |
dc.identifier.uri | https://hdl.handle.net/1721.1/143740 | |
dc.description.abstract | © 2020 IEEE. Existing work-efficient parallel algorithms for floating-point prefix sums exhibit either good performance or good numerical accuracy, but not both. Consequently, prefix-sum algorithms cannot easily be used in scientific-computing applications that require both high performance and accuracy. We have designed and implemented two new algorithms, called CAST _BLK and PAIR_BLK, whose accuracy is significantly higher than that of the high-performing prefix-sum algorithm from the Problem Based Benchmark Suite, while running with comparable performance on modern multicore machines. Specifically, the root mean squared error of the PBBS code on a large array of uniformly distributed 64-bit floating-point numbers is 8 times higher than that of CAST _BLK and 5.8 times higher than that of PAIR_BLK. These two codes employ the PBBS three-stage strategy for performance, but they are designed to achieve high accuracy, both theoretically and in practice. A vectorization enhancement to these two scalar codes trades off a small amount of accuracy to match or outperform the PBBS code while still maintaining lower error. | en_US |
dc.language.iso | en | |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | en_US |
dc.relation.isversionof | 10.1109/HPEC43674.2020.9286240 | en_US |
dc.rights | Creative Commons Attribution-Noncommercial-Share Alike | en_US |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | en_US |
dc.source | MIT web domain | en_US |
dc.title | Work-Efficient Parallel Algorithms for Accurate Floating-Point Prefix Sums | en_US |
dc.type | Article | en_US |
dc.identifier.citation | Fraser, Sean, Xu, Helen and Leiserson, Charles E. 2020. "Work-Efficient Parallel Algorithms for Accurate Floating-Point Prefix Sums." 2020 IEEE High Performance Extreme Computing Conference, HPEC 2020. | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
dc.contributor.department | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory | |
dc.relation.journal | 2020 IEEE High Performance Extreme Computing Conference, HPEC 2020 | en_US |
dc.eprint.version | Author's final manuscript | en_US |
dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
dc.date.updated | 2022-07-14T17:57:19Z | |
dspace.orderedauthors | Fraser, S; Xu, H; Leiserson, CE | en_US |
dspace.date.submission | 2022-07-14T17:57:20Z | |
mit.license | OPEN_ACCESS_POLICY | |
mit.metadata.status | Authority Work and Publication Information Needed | en_US |