Work-Efficient Parallel Algorithms for Accurate Floating-Point Prefix Sums

Fraser, Sean; Xu, Helen; Leiserson, Charles E

dc.contributor.author	Fraser, Sean
dc.contributor.author	Xu, Helen
dc.contributor.author	Leiserson, Charles E
dc.date.accessioned	2022-07-14T18:27:54Z
dc.date.available	2022-07-14T18:27:54Z
dc.date.issued	2020
dc.identifier.uri	https://hdl.handle.net/1721.1/143740
dc.description.abstract	© 2020 IEEE. Existing work-efficient parallel algorithms for floating-point prefix sums exhibit either good performance or good numerical accuracy, but not both. Consequently, prefix-sum algorithms cannot easily be used in scientific-computing applications that require both high performance and accuracy. We have designed and implemented two new algorithms, called CAST _BLK and PAIR_BLK, whose accuracy is significantly higher than that of the high-performing prefix-sum algorithm from the Problem Based Benchmark Suite, while running with comparable performance on modern multicore machines. Specifically, the root mean squared error of the PBBS code on a large array of uniformly distributed 64-bit floating-point numbers is 8 times higher than that of CAST _BLK and 5.8 times higher than that of PAIR_BLK. These two codes employ the PBBS three-stage strategy for performance, but they are designed to achieve high accuracy, both theoretically and in practice. A vectorization enhancement to these two scalar codes trades off a small amount of accuracy to match or outperform the PBBS code while still maintaining lower error.	en_US
dc.language.iso	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)	en_US
dc.relation.isversionof	10.1109/HPEC43674.2020.9286240	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	MIT web domain	en_US
dc.title	Work-Efficient Parallel Algorithms for Accurate Floating-Point Prefix Sums	en_US
dc.type	Article	en_US
dc.identifier.citation	Fraser, Sean, Xu, Helen and Leiserson, Charles E. 2020. "Work-Efficient Parallel Algorithms for Accurate Floating-Point Prefix Sums." 2020 IEEE High Performance Extreme Computing Conference, HPEC 2020.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.relation.journal	2020 IEEE High Performance Extreme Computing Conference, HPEC 2020	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2022-07-14T17:57:19Z
dspace.orderedauthors	Fraser, S; Xu, H; Leiserson, CE	en_US
dspace.date.submission	2022-07-14T17:57:20Z
mit.license	OPEN_ACCESS_POLICY
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: prefix-hpec.pdf
Size:: 966.6Kb
Format:: PDF
Description:: Accepted version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record