Work-Efficient Parallel Algorithms for Accurate Floating-Point Prefix Sums

Fraser, Sean; Xu, Helen; Leiserson, Charles E

Author(s)

Fraser, Sean; Xu, Helen; Leiserson, Charles E

DownloadAccepted version (966.6Kb)

Open Access Policy

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

© 2020 IEEE. Existing work-efficient parallel algorithms for floating-point prefix sums exhibit either good performance or good numerical accuracy, but not both. Consequently, prefix-sum algorithms cannot easily be used in scientific-computing applications that require both high performance and accuracy. We have designed and implemented two new algorithms, called CAST _BLK and PAIR_BLK, whose accuracy is significantly higher than that of the high-performing prefix-sum algorithm from the Problem Based Benchmark Suite, while running with comparable performance on modern multicore machines. Specifically, the root mean squared error of the PBBS code on a large array of uniformly distributed 64-bit floating-point numbers is 8 times higher than that of CAST _BLK and 5.8 times higher than that of PAIR_BLK. These two codes employ the PBBS three-stage strategy for performance, but they are designed to achieve high accuracy, both theoretically and in practice. A vectorization enhancement to these two scalar codes trades off a small amount of accuracy to match or outperform the PBBS code while still maintaining lower error.

Date issued

2020

URI

https://hdl.handle.net/1721.1/143740

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science; Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory

Journal

2020 IEEE High Performance Extreme Computing Conference, HPEC 2020

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Citation

Fraser, Sean, Xu, Helen and Leiserson, Charles E. 2020. "Work-Efficient Parallel Algorithms for Accurate Floating-Point Prefix Sums." 2020 IEEE High Performance Extreme Computing Conference, HPEC 2020.

Version: Author's final manuscript

Collections

MIT Open Access Articles