IcebergHT: High Performance Hash Tables Through Stability and Low Associativity
Author(s)
Pandey, Prashant; Bender, Michael A.; Conway, Alex; Farach-Colton, Martin; Kuszmaul, William; Tagliavini, Guido; Johnson, Rob; ... Show more Show less
Download3588727.pdf (783.7Kb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
Modern hash table designs for DRAM and PMEM strive to minimize space while maximizing speed. The most important factor in speed is the number of cache lines accessed during updates and queries. On PMEM, there is an additional consideration, which is to minimize the number of writes, because on PMEM writes are more expensive than reads.
This paper proposes two design objectives, stability and low-associativity, that enable us to build hash tables that minimize cache-line accesses for all operations. A hash table is stable if it does not move items around, and a hash table has low associativity if there are only a few locations where an item can be stored. Low associativity ensures that queries need to examine only a few memory locations, and stability ensures that insertions write to very few cache lines. Stability also simplifies concurrency and, on PMEM, crash safety.
We present IcebergHT, a fast, concurrent, space-efficient, and crash-safe (for PMEM) hash table based on the design principles of stability and low associativity. IcebergHT combines in-memory metadata with a new hashing technique, iceberg hashing, that is (1) space efficient, (2) stable, and (3) supports low associativity. In contrast, existing hash-tables either modify numerous cache lines during insertions (e.g. cuckoo hashing), access numerous cache lines during queries (e.g. linear probing), or waste space (e.g. chaining). Moreover, the combination of (1)-(3) yields several emergent benefits: IcebergHT scales better than other hash tables, has excellent performance, and supports crash-safety on PMEM.
Our benchmarks show that IcebergHT has excellent performance both in DRAM and PMEM. In PMEM, IcebergHT insertions are 50% to 3× faster than state-of-the-art PMEM hash tables, such as Dash and CLHT, and queries are 20% to 2× faster. IcebergHT space overhead is 17%, whereas Dash and CLHT have space overheads of 2× and 3×, respectively. IcebergHT also scaled linearly throughout our experiments and is crash safe. In DRAM, IcebergHT outperforms state-of-the-art hash tables libcuckoo and CLHT by almost 2× on insertions while offering good query throughput and much better space efficiency.
Date issued
2023-05-30Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence LaboratoryJournal
Proceedings of the ACM on Management of Data
Publisher
ACM
Citation
Pandey, Prashant, Bender, Michael A., Conway, Alex, Farach-Colton, Martin, Kuszmaul, William et al. 2023. "IcebergHT: High Performance Hash Tables Through Stability and Low Associativity." Proceedings of the ACM on Management of Data, 1 (1).
Version: Final published version
ISSN
2836-6573
Collections
The following license files are associated with this item: