Show simple item record

dc.contributor.authorPandey, Prashant
dc.contributor.authorBender, Michael A.
dc.contributor.authorConway, Alex
dc.contributor.authorFarach-Colton, Martin
dc.contributor.authorKuszmaul, William
dc.contributor.authorTagliavini, Guido
dc.contributor.authorJohnson, Rob
dc.date.accessioned2023-06-02T14:04:20Z
dc.date.available2023-06-02T14:04:20Z
dc.date.issued2023-05-30
dc.identifier.issn2836-6573
dc.identifier.urihttps://hdl.handle.net/1721.1/150847
dc.description.abstractModern hash table designs for DRAM and PMEM strive to minimize space while maximizing speed. The most important factor in speed is the number of cache lines accessed during updates and queries. On PMEM, there is an additional consideration, which is to minimize the number of writes, because on PMEM writes are more expensive than reads. This paper proposes two design objectives, stability and low-associativity, that enable us to build hash tables that minimize cache-line accesses for all operations. A hash table is stable if it does not move items around, and a hash table has low associativity if there are only a few locations where an item can be stored. Low associativity ensures that queries need to examine only a few memory locations, and stability ensures that insertions write to very few cache lines. Stability also simplifies concurrency and, on PMEM, crash safety. We present IcebergHT, a fast, concurrent, space-efficient, and crash-safe (for PMEM) hash table based on the design principles of stability and low associativity. IcebergHT combines in-memory metadata with a new hashing technique, iceberg hashing, that is (1) space efficient, (2) stable, and (3) supports low associativity. In contrast, existing hash-tables either modify numerous cache lines during insertions (e.g. cuckoo hashing), access numerous cache lines during queries (e.g. linear probing), or waste space (e.g. chaining). Moreover, the combination of (1)-(3) yields several emergent benefits: IcebergHT scales better than other hash tables, has excellent performance, and supports crash-safety on PMEM. Our benchmarks show that IcebergHT has excellent performance both in DRAM and PMEM. In PMEM, IcebergHT insertions are 50% to 3× faster than state-of-the-art PMEM hash tables, such as Dash and CLHT, and queries are 20% to 2× faster. IcebergHT space overhead is 17%, whereas Dash and CLHT have space overheads of 2× and 3×, respectively. IcebergHT also scaled linearly throughout our experiments and is crash safe. In DRAM, IcebergHT outperforms state-of-the-art hash tables libcuckoo and CLHT by almost 2× on insertions while offering good query throughput and much better space efficiency.en_US
dc.publisherACMen_US
dc.relation.isversionofhttps://doi.org/10.1145/3588727en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleIcebergHT: High Performance Hash Tables Through Stability and Low Associativityen_US
dc.typeArticleen_US
dc.identifier.citationPandey, Prashant, Bender, Michael A., Conway, Alex, Farach-Colton, Martin, Kuszmaul, William et al. 2023. "IcebergHT: High Performance Hash Tables Through Stability and Low Associativity." Proceedings of the ACM on Management of Data, 1 (1).
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.relation.journalProceedings of the ACM on Management of Dataen_US
dc.identifier.mitlicensePUBLISHER_CC
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2023-06-01T07:47:28Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2023-06-01T07:47:28Z
mit.journal.volume1en_US
mit.journal.issue1en_US
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record