Show simple item record

dc.contributor.authorGao, Jenny
dc.contributor.authorDing, Jialin
dc.contributor.authorSudhir, Sivaprasad
dc.contributor.authorMadden, Samuel
dc.date.accessioned2024-07-09T16:30:32Z
dc.date.available2024-07-09T16:30:32Z
dc.date.issued2024-06-09
dc.identifier.isbn979-8-4007-0680-6
dc.identifier.urihttps://hdl.handle.net/1721.1/155538
dc.descriptionaiDM ’24, June 14, 2024, Santiago, AA, Chileen_US
dc.description.abstractTo improve the performance of scanning and filtering, modern analytic data systems such as Amazon Redshift and Databricks Delta Lake give users the ability to sort a table using a Z-order, which maps each row to a "Z-value" by interleaving the binary representations of the row's attributes, then sorts rows by their Z-values. These Z-order layouts essentially sort the table by multiple columns simultaneously and can achieve superior performance to single-column sort orders when the user's queries filter over multiple columns. However, the user shoulders the burden of manually selecting the columns to include in the Z-order, and a poor choice of columns can significantly degrade performance. Furthermore, these systems treat all columns included in the Z-order as equally important, which often does not result in the best performance due to the unequal impact that different columns have on query performance. In this work, we investigate the performance impact of using Z-orders that place unequal importance on columns: instead of using an equal number of bits from each column in the Z-value interleaving, we allow unequal bit allocation. We introduce a technique that uses Bayesian optimization to automatically learn the best bit allocation for a Z-order layout on a given dataset and query workload. Z-order layouts using our learned bit allocations outperform equal-bit Z-orders by up to 1.6× in query runtime and up to 2× in rows scanned.en_US
dc.publisherACMen_US
dc.relation.isversionof10.1145/3663742.3663975en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleLearning Bit Allocations for Z-Order Layouts in Analytic Data Systemsen_US
dc.typeArticleen_US
dc.identifier.citationGao, Jenny, Ding, Jialin, Sudhir, Sivaprasad and Madden, Samuel. 2024. "Learning Bit Allocations for Z-Order Layouts in Analytic Data Systems."
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.identifier.mitlicensePUBLISHER_CC
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2024-07-01T08:00:58Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2024-07-01T08:00:58Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record