Show simple item record

dc.contributor.advisorKraska, Tim
dc.contributor.authorCen, Lujing
dc.date.accessioned2022-01-14T14:55:23Z
dc.date.available2022-01-14T14:55:23Z
dc.date.issued2021-06
dc.date.submitted2021-06-17T20:12:59.977Z
dc.identifier.urihttps://hdl.handle.net/1721.1/139184
dc.description.abstractAs the demand for data outpaces diminishing improvements in the hardware used to store and query them, we must find intelligent ways to increase database performance on existing systems. This project is focused on integrating learned encodings into SageDB, a database capable of accelerating queries by analyzing and adapting to different workloads. Encodings improve query performance through lossless compression, thereby reducing I/O time during scans. Different encoding types exhibit different characteristics depending on properties of the underlying data and the hardware on which queries are executed. We implement a variety of common encodings in SageDB and propose a learning-based approach to select the optimal encoding for a given data block by combining block-level statistics with sampling. In addition, we demonstrate how to leverage properties of encoded data along with vectorized processing units in modern CPUs to more efficiently execute queries without the need to decode every value.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright MIT
dc.rights.urihttp://rightsstatements.org/page/InC-EDU/1.0/
dc.titleLearned Encodings in SageDB
dc.typeThesis
dc.description.degreeM.Eng.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Engineering in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record