MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Enhancing Cloud Database Performance: General-Purpose Compression and Workload-Driven Layout

Author(s)
Piszczek, Miloslawa
Thumbnail
DownloadThesis PDF (2.519Mb)
Advisor
Kraska, Tim
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Cloud-based disaggregated database systems that divide data across a data layer and a storage layer connected by network calls are popular for analytical query loads. This thesis explores two topics critical to building performant systems of this type: space optimization and latency minimization. First, I propose ColumnConstruct- a general-purpose machine learning compression that uses a novel information-maximizing method for building input features. ColumnConstruct is competitive with existing ML compression methods for categorical data, but is not able to perform lossless compression on arbitrary tabular data. This limitation, as well as the additional compression and decompression latency, make it insufficient to improve query latency within a database management system. Next, I investigate whether workload-aware data layout combined with caching can improve query times without the need for ML-based compression or storage layer computation pushdown. I show that for small cache sizes and homogeneous query sets, a workload-aware layout combined with existing compression methods can be more effective than computation pushdown without reliance on particular features in the data storage layer.
Date issued
2024-02
URI
https://hdl.handle.net/1721.1/153856
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.