Elastic database systems
Author(s)
Taft, Rebecca (Rebecca Yale)
DownloadFull printable version (15.49Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Michael R. Stonebraker.
Terms of use
Metadata
Show full item recordAbstract
Distributed on-line transaction processing (OLTP) database management systems (DBMSs) are a critical part of the operation of large enterprises. These systems often serve time-varying workloads due to daily, weekly or seasonal fluctuations in load, or because of rapid growth in demand due to a company's business success. In addition, many OLTP workloads are heavily skewed to "hot" tuples or ranges of tuples. For example, the majority of NYSE volume involves only 40 stocks. To manage such fluctuations, many companies currently provision database servers for peak demand. This approach is wasteful and not resilient to extreme skew or large workload spikes. To be both efficient and resilient, a distributed OLTP DBMS must be elastic; that is, it must be able to expand and contract its cluster of servers as demand fluctuates, and dynamically balance load as hot tuples vary over time. This thesis presents two elastic OLTP DBMSs, called E-Store and P-Store, which demonstrate the benefits of elasticity for distributed OLTP DBMSs on different types of workloads. E-Store automatically scales the database cluster in response to demand spikes, periodic events, and gradual changes in an application's workload, but it is particularly well-suited for managing hot spots. In contrast to traditional single-tier hash and range partitioning strategies, E-Store manages hot spots through a two-tier data placement strategy: cold data is distributed in large chunks, while smaller ranges of hot tuples are assigned explicitly to individual nodes. P-Store is an elastic OLTP DBMS that is designed for a subset of OLTP applications in which load varies predictably. For these applications, P-Store performs better than reactive systems like E-Store, because P-Store uses predictive modeling to reconfigure the system in advance of predicted load changes. The experimental evaluation shows the efficacy of the two systems under variations in load across a cluster of machines. Compared to single-tier approaches, E-Store improves throughput by up to 130% while reducing latency by 80%. On a predictable workload, P-Store outperforms a purely reactive system by causing 72% fewer latency violations, and achieves performance comparable to static allocation for peak demand while using 50% fewer servers.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017. Cataloged from PDF version of thesis. Includes bibliographical references (pages 131-139).
Date issued
2017Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.