MIT Libraries homeMIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Look-up tables : the benefit of enabling fine-grained routing and load balancing

Author(s)
Tatarowicz, Aubrey Lynn
Thumbnail
DownloadFull printable version (4.992Mb)
Alternative title
Benefit of enabling fine-grained routing and load balancing
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
Samuel R. Madden.
Terms of use
M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
Data volumes are exploding. It is essential to use multiple machines to store such large amounts of data. To address this explosion, storage systems like databases need to be distributed across many machines. Transactions that access a few tuples, often seen in web workloads such as Twitter, do not run optimally using traditional partitioning schemes [25]. Hence, increasing the number of machines often presents a bottleneck for workloads where each transaction accesses just a few tuples. Fine-grained partitioning can fix the scale out problem introduced by simplistic partitioning schemes. In this thesis, I introduce a design of a distributed query execution system that handles fine-grained partitioning using look-up tables. I introduce look-up tables, which is a mapping from a tuple attribute to a tuple back-end location such that fine grained partitioning can be supported. I show through both synthetic and real data that fine-grained partitioning enabled by look-up tables can increase throughput of a distributed database system. My goal is scale-out with the number of machines used in the distributed database. I show in my experiments that scale-out can be reached if an ideal partitioning can be created. I test my implementation on a Wikipedia data set. I show in this example a factor of three times better performance compared to the optimal hash partitioning scheme with eight back-ends and signs of continual scale-out with more machines. Through the use of large data sets and projecting my results onto even larger data sets, I show that look-up tables can be used to represent complex partitioning schemes for databases containing billions of tuples.
Description
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.
 
Cataloged from PDF version of thesis.
 
Includes bibliographical references (p. 61-63).
 
Date issued
2011
URI
http://hdl.handle.net/1721.1/66813
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries homeMIT Libraries logo

Find us on

Twitter Facebook Instagram YouTube RSS

MIT Libraries navigation

SearchHours & locationsBorrow & requestResearch supportAbout us
PrivacyPermissionsAccessibility
MIT
Massachusetts Institute of Technology
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.