Show simple item record

dc.contributor.advisorSamuel R. Madden.en_US
dc.contributor.authorTatarowicz, Aubrey Lynnen_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2011-11-01T19:47:58Z
dc.date.available2011-11-01T19:47:58Z
dc.date.copyright2011en_US
dc.date.issued2011en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/66813
dc.descriptionThesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (p. 61-63).en_US
dc.description.abstractData volumes are exploding. It is essential to use multiple machines to store such large amounts of data. To address this explosion, storage systems like databases need to be distributed across many machines. Transactions that access a few tuples, often seen in web workloads such as Twitter, do not run optimally using traditional partitioning schemes [25]. Hence, increasing the number of machines often presents a bottleneck for workloads where each transaction accesses just a few tuples. Fine-grained partitioning can fix the scale out problem introduced by simplistic partitioning schemes. In this thesis, I introduce a design of a distributed query execution system that handles fine-grained partitioning using look-up tables. I introduce look-up tables, which is a mapping from a tuple attribute to a tuple back-end location such that fine grained partitioning can be supported. I show through both synthetic and real data that fine-grained partitioning enabled by look-up tables can increase throughput of a distributed database system. My goal is scale-out with the number of machines used in the distributed database. I show in my experiments that scale-out can be reached if an ideal partitioning can be created. I test my implementation on a Wikipedia data set. I show in this example a factor of three times better performance compared to the optimal hash partitioning scheme with eight back-ends and signs of continual scale-out with more machines. Through the use of large data sets and projecting my results onto even larger data sets, I show that look-up tables can be used to represent complex partitioning schemes for databases containing billions of tuples.en_US
dc.description.statementofresponsibilityby Aubrey Lynn Tatarowicz.en_US
dc.format.extent63 p.en_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleLook-up tables : the benefit of enabling fine-grained routing and load balancingen_US
dc.title.alternativeBenefit of enabling fine-grained routing and load balancingen_US
dc.typeThesisen_US
dc.description.degreeM.Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc757169991en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record