BlendDB : blending table layouts to support efficient browsing of relational databases

Marcus, Adam, Ph. D. Massachusetts Institute of Technology

dc.contributor.advisor	Samuel R. Madden and David R. Karger.	en_US
dc.contributor.author	Marcus, Adam, Ph. D. Massachusetts Institute of Technology	en_US
dc.contributor.other	Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2009-06-30T16:33:11Z
dc.date.available	2009-06-30T16:33:11Z
dc.date.copyright	2008	en_US
dc.date.issued	2008	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/45890
dc.description	Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.	en_US
dc.description	Includes bibliographical references (p. 63-65).	en_US
dc.description.abstract	The physical implementation of most relational databases follows their logical description, where each relation is stored in its own file or collection of files on disk. Such an implementation is good for queries that filter or aggregate large portions of a single table, and provides reasonable performance for queries that join many records from one table to another. It is much less ideal, however, for join queries that follow paths from a small number of tuples in one table to small collections of tuples in other tables to accumulate facts about a related collection of objects (e.g., co-authors of a particular author in a publications database), since answering such queries involves one or more random I/Os per table involved in the path. If the primary workload of a database consists of many such path queries, as is likely to be the case when supporting browsing-oriented applications, performance will be quite poor. This thesis focuses on optimizing the performance of these kinds of path queries in a system called BlendDB, a relational database that supports on-disk co-location of tuples from different relations. To make BlendDB efficient, the thesis will propose a clustering algorithm that, given knowledge of the database workload, co-locates the tuples of multiple relations if they join along common paths. To support the claim of improved performance, the thesis will include experiments in which BlendDB provides better performance than traditional relational databases on queries against the IMDB movie dataset. Additionally, this thesis will show that BlendDB provides commensurate performance to materialized views while using less disk space, and can achieve better performance than materialized views in exchange for more disk space when users navigate between related items in the database.	en_US
dc.description.statementofresponsibility	by Adam Marcus.	en_US
dc.format.extent	65 p.	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	BlendDB : blending table layouts to support efficient browsing of relational databases	en_US
dc.type	Thesis	en_US
dc.description.degree	S.M.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	320446978	en_US

Files in this item

Name:: 320446978-MIT.pdf
Size:: 7.214Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record