Show simple item record

dc.contributor.advisorSamuel Madden.en_US
dc.contributor.authorRawlani, Praynaaen_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2016-01-04T20:51:56Z
dc.date.available2016-01-04T20:51:56Z
dc.date.copyright2014en_US
dc.date.issued2014en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/100670
dc.descriptionThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 99-100).en_US
dc.description.abstractGraph analytics has become increasing popular in the recent years. Conventionally, data is stored in relational databases that have been refined over decades, resulting in highly optimized data processing engines. However, the awkwardness of expressing iterative queries in SQL makes the relational query-processing model inadequate for graph analytics, leading to many alternative solutions. Our research explores the possibility of combining a more natural query model with relational databases for graph analytics. In particular, we bring together a graph-natural vertex-centric query interface to highly optimized column-oriented relational databases, thus providing the efficiency of relational engines and ease-of-use of new graph systems. Throughout the thesis, we used stochastic gradient descent, a loss-minimization algorithm applied in many machine learning and graph analytics queries, as the example iterative algorithm. We implemented two different approaches for emulating a vertex-centric interface on a leading column-oriented database, Vertica: disk-based and main-memory based. The disk-based solution stores data for each iteration in relational tables and allows for interleaving SQL queries with graph algorithms. The main-memory approach stores data in memory, allowing faster updates. We applied optimizations to both implementations, which included refining logical and physical query plans, applying algorithm-level improvements and performing system-specific optimizations. The experiments and results show that the two implementations provide reasonable performance in comparison with popular graph processing systems. We present a detailed cost analysis of the two implementations and study the effect of each individual optimization on the query performance.en_US
dc.description.statementofresponsibilityby Praynaa Rawlani.en_US
dc.format.extent100 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleGraph analytics on relational databasesen_US
dc.typeThesisen_US
dc.description.degreeM. Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc932127708en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record