HARBOR : an integrated approach to recovery and high availability in an updatable, distributed data warehouse

Lau, Edmond, M. Eng. Massachusetts Institute of Technology

dc.contributor.advisor	Samuel Madden.	en_US
dc.contributor.author	Lau, Edmond, M. Eng. Massachusetts Institute of Technology	en_US
dc.contributor.other	Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2007-05-16T15:22:43Z
dc.date.available	2007-05-16T15:22:43Z
dc.date.copyright	2006	en_US
dc.date.issued	2006	en_US
dc.identifier.uri	http://dspace.mit.edu/handle/1721.1/36800	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/36800
dc.description	Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.	en_US
dc.description	This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.	en_US
dc.description	Includes bibliographical references (p. 99-105).	en_US
dc.description.abstract	Any highly available data warehouse will use some form of data replication to ensure that it can continue to service queries despite machine failures. In this thesis, I demonstrate that it is possible to leverage the data replication available in these environments to build a simple yet efficient crash recovery mechanism that revives a crashed site by querying remote replicas for missing updates. My new integrated approach to recovery and high availability, called HARBOR (High Availability and Replication-Based Online Recovery), targets updatable data warehouses and offers an attractive alternative to the widely used log-based crash recovery algorithms found in existing database systems. Aside from its simplicity over log-based approaches, HARBOR also avoids the runtime overhead of maintaining an on-disk log, accomplishes recovery without quiescing the system, allows replicated data to be stored in non-identical formats, and supports the parallel recovery of multiple sites and database objects. To evaluate HARBOR's feasibility, I compare HARBOR's runtime overhead and recovery performance with those of two-phase commit and ARIES, the gold standard for log-based recovery, on a four-node distributed database system that I have implemented.	en_US
dc.description.abstract	(cont.) My experiments show that HARBOR incurs lower runtime overhead because it does not require log writes to be forced to disk during transaction commit. Furthermore, they indicate that HARBOR's recovery performance is comparable to ARIES's performance on many workloads and even surpasses it on characteristic warehouse workloads with few updates to historical data. The results are highly encouraging and suggest that my integrated approach is quite tenable.	en_US
dc.description.statementofresponsibility	by Edmond Lau.	en_US
dc.format.extent	105 p.	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/36800	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	HARBOR : an integrated approach to recovery and high availability in an updatable, distributed data warehouse	en_US
dc.title.alternative	Integrated approach to recovery and high availability in an updatable, distributed data warehouse	en_US
dc.type	Thesis	en_US
dc.description.degree	M.Eng.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	79653535	en_US

Files in this item

Name:: 79653535-MIT.pdf
Size:: 537.1Kb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record