MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

CheckSync: Transparent Primary-Backup Replication for Go Applications Using Checkpoints

Author(s)
Kaashoek, Nicolaas M.
Thumbnail
DownloadThesis PDF (458.2Kb)
Advisor
Morris, Robert Tappan
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Many distributed systems have singular, mission-critical components. The MapReduce coordinator, lock servers, etc are all examples of such components. Due to their importance, they require high availability and fault tolerance. The most common way to achieve this is through the use of replicated state machines, an approach in which the application is replicated across multiple machines. There could be as few as two in a primary/backup arrangement, or more to reduce the risk of downtime. Each instance starts in the same state, and then advances to new states in the same order. This allows for easy failover to one of the replicas in case the primary machine fails. The use of replicated state machines, however, requires an application to expose the correct stream of operations to ensure that each machine ends up in the same final state. This abstraction is not well-suited to all applications, as it can’t support multithreading and can add extra complexity for application developers. This thesis proposes CheckSync, a protocol for achieving high availability and fault tolerance via the use of checkpoints. CheckSync is designed with transparency as a primary goal: applications require little to no modification to use it. It achieves this by checkpointing the memory of an application and replicating that state from primary and a backup. Upon failure, the backup resumes from the checkpoint and continues running. CheckSync’s transparency sets it apart. Unlike the operation stream required for replicated state machines, CheckSync doesn’t place constraints on the design of the application. It can suspend and capture the memory of Go applications without knowledge of the specifics of the application, as well as restore them on the backup. This is accomplished through careful analysis and recreation of the application’s memory space, as well as efficient transmission of the checkpoint files to minimize performance overhead. CheckSync is evaluated with three different applications, and supports all three without any changes to their code.
Date issued
2021-06
URI
https://hdl.handle.net/1721.1/139242
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.