MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Coding approaches for maintaining data in unreliable network systems

Author(s)
Abdrashitov, Vitaly
Thumbnail
DownloadFull printable version (2.372Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Muriel Médard.
Terms of use
MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
In the recent years, the explosive growth of the data storage demand has made the storage cost a critically important factor in the design of distributed storage systems (DSS). At the same time, optimizing the storage cost is constrained by the reliability requirements. The goal of the thesis is to further study the fundamental limits of maintaining data fault tolerance in a DSS spread across a communication network. Particularly, we focus our attention on performing efficient storage node repair in a redundant erasure-coded storage with a low storage overhead. We consider two operating scenarios of the DSS. First, we consider a clustered scenario, where individual nodes are grouped into clusters representing data centers, storage clouds of different service providers, racks, etc. The network bandwidth within a cluster is assumed to be cheap with respect to the bandwidth between nodes in different clusters. We extend the regenerating codes framework by Dimakis et al. [1] to the clustered topologies, and introduce generalized regenerating codes (GRC), which perform node repair using the helper data both from the local cluster and from other clusters. We show the optimal trade-off between the storage overhead and the inter-cluster repair bandwidth, along with optimal code constructions. In addition, we find the minimal amount of the intra-cluster repair bandwidth required for achieving a given point on the trade-off. Second, we consider a scenario, where the underlying network features a highly varying topology. Such behavior is characteristic for peer-to-peer, content delivery, or ad-hoc mobile networks. Because of the limited and time-varying connectivity, the sources for node repair are scarce. We consider a stochastic model of failures in the storage, which also describes the random and opportunistic nature of selecting the sources for node repair. We show that, even though the repair opportunities are scarce, with a practically high probability, the data can be maintained for a large number of failures and repairs and for the time periods far exceeding a typical lifespan of the data. The thesis also analyzes a random linear network coded (RLNC) approach to operating in such variable networks and demonstrates its high achievable rates, outperforming that of regenerating codes, and robustness in a wide range of model and implementation assumptions and parameters such as code rate, field size, repair bandwidth, node distributions, etc.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.
 
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
 
Cataloged from student-submitted PDF version of thesis.
 
Includes bibliographical references (pages 113-117).
 
Date issued
2018
URI
http://hdl.handle.net/1721.1/117805
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.