Toward sustainable networking : coded storage and high-traffic networks
Author(s)Ferner, Ulric John
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
MetadataShow full item record
Current projections indicate that the worldwide data center (DC) industry will require a quadrupling of capacity by the year 2020 . Modern DCs can manage up to hundreds of thousands of servers, storing petabytes of data and concurrently serving thousands of users. Coded storage is a strategy proposed in the last decade, which may be able to improve performance of these growing systems. Broadly, we classify storage systems into those that can be modeled as static, or as traffic-agnostic, and those that are modeled as high-traffic networks, which experience significant traffic congestion. Prior work on coded storage has focused on code construction and traffic-agnostic performance analysis. " Traffic-agnostic coded storage: Significant work exists on code construction and on traffic-agnostic performance analysis of coded storage at the network level. Intuitive examples, problem frameworks, performance bounds, and a plethora of practical codes exist in this setting. The vast majority of work in this area considers coded storage for single data repair events, in which drives permanently fail, as per the seminal work of Dimakis et al. [2, 3], and recent work considers probabilistic drive failures, where failures are independent of incoming traffic. " Coded storage in high-traffic networks: This thesis proposes coded storage as a traffic congestion mitigation strategy, and seeks to explore the potential benefits in doing so. To the author's knowledge, this is the first thesis on coded storage in high-traffic networks, by which we mean networks with significant traffic congestion. When this thesis began, this sub-area had no prior work. In particular, this area had no intuitive examples, problem formulations, models, nor results. We will argue that some kind of queueing model can be applied to high-traffic networks that use coded storage. Strictly speaking, such queueing models could also be applied to networks without significant traffic congestion, but they may not be necessary for analytical purposes. As of May 2014, a handful of papers on coded storage to reduce traffic congestion have been published by various authors. Each paper will be referenced and discussed in the most closely relevant technical thesis chapter. This thesis combines academic literature from queueing theory, coding theory, and storage systems engineering. Broadly, in each technical chapter, contributions are threefold: " We develop models and tools for high-traffic networks that allow tractable analysis of coded storage. " Using various queueing models, we explore mechanisms by which coded storage could improve user management, and quantify this potential. The coded storage schemes we propose code across chunks as they are commonly laid out on drives today, and are generally application-agnostic. Coding structures are not optimized (with the exception of Chapter 5). " In addition, in Chapter 5, we explore coded storage structures that are application-specific, and propose a multi-resolution coding approach for video streaming. Metrics considered include blocking probability, saturation probability, average delay, and maximum stable throughput. No claim is made about the best or most general metric. Keeping results DC-centric and relatively application-agnostic also allows us to make fewer application-specific assumptions, which may make it easier for our insights to be applied to a larger set of DC types, such as single- and multi-tenant DCs, or those with and without outsourced server operations. Further, this thesis focuses on understanding the mechanisms for coded storage gains, and quantifying them, i.e., it is a first cut at answering the question of "is it worth doing coded storage in high-traffic networks at all?" This is in contrast to a different but related problem that is to optimize coding structures, independent of absolute performance. To guide the reader through the thesis, see Appendix A for a summary roadmap of technical assumptions made in each thesis chapter. Specific contributions divided by chapter follow. In Chapter 1 we introduce the reader to coded storage and high-traffic networks. We review the current storage technological landscape, detail related work, and present the intuition behind this thesis. In Chapter 2 we introduce the reader to queueing analysis for coded storage. We present a blocking drive model, which can provide simple and exact analytical solutions to blocking performance in various drive networks. We analyze the blocking performance of uncoded and coded storage systems, and demonstrate how to apply such analysis to striped file systems. This analysis holds when incoming read requests are for individual chunks and when the performance metric is system blocking probability. Blocking probability savings of up to an order of magnitude are observed. Building upon and extending Chapter 2, in Chapter 3 we present a system model for bulk, as opposed to individual read requests. In this model, individual drives have the blocking characteristics of Chapter 1, and because arrivals come in bulk, we consider the average scheduling delay as our performance metric, and for analytical tractability we consider only one chunk per drive. We analyze a block-based code and a stochastic scheduling algorithm which is beneficial in the case of continuous chunk read patterns. In particular, we demonstrate that in systems with continuous chunk reads, when drive blocking is either independent of traffic or caused by traffic congestion, block coded storage can reduce average download time by 10-66%, given modern system parameters. However, a distinction should be made between systems with continuous and those with interrupted chunk read patterns. For interrupted chunk read systems, given an allocation algorithm that performs well for continuous reads, block coded storage performance can be worse than replication; numerical illustrations show relative losses over 66%. In Chapter 4 we extend Chapter 3 to consider optimal scheduling and rate region (RR) analysis. We develop a queued cross-bar network (QCN) model that can be used to map a storage network with arbitrary file layouts across drives into a moded queueing network. All results in this chapter assume drives with deterministic read times. Our QCN model also considers drive-to-user transmission traffic patterns, which prior chapters do not. Chapter 4 presents an offline scheduling algorithm that is rate optimal for uncoded storage. For coded storage, we develop a RR upper bound, for which we provide an intuitive interpretation, and we show that this bound is achievable for particular code structures. Numerical results illustrate that the RR of coded storage can subsume that of uncoded storage, with increases in volume averaging 50% across traffic patterns. Chapter 5 departs from prior chapters in considering application-specific structures. We explore alternative coding structures and propose a multi-resolution (MR) coded storage system for video streaming. These results hold for drives with general service distributions. Due to inherent asymmetry in MR video traffic statistics, in this chapter we optimize chunk layout strategies as a function of incoming traffic statistics. We are interested in saturation probability as our performance metric, a variation on blocking probability that accounts for asymmetry in MR demand statistics. Numerical results illustrate an order of magnitude reduction in saturation probability. Chapter 6 wraps up the thesis by taking a systems engineering view of the potential energy and operating cost financial savings that coded storage may have in enterprise DCs owing to blocking probability savings. Finally, Chapter 7 discusses potential future research extensions, and concludes the thesis.
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.Cataloged from PDF version of thesis.Includes bibliographical references (pages 167-175).
DepartmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.