MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

A hardware and software architecture for efficient datacenters

Author(s)
Kasture, Harshad
Thumbnail
DownloadFull printable version (25.79Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Daniel Sanchez.
Terms of use
MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
Datacenters host an increasing amount of the world's compute, powering a diverse set of applications that range from scientific computing and business analytics to massive online services such as social media and online maps. Despite their growing importance, however, datacenters suffer from low resource and energy efficiency, using only 10-30% of their compute capacity on average. This overprovisioning adds billions of dollars annually to datacenter equipment costs, and wastes significant energy. This low efficiency stems from two sources. First, latency-critical applications, which form the backbone of user-facing, interactive services, need guaranteed low response times, often a few tens of milliseconds or less. By contrast, current systems are architected to maximize long-term, average performance (e.g., throughput over a period of seconds), and cannot provide the short-term performance guarantees needed by these applications. The stringent performance requirements of latency-critical applications make power management challenging, and make it hard to colocate them with other applications, as interference in shared resources hurts their responsiveness. Second, throughput-oriented batch applications, while easier to colocate, experience performance degradation as multiple colocated applications compete for shared resources on servers. This thesis presents novel hardware and software techniques that improve resource and energy efficiency for both classes of applications. First, Ubik is a dynamic cache partitioning technique that allows latency-critical and batch applications to safely share the last-level cache, maximizing batch throughput while providing latency guarantees for latency-critical applications. Ubik accurately predicts the transients that result when caches are reconfigured, and can thus mitigate latency degradation due to performance inertia, i.e., the loss of performance as an application transitions between steady states. Second, Rubik is a fine-grain voltage and frequency scaling scheme that quickly and accurately adapts to short-term load variations in latency-critical applications to minimize dynamic power consumption without hurting latency. Rubik uses a novel, lightweight statistical model that accurately predicts queued work, and accounts for variations in per-request compute requirements as well as queuing delays. Further, Rubik improves system utilization by allowing latency-critical and batch applications to safely share cores, using frequency scaling to mitigate performance degradation due to interference in per-core resources such as private caches. Third, Shepherd is a cluster scheduler that uses per-node cache-partitioning decisions to drive application placement across machines. Shepherd uses detailed application profiling data to partition the last-level cache on each machine and to predict the performance of colocated applications, and uses randomized search to find a schedule that maximizes throughput. A common theme across these techniques is the use of lightweight, general-purpose architectural support to provide performance isolation and fast state transitions, coupled with intelligent software runtimes that configure the hardware to meet application performance requirements. Unlike prior work, which often relies on heuristics, these techniques use accurate analytical modeling to guide resource allocation, boosting efficiency while satisfying applications' disparate performance goals. Ubik allows latency-critical and batch applications to be safely and efficiently colocated, improving batch throughput by an average of 17% over a static partitioning scheme while guaranteeing tail latency. Rubik further allows these two classes of applications to share cores, reducing datacenter power consumption by up to 31% while using 41% fewer machines over a scheme that segregates these applications. Shepherd improves batch throughput by 39% over a randomly scheduled, unpartitioned baseline, and significantly outperforms scheduling-only and partitioning-only approaches.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
 
Cataloged from PDF version of thesis.
 
Includes bibliographical references (pages 121-131).
 
Date issued
2017
URI
http://hdl.handle.net/1721.1/109005
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.