A hardware and software architecture for efficient datacenters

Kasture, Harshad

dc.contributor.advisor	Daniel Sanchez.	en_US
dc.contributor.author	Kasture, Harshad	en_US
dc.contributor.other	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2017-05-11T20:00:13Z
dc.date.available	2017-05-11T20:00:13Z
dc.date.copyright	2017	en_US
dc.date.issued	2017	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/109005
dc.description	Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.	en_US
dc.description	Cataloged from PDF version of thesis.	en_US
dc.description	Includes bibliographical references (pages 121-131).	en_US
dc.description.abstract	Datacenters host an increasing amount of the world's compute, powering a diverse set of applications that range from scientific computing and business analytics to massive online services such as social media and online maps. Despite their growing importance, however, datacenters suffer from low resource and energy efficiency, using only 10-30% of their compute capacity on average. This overprovisioning adds billions of dollars annually to datacenter equipment costs, and wastes significant energy. This low efficiency stems from two sources. First, latency-critical applications, which form the backbone of user-facing, interactive services, need guaranteed low response times, often a few tens of milliseconds or less. By contrast, current systems are architected to maximize long-term, average performance (e.g., throughput over a period of seconds), and cannot provide the short-term performance guarantees needed by these applications. The stringent performance requirements of latency-critical applications make power management challenging, and make it hard to colocate them with other applications, as interference in shared resources hurts their responsiveness. Second, throughput-oriented batch applications, while easier to colocate, experience performance degradation as multiple colocated applications compete for shared resources on servers. This thesis presents novel hardware and software techniques that improve resource and energy efficiency for both classes of applications. First, Ubik is a dynamic cache partitioning technique that allows latency-critical and batch applications to safely share the last-level cache, maximizing batch throughput while providing latency guarantees for latency-critical applications. Ubik accurately predicts the transients that result when caches are reconfigured, and can thus mitigate latency degradation due to performance inertia, i.e., the loss of performance as an application transitions between steady states. Second, Rubik is a fine-grain voltage and frequency scaling scheme that quickly and accurately adapts to short-term load variations in latency-critical applications to minimize dynamic power consumption without hurting latency. Rubik uses a novel, lightweight statistical model that accurately predicts queued work, and accounts for variations in per-request compute requirements as well as queuing delays. Further, Rubik improves system utilization by allowing latency-critical and batch applications to safely share cores, using frequency scaling to mitigate performance degradation due to interference in per-core resources such as private caches. Third, Shepherd is a cluster scheduler that uses per-node cache-partitioning decisions to drive application placement across machines. Shepherd uses detailed application profiling data to partition the last-level cache on each machine and to predict the performance of colocated applications, and uses randomized search to find a schedule that maximizes throughput. A common theme across these techniques is the use of lightweight, general-purpose architectural support to provide performance isolation and fast state transitions, coupled with intelligent software runtimes that configure the hardware to meet application performance requirements. Unlike prior work, which often relies on heuristics, these techniques use accurate analytical modeling to guide resource allocation, boosting efficiency while satisfying applications' disparate performance goals. Ubik allows latency-critical and batch applications to be safely and efficiently colocated, improving batch throughput by an average of 17% over a static partitioning scheme while guaranteeing tail latency. Rubik further allows these two classes of applications to share cores, reducing datacenter power consumption by up to 31% while using 41% fewer machines over a scheme that segregates these applications. Shepherd improves batch throughput by 39% over a randomly scheduled, unpartitioned baseline, and significantly outperforms scheduling-only and partitioning-only approaches.	en_US
dc.description.statementofresponsibility	by Harshad Kasture.	en_US
dc.format.extent	131 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	A hardware and software architecture for efficient datacenters	en_US
dc.type	Thesis	en_US
dc.description.degree	Ph. D.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	986529222	en_US

Files in this item

Name:: 986529222-MIT.pdf
Size:: 25.79Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record