Ubik: efficient cache sharing with strict qos for latency-critical workloads

Kasture, Harshad; Sanchez, Daniel

dc.contributor.author	Kasture, Harshad
dc.contributor.author	Sanchez, Daniel
dc.date.accessioned	2014-10-09T18:20:42Z
dc.date.available	2014-10-09T18:20:42Z
dc.date.issued	2014-03
dc.identifier.isbn	9781450323055
dc.identifier.uri	http://hdl.handle.net/1721.1/90846
dc.description.abstract	Chip-multiprocessors (CMPs) must often execute workload mixes with different performance requirements. On one hand, user-facing, latency-critical applications (e.g., web search) need low tail (i.e., worst-case) latencies, often in the millisecond range, and have inherently low utilization. On the other hand, compute-intensive batch applications (e.g., MapReduce) only need high long-term average performance. In current CMPs, latency-critical and batch applications cannot run concurrently due to interference on shared resources. Unfortunately, prior work on quality of service (QoS) in CMPs has focused on guaranteeing average performance, not tail latency. In this work, we analyze several latency-critical workloads, and show that guaranteeing average performance is insufficient to maintain low tail latency, because microarchitectural resources with state, such as caches or cores, exert inertia on instantaneous workload performance. Last-level caches impart the highest inertia, as workloads take tens of milliseconds to warm them up. When left unmanaged, or when managed with conventional QoS frameworks, shared last-level caches degrade tail latency significantly. Instead, we propose Ubik, a dynamic partitioning technique that predicts and exploits the transient behavior of latency-critical workloads to maintain their tail latency while maximizing the cache space available to batch applications. Using extensive simulations, we show that, while conventional QoS frameworks degrade tail latency by up to 2.3x, Ubik simultaneously maintains the tail latency of latency-critical workloads and significantly improves the performance of batch applications.	en_US
dc.description.sponsorship	United States. Defense Advanced Research Projects Agency (Power Efficiency Revolution For Embedded Computing Technologies Contract HR0011-13-2-0005)	en_US
dc.description.sponsorship	National Science Foundation (U.S.) (Grant CCF-1318384)	en_US
dc.language.iso	en_US
dc.publisher	Association for Computing Machinery (ACM)	en_US
dc.relation.isversionof	http://dx.doi.org/10.1145/2541940.2541944	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_US
dc.source	MIT web domain	en_US
dc.title	Ubik: efficient cache sharing with strict qos for latency-critical workloads	en_US
dc.type	Article	en_US
dc.identifier.citation	Harshad Kasture and Daniel Sanchez. 2014. Ubik: efficient cache sharing with strict qos for latency-critical workloads. In Proceedings of the 19th international conference on Architectural support for programming languages and operating systems (ASPLOS '14). ACM, New York, NY, USA, 729-742.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.contributor.mitauthor	Kasture, Harshad	en_US
dc.contributor.mitauthor	Sanchez, Daniel	en_US
dc.relation.journal	Proceedings of the 19th international conference on Architectural support for programming languages and operating systems (ASPLOS '14)	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dspace.orderedauthors	Kasture, Harshad; Sanchez, Daniel	en_US
dc.identifier.orcid	https://orcid.org/0000-0002-2453-2904
dc.identifier.orcid	https://orcid.org/0000-0002-3964-9064
mit.license	OPEN_ACCESS_POLICY	en_US
mit.metadata.status	Complete

Files in this item

Name:: Sanchez_Ubik.pdf
Size:: 329.1Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record