Shenango: Achieving high CPU efficiency for latency-sensitive datacenter workloads

Ousterhout, Amy Elizabeth; Fried, Joshua; Behrens, Jonathan (Jonathan Kyle); Belay, Adam M; Balakrishnan, Hari

Author(s)

Ousterhout, Amy Elizabeth; Fried, Joshua; Behrens, Jonathan (Jonathan Kyle); Belay, Adam M; Balakrishnan, Hari

DownloadAccepted version (363.2Kb)

Open Access Policy

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/

Metadata

Show full item record

Abstract

Datacenter applications demand microsecond-scale tail latencies and high request rates from operating systems, and most applications handle loads that have high variance over multiple timescales. Achieving these goals in a CPU-efficient way is an open problem. Because of the high overheads of today's kernels, the best available solution to achieve microsecond-scale latencies is kernel-bypass networking, which dedicates CPU cores to applications for spin-polling the network card. But this approach wastes CPU: even at modest average loads, one must dedicate enough cores for the peak expected load. Shenango achieves comparable latencies but at far greater CPU efficiency. It reallocates cores across applications at very fine granularity-every 5 µs-enabling cycles unused by latency-sensitive applications to be used productively by batch processing applications. It achieves such fast reallocation rates with (1) an efficient algorithm that detects when applications would benefit from more cores, and (2) a privileged component called the IOKernel that runs on a dedicated core, steering packets from the NIC and orchestrating core reallocations. When handling latency-sensitive applications, such as memcached, we found that Shenango achieves tail latency and throughput comparable to ZygOS, a state-of-the-art, kernel-bypass network stack, but can linearly trade latency-sensitive application throughput for batch processing application throughput, vastly increasing CPU efficiency.

Date issued

2019-02

URI

https://hdl.handle.net/1721.1/131018

Department

Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory

Journal

Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation

Publisher

Association for Computing Machinery (ACM)/ USENIX Association

Citation

Ousterhout, Amy et al. "Shenango: Achieving high CPU efficiency for latency-sensitive datacenter workloads." Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, February 2019, Boston, MA, Association for Computing Machinery / USENIX Association, February 2019. © 2019 The USENIX Association

Version: Author's final manuscript

Collections

MIT Open Access Articles