Improving performance and security of indirect memory references on speculative execution machines

Kiriansky, Vladimir L.(Vladimir Lubenov),1979-

Author(s)

Kiriansky, Vladimir L.(Vladimir Lubenov),1979-

Download1122780409-MIT.pdf (17.14Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Saman Amarasinghe.

Terms of use

MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Indirect memory references hobble efficient and secure execution on current processor architectures. Traditional hardware techniques such as caches and speculative execution are ineffective on demanding workloads, such as in-memory databases, machine learning, and graph analytics. While terabytes of DRAM are now available in public cloud machines, indirect memory references in large working sets often incur the full penalty of a random DRAM access. Furthermore, caches and speculative execution enable the recently discovered Spectre family of side-channel attacks, which allow untrusted neighbors in a public cloud to steal secrets. In this thesis, we introduce complementary software and hardware techniques to improve the performance of caches and speculative execution, and to block the largest attack class with low overhead. MILK is our C++ extension to improve data cache locality.

Milk's programming model preserves parallel program semantics and maps well to the Bulk-Synchronous Parallel (BSP) theoretical model. Within a BSP superstep, which may encompass billions of memory references, Milk captures the temporal and spatial locality of ideal infinite caches on real hardware and provides up to 4x speedup. CIMPLE is our domain specific language (DSL) to improve the effectiveness of speculative execution in discovering instruction level parallelism and memory level parallelism. Improving memory parallelism on current CPUs allows up to ten memory references in parallel to reduce the effective DRAM latency. Speculative execution is constrained by branch predictor effectiveness and can only uncover independent accesses within the hardware limits of instruction windows (up to 100 instructions). With Cimple, interleaved co-routines expose instruction and memory level parallelism close to ideal hardware with unlimited instruction windows and perfect predictors.

On in-memory database index data structures, Cimple achieves up to 6x speedup. DAWG is our secure cache architecture that prevents leaks via measuring the cache effects of speculative indirect memory references. Unlike performance isolation mechanisms such as Intel's Cache Allocation Technology (CAT), DAWG blocks both speculative and non-speculative side-channels by isolating cache protection domains. DAWG incurs no overhead over CAT for isolation in public clouds. DAWG also enables OS isolation with efficient sharing and communication via caches, e.g., in system calls.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 123-139).

Date issued

2019

URI

https://hdl.handle.net/1721.1/122556

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses