Implementing a Persistent Offline Cache Improving Time to
First Execution (TTFX) of GPU Code in Julia

Warner, Collin

Author(s)

Warner, Collin

DownloadThesis PDF (600.1Kb)

Advisor

Edelman, Alan

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

GPU’s allow users the ability to run code with high data parallelism efficiently on specialized hardware. GPUCompiler.jl provides a GPU compilation process to Julia allowing users to write highly efficient vector operations common in scientific computing. GPUCompiler.jl does not support the same level of persistent offline caching that is available in the core Julia compiler. This increases the time to first execution (TTFX) as programs need to recompile GPU code on every package reload regardless of if any code was changed. In this thesis we implement a persistent offline cache that is capable of storing both type inferred and native code drastically reducing the TTFX on precompiled GPU code. We demonstrate that by caching native code, execution can be sped up 2-3x while reducing compilation storage costs by 3-40x when compared to the current GPU compilation process.

Date issued

2023-06

URI

https://hdl.handle.net/1721.1/151406

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses