Implementing a Persistent Offline Cache Improving Time to First Execution (TTFX) of GPU Code in Julia
Author(s)
Warner, Collin
DownloadThesis PDF (600.1Kb)
Advisor
Edelman, Alan
Terms of use
Metadata
Show full item recordAbstract
GPU’s allow users the ability to run code with high data parallelism efficiently on specialized hardware. GPUCompiler.jl provides a GPU compilation process to Julia allowing users to write highly efficient vector operations common in scientific computing. GPUCompiler.jl does not support the same level of persistent offline caching that is available in the core Julia compiler. This increases the time to first execution (TTFX) as programs need to recompile GPU code on every package reload regardless of if any code was changed. In this thesis we implement a persistent offline cache that is capable of storing both type inferred and native code drastically reducing the TTFX on precompiled GPU code. We demonstrate that by caching native code, execution can be sped up 2-3x while reducing compilation storage costs by 3-40x when compared to the current GPU compilation process.
Date issued
2023-06Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology