Optimizing RAM-latency dominated applications
Author(s)
Mao, Yandong; Cutler, Cody; Morris, Robert Tappan
DownloadMorris_Optimizing RAM.pdf (102.6Kb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
Many apparently CPU-limited programs are actually bottlenecked by RAM fetch latency, often because they follow pointer chains in working sets that are much bigger than the CPU's on-chip cache. For example, garbage collectors that identify live objects by tracing inter-object pointers can spend much of their time stalling due to RAM fetches. We observe that for such workloads, programmers should view RAM much as they view disk. The two situations share not just high access latency, but also a common set of approaches to coping with that latency. Relatively general-purpose techniques such as batching, sorting, and "I/O" concurrency work to hide RAM latency much as they do for disk. This paper studies several RAM-latency dominated programs and shows how we apply general-purpose approaches to hide RAM latency. The evaluation shows that these optimizations improve performance by a factor of 1.3x. Counter-intuitively, even though these programs are not limited by CPU cycles, we found that adding more cores can yield better performance.
Date issued
2013-07Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
Proceedings of the 4th Asia-Pacific Workshop on Systems (APSys '13)
Publisher
Association for Computing Machinery (ACM)
Citation
Yandong Mao, Cody Cutler, and Robert Morris. 2013. Optimizing RAM-latency dominated applications. In Proceedings of the 4th Asia-Pacific Workshop on Systems (APSys '13). ACM, New York, NY, USA, Article 12, 5 pages.
Version: Author's final manuscript
ISBN
9781450323161