Tarcil: reconciling scheduling speed and quality in large shared clusters
Author(s)
Delimitrou, Christina; Sanchez, Daniel; Kozyrakis, Christos
DownloadSanchez.tarcil.pdf (1.937Mb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
Scheduling diverse applications in large, shared clusters is particularly challenging. Recent research on cluster scheduling focuses either on scheduling speed, using sampling to quickly assign resources to tasks, or on scheduling quality, using centralized algorithms that search for the resources that improve both task performance and cluster utilization.
We present Tarcil, a distributed scheduler that targets both scheduling speed and quality. Tarcil uses an analytically derived sampling framework that adjusts the sample size based on load, and provides statistical guarantees on the quality of allocated resources. It also implements admission control when sampling is unlikely to find suitable resources. This makes it appropriate for large, shared clusters hosting short- and long-running jobs. We evaluate Tarcil on clusters with hundreds of servers on EC2. For highly-loaded clusters running short jobs, Tarcil improves task execution time by 41% over a distributed, sampling-based scheduler. For more general scenarios, Tarcil achieves near-optimal performance for 4× and 2× more jobs than sampling-based and centralized schedulers respectively.
Date issued
2015-08Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
Proceedings of the Sixth ACM Symposium on Cloud Computing (SoCC '15)
Publisher
Association for Computing Machinery (ACM)
Citation
Delimitrou, Christina et al. “Tarcil: reconciling scheduling speed and quality in large shared clusters” Proceedings of the Sixth ACM Symposium on Cloud Computing (SoCC ’15), August 27-29 2015, Kohala Coast, Hawaii, Association for Computing Machinery (ACM), August 2015 © 2015 Association for Computing Machinery (ACM)
Version: Author's final manuscript
ISBN
978-1-4503-3651-2