MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Tarcil: reconciling scheduling speed and quality in large shared clusters

Author(s)
Delimitrou, Christina; Sanchez, Daniel; Kozyrakis, Christos
Thumbnail
DownloadSanchez.tarcil.pdf (1.937Mb)
OPEN_ACCESS_POLICY

Open Access Policy

Creative Commons Attribution-Noncommercial-Share Alike

Terms of use
Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/
Metadata
Show full item record
Abstract
Scheduling diverse applications in large, shared clusters is particularly challenging. Recent research on cluster scheduling focuses either on scheduling speed, using sampling to quickly assign resources to tasks, or on scheduling quality, using centralized algorithms that search for the resources that improve both task performance and cluster utilization. We present Tarcil, a distributed scheduler that targets both scheduling speed and quality. Tarcil uses an analytically derived sampling framework that adjusts the sample size based on load, and provides statistical guarantees on the quality of allocated resources. It also implements admission control when sampling is unlikely to find suitable resources. This makes it appropriate for large, shared clusters hosting short- and long-running jobs. We evaluate Tarcil on clusters with hundreds of servers on EC2. For highly-loaded clusters running short jobs, Tarcil improves task execution time by 41% over a distributed, sampling-based scheduler. For more general scenarios, Tarcil achieves near-optimal performance for 4× and 2× more jobs than sampling-based and centralized schedulers respectively.
Date issued
2015-08
URI
http://hdl.handle.net/1721.1/111979
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Journal
Proceedings of the Sixth ACM Symposium on Cloud Computing (SoCC '15)
Publisher
Association for Computing Machinery (ACM)
Citation
Delimitrou, Christina et al. “Tarcil: reconciling scheduling speed and quality in large shared clusters” Proceedings of the Sixth ACM Symposium on Cloud Computing (SoCC ’15), August 27-29 2015, Kohala Coast, Hawaii, Association for Computing Machinery (ACM), August 2015 © 2015 Association for Computing Machinery (ACM)
Version: Author's final manuscript
ISBN
978-1-4503-3651-2

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.