MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

SparkSim: A Counterfactual Approach for Spark Cluster Scheduling

Author(s)
Rodríguez Garnica, Sol Estrella
Thumbnail
DownloadThesis PDF (2.467Mb)
Advisor
Nasr-Esfahany, Arash
Madden, Samuel
Alizadeh, Mohammad
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Simulating and testing scheduling policies can be immensely time- and resourceintensive. In this work, we explore a novel approach, SparkSim, to scheduling policy training that is faster and more efficient than traditional scheduling policy testing. Our approach is based on an extension of CausalSim’s existing trace-driven approach [3], which we apply to replace the current Spark Cluster scheduling policy testing in simulation. To simulate the runtime under a new scheduling policy, our method consists of training a neural model to learn about unseen and unbiased computation elements of the cluster, extracting them, and using them as latents in predicting the duration of a workload from an existing trace. We implement this using a counterfactual approach, which takes a trace that was executed to predict a new one as if it had taken place under the same cluster conditions. My thesis focuses on evaluating and investigating the performance of SparkSim. We evaluate SparkSim on two baselines that do not require training. Our results show that SparkSim underperforms against these baselines during easier prediction tasks (such as copying from source), but outperforms them when the prediction tasks get harder. Future work lends itself to greatly improve upon these results.
Date issued
2023-06
URI
https://hdl.handle.net/1721.1/151471
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.