Monkey: A Distributed Orchestrator for a Virtual Pseudo-Homogenous Computational Cluster Consisting of Heterogeneous Sources
Author(s)
Stallone, Matthew J.
DownloadThesis PDF (1.167Mb)
Advisor
Agrawal, Pulkit
Terms of use
Metadata
Show full item recordAbstract
As machine learning research becomes increasingly ubiquitous, novel algorithms and state-of-the-art models are progressing to an advanced state with considerably more complex and involved procedures. That is, to achieve groundbreaking results in such a climate, a researcher increasingly depends upon immense computational requisites to develop, train, and evaluate such algorithms. As a result, research labs are faced with the challenge of providing ample computational resources, and researchers are detracted from their core research in order to design, code, and configure experiments for the disparate computational resources provided.
The framework proposed herein, therefore, strives to bridge the gaps between research labs, researchers, and computational resources by abstracting and automating the standard process of designing, training, and evaluating an algorithm. This framework, built upon the preexisting Monkey framework, will provide a fault-tolerant, decentralized system that is capable of scheduling and reproducing research training jobs. The framework maintains a virtual pseudo-homogenous cluster built on top of existing heterogeneous computational clusters. Moreover, the framework, designed to be flexible and cost-effective, also prioritizes user accessibility by providing access to an integrated machine learning toolkit with hyperparameter optimizers and a visualization dashboard.
Date issued
2022-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology