The MIT Libraries is completing a major upgrade to DSpace@MIT. Starting May 5 2026, DSpace will remain functional, viewable, searchable, and downloadable, however, you will not be able to edit existing collections or add new material. We are aiming to have full functionality restored by May 18, 2026, but intermittent service interruptions may occur. Please email dspace-lib@mit.edu with any questions. Thank you for your patience as we implement this important upgrade.

Show simple item record

dc.contributor.authorMurray, Riley
dc.contributor.authorKhuller, Samir
dc.contributor.authorChao, Megan C.
dc.date.accessioned2018-06-21T17:00:22Z
dc.date.available2018-06-21T17:00:22Z
dc.date.issued2017-07
dc.date.submitted2016-09
dc.identifier.issn0178-4617
dc.identifier.issn1432-0541
dc.identifier.urihttp://hdl.handle.net/1721.1/116477
dc.description.abstractThe Map-Reduce computing framework rose to prominence with datasets of such size that dozens of machines on a single cluster were needed for individual jobs. As datasets approach the exabyte scale, a single job may need distributed processing not only on multiple machines, but on multiple clusters. We consider a scheduling problem to minimize weighted average completion time of n jobs on m distributed clusters of parallel machines. In keeping with the scale of the problems motivating this work, we assume that (1) each job is divided into m “subjobs” and (2) distinct subjobs of a given job may be processed concurrently. When each cluster is a single machine, this is the NP-Hard concurrent open shop problem. A clear limitation of such a model is that a serial processing assumption sidesteps the issue of how different tasks of a given subjob might be processed in parallel. Our algorithms explicitly model clusters as pools of resources and effectively overcome this issue. Under a variety of parameter settings, we develop two constant factor approximation algorithms for this problem. The first algorithm uses an LP relaxation tailored to this problem from prior work. This LP-based algorithm provides strong performance guarantees. Our second algorithm exploits a surprisingly simple mapping to the special case of one machine per cluster. This mapping-based algorithm is combinatorial and extremely fast. These are the first constant factor approximations for this problem.en_US
dc.description.sponsorshipNational Science Foundation (U.S.)en_US
dc.description.sponsorshipNational Science Foundation (U.S.). Research Experience for Undergraduates (Program) (Grant CCF 1262805)en_US
dc.description.sponsorshipWinkler Foundationen_US
dc.publisherSpringer-Verlagen_US
dc.relation.isversionofhttps://doi.org/10.1007/s00453-017-0345-xen_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceSpringer USen_US
dc.titleScheduling Distributed Clusters of Parallel Machines : Primal-Dual and LP-based Approximation Algorithmsen_US
dc.typeArticleen_US
dc.identifier.citationMurray, Riley, Samir Khuller, and Megan Chao. “Scheduling Distributed Clusters of Parallel Machines : Primal-Dual and LP-Based Approximation Algorithms.” Algorithmica 80, no. 10 (July 19, 2017): 2777–2798.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.mitauthorChao, Megan C.
dc.relation.journalAlgorithmicaen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2018-05-31T05:10:36Z
dc.language.rfc3066en
dc.rights.holderSpringer Science+Business Media, LLC
dspace.orderedauthorsMurray, Riley; Khuller, Samir; Chao, Meganen_US
dspace.embargo.termsNen
mit.licensePUBLISHER_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record