Show simple item record

dc.contributor.authorLi, Chenning
dc.contributor.authorNasr-Esfahany, Arash
dc.contributor.authorZhao, Kevin
dc.contributor.authorNoorbakhsh, Kimia
dc.contributor.authorGoyal, Prateesh
dc.contributor.authorAlizadeh, Mohammad
dc.contributor.authorAnderson, Thomas
dc.date.accessioned2024-09-05T13:38:53Z
dc.date.available2024-09-05T13:38:53Z
dc.date.issued2024-08-04
dc.identifier.isbn979-8-4007-0614-1
dc.identifier.urihttps://hdl.handle.net/1721.1/156674
dc.descriptionACM SIGCOMM ’24, August 4–8, 2024, Sydney, NSW, Australiaen_US
dc.description.abstractData center network operators often need accurate estimates of aggregate network performance. Unfortunately, existing methods for estimating aggregate network statistics are either inaccurate or too slow to be practical at the data center scale. In this paper, we develop and evaluate a scale-free, fast, and accurate model for estimating data center network tail latency performance for a given workload, topology, and network configuration. First, we show that path-level simulations---simulations of traffic that intersects a given path---produce almost the same aggregate statistics as full network-wide packet-level simulations. We use a simple and fast flow-level fluid simulation in a novel way to capture and summarize essential elements of the path workload, including the effect of cross-traffic on flows on that path. We use this coarse simulation as input to a machine-learning model to predict path-level behavior, and run it on a sample of paths to produce accurate network-wide estimates. Our model generalizes over the choice of congestion control (CC) protocol, CC protocol parameters, and routing. Relative to Parsimon, a state-of-the-art system for rapidly estimating aggregate network tail latency, our approach is significantly faster (5.7×), more accurate (45.9% less error), and more robust.en_US
dc.publisherACM|ACM SIGCOMM 2024 Conferenceen_US
dc.relation.isversionofhttps://doi.org/10.1145/3651890.3672243en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titlem3: Accurate Flow-Level Performance Estimation using Machine Learningen_US
dc.typeArticleen_US
dc.identifier.citationChenning Li, Arash Nasr-Esfahany, Kevin Zhao, Kimia Noorbakhsh, Prateesh Goyal, Mohammad Alizadeh, and Thomas E. Anderson. 2024. M3: Accurate Flow-Level Performance Estimation using Machine Learning. In Proceedings of the ACM SIGCOMM 2024 Conference (ACM SIGCOMM '24). Association for Computing Machinery, New York, NY, USA, 813–827.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.identifier.mitlicensePUBLISHER_CC
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2024-09-01T07:47:13Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2024-09-01T07:47:13Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record