Advanced Search

Centralized performance control for datacenter networks

Research and Teaching Output of the MIT Community

Show simple item record

dc.contributor.advisor Hari Balakrishnan and Devavrat Shah. en_US Perry, Jonathan, Ph. D. Massachusetts Institute of Technology en_US
dc.contributor.other Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. en_US 2017-10-18T15:09:30Z 2017-10-18T15:09:30Z 2017 en_US 2017 en_US
dc.description Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017. en_US
dc.description Cataloged from PDF version of thesis. en_US
dc.description Includes bibliographical references (pages 99-104). en_US
dc.description.abstract An ideal datacenter network should allow operators to specify policy for resource allocation between users or applications, while providing several properties, including low median and tail latency, high utilization (throughput), and congestion (loss) avoidance. Current datacenter networks inherit the principles that went into the design of the Internet, where packet transmission and path selection decisions are distributed among the endpoints and routers, which impede obtaining the desired properties. Instead, we propose that a centralized controller should tightly regulate senders' use of the network according to operator policy, and evaluate two architectures: Fastpass and Flowtune. In Fastpass, the controller decides when each packet should be transmitted and what path it should follow. Fastpass incorporates two fast algorithms: the first determines the time at which each packet should be transmitted, while the second determines the path to use for that packet. We deployed and evaluated Fastpass in a portion of Facebook's datacenter network. Our results show that Fastpass achieves high throughput comparable to current networks at a 240 x reduction is queue lengths, achieves much fairer and consistent flow throughputs than the baseline TCP, scales to schedule 2.21 Terabits/s of traffic in software on eight cores, and achieves a 2.5 x reduction in the number of TCP retransmissions in a latency-sensitive service at Facebook. In Flowtune, congestion control decisions are made at the granularity of a flowlet, not a packet, so allocations change only when flowlets arrive or leave. The centralized allocator receives flowlet start and end notifications from endpoints, and computes optimal rates using a new, fast method for network utility maximization. A normalization algorithm ensures allocations do not exceed link capacities. Flowtune updates rate allocations for 4600 servers in 31 ps regardless of link capacities. Experiments show that Flowtune outperforms DCTCP, pFabric, sfqCoDel, and XCP on tail packet delays in various settings, and converges to optimal rates within a few packets rather than over several RTTs. EC2 benchmarks show a fairer rate allocation than Linux's Cubic. en_US
dc.description.statementofresponsibility by Jonathan Perry. en_US
dc.format.extent 104 pages en_US
dc.language.iso eng en_US
dc.publisher Massachusetts Institute of Technology en_US
dc.rights MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. en_US
dc.rights.uri en_US
dc.subject Electrical Engineering and Computer Science. en_US
dc.title Centralized performance control for datacenter networks en_US
dc.type Thesis en_US Ph. D. en_US
dc.contributor.department Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. en_US
dc.identifier.oclc 1005140050 en_US

Files in this item

Name Size Format Description
1005140050-MIT.pdf 12.53Mb PDF Full printable version

This item appears in the following Collection(s)

Show simple item record