Optimal Reissue Policies for Reducing Tail Latency

Kaler, Tim; He, Yuxiong; Elnikety, Sameh

dc.contributor.author	Elnikety, Sameh
dc.contributor.author	Kaler, Timothy
dc.contributor.author	He, Yuxiong
dc.date.accessioned	2018-07-02T13:49:18Z
dc.date.available	2018-07-02T13:49:18Z
dc.date.issued	2017-07
dc.identifier.isbn	9781450345934
dc.identifier.uri	http://hdl.handle.net/1721.1/116708
dc.description.abstract	Interactive services send redundant requests to multiple different replicas to meet stringent tail latency requirements. These addi- tional (reissue) requests mitigate the impact of non-deterministic delays within the system and thus increase the probability of re- ceiving an on-time response. There are two existing approaches of using reissue requests to reduce tail latency. (1) Reissue requests immediately to one or more replicas, which multiplies the load and runs the risk of overloading the system. (2) Reissue requests if not completed after a fixed delay. The delay helps to bound the number of extra reissue requests, but it also reduces the chance for those requests to respond before a tail latency target. We introduce a new family of reissue policies, Single-Time / Random ( SingleR ), that reissue requests after a delay d with probability q . SingleR employs randomness to bound the reissue rate, while allowing requests to be reissued early enough so they have sufficient time to respond, exploiting the benefits of both immediate and delayed reissue of prior work. We formally prove, within a simplified analytical model, that SingleR is optimal even when compared to more complex policies that reissue multiple times. To use SingleR for interactive services, we provide efficient algorithms for calculating optimal reissue delay and probability from response time logs through data-driven approach. We apply itera- tive adaptation for systems with load-dependent queuing delays. The key advantage of this data-driven approach is its wide applica- bility and effectiveness to systems with various design choices and workload properties. We evaluated SingleR policies thoroughly. We use simulation to illustrate its internals and demonstrate its robustness to a wide range of workloads. We conduct system experiments on the Re- dis key-value store and Lucene search server. The results show that for utilizations ranging from 40 - 60% , SingleR reduces the 99 th-percentile latency of Redis by 30 - 70% by reissuing only 2% of requests, and the 99 th-percentile latency of Lucene by 15 - 25% by reissuing 1% only.	en_US
dc.language.iso	en_US
dc.relation.isversionof	https://doi.org/10.1145/3087556.3087566	en_US
dc.rights	Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.	en_US
dc.source	Kaler	en_US
dc.title	Optimal Reissue Policies for Reducing Tail Latency	en_US
dc.type	Article	en_US
dc.identifier.citation	Kaler, Tim, Yuxiong He, and Sameh Elnikety. “Optimal Reissue Policies for Reducing Tail Latency.” Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures - SPAA ’17 (2017).	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Mathematics	en_US
dc.contributor.approver	Kaler, Tim	en_US
dc.contributor.mitauthor	Kaler, Timothy
dc.contributor.mitauthor	He, Yuxiong
dc.relation.journal	Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures - SPAA '17	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dspace.orderedauthors	Kaler, Tim; He, Yuxiong; Elnikety, Sameh	en_US
dspace.embargo.terms	N	en_US
dc.identifier.orcid	https://orcid.org/0000-0002-3831-8255
mit.license	PUBLISHER_POLICY	en_US

Files in this item

Name:: reissuepolicy_preprint.pdf
Size:: 2.601Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record