Abstract
Distributed key-value stores employ replication for high availability. Yet, they do not always efficiently take advantage of the availability of multiple replicas for each value, and read operations often exhibit high tail latencies. Various replica selection strategies have been proposed to address this problem, together with local request scheduling policies. It is difficult, however, to determine what is the absolute performance gain each of these strategies can achieve. We present a formal framework allowing the systematic study of request scheduling strategies in key-value stores. We contribute a definition of the optimization problem related to reducing tail latency in a replicated key-value store as a minimization problem with respect to the maximum weighted flow criterion. By using scheduling theory, we show the difficulty of this problem, and therefore the need to develop performance guarantees. We also study the behavior of heuristic methods using simulations, which highlight which properties are useful for limiting tail latency: for instance, the EFT strategy—which uses the earliest available time of servers—exhibits a tail latency that is less than half that of state-of-the-art strategies, often matching the lower bound. Our study also emphasizes the importance of metrics such as the stretch to properly evaluate replica selection and local execution policies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
We express non-migratory preemption as \( pmtn ^*\) in the \(\beta \)-part, not to be confused with the classic \( pmtn \) constraint.
References
Atikoglu, B., Xu, Y., Frachtenberg, E., Jiang, S., Paleczny, M.: Workload analysis of a large-scale key-value store. In: ACM SIGMETRICS Performance Evaluation Review, vol. 40, pp. 53–64. ACM (2012)
Bansal, N., Pruhs, K.: Server scheduling in the \(l_p\) norm: a rising tide lifts all boat. In: ACM STOCS (2003)
Ben Mokhtar, S., Canon, L.C., Dugois, A., Marchal, L., Rivière, E.: Taming Tail latency in key-value stores: a scheduling perspective (extended version). Tech. rep. (2021). https://hal.inria.fr/hal-03144818
Bender, M.A., Chakrabarti, S., Muthukrishnan, S.: Flow and stretch metrics for scheduling continuous job streams. In: ACM-SIAM Symposium on Discrete Algorithms (1998)
Brucker, P., Jurisch, B., Krämer, A.: Complexity of scheduling problems with multi-purpose machines. Ann. Oper. Res. 70, 57–73 (1997)
Brutlag, J.: Speed matters for google web search (2009)
Canon, L.C., Dugois, A., Marchal, L.: Artifact and instructions to generate experimental results for the euro-par 2021 paper: “taming tail latency in key-value stores: a scheduling perspective”. https://doi.org/10.6084/m9.figshare.14755359
Dean, J., Barroso, L.A.: The tail at scale. Commun. ACM 56(2), 74–80 (2013)
DeCandia, G., et al.: Dynamo: amazon’s highly available key-value store. ACM SIGOPS Oper. Sys. Rev. 41, 205–220 (2007)
Didona, D., Zwaenepoel, W.: Size-aware sharding for improving tail latencies in in-memory key-value stores. In: NSDI, pp. 79–94 (2019)
Feitelson, D.G.: Workload Modeling for Computer Systems Performance Evaluation. Cambridge University Press (2015)
Graham, R.L.: Bounds for certain multiprocessing anomalies. Bell Syst. Tech. J. 45(9), 1563–1581 (1966)
Jaiman, V., Ben Mokhtar, S., Quéma, V., Chen, L.Y., Rivière, E.: Héron: taming tail latencies in key-value stores under heterogeneous workloads. In: SRDS. IEEE (2018)
Jaiman, V., Ben Mokhtar, S., Rivière, E.: TailX: Scheduling Heterogeneous Multiget Queries to Improve Tail Latencies in Key-Value Stores. In: Remke, A., Schiavoni, V. (eds.) DAIS 2020. LNCS, vol. 12135, pp. 73–92. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50323-9_5
Jiang, W., Xie, H., Zhou, X., Fang, L., Wang, J.: Haste makes waste: the on-off algorithm for replica selection in key-value stores. JPDC 130, 80–90 (2019)
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)
Lawler, E.L., Labetoulle, J.: On preemptive scheduling of unrelated parallel processors by linear programming. J. ACM (JACM) 25(4), 612–619 (1978)
Legrand, A., Su, A., Vivien, F.: Minimizing the stretch when scheduling flows of divisible requests. J. Sched. 11(5), 381–404 (2008)
Lenstra, J.K., Kan, A.R., Brucker, P.: Complexity of machine scheduling problems. Stud. Integer Program. 1, 343–362 (1977)
Leung, J.Y.T., Li, C.L.: Scheduling with processing set restrictions: a survey. Int. J. Prod. Econ. 116(2), 251–262 (2008)
Li, J., Sharma, N.K., Ports, D.R., Gribble, S.D.: Tales of the tail: Hardware, OS, and application-level sources of tail latency. In: ACM Symposium Cloud Computing (2014)
Reda, W., Canini, M., Suresh, L., Kostić, D., Braithwaite, S.: Rein: taming tail latency in key-value stores via multiget scheduling. In: EuroSys (2017)
Suresh, L., Canini, M., Schmid, S., Feldmann, A.: C3: cutting tail latency in cloud data stores via adaptive replica selection. In: NSDI (2015)
Acknowledgements and Data Availability Statement
The datasets and code generated during and/or analyzed during the current study are available in the Figshare repository: https://doi.org/10.6084/m9.figshare.14755359 [7].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Mokhtar, S.B., Canon, LC., Dugois, A., Marchal, L., Rivière, E. (2021). Taming Tail Latency in Key-Value Stores: A Scheduling Perspective. In: Sousa, L., Roma, N., Tomás, P. (eds) Euro-Par 2021: Parallel Processing. Euro-Par 2021. Lecture Notes in Computer Science(), vol 12820. Springer, Cham. https://doi.org/10.1007/978-3-030-85665-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-85665-6_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85664-9
Online ISBN: 978-3-030-85665-6
eBook Packages: Computer ScienceComputer Science (R0)