Skip to main content
Log in

Large-scale parallel server system with multi-component jobs

  • Published:
Queueing Systems Aims and scope Submit manuscript

Abstract

A broad class of parallel server systems is considered, for which we prove the steady-state asymptotic independence of server workloads, as the number of servers goes to infinity, while the system load remains sub-critical. Arriving jobs consist of multiple components. There are multiple job classes, and each class may be of one of two types, which determines the rule according to which the job components add workloads to the servers. The model is broad enough to include as special cases some popular queueing models with redundancy, such as cancel-on-start and cancel-on-completion redundancy. Our analysis uses mean-field process representation and the corresponding mean-field limits. In essence, our approach relies almost exclusively on three fundamental properties of the model: (a) monotonicity, (b) work conservation and (c) the property that, on average, “new arriving workload prefers to go to servers with lower workloads.”

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Adan, I., Kleiner, I., Righter, R., Weiss, G.: FCFS parallel service systems and matching models. Perform. Eval. 127, 253–272 (2018)

    Article  Google Scholar 

  2. Ayesta, U., Bodas, T., Verloop, I.M.: On a unifying product form framework for redundancy models. Perform. Eval. 127, 93–119 (2018)

    Article  Google Scholar 

  3. Ayesta, U., Bodas, T., Verloop, I.M.: On redundancy-d with cancel-on-start aka join-shortest-work (d). ACM SIGMETRICS Perform. Eval. Rev. 46(2), 24–26 (2018)

    Article  Google Scholar 

  4. Bramson, M.: Stability of join the shortest queue networks. Ann. Appl. Probab. 21, 1568–1625 (2011)

    Google Scholar 

  5. Bramson, M., Lu, Y., Prabhakar, B.: Asymptotic independence of queues under randomized load balancing. Queueing Syst. 71, 247–292 (2012)

    Article  Google Scholar 

  6. Foss, S., Chernova, N.: On the stability of a partially accessible multi-station queue with state-dependent routing. Queueing Syst. 29, 55–73 (1998)

    Article  Google Scholar 

  7. Gardner, K., Harchol-Balter, M., Scheller-Wolf, A., Velednitsky, M., Zbarsky, S.: Redundancy-d: the power of d choices for redundancy. Oper. Res. 65(4), 1078–1094 (2017)

    Article  Google Scholar 

  8. Gardner, K., Zbarsky, S., Doroudi, S., Harchol-Balter, M., Hyytia, E.: Reducing latency via redundant requests: exact analysis. ACM SIGMETRICS Perform. Eval. Rev. 43(1), 347–360 (2015)

    Article  Google Scholar 

  9. Greenberg, A., Malyshev, V., Popov, S.: Stochastic model of massively parallel computation. Markov Process. Relat. Fields 2, 473–490 (1997)

    Google Scholar 

  10. Hellemans, T., Bodas, T., Van Houdt, B.: Performance analysis of workload dependent load balancing policies. Proc. ACM Meas. Anal. Comput. Syst. 3(2), 35 (2019)

    Article  Google Scholar 

  11. Pang, G., Talreja, R., Whitt, W.: Martingale proofs of many-server heavy-traffic limits for Markovian queues. Probab. Surv. 4, 193–267 (2007)

    Article  Google Scholar 

  12. Shah, N.B., Lee, K., Ramchandran, K.: When do redundant requests reduce latency? IEEE Trans. Commun. 64(2), 715–722 (2015)

    Article  Google Scholar 

  13. Stolyar, A.L.: Pull-based load distribution in large-scale heterogeneous service systems. Queueing Syst. 80(4), 341–361 (2015)

    Article  Google Scholar 

  14. Stolyar, A.L.: Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers. Queueing Syst. 85(1–2), 31–65 (2017)

    Article  Google Scholar 

  15. Vulimiri, A., Godfrey, P., Mittal, R., Sherry, J., Ratnasamy, S., Shenker, S.: Low latency via redundancy. CoNEXT 2013—Proceedings of the 2013 ACM International Conference on Emerging Networking Experiments and Technologies. Association for Computing Machinery, pp. 283–294 (2013)

  16. Vvedenskaya, N., Dobrushin, R., Karpelevich, F.: Queueing system with selection of the shortest of two queues: an asymptotic approach. Probl. Inf. Transm. 32(1), 20–34 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander L. Stolyar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shneer, S., Stolyar, A.L. Large-scale parallel server system with multi-component jobs. Queueing Syst 98, 21–48 (2021). https://doi.org/10.1007/s11134-021-09686-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11134-021-09686-y

Keywords

Mathematics Subject Classification

Navigation