Skip to main content
Log in

Network SLO-aware container scheduling in Kubernetes

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In clouds, various services run on respective containers and have service-level objectives (SLO) that significantly impact service qualities. However, Kubernetes, a widely used container orchestration platform, does not schedule containers with respect to the network SLOs. This paper proposes a new container scheduling technique consisting of a cloud-level and node-level scheduler. The cloud-level scheduler selects a node that is best suited for satisfying the network SLO, and the node-level scheduler adjusts the CPU allocation for the container to satisfy SLOs on the selected node. We implement the cloud-level scheduler in Kubernetes and the node-level scheduler in the Linux kernel module and evaluate them using simulation and actual deployment. The evaluation results show that the cloud-level scheduler reduces the scheduling overhead by 22\(\times\) compared to DRF, a representative multi-resource scheduling technique. Also, the node-level scheduler increases the number of containers that satisfy SLOs by 2.5\(\times\) compared to native Kubernetes, which will significantly enhance the service quality of user-facing services.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

Data and materials are available on request from the authors.

Notes

  1. The source code of the prototype implementation can be found at https://github.com/kiiimes/DepCon.

References

  1. LinuxFoundation.: Production-Grade Container Orchestration. http://Kubernetes.io/

  2. Beltre A, Saha P, Govindaraju M, Kubesphere (2019) An approach to multi-tenant fair scheduling for kubernetes clusters. In, (2019) IEEE cloud summit. IEEE :14–20

  3. Carrión C. Kubernetes scheduling: Taxonomy, ongoing issues and challenges. ACM Computing Surveys (CSUR). 2022;

  4. Kannan RS, Subramanian L, Raju A, Ahn J, Mars J, Tang L (2019) Grandslam Guaranteeing slas for jobs in microservices execution frameworks. In: Proceedings of the fourteenth EuroSys conference. 1–16

  5. Qiu H, Banerjee SS, Jha S, Kalbarczyk ZT, Iyer RK (2020) FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In: 14th USENIX symposium on operating systems design and implementation (OSDI 20). 805–825

  6. Xu C, Rajamani K, Felter W. Nbwguard (2018) Realizing network qos for kubernetes. In: Proceedings of the 19th international middleware conference industry. 32–38

  7. Khalid J, Rozner E, Felter W, Xu C, Rajamani K, Ferreira A, et al (2018) Iron: Isolating Network-based CPU in Container Environments. In: 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18); 313–328

  8. Kim D, Yu T, Liu HH, Zhu Y, Padhye J, Raindel S, et al (2019) FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds. In: 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19); 113–126

  9. Guo Y, Yao W (2018) A container scheduling strategy based on neighborhood division in micro service. In: NOMS 2018-2018 IEEE/IFIP network operations and management symposium. IEEE; 1–6

  10. Tembey P, Gavrilovska A, Schwan K. Merlin (2014) Application-and platform-aware resource allocation in consolidated server systems. In: Proceedings of the ACM symposium on cloud computing; 1–14

  11. Ghodsi A, Zaharia M, Hindman B, Konwinski A, Shenker S, Stoica I (2011) Dominant resource fairness: Fair allocation of multiple resource types. In: 8th USENIX symposium on networked systems design and implementation (NSDI 11);

  12. Chowdhury M, Liu Z, Ghodsi A, Stoica I HUG (2016) Multi-resource fairness for correlated and elastic demands. In: 13th USENIX symposium on networked systems design and implementation (NSDI 16); 407–424

  13. Bhattacharya AA, Culler D, Friedman E, Ghodsi A, Shenker S, Stoica I(2013) Hierarchical scheduling for diverse datacenter workloads. In: Proceedings of the 4th annual symposium on cloud Computing; 1–15

  14. Kash IA, O’Shea G, Volos S (2018) DC-DRF: Adaptive multi-resource sharing at public cloud scale. In: Proceedings of the ACM symposium on cloud computing; 374–385

  15. Lee K, Lee K, Park H, Hwang J, Yoo C (2022) Autothrottle: satisfying network performance requirements for containers. IEEE Transact Cloud Comput

  16. Hong CH, Lee K, Kang M, Yoo C (2018) qCon: QoS-aware network resource management for fog computing. Sensors. 18(10):3444

    Article  Google Scholar 

  17. Chung A, Park JW, Ganger GR (2018) Stratus: Cost-aware container scheduling in the public cloud. In: Proceedings of the ACM symposium on cloud computing; 121–134

  18. Menouer T (2021) KCSS: Kubernetes container scheduling strategy. J Supercomput 77(5):4267–4293

    Article  Google Scholar 

  19. Zhao D, Mohamed M, Ludwig H (2018) Locality-aware scheduling for containers in cloud computing. IEEE Transact Cloud Comput 8(2):635–646

    Article  Google Scholar 

  20. Burns B, Grant B, Oppenheimer D, Brewer E, Wilkes J (2016) Borg, omega, and kubernetes. Queue. 14(1):70–93

    Article  Google Scholar 

  21. Piraghaj SF, Dastjerdi AV, Calheiros RN, Buyya R (2017) ContainerCloudSim An environment for modeling and simulation of containers in cloud data centers. Software: Practice and Experience. ;47(4):505–521

  22. HewlettPackard.: Netperf. https://github.com/HewlettPackard/netperf

  23. Badshah A, Ghani A, Shamshirband S, Aceto G, Pescapè A (2020) Performance-based service-level agreement in cloud computing to optimise penalties and revenue. IET Communicat 14(7):1102–1112

    Google Scholar 

  24. Zeng X, Garg S, Barika M, Zomaya AY, Wang L, Villari M et al (2020) SLA management for big data analytical applications in clouds: a taxonomy study. ACM Comput Survey (CSUR). 53(3):1–40

    Article  Google Scholar 

  25. Zhao L, Sakr S, Liu A (2013) A framework for consumer-centric SLA management of cloud-hosted databases. IEEE Transact Serv Comput 8(4):534–549

    Article  Google Scholar 

  26. Le TN, Sun X, Chowdhury M, Liu Z (2020) AlloX: compute allocation in hybrid clusters. In: Proceedings of the fifteenth european conference on computer Systems; 1–16

  27. Dobrescu M, Egi N, Argyraki K, Chun BG, Fall K, Iannaccone G, et al (2009) RouteBricks: Exploiting parallelism to scale software routers. In: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles; 15–28

  28. Khamse-Ashari J, Lambadaris I, Kesidis G, Urgaonkar B, Zhao Y (2017) Per-Server Dominant-Share Fairness (PS-DSF): A multi-resource fair allocation mechanism for heterogeneous servers. In: 2017 IEEE international conference on communications (ICC). IEEE; 1–7

  29. Grandl R, Chowdhury M, Akella A, Ananthanarayanan G (2016) Altruistic scheduling in multi-resource clusters. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16); 65–80

  30. Wang W, Li B, Liang B, Li J (2016) Multi-resource fair sharing for datacenter jobs with placement constraints. In: SC’16: Proceedings of the international Conference for High Performance Computing, Networking, Storage and Analysis. IEEE; 1003–1014

  31. Carvalho M, QoE-Aware Macedo DF (2021) Scheduler Container, for Co-located Cloud Environments. In, (2021) IFIP/IEEE international symposium on integrated network management (IM). IEEE :286–294

  32. Kim YK, HoseinyFarahabady MR, Lee YC, Zomaya AY (2020) Automated fine-grained cpu cap control in serverless computing platform. IEEE Transact Parallel Distribut Syst 31(10):2289–2301

    Article  Google Scholar 

  33. Lee K, Hong CH, Hwang J, Yoo C (2019) Dynamic network scheduling for virtual routers. IEEE Syst J 14(3):3618–3629

    Article  Google Scholar 

Download references

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science and ICT, MSIT) (No. RS-2022-00166222 and No. 2023R1A2C3004145) and Basic Science Research Program funded by the Ministry of Education (NRF-2021R1A6A1A13044830).

Author information

Authors and Affiliations

Authors

Contributions

EK and KL wrote the main manuscript text and EK prepared the figures. CY reviewed the manuscript.

Corresponding authors

Correspondence to Kyungwoon Lee or Chuck Yoo.

Ethics declarations

Conflict of interest

The authors declare that there are no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Ethics approval

This declaration is not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, E., Lee, K. & Yoo, C. Network SLO-aware container scheduling in Kubernetes. J Supercomput 79, 11478–11494 (2023). https://doi.org/10.1007/s11227-023-05122-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05122-5

Keywords

Navigation