Abstract
In clouds, various services run on respective containers and have service-level objectives (SLO) that significantly impact service qualities. However, Kubernetes, a widely used container orchestration platform, does not schedule containers with respect to the network SLOs. This paper proposes a new container scheduling technique consisting of a cloud-level and node-level scheduler. The cloud-level scheduler selects a node that is best suited for satisfying the network SLO, and the node-level scheduler adjusts the CPU allocation for the container to satisfy SLOs on the selected node. We implement the cloud-level scheduler in Kubernetes and the node-level scheduler in the Linux kernel module and evaluate them using simulation and actual deployment. The evaluation results show that the cloud-level scheduler reduces the scheduling overhead by 22\(\times\) compared to DRF, a representative multi-resource scheduling technique. Also, the node-level scheduler increases the number of containers that satisfy SLOs by 2.5\(\times\) compared to native Kubernetes, which will significantly enhance the service quality of user-facing services.
Similar content being viewed by others
Data Availability
Data and materials are available on request from the authors.
Notes
The source code of the prototype implementation can be found at https://github.com/kiiimes/DepCon.
References
LinuxFoundation.: Production-Grade Container Orchestration. http://Kubernetes.io/
Beltre A, Saha P, Govindaraju M, Kubesphere (2019) An approach to multi-tenant fair scheduling for kubernetes clusters. In, (2019) IEEE cloud summit. IEEE :14–20
Carrión C. Kubernetes scheduling: Taxonomy, ongoing issues and challenges. ACM Computing Surveys (CSUR). 2022;
Kannan RS, Subramanian L, Raju A, Ahn J, Mars J, Tang L (2019) Grandslam Guaranteeing slas for jobs in microservices execution frameworks. In: Proceedings of the fourteenth EuroSys conference. 1–16
Qiu H, Banerjee SS, Jha S, Kalbarczyk ZT, Iyer RK (2020) FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In: 14th USENIX symposium on operating systems design and implementation (OSDI 20). 805–825
Xu C, Rajamani K, Felter W. Nbwguard (2018) Realizing network qos for kubernetes. In: Proceedings of the 19th international middleware conference industry. 32–38
Khalid J, Rozner E, Felter W, Xu C, Rajamani K, Ferreira A, et al (2018) Iron: Isolating Network-based CPU in Container Environments. In: 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18); 313–328
Kim D, Yu T, Liu HH, Zhu Y, Padhye J, Raindel S, et al (2019) FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds. In: 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19); 113–126
Guo Y, Yao W (2018) A container scheduling strategy based on neighborhood division in micro service. In: NOMS 2018-2018 IEEE/IFIP network operations and management symposium. IEEE; 1–6
Tembey P, Gavrilovska A, Schwan K. Merlin (2014) Application-and platform-aware resource allocation in consolidated server systems. In: Proceedings of the ACM symposium on cloud computing; 1–14
Ghodsi A, Zaharia M, Hindman B, Konwinski A, Shenker S, Stoica I (2011) Dominant resource fairness: Fair allocation of multiple resource types. In: 8th USENIX symposium on networked systems design and implementation (NSDI 11);
Chowdhury M, Liu Z, Ghodsi A, Stoica I HUG (2016) Multi-resource fairness for correlated and elastic demands. In: 13th USENIX symposium on networked systems design and implementation (NSDI 16); 407–424
Bhattacharya AA, Culler D, Friedman E, Ghodsi A, Shenker S, Stoica I(2013) Hierarchical scheduling for diverse datacenter workloads. In: Proceedings of the 4th annual symposium on cloud Computing; 1–15
Kash IA, O’Shea G, Volos S (2018) DC-DRF: Adaptive multi-resource sharing at public cloud scale. In: Proceedings of the ACM symposium on cloud computing; 374–385
Lee K, Lee K, Park H, Hwang J, Yoo C (2022) Autothrottle: satisfying network performance requirements for containers. IEEE Transact Cloud Comput
Hong CH, Lee K, Kang M, Yoo C (2018) qCon: QoS-aware network resource management for fog computing. Sensors. 18(10):3444
Chung A, Park JW, Ganger GR (2018) Stratus: Cost-aware container scheduling in the public cloud. In: Proceedings of the ACM symposium on cloud computing; 121–134
Menouer T (2021) KCSS: Kubernetes container scheduling strategy. J Supercomput 77(5):4267–4293
Zhao D, Mohamed M, Ludwig H (2018) Locality-aware scheduling for containers in cloud computing. IEEE Transact Cloud Comput 8(2):635–646
Burns B, Grant B, Oppenheimer D, Brewer E, Wilkes J (2016) Borg, omega, and kubernetes. Queue. 14(1):70–93
Piraghaj SF, Dastjerdi AV, Calheiros RN, Buyya R (2017) ContainerCloudSim An environment for modeling and simulation of containers in cloud data centers. Software: Practice and Experience. ;47(4):505–521
HewlettPackard.: Netperf. https://github.com/HewlettPackard/netperf
Badshah A, Ghani A, Shamshirband S, Aceto G, Pescapè A (2020) Performance-based service-level agreement in cloud computing to optimise penalties and revenue. IET Communicat 14(7):1102–1112
Zeng X, Garg S, Barika M, Zomaya AY, Wang L, Villari M et al (2020) SLA management for big data analytical applications in clouds: a taxonomy study. ACM Comput Survey (CSUR). 53(3):1–40
Zhao L, Sakr S, Liu A (2013) A framework for consumer-centric SLA management of cloud-hosted databases. IEEE Transact Serv Comput 8(4):534–549
Le TN, Sun X, Chowdhury M, Liu Z (2020) AlloX: compute allocation in hybrid clusters. In: Proceedings of the fifteenth european conference on computer Systems; 1–16
Dobrescu M, Egi N, Argyraki K, Chun BG, Fall K, Iannaccone G, et al (2009) RouteBricks: Exploiting parallelism to scale software routers. In: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles; 15–28
Khamse-Ashari J, Lambadaris I, Kesidis G, Urgaonkar B, Zhao Y (2017) Per-Server Dominant-Share Fairness (PS-DSF): A multi-resource fair allocation mechanism for heterogeneous servers. In: 2017 IEEE international conference on communications (ICC). IEEE; 1–7
Grandl R, Chowdhury M, Akella A, Ananthanarayanan G (2016) Altruistic scheduling in multi-resource clusters. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16); 65–80
Wang W, Li B, Liang B, Li J (2016) Multi-resource fair sharing for datacenter jobs with placement constraints. In: SC’16: Proceedings of the international Conference for High Performance Computing, Networking, Storage and Analysis. IEEE; 1003–1014
Carvalho M, QoE-Aware Macedo DF (2021) Scheduler Container, for Co-located Cloud Environments. In, (2021) IFIP/IEEE international symposium on integrated network management (IM). IEEE :286–294
Kim YK, HoseinyFarahabady MR, Lee YC, Zomaya AY (2020) Automated fine-grained cpu cap control in serverless computing platform. IEEE Transact Parallel Distribut Syst 31(10):2289–2301
Lee K, Hong CH, Hwang J, Yoo C (2019) Dynamic network scheduling for virtual routers. IEEE Syst J 14(3):3618–3629
Funding
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science and ICT, MSIT) (No. RS-2022-00166222 and No. 2023R1A2C3004145) and Basic Science Research Program funded by the Ministry of Education (NRF-2021R1A6A1A13044830).
Author information
Authors and Affiliations
Contributions
EK and KL wrote the main manuscript text and EK prepared the figures. CY reviewed the manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that there are no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.
Ethics approval
This declaration is not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kim, E., Lee, K. & Yoo, C. Network SLO-aware container scheduling in Kubernetes. J Supercomput 79, 11478–11494 (2023). https://doi.org/10.1007/s11227-023-05122-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05122-5