Abstract
With the growth of the cloud computing market, the number and scale of cloud data centers are expanding rapidly. While cloud data centers provide a large amount of computing power, generating tremendous energy consumption has become a fundamental issue in the financial and environmental fields. Improving quality of service and reducing energy costs are fundamental challenges for next-generation cloud data centers. Task scheduling in cloud data centers grows increasingly complex due to the heterogeneity of computing resources, intricate dependencies of jobs and rising expenses resulting from high energy consumption. Efficiently utilizing computing resources is crucial, so it is necessary to develop optimal strategies for job scheduling. This paper proposes a reinforcement learning-based task scheduler (E2DSched) for online scheduling of randomly arriving directed acyclic graph jobs in cloud data centers. E2DSched divides the scheduling process into three layers: task selection layer, server selection layer and frequency control layer. It achieves joint optimization of energy consumption and quality of service through three-layer cooperation. Finally, we compare E2DSched with various other algorithms, and the results show that E2DSched can provide excellent service with less energy consumption.
Similar content being viewed by others
Data availability
All of the material is owned by the authors and/or no permissions are required.
References
Council NRD (2014) Scaling up energy efficiency across the data center industry: evaluating key drivers and barriers. In: Issue Paper
Wang Q, Mei X, Liu H et al (2022) Energy-aware non-preemptive task scheduling with deadline constraint in dvfs-enabled heterogeneous clusters. IEEE Trans Parallel Distrib Syst 33(12):4083–4099
Yang Y, Shen H (2021) Deep reinforcement learning enhanced greedy optimization for online scheduling of batched tasks in cloud HPC systems. IEEE Trans Parallel Distrib Syst 33(11):3003–3014
Bohrer, P., Elnozahy, E.N., Keller, T., et al: The case for power management in web servers. In: Power Aware Computing, pp. 261–289 (2002)
Liu Y, Wei X, Xiao J et al (2020) Energy consumption and emission mitigation prediction based on data center traffic and PUE for global data centers. Glob. Energy Interconnect. 3(3):272–282
Fan X, Weber W-D, Barroso LA (2007) Power provisioning for a warehouse-sized computer. ACM SIGARCH Comput. Archit. News 35(2):13–23
Tian H, Zheng Y, Wang W (2019) Characterizing and synthesizing task dependencies of data-parallel jobs in Alibaba cloud. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 139–151
Khallouli W, Huang J (2022) Cluster resource scheduling in cloud computing: literature review and research challenges. J Supercomput 1–46
Zhang D, Dai D, He Y, et al (2020) Rlscheduler: an automated HPC batch job scheduler using reinforcement learning. In: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, pp 1–15
Fan Y, Lan Z, Childers T et al (2021) Deep reinforcement agent for scheduling in HPC. In: IEEE International Parallel and Distributed Processing Symposium. IEEE, pp 807–816
Topcuoglu H, Hariri S, Wu M-Y (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274
Djigal H, Feng J, Lu J, Ge J (2020) IPPTS: an efficient algorithm for scientific workflow scheduling in heterogeneous computing systems. IEEE Trans Parallel Distrib Syst 32(5):1057–1071
Sulaiman M, Halim Z, Waqas M et al (2021) A hybrid list-based task scheduling scheme for heterogeneous computing. J Supercomput 77:10252–10288
Liu J, Ren J, Dai W et al (2019) Online multi-workflow scheduling under uncertain task execution time in IaaS clouds. IEEE Trans Cloud Comput 9(3):1180–1194
Ueter N, Günzel M, von der Brüggen G, Chen J-J (2023) Parallel path progression DAG scheduling. IEEE Trans Comput
Guan F, Peng L, Qiao J (2023) A new federated scheduling algorithm for arbitrary-deadline DAG tasks. IEEE Trans Comput
Senapati D, Rajesh K, Karfa C, Sarkar A (2023) TMDS: Temperature-aware makespan minimizing DAG scheduler for heterogeneous distributed systems. ACM Trans Des Autom Electron Syst 28(6):1–22
Shao S, Gu S, Sun B, Sha EH-M, Zhuge Q (2023) Fairness scheduling for tasks with different real-time level on heterogeneous systems. In: 2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS). IEEE, pp 625–632
Wu Q, Wu Z, Zhuang Y et al (2018) Adaptive DAG tasks scheduling with deep reinforcement learning. In: International Conference on Algorithms and Architectures for Parallel Processing. Springer, pp 477–490
Mao H, Schwarzkopf M, Venkatakrishnan SB et al (2019) Learning scheduling algorithms for data processing clusters. In: Proceedings of the ACM Special Interest Group on Data Communication, pp 270–288
Lin C-C, Syu Y-C, Chang C-J et al (2015) Energy-efficient task scheduling for multi-core platforms with per-core DVFS. J Parallel Distrib Comput 86:71–81
Jin P, Hao X, Wang X et al (2018) Energy-efficient task scheduling for CPU-intensive streaming jobs on Hadoop. IEEE Trans Parallel Distrib Syst 30(6):1298–1311
Cheng D, Zhou X, Lama P et al (2017) Energy efficiency aware task assignment with DVFS in heterogeneous Hadoop clusters. IEEE Trans Parallel Distrib Syst 29(1):70–82
Chen L, Li J, Ma R et al (2020) Balancing power and performance in HPC clouds. Comput J 63(1):880–899
Li J, Zhang X, Wei Z et al (2021) Energy-aware task scheduling optimization with deep reinforcement learning for large-scale heterogeneous systems. CCF Trans High Perform Comput 3:383–392
Yi D, Zhou X, Wen Y et al (2019) Toward efficient compute-intensive job allocation for green data centers: A deep reinforcement learning approach. In: International Conference on Distributed Computing Systems. IEEE, pp 634–644
Liu N, Li Z, Xu J et al (2017) A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In: International Conference on Distributed Computing Systems. IEEE, pp 372–382
Liu D, Yang S-G, He Z et al (2021) CARTAD: Compiler-assisted reinforcement learning for thermal-aware task scheduling and dvfs on multicores. IEEE Trans Comput Aided Des Integr Circuits Syst 41(6):1813–1826
Huang J, Li R, Jiao X et al (2020) Dynamic DAG scheduling on multiprocessor systems: reliability, energy, and makespan. IEEE Trans Comput Aided Des Integr Circuits Syst 39(11):3336–3347
Safari M, Khorsand R (2018) PL-DVFS: combining power-aware list-based scheduling algorithm with DVFS technique for real-time tasks in cloud computing. J Supercomput 74:5578–5600
Chen R, Chen X, Yang C (2022) Using a task dependency job-scheduling method to make energy savings in a cloud computing environment. J Supercomput 78(3):4550–4573
Hosseinioun P, Kheirabadi M, Tabbakh SRK et al (2020) A new energy-aware tasks scheduling approach in fog computing using hybrid meta-heuristic algorithm. J Parallel Distrib Comput 143:88–96
Zhu Z, Zhang W, Chaturvedi V et al (2019) Energy minimization for multicore platforms through DVFS and VR phase scaling with comprehensive convex model. IEEE Trans Comput Aided Des Integr Circuits Syst 39(3):686–699
Huang H, Lin M, Yang LT et al (2019) Autonomous power management with double-q reinforcement learning method. IEEE Trans Industr Inf 16(3):1938–1946
Wang Y, Zhang W, Hao M et al (2021) Online power management for multi-cores: a reinforcement learning based approach. IEEE Trans Parallel Distrib Syst 33(4):751–764
Hu B, Yang X, Zhao M (2023) Online energy-efficient scheduling of DAG tasks on heterogeneous embedded platforms. J Syst Architect 140:102894
Bhuiyan A, Pivezhandi M, Guo Z, Li J, Modekurthy VP, Saifullah A (2023) Precise scheduling of dag tasks with dynamic power management. In: 35th Euromicro Conference on Real-Time Systems (ECRTS 2023). Schloss Dagstuhl-Leibniz-Zentrum für Informatik
Sun Z, Huang H, Li Z, Gu C, Xie R, Qian B (2023) Efficient, economical and energy-saving multi-workflow scheduling in hybrid cloud. Expert Syst Appl 228:120401
Swarup S, Shakshuki EM, Yasar A (2021) Task scheduling in cloud using deep reinforcement learning. Procedia Comput Sci 184:42–51
Zhong Z, He J, Rodriguez MA et al (2020) Heterogeneous task co-location in containerized cloud computing environments. In: 2020 IEEE 23rd International Symposium on Real-Time Distributed Computing. IEEE, pp 79–88
Shen S, Van Beek V, Iosup A (2015) Statistical characterization of business-critical workloads hosted in cloud datacenters. In: IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE, pp 465–474
Synthetic Workflow Generators. https://github.com/pegasus-isi/WorkflowGenerator. Accessed 14 January 2024
Standard Performance Evaluation Corporation. https://www.spec.org/power/. Accessed 21 January 2023
Palladi ASV, Starikovskiy A (2001) The ondemand governor: past, present and future. In: Proceedings of Linux Symposium, vol 2, p 3
Grandl R, Ananthanarayanan G, Kandula S et al (2014) Multi-resource packing for cluster schedulers. ACM SIGCOMM Comput Commun Rev 44(4):455–466
Koslovski GP, Pereira K, Albuquerque PR (2024) DAG-based workflows scheduling using actor-critic deep reinforcement learning. Futur Gener Comput Syst 150:354–363
Funding
This work was supported by the Natural Science Foundation of China (62172327).
Author information
Authors and Affiliations
Contributions
WY and MZ conceived the design and realized the work, and JL and XZ checked and perfected the work.
Corresponding author
Ethics declarations
Ethical approval
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, W., Zhao, M., Li, J. et al. Energy-efficient DAG scheduling with DVFS for cloud data centers. J Supercomput (2024). https://doi.org/10.1007/s11227-024-06035-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-024-06035-7