Skip to main content
Log in

Energy-efficient DAG scheduling with DVFS for cloud data centers

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

With the growth of the cloud computing market, the number and scale of cloud data centers are expanding rapidly. While cloud data centers provide a large amount of computing power, generating tremendous energy consumption has become a fundamental issue in the financial and environmental fields. Improving quality of service and reducing energy costs are fundamental challenges for next-generation cloud data centers. Task scheduling in cloud data centers grows increasingly complex due to the heterogeneity of computing resources, intricate dependencies of jobs and rising expenses resulting from high energy consumption. Efficiently utilizing computing resources is crucial, so it is necessary to develop optimal strategies for job scheduling. This paper proposes a reinforcement learning-based task scheduler (E2DSched) for online scheduling of randomly arriving directed acyclic graph jobs in cloud data centers. E2DSched divides the scheduling process into three layers: task selection layer, server selection layer and frequency control layer. It achieves joint optimization of energy consumption and quality of service through three-layer cooperation. Finally, we compare E2DSched with various other algorithms, and the results show that E2DSched can provide excellent service with less energy consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

All of the material is owned by the authors and/or no permissions are required.

References

  1. Council NRD (2014) Scaling up energy efficiency across the data center industry: evaluating key drivers and barriers. In: Issue Paper

  2. Wang Q, Mei X, Liu H et al (2022) Energy-aware non-preemptive task scheduling with deadline constraint in dvfs-enabled heterogeneous clusters. IEEE Trans Parallel Distrib Syst 33(12):4083–4099

    Article  Google Scholar 

  3. Yang Y, Shen H (2021) Deep reinforcement learning enhanced greedy optimization for online scheduling of batched tasks in cloud HPC systems. IEEE Trans Parallel Distrib Syst 33(11):3003–3014

    MathSciNet  Google Scholar 

  4. Bohrer, P., Elnozahy, E.N., Keller, T., et al: The case for power management in web servers. In: Power Aware Computing, pp. 261–289 (2002)

  5. Liu Y, Wei X, Xiao J et al (2020) Energy consumption and emission mitigation prediction based on data center traffic and PUE for global data centers. Glob. Energy Interconnect. 3(3):272–282

    Article  Google Scholar 

  6. Fan X, Weber W-D, Barroso LA (2007) Power provisioning for a warehouse-sized computer. ACM SIGARCH Comput. Archit. News 35(2):13–23

    Article  Google Scholar 

  7. Tian H, Zheng Y, Wang W (2019) Characterizing and synthesizing task dependencies of data-parallel jobs in Alibaba cloud. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 139–151

  8. Khallouli W, Huang J (2022) Cluster resource scheduling in cloud computing: literature review and research challenges. J Supercomput 1–46

  9. Zhang D, Dai D, He Y, et al (2020) Rlscheduler: an automated HPC batch job scheduler using reinforcement learning. In: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, pp 1–15

  10. Fan Y, Lan Z, Childers T et al (2021) Deep reinforcement agent for scheduling in HPC. In: IEEE International Parallel and Distributed Processing Symposium. IEEE, pp 807–816

  11. Topcuoglu H, Hariri S, Wu M-Y (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274

    Article  Google Scholar 

  12. Djigal H, Feng J, Lu J, Ge J (2020) IPPTS: an efficient algorithm for scientific workflow scheduling in heterogeneous computing systems. IEEE Trans Parallel Distrib Syst 32(5):1057–1071

    Article  Google Scholar 

  13. Sulaiman M, Halim Z, Waqas M et al (2021) A hybrid list-based task scheduling scheme for heterogeneous computing. J Supercomput 77:10252–10288

    Article  Google Scholar 

  14. Liu J, Ren J, Dai W et al (2019) Online multi-workflow scheduling under uncertain task execution time in IaaS clouds. IEEE Trans Cloud Comput 9(3):1180–1194

    Article  Google Scholar 

  15. Ueter N, Günzel M, von der Brüggen G, Chen J-J (2023) Parallel path progression DAG scheduling. IEEE Trans Comput

  16. Guan F, Peng L, Qiao J (2023) A new federated scheduling algorithm for arbitrary-deadline DAG tasks. IEEE Trans Comput

  17. Senapati D, Rajesh K, Karfa C, Sarkar A (2023) TMDS: Temperature-aware makespan minimizing DAG scheduler for heterogeneous distributed systems. ACM Trans Des Autom Electron Syst 28(6):1–22

    Article  Google Scholar 

  18. Shao S, Gu S, Sun B, Sha EH-M, Zhuge Q (2023) Fairness scheduling for tasks with different real-time level on heterogeneous systems. In: 2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS). IEEE, pp 625–632

  19. Wu Q, Wu Z, Zhuang Y et al (2018) Adaptive DAG tasks scheduling with deep reinforcement learning. In: International Conference on Algorithms and Architectures for Parallel Processing. Springer, pp 477–490

  20. Mao H, Schwarzkopf M, Venkatakrishnan SB et al (2019) Learning scheduling algorithms for data processing clusters. In: Proceedings of the ACM Special Interest Group on Data Communication, pp 270–288

  21. Lin C-C, Syu Y-C, Chang C-J et al (2015) Energy-efficient task scheduling for multi-core platforms with per-core DVFS. J Parallel Distrib Comput 86:71–81

    Article  Google Scholar 

  22. Jin P, Hao X, Wang X et al (2018) Energy-efficient task scheduling for CPU-intensive streaming jobs on Hadoop. IEEE Trans Parallel Distrib Syst 30(6):1298–1311

    Article  Google Scholar 

  23. Cheng D, Zhou X, Lama P et al (2017) Energy efficiency aware task assignment with DVFS in heterogeneous Hadoop clusters. IEEE Trans Parallel Distrib Syst 29(1):70–82

    Article  Google Scholar 

  24. Chen L, Li J, Ma R et al (2020) Balancing power and performance in HPC clouds. Comput J 63(1):880–899

    Article  Google Scholar 

  25. Li J, Zhang X, Wei Z et al (2021) Energy-aware task scheduling optimization with deep reinforcement learning for large-scale heterogeneous systems. CCF Trans High Perform Comput 3:383–392

    Article  Google Scholar 

  26. Yi D, Zhou X, Wen Y et al (2019) Toward efficient compute-intensive job allocation for green data centers: A deep reinforcement learning approach. In: International Conference on Distributed Computing Systems. IEEE, pp 634–644

  27. Liu N, Li Z, Xu J et al (2017) A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In: International Conference on Distributed Computing Systems. IEEE, pp 372–382

  28. Liu D, Yang S-G, He Z et al (2021) CARTAD: Compiler-assisted reinforcement learning for thermal-aware task scheduling and dvfs on multicores. IEEE Trans Comput Aided Des Integr Circuits Syst 41(6):1813–1826

    Article  Google Scholar 

  29. Huang J, Li R, Jiao X et al (2020) Dynamic DAG scheduling on multiprocessor systems: reliability, energy, and makespan. IEEE Trans Comput Aided Des Integr Circuits Syst 39(11):3336–3347

    Article  Google Scholar 

  30. Safari M, Khorsand R (2018) PL-DVFS: combining power-aware list-based scheduling algorithm with DVFS technique for real-time tasks in cloud computing. J Supercomput 74:5578–5600

    Article  Google Scholar 

  31. Chen R, Chen X, Yang C (2022) Using a task dependency job-scheduling method to make energy savings in a cloud computing environment. J Supercomput 78(3):4550–4573

    Article  MathSciNet  Google Scholar 

  32. Hosseinioun P, Kheirabadi M, Tabbakh SRK et al (2020) A new energy-aware tasks scheduling approach in fog computing using hybrid meta-heuristic algorithm. J Parallel Distrib Comput 143:88–96

    Article  Google Scholar 

  33. Zhu Z, Zhang W, Chaturvedi V et al (2019) Energy minimization for multicore platforms through DVFS and VR phase scaling with comprehensive convex model. IEEE Trans Comput Aided Des Integr Circuits Syst 39(3):686–699

    Article  Google Scholar 

  34. Huang H, Lin M, Yang LT et al (2019) Autonomous power management with double-q reinforcement learning method. IEEE Trans Industr Inf 16(3):1938–1946

    Article  Google Scholar 

  35. Wang Y, Zhang W, Hao M et al (2021) Online power management for multi-cores: a reinforcement learning based approach. IEEE Trans Parallel Distrib Syst 33(4):751–764

    Article  Google Scholar 

  36. Hu B, Yang X, Zhao M (2023) Online energy-efficient scheduling of DAG tasks on heterogeneous embedded platforms. J Syst Architect 140:102894

    Article  Google Scholar 

  37. Bhuiyan A, Pivezhandi M, Guo Z, Li J, Modekurthy VP, Saifullah A (2023) Precise scheduling of dag tasks with dynamic power management. In: 35th Euromicro Conference on Real-Time Systems (ECRTS 2023). Schloss Dagstuhl-Leibniz-Zentrum für Informatik

  38. Sun Z, Huang H, Li Z, Gu C, Xie R, Qian B (2023) Efficient, economical and energy-saving multi-workflow scheduling in hybrid cloud. Expert Syst Appl 228:120401

    Article  Google Scholar 

  39. Swarup S, Shakshuki EM, Yasar A (2021) Task scheduling in cloud using deep reinforcement learning. Procedia Comput Sci 184:42–51

    Article  Google Scholar 

  40. Zhong Z, He J, Rodriguez MA et al (2020) Heterogeneous task co-location in containerized cloud computing environments. In: 2020 IEEE 23rd International Symposium on Real-Time Distributed Computing. IEEE, pp 79–88

  41. Shen S, Van Beek V, Iosup A (2015) Statistical characterization of business-critical workloads hosted in cloud datacenters. In: IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE, pp 465–474

  42. Synthetic Workflow Generators. https://github.com/pegasus-isi/WorkflowGenerator. Accessed 14 January 2024

  43. Standard Performance Evaluation Corporation. https://www.spec.org/power/. Accessed 21 January 2023

  44. Palladi ASV, Starikovskiy A (2001) The ondemand governor: past, present and future. In: Proceedings of Linux Symposium, vol 2, p 3

  45. Grandl R, Ananthanarayanan G, Kandula S et al (2014) Multi-resource packing for cluster schedulers. ACM SIGCOMM Comput Commun Rev 44(4):455–466

    Article  Google Scholar 

  46. Koslovski GP, Pereira K, Albuquerque PR (2024) DAG-based workflows scheduling using actor-critic deep reinforcement learning. Futur Gener Comput Syst 150:354–363

    Article  Google Scholar 

Download references

Funding

This work was supported by the Natural Science Foundation of China (62172327).

Author information

Authors and Affiliations

Authors

Contributions

WY and MZ conceived the design and realized the work, and JL and XZ checked and perfected the work.

Corresponding author

Correspondence to Xingjun Zhang.

Ethics declarations

Ethical approval

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, W., Zhao, M., Li, J. et al. Energy-efficient DAG scheduling with DVFS for cloud data centers. J Supercomput (2024). https://doi.org/10.1007/s11227-024-06035-7

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06035-7

Keywords

Navigation