Skip to main content
Log in

Machine learning-driven energy-efficient load balancing for real-time heterogeneous systems

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Load balancing plays a critical role in ensuring system stability and optimal performance, and as such, it has been a subject of extensive research across diverse computing domains, particularly in heterogeneous systems. Such systems integrate various computing devices with distinct architectures and computational power, each designed to execute a particular type of workload having diverse and immediate requirements. This research introduces a novel load balancing approach that utilizes machine learning to monitor the load distribution of heterogeneous systems in real-time. It estimates the load of the system’s devices according to our proposed definition of the load. This definition provided us with an accurate measurement of the actual load, enabling our approach to detect imbalances and locate underloaded devices. Subsequently, our approach assigns applications to these underloaded devices to achieve load balance. We conducted a comparative analysis between our proposed approach and various established load balancing methods from diverse computing domains. This evaluation included three critical criteria: load balancing, workload execution time, and energy consumption. Additionally, we introduced a novel metric called “IMBALANCE” that quantifies the difference in load distribution among the system devices. We implemented each load balancing approach in a real-time heterogeneous system consisting of a master host and three worker servers each equipped with a Central Processing Unit and Graphical Processing Unit. To ensure a comprehensive validation of our approach, we conducted experiments in three distinct scenarios, involving different types of workloads having varying levels of computing intensity. The experimental results consistently demonstrate significant performance improvements of our approach across all scenarios. Our approach excels in minimizing system imbalance, which, in turn, enhances load balancing. This leads to reduced workload execution times and minimized device idle periods, ultimately improving energy efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

Data availability

Enquiries about data availability should be directed to the authors.

References

  1. Bestavros, A.: WWW traffic reduction and load balancing through server-based caching. IEEE Concurr. 5(1), 56–67 (1997). https://doi.org/10.1109/ACCESS.2021.3065170

    Article  Google Scholar 

  2. Lung-Hsuan, H., Chih-Hung, W., Chiung-Hui, T., Hsiang-Cheh, H.: Migration-based load balance of virtual machine servers in cloud computing by load prediction using genetic-based methods. IEEE Access 9, 49760–49773 (2021). https://doi.org/10.1007/978-3-642-19861-8_16

    Article  Google Scholar 

  3. Yong, M.T., Rassul, A.: Comparison of load balancing strategies on cluster-based web servers. SIMULATION 77(5–6), 185–195 (2001). https://doi.org/10.1177/003754970107700504

    Article  Google Scholar 

  4. Tarek H., Sadam A., Omar B.: A machine Learning-Based approach to estimate the CPU-Burst time for processes in the computational grids. In: International conference on artificial intelligence, modelling and simulation (AIMS), 2015. https://doi.org/10.1109/AIMS.2015.11

  5. Lee, J., Samadi, M., Park, Y., Mahlke, S.A.: Transparent CPU-GPU collaboration for data-parallel kernels on heterogeneous systems. In: 22nd international conference on parallel architectures and compilation techniques, Edinburgh, UK, pp. 245–255, 2013. https://doi.org/10.1109/PACT.2013.6618821

  6. Khronos Group: OpenCL specification, version 2.2. (2019). https://registry.khronos.org/OpenCL/

  7. Vella, F., Neri, I., Gervasi, O., Tasso, S.: A simulation framework for scheduling performance evaluation on CPU-GPU heterogeneous system. In: International Conference on Computational Science and Its Applications. Springer, Berlin, Heidelberg (2012)

    Google Scholar 

  8. Borja, P., Stafford, E., Bosque, J.L., Beivide, R.: Sigmoid: an auto-tuned load balancing algorithm for heterogeneous systems. J. Parallel Distrib. Comput. 157, 30–42 (2021). https://doi.org/10.1016/j.jpdc.2021.06.003

    Article  Google Scholar 

  9. Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors. J. Parallel Distrib. Comput. 7(2), 279–301 (1989). https://doi.org/10.1016/0743-7315(89)90021-x

    Article  Google Scholar 

  10. Olivier, V., Pangfeng, L., Jan-Jan, W.: A collaborative CPU–GPU approach for principal component analysis on mobile heterogeneous platforms. J Parallel Distrib. Comput. 120, 44–61 (2018). https://doi.org/10.1016/j.jpdc.2018.05.006

    Article  Google Scholar 

  11. Judit, P., Rosa, M.B., Eduard, A., Jesús, L.: SSMART: smart scheduling of multi-architecture tasks on heterogeneous systems. In: WACCPD: proceedings of the second workshop on accelerator programming using directives, pp. 1–11, 2015. https://doi.org/10.1145/2832105.2832109

  12. Borja, P., Esteban, S., José, L.B., Ramón, B.: Energy efficiency of load balancing for data-parallel applications in heterogeneous systems. J. Supercomput. 73, 330–342 (2017). https://doi.org/10.1007/s11227-016-1864-y

    Article  Google Scholar 

  13. Mahmoudi, S.A., Manneback, P., Augonnet, C., Thibault, S.: Traitements d’images sur architectures parallèles et hétérogènes. Techniques et sciences informatiques (Comput. Sci. Technol.) 31, 1183–1203 (2012). https://doi.org/10.3166/tsi.31.1183-1203

    Article  Google Scholar 

  14. Harichane, I., Makhlouf, S.A., Belalem, G.: KubeSC-RTP: smart scheduler for Kubernetes platform on CPU-GPU heterogeneous systems. Concurrency Comput Practice Exp 34, e7108 (2022). https://doi.org/10.1002/cpe.7108

    Article  Google Scholar 

  15. Wen, Y., Wang, Z., O’Boyle, M.F.P.: Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms. Int. Conf. High Perform. Comput. (2014). https://doi.org/10.1109/hipc.2014.7116910

    Article  Google Scholar 

  16. Hong, J.C., Dong, O.S., Seung, G.K., Jong, M.K., Hsien-Hsin, L., Cheol, H.K.: An efficient scheduling scheme using estimated execution time for heterogeneous computing systems. J. Supercomput. 65, 886–902 (2013). https://doi.org/10.1007/s11227-013-0870-6

    Article  Google Scholar 

  17. Konrad, M., Diana, G.: Automatic mapping for OpenCL-programs on CPU/GPU heterogeneous platforms. In: International conference on computational science ICCS, pp. 301–314. Springer, Cham (2018)

    Google Scholar 

  18. Khalid, Y.N., Aleem, M., Prodan, R., Iqbal, M.A., Islam, M.A.: E-OSched: a load balancing scheduler for heterogeneous multicores. J. Supercomput. 74, 5399–5431 (2018). https://doi.org/10.1007/s11227-018-2435-1

    Article  Google Scholar 

  19. Khalid, Y.N., Aleem, M., Usman, A., Muhammad, A.I., Islam, M.A., Iqbal, M.A.: Troodon A machine-learning based load-balancing application scheduler for CPU–GPU system. J Parallel Distrib. Comput. 132, 79–94 (2019). https://doi.org/10.1016/j.jpdc.2019.05.015

    Article  Google Scholar 

  20. Usman, A., Jerry, C.W.L., Gautam, S., Aleem, M.: A load balance multi-scheduling model for OpenCL kernel tasks in an integrated cluster. Soft. Comput. 25, 407–420 (2021). https://doi.org/10.1007/s00500-020-05152-8

    Article  Google Scholar 

  21. Taha, A. R., Fatima, D., Ghalem, B., Sidi Ahmed, M.: HBalancer: a machine learning based load balancer in real time CPU-GPU heterogeneous systems. In: International conference on innovation and intelligence for informatics, computing, and technologies (3ICT), Bahrain, pp. 674–679, 2022. https://doi.org/10.1109/3ICT56508.2022.9990623

  22. Audace, M., Saadi, B., Lamia, C. F.: A priority-weighted Round Robin scheduling strategy for a WBAN based healthcare monitoring system. In: 13th IEEE annual consumer communications & networking conference (CCNC), pp. 224–229, 2016. https://doi.org/10.1109/CCNC.2016.7444760

  23. Weikun, W., Giuliano, C.: Evaluating weighted Round Robin load balancing for cloud web services. In: 6th international symposium on symbolic and numeric algorithms for scientific computing, pp. 393–400, 2014. https://doi.org/10.1109/SYNASC.2014.59

  24. Bhavya, A., Gaurav, S., Harleen, K., Raul, V., Victor, C.: Experimental setup for investigating the efficient load balancing algorithms on virtual cloud. Sensors 20(24), 7342 (2020). https://doi.org/10.3390/s20247342

    Article  Google Scholar 

  25. Grauer-Gray, S., Xu, L., Searles, R., Ayalasomayajula, S., Cavazos, J.: Auto-tuning a high-level language targeted to GPU codes. Innov. Parallel Comput. (2012). https://doi.org/10.1109/InPar.2012.6339595

    Article  Google Scholar 

  26. Pycaret Documentation. https://pycaret.gitbook.io/docs/

  27. Chen, R.C., Dewi, C., Huang, S., Caraka, R.: Selecting critical features for data classification based on machine learning methods. J. Big Data (2020). https://doi.org/10.1186/s40537-020-00327-4

    Article  Google Scholar 

  28. Chen, X., Jeong, J.C.: Enhanced recursive feature elimination. Sixth Int. Conf. Mach. Learn. App ICMLA 2007, 429–435 (2007). https://doi.org/10.1109/ICMLA.2007.35

    Article  Google Scholar 

  29. Chicco, D., Warrens, M., Jurman, G.: The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Comput. Sci. 7, e623 (2021). https://doi.org/10.7717/peerj-cs.623

    Article  Google Scholar 

  30. Ubuntu manuals. https://www.man7.org/linux/man-pages/man1/perf.1.html

  31. Nvidia Corporation. https://developer.download.nvidia.com/compute/DCGM/docs/nvidia-smi-367.38.pdf

  32. Intel Corporation. https://ark.intel.com/content/www/fr/fr/ark/products/52270/intel-xeon-processor-e31225-6m-cache-3-10-ghz.html

  33. Nvidia Corporation. https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/productspage/quadro/quadro-desktop/quadro-pascal-p400-data-sheet-us-nv-704503-r1.pdf

  34. Intel Corporation. https://ark.intel.com/content/www/fr/fr/ark/products/88196/intel-core-i76700-processor-8m-cache-up-to-4-00-ghz.html

  35. Nvidia Corporation. https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/documents/75509_DS_NV_Quadro_K620_US_NV_HR.pdf

Download references

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Authors

Contributions

RTA is the primary author of this study. Conducted within the guidance of BG and SAM. RTA undertook the research, executed the proposed solution, conducted experiments, and wrote the initial draft. BG and SAM actively participated throughout the research process, offering guidance and insights in regular meetings. BG contributed expertise in cluster/cloud computing, while SAM facilitated the utilization of Machine Learning tools, contributing to both the methodology and validation of the learning process. The comprhensive review and refinement of the first and second drafts were overseen by ORBM, who restructured the initial draft and validated the subsequent final draft.

Corresponding author

Correspondence to Taha Abdelazziz Rahmani.

Ethics declarations

Competing interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rahmani, T.A., Belalem, G., Mahmoudi, S.A. et al. Machine learning-driven energy-efficient load balancing for real-time heterogeneous systems. Cluster Comput (2024). https://doi.org/10.1007/s10586-023-04215-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10586-023-04215-3

Keywords

Navigation