Skip to main content

Advertisement

Log in

Reinforcement learning based energy efficient resource allocation strategy of MapReduce jobs with deadline constraint

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Big Data applications require more energy consumption to process a massive volume of data in a heterogeneous environment. Moreover, reducing energy consumption in Big Data applications is an important research topic. It is one of the challenging issues to conserve energy with a deadline constraint in a heterogeneous environment. In this paper, we formulate scheduling the MapReduce jobs as a minimization problem by considering the decision variables with a user-specified deadline constraint. Further, a Learning Automata-based MapReduce Scheduling (LA-MRS) algorithm has been proposed to identify the resource allocation and save energy consumption of MapReduce tasks in a heterogeneous environment. We perform experimentation on the proposed LA-MRS algorithm using Hibench benchmark workloads such as Enhanced DFSIO, Nutch Indexing, k-mean Clustering and Hive Join. The experimentation illustrates that the proposed LA-MRS algorithm schedules the MapReduce task by saving around 25% of less energy consumed when compared to the existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

None

References

  1. Shao, Y., Li, C., Gu, J., Zhang, J., Luo, Y.: Efficient jobs scheduling approach for big data applications. Comput. Ind. Eng. 117, 249–261 (2018)

    Article  Google Scholar 

  2. Li, H., Wang, H., Xiong, A., Lai, J., Tian, W.: Comparative analysis of energy-efficient scheduling algorithms for big data applications. IEEE Access 6, 40073–40084 (2018)

    Article  Google Scholar 

  3. Yousefi, M.H.N., Goudarzi, M.: A task-based greedy scheduling algorithm for minimizing energy of mapreduce jobs. J. Grid Comput. 16(4), 535–551 (2018)

    Article  Google Scholar 

  4. Pandey, V., Saini, P.: A heuristic method towards deadline-aware energy-efficient mapreduce scheduling problem in hadoop yarn. Clust. Comput. 24(2), 683–699 (2021)

    Article  Google Scholar 

  5. Gregory, A., Majumdar, S.: Resource management for deadline constrained mapreduce jobs for minimising energy consumption. Int. J. Big Data Intell. 5(4), 270–287 (2018)

    Article  Google Scholar 

  6. Zong, Z., Ge, R., Gu, Q.: Marcher: a heterogeneous system supporting energy-aware high performance computing and big data analytics. Big Data Res. 8, 27–38 (2017)

    Article  Google Scholar 

  7. Verma, A., Cherkasova, L., Kumar, V.S., Campbell, R.H.: Deadline-based workload management for mapreduce environments: Pieces of the performance puzzle. In: 2012 IEEE Network Operations and Management Symposium, pp. 900–905. IEEE (2012)

  8. Bhattacharya, A.A., Culler, D., Friedman, E., Ghodsi, A., Shenker, S., Stoica, I.: Hierarchical scheduling for diverse datacenter workloads. In: Proceedings of the 4th Annual Symposium on Cloud Computing, pp. 1–15 (2013)

  9. Zhang, X., Liu, X., Li, W., Zhang, X.: Trade-off between energy consumption and makespan in the mapreduce resource allocation problem. In: International Conference on Artificial Intelligence and Security, pp. 239–250. Springer (2019)

  10. Wang, H., Cao, Y.: An energy efficiency optimization and control model for hadoop clusters. IEEE Access 7, 40534–40549 (2019)

    Article  Google Scholar 

  11. Ahmed, N., Barczak, A.L., Susnjak, T., Rashid, M.A.: A comprehensive performance analysis of apache hadoop and apache spark for large scale data sets using hibench. J. Big Data 7(1), 1–18 (2020)

    Article  Google Scholar 

  12. Hadoop, W., Hadoop, T.: The Definitive Guide. O’Reilly Media Inc, Sebastopol, CA (2015)

    Google Scholar 

  13. Ullah, I., Khan, M.S., Amir, M., Kim, J., Kim, S.M.: Lstpd: least slack time-based preemptive deadline constraint scheduler for hadoop clusters. IEEE Access 8, 111751–111762 (2020)

    Article  Google Scholar 

  14. Gandomi, A., Reshadi, M., Movaghar, A., Khademzadeh, A.: Hybsmrp: a hybrid scheduling algorithm in hadoop mapreduce framework. J. Big Data 6(1), 1–16 (2019)

    Article  Google Scholar 

  15. Sulaiman, M., Halim, Z., Lebbah, M., Waqas, M., Tu, S.: An evolutionary computing-based efficient hybrid task scheduling approach for heterogeneous computing environment. J. Grid Comput. 19(1), 1–31 (2021)

    Article  Google Scholar 

  16. Wu, W., Lin, W., Hsu, C.-H., He, L.: Energy-efficient hadoop for big data analytics and computing: a systematic review and research insights. Futur. Gener. Comput. Syst. 86, 1351–1367 (2018)

    Article  Google Scholar 

  17. Senthilkumar, M., Ilango, P.: Energy aware task scheduling using hybrid firefly-ga in big data. Int. J. Adv. Intell. Paradigms 16(2), 99–112 (2020)

    Article  Google Scholar 

  18. Tran, X.T., Van Do, T., Rotter, C., Hwang, D.: A new data layout scheme for energy-efficient mapreduce processing tasks. J. Grid Comput. 16(2), 285–298 (2018)

    Article  Google Scholar 

  19. Cai, X., Li, F., Li, P., Ju, L., Jia, Z.: Sla-aware energy-efficient scheduling scheme for hadoop yarn. J. Supercomput. 73(8), 3526–3546 (2017)

    Article  Google Scholar 

  20. Jin, P., Hao, X., Wang, X., Yue, L.: Energy-efficient task scheduling for cpu-intensive streaming jobs on hadoop. IEEE Trans. Parallel Distrib. Syst. 30(6), 1298–1311 (2018)

    Article  Google Scholar 

  21. Lingam, G., Rout, R.R., Somayajulu, D., Ghosh, S.K.: Particle swarm optimization on deep reinforcement learning for detecting social spam bots and spam-influential users in twitter network. IEEE Syst. J. 15(2), 2281–2292 (2020)

    Article  Google Scholar 

  22. Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The hibench benchmark suite: Characterization of the mapreduce-based data analysis. In: 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010), pp. 41–51 (2010)

  23. Pandey, V., Saini, P.: Constraint programming versus heuristic approach to mapreduce scheduling problem in hadoop yarn for energy minimization. J. Supercomput., 1–29 (2021)

  24. Seethalakshmi, V., Govindasamy, V., Akila, V.: Real-coded multi-objective genetic algorithm with effective queuing model for efficient job scheduling in heterogeneous hadoop environment. J. King Saud Univ. (2020)

  25. Li, H., Dai, H., Liu, Z., Fu, H., Zou, Y.: Dynamic energy-efficient scheduling for streaming applications in storm. Computing, 1–20 (2021)

  26. Aggarwal, V., Xu, M., Lan, T., Subramaniam, S.: On the optimality of scheduling dependent mapreduce tasks on heterogeneous machines. arXiv:1711.09964 (2017)

  27. Tang, Z., Jiang, L., Zhou, J., Li, K., Li, K.: A self-adaptive scheduling algorithm for reduce start time. Futur. Gener. Comput. Syst. 43, 51–60 (2015)

    Article  Google Scholar 

  28. Hsu, C.-H., Slagter, K.D., Chung, Y.-C.: Locality and loading aware virtual machine mapping techniques for optimizing communications in mapreduce applications. Futur. Gener. Comput. Syst. 53, 43–54 (2015)

    Article  Google Scholar 

  29. Dong, J., Goebel, R., Hu, J., Lin, G., Su, B.: Minimizing total job completion time in mapreduce scheduling. Comput. Ind. Eng. 158, 107387 (2021)

    Article  Google Scholar 

  30. Maleki, N., Faragardi, H.R., Rahmani, A.M., Conti, M., Lofstead, J.: Tmar: a two-stage mapreduce scheduler for heterogeneous environments. HCIS 10(1), 1–26 (2020)

    Google Scholar 

  31. Mashayekhy, L., Nejad, M.M., Grosu, D., Zhang, Q., Shi, W.: Energy-aware scheduling of mapreduce jobs for big data applications. IEEE Trans. Parallel Distrib. Syst. 26(10), 2720–2733 (2014)

    Article  Google Scholar 

  32. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., Gandomi, A.H.: The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 376, 113609 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  33. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A.A., Al-Qaness, M.A., Gandomi, A.H.: Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput. Ind. Eng. 157, 107250 (2021)

    Article  Google Scholar 

  34. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z.W., Gandomi, A.H.: Reptile search algorithm (rsa): a nature-inspired meta-heuristic optimizer. Expert Syst. Appl. 191, 116158 (2022)

    Article  Google Scholar 

  35. Zhang, D., Yao, L., Chen, K., Wang, S., Chang, X., Liu, Y.: Making sense of spatio-temporal preserving representations for eeg-based human intention recognition. IEEE Trans. Cybernet. 50(7), 3033–3044 (2019)

    Article  Google Scholar 

  36. Luo, M., Chang, X., Nie, L., Yang, Y., Hauptmann, A.G., Zheng, Q.: An adaptive semisupervised feature analysis for video semantic recognition. IEEE Trans. Cybernet. 48(2), 648–660 (2017)

    Article  Google Scholar 

  37. Chen, K., Yao, L., Zhang, D., Wang, X., Chang, X., Nie, F.: A semisupervised recurrent convolutional attention model for human activity recognition. IEEE Trans. Neural Netw. Learn. Syst. 31(5), 1747–1756 (2019)

    Article  Google Scholar 

  38. Gao, Y., Huang, C.: Energy-efficient scheduling of mapreduce tasks based on load balancing and deadline constraint in heterogeneous hadoop yarn cluster. In: 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 220–225. IEEE (2021)

  39. Hu, J.: Hybrid dynamic scheduling of mapreduce and spark services based on the profit model in the cloud computing platform. In: 2021 Second International Conference on Intelligent Data Science Technologies and Applications (IDSTA), pp. 114–121. IEEE (2021)

  40. Gao, Y., Zhang, K.: Deadline-aware preemptive job scheduling in hadoop yarn clusters. In: 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 1269–1274. IEEE (2022)

Download references

Funding

Not applicable

Author information

Authors and Affiliations

Authors

Contributions

The author contributed completely to this work

Corresponding author

Correspondence to Greeshma Lingam.

Ethics declarations

Conflict of interest

This manuscript has not been submitted to, nor is under review at, another journal or other publishing venue.

Ethical approval

Yes

Consent for publication

Yes

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lingam, G. Reinforcement learning based energy efficient resource allocation strategy of MapReduce jobs with deadline constraint. Cluster Comput 26, 2719–2735 (2023). https://doi.org/10.1007/s10586-022-03761-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-022-03761-6

Keywords

Navigation