Skip to main content

Advertisement

Log in

Energy-Aware Heuristic Scheduling Using Bin Packing MapReduce Scheduler for Heterogeneous Workloads Performance in Big Data

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

Big data refers to diverse large data types from heterogeneous sources such as mobile devices, the web, social media, and the internet of things. The cloud offers a wide variety of tools to handle the big data on-demand for pay-per-service basis through a cluster of virtual machines that are hosted across cloud datacenter in heterogeneous physical machines. The primary goal tends to analyze the big data at the point of creation and scaling by data-intensive computing. Hadoop MapReduce helps to solve scalability and complexity by adding more jobs in a virtual cluster across the racks in a distributed cloud datacenter. By default, MapReduce schedulers do not perform the computational jobs in heterogeneity and the virtual machines will execute the blocks in equal numbers despite their capacity decreasing the performance dynamically. The virtual machines on a virtual cluster are not aware of energy efficiency, which is highly important in a heterogeneous environment. Hence, we propose the dynamic performance heuristic-based bin packing (DP-HBP) MapReduce scheduler, which increases the utilization of resources in heterogeneous virtual machines. The proposed DP-HBP scheduler improves the makespan and latency by 49% and 39% over the roulette wheel scheme and heuristic-based MapReduce job schedulers from our experimentation. DP-HBP derived the average number of data nonlocal execution as 27%, which is lesser, compared to the existing schedulers. The resource utilization for the average number of unused vCPU and memory is improved by 34% and 41%, which enhances the performance workloads in handling big data in a heterogeneous environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Prajapati, V.: Big Data Analytics with R and Hadoop. Packt Publishing Ltd (2013)

  2. Apache Hadoop. http://hadoop.apache.org/. Accessed 2021/07/05

  3. Magesh, G.: Big data and its applications: a survey. Res. J. Pharm. Biol. Chem. Sci. 8, 2346–2358 (2017). https://doi.org/10.3923/ijscomp.2016.305.311

    Article  Google Scholar 

  4. Jeyaraj, R.; Ananthanarayana, V.S.; Paul, A.: Improving MapReduce scheduler for heterogeneous workloads in a heterogeneous environment. Concurr. Comput. Pract. Exp. (2020). https://doi.org/10.1002/cpe.5558

    Article  Google Scholar 

  5. Yao, Y.; Gao, H.; Wang, J.; Sheng, B.; Mi, N.: New scheduling algorithms for improving performance and resource utilization in Hadoop YARN clusters. IEEE Trans. Cloud Comput. 1, 66 (2019). https://doi.org/10.1109/TCC.2019.2894779

    Article  Google Scholar 

  6. Rathinaraja, J.; Ananthanarayana, V.S.; Paul, A.: Dynamic ranking-based MapReduce job scheduler to exploit heterogeneous performance in a virtualized environment. J. Supercomput. 75, 7520–7549 (2019). https://doi.org/10.1007/s11227-019-02960-0

    Article  Google Scholar 

  7. Hsieh, S.-Y.; Chen, C.-T.; Chen, C.-H.; Yen, T.-H.; Hsiao, H.-C.; Buyya, R.: Novel scheduling algorithms for efficient deployment of MapReduce applications in heterogeneous computing environments. IEEE Trans. Cloud Comput. 6, 1080–1095 (2018). https://doi.org/10.1109/TCC.2016.2552518

    Article  Google Scholar 

  8. Wei, L.; Foh, C.H.; He, B.; Cai, J.: Towards efficient resource allocation for heterogeneous workloads in IaaS clouds. IEEE Trans. Cloud Comput. 6, 264–275 (2018). https://doi.org/10.1109/TCC.2015.2481400

    Article  Google Scholar 

  9. Lee, M.-C.; Lin, J.-C.; Yahyapour, R.: Hybrid job-driven scheduling for virtual MapReduce clusters. IEEE Trans. Parallel Distrib. Syst. 27, 1687–1699 (2016). https://doi.org/10.1109/TPDS.2015.2463817

    Article  Google Scholar 

  10. Tesfatsion, S.K.; Wadbro, E.; Tordsson, J.: PerfGreen: performance and energy aware resource provisioning for heterogeneous clouds. In: 2018 IEEE International Conference on Autonomic Computing (ICAC), pp. 81–90. IEEE, Trento (2018). https://doi.org/10.1109/ICAC.2018.00018

  11. Li, X.; Jiang, T.; Ruiz, R.: Heuristics for periodical batch job scheduling in a MapReduce computing framework. Inf. Sci. 326, 119–133 (2016). https://doi.org/10.1016/j.ins.2015.07.040

    Article  MATH  Google Scholar 

  12. Ubarhande, V.; Popescu, A.-M.; Gonzalez-Velez, H.: Novel data-distribution technique for hadoop in heterogeneous cloud environments. In: 2015 Ninth International Conference on Complex, Intelligent, and Software Intensive Systems, pp. 217–224. IEEE, Santa Catarina, Brazil (2015). https://doi.org/10.1109/CISIS.2015.37

  13. Yang, S.-J.; Chen, Y.-R.: Design adaptive task allocation scheduler to improve MapReduce performance in heterogeneous clouds. J. Netw. Comput. Appl. 57, 61–70 (2015). https://doi.org/10.1016/j.jnca.2015.07.012

    Article  Google Scholar 

  14. Senthilkumar, M.; Ilango, P.: Energy aware task scheduling using hybrid firefly—GA in big data. Int. J. Adv. Intell. Paradig. 16, 99–112 (2020). https://doi.org/10.1504/IJAIP.2020.107008

    Article  Google Scholar 

  15. Senthilkumar, M.: Energy-Aware Task Scheduling Using Hybrid Firefly-BAT (FFABAT) in big data. Cybern. Inf. Technol. 18, 98–111 (2018). https://doi.org/10.2478/cait-2018-0031

    Article  Google Scholar 

  16. Tang, S.; Lee, B.-S.; He, B.: DynamicMR: a dynamic slot allocation optimization framework for MapReduce clusters. IEEE Trans. Cloud Comput. 2, 333–347 (2014). https://doi.org/10.1109/TCC.2014.2329299

    Article  Google Scholar 

  17. Zhang, Q.; Zhani, M.F.; Boutaba, R.; Hellerstein, J.L.: Dynamic heterogeneity-aware resource provisioning in the cloud. IEEE Trans. Cloud Comput. 2, 14–28 (2014). https://doi.org/10.1109/TCC.2014.2306427

    Article  Google Scholar 

  18. Wei, L.; He, B.; Foh, C.H.: Towards multi-resource physical machine provisioning for IaaS clouds. In: 2014 IEEE International Conference on Communications (ICC), pp. 3469–3472. IEEE, Sydney, NSW (2014). https://doi.org/10.1109/ICC.2014.6883858

  19. Bardhan, S.; Menasce, D.A.: The anatomy of mapreduce jobs, scheduling, and performance challenges. In: International CMG Conference (2013)

  20. Xie, J.; Meng, F.; Wang, H.; Pan, H.; Cheng, J.; Qin, X.: Research on scheduling scheme for Hadoop clusters. Procedia Comput. Sci. 18, 2468–2471 (2013). https://doi.org/10.1016/j.procs.2013.05.423

    Article  Google Scholar 

  21. Senthilkumar, M.; Ilango, P.: A survey on job scheduling in big data. Cybern. Inf. Technol. 16, 35–51 (2016). https://doi.org/10.1515/cait-2016-0033

    Article  Google Scholar 

  22. Wang, M.; Wu, C.Q.; Cao, H.; Liu, Y.; Wang, Y.; Hou, A.: On MapReduce scheduling in hadoop yarn on heterogeneous clusters. In: 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), pp. 1747–1754. IEEE, New York, NY, USA (2018). https://doi.org/10.1109/TrustCom/BigDataSE.2018.00264

  23. PUMA MapReduce Datasets Download. https://engineering.purdue.edu/~puma/datasets.htm. Accessed 2021/10/12

  24. White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Inc. (2012)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Aarthee.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aarthee, S., Prabakaran, R. Energy-Aware Heuristic Scheduling Using Bin Packing MapReduce Scheduler for Heterogeneous Workloads Performance in Big Data. Arab J Sci Eng 48, 1891–1905 (2023). https://doi.org/10.1007/s13369-022-06963-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-022-06963-7

Keywords

Navigation