Skip to main content

Advertisement

Log in

Energy- and locality-efficient multi-job scheduling based on MapReduce for heterogeneous datacenter

  • Special Issue Paper
  • Published:
Service Oriented Computing and Applications Aims and scope Submit manuscript

Abstract

Job scheduling of MapReduce is a research hot spot, especially on the heterogeneous datacenter. Huge energy consumption and operating costs are key challenges. Most of the previous work only considers the scheduling optimization of a single job. In this paper, we take multiple jobs of MapReduce as research objects and focus on the goal of “jointly optimizing the scheduling time, job costs and energy consumption.” For that, an energy- and locality-efficient MapReduce multi-job scheduling algorithm is developed for the heterogeneous datacenter. Firstly, we use rack as the basic unit of resource in job scheduling to reduce data communication between jobs and to facilitate energy savings. Secondly, according to the capacity of heterogeneous rack, we design a multi-job pre-mapping method to optimize the execution order of jobs and jointly optimize the scheduling time, job costs and energy consumption. Based this pre-mapping method, we can assign one job to the virtual machine on the same rack, so as to minimize the amount of online rack. This centralized mapping strategy is very helpful to save energy and reduce data transmission of jobs. Thirdly, the map and reduce tasks of a job will be divided into multiple task groups for parallel execution, thereby further reducing data communication and energy consumption. Finally, a lot of experimental results prove the advantages of our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Hashem IAT, Anuar NB, Marjani M et al (2018) MapReduce scheduling algorithms: a review. J Supercomput 2018(1):1–31

    Google Scholar 

  2. Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113

    Article  Google Scholar 

  3. Dahiphale D, Karve R, Vasilakos AV et al (2014) An advanced mapreduce: cloud mapreduce, enhancements and applications. IEEE Trans Netw Serv Manag 11(1):101–115

    Article  Google Scholar 

  4. Mashayekhy L, Nejad MM, Grosu D et al (2015) Energy-aware scheduling of mapreduce jobs for big data applications. IEEE Trans Parallel Distrib Syst 26(10):2720–2733

    Article  Google Scholar 

  5. Bampis E, Chau V, Letsios D, Lucarelli G, Milis I, Zois G (2014) Energy efficient scheduling of mapreduce jobs. In: Euro-Par 2014 parallel processing. Springer

  6. Wang J, Li X, Yang J (2015) Energy-aware task scheduling of mapreduce cluster. In: 2015 international conference on service science (ICSS)

  7. Maheshwari N, Nanduri R, Varma V (2012) Dynamic energy efficient data placement and cluster reconfiguration algorithm for mapreduce framework. Future Gener Comput Syst 28(1):119–127

    Article  Google Scholar 

  8. Chen Y, Alspaugh S, Borthakur D, et al (2012) Energy efficiency for large-scale mapreduce workloads with significant interactive analysis. In: Proceedings of the 7th ACM European conference on computer systems

  9. Palanisamy B, Singh A, Liu L, Jain B (2011) Purlieus: locality-aware resource allocation for mapreduce in a cloud. In: Proceedings of 2011 international conference for high performance computing, networking, storage and analysis

  10. Chen L, Zhang J, Cai L et al (2017) Fast community detection based on distance dynamics. Tsinghua Sci Technol 22(6):564–585

    Article  Google Scholar 

  11. Tang Z, Jiang L, Zhou J, Li K, Li K (2015) A self-adaptive scheduling algorithm for reduce start time. Future Gener Comput Syst 43:51–60

    Article  Google Scholar 

  12. Ramanathan R, Latha B (2018) Towards optimal resource provisioning for Hadoop-MapReduce jobs using scale-out strategy and its performance analysis in private cloud environment. Clust Comput 2:1–11

    Google Scholar 

  13. Lin JW, Arul JM, Lin CY (2018) Joint deadline-constrained and influence-aware design for allocating MapReduce jobs in cloud computing systems. Clust Comput 1:1–14

    Google Scholar 

  14. Zhu Y, Jiang Y, Wu W, Ding L, Teredesai A, Li D, Lee W (2014) Minimizing makespan and total completion time in mapreduce-like systems. In: 2014 proceedings on INFOCOM. IEEE

  15. Palanisamy B, Singh A, Liu L (2015) Cost-effective resource provisioning for mapreduce in a cloud. IEEE Trans Parallel Distrib Syst 26(5):1265–1279

    Article  Google Scholar 

  16. Lin M, Zhang L, Wierman A, Tan J (2013) Joint optimization of overlapping phases in mapreduce. Perform Eval 70(10):720–735

    Article  Google Scholar 

  17. Heintz B, Chandra A, Weissman J (2014) Cross-phase optimization in mapreduce. In: Cloud computing for data-intensive applications

  18. Anjos JC, Carrera I, Kolberg W, Tibola AL, Arantes LB, Geyer CR (2015) Mar++: scheduling and data placement on mapreduce for heterogeneous environments. Future Gener Comput Syst 42:22–35

    Article  Google Scholar 

  19. Jin H, Yang X, Sun X-H, Raicu I (2012) Adapt: availability-aware mapreduce data placement for non-dedicated distributed computing. In: 2012 IEEE 32nd international conference on distributed computing systems (ICDCS). IEEE

  20. Xie J, Yin S, Ruan X, Ding Z, Tian Y, Majors J, Manzanares A, Qin X (2010) Improving mapreduce performance through data placement in heterogeneous hadoop clusters. In: 2010 IEEE international symposium on parallel and distributed processing, workshops and Ph.D. forum (IPDPSW). IEEE

  21. Al-Khasawneh MA, Shamsuddin SM, Hasan S et al (2018) MapReduce a comprehensive review. In: 2018 international conference on smart computing and electronic enterprise (ICSCEE) on IEEE

  22. Gregory A, Majumdar S (2018) Resource management for deadline constrained MapReduce jobs for minimising energy consumption. Int J Big Data Intell 5(4):270–287

    Article  Google Scholar 

  23. Elzein NM, Majid MA, Hashem IAT et al (2018) Managing big RDF data in clouds: challenges, opportunities, and solutions. Sustain Cities Soc 39:375–386

    Article  Google Scholar 

  24. Chen L, Zhang J, Cai L et al (2016) Locality-aware and energy-aware job pre-assignment for mapreduce. In: International conference on intelligent networking and collaborative systems

Download references

Acknowledgements

This work was supported by the Science Research Project of Education Department of Hunan Province (18C0296); the Open Project of State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body (31715010); Hunan Provincial Natural Science Foundation of China (2018JJ2134); Hunan Provincial Young Talents Project (2018RS3095); and Ph.D. research startup foundation of Hunan University of Science and Technology (E51863).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, L., Liu, ZH. Energy- and locality-efficient multi-job scheduling based on MapReduce for heterogeneous datacenter. SOCA 13, 297–308 (2019). https://doi.org/10.1007/s11761-019-00273-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11761-019-00273-x

Keywords

Navigation