Big Data Hadoop MapReduce Job Scheduling: A Short Survey

  • N. DeshaiEmail author
  • B. V. D. S. Sekhar
  • S. Venkataramana
  • K. Srinivas
  • G. P. S. Varma
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 862)


A latest peta to zeta era occurs from various complex digital world information, continuously collecting from device to device, social sites, etc., expressed as large information (as big data). Because of that we are unable to store and process due to lack of scalable and efficient schedulers. A main reason that day by day data is twice over digital world is database’s size changes to zeta from tera. An apache open source Hadoop is the latest and innovative marketing weapon to grip huge volume of information through its classical and flexible components that are Hadoop distributed file system and Reduce-map, to defeat efficiently, store and serve different services on immense magnitude of world digital text, image, audio, and video data. To build and select an innovative and well-organized scheduler is an important key factor for selecting nodes and optimize and achieve high performance in complex information. A latest and useful survey, examination and overview uses and lacks facilities on Hadoop scheduler algorithms that are recognized throughout paper.


Big data Hadoop HDFS MapReduce Scheduling 


  1. 1.
    A. Verma, et al., ARIA: Automatic Resource Inference and Allocation for MapReduce environments, in 8th Autonomic Computing ACM (2011)Google Scholar
  2. 2.
    X. Yi, Research and improvement of job scheduling algorithms in Hadoop platform, Master Degree Dissertation. 45-51 (2010)Google Scholar
  3. 3.
    W. Zhang, et al., MIMP: deadline and interference aware scheduling of Hadoop virtual machines, in 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (2014), 978-1-4799-2784Google Scholar
  4. 4.
    S.G. Ahmad, et al., Data-intensive workflow optimization based on application task graph partitioning in heterogeneous computing systems, in IEEE 4th BdCloud (2014), 978-1-4799-6719Google Scholar
  5. 5.
    T.L. Casavant et al., A taxonomy of scheduling in general-purpose distributed computing systems. IEEE Trans. Softw. Eng. 14, 141–154 (1988)CrossRefGoogle Scholar
  6. 6.
    D. Yoo, K.M. Sim, A comparative review of job scheduling for MapReduce, in Cloud Computing and Intelligence Systems (IEEE, 2011)Google Scholar
  7. 7.
    S. Sakr et al., The family of MapReduce and large-scale data processing systems. ACM Comput. Surv. (CSUR) 46(1), 11 (2013)CrossRefGoogle Scholar
  8. 8.
    C. Doulkeridi, K. NØrvag, A survey of large-scale analytical query processing in MapReduce. VLDB J. 23(3), 355–380 (2014)CrossRefGoogle Scholar
  9. 9.
    N. Tiwari et al., Classification framework of MapReduce scheduling algorithms. ACM Comput. Surv. 47(3), 49 (2015)CrossRefGoogle Scholar
  10. 10.
  11. 11.
    M. Malak, Data Locality: HPC vs. Hadoop vs. Spark. Data Science Association (2014)Google Scholar
  12. 12.
    G. Sanjay, G. Howard, S.T. Leung, The Google file system, in 19th Symposium on Operating Systems Principles, New York (2003), pp. 29–43Google Scholar
  13. 13.
    D. Yuan et al., A data placement strategy in scientific cloud workflows. Futur. Gener. Comput. Syst. 26(8), 1200–1214 (2010)CrossRefGoogle Scholar
  14. 14.
    W. Cirne, et al., UFCG/DSC Technical Report 07, 2005(4):225–246Google Scholar
  15. 15.
    M. Zaharia, et al., Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling, in EuroSys’10, pp. 265–278Google Scholar
  16. 16.
    P.S. Jun, et al., Optimization and research of Hadoop platform based on FIFO scheduler, in 7th ICMTMA,
  17. 17.
  18. 18.
    J. Chen et al., A task scheduling algorithm for Hadoop platform. J. Comput. 8(4), 929–936 (2013)CrossRefGoogle Scholar
  19. 19.
    C. He, et al., Matchmaking: a new MapReduce scheduling technique, in 2011 IEEE 3rd International Conference on Cloud Computing Technology and Science (CloudCom) (IEEE, 2011)Google Scholar
  20. 20.
    M. Zaharia, et al., Improving MapReduce performance in heterogeneous environments, in OSDI’08: 8th USENIX Symposium, Oct 2008Google Scholar
  21. 21.
    J. Geetha, et al., Hadoop scheduler with deadline constraint. IJCCSA 4(5) (2014)Google Scholar
  22. 22.
    M. Yong, N. Garegrat, S. Mohan, Towards a resource aware scheduler in Hadoop, in Proceedings of ICWS (2009), pp. 102–109Google Scholar
  23. 23.
    G.K. Archana, V.D. Chakravarthy, HPCA: a node selection and scheduling method for Hadoop MapReduce, in ICCCT’15 (2015), 978-1-4799-7623Google Scholar
  24. 24.
    Y. Wang, et al., A round robin with multiple feedback job scheduler in Hadoop, in Progress in Informatics and Computing (PIC) International Conference, Shanghai (IEEE, 2014) 978-1-4799-2030-3Google Scholar
  25. 25.
    J. Chen et al., A task scheduling algorithm for Hadoop platform. J. Comput. 8(4), 929–936 (2013). Scholar
  26. 26.
    S. Bardhan, D.A. Menasce, Queuing network models to predict the completion time of the map phase of map reduce jobs, in ICMG (2012)Google Scholar
  27. 27.
    J.V. Gautam, et al., A survey on job scheduling algorithms in Big data processing, in ICECCT (IEEE, 2015), 978-1-4799-6084-2Google Scholar
  28. 28.
    Hadoop, Retrieved 29 Feb 2016

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • N. Deshai
    • 1
    Email author
  • B. V. D. S. Sekhar
    • 1
  • S. Venkataramana
    • 1
  • K. Srinivas
    • 1
  • G. P. S. Varma
    • 1
  1. 1.Department Information TechnologyS.R.K.R Engineering CollegeBhimavaramIndia

Personalised recommendations