Optimized Capacity Scheduler for MapReduce Applications in Cloud Environments

  • Adepu Sree Lakshmi
  • N. Subhash Chandra
  • M. BalRaju
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 808)


Most of the current-day applications are data centric and involves lot of data processing. Technologies like hadoop enable data processing with automatic parallelism. Current-day applications which are more data intensive and compute intensive can take advantage of this automatic parallelism and the methodology of moving computation to data. In addition to it the Cloud computing technology enables users to establish the required clusters with required number of nodes instantly. Cloud computing has made easy for the users to execute large data applications without any requirement to establish/maintain the infrastructure. As cloud gives readily installed infrastructures, using hadoop on cloud has become common. The existing schedulers are very effective in static cluster environments but lack performance in virtual environments. The purpose of this work is to design an effective capacity scheduler for MapReduce applications for virtualized environments like public clouds by making scheduling decisions more intelligent using the characteristics of job and virtual machines.


Big data Cloud computing CloudSim Hadoop MapReduce Virtual machine 


  1. 1.
    Hadoop The definitive guide, O’Reilly & Yahoo Press, Tom White.Google Scholar
  2. 2.
    Sree Lakshmi, A., BalRaju, M., Subhash Chandra, N. (2016). Towards optimization of hadoop map reduce jobs on cloud. In IEEE International Conference on Computing, Analytics and Security Trends (CAST 2016), Dec 2016. ISBN: 978-1-5090-1338-8.Google Scholar
  3. 3.
    Sree Lakshmi, A., Bal Raju, M., & Subhash Chandra, N. (2015). Scheduling of parallel applications using map reduce on cloud: A literature survey. International Journal of Computer Science and Information Technologies, 6, 112–115.Google Scholar
  4. 4.
  5. 5.
  6. 6.
    Kumar, A. K., Krishna, V., Voruganti, K., & Prabhakara Rao, G. V. (2012). CASH: Context aware scheduler for Hadoop. In ICACCI ‘12 Proceedings of the International Conference on Advances in Computing, Communications and Informatics.Google Scholar
  7. 7.
    Chen, Q., Zhang, D., Guo, M., Deng, Q., & Guo, S. (2010). SAMR: A self-adaptive mapreduce scheduling algorithm in heterogeneous environment. Computer and Information Technology, International Conference, 2736–2743.Google Scholar
  8. 8.
    Mao, Y., Qi, H., Ping, P., & Li, X. (2016). FiGMR: A fine grained mapreduce scheduler in the heterogeneous cloud. In Proceedings of the IEEE International Conference of Information and Automation, Ningbo, China, August 2016.Google Scholar
  9. 9.
    Deshmukh, S., Aghav, J. V., & Chakravarthy, R. (2013). Job classification for mapreduce scheduler in heterogeneous environment. In 2013 International Conference on Cloud & Ubiquitous Computing & Emerging Technologies.Google Scholar
  10. 10.
    Wylie, A., Shi, W., Corriveau, J. P. (2016). A scheduling algorithm for hadoop mapreduce workflows with budget constraints in the heterogeneous cloud. In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops.Google Scholar
  11. 11.
    Kang, H., Chen, Y., Wong, J. L., Sion, R., & Wu, J. (2011). Enhancement of Xen’s scheduler for MapReduce workloads. In Proceedings of the 20th international symposium on High performance distributed computing, New York, NY, USA, pp. 251–262.Google Scholar
  12. 12.
    Yazdanov, L., Gorbunov, M., & Fetzer, C. (2015). EHadoop: Network I/O aware scheduler for elastic MapReduce cluster. In 2015 IEEE 8th International Conference on Cloud Computing.
  13. 13.
    Ehsan, M., Chandrasekaran, K., Chen, Y., & Sion, R. (2016). Cost-efficient tasks and data co-scheduling with affordhadoop. IEEE transactions on cloud computing. Scholar
  14. 14.
    Das, R., Singh, R. P., Patgiri, R. (2016). Mapreduce scheduler: A 360-degree view. International Journal of Current Engineering and Scientific Research (IJCESR), 3(11), ISSN (print): 2393–8374, (online): 2394–0697.Google Scholar
  15. 15.
    Kim, S., Kang, D., & Choi, J. (2015). I/O characteristics and implications of big data processing on virtualized environments. Applied Mathematics & Information Sciences An International Journal, 9(2L), 591–598.Google Scholar
  16. 16.
    Kim, S., Kang, D., Choi, J., & Kim, J. (2014). Burstiness-aware I/O scheduler for MapReduce framework on virtualized environments. In 2014 International Conference on Big Data and Smart Computing (BIGCOMP) (pp. 305–308).
  17. 17.
    Tian, W., Li, G., Yang, W., & Buyya, R. (2016). HScheduler: An optimal approach to minimize the makespan of multiple MapReduce jobs. The Journal of Supercomputing, 72(6), 2376–2393. Scholar
  18. 18.
    Wang, X., Shen, D., Yu, G., Nie, T., & Kou, Y. (2013). A throughput driven task scheduler for improving mapreduce performance in job-intensive environments. In 2013 IEEE International Congress on Big Data (pp. 211–218).Google Scholar
  19. 19.
    Yao, Y., Wang, J., Sheng, B., Lin, J., Mi, N. (2014). HaSTE: Hadoop YARN scheduling based on task-dependency and resource-demand. In 2014 IEEE International Conference on Cloud Computing. 978-1-4799-5063-8/14.Google Scholar
  20. 20.
  21. 21.

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Adepu Sree Lakshmi
    • 1
  • N. Subhash Chandra
    • 2
  • M. BalRaju
    • 3
  1. 1.Geethanjali College of Engineering and TechnologyHyderabadIndia
  2. 2.CSE DepartmentCVR College of EngineeringHyderabadIndia
  3. 3.CSE DepartmentSwami Vivekanandha Institute of TechologyHyderabadIndia

Personalised recommendations