Skip to main content
Log in

Optimized Approach (SPCA) for Load Balancing in Distributed HDFS Cluster

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

A Publisher Correction to this article was published on 28 September 2023

This article has been updated

Abstract

Hadoop is an open-source utility which allows users to provide massive input in terms of data and facilitates the computation. Role of Hadoop in load balancing is enormous which allows the user to configure the network of nodes having master/slave nodes. Hadoop’s typical architecture takes into consideration the default configuration for the machine as homogeneous, but many of the real-time application or clusters of nodes will have the homogeneous configurations. Thus, an effort is made in this paper to consider the homogeneity of the nodes in clusters and build an efficient algorithm which does load balancing in an efficient way when compared with the default balancer of Hadoop which works well only on homogeneous nodes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Change history

References

  1. An optimal task selection scheme for Hadoop scheduling S.Suresh, N.P.Gopalan Depatment of computer Applications, National institute of technology, Tiruchirapalli.

  2. An Optimization Algorithm for Heterogeneous HadoopClusters Basedon Dynamic Load Balancing Wei Yan, ChunLin Li, ShuMeng Du, XiJun Mao Software Engineering Wuhan University of Technology No.1186, Heping Boulevard, Wuchang District, Wuhan, Hubei CHINA.

  3. Cluster Computing at a Glance Mark Bakery and Rajkumar Buyya Division of Computer Science University of Portsmouth Southsea, Hants, UK z School of Computer Science and Software Engineering Monash University Melbourne, Australia.

  4. HPCA: A Node Selection and Scheduling Method for Hadoop MapReduce Archana.G.K1, V.Deeban Chakravarthy2 1 II M.Tech, Department Of Computer Science and Engineering, SRM University, Chennai. 2 Assistant Professor, Department Of Computer Science and Engineering, SRM University, Chennai.

  5. Hadoop/MapReduce Object-oriented framework presentation CSCI 5448 Casey McTaggart.

  6. Hadoop MapReduce Scheduling Algorithms – A Survey Ms. Anjana Sharma Senior Assistant Professor, Computer Science and Engineering Department, New Horizon College of Engineering, Bangalore, India.

  7. https://www.cloudera.com/documentation/enterprise/5-7-x/topics/admin_hdfs_balancer.html, https://www.oreilly.com/ideas/distributed-systems-a-quick-and-simple-definition.

  8. Load balancing in MapReduce on homogeneous and heterogeneous clusters: an in-depth review Mohammad Javad Kargar and Meysam Vakili* Department of Computer Engineering, College of Engineering, University of Science and Culture, Tehran, Iran.

  9. Load Balancing in MapReduce Environments for Data Intensive Applications Yang Liu1.

  10. Maozhen Li,Nasullah Khalid Alham, Suhel Hammoud and Mahesh Ponraj, School of Engineering and Design, Brunel University, Uxbridge, Middlesex, UB8 3PH, UK The Key Laboratory of Embedded Systems and Service Computing, Ministry of Education, Tongji University, China, http://www.cse.scu.edu/~mwang2/projects/CDH_installConfig1_13m.pdf.

  11. Suresh, N.P. Gopalan. “An Optimal Task Selection Scheme for Hadoop Scheduling”, IERI Procedia, 2014.

  12. White T. Hadoop: the definitive guide[M]. O’Reilly, 2012.

  13. Wickham, Hadley (2011). The split-apply-combine strategy for data analysis. Journal of Statistical Software.

  14. YARN, MapReduce 2.0, Hadoop clusters and the Big Data layer cake.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Manjula.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Advances in Computational Intelligence Paradigms and Applications” guest edited by Young Lee and S. Meenakshi Sundaram.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Manjula, K., Meenakshi Sundaram, S. Optimized Approach (SPCA) for Load Balancing in Distributed HDFS Cluster. SN COMPUT. SCI. 1, 102 (2020). https://doi.org/10.1007/s42979-020-0107-8

Download citation

  • Published:

  • DOI: https://doi.org/10.1007/s42979-020-0107-8

Keywords

Navigation