Abstract
Big Data, the buzz around the globe in recent days is used for large-scale data which have huge volume, variety and with some genuinely difficult complex structure. The last few years of internet technology as well as computer world has seen a lot of growth and popularity in the field of cloud computing. As a consequence, these cloud applications are continually generating this big data. There are various burning problems associated with big data in the research field, like how to store, analysis and visualize these for generating further outcomes. This paper initially points out the recent developed information technologies in the field of big data. Later on, the paper outlines the major key problems like, proper load balancing, storage and processing of small files and de-duplication regarding the big data.
Keywords
- Big data
- Key technologies
- Hadoop
- Load balancing
- Storage
This is a preview of subscription content, access via your institution.
Buying options
References
Dev, D., Baishnab, K.L.: A review and research towards mobile cloud computing. In: 2014 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud), pp. 252, 256, 8–11 Apr 2014
Eaton, C., Deroos, D., Deutsch, T., Lapis, G., Zikopoulos, P.C.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. Mc Graw-Hill Companies, New York (2012). ISBN 978-0-07-179053-6
Schneider, R.D.: Hadoop for Dummies, Special Edition. Wiley, Canada (2012). ISBN 978-1-118-25051-8
Intel IT Center.: Planning Guide: Getting Started with Hadoop. Steps IT Managers Can Take to Move Forward with Big Data Analytics (2012). http://www.intel.com/content/dam/www/public/us/en/documents/guides/getting-started-with-hadoop-planning-guide.pdf
Singh, S., Singh, N.: Big data analytics. In: 2012 International Conference on Communication, Information & Computing Technology Mumbai India, IEEE (2011) http://hpccsystems.com/. Accessed 11 Mar 2013
http://hpccsystems.com/. Access 11 Mar 2013
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute (2011). http://www.mckinsey.com/~/media/McKinsey/dotcom/Insights%20and%20pubs/MGI/Research/Technology%20and%20Innovation/Big%20Data/MGI_big_data_full_report.ashx
Gerhardt, B., Griffin, K., Klemann, R.: Unlocking value in the fragmented world of big data analytics. Cisco Internet Business Solutions Group (2012). http://www.cisco.com/web/about/ac79/docs/sp/Information-Infomediaries.pdf
http://www.humanfaceofbigdata.com/. Accessed 11 Mar 2013
Tankard, C.: Big data security. Network Security Newsletter, Elsevier (2012). ISSN 1353-4858
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. CACM 51(1), 107–113 (2008)
Apache, Hadoop.: Open-source implementation of MapReduce. http://hadoop.apache.org
Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: Proceedings of ACM SOSP (2003)
Apache.: HDFS Architecture Guide. Apache Software Foundation, Canada (2008)
Dev, D., Patgiri, R.: Performance evaluation of HDFS in big data management. In: 2014 International Conference on High Performance Computing and Applications (ICHPCA), pp. 1, 7, 22–24 Dec 2014
Cattell, R.: Scalable SQL and NoSQL data stores. ACM SIGMOD Rec. 39(4), 12–27 (2010)
Lee, K.H., Lee, Y.J., Choi, H., Chung, Y.D., Moon, B.: Parallel data processing with MapReduce: a survey. ACM SIGMOD Rec. 40(4), 11–20 (2011)
Ci, X., Meng, X.: Big data management: concepts, techniques and challenges. J. Comput. Res. Dev. 50, 146–169 (2013)
Li, X., Dong, B., Xiao, L. Ruan, L., Ding, Y.: Small files problem in parallel file system. In: 2011 International Conference on Network Computing and Information Security, NCIS 2011, pp. 227–232. Guilin, Guangxi, China, 14–15 May 2011
Dong, B., Zheng, Q., Tian, F., Chao, K., Ma, R., Anane, R.: An optimized approach for storing and accessing small files on cloud storage. J. Netw. Comput. Appl. 35, 1847–1862 (2012)
Dong, B., Qiu, J., Zheng, Q., Zhong, X., Li, J., Li, Y.: A novel approach to improving the efficiency of storing and accessing small files on Hadoop: a case study by PowerPoint files. In: 2010 IEEE 7th International Conference on Services Computing, SCC 2010, pp. 65–72. Miami, FL, United States, 5–10 July 2010
MacKey, G., Sehrish, S., Wang, J.: Improving metadata management for small files in HDFS. In: 2009 IEEE International Conference on Cluster Computing and Workshops, CLUSTER ‘09. New Orleans, LA, United States, 31 Aug–4 Sept 2009
Chandrasekar, S., Dakshinamurthy, R., Seshakumar, P.G., Prabavathy, B., Babu, C.: A novel indexing scheme for efficient handling of small files in Hadoop distributed file system. In: 2013 3rd International Conference on Computer Communication and Informatics, ICCCI 2013. Government of India, Department of Science and Technology, Council for Scientific and Industrial Research (CSIR), Coimbatore, India, 4–6 Jan 2013
Zhang, Y., Liu, D.: Improving the efficiency of storing for small files in HDFS. In: 2012 International Conference on Computer Science and Service System, CSSS 2012, pp. 2239–2242. Nanjing, China, 11–13 Aug 2012
Li, X., Dong, B., Xiao, L., Ruan, L.: Performance optimization of small file I/O with adaptive migration strategy in cluster file system. In: 2nd International Conference on High-Performance Computing and Applications, HPCA 2009, pp. 242–249. Shanghai, China (2010), 10–12 Aug 2009
Mohandas, N., Thampi, S.M.: Improving Hadoop performance in handling small files. In: 1st International Conference on Advances in Computing and Communications, ACC 2011, pp. 187–194. Kochi, India, 22–24 July 2011
Liu, J., Bing, L., Meina, S.: The optimization of HDFS based on small files. In: 2010 3rd IEEE International Conference on Broadband Network and Multimedia Technology, IC-BNMT2010, pp. 912–915. Beijing, China, 26–28 Oct 2010
Zhang, C., Yin, J.: Dynamic load balancing algorithm of distributed file system. J. Chin. Comput. Syst. 32, 1424–1426 (2011)
Wu, W.: Research on Mass Storage Metadata Management, vol. D. Huazhong University of Science and Technology, Wuhan (2010)
Tian, J., Song, W., Yu, H.: Load-balance policy in two level cluster file system. Comput. Eng. 33, 77–79, 82 (2007)
Gu, F.: Research on Distributed File System Load Balancing in Cloud Environment, vol. D. Jiaotong University, Beijing (2011)
Cai, B., Zhang, F.L., Wang, C.: Research on chunking algorithms of data de-duplication. In: International Conference on Communication, Electronics, and Automation Engineering, 2012, pp. 1019–1028. Xi’an, China (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer India
About this paper
Cite this paper
Dev, D., Patgiri, R. (2016). A Survey of Different Technologies and Recent Challenges of Big Data. In: Nagar, A., Mohapatra, D., Chaki, N. (eds) Proceedings of 3rd International Conference on Advanced Computing, Networking and Informatics. Smart Innovation, Systems and Technologies, vol 44. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2529-4_56
Download citation
DOI: https://doi.org/10.1007/978-81-322-2529-4_56
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2528-7
Online ISBN: 978-81-322-2529-4
eBook Packages: EngineeringEngineering (R0)