A Survey of Different Technologies and Recent Challenges of Big Data

Dev, Dipayan; Patgiri, Ripon

doi:10.1007/978-81-322-2529-4_56

Dipayan Dev⁶ &
Ripon Patgiri⁶

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 44))

1106 Accesses
6 Citations

Abstract

Big Data, the buzz around the globe in recent days is used for large-scale data which have huge volume, variety and with some genuinely difficult complex structure. The last few years of internet technology as well as computer world has seen a lot of growth and popularity in the field of cloud computing. As a consequence, these cloud applications are continually generating this big data. There are various burning problems associated with big data in the research field, like how to store, analysis and visualize these for generating further outcomes. This paper initially points out the recent developed information technologies in the field of big data. Later on, the paper outlines the major key problems like, proper load balancing, storage and processing of small files and de-duplication regarding the big data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Dev, D., Baishnab, K.L.: A review and research towards mobile cloud computing. In: 2014 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud), pp. 252, 256, 8–11 Apr 2014
Google Scholar
Eaton, C., Deroos, D., Deutsch, T., Lapis, G., Zikopoulos, P.C.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. Mc Graw-Hill Companies, New York (2012). ISBN 978-0-07-179053-6
Google Scholar
Schneider, R.D.: Hadoop for Dummies, Special Edition. Wiley, Canada (2012). ISBN 978-1-118-25051-8
Google Scholar
Intel IT Center.: Planning Guide: Getting Started with Hadoop. Steps IT Managers Can Take to Move Forward with Big Data Analytics (2012). http://www.intel.com/content/dam/www/public/us/en/documents/guides/getting-started-with-hadoop-planning-guide.pdf
Singh, S., Singh, N.: Big data analytics. In: 2012 International Conference on Communication, Information & Computing Technology Mumbai India, IEEE (2011) http://hpccsystems.com/. Accessed 11 Mar 2013
http://hpccsystems.com/. Access 11 Mar 2013
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute (2011). http://www.mckinsey.com/~/media/McKinsey/dotcom/Insights%20and%20pubs/MGI/Research/Technology%20and%20Innovation/Big%20Data/MGI_big_data_full_report.ashx
Gerhardt, B., Griffin, K., Klemann, R.: Unlocking value in the fragmented world of big data analytics. Cisco Internet Business Solutions Group (2012). http://www.cisco.com/web/about/ac79/docs/sp/Information-Infomediaries.pdf
https://www.youtube.com/yt/press/en-GB/statistics.html
http://www.humanfaceofbigdata.com/. Accessed 11 Mar 2013
Tankard, C.: Big data security. Network Security Newsletter, Elsevier (2012). ISSN 1353-4858
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. CACM 51(1), 107–113 (2008)
Article Google Scholar
Apache, Hadoop.: Open-source implementation of MapReduce. http://hadoop.apache.org
Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: Proceedings of ACM SOSP (2003)
Google Scholar
Apache.: HDFS Architecture Guide. Apache Software Foundation, Canada (2008)
Google Scholar
Dev, D., Patgiri, R.: Performance evaluation of HDFS in big data management. In: 2014 International Conference on High Performance Computing and Applications (ICHPCA), pp. 1, 7, 22–24 Dec 2014
Google Scholar
Cattell, R.: Scalable SQL and NoSQL data stores. ACM SIGMOD Rec. 39(4), 12–27 (2010)
Article Google Scholar
Lee, K.H., Lee, Y.J., Choi, H., Chung, Y.D., Moon, B.: Parallel data processing with MapReduce: a survey. ACM SIGMOD Rec. 40(4), 11–20 (2011)
Article Google Scholar
Ci, X., Meng, X.: Big data management: concepts, techniques and challenges. J. Comput. Res. Dev. 50, 146–169 (2013)
Google Scholar
Li, X., Dong, B., Xiao, L. Ruan, L., Ding, Y.: Small files problem in parallel file system. In: 2011 International Conference on Network Computing and Information Security, NCIS 2011, pp. 227–232. Guilin, Guangxi, China, 14–15 May 2011
Google Scholar
Dong, B., Zheng, Q., Tian, F., Chao, K., Ma, R., Anane, R.: An optimized approach for storing and accessing small files on cloud storage. J. Netw. Comput. Appl. 35, 1847–1862 (2012)
Article Google Scholar
Dong, B., Qiu, J., Zheng, Q., Zhong, X., Li, J., Li, Y.: A novel approach to improving the efficiency of storing and accessing small files on Hadoop: a case study by PowerPoint files. In: 2010 IEEE 7th International Conference on Services Computing, SCC 2010, pp. 65–72. Miami, FL, United States, 5–10 July 2010
Google Scholar
MacKey, G., Sehrish, S., Wang, J.: Improving metadata management for small files in HDFS. In: 2009 IEEE International Conference on Cluster Computing and Workshops, CLUSTER ‘09. New Orleans, LA, United States, 31 Aug–4 Sept 2009
Google Scholar
Chandrasekar, S., Dakshinamurthy, R., Seshakumar, P.G., Prabavathy, B., Babu, C.: A novel indexing scheme for efficient handling of small files in Hadoop distributed file system. In: 2013 3rd International Conference on Computer Communication and Informatics, ICCCI 2013. Government of India, Department of Science and Technology, Council for Scientific and Industrial Research (CSIR), Coimbatore, India, 4–6 Jan 2013
Google Scholar
Zhang, Y., Liu, D.: Improving the efficiency of storing for small files in HDFS. In: 2012 International Conference on Computer Science and Service System, CSSS 2012, pp. 2239–2242. Nanjing, China, 11–13 Aug 2012
Google Scholar
Li, X., Dong, B., Xiao, L., Ruan, L.: Performance optimization of small file I/O with adaptive migration strategy in cluster file system. In: 2nd International Conference on High-Performance Computing and Applications, HPCA 2009, pp. 242–249. Shanghai, China (2010), 10–12 Aug 2009
Google Scholar
Mohandas, N., Thampi, S.M.: Improving Hadoop performance in handling small files. In: 1st International Conference on Advances in Computing and Communications, ACC 2011, pp. 187–194. Kochi, India, 22–24 July 2011
Google Scholar
Liu, J., Bing, L., Meina, S.: The optimization of HDFS based on small files. In: 2010 3rd IEEE International Conference on Broadband Network and Multimedia Technology, IC-BNMT2010, pp. 912–915. Beijing, China, 26–28 Oct 2010
Google Scholar
Zhang, C., Yin, J.: Dynamic load balancing algorithm of distributed file system. J. Chin. Comput. Syst. 32, 1424–1426 (2011)
Google Scholar
Wu, W.: Research on Mass Storage Metadata Management, vol. D. Huazhong University of Science and Technology, Wuhan (2010)
Google Scholar
Tian, J., Song, W., Yu, H.: Load-balance policy in two level cluster file system. Comput. Eng. 33, 77–79, 82 (2007)
Google Scholar
Gu, F.: Research on Distributed File System Load Balancing in Cloud Environment, vol. D. Jiaotong University, Beijing (2011)
Google Scholar
Cai, B., Zhang, F.L., Wang, C.: Research on chunking algorithms of data de-duplication. In: International Conference on Communication, Electronics, and Automation Engineering, 2012, pp. 1019–1028. Xi’an, China (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology Silchar, Assam, 788010, India
Dipayan Dev & Ripon Patgiri

Authors

Dipayan Dev
View author publications
You can also search for this author in PubMed Google Scholar
Ripon Patgiri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dipayan Dev .

Editor information

Editors and Affiliations

Department of Computer Science, Liverpool Hope University, Liverpool, United Kingdom
Atulya Nagar
Department of Computer Science and Engineering, National Institute of Technology Rourkela, Rourkela, India
Durga Prasad Mohapatra
Computer Science & Engineering, University of Calcutta, Kolkata, West Bengal, India
Nabendu Chaki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dev, D., Patgiri, R. (2016). A Survey of Different Technologies and Recent Challenges of Big Data. In: Nagar, A., Mohapatra, D., Chaki, N. (eds) Proceedings of 3rd International Conference on Advanced Computing, Networking and Informatics. Smart Innovation, Systems and Technologies, vol 44. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2529-4_56

Download citation

DOI: https://doi.org/10.1007/978-81-322-2529-4_56
Published: 03 September 2015
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2528-7
Online ISBN: 978-81-322-2529-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics