Skip to main content

Big Data in Cloud Computing: A Review of Key Technologies and Open Issues

  • Conference paper
  • First Online:
Advances in Internet, Data & Web Technologies (EIDWT 2018)

Abstract

Academia, industry and government as well, are involved in big data projects. Many researches on big data applications and technologies are actively being conducted. This paper presents a literature review of recent researches on key technologies and open issues for big data management via cloud computing. Its goal is to identify and evaluate the main technology components and their impacts on cloud-based big data implementations. This is achieved by reviewing 40 publications published in the latest four years, 2014–2017. We classified the results based on the main technical aspects: frameworks, databases and data processing techniques, and programming languages. This paper also provides a reference source for researchers and developers, to determine the best emerging technologies for big data project implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mallika, C., Selvamuthukumaran, S.: Hadoop framework: analyzes workload predicition of data from cloud computing. In: 2017 International Conference on IoT and Application (ICIOT), pp. 1–6. IEEE (2017)

    Google Scholar 

  2. Nodarakis, N., Sioutas, S., Tsakalidis, A., Tzima, G.: Using Hadoop for Large Scale Analysis on Twitter: A Technical Report. arXiv preprint arXiv:1602.01248 (2016)

  3. Meng, S., Dou, W., Zhang, X., Chen, J.: KASR: a keyword-aware service recommendation method on MapReduce for big data applications. IEEE Trans. Parallel Distrib. Syst. 25(12), 3221–3231 (2014)

    Article  Google Scholar 

  4. Bhimani, J., Yang, Z., Leeser, M., Mi, N.: Accelerating big data applications using lightweight virtualization framework on enterprise cloud. In: High Performance Extreme Computing Conference (HPEC), pp. 1–7. IEEE (2017)

    Google Scholar 

  5. Ortiz, J.L.R., Oneto, L., Anguita, D.: Big data analytics in the cloud: spark on hadoop vs MPI/OpenMP on Beowulf. Procedia Comput. Sci. 53, 121–130 (2015)

    Article  Google Scholar 

  6. Zhaoa, J., Wang, L., Tao, J., Chen, J.: A security framework in G-Hadoop for big data computing across distributed Cloud data centres. J. Comput. Syst. Sci. 80(5), 994–1007 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  7. Huang, T., Lan, L., Fang, X., An, P., Min, J., Wang, F.: Promises and challenges of big data computing in health sciences. Big Data Res. 2(1), 2–11 (2015)

    Article  Google Scholar 

  8. Miller, J., Bowman, C., Harish, V., Quinn, S.: Open source big data analytics frameworks written in scala. In: 2016 IEEE International Congress on Big Data (BigData Congress), pp. 389–393 (2016)

    Google Scholar 

  9. Totoni, E., Anderson, T., Shpeisman, T.: HPAT: High Performance Analytics with Scripting Ease-of-Use. arXiv preprint arXiv:1611.04934 (2016)

  10. Khan, Z., Anjum, A., Soomro, K., Tahir, M.A.: Towards cloud based big data analytics for smart future cities. J. Cloud Comput. 4(1), 2 (2015)

    Article  Google Scholar 

  11. Xhafa, F., Naranjo, V., Caballé, S.: Processing and analytics of big data streams with Yahoo!S4. In: 2015 IEEE 29th International Conference on Advanced Information Networking and Applications (AINA), pp. 263–270 (2015)

    Google Scholar 

  12. Baek, J., Vu, Q., Liu, J., Huang, X., Xiang, Y.: A secure cloud computing based framework for big data information management of smart grid. IEEE Trans. Cloud Comput. 3(2), 233–244 (2015)

    Article  Google Scholar 

  13. Chandarana, P., Vijayalakshmi, M.: Big data analytics frameworks. In: Proceedings of the International Conference on Circuits, pp. 430–434. IEEE (2014). ISBN: 978-1-4799-2494-3

    Google Scholar 

  14. Singh, D., Reddy, C.K.: A survey on platforms for big data analytics. J. Big Data 2(1), 8 (2015)

    Article  Google Scholar 

  15. Koliopoulos, A., Yiapanis, P., Tekiner, F., Nenadic, G., Keane, J.: A parallel distributed weka framework for big data mining using spark. In: 2015 IEEE International Congress Big Data (BigData Congress), pp. 9–16 (2015)

    Google Scholar 

  16. Zicari, R., Rosselli, M., Korfiatis, N.: Setting up a big data project: challenges, opportunities, technologies and optimization. In: Studies in Big Data, vol. 18, pp. 17–47. Springer (2016)

    Google Scholar 

  17. Sharma, S., Tim, U.S., Wong, J., Gadia, S.: A brief review on leading big data models. Data Sci. J. 13, 138–157 (2014)

    Article  Google Scholar 

  18. Matallah, H., Belalem, G.: Experimental comparative study of NoSQL databases: HBASE versus MongoDB by YCSB. Comput. Syst. Sci. Eng. 32(4), 307–317 (2017)

    Google Scholar 

  19. Dede, E., Sendir, B., Kuzlu, P., Weachock, J., Govindaraju, M., Ramakrishan, L.: Processing Cassandra datasets with Hadoop-streaming based approaches. IEEE Trans. Serv. Comput. 9(1), 46–58 (2016)

    Article  Google Scholar 

  20. Ptiček, M., Vrdoljak, B.: MapReduce research on warehousing of big data. In: Mipro 2017 (2017)

    Google Scholar 

  21. Zhang, H., Chen, G., Ooi, B.C., Tan, K.L.: In-memory big data management and processing: a survey. IEEE Trans. Knowl. Data Eng. 27(7), 1920–1948 (2015)

    Article  Google Scholar 

  22. Oussous, A., Benjelloun, F.Z., Lahcen, A.A., Belfkih, S.: Big data technologies: a survey. J. King Saud Univ.-Comput. Inf. Sci. (2017)

    Google Scholar 

  23. Peng, S., Liu, R., Wang, F.: New Research on Key Technologies of Unstructured Data Cloud Storage. Francis Academic Press, UK (2017)

    Google Scholar 

  24. Dehdouh, K., Bentayeb, F., Boussaid, O., Kabachi, N.: Using the column oriented NoSQL model for implementing big data warehouses. In: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), The Steering Committee of the World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), p. 469 (2015)

    Google Scholar 

  25. Sharma, S.: An extended classification and comparison of NoSQL big data models. arXiv preprint arXiv:1509.08035 (2015)

  26. Chang, B.R., Tsai, H.F., Chen, C.Y., Huang, C.F., Hsu, H.T.: Implementation of secondary index on cloud computing NoSQL database in big data environment. Sci. Program. 19 (2015)

    Google Scholar 

  27. Sitalakshmi Venkatraman, K.F., Kaspi, S., Venkatraman, R.: SQL versus NoSQL Movement with Big Data Analytics (2016)

    Google Scholar 

  28. Santos, M.Y., Costa, C.: Data warehousing in big data: from multidimensional to tabular data models. In: Proceedings of the Ninth International C* Conference on Computer Science and Software Engineering, pp. 51–60. ACM (2016)

    Google Scholar 

  29. Armbrust, M., Xin, R.S., Lian, C., Huai, Y., Liu, D., Bradley, J.K., Meng, X., Kaftan, T., Franklin, M.J., Ghodsi, A., Zaharia, M.: Spark SQL: relational data processing in spark. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1383–1394 (2015)

    Google Scholar 

  30. Siddiqui, T., Alkadri, M., Khan, N.A.: Review of programming languages and tools for big data analytics. Int. J. Adv. Res. Comput. Sci. 8(5) (2017)

    Google Scholar 

  31. Wu, D., Sakr, S., Zhu, L.: Big data programming models. In: Handbook of Big Data Technologies, pp. 31–63. Springer (2017)

    Google Scholar 

  32. Dobre, C., Xhafa, F.: Parallel programming paradigms and frameworks in big data era. Int. J. Parallel Prog. 42(5), 710–738 (2014)

    Article  Google Scholar 

  33. Jackson, J.C., Vijayakumar, V., Quadir, M.A., Bharathi, C.: Survey on programming models and environments for cluster, cloud, and grid computing that defends big data. Procedia Comput. Sci. 50, 517–523 (2015)

    Article  Google Scholar 

  34. Nystrom, N.: A scala framework for supercompilation. In: Proceedings of the 8th ACM SIGPLAN International Symposium on Scala, pp. 18–28, October 2017

    Google Scholar 

  35. Edelman, A.: Julia: a fresh approach to parallel programming. In: 2015 IEEE International Conference on Parallel and Distributed Processing Symposium (IPDPS), p. 517 (2015)

    Google Scholar 

  36. Oancea, B., Dragoescu, R.M.: Integrating R and hadoop for big data analysis. arXiv preprint arXiv:1407.4908 (2014)

  37. Maas, M., Asanović, K., Kubiatowicz, J.: Return of the runtimes: rethinking the language runtime system for the cloud 3.0 era. In: Proceedings of the 16th Workshop on Hot Topics in Operating Systems, pp. 138–143, May 2017

    Google Scholar 

  38. Cuzzocrea, A., Buyya, R., Passanisi, V., Pilato, G.: MapReduce-based algorithms for managing big RDF graphs: state-of-the-art analysis, paradigms, and future directions. In: Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 898–905 (2017)

    Google Scholar 

  39. James Stephen, J., Savvides, S., Seidel, R., Eugster, P.: Program analysis for secure big data processing. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, pp. 277–288 (2014)

    Google Scholar 

  40. Fernandez, R.C., Garefalakis, P., Pietzuch, P.: Java2SDG: stateful big data processing for the masses. In: 2016 IEEE 32nd International Conference Data Engineering (ICDE), pp. 1390–1393 (2016)

    Google Scholar 

  41. The 2017 Top Programming Languages, IEEE Spectrum ranking. https://spectrum.ieee.org/computing/software/the-2017-top-programming-languages. Accessed 27 Oct 2017

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elena Canaj .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Canaj, E., Xhuvani, A. (2018). Big Data in Cloud Computing: A Review of Key Technologies and Open Issues. In: Barolli, L., Xhafa, F., Javaid, N., Spaho, E., Kolici, V. (eds) Advances in Internet, Data & Web Technologies. EIDWT 2018. Lecture Notes on Data Engineering and Communications Technologies, vol 17. Springer, Cham. https://doi.org/10.1007/978-3-319-75928-9_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75928-9_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75927-2

  • Online ISBN: 978-3-319-75928-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics