Skip to main content
Log in

Big Data Knowledge Discovery as a Service: Recent Trends and Challenges

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

Current era is witnessing data explosion being generated from a wide range of resources including RFID (Radio-frequency identification), sensors, web logs, social media, IoT (Internet of Things) devices and many more. Pace at which data is being generated routinely in all the task performed by us has overwhelmed the proficiency and working of present infrastructure and analytical solutions available. Data has become the driving force of economy and has been treated as an asset for an organization. It contains truth or facts that can be interpreted and manipulated to gain insight for knowledge discovery. To excel out in competition enterprises are escalating their big data projects for knowledge discovery to gain valuable insights. These projects require scalable architectures for storage and data processing. Data-centric technologies are gaining impetus which can be provisioned as service to the organizations. Cloud computing is an effective and promising solution for refined analytical application. Cloud computing model supports resources to be provisioned as service. Herein paper we examine the requirements for provisioning Big Data Knowledge Discovery as a service. In addition, we explore the prevalent big data frameworks accessible and provisioned as a service via cloud. We also explore the state-of-the- art progress in this arena with open challenges and research prospects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data Availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Manyika, J., Chui, M., Brown, B., Bughin, J., et al. (2011). Big Data: The next frontier for innovation, competition, and productivity. Technical report, McKinsey Global Institute.

  2. Singh, N., Singh, D. P., & Pant, B. A. (2017). Comprehensive Study of big data machine learning approaches and challenges. In Proceedings of the International Conference on Next Generation Computing and Information Systems (ICNGCIS); 2017 Dec 11–12; MIET Jammu, India: IEEE; pp. 80–85.

  3. Cardoso, A., & Simões, P. (2011). Cloud computing: Concepts, technologies and challenges. In: International Conference on Virtual and Networked Organizations, Emergent Technologies, and Tools; Jul: Springer, Berlin, and Heidelberg, pp. 127–136.

  4. Math, R. (2018). Big Data Analytics: Recent and Emerging Application in Services Industry. Big Data Analytics. Springer.

    Google Scholar 

  5. Chebbi, I., Wadii, B., & Imed, R. F. (2015). Big Data: Concepts, Challenges and Applications. Computational Collective Intelligence. Springer.

    Google Scholar 

  6. Skourletopoulos, G., Mavromoustakis, C.X., Mastorakis, G., Batalla, J.M., Dobre, C., Panagiotakis, S., & Pallis, E. (2017). Big data and cloud computing: A survey of the state-of-the-art and research challenges. In Advances in Mobile Cloud Computing and Big Data in the 5G Era, Springer, 23–41.

  7. Singh, N., Singh, D. P., & Pant, B. (2019). Big data knowledge discovery platforms: A 360 degree perspective. International Journal of Engineering and Advanced Technology (IJEAT), 9(2), 2424–2433.

    Article  Google Scholar 

  8. Mell, P., & Grance, T. (2011). The NIST definition of cloud computing. Gaithersburg, MD: National Institution of Standards and Technology (NIST).

  9. Elshawi, R., Sakr, S., Talia, D., & Trunfio, P. (2018). Big data systems meet machine learning challenges: Towards big data science as a service. Big Data Research, 14, 1–11.

    Article  Google Scholar 

  10. Wang, X., Yang, L. T., Liu, H., & Deen, M. J. (2017). A big data-as-a-service framework: State-of-the-art and perspectives. IEEE Transactions on Big Data, 4(3), 325–340.

    Article  Google Scholar 

  11. Buxton, B., Goldston, D., Doctorow, C., & Waldrop, M. (2008). Big data: Science in the petabyte era. Nature, 455(7209), 8–9.

    Article  Google Scholar 

  12. Hu, H., Wen, Y., Chua, T. S., & Li, X. (2014). Toward scalable systems for big data analytics: A technology tutorial. IEEE access, 2, 652–687.

    Article  Google Scholar 

  13. Sakr, S. (2014). Cloud-hosted databases: technologies, challenges and opportunities. Cluster Computing, 17, 487–502.

    Article  Google Scholar 

  14. Sakr, S. (2016). Big Data 2.0 Processing Systems: A Survey. Springer.

    Book  Google Scholar 

  15. Sarkar, D. (2014). Introducing hdinsight. Pro Microsoft HDInsight. Apress.

    Book  Google Scholar 

  16. Nadipalli, R. (2015). HDInsight Essentials. London: Packt Publishing Ltd.

    Google Scholar 

  17. Oussous, A., Benjelloun, F. Z., Lahcen, A. A., & Belfkih, S. (2018). Big Data technologies: A survey. Journal of King Saud University-Computer and Information Sciences, 30(4), 431–448.

    Article  Google Scholar 

  18. Khan, S., Kashish, A. S., & Mansaf, A. (2018). Cloud-Based Big Data Analytics: A Survey of Current Research and Future Directions Big Data Analytics. Springer.

    Google Scholar 

  19. Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information systems, 47, 98–115.

    Article  Google Scholar 

  20. Khan, S., Shakil, K. A., & Alam, M. (2018). Cloud-Based Big Data Analytics: A Survey of Current Research and Future Directions. Big Data Analytics. Springer.

    Google Scholar 

  21. Talia, D., Trunfio, P., & Marozzo, F. (2016). Data Analysis in the Cloud. Elsevier.

    Google Scholar 

  22. Gulabani, S. (2017). Practical Amazon EC2, SQS, Kinesis, and S3.

  23. Kumar, V.D.A. et al. (2017). Cloud enabled media streaming using Amazon Web Services. In 2017 IEEE International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM). IEEE.

  24. Gonzales, J.U., & Krishnan, S.P.T. (2015). Building your next big thing with Google Cloud Platform. Aprés 27.

  25. Krishnan, S. P. T., & Jose, L. U. G. (2015). Google BigQuery. Building Your Next Big Thing with Google Cloud Platform. Apress.

    Book  Google Scholar 

  26. Anil, P. et al. Cloud Object Storage as a Service, IBM Redbooks. https://www.redbooks.ibm.com/redbooks/pdfs/sg248385.pdf

  27. Serrano, N., Gallardo, G., & Hernantes, J. (2015). Infrastructure as a service and cloud technologies. IEEE Software, 32(2), 30–36.

    Article  Google Scholar 

  28. Copeland, M., et al. (2015). Microsoft Azure. Apress.

    Book  Google Scholar 

  29. Klein, S. (2017). IoT Solutions in Microsoft’s Azure IoT Suite. Apress.

    Book  Google Scholar 

  30. Reagan, R., & Cosmos, D. B. (2018). Web Applications on Azure. Apress.

    Book  Google Scholar 

  31. Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters Communications of the ACM cessing. Communications of the ACM, 59(11), 56–65.

    Google Scholar 

  32. Singh, M.P., Hoque, M.A., & Tarkoma, S. (2016). A survey of systems for massive stream analytics. http://arxiv.org/abs/1605.09021.

  33. A. Team (2016). AzureML: Anatomy of a machine learning service. In Proceedings of the 2nd International Conference on Predictive APIs and Apps, pp. 1–13.

  34. Brown, P.G. (2010). Overview of SciDB: Large scale array storage, processing and analysis. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, ACM, pp. 963–968

  35. Nguyen, G., Dlugolinsky, S., Bobák, M., Tran, V., García, Á. L., Heredia, I., & Hluchý, L. (2019). Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: A survey. Artificial Intelligence Review, 52(1), 77–124.

    Article  Google Scholar 

  36. Thusoo, A., Sarma, J. S., Jain, N., Shao, Z., Chakka, P., Anthony, S., & Murthy, R. (2009). Hive: A warehousing solution over a map-reduce framework. Proceedings of the VLDB Endowment, 2(2), 1626–1629.

    Article  Google Scholar 

  37. George, L. (2011). Hbase: The Definitive Guide. O’Reilly Media Inc.

    Google Scholar 

  38. Zaharia, M., Chowdhury, M., Franklin, M. J., Shenker, S., & Stoica, I. (2010). Spark: Cluster computing with working sets. HotCloud, 10(10–10), 95.

    Google Scholar 

  39. Hewitt, E. (2010). Cassandra: the Definitive Guide. O’Reilly Media Inc.

    Google Scholar 

  40. Franciscus, N., Ren, X., & Stantic, B. (2018). Precomputing architecture for flexible and efficient big data analytics. Vietnam Journal of Computer Science, 5(2), 133–142.

    Article  Google Scholar 

  41. Sakr, S., Orakzai, F. M., Abdelaziz, I., & Khayyat, Z. (2016). Large-Scale Graph Processing Using Apache Giraph. Springer.

    Book  Google Scholar 

  42. Chen, C. P., & Zhang, C. Y. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information sciences, 275, 314–347.

    Article  Google Scholar 

  43. Brownlee, J. (2014). BigML review: Discover the clever features in this machine learning as a service platform, 11.

  44. Redavid, D., Malerba, D., Di Martino, B., Esposito, A., Ardagna, C.A., Bellandi, V., & Damiani, E. (2018). Semantic support for model based big data analytics-as-a- service (MBDAaaS). In Conference on Complex, Intelligent, and Software Intensive Systems, Springer, Cham, pp. 1012–1021.

  45. Siddiqui, T., Shadab A.S., & Najeeb A.K. (2019). Comprehensive analysis of container technology. In 2019 4th International Conference on Information Systems and Computer Networks (ISCON), IEEE.

  46. Zheng, Z., Zhu, J., & Lyu, M.R. (2013). Service-generated big data and big data-as-a- service: An overview. In 2013 IEEE International Congress on Big Data, IEEE, pp. 403–410.

  47. Xu, X., Sheng, Q. Z., Zhang, L. J., Fan, Y., & Dustdar, S. (2015). From big data to big service. Computer, 7, 80–83.

    Article  Google Scholar 

  48. Talia, D. (2013). Clouds for scalable big data analytics. Computer, 5, 98–101.

    Article  Google Scholar 

  49. Ardagna, C.A., Ceravolo, P., & Damiani, E. (2016). Big data analytics as-a-service: Issues and challenges. In 2016 IEEE International Conference on Big Data (Big Data), IEEE, pp. 3638–3644.

  50. Ahmad, I., et al. (2020). Machine learning meets communication networks: Current trends and future challenges. IEEE Access, 8, 223418–223460.

    Article  Google Scholar 

  51. Nykvist, C., et al. (2020). A lightweight portable intrusion detection communication system for auditing applications. International Journal of Communication Systems, 33(7), e4327.

    Article  Google Scholar 

  52. Provost, F., & Fawcett, T. (2013). Data science and its relationship to big data and data-driven decision making. Big Data, 1(1), 51–59.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Neelam Singh.

Ethics declarations

Conflict of interest

The authors declare no competing financial interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, N., Singh, D.P. & Pant, B. Big Data Knowledge Discovery as a Service: Recent Trends and Challenges. Wireless Pers Commun 123, 1789–1807 (2022). https://doi.org/10.1007/s11277-021-09213-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-021-09213-5

Keywords

Navigation