Skip to main content
Log in

An efficient architecture for processing real-time traffic data streams using apache flink

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Big Data technologies emerging day by day and are making drastic changes in various real-world applications. Traditional data mining tools adequate to process volumes of data but from past decades the rapid growth in data becomes difficult for processing. Due to continuous flow of data, data streams require additional computational processing than the traditional one. Big data stream processing considers different features of the data streams heterogeneity, scalability, fault tolerance and query optimization. Efficient implementation of these features in real-world applications using big data analytics is a challenging job during data storage, processing, and analysis phases. Therefore, the proposed model FRTSPS is a generic architecture which is influenced by popular big data processing Lambda architecture, based on distributed computing platform. The architecture using open-source platform Apache Flink for doing data processing. Flink is a popular platform for processing historical and stream data flows at once parallelly. Its stateful streaming can obtain more scalability and flexibility along with high throughput and low latency than the remaining stream processing programming models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Kiran M, Murphy P, Monga I, Dugan J, Baveja SS (2015) Lambda architecture for cost-effective batch and speed big data processing. 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, pp. 2785–2792. https://doi.org/10.1109/BigData.2015.7364082

  2. Isah H, Abughofa T, Mahfuz S, Ajerla D, Zulkernine F, Khan S (2019) A survey of distributed data stream processing frameworks. IEEE Access 7:154300–154316

    Article  Google Scholar 

  3. Tantalaki N, Souravlas S, Roumeliotis M (2020) A review on big data real-time stream processing and its scheduling techniques. Int J Parallel Emergent Distrib Syst 35(5):571–601

    Article  Google Scholar 

  4. Lopez MA, Lobato AG, Duarte OC (2016) A performance comparison of open-source stream processing platforms. 2016 IEEE Global Communications Conference (GLOBECOM), Washington, DC, USA, pp. 1–6. https://doi.org/10.1109/GLOCOM.2016.7841533

  5. Rabl T, Traub J, Katsifodimos A, Markl V (2016) Apache Flink in current research. It-Inform Technol 58(4):157–165

    Article  Google Scholar 

  6. Feng L (2020) A real-time computer network trend analysis algorithm based on dynamic data stream in the context of big data. 2020 International conference on intelligent transportation, big data & smart city (ICITBS), Vientiane, Laos, pp. 473–476. https://doi.org/10.1109/ICITBS49701.2020.00102

  7. Carbone P, Fragkoulis M, Kalavri V, Katsifodimos A (2020) Beyond analytics: The evolution of stream processing systems. In Proceedings of the 2020 ACM SIGMOD international conference on management of data (SIGMOD '20). Association for computing machinery, New York, USA, 2651–2658. https://doi.org/10.1145/3318464.3383131

  8. Marques, Nuno C, Bruno Silva, Hugo Santos (2016) An interactive interface for multi-dimensional data stream analysis. 2016 20th International Conference Information Visualisation (IV), Lisbon, Portugal,  pp. 223–229. https://doi.org/10.1109/IV.2016.72

  9. De Mauro A, Greco M, Grimaldi M (2016) A formal definition of Big Data based on its essential features. Libr Rev 65(3):122–135

    Article  Google Scholar 

  10. Carbone P, Katsifodimos A, Ewen S, Markl V, Haridi S, Tzoumas K (2015) Apache flink: Stream and batch processing in a single engine. The Bulletin of the Technical Committee on Data Engineering, 38(4):28–38

  11. Jiang W, Luo J (2022) Big data for traffic estimation and prediction: a survey of data and tools. Appl Syst Innov 5(1):23

    Article  Google Scholar 

  12. Nazari E, Shahriari MH, Tabesh H (2019) BigData analysis in healthcare: apache hadoop, apache spark and apache flink. Front Health Inform 8(1):14

    Article  Google Scholar 

  13. Naoual El aboudi and Benhlima L (2018) Big data management for healthcare systems: architecture, requirements, and implementation." Advances in Bioinformatics, 2018(4059018):10. https://doi.org/10.1155/2018/4059018

  14. Venkataraman S, Panda A, Ousterhout K, Armbrust M, Ghodsi A, Franklin MJ, Recht B, Stoica I (2017) Drizzle: Fast and adaptable stream processing at scale. In Proceedings of the 26th Symposium on Operating Systems Principles, 374–389. https://doi.org/10.1145/3132747.3132750

  15. Fragkoulis M, Carbone P, Kalavri V, Katsifodimos A (2020) A survey on the evolution of stream processing systems. arXiv preprint arXiv:2008.00842

  16. Mahapatra T (2020) Composing high-level stream processing pipelines. J Big Data 7(1):1–28

    Article  Google Scholar 

  17. Van Dongen G, Van Den Poel D (2021) Influencing factors in the scalability of distributed stream processing jobs. IEEE Access 9:109413–109431

    Article  Google Scholar 

  18. Shahverdi E, Awad A, Sakr S (2019) Big stream processing systems: an experimental evaluation. In 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW), 53–60

  19. HoseinyFarahabady MR, Jannesari A, Taheri J, Bao W, Zomaya AY, Tari Z (2020) Q-flink: A qos-aware controller for apache flink. 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), Melbourne, VIC, Australia,  pp. 629-638. https://doi.org/10.1109/CCGrid49817.2020.00-30

  20. Iwendi C, Ponnan S, Munirathinam R, Srinivasan K, Chang C-Y (2019) An efficient and unique TF/IDF algorithmic model-based data analysis for handling applications with big data streaming. Electronics 8(11):1331

    Article  Google Scholar 

  21. Ta, V-D, Liu C-M, Nkabinde GW (2016) Big data stream computing in healthcare real-time analytics. In 2016 IEEE international conference on cloud computing and big data analysis (ICCCBDA), pp. 37–42. IEEE

  22. Akanbi A, Masinde M (2020) A distributed stream processing middleware framework for real-time analysis of heterogeneous data on big data platform: case of environmental monitoring. Sensors 20(11):3166

    Article  Google Scholar 

  23. Roriz Junior M, Olivieri B, Endler M (2019) DG2CEP: a near real-time on-line algorithm for detecting spatial clusters large data streams through complex event processing. J Internet Serv Appl 10(1):1–28

    Article  Google Scholar 

  24. Vanathi R, Khadir AS (2017) A robust architectural framework for big data stream computing in personal healthcare real time analytics. 2017 world congress on computing and communication technologies (WCCCT), Tiruchirappalli, India, pp. 97–104. https://doi.org/10.1109/WCCCT.2016.32

  25. Puthal D, Nepal S, Ranjan R, Chen J (2016) A secure big data stream analytics framework for disaster management on the cloud. 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), Sydney, NSW, Australia, pp. 1218–1225. https://doi.org/10.1109/HPCC-SmartCity-DSS.2016.0170

  26. Corral-Plaza D, Medina-Bulo I, Ortiz G, Boubeta-Puig J (2020) A stream processing architecture for heterogeneous data sources in the Internet of Things. Comput Stand Inter 70:103426

    Article  Google Scholar 

  27. van Dongen G, Van Den Poel D (2021) A performance analysis of fault recovery in stream processing frameworks. IEEE Access 9:93745–93763

    Article  Google Scholar 

  28. Hasani Z, Kon-Popovska M, Velinov G (2014) Lambda architecture for real time big data analytic. ICT Innovations 133–143

  29. Probst L, Rauschenbach F, Schuldt H, Seidenschwarz P, Rumo M (2018) Integrated real-time data stream analysis and sketch-based video retrieval in team sports. 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, pp. 548-555. https://doi.org/10.1109/BigData.2018.8622592

  30. Qadah E, Mock M, Alevizos E, Fuchs G (2018) Lambda architecture for batch and stream processing. In CEUR Workshop Proc 2083:109–116

    Google Scholar 

  31. Li Z, Yu J, Bian C, Pu Y, Wang Y, Zhang Y, Guo B (2020) Flink-er: an elastic resource-scheduling strategy for processing fluctuating mobile stream data on flink. Mobile Information Systems, 2020(5351824):17. https://doi.org/10.1155/2020/5351824

  32. Van Dongen G, Van den Poel D (2020) Evaluation of stream processing frameworks. IEEE Trans Parallel Distrib Syst 31(8):1845–1858

    Article  Google Scholar 

  33. Karri C (2021) Secure robot face recognition in cloud environments. Multimedia Tools Appl 80(12):18611–18626

    Article  Google Scholar 

  34. Shen J, Yan S, & Hua XS (2010). The e-recall environment for cloud based mobile rich media data management. In Proceedings of the 2010 ACM multimedia workshop on Mobile cloud media computing. 31–34. https://doi.org/10.1145/1877953.1877963

Download references

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Venkata Krishna.

Ethics declarations

Conflicts of interests

The authors declares that there is no conflict of interest for this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deepthi, B.G., Rani, K.S., Krishna, P.V. et al. An efficient architecture for processing real-time traffic data streams using apache flink. Multimed Tools Appl 83, 37369–37385 (2024). https://doi.org/10.1007/s11042-023-17151-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17151-6

Keywords

Navigation