Skip to main content
Log in

A threat monitoring system for intelligent data analytics of network traffic

  • Published:
Annals of Telecommunications Aims and scope Submit manuscript

Abstract

Security attacks have been increasingly common and cause great harm to people and organizations. Late detection of such attacks increases the possibility of irreparable damage, with high financial losses being a common occurrence. This article proposes TeMIA-NT (ThrEat Monitoring and Intelligent data Analytics of Network Traffic), a real-time flow analysis system that uses parallel flow processing. The main contributions of the TeMIA-NT are (i) the proposal of an architecture for real-time detection of network intrusions that supports high traffic rates, (ii) the use of the structured streaming library, and (iii) two modes of operation: offline and online. The offline operation mode allows evaluating the performance of multiple machine learning algorithms over a given dataset, including metrics such as accuracy and F1-score. The proposed system uses dataframes and the structured streaming engine in online mode, which allows detection of threats in real-time and a quick reaction to attacks. To prevent or minimize the damage caused by security attacks, TeMIA-NT achieves flow-processing rates that reach 50 GB/s.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://github.com/elastic/elasticsearch, accessed in April 2021.

  2. https://github.com/elastic/kibana, accessed in April 2021.

  3. The code, documentation, and license are available at: https://www.gta.ufrj.br/TeMIA-NT/.

References

  1. Cybersecurity Market Report. Available at: https://cybersecurityventures.com/https://cybersecurityventures.com/. Last access: 30 April 2021

  2. Bertino E, Islam N (2017) Botnets and internet of things security. Computer 50(2):76–79

    Article  Google Scholar 

  3. Azmoodeh A, Dehghantanha A, Choo K.-K.R. (2019) Big data and internet of things security and forensics: challenges and opportunities. In: Handbook of big data and IoT security. Springer, pp 1–4

  4. (2019) Symantec, Internet security threat report. Available at: https://docs.broadcom.com/doc/istr-24-2019-en. Last access: 30 April, 2021

  5. Habeeb RAA, Nasaruddin F, Gani A, Hashem IAT, Ahmed E, Imran M (2019) Real-time big data processing for anomaly detection: a survey. Int J Inf Manag 45:289–307

    Article  Google Scholar 

  6. (2020) Verizon Enterprise, Data breach investigations report. Available at: https://enterprise.verizon.com/resources/reports/2020-data-breach-investigations-report.pdfhttps://enterprise.verizon.com/resources/reports/2020-data-breach-investigations-report.pdf. Last access: 30 April 2021

  7. Lopez MA, Mattos DMF, Duarte OCMB, Pujolle G (2019) Toward a monitoring and threat detection system based on stream processing as a virtual network function for big data. Concurr Computat Pract Experience 31(20):e5344

    Google Scholar 

  8. Pelloso M, Vergutz A, Santos A, Nogueira M (2018) A self-adaptable system for DDoS attack prediction based on the metastability theory. In: 2018 IEEE global communications conference (GLOBECOM), pp 1–6

  9. Viegas E, Santin A, Bessani A, Neves N (2019) Bigflow: Real-time and reliable anomaly-based intrusion detection for high-speed networks. Futur Gener Comput Syst 93:473–485

    Article  Google Scholar 

  10. Campiolo R, dos Santos LAF, Monteverde WA, Suca EG, Batista DM (2018) Uma arquitetura para detecção de ameaças cibernéticas baseada na análise de grandes volumes de dados. In: Anais do I Workshop de Segurança Cibernética em Dispositivos Conectados. SBC

  11. Lobato AGP, Lopez MA, Sanz IJ, Cardenas AA, Duarte OCM, Pujolle G (2018) An adaptive real-time architecture for zero-day threat detection. In: 2018 IEEE international conference on communications (ICC). IEEE, pp 1–6

  12. Lopez MA, Mattos DMF, Duarte OCMB (2016) An elastic intrusion detection system for software networks. Ann Telecommun 71(11-12):595–605

    Article  Google Scholar 

  13. Lopez MA, Mattos DMF, Duarte OCMB, Pujolle G (2019) A fast unsupervised preprocessing method for network monitoring. Ann Telecommun 74(3-4):139–155

    Article  Google Scholar 

  14. Cisco Systems (2014) OpenSOC: The open security operations center. Available at: https://opensoc.github.io/. Last access: 30 April 2021

  15. (2017) Apache Software Foundation, Apache Metron. https://metron.apache.org/. Last access: 30 April 2021

  16. Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: Cluster computing with working sets. HotCloud 10(10-10):95

    Google Scholar 

  17. Jirsik T, Cermak M, Tovarnak D, Celeda P (2017) Toward stream-based IP flow analysis. IEEE Commun Mag 55(7):70–76

    Article  Google Scholar 

  18. Zaharia M, Xin RS, Wendell P, Das T, Armbrust M, Dave A, Meng X, Rosen J, Venkataraman S, Franklin MJ et al (2016) Apache Spark: a unified engine for big data processing. Commun ACM 59(11):56–65

    Article  Google Scholar 

  19. Xin R, Rosen J (2015) Project tungsten: bringing apache spark closer to bare metal. Available at: https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.htmlhttps://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html. Last access: 30 April 2021

  20. Armbrust M, S Xin R, Lian C, Huai Y, Liu D, K Bradley J, Meng X, Kaftan T, Franklin MJ, Ghodsi A et al (2015) Spark SQL: Relational data processing in Spark. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data. ACM, pp 1383–1394

  21. Meng X, Bradley J, Yavuz B, Sparks E, Venkataraman S, Liu D, Freeman J, Tsai D, Amde M, Owen S et al (2016) Mllib: machine learning in apache spark. J Mach Learn Res 17 (1):1235–1241

    MathSciNet  MATH  Google Scholar 

  22. Friedman J, Hastie T, Tibshirani R et al (2001) The elements of statistical learning, vol 1. Springer series in statistics, New York

    MATH  Google Scholar 

  23. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  24. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123– 140

    MATH  Google Scholar 

  25. de Souza LAC et al (2020) DFedForest: Decentralized federated forest. In: 2020 IEEE Blockchain, pp 90–97

  26. Chen J, Li K, Tang Z, Bilal K, Yu S, Weng C, Li K (2016) A parallel random forest algorithm for Big Data in a Spark cloud computing environment. IEEE TPDS 28(4):919–933

    Google Scholar 

  27. Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the KDD CUP 99 data set. In: 2009 IEEE symposium on CISDA, pp 1–6

  28. Lopez MA, Silva RS, Alvarenga ID, Rebello GAF, Sanz IJ, Lobato AG, Mattos DMF, Duarte OC, Pujolle G (2017) Collecting and characterizing a real broadband access network traffic dataset. In: 2017 1st cyber security in networking conference (CSNet), pp 1–8

  29. Arndt D (2011) Flowtbag. Available at: https://github.com/DanielArndt/flowtbag/wiki/features. Last access: 30 April 2021

  30. Reddy T, Boucadair M, Patil P, Mortensen A, Teague N Distributed denial-of-service open threat signaling (dots) signal channel specification, Internet Requests for Comments, RFC Editor, RFC 8782, 05 2020. [Online]. Available: https://datatracker.ietf.org/doc/html/rfc8782

Download references

Funding

This work was financed by CNPq, CAPES, FAPERJ, and FAPESP (2018/23292-0, 15/24485-9, 14/50937-1).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lucas C. B. Guimarães.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guimarães, L.C.B., Rebello, G.A.F., Camilo, G.F. et al. A threat monitoring system for intelligent data analytics of network traffic. Ann. Telecommun. 77, 539–554 (2022). https://doi.org/10.1007/s12243-021-00893-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12243-021-00893-5

Keywords

Navigation