Skip to main content
Log in

Cyberattacks detection and analysis in a network log system using XGBoost with ELK stack

  • Data analytics and machine learning
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The usage of artificial intelligence and machine learning methods on cyberattacks increasing significantly recently. For the defense method of cyberattacks, it is possible to detect and identify the attack event by observing the log data and analyzing whether it has abnormal behavior or not. This paper implemented the ELK Stack network log system (NetFlow Log) to visually analyze log data and present several network attack behavior characteristics for further analysis. Additionally, this system evaluated the extreme gradient enhancement (XGBoost), Recurrent Neural Network (RNN), and Deep Neural Network (DNN) model for machine learning methods. Keras was used as a deep learning framework for building a model to detect the attack event. From the experiments, it can be confirmed that the XGBoost model has an accuracy rate of 96.01% for potential threats. The full attack dataset can achieve 96.26% accuracy, which is better than RNN and DNN models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Data availability

All data generated or analyzed during this study are included in this published article (and its supplementary information files). The codes generated during the current study are not publicly available due to will use at my future study but are available from the corresponding author on reasonable request.

References

  • Ahad N, Qadir J, Ahsan N (2016) Neural networks in wireless networks: techniques, applications and guidelines. J Netw Comput Appl 68:1–27

    Article  Google Scholar 

  • Al-Qurishi M, Alrubaian M, Rahman SMM, Alamri A, Hassan MM (2018) A prediction system of sybil attack in social network using deep-regression model. Futur Gener Comput Syst 87:743–753. https://doi.org/10.1016/j.future.2017.08.030

    Article  Google Scholar 

  • Bagnasco S, Berzano D, Guarise A, Lusso S, Masera M, Vallero S (2015) Monitoring of IaaS and scientific applications on the cloud using the elasticsearch ecosystem. J Phys: Conf Ser 608:012016. https://doi.org/10.1088/1742-6596/608/1/012016

    Article  Google Scholar 

  • Bajer M (2017) Building an iot data hub with elasticsearch, logstash and kibana. In: 2017 5th international conference on future internet of things and cloud workshops (FiCloudW), pp 63–68. IEEE

  • Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16, pp. 785–794. ACM, New York, NY, USA. https://doi.org/10.1145/2939672.2939785

  • Chen Z, Jiang F, Cheng Y, Gu X, Liu W, Peng J (2018) Xgboost classifier for ddos attack detection and analysis in sdn-based cloud. In: 2018 IEEE international conference on big data and smart computing (BigComp), pp 251–256. IEEE

  • Chen S, Xue M, Fan L, Hao S, Xu L, Zhu H (2017) Hardening malware detection systems against cyber maneuvers: an adversarial machine learning approach. CoRR arXiv:1706.04146

  • Diro AA, Chilamkurti N (2018) Distributed attack detection scheme using deep learning approach for internet of things. Futur Gener Comput Syst 82:761–768. https://doi.org/10.1016/j.future.2017.08.043

    Article  Google Scholar 

  • Eighty two percent of security professionals fear artificial intelligence attacks against their organization (2018) https://www.home.neustar/about-us/news-room/press-releases/2018/NISCOctober

  • Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 1189–1232

  • Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378

    Article  MathSciNet  Google Scholar 

  • Ghafir I, Hammoudeh M, Prenosil V, Han L, Hegarty R, Rabie K, Aparicio-Navarro FJ (2018) Detection of advanced persistent threat using machine-learning correlation analysis. Futur Gener Comput Syst 89:349–359. https://doi.org/10.1016/j.future.2018.06.055

    Article  Google Scholar 

  • How to detect http parameter pollution attacks (2021) https://www.acunetix.com/blog/whitepaper-http-parameter-pollution/

  • Kozik R, Choraś M, Ficco M, Palmieri F (2018) A scalable distributed machine learning approach for attack detection in edge computing environments. J Parall Distributed Comput 119:18–26. https://doi.org/10.1016/j.jpdc.2018.03.006

    Article  Google Scholar 

  • Kristiani E, Yang CT, Huang CY, Ko PC, Fathoni H (2020) On construction of sensors, edge, and cloud (isec) framework for smart system integration and applications. IEEE Internet Things J 8(1):309–319

    Article  Google Scholar 

  • Lai CH, Yang CT, Kristiani E, Liu JC, Chan YW (2019) Using xgboost for cyberattack detection and analysis in a network log system with elk stack. In: International conference on frontier computing, pp 302–311. Springer

  • Langi PPI, Najib W, Aji TB (2015) An evaluation of twitter river and logstash performances as elasticsearch inputs for social media analysis of twitter. In: 2015 international conference on information communication technology and systems (ICTS), pp 181–186. https://doi.org/10.1109/ICTS.2015.7379895

  • Liu JC, Yang CT, Chan YW, Kristiani E, Jiang WJ (2021) Cyberattack detection model using deep learning in a network log system with data visualization. J Supercomput 77(10):10984–11003

    Article  Google Scholar 

  • Liu H, Lang B, Liu M, Yan H (2019) Cnn and rnn based payload classification methods for attack detection. Knowl-Based Syst 163:332–341. https://doi.org/10.1016/j.knosys.2018.08.036

    Article  Google Scholar 

  • Peterson P (2018) Unmasking deceptive attacks with machine learning. Comput Fraud Secur 2018(11):15–17. https://doi.org/10.1016/S1361-3723(18)30110-6

    Article  Google Scholar 

  • Prakash TR, Kakkar M, Patel K (2016) Geo-identification of web users through logs using elk stack. In: 2016 6th international conference - cloud system and big data engineering (Confluence) pp 606–610

  • Rattan A, Kaur N, Bhushan S (2019) Standardization of intelligent information of specific attack trends. In: Progress in Advanced Computing and Intelligent Engineering, pp 75–86. Springer

  • Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674

    Article  MathSciNet  Google Scholar 

  • Sahingoz OK, Buber E, Demir O, Diri B (2019) Machine learning based phishing detection from urls. Expert Syst Appl 117:345–357. https://doi.org/10.1016/j.eswa.2018.09.029

    Article  Google Scholar 

  • Sharafaldin I, Lashkari AH, Ghorbani AA (2019) An evaluation framework for network security visualizations. Comput Secur 84:70–92. https://doi.org/10.1016/j.cose.2019.03.005

    Article  Google Scholar 

  • Sun P, Li J, Bhuiyan MZA, Wang L, Li B (2019) Modeling and clustering attacker activities in iot through machine learning techniques. Inf Sci 479:456–471. https://doi.org/10.1016/j.ins.2018.04.065

    Article  Google Scholar 

  • Yang CT, Kristiani E, Wang YT, Min G, Lai CH, Jiang WJ (2020) On construction of a network log management system using elk stack with ceph. J Supercomput 76(8):6344–6360

    Article  Google Scholar 

  • Yang CT, Liu JC, Kristiani E, Liu ML, You I, Pau G (2020) Netflow monitoring and cyberattack detection using deep learning with ceph. IEEE Access 8:7842–7850

    Article  Google Scholar 

  • Yang C, Shi Z, Zhang H, Wu J, Shi X (2019) Multiple attacks detection in cyber-physical systems using random finite set theory. IEEE Trans Cybern 50(9):4066–4075

    Article  Google Scholar 

  • Yuan X, Li C, Li X (2017) Deepdefense: Identifying ddos attack via deep learning. In: 2017 IEEE international conference on smart computing (SMARTCOMP), pp 1–8 https://doi.org/10.1109/SMARTCOMP.2017.7946998

  • Zhang D, Liu L, Feng G (2018) Consensus of heterogeneous linear multiagent systems subject to aperiodic sampled-data and dos attack. IEEE Trans Cybern 49(4):1501–1511

  • Zhang J, Gardner R, Vukotic I (2019) Anomaly detection in wide area network meshes using two machine learning algorithms. Futur Gener Comput Syst 93:418–426. https://doi.org/10.1016/j.future.2018.07.023

    Article  Google Scholar 

Download references

Acknowledgements

The part of data has been presented previously in a conference proceeding at: https://doi.org/10.1007/978-981-15-3250-4_36.

Funding

This research was supported in part by the Ministry of Science and Technology (MOST), Taiwan R.O.C. (No. 110-2221-E-029-020-MY3, 110-2811-E-029-003, 110-2621-M-029-003, and 110-2622-E-029-003.).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao-Tung Yang.

Ethics declarations

Conflict of Interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, CT., Chan, YW., Liu, JC. et al. Cyberattacks detection and analysis in a network log system using XGBoost with ELK stack. Soft Comput 26, 5143–5157 (2022). https://doi.org/10.1007/s00500-022-06954-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-022-06954-8

Keywords

Navigation