Abstract
The usage of artificial intelligence and machine learning methods on cyberattacks increasing significantly recently. For the defense method of cyberattacks, it is possible to detect and identify the attack event by observing the log data and analyzing whether it has abnormal behavior or not. This paper implemented the ELK Stack network log system (NetFlow Log) to visually analyze log data and present several network attack behavior characteristics for further analysis. Additionally, this system evaluated the extreme gradient enhancement (XGBoost), Recurrent Neural Network (RNN), and Deep Neural Network (DNN) model for machine learning methods. Keras was used as a deep learning framework for building a model to detect the attack event. From the experiments, it can be confirmed that the XGBoost model has an accuracy rate of 96.01% for potential threats. The full attack dataset can achieve 96.26% accuracy, which is better than RNN and DNN models.
Similar content being viewed by others
Data availability
All data generated or analyzed during this study are included in this published article (and its supplementary information files). The codes generated during the current study are not publicly available due to will use at my future study but are available from the corresponding author on reasonable request.
References
Ahad N, Qadir J, Ahsan N (2016) Neural networks in wireless networks: techniques, applications and guidelines. J Netw Comput Appl 68:1–27
Al-Qurishi M, Alrubaian M, Rahman SMM, Alamri A, Hassan MM (2018) A prediction system of sybil attack in social network using deep-regression model. Futur Gener Comput Syst 87:743–753. https://doi.org/10.1016/j.future.2017.08.030
Bagnasco S, Berzano D, Guarise A, Lusso S, Masera M, Vallero S (2015) Monitoring of IaaS and scientific applications on the cloud using the elasticsearch ecosystem. J Phys: Conf Ser 608:012016. https://doi.org/10.1088/1742-6596/608/1/012016
Bajer M (2017) Building an iot data hub with elasticsearch, logstash and kibana. In: 2017 5th international conference on future internet of things and cloud workshops (FiCloudW), pp 63–68. IEEE
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16, pp. 785–794. ACM, New York, NY, USA. https://doi.org/10.1145/2939672.2939785
Chen Z, Jiang F, Cheng Y, Gu X, Liu W, Peng J (2018) Xgboost classifier for ddos attack detection and analysis in sdn-based cloud. In: 2018 IEEE international conference on big data and smart computing (BigComp), pp 251–256. IEEE
Chen S, Xue M, Fan L, Hao S, Xu L, Zhu H (2017) Hardening malware detection systems against cyber maneuvers: an adversarial machine learning approach. CoRR arXiv:1706.04146
Diro AA, Chilamkurti N (2018) Distributed attack detection scheme using deep learning approach for internet of things. Futur Gener Comput Syst 82:761–768. https://doi.org/10.1016/j.future.2017.08.043
Eighty two percent of security professionals fear artificial intelligence attacks against their organization (2018) https://www.home.neustar/about-us/news-room/press-releases/2018/NISCOctober
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 1189–1232
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378
Ghafir I, Hammoudeh M, Prenosil V, Han L, Hegarty R, Rabie K, Aparicio-Navarro FJ (2018) Detection of advanced persistent threat using machine-learning correlation analysis. Futur Gener Comput Syst 89:349–359. https://doi.org/10.1016/j.future.2018.06.055
How to detect http parameter pollution attacks (2021) https://www.acunetix.com/blog/whitepaper-http-parameter-pollution/
Kozik R, Choraś M, Ficco M, Palmieri F (2018) A scalable distributed machine learning approach for attack detection in edge computing environments. J Parall Distributed Comput 119:18–26. https://doi.org/10.1016/j.jpdc.2018.03.006
Kristiani E, Yang CT, Huang CY, Ko PC, Fathoni H (2020) On construction of sensors, edge, and cloud (isec) framework for smart system integration and applications. IEEE Internet Things J 8(1):309–319
Lai CH, Yang CT, Kristiani E, Liu JC, Chan YW (2019) Using xgboost for cyberattack detection and analysis in a network log system with elk stack. In: International conference on frontier computing, pp 302–311. Springer
Langi PPI, Najib W, Aji TB (2015) An evaluation of twitter river and logstash performances as elasticsearch inputs for social media analysis of twitter. In: 2015 international conference on information communication technology and systems (ICTS), pp 181–186. https://doi.org/10.1109/ICTS.2015.7379895
Liu JC, Yang CT, Chan YW, Kristiani E, Jiang WJ (2021) Cyberattack detection model using deep learning in a network log system with data visualization. J Supercomput 77(10):10984–11003
Liu H, Lang B, Liu M, Yan H (2019) Cnn and rnn based payload classification methods for attack detection. Knowl-Based Syst 163:332–341. https://doi.org/10.1016/j.knosys.2018.08.036
Peterson P (2018) Unmasking deceptive attacks with machine learning. Comput Fraud Secur 2018(11):15–17. https://doi.org/10.1016/S1361-3723(18)30110-6
Prakash TR, Kakkar M, Patel K (2016) Geo-identification of web users through logs using elk stack. In: 2016 6th international conference - cloud system and big data engineering (Confluence) pp 606–610
Rattan A, Kaur N, Bhushan S (2019) Standardization of intelligent information of specific attack trends. In: Progress in Advanced Computing and Intelligent Engineering, pp 75–86. Springer
Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674
Sahingoz OK, Buber E, Demir O, Diri B (2019) Machine learning based phishing detection from urls. Expert Syst Appl 117:345–357. https://doi.org/10.1016/j.eswa.2018.09.029
Sharafaldin I, Lashkari AH, Ghorbani AA (2019) An evaluation framework for network security visualizations. Comput Secur 84:70–92. https://doi.org/10.1016/j.cose.2019.03.005
Sun P, Li J, Bhuiyan MZA, Wang L, Li B (2019) Modeling and clustering attacker activities in iot through machine learning techniques. Inf Sci 479:456–471. https://doi.org/10.1016/j.ins.2018.04.065
Yang CT, Kristiani E, Wang YT, Min G, Lai CH, Jiang WJ (2020) On construction of a network log management system using elk stack with ceph. J Supercomput 76(8):6344–6360
Yang CT, Liu JC, Kristiani E, Liu ML, You I, Pau G (2020) Netflow monitoring and cyberattack detection using deep learning with ceph. IEEE Access 8:7842–7850
Yang C, Shi Z, Zhang H, Wu J, Shi X (2019) Multiple attacks detection in cyber-physical systems using random finite set theory. IEEE Trans Cybern 50(9):4066–4075
Yuan X, Li C, Li X (2017) Deepdefense: Identifying ddos attack via deep learning. In: 2017 IEEE international conference on smart computing (SMARTCOMP), pp 1–8 https://doi.org/10.1109/SMARTCOMP.2017.7946998
Zhang D, Liu L, Feng G (2018) Consensus of heterogeneous linear multiagent systems subject to aperiodic sampled-data and dos attack. IEEE Trans Cybern 49(4):1501–1511
Zhang J, Gardner R, Vukotic I (2019) Anomaly detection in wide area network meshes using two machine learning algorithms. Futur Gener Comput Syst 93:418–426. https://doi.org/10.1016/j.future.2018.07.023
Acknowledgements
The part of data has been presented previously in a conference proceeding at: https://doi.org/10.1007/978-981-15-3250-4_36.
Funding
This research was supported in part by the Ministry of Science and Technology (MOST), Taiwan R.O.C. (No. 110-2221-E-029-020-MY3, 110-2811-E-029-003, 110-2621-M-029-003, and 110-2622-E-029-003.).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, CT., Chan, YW., Liu, JC. et al. Cyberattacks detection and analysis in a network log system using XGBoost with ELK stack. Soft Comput 26, 5143–5157 (2022). https://doi.org/10.1007/s00500-022-06954-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-022-06954-8