Cyberattacks detection and analysis in a network log system using XGBoost with ELK stack

Yang, Chao-Tung; Chan, Yu-Wei; Liu, Jung-Chun; Kristiani, Endah; Lai, Cing-Han

doi:10.1007/s00500-022-06954-8

Cyberattacks detection and analysis in a network log system using XGBoost with ELK stack

Data analytics and machine learning
Published: 31 March 2022

Volume 26, pages 5143–5157, (2022)
Cite this article

Soft Computing Aims and scope Submit manuscript

Chao-Tung Yang^1,2,
Yu-Wei Chan³,
Jung-Chun Liu¹,
Endah Kristiani^1,4 &
…
Cing-Han Lai¹

475 Accesses
5 Citations
Explore all metrics

Abstract

The usage of artificial intelligence and machine learning methods on cyberattacks increasing significantly recently. For the defense method of cyberattacks, it is possible to detect and identify the attack event by observing the log data and analyzing whether it has abnormal behavior or not. This paper implemented the ELK Stack network log system (NetFlow Log) to visually analyze log data and present several network attack behavior characteristics for further analysis. Additionally, this system evaluated the extreme gradient enhancement (XGBoost), Recurrent Neural Network (RNN), and Deep Neural Network (DNN) model for machine learning methods. Keras was used as a deep learning framework for building a model to detect the attack event. From the experiments, it can be confirmed that the XGBoost model has an accuracy rate of 96.01% for potential threats. The full attack dataset can achieve 96.26% accuracy, which is better than RNN and DNN models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Deep Learning Modules for Cyberattack Identification in NetFlow Data Log with Ceph

Cyberattack detection model using deep learning in a network log system with data visualization

Article 16 March 2021

SAGRU: A Stacked Autoencoder-Based Gated Recurrent Unit Approach to Intrusion Detection

Data availability

All data generated or analyzed during this study are included in this published article (and its supplementary information files). The codes generated during the current study are not publicly available due to will use at my future study but are available from the corresponding author on reasonable request.

References

Ahad N, Qadir J, Ahsan N (2016) Neural networks in wireless networks: techniques, applications and guidelines. J Netw Comput Appl 68:1–27
Article Google Scholar
Al-Qurishi M, Alrubaian M, Rahman SMM, Alamri A, Hassan MM (2018) A prediction system of sybil attack in social network using deep-regression model. Futur Gener Comput Syst 87:743–753. https://doi.org/10.1016/j.future.2017.08.030
Article Google Scholar
Bagnasco S, Berzano D, Guarise A, Lusso S, Masera M, Vallero S (2015) Monitoring of IaaS and scientific applications on the cloud using the elasticsearch ecosystem. J Phys: Conf Ser 608:012016. https://doi.org/10.1088/1742-6596/608/1/012016
Article Google Scholar
Bajer M (2017) Building an iot data hub with elasticsearch, logstash and kibana. In: 2017 5th international conference on future internet of things and cloud workshops (FiCloudW), pp 63–68. IEEE
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16, pp. 785–794. ACM, New York, NY, USA. https://doi.org/10.1145/2939672.2939785
Chen Z, Jiang F, Cheng Y, Gu X, Liu W, Peng J (2018) Xgboost classifier for ddos attack detection and analysis in sdn-based cloud. In: 2018 IEEE international conference on big data and smart computing (BigComp), pp 251–256. IEEE
Chen S, Xue M, Fan L, Hao S, Xu L, Zhu H (2017) Hardening malware detection systems against cyber maneuvers: an adversarial machine learning approach. CoRR arXiv:1706.04146
Diro AA, Chilamkurti N (2018) Distributed attack detection scheme using deep learning approach for internet of things. Futur Gener Comput Syst 82:761–768. https://doi.org/10.1016/j.future.2017.08.043
Article Google Scholar
Eighty two percent of security professionals fear artificial intelligence attacks against their organization (2018) https://www.home.neustar/about-us/news-room/press-releases/2018/NISCOctober
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 1189–1232
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378
Article MathSciNet Google Scholar
Ghafir I, Hammoudeh M, Prenosil V, Han L, Hegarty R, Rabie K, Aparicio-Navarro FJ (2018) Detection of advanced persistent threat using machine-learning correlation analysis. Futur Gener Comput Syst 89:349–359. https://doi.org/10.1016/j.future.2018.06.055
Article Google Scholar
How to detect http parameter pollution attacks (2021) https://www.acunetix.com/blog/whitepaper-http-parameter-pollution/
Kozik R, Choraś M, Ficco M, Palmieri F (2018) A scalable distributed machine learning approach for attack detection in edge computing environments. J Parall Distributed Comput 119:18–26. https://doi.org/10.1016/j.jpdc.2018.03.006
Article Google Scholar
Kristiani E, Yang CT, Huang CY, Ko PC, Fathoni H (2020) On construction of sensors, edge, and cloud (isec) framework for smart system integration and applications. IEEE Internet Things J 8(1):309–319
Article Google Scholar
Lai CH, Yang CT, Kristiani E, Liu JC, Chan YW (2019) Using xgboost for cyberattack detection and analysis in a network log system with elk stack. In: International conference on frontier computing, pp 302–311. Springer
Langi PPI, Najib W, Aji TB (2015) An evaluation of twitter river and logstash performances as elasticsearch inputs for social media analysis of twitter. In: 2015 international conference on information communication technology and systems (ICTS), pp 181–186. https://doi.org/10.1109/ICTS.2015.7379895
Liu JC, Yang CT, Chan YW, Kristiani E, Jiang WJ (2021) Cyberattack detection model using deep learning in a network log system with data visualization. J Supercomput 77(10):10984–11003
Article Google Scholar
Liu H, Lang B, Liu M, Yan H (2019) Cnn and rnn based payload classification methods for attack detection. Knowl-Based Syst 163:332–341. https://doi.org/10.1016/j.knosys.2018.08.036
Article Google Scholar
Peterson P (2018) Unmasking deceptive attacks with machine learning. Comput Fraud Secur 2018(11):15–17. https://doi.org/10.1016/S1361-3723(18)30110-6
Article Google Scholar
Prakash TR, Kakkar M, Patel K (2016) Geo-identification of web users through logs using elk stack. In: 2016 6th international conference - cloud system and big data engineering (Confluence) pp 606–610
Rattan A, Kaur N, Bhushan S (2019) Standardization of intelligent information of specific attack trends. In: Progress in Advanced Computing and Intelligent Engineering, pp 75–86. Springer
Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674
Article MathSciNet Google Scholar
Sahingoz OK, Buber E, Demir O, Diri B (2019) Machine learning based phishing detection from urls. Expert Syst Appl 117:345–357. https://doi.org/10.1016/j.eswa.2018.09.029
Article Google Scholar
Sharafaldin I, Lashkari AH, Ghorbani AA (2019) An evaluation framework for network security visualizations. Comput Secur 84:70–92. https://doi.org/10.1016/j.cose.2019.03.005
Article Google Scholar
Sun P, Li J, Bhuiyan MZA, Wang L, Li B (2019) Modeling and clustering attacker activities in iot through machine learning techniques. Inf Sci 479:456–471. https://doi.org/10.1016/j.ins.2018.04.065
Article Google Scholar
Yang CT, Kristiani E, Wang YT, Min G, Lai CH, Jiang WJ (2020) On construction of a network log management system using elk stack with ceph. J Supercomput 76(8):6344–6360
Article Google Scholar
Yang CT, Liu JC, Kristiani E, Liu ML, You I, Pau G (2020) Netflow monitoring and cyberattack detection using deep learning with ceph. IEEE Access 8:7842–7850
Article Google Scholar
Yang C, Shi Z, Zhang H, Wu J, Shi X (2019) Multiple attacks detection in cyber-physical systems using random finite set theory. IEEE Trans Cybern 50(9):4066–4075
Article Google Scholar
Yuan X, Li C, Li X (2017) Deepdefense: Identifying ddos attack via deep learning. In: 2017 IEEE international conference on smart computing (SMARTCOMP), pp 1–8 https://doi.org/10.1109/SMARTCOMP.2017.7946998
Zhang D, Liu L, Feng G (2018) Consensus of heterogeneous linear multiagent systems subject to aperiodic sampled-data and dos attack. IEEE Trans Cybern 49(4):1501–1511
Zhang J, Gardner R, Vukotic I (2019) Anomaly detection in wide area network meshes using two machine learning algorithms. Futur Gener Comput Syst 93:418–426. https://doi.org/10.1016/j.future.2018.07.023
Article Google Scholar

Download references

Acknowledgements

The part of data has been presented previously in a conference proceeding at: https://doi.org/10.1007/978-981-15-3250-4_36.

Funding

This research was supported in part by the Ministry of Science and Technology (MOST), Taiwan R.O.C. (No. 110-2221-E-029-020-MY3, 110-2811-E-029-003, 110-2621-M-029-003, and 110-2622-E-029-003.).

Author information

Authors and Affiliations

Department of Computer Science, Tunghai University, Taichung City, 407224, Taiwan, ROC
Chao-Tung Yang, Jung-Chun Liu, Endah Kristiani & Cing-Han Lai
Research Center for Smart Sustainable Circular Economy, Tunghai University, No. 1727, Sec.4, Taiwan Boulevard, Taichung City, 407224, Taiwan, ROC
Chao-Tung Yang
College of Computing and Informatics, Providence University, Taichung City, 43301, Taiwan, ROC
Yu-Wei Chan
Department of Informatics, Krida Wacana Christian University, Jakarta, 11470, Indonesia
Endah Kristiani

Authors

Chao-Tung Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Wei Chan
View author publications
You can also search for this author in PubMed Google Scholar
Jung-Chun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Endah Kristiani
View author publications
You can also search for this author in PubMed Google Scholar
Cing-Han Lai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chao-Tung Yang.

Ethics declarations

Conflict of Interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, CT., Chan, YW., Liu, JC. et al. Cyberattacks detection and analysis in a network log system using XGBoost with ELK stack. Soft Comput 26, 5143–5157 (2022). https://doi.org/10.1007/s00500-022-06954-8

Download citation

Accepted: 21 January 2022
Published: 31 March 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s00500-022-06954-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cyberattacks detection and analysis in a network log system using XGBoost with ELK stack

Abstract

Access this article

Similar content being viewed by others

The Deep Learning Modules for Cyberattack Identification in NetFlow Data Log with Ceph

Cyberattack detection model using deep learning in a network log system with data visualization

SAGRU: A Stacked Autoencoder-Based Gated Recurrent Unit Approach to Intrusion Detection

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cyberattacks detection and analysis in a network log system using XGBoost with ELK stack

Abstract

Access this article

Similar content being viewed by others

The Deep Learning Modules for Cyberattack Identification in NetFlow Data Log with Ceph

Cyberattack detection model using deep learning in a network log system with data visualization

SAGRU: A Stacked Autoencoder-Based Gated Recurrent Unit Approach to Intrusion Detection

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation