Abstract
All organizations, be they businesses, governments, infrastructure or utility providers, depend on the availability and functioning of their computers, computer networks and data centers for all or part of their operations. Network intrusion detection systems are the first line of defense that protect computing infrastructure from external attacks. In this study we develop five different Machine Learning classifiers for a number of attacks. We used the CSE-CIC-IDS2018 dataset, developed in a collaborative effort between the Communications Security Establishment and the Canadian Institute for Cybersecurity. It is an extensive network traffic trace dataset that captures multiple attacks and has become available relatively recently. The previous major dataset used for the development of network intrusion detection systems is the KDD Cup’99 dataset, now going on 22 years, which predates mobile computing, Web 2.0/3.0, social media, streaming video and widespread use of SSL. These significant Internet trends of the last two decades demand a reevaluation and redevelopment of intrusion detectors. Prior studies that designed Machine Learning classifiers using the CSE-CIC-IDS2018 dataset use a large and rich set of features, of which at least one is not dataset-invariant. Almost none have explored the appropriateness of using all available features with datasets containing only a few hundred attack class samples. The classifiers developed in this study rely on a justifiable number of features and their performance is reviewed for stability and generalization by reporting not just average performance over 10 fold cross-validation but also the degree of variation from one fold to the next.
Similar content being viewed by others
References
Boutaba R, Salahuddin MA, Limam N, Ayoubi S, Shahriar N, Estrada-Solano F, Caicedo OM (2018) A comprehensive survey on machine learning for networking: evolution, applications and research opportunities. J Internet Services Appl 9(1):1–99
Catillo M, Rak M, Villano U (2020) 2L-ZED-IDS: A two-level anomaly detector for multiple attack classes. In: workshops of the international conference on advanced information networking and applications, springer, pp 687–696
Chadza T, Kyriakopoulos KG, Lambotharan S (2019) Contemporary sequential network attacks prediction using hidden markov model. In: 2019 17th international conference on privacy. Security and Trust (PST), IEEE, pp 1–3
Chastikova V, Sotnikov V (2019) Method of analyzing computer traffic based on recurrent neural networks. In: journal of physics: conference series, IOP Publishing 1353:012133
of Cybersecurity CI (2018) Ids 2018 | datasets | research | canadian institute for cybersecurity | unb. https://www.unbca/cic/datasets/ids-2018html
for Cybersecurity CI (2021) Github - cometa/cicflowmeter: Cicflowmeter-v3.0. https://github.com/cometa/CICFlowMeter
Dorfman R (1979) A formula for the gini coefficient. The review of economics and statistics. pp 146–149
Ferrag MA, Maglaras L, Moschoyiannis S, Janicke H (2020) Deep learning for cyber security intrusion detection: approaches, datasets, and comparative study. J Inf Secur Appl 50:102419
Kim J, Shin Y, Choi E (2019) An intrusion detection model based on a convolutional neural network. J Multimedia Inf Syst 6(4):165–172
Koroniotis N, Moustafa N, Sitnikova E, Turnbull B (2019) Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset. Futur Gener Comput Syst 100:779–796
Lin P, Ye K, Xu CZ (2019) Dynamic network anomaly detection system by using deep learning techniques. In: international conference on cloud computing, Springer, pp 161–176
Lypa B, Iver O, Kifer V (2019) Application of machine learning methods for network intrusion detection system
Panigrahi R, Borah S (2018) A detailed analysis of CICIDS2017 dataset for designing intrusion detection systems. Int J Eng Technol. 7(3.24):479–482
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Machine Learn Res 12:2825–2830
Rios ALG, Li Z, Bekshentayeva K, Trajković L (2020) Detection of denial of service attacks in communication networks. In: 2020 IEEE international symposium on circuits and systems (ISCAS), IEEE, pp 1–5
Sharafaldin I, Lashkari AH, Ghorbani AA (2018) Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: ICISSp, pp 108–116
SIGKDD A, (1999) Sigkdd : Kdd cup 1999 : Computer network intrusion detection. https://www.kddorg/kdd-cup/view/kdd-cup-1999/Data
Sperl P, Schulze JP, Böttinger K (2020) A\(^3\): Activation anomaly analysis. arXiv preprint arXiv:2003.01801
Trappenberg TP (2019) Machine learning with sklearn. Fundamentals of Machine Learning. Oxford University Press, Oxford, pp 38–65
Vinayakumar R, Alazab M, Soman K, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access 7:41525–41550
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This study was funded by grant No. UJ-02-011-DR from the Deanship of Scientific Research (DSR) of the University of Jeddah, Jeddah, Saudi Arabia.
Rights and permissions
About this article
Cite this article
Ilyas, M.U., Alharbi, S.A. Machine learning approaches to network intrusion detection for contemporary internet traffic. Computing 104, 1061–1076 (2022). https://doi.org/10.1007/s00607-021-01050-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-021-01050-5