Abstract
Network-based Intrusion Detection Systems (NIDSs) identify malicious activities by analyzing network traffic. NIDSs are trained with the samples of benign and intrusive network traffic. Training samples belong to either majority or minority classes depending upon the number of available instances. Majority classes consist of abundant samples for the normal traffic as well as for recurrent intrusions. Whereas, minority classes include fewer samples for unknown events or infrequent intrusions. NIDSs trained on such imbalanced data tend to give biased predictions against minority attack classes, causing undetected or misclassified intrusions. Past research works handled this class imbalance problem using data-level approaches that either increase minority class samples or decrease majority class samples in the training data set. Although these data-level balancing approaches indirectly improve the performance of NIDSs, they do not address the underlying issue in NIDSs i.e. they are unable to identify attacks having limited training data only. This paper proposes an algorithm-level approach called Improved Siam-IDS (I-SiamIDS), which is a two-layer ensemble for handling class imbalance problem. I-SiamIDS identifies both majority and minority classes at the algorithm-level without using any data-level balancing techniques. The first layer of I-SiamIDS uses an ensemble of binary eXtreme Gradient Boosting (b-XGBoost), Siamese Neural Network (Siamese-NN) and Deep Neural Network (DNN) for hierarchical filtration of input samples to identify attacks. These attacks are then sent to the second layer of I-SiamIDS for classification into different attack classes using multi-class eXtreme Gradient Boosting classifier (m-XGBoost). As compared to its counterparts, I-SiamIDS showed significant improvement in terms of Accuracy, Recall, Precision, F1-score and values of Area Under the Curve (AUC) for both NSL-KDD and CIDDS-001 datasets. To further strengthen the results, computational cost analysis was also performed to study the acceptability of the proposed I-SiamIDS.
Similar content being viewed by others
References
Abdulhammed R, Faezipour M, Abuzneid A, AbuMallouh A (2018) Deep and machine learning approaches for anomaly-based intrusion detection of imbalanced network traffic. IEEE Sens Lett 3(1):1–4. https://doi.org/10.1109/LSENS.2018.2879990
Ali A, Shamsuddin SM, Ralescu AL (2015) Classification with class imbalance problem: a review. Int J Adv Soft Comput Appl 7(3):176–204
Bedi P, Gupta N, Jindal V (2019) Siam-IDS: handling class imbalance problem in intrusion detection systems using Siamese neural network. Third International Conference on Computing and Network Communications, Trivandrum
Bi J, Zhang C (2018) An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme. Knowl-Based Syst 158:81–93. https://doi.org/10.1016/j.knosys.2018.05.037
Bonfitto A, Feraco S, Tonoli A, Amati N, Monti F (2019) Estimation accuracy and computational cost analysis of artificial neural networks for state of charge estimation in Lithium batteries. Batteries 5(2):47. https://doi.org/10.3390/batteries5020047
Bromley J, Guyon I, LeCun Y, Sickinger E, Shah R (1994) Signature verification using a "Siamese" time delay neural network. Adv Neural Inf Process Syst:737-744
Bunkhumpornpat C, Sinapiromsaran K, Lursinsap C (2012) DBSMOTE: density-based synthetic minority over-sampling TEchnique. Appl Intell 36(3):664–684. https://doi.org/10.1007/s10489-011-0287-y
Çavuşoğlu Ü (2019) A new hybrid approach for intrusion detection using machine learning methods. Appl Intell 49(7):2735–2761. https://doi.org/10.1007/s10489-018-01408-x
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco: ACM New York, NY, USA, pp 785–794. https://doi.org/10.1145/2939672.2939785
Chen Z, Jiang F, Cheng Y, Gu X, Liu W, Peng J (2018) XGBoost classifier for DDoS attack detection and analysis in SDN-based cloud. 2018 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE, Shanghai, pp 251–256. https://doi.org/10.1109/BigComp.2018.00044
Chowdhury MU, Hammond F, Konowicz G, Li J, Xin C, Wu H (2017) A few-shot deep learning approach for improved intrusion detection. 2017 IEEE 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON). IEEE, New York, pp 456–462. https://doi.org/10.1109/UEMCON.2017.8249084
Deka RK, Kalita KP, Bhattacharya DK, Kalita JK (2015) Network defense: approaches, methods and techniques. J Netw Comput Appl 57:71–84. https://doi.org/10.1016/j.jnca.2015.07.011
Dhaliwal SS, Nahid A-A, Abbas R (2018) Effective intrusion detection system using XGBoost. Information 9(7):1–24. https://doi.org/10.3390/info9070149
Gupta N, Bedi P, Jindal V (2019) Effect of activation functions on the performance of deep learning algorithms for network intrusion detection systems. In: International Conference on Emerging Trends in Information Technology (ICETIT-2019). Delhi, Springer, pp 1–12
Gurung S, Ghose MK, Subedi A (2019) Deep learning approach on network intrusion detection system using NSL-KDD dataset. Int J Comput Netw Inf Secur (IJCNIS) 11(3):8–14. https://doi.org/10.5815/ijcnis.2019.03.02
Hamid Y, Sugumaran M, Journaux L (2016) A fusion of feature extraction and feature selection technique for network intrusion detection. Int J Secur Appl 10(8):151–158. https://doi.org/10.14257/ijsia.2016.10.8.13
Idhammad M, Afdel K, Belouch M (2018) Semi-supervised machine learning approach for DDoS detection. Appl Intell 48(10):3193–3208. https://doi.org/10.1007/s10489-018-1141-2
Jeong Y, Lee S, Park D, Park K-H (2018) Accurate age estimation using multi-task Siamese network-based deep metric learning for frontal face images. Symmetry 10(9):385. https://doi.org/10.3390/sym10090385
Justus D, Brennan J, Bonner S, McGough AS (2018) Predicting the computational cost of deep learning models. In: 2018 IEEE International Conference on Big Data (Big Data). IEEE, Seattle, pp 1–11. https://doi.org/10.1109/BigData.2018.8622396
Kaja N, Shaout A, Ma D (2019) An intelligent intrusion detection system. Appl Intell 49:3235–3247. https://doi.org/10.1007/s10489-019-01436-1
Kar P, Banerjee S, Mondal KC, Mahapatra G, Chattopadhyay S (2019) A Hybrid Intrusion Detection System for Hierarchical Filtration of Anomalies. In: Information and Communication Technology for Intelligent Systems. Springer, Singapore, pp 417–426. https://doi.org/10.1007/978-981-13-1742-2_41
Laudani A, Lozito GM, Fulginei FR, Salvini A (2015) On training efficiency and computational costs of a feed forward neural network: a review. Comput Intell Neurosci 2015:1–13. https://doi.org/10.1155/2015/818243
Lee J, Park K (2019) GAN-based imbalanced data intrusion detection system. Pers Ubiquit Comput, 1-8. https://doi.org/10.1007/s00779-019-01332-y
Lee WH, Lim CS, Noh BN (2020) Generation of Similar Traffic Using GAN for Resolving Data Imbalance. In: International Conference on Ubiquitous Information Technologies and Applications. Springer, Singapore, pp 1–7. https://doi.org/10.1007/978-981-13-9341-9_1
Liu J, Sun C, Xu X, Xu B, Yu S (2019) A spatial and temporal features mixture model with body parts for video-based person re-identification. Appl Intell 49(9):3436–3446. https://doi.org/10.1007/s10489-019-01459-8
Mazini M, Shirazi B, Mahdavi I (2019) Anomaly network-based intrusion detection system using a reliable hybrid artificial bee colony and AdaBoost algorithms. J King Saud Univ - Comput Inf Sci 31(4):541–553. https://doi.org/10.1016/j.jksuci.2018.03.011
Ring M, Wunderlich S, Grüdl D, Landes D, Hotho A (2017) Flow-based benchmark data sets for intrusion detection. Proceedings of the 16th European Conference on Cyber Warfare and Security (ECCWS) (pp. 361–369). ACPI, Dublin
Ring M, Wunderlich S, Scheuring D, Landes D, Hotho A (2019) A survey of network-based intrusion detection data sets. Comput Secur 86:147–167. https://doi.org/10.1016/j.cose.2019.06.005
Rodda S (2018) Network Intrusion Detection Systems Using Neural Networks. In: Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 672. Springer, Singapore, pp 903–908. https://doi.org/10.1007/978-981-10-7512-4_89
Shenfield A, Day D, Ayesh A (2018) Intelligent intrusion detection systems using artificial neural networks. ICT Express 4(2):95–99. https://doi.org/10.1016/j.icte.2018.04.003
Sun J, Lang J, Fujita H, Li H (2018) Imbalanced enterprise credit evaluation with DTE-SBD: decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Inf Sci 425:76–91. https://doi.org/10.1016/j.ins.2017.10.017
Sun J, Li H, Fujita H, Fu B, Ai W (2019) Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensemble combined with SMOTE and time weighting. Inf Fusion 54:128–144. https://doi.org/10.1016/j.inffus.2019.07.006
Tao X, Peng Y, Zhao F, Zhao P, Wang Y (2018) A parallel algorithm for network traffic anomaly detection based on isolation Forest. Int J Distrib Sensor Netw 14(11):1–11. https://doi.org/10.1177/1550147718814471
Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) NSL-KDD dataset. Retrieved 9 7, 2019, from Canadian Institute for Cybersecurity, University of New Brunswick: https://www.unb.ca/cic/datasets/nsl.html
Tyagi S, Mittal S (2020) Sampling Approaches for Imbalanced Data Classification Problem in Machine Learning. In: Proceedings of International Conference on Recent Innovations in Computing (ICRIC 2019). Lecture Notes in Electrical Engineering, vol 597. Springer, Cham, pp 209–221. https://doi.org/10.1007/978-3-030-29407-6_17
Verma P, Anwar S, Khan S, Mane SB (2018) Network intrusion detection using clustering and gradient boosting. In: 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT). IEEE, Bangalore, pp 1–7. https://doi.org/10.1109/ICCCNT.2018.8494186
Wan Z, Zhang Y, He H (2017) Variational autoencoder based synthetic data generation for imbalanced learning. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, Honolulu, pp 1–7. https://doi.org/10.1109/SSCI.2017.8285168
Wang W, Wang X, Feng D, Liu J, Han Z, Zhang X (2014) Exploring permission-induced risk in android applications for malicious application detection. IEEE Trans Inf Forensics Secur 9(11):1869–1882. https://doi.org/10.1109/TIFS.2014.2353996
Wang W, Zhao M, Gao Z, Xu G, Xian H, Li Y, Zhang X (2019) Constructing features for detecting android malicious applications: issues, taxonomy and directions. IEEE Access 7:67602–67631. https://doi.org/10.1109/ACCESS.2019.2918139
Xiao Y, Xiao X (2019) An intrusion detection system based on a simplified residual network. Information 10(11):1–17. https://doi.org/10.3390/info10110356
Zhang C, Liu W, Ma H, Fu H (2016) Siamese neural network based gait recognition for human identification. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Shanghai, pp 2832–2836. https://doi.org/10.1109/ICASSP.2016.7472194
Zhang C, Bi J, Xu S, Ramentol E, Fan G, Qiao B, Fujita H (2019) Multi-imbalance: an open-source software for multi-class imbalance learning. Knowl-Based Syst 174:137–143. https://doi.org/10.1016/j.knosys.2019.03.001
Zhou F, Yang S, Fujita H, Chen D, Wen C (2020) Deep learning fault diagnosis method based on global optimization GAN for unbalanced data. Knowl-Based Syst 187:104837. https://doi.org/10.1016/j.knosys.2019.07.008
Acknowledgements
The second author would like to acknowledge University Grants Commission for partially funding this work via Junior Research Fellowship Ref. No. 3505/(NET-NOV-2017).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Bedi, P., Gupta, N. & Jindal, V. I-SiamIDS: an improved Siam-IDS for handling class imbalance in network-based intrusion detection systems. Appl Intell 51, 1133–1151 (2021). https://doi.org/10.1007/s10489-020-01886-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-01886-y