Skip to main content
Log in

I-SiamIDS: an improved Siam-IDS for handling class imbalance in network-based intrusion detection systems

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Network-based Intrusion Detection Systems (NIDSs) identify malicious activities by analyzing network traffic. NIDSs are trained with the samples of benign and intrusive network traffic. Training samples belong to either majority or minority classes depending upon the number of available instances. Majority classes consist of abundant samples for the normal traffic as well as for recurrent intrusions. Whereas, minority classes include fewer samples for unknown events or infrequent intrusions. NIDSs trained on such imbalanced data tend to give biased predictions against minority attack classes, causing undetected or misclassified intrusions. Past research works handled this class imbalance problem using data-level approaches that either increase minority class samples or decrease majority class samples in the training data set. Although these data-level balancing approaches indirectly improve the performance of NIDSs, they do not address the underlying issue in NIDSs i.e. they are unable to identify attacks having limited training data only. This paper proposes an algorithm-level approach called Improved Siam-IDS (I-SiamIDS), which is a two-layer ensemble for handling class imbalance problem. I-SiamIDS identifies both majority and minority classes at the algorithm-level without using any data-level balancing techniques. The first layer of I-SiamIDS uses an ensemble of binary eXtreme Gradient Boosting (b-XGBoost), Siamese Neural Network (Siamese-NN) and Deep Neural Network (DNN) for hierarchical filtration of input samples to identify attacks. These attacks are then sent to the second layer of I-SiamIDS for classification into different attack classes using multi-class eXtreme Gradient Boosting classifier (m-XGBoost). As compared to its counterparts, I-SiamIDS showed significant improvement in terms of Accuracy, Recall, Precision, F1-score and values of Area Under the Curve (AUC) for both NSL-KDD and CIDDS-001 datasets. To further strengthen the results, computational cost analysis was also performed to study the acceptability of the proposed I-SiamIDS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11.
Fig. 12.

Similar content being viewed by others

References

  1. Abdulhammed R, Faezipour M, Abuzneid A, AbuMallouh A (2018) Deep and machine learning approaches for anomaly-based intrusion detection of imbalanced network traffic. IEEE Sens Lett 3(1):1–4. https://doi.org/10.1109/LSENS.2018.2879990

    Article  Google Scholar 

  2. Ali A, Shamsuddin SM, Ralescu AL (2015) Classification with class imbalance problem: a review. Int J Adv Soft Comput Appl 7(3):176–204

    Google Scholar 

  3. Bedi P, Gupta N, Jindal V (2019) Siam-IDS: handling class imbalance problem in intrusion detection systems using Siamese neural network. Third International Conference on Computing and Network Communications, Trivandrum

    Google Scholar 

  4. Bi J, Zhang C (2018) An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme. Knowl-Based Syst 158:81–93. https://doi.org/10.1016/j.knosys.2018.05.037

    Article  Google Scholar 

  5. Bonfitto A, Feraco S, Tonoli A, Amati N, Monti F (2019) Estimation accuracy and computational cost analysis of artificial neural networks for state of charge estimation in Lithium batteries. Batteries 5(2):47. https://doi.org/10.3390/batteries5020047

    Article  Google Scholar 

  6. Bromley J, Guyon I, LeCun Y, Sickinger E, Shah R (1994) Signature verification using a "Siamese" time delay neural network. Adv Neural Inf Process Syst:737-744

  7. Bunkhumpornpat C, Sinapiromsaran K, Lursinsap C (2012) DBSMOTE: density-based synthetic minority over-sampling TEchnique. Appl Intell 36(3):664–684. https://doi.org/10.1007/s10489-011-0287-y

    Article  Google Scholar 

  8. Çavuşoğlu Ü (2019) A new hybrid approach for intrusion detection using machine learning methods. Appl Intell 49(7):2735–2761. https://doi.org/10.1007/s10489-018-01408-x

    Article  Google Scholar 

  9. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  Google Scholar 

  10. Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco: ACM New York, NY, USA, pp 785–794. https://doi.org/10.1145/2939672.2939785

    Book  Google Scholar 

  11. Chen Z, Jiang F, Cheng Y, Gu X, Liu W, Peng J (2018) XGBoost classifier for DDoS attack detection and analysis in SDN-based cloud. 2018 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE, Shanghai, pp 251–256. https://doi.org/10.1109/BigComp.2018.00044

    Book  Google Scholar 

  12. Chowdhury MU, Hammond F, Konowicz G, Li J, Xin C, Wu H (2017) A few-shot deep learning approach for improved intrusion detection. 2017 IEEE 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON). IEEE, New York, pp 456–462. https://doi.org/10.1109/UEMCON.2017.8249084

    Book  Google Scholar 

  13. Deka RK, Kalita KP, Bhattacharya DK, Kalita JK (2015) Network defense: approaches, methods and techniques. J Netw Comput Appl 57:71–84. https://doi.org/10.1016/j.jnca.2015.07.011

    Article  Google Scholar 

  14. Dhaliwal SS, Nahid A-A, Abbas R (2018) Effective intrusion detection system using XGBoost. Information 9(7):1–24. https://doi.org/10.3390/info9070149

    Article  Google Scholar 

  15. Gupta N, Bedi P, Jindal V (2019) Effect of activation functions on the performance of deep learning algorithms for network intrusion detection systems. In: International Conference on Emerging Trends in Information Technology (ICETIT-2019). Delhi, Springer, pp 1–12

  16. Gurung S, Ghose MK, Subedi A (2019) Deep learning approach on network intrusion detection system using NSL-KDD dataset. Int J Comput Netw Inf Secur (IJCNIS) 11(3):8–14. https://doi.org/10.5815/ijcnis.2019.03.02

    Article  Google Scholar 

  17. Hamid Y, Sugumaran M, Journaux L (2016) A fusion of feature extraction and feature selection technique for network intrusion detection. Int J Secur Appl 10(8):151–158. https://doi.org/10.14257/ijsia.2016.10.8.13

    Article  Google Scholar 

  18. Idhammad M, Afdel K, Belouch M (2018) Semi-supervised machine learning approach for DDoS detection. Appl Intell 48(10):3193–3208. https://doi.org/10.1007/s10489-018-1141-2

    Article  Google Scholar 

  19. Jeong Y, Lee S, Park D, Park K-H (2018) Accurate age estimation using multi-task Siamese network-based deep metric learning for frontal face images. Symmetry 10(9):385. https://doi.org/10.3390/sym10090385

    Article  Google Scholar 

  20. Justus D, Brennan J, Bonner S, McGough AS (2018) Predicting the computational cost of deep learning models. In: 2018 IEEE International Conference on Big Data (Big Data). IEEE, Seattle, pp 1–11. https://doi.org/10.1109/BigData.2018.8622396

    Chapter  Google Scholar 

  21. Kaja N, Shaout A, Ma D (2019) An intelligent intrusion detection system. Appl Intell 49:3235–3247. https://doi.org/10.1007/s10489-019-01436-1

    Article  Google Scholar 

  22. Kar P, Banerjee S, Mondal KC, Mahapatra G, Chattopadhyay S (2019) A Hybrid Intrusion Detection System for Hierarchical Filtration of Anomalies. In: Information and Communication Technology for Intelligent Systems. Springer, Singapore, pp 417–426. https://doi.org/10.1007/978-981-13-1742-2_41

    Chapter  Google Scholar 

  23. Laudani A, Lozito GM, Fulginei FR, Salvini A (2015) On training efficiency and computational costs of a feed forward neural network: a review. Comput Intell Neurosci 2015:1–13. https://doi.org/10.1155/2015/818243

    Article  Google Scholar 

  24. Lee J, Park K (2019) GAN-based imbalanced data intrusion detection system. Pers Ubiquit Comput, 1-8. https://doi.org/10.1007/s00779-019-01332-y

  25. Lee WH, Lim CS, Noh BN (2020) Generation of Similar Traffic Using GAN for Resolving Data Imbalance. In: International Conference on Ubiquitous Information Technologies and Applications. Springer, Singapore, pp 1–7. https://doi.org/10.1007/978-981-13-9341-9_1

    Chapter  Google Scholar 

  26. Liu J, Sun C, Xu X, Xu B, Yu S (2019) A spatial and temporal features mixture model with body parts for video-based person re-identification. Appl Intell 49(9):3436–3446. https://doi.org/10.1007/s10489-019-01459-8

    Article  Google Scholar 

  27. Mazini M, Shirazi B, Mahdavi I (2019) Anomaly network-based intrusion detection system using a reliable hybrid artificial bee colony and AdaBoost algorithms. J King Saud Univ - Comput Inf Sci 31(4):541–553. https://doi.org/10.1016/j.jksuci.2018.03.011

    Article  Google Scholar 

  28. Ring M, Wunderlich S, Grüdl D, Landes D, Hotho A (2017) Flow-based benchmark data sets for intrusion detection. Proceedings of the 16th European Conference on Cyber Warfare and Security (ECCWS) (pp. 361–369). ACPI, Dublin

    Google Scholar 

  29. Ring M, Wunderlich S, Scheuring D, Landes D, Hotho A (2019) A survey of network-based intrusion detection data sets. Comput Secur 86:147–167. https://doi.org/10.1016/j.cose.2019.06.005

    Article  Google Scholar 

  30. Rodda S (2018) Network Intrusion Detection Systems Using Neural Networks. In: Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 672. Springer, Singapore, pp 903–908. https://doi.org/10.1007/978-981-10-7512-4_89

    Chapter  Google Scholar 

  31. Shenfield A, Day D, Ayesh A (2018) Intelligent intrusion detection systems using artificial neural networks. ICT Express 4(2):95–99. https://doi.org/10.1016/j.icte.2018.04.003

    Article  Google Scholar 

  32. Sun J, Lang J, Fujita H, Li H (2018) Imbalanced enterprise credit evaluation with DTE-SBD: decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Inf Sci 425:76–91. https://doi.org/10.1016/j.ins.2017.10.017

    Article  MathSciNet  Google Scholar 

  33. Sun J, Li H, Fujita H, Fu B, Ai W (2019) Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensemble combined with SMOTE and time weighting. Inf Fusion 54:128–144. https://doi.org/10.1016/j.inffus.2019.07.006

    Article  Google Scholar 

  34. Tao X, Peng Y, Zhao F, Zhao P, Wang Y (2018) A parallel algorithm for network traffic anomaly detection based on isolation Forest. Int J Distrib Sensor Netw 14(11):1–11. https://doi.org/10.1177/1550147718814471

    Article  Google Scholar 

  35. Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) NSL-KDD dataset. Retrieved 9 7, 2019, from Canadian Institute for Cybersecurity, University of New Brunswick: https://www.unb.ca/cic/datasets/nsl.html

  36. Tyagi S, Mittal S (2020) Sampling Approaches for Imbalanced Data Classification Problem in Machine Learning. In: Proceedings of International Conference on Recent Innovations in Computing (ICRIC 2019). Lecture Notes in Electrical Engineering, vol 597. Springer, Cham, pp 209–221. https://doi.org/10.1007/978-3-030-29407-6_17

  37. Verma P, Anwar S, Khan S, Mane SB (2018) Network intrusion detection using clustering and gradient boosting. In: 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT). IEEE, Bangalore, pp 1–7. https://doi.org/10.1109/ICCCNT.2018.8494186

    Chapter  Google Scholar 

  38. Wan Z, Zhang Y, He H (2017) Variational autoencoder based synthetic data generation for imbalanced learning. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, Honolulu, pp 1–7. https://doi.org/10.1109/SSCI.2017.8285168

    Chapter  Google Scholar 

  39. Wang W, Wang X, Feng D, Liu J, Han Z, Zhang X (2014) Exploring permission-induced risk in android applications for malicious application detection. IEEE Trans Inf Forensics Secur 9(11):1869–1882. https://doi.org/10.1109/TIFS.2014.2353996

    Article  Google Scholar 

  40. Wang W, Zhao M, Gao Z, Xu G, Xian H, Li Y, Zhang X (2019) Constructing features for detecting android malicious applications: issues, taxonomy and directions. IEEE Access 7:67602–67631. https://doi.org/10.1109/ACCESS.2019.2918139

    Article  Google Scholar 

  41. Xiao Y, Xiao X (2019) An intrusion detection system based on a simplified residual network. Information 10(11):1–17. https://doi.org/10.3390/info10110356

    Article  Google Scholar 

  42. Zhang C, Liu W, Ma H, Fu H (2016) Siamese neural network based gait recognition for human identification. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Shanghai, pp 2832–2836. https://doi.org/10.1109/ICASSP.2016.7472194

    Chapter  Google Scholar 

  43. Zhang C, Bi J, Xu S, Ramentol E, Fan G, Qiao B, Fujita H (2019) Multi-imbalance: an open-source software for multi-class imbalance learning. Knowl-Based Syst 174:137–143. https://doi.org/10.1016/j.knosys.2019.03.001

    Article  Google Scholar 

  44. Zhou F, Yang S, Fujita H, Chen D, Wen C (2020) Deep learning fault diagnosis method based on global optimization GAN for unbalanced data. Knowl-Based Syst 187:104837. https://doi.org/10.1016/j.knosys.2019.07.008

    Article  Google Scholar 

Download references

Acknowledgements

The second author would like to acknowledge University Grants Commission for partially funding this work via Junior Research Fellowship Ref. No. 3505/(NET-NOV-2017).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Neha Gupta.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Fig. 13
figure 13

Permutations of Layer 1 classifiers for NSL-KDD dataset

Fig. 14
figure 14

Permutations of Layer 1 classifiers for CIDDS-001 dataset

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bedi, P., Gupta, N. & Jindal, V. I-SiamIDS: an improved Siam-IDS for handling class imbalance in network-based intrusion detection systems. Appl Intell 51, 1133–1151 (2021). https://doi.org/10.1007/s10489-020-01886-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01886-y

Keywords

Navigation