A tree-based stacking ensemble technique with feature selection for network intrusion detection

Rashid, Mamunur; Kamruzzaman, Joarder; Imam, Tasadduq; Wibowo, Santoso; Gordon, Steven

doi:10.1007/s10489-021-02968-1

A tree-based stacking ensemble technique with feature selection for network intrusion detection

Published: 08 January 2022

Volume 52, pages 9768–9781, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Mamunur Rashid ORCID: orcid.org/0000-0002-3929-7361¹,
Joarder Kamruzzaman²,
Tasadduq Imam³,
Santoso Wibowo¹ &
…
Steven Gordon¹

2762 Accesses
51 Citations
1 Altmetric
Explore all metrics

Abstract

Several studies have used machine learning algorithms to develop intrusion systems (IDS), which differentiate anomalous behaviours from the normal activities of network systems. Due to the ease of automated data collection and subsequently an increased size of collected data on network traffic and activities, the complexity of intrusion analysis is increasing exponentially. A particular issue, due to statistical and computation limitations, a single classifier may not perform well for large scale data as existent in modern IDS contexts. Ensemble methods have been explored in literature in such big data contexts. Although more complicated and requiring additional computation, literature has a note that ensemble methods can result in better accuracy than single classifiers in different large scale data classification contexts, and it is interesting to explore how ensemble approaches can perform in IDS. In this research, we introduce a tree-based stacking ensemble technique (SET) and test the effectiveness of the proposed model on two intrusion datasets (NSL-KDD and UNSW-NB15). We further enhance incorporate feature selection techniques to select the best relevant features with the proposed SET. A comprehensive performance analysis shows that our proposed model can better identify the normal and anomaly traffic in network than other existing IDS models. This implies the potentials of our proposed system for cybersecurity in Internet of Things (IoT) and large scale networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of Intrusion Detection Using Ensemble Stacking-Based Machine Learning Techniques in IoT Networks

A novel ensemble learning-based model for network intrusion detection

Article Open access 03 April 2023

A Novel Ensemble Method for Network-Based Anomaly Intrusion Detection System

References

Sarker IH, Kayes ASM, Badsha S, Alqahtani H, Watters P, Ng A (2020) Cybersecurity data science: an overview from machine learning perspective. J Big Data 7(1):1–29
Article Google Scholar
Av-test institute, germany, https://www.av-test.org/en/statistics/malware/. Accessed 19 Jan 2021
Juniper research. https://www.juniperresearch.com/. White paper: Cybercrime & the Internet of Threats 2019. Accessed on 19 Jan 2021
Rashid M, Kamruzzaman J, Ahmed M, Islam N, Wibowo S, Gordon S (2020) performance enhancement of intrusion detection system using bagging ensemble technique with feature selection, 7thieee asia-pacific conference on computer science and data engineering 16-18 December, vol 2020. Gold Coast, Australia
Google Scholar
Tsai CF, Hsu YF, Lin CY, Lin WY (2009) Intrusion detection by machine learning: A review. Expert Syst. Appl. 36:11994–12000
Article Google Scholar
Buczak AL, Guven E (2015) A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutor 18:1153–1176
Article Google Scholar
Xin Y, Kong L, Liu Z, Chen Y, Li Y, Zhu H, Gao M, Hou H, Wang C (2018) Machine learning and deep learning methods for cybersecurity. IEEE Access 6:35365–35381
Article Google Scholar
Sommer R, Paxson V (2010) Outside the closed world: On using machine learning for network intrusion detection. In: Proceedings of the 2010 IEEE Symposium on Security and Privacy, Berkeley/Oakland, CA USA, 16–19 May, 2010, pp 305–316
Garg A, Maheshwari P (2016) A hybrid intrusion detection system: A review, 10th International Conference on Intelligent Systems and Control (ISCO), pp 1-5
Biswas SK (2018) Intrusion detection using machine learning: a comparison study. Int J Pure Appl Math 118(19):101–114
Google Scholar
Saxena AK, Sinha S, Shukla P (2017) General study of intrusion detection system and survey of agent-based intrusion detection system, 2017. International Conference on Computing Communication and Automation (ICCCA), pp 471–421
Sarker IH, Abushark YB, Alsolami F, Khan AI (2020) IntruDTree: A Machine Learning Based Cyber Security Intrusion Detection Model. Symmetry 12(5):754
Article Google Scholar
Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Machine Intell 12(10):993–1001
Article Google Scholar
Pham NT, Foo E, Suriadi S, Jeffrey H, Lahza HFM (2018) Improving performance of intrusion detection system using ensemble methods and feature selection. In: Proceedings of the Australasian Computer Science Week Multiconference, pp 1–6
Panigrah A, Patra MR (2016) Fuzzy rough classification models for network intrusion detection. Trans Machine Learn Artif Intell 4(2):07–07
Article Google Scholar
Panigrahi A, Patra M (2019) Anomaly based network intrusion detection using bayes net classifiers. Int J Scientif Technol Res 8(9):481–485
Google Scholar
Tama BA, Comuzzi M, Rhee KH (2019) TSE-IDS: A two-stage classifier ensemble for intelligent anomaly-based intrusion detection system. IEEE Access 7:94497–94507
Article Google Scholar
Tama BA, Rhee KH (2017) An extensive empirical evaluation of classifier ensembles for intrusion detection task. Comput Syst Sci Eng 32(2):149–158
Google Scholar
Smitha R, Kundapur PP, Hareesha KS (2020) A stacking ensemble for network intrusion detection using heterogeneous datasets. Hindawi security and communication networks 1–9
Paulauskas N, Auskalnis J (2017) Analysis of data pre-processing influence on intrusion detection using NSL-KDD dataset. In: 2017 open conference of electrical, electronic and information sciences (eStream), pp 1–5
Moustafa N, Turnbull B, Choo KKR (2019) An ensemble intrusion detection technique based on proposed statistical flow features for protecting network traffic of internet of things. IEEE Internet of Things Journal 6(3):4815–4830
Article Google Scholar
Salo F, Nassif AB, Essex A (2019) Dimensionality reduction with ig-pca and ensemble classifier for network intrusion detection. Comput Netw 148:164–175
Article Google Scholar
Zhou Y, Cheng G, Jiang S, Dai M (2020) Building an efficient intrusion detection system based on feature selection and ensemble classifier. Comput Netw p 107247
Yang XS (2010) A new metaheuristic bat-inspired algorithm. In: Nature Inspired Cooperative Strategies for Optimization (NICSO 2010). ISBN 978-3-642-24094-2. Springer, Madrid, pp 65–74
Rashid MM, Kamruzzaman J, Hassan MM, Imam T, Gordon S (2020) Cyberattacks Detection in IoT-Based Smart City Applications Using Machine Learning Techniques. International Journal of Environmental Research and Public Health 17 (24): 9347
Article Google Scholar
Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the kdd cup 99 data set. In: 2009 IEEE symposium on computational intelligence for security and defense applications, pp 1–6
NSL-KDD dataset. Available on http://www.unb.ca/cic/research/datasets/
Moustafa N, Slay J (2016) The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Inf Secur J A Glob Perspectvol 25:18–31
Article Google Scholar
Moustafa N (2017) Reliable statistical anomaly detection framework for dealing with large High-Speed network traffic. Ph.D. thesis designing an online university of new south wales. Canberra, Australia
Google Scholar
Scikit-Learn Developers. Available online: sklearn.preprocessing. LabelEncoder accessed on 10 June 2020 (2020)
Kotsiantis S, Kanellopoulos D, Pintelas P (2006) Data preprocessing for supervised leaning. Int J Comput Sci 1:111–117
Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: Machine Learning in Python. J Machine Learn Res 12:2825–2830
MathSciNet MATH Google Scholar
Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12:993–1001
Article Google Scholar
Wolpert DH (1992) Stacked generalization. Neural Netw. 5:241–259
Article Google Scholar
Bansal A, Kaur S (2018) Extreme gradient boosting based tuning for classification in intrusion detection systems, International Conference on Advances in Computing and Data Sciences. Springer, Berlin, pp 372–380
Google Scholar
Pham NT, Foo E, Suriadi S, Jeffrey H, Lahza HFH (2018) Improving performance of intrusion detection system using ensemble methods and feature selection. In: Proceedings of the Australasian Computer Science Week Multiconference, pp 1–6
Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? J Machine Learn Res 15(90):3133–3181. Accessed: Mar. 21, 2021. [Online]. Available: http://jmlr.org/papers/v15/delgado14a.html
MathSciNet MATH Google Scholar
Esmaily H, Tayefi M, Doosti H, Ghayour-Mobarhan M, Nezami H, Amirabadizadeh A (2018) A Comparison between Decision Tree and Random Forest in Determining the Risk Factors Associated with Type 2 Diabetes. J Res Health Sci 18(2):412. Accessed: Mar. 21, 2021. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7204421/
Google Scholar
Ali J, Khan R, Ahmad N, Maqsood I (2012) Random forests and decision trees. Int J Comput Sci Issues 9(5):272–27
Google Scholar
Berhane TM, et al. (2018) Decision-Tree, Rule-Based, And random forest classification of High-Resolution multispectral imagery for wetland mapping and inventory. Remote Sens (Basel) 10(4):580. https://doi.org/10.3390/rs10040580
Article Google Scholar
Prajwala TR (2015) A comparative study on decision tree and random forest using r tool. IJARCCE 4(1):196–199. https://doi.org/10.17148/IJARCCE.2015.4142
Google Scholar
Chen T, Guestrin C (2016) XGBOost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp 785–794. https://doi.org/10.1145/2939672.2939785
Dhaliwal SS, Nahid A, Abbas R (2018) Effective Intrusion Detection System Using XGBoost, Information, vol. 9, no. 7. Art. no. 7. https://doi.org/10.3390/info9070149
Chen Z, Jiang F, Cheng Y, Gu X, Liu W, Peng J (2018) XGBoost Classifier for DDoS Attack Detection and Analysis in SDN-Based Cloud. In: 2018 IEEE International Conference on Big Data and Smart Computing (BigComp), pp 251–256. https://doi.org/10.1109/BigComp.2018.00044
Law A, et al. (2020) Secure Collaborative Training and Inference for XGBoost. In: Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice, New York, NY, USA, pp 21–26. https://doi.org/10.1145/3411501.3419420
Kolias C, Kambourakis G, Stavrou A, Gritzalis S (2015) Intrusion detection in 802.11 networks: Empirical evaluation of threats and a public dataset. IEEE Commun Surv Tutor 18:184–208
Article Google Scholar
Sharafaldin I, Lashkari AH, Ghorbani AA (2018) Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic characterization,ICISSP, pp. 108–116, Jan 22–24. Funchal, Portugal
Google Scholar
Alazzam H, Sharieh A, Sabri KE (2020) A feature selection algorithm for intrusion detection system based on pigeon inspired optimizer. Expert Syst Appl 148:113–249
Article Google Scholar
Shiravi A, Shiravi H, Tavallaee M, Ghorbani AA (2012) Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput Secur 31(3):357–374
Article Google Scholar
Song J, Takakura H, Okabe Y, Eto M, Inoue D, Nakao K (2011) Statistical analysis of honeypot data and building of kyoto 2006+ dataset for nids evaluation. In: Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, ACM (2011), pp 29–36
The-NIMS-Dataset, Available: https://projects.cs.dal.ca/projectx/Download.html
Mahfouz A, Abuhussein A, Venugopal D, Shiva S (2020) Ensemble classifiers for network intrusion detection using a novel network attack dataset. Future Internet 12(11):180
Article Google Scholar
Taneja M, Davy A (2017) Resource aware placement of IoT application modules in Fog-Cloud Computing Paradigm. In: 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), May, 2017, pp 1222–1228. https://doi.org/10.23919/INM.2017.7987464
Chao LW, Shih-Wen K, Chih-Fon T (Jan. 2017) 10 data mining techniques in business applications:brief survey. Kybernetes 46(7):1158–1170. https://doi.org/10.1108/K-10-2016-0302
Article Google Scholar
Noor U, Anwar Z, Amjad T, Choo K-KR (2019) A machine learning-based FinTech cyber threat attribution framework using high-level indicators of compromise. Futur Gener Comput Syst 96:227–242. https://doi.org/10.1016/j.future.2019.02.013
Article Google Scholar
Džeroski S, ženko B. (2004) Is combining classifiers with stacking better than selecting the best one?. Mach Learn 54(3):255–273
Article Google Scholar
Wilcoxon Rank-Sum Test, https://www.stat.auckland.ac.nz/wild/ChanceEnc/Ch10.wilcoxon.pdf
Ying X (2019) An overview of overfitting and its solutions. J Phys Conf Series 1168(2):022022. IOP Publishing
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering and Technology, CQUniversity, Rockhampton, Australia
Mamunur Rashid, Santoso Wibowo & Steven Gordon
School of Engineering and Information Technology Federation University, Ballarat, Australia
Joarder Kamruzzaman
School of Business and Law, CQUniversity, Melbourne, Australia
Tasadduq Imam

Authors

Mamunur Rashid
View author publications
You can also search for this author in PubMed Google Scholar
Joarder Kamruzzaman
View author publications
You can also search for this author in PubMed Google Scholar
Tasadduq Imam
View author publications
You can also search for this author in PubMed Google Scholar
Santoso Wibowo
View author publications
You can also search for this author in PubMed Google Scholar
Steven Gordon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mamunur Rashid.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rashid, M., Kamruzzaman, J., Imam, T. et al. A tree-based stacking ensemble technique with feature selection for network intrusion detection. Appl Intell 52, 9768–9781 (2022). https://doi.org/10.1007/s10489-021-02968-1

Download citation

Accepted: 01 November 2021
Published: 08 January 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s10489-021-02968-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A tree-based stacking ensemble technique with feature selection for network intrusion detection

Abstract

Access this article

Similar content being viewed by others

Analysis of Intrusion Detection Using Ensemble Stacking-Based Machine Learning Techniques in IoT Networks

A novel ensemble learning-based model for network intrusion detection

A Novel Ensemble Method for Network-Based Anomaly Intrusion Detection System

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A tree-based stacking ensemble technique with feature selection for network intrusion detection

Abstract

Access this article

Similar content being viewed by others

Analysis of Intrusion Detection Using Ensemble Stacking-Based Machine Learning Techniques in IoT Networks

A novel ensemble learning-based model for network intrusion detection

A Novel Ensemble Method for Network-Based Anomaly Intrusion Detection System

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation