Successful intrusion detection with a single deep autoencoder: theory and practice

Catillo, Marta; Pecchia, Antonio; Villano, Umberto

doi:10.1007/s11219-023-09636-2

Successful intrusion detection with a single deep autoencoder: theory and practice

Research
Published: 25 May 2023

Volume 32, pages 95–123, (2024)
Cite this article

Software Quality Journal Aims and scope Submit manuscript

Marta Catillo¹,
Antonio Pecchia¹ &
Umberto Villano¹

301 Accesses
1 Citation
Explore all metrics

Abstract

Intrusion detection is a key topic in computer security. Due to the ever-increasing number of network attacks, several accurate anomaly-based techniques have been proposed for intrusion detection, wherein pattern recognition through machine learning techniques is typically used. Many proposals rely on the use of autoencoders, due to their capability to analyze complex, high-dimensional, and large-scale data. They capitalize on composite architectures and accurate learning approaches, possibly in combination with sophisticated feature selection techniques. However, due to their high complexity and lack of transferability of the impressive intrusion detection results, they are hardly ever used in production environments. This paper is developed around the intuition that complexity is not necessarily justified because a single autoencoder is enough to obtain similar, if not better, intrusion detection results compared to related proposals. The wide study presented here addresses the effect of the seed, a deep investigation on the training loss, and feature selection across the use of different hardware platforms. The best practices presented, regarding set-up and training, threshold setting, and possible use of feature selection techniques for performance improvement, can be valuable for any future work on the use of autoencoders for successful intrusion detection purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Simpler Is Better: On the Use of Autoencoders for Intrusion Detection

Hybrid Model for Improving the Classification Effectiveness of Network Intrusion Detection

Feature Extraction and Anomaly Detection Using Different Autoencoders for Modeling Intrusion Detection Systems

Article 17 April 2024

Data availability

Publicly available datasets were analyzed in this study. The datasets can be found at the URLs mentioned in the paper.

Notes

The seed is the initial point of the sequence of values generated by the pseudorandom number generator (PRNG).
https://github.com/ahlashkari/CICFlowMeter
https://downloads.distrinet-research.be/WTMC2021/tools_datasets.html
http://idsdata.ding.unisannio.it/tools.html
This can be obtained by calling tf.config.threading.set_inter_op_parallelism_threads(1) and tf.config.threading.set_intra_op_parallelism_threads(1), but this is detrimental for learning times.
https://github.com/NVIDIA/framework-determinism
http://idsdata.ding.unisannio.it/

References

Apruzzese, G., Pajola, L., & Conti, M. (2022). The cross-evaluation of machine learning-based network intrusion detection systems. IEEE Transactions on Network and Service Management, 19, 5152–5169.
Article Google Scholar
Binbusayyis, A., & Vaiyapuri, T. (2020). Comprehensive analysis and recommendation of feature evaluation measures for intrusion detection. Heliyon, 6, e04262.
Article PubMed PubMed Central Google Scholar
Cai, J., Luo, J., Wang, S., & Yang, S. (2018). Feature selection in machine learning: A new perspective. Neurocomputing, 300, 70–79.
Article Google Scholar
Catillo, M., Rak, M., & Villano, U. (2019). Discovery of DoS attacks by the ZED-IDS anomaly detector. Journal of High Speed Networks, 25, 349–365.
Article Google Scholar
Catillo, M., Rak, M., & Villano, U. (2020). 2L-ZED-IDS: A two-level anomaly detector for multiple attack classes. In Web, artificial intelligence and network applications (pp. 687–696). Springer International Publishing.
Catillo, M., Del Vecchio, A., Ocone, L., Pecchia, A., & Villano, U. (2021a). USB-IDS-1: A public multilayer dataset of labeled network flows for IDS evaluation. In Proc. International Conference on Dependable Systems and Networks Workshops (pp. 1–6). IEEE.
Catillo, M., Pecchia, A., Rak, M., & Villano, U. (2021b). Demystifying the role of public intrusion datasets: A replication study of DoS network traffic data. Computers & Security, 108,
Catillo, M., Del Vecchio, A., Pecchia, A., & Villano, U. (2022a). Transferability of machine learning models learned from public intrusion detection datasets: The CICIDS2017 case study. Software Quality Journal, 30, 955–981.
Catillo, M., Pecchia, A., & Villano, U. (2022b). Simpler is better: On the use of autoencoders for intrusion detection. In Quality of information and communications technology (pp. 223–238). Springer International Publishing.
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Comput. Surv., 41, 15.
Article Google Scholar
Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40, 16–28.
Article Google Scholar
de Carvalho Bertoli, G., Junior, Alves Pereira, L., Saotome, O., & dos Santos, A. L. (2023). Generalizing intrusion detection for heterogeneous networks: A stacked-unsupervised federated learning approach. Computers & Security, 127, 103106.
Dina, A. S., & Manivannan, D. (2021). Intrusion detection based on machine learning techniques in computer networks. Internet of Things, 16, 100462.
Article Google Scholar
Engelen, G., Rimmer, V., & Joosen, W. (2021). Troubleshooting an intrusion detection dataset: The CICIDS2017 case study. In Proc. Security and Privacy Workshops (pp. 7–12). IEEE.
Jiang, J., Han, G., Liu, L., Shu, L., & Guizani, M. (2020). Outlier detection approaches based on machine learning in the Internet-of-Things. IEEE Wireless Communications, 27, 53–59.
Article Google Scholar
Kilincer, I., Ertam, F., & Sengur, A. (2021). Machine learning methods for cyber security intrusion detection: Datasets and comparative study. Computer Networks, 188, 107840.
Article Google Scholar
Kramer, M. A. (1991). Nonlinear principal component analysis using autoassociative neural networks. AIChE Journal, 37, 233–243.
Article ADS CAS Google Scholar
Kshirsagar, D., & Kumar, S. (2021). An efficient feature reduction method for the detection of DoS attack. ICT Express, 7, 371–375.
Article Google Scholar
Kunang, Y. N., Nurmaini, S., Stiawan, D., Zarkasi, A., Firdaus, & Jasmir (2018). Automatic features extraction using autoencoder in intrusion detection system. In Proc. International Conference on Electrical Engineering and Computer Science (pp. 219–224). IEEE.
Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2018). Feature selection: A data perspective. ACM Comput. Surv., 50, 1–45.
Article Google Scholar
Liu, F. T., Ting, K. M., & Zhou, Z. (2008). Isolation forest. In Proc. International Conference on Data Mining (pp. 413–422). IEEE.
Maciá-Fernández, G., Camacho, J., Magán-Carrión, R., García-Teodoro, P., & Therón, R. (2017). UGR’16: A new dataset for the evaluation of cyclostationarity-based network IDSs. Computer & Security, 73, 411–424.
Article Google Scholar
Maseer, Z. K., Yusof, R., Bahaman, N., Mostafa, S. A., & Foozy, C. F. M. (2021). Benchmarking of machine learning for anomaly based intrusion detection systems in the CICIDS2017 dataset. IEEE Access, 9, 22351–22370.
Article Google Scholar
Meidan, Y., Bohadana, M., Mathov, Y., Mirsky, Y., Shabtai, A., Breitenbacher, D., & Elovici, Y. (2018). N-BaIoT-network-based detection of IoT botnet attacks using deep autoencoders. IEEE Pervasive Computing, 17, 12–22.
Article Google Scholar
Mirsky, Y., Doitshman, T., Elovici, Y., & Shabtai, A. (2018). Kitsune: An ensemble of autoencoders for online network intrusion detection. In Proc. International Conference of Network and Distributed System Security Symposium.
Moustafa, N., & Slay, J. (2015). UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proc. International Conference Military Communications and Information Systems Conference (pp. 1–6). IEEE.
Panigrahi, R., Borah, S., Bhoi, A. K., Ijaz, M. F., Pramanik, M., Jhaveri, R. H., & Chowdhary, C. L. (2021). Performance assessment of supervised classifiers for designing intrusion detection systems: A comprehensive review and recommendations for future research. Mathematics, 9, 690.
Article Google Scholar
Ring, M., Wunderlich, S., Scheuring, D., Landes, D., & Hotho, A. (2019). A survey of network-based intrusion detection data sets. Computer & Security, 86, 147–167.
Article Google Scholar
Roesch, M. (1999). Snort - Lightweight intrusion detection for networks. In Proc. International USENIX Conference on System Administration (p. 229-238). USENIX Association.
Rosay, A., Carlier, F., Cheval, E., & Leroux, P. (2021). From CIC-IDS2017 to LYCOS-IDS2017: A corrected dataset for better performance. In Proc. International Conference on Web Intelligence (pp. 570–575). ACM.
Sharafaldin, I., Lashkari, A. H., & Ghorbani., A. A. (2018). Toward generating a new intrusion detection dataset and intrusion traffic characterization. In Proc. International Conference on Information Systems Security and Privacy (pp. 108–116). SciTePress.
Solorio-Fernández, S., Carrasco-Ochoa, J. A., & Martìnez-Trinidad, J. F. (2020). A review of unsupervised feature selection methods. Artificial Intelligence Review, 53, 907–948.
Article Google Scholar
Taher, K. A., Mohammed Yasin Jisan, B., & Rahman, M. M. (2019). Network intrusion detection using supervised machine learning technique with feature selection. In Proc. International Conference on Robotics, Electrical and Signal Processing Techniques (pp. 643–646). IEEE.
Verkerken, M., D’Hooge, L., Wauters, T., Volckaert, B., & De Turck, F. (2021). Towards model generalization for intrusion detection: Unsupervised machine learning techniques. Journal of Network and Systems Management, 30, 12.
Article PubMed Central Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P. A. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11, 3371–3408.
MathSciNet Google Scholar
Wei-Chao, L., Shih-Wen, K., & Chih-Fong, T. (2015). CANN: An intrusion detection system based on combining cluster centers and nearest neighbors. Knowledge-Based Systems, 78, 13–21.
Article Google Scholar
Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., & Wesslén, A. (2000). Experimentation in software engineering: An introduction. Kluwer Academic.
Wu, J., Wu, Y., Niu, N., & Zhou, M. (2021). MHCPDP: Multi-source heterogeneous cross-project defect prediction via multi-source transfer learning and autoencoder. IEEE Pervasive Computing, 29, 405–430.
Google Scholar
XuKui, L., Wei, C., Qianru, Z., & Lifa, W. (2020). Building auto-encoder intrusion detection system based on random forest feature selection. Computers & Security, 95, 101851.
Article Google Scholar
Zhang, Y., Lee, W., & Huang, Y. (2003). Intrusion detection techniques for mobile wireless networks. Wireless Networks, 9, 545–556.
Article Google Scholar
Zhong, Y., Chen, W., Wang, Z., Chen, Y., Wang, K., Li, Y., Yin, X., Shi, X., Yang, J., & Li, K. (2020). HELAD: A novel network anomaly detection model based on heterogeneous ensemble learning. Computer Networks, 169, 107049.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Ingegneria, Università degli Studi del Sannio, Pal. Bosco Lucarelli C.so Garibaldi 107, Benevento, 82100, Italy
Marta Catillo, Antonio Pecchia & Umberto Villano

Authors

Marta Catillo
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Pecchia
View author publications
You can also search for this author in PubMed Google Scholar
Umberto Villano
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Marta Catillo, Antonio Pecchia, and Umberto Villano contributed equally to this work.

Corresponding author

Correspondence to Marta Catillo.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Marta Catillo, Antonio Pecchia and Umberto Villano contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Catillo, M., Pecchia, A. & Villano, U. Successful intrusion detection with a single deep autoencoder: theory and practice. Software Qual J 32, 95–123 (2024). https://doi.org/10.1007/s11219-023-09636-2

Download citation

Accepted: 19 April 2023
Published: 25 May 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s11219-023-09636-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Successful intrusion detection with a single deep autoencoder: theory and practice

Abstract

Access this article

Similar content being viewed by others

Simpler Is Better: On the Use of Autoencoders for Intrusion Detection

Hybrid Model for Improving the Classification Effectiveness of Network Intrusion Detection

Feature Extraction and Anomaly Detection Using Different Autoencoders for Modeling Intrusion Detection Systems

Data availability

Notes

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Successful intrusion detection with a single deep autoencoder: theory and practice

Abstract

Access this article

Similar content being viewed by others

Simpler Is Better: On the Use of Autoencoders for Intrusion Detection

Hybrid Model for Improving the Classification Effectiveness of Network Intrusion Detection

Feature Extraction and Anomaly Detection Using Different Autoencoders for Modeling Intrusion Detection Systems

Data availability

Notes

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation