Improving Cyber-Threat Detection by Moving the Boundary Around the Normal Samples

Andresini, Giuseppina; Appice, Annalisa; Paolo Caforio, Francesco; Malerba, Donato

doi:10.1007/978-3-030-57024-8_5

Giuseppina Andresini⁶,
Annalisa Appice^6,7,
Francesco Paolo Caforio⁶ &
…
Donato Malerba^6,7

Part of the book series: Studies in Computational Intelligence ((SCI,volume 919))

900 Accesses
6 Citations

Abstract

Recent research trends definitely recognise deep learning as an important approach in cybersecurity. Deep learning allows us to learn accurate threat detection models in various scenarios. However, it often suffers from training data over-fitting. In this paper, we propose a supervised machine learning method for cyber-threat detection, which modifies the training set to reduce data over-fitting when training a deep neural network. This is done by re-positioning the decision boundary that separates the normal training samples and the threats. Particularly, it re-assigns the normal training samples that are close to the boundary to the opposite class and trains a competitive deep neural network from the modified training set. In this way, it learns a classification model that can detect unseen threats, which behave similarly to normal samples. The experiments, performed by considering three benchmark datasets, prove the effectiveness of the proposed method. They provide encouraging results, also compared to several prominent competitors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://scikit-learn.org/stable/index.html.
2.
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html.
3.
https://github.com/gsndr/THEODORA.
4.
https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html.
5.
https://github.com/gsndr/MINDFUL.
6.
http://kdd.ics.uci.edu//databases//kddcup99//kddcup99.html.
7.
10%KDDCUP99Train and KDDCUP99Test are populated with the data stored in kddcup.data_10_percent.gz and corrected.gz at http://kdd.ics.uci.edu//databases//kddcup99//kddcup99.html.
8.
https://www.unb.ca/cic/datasets/ids-2017.html.
9.
https://www.unb.ca/cic/datasets/android-adware.html.
10.
In principle, any traditional supervised algorithm, that is able to estimate the classification certainty, can be used in place of SVM. We consider SVM as several studies [27, 34, 40] have repeatedly proved that it outperforms competitors based on Linear SVM, RBF SVM, Random Forest, K-NN and Naive Bayes in various cybersecurity applications.

References

Abdulhammed Alani R, Musafer H, Alessa A, Faezipour M, Abuzneid A (2019) Features dimensionality reduction approaches for machine learning based network intrusion detection. Electronics 8:322
Article Google Scholar
Abri F, Siami-Namini S, Khanghah MA, Soltani FM, Namin AS (2019) Can machine/deep learning classifiers detect zero-day malware with high accuracy? In: 2019 IEEE international conference on big data (Big Data), pp 3252–3259
Google Scholar
Al-Qatf M, Lasheng Y, Al-Habib M, Al-Sabahi K (2018) Deep learning approach combining sparse autoencoder with svm for network intrusion detection. IEEE Access 6:52843–52856
Article Google Scholar
Aldweesh A, Derhab A, Emam AZ (2020) Deep learning approaches for anomaly-based intrusion detection systems: a survey, taxonomy, and open issues. Knowl-Based Syst 189:105124
Article Google Scholar
AlEroud A, Karabatis G (2020) Sdn-gan: generative adversarial deep nns for synthesizing cyber attacks on software defined networks. In: Debruyne C, Panetto H, Guédria W, Bollen P, Ciuciu I, Karabatis G, Meersman R (eds) On the move to meaningful internet systems: OTM 2019 workshops. Springer International Publishing, Cham, pp 211–220
Chapter Google Scholar
Althubiti SA, Jones EM, Roy K (2018) Lstm for anomaly-based network intrusion detection. In: 2018 28th International telecommunication networks and applications conference (ITNAC). IEEE Computer Society, pp 1–3
Google Scholar
Amigó E, Gonzalo J, Artiles J, Verdejo M (2009) Amigó e, gonzalo j, artiles j et ala comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retrieval 12:461–486
Article Google Scholar
Andresini G, Appice A, Malerba D (2020) Dealing with class imbalance in android malware detection by cascading clustering and classification. In: Complex pattern mining—new challenges, methods and applications, Studies in Computational Intelligence, vol 880. Springer, pp 173–187. https://doi.org/10.1007/978-3-030-36617-9_11
Andresini G, Appice A, Mauro ND, Loglisci C, Malerba D (2019) Exploiting the auto-encoder residual error for intrusion detection. In: 2019 IEEE European symposium on security and privacy workshops, EuroS&P workshops 2019, Stockholm, Sweden, 17–19 June 2019. IEEE, pp 281–290
Google Scholar
Andresini G, Appice A, Mauro ND, Loglisci C, Malerba D (2020) Multi-channel deep feature learning for intrusion detection. IEEE Access 8:53346–53359
Article Google Scholar
Angelo P, Costa Drummond A (2018) Adaptive anomaly-based intrusion detection system using genetic algorithm and profiling. Secur Priv 1(4):e36
Article Google Scholar
Appice A, Andresini G, Malerba D (2020) Clustering-aided multi-view classification: a case study on android malware detection. J Intell Inf Systms. https://doi.org/10.1007/s10844-020-00598-6
Article Google Scholar
Appice A, Guccione P, Malerba D (2017) A novel spectral-spatial co-training algorithm for the transductive classification of hyperspectral imagery data. Pattern Recognit 63:229–245
Article Google Scholar
Appice A, Malerba D (2019) Segmentation-aided classification of hyperspectral data using spatial dependency of spectral bands. ISPRS J Photogrammetry Remote Sens 147:215–231
Article Google Scholar
Berman DS, Buczak AL, Chavis JS, Corbett CL (2019) A survey of deep learning methods for cyber security. Information 10(4):1–35
Article Google Scholar
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Kluwer Academic Publishers, USA
Book MATH Google Scholar
Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Article Google Scholar
Cheng F, Yang K, Zhang L (2015) A structural svm based approach for binary classification under class imbalance. Math Probl Eng 2015:1–10
MathSciNet MATH Google Scholar
Chun M, Wei D, Qing W (2020) Speech analysis for wilson’s disease using genetic algorithm and support vector machine. In: Abawajy JH, Choo KKR, Islam R, Xu Z, Atiquzzaman M (eds) International conference on applications and techniques in cyber intelligence ATCI 2019. Springer International Publishing, Cham, pp 1286–1295
Google Scholar
Comar PM, Liu L, Saha S, Tan P, Nucci A (2013) Combining supervised and unsupervised learning for zero-day malware detection. In: 2013 Proceedings IEEE INFOCOM, pp 2022–2030
Google Scholar
Dan L, Dacheng C, Baihong J, Lei S, Jonathan G, See-Kiong N (2019) Mad-gan: Multivariate anomaly detection for time series data with generative adversarial networks. In: Artificial neural networks and machine learning, pp 703–716
Google Scholar
Dunn JC (1973) A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. J Cybern 3(3):32–57
Article MathSciNet MATH Google Scholar
Gandotra E, Bansal D, Sofat S (2016) Zero-day malware detection. In: 2016 Sixth international symposium on embedded computing and system design (ISED), pp 171–175
Google Scholar
Goh KS, Chang E, Cheng KT (2001) Svm binary classifier ensembles for image classification. In: Proceedings of the tenth international conference on information and knowledge management, CIKM ’01. Association for Computing Machinery, New York, NY, USA, pp 395–402
Google Scholar
Goodfellow I, McDaniel P, Papernot N (2018) Making machine learning robust against adversarial inputs. Commun ACM 61(7):56–66
Article Google Scholar
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems 27, Annual conference on neural information processing systems 2014, 8–13 December 2014, Montreal, Quebec, Canada, pp 2672–2680
Google Scholar
Halimaa A, Sundarakantham K (2019) Machine learning based intrusion detection system. In: 2019 3rd International conference on trends in electronics and informatics (ICOEI), pp 916–920
Google Scholar
Hao M, Tianhao Y, Fei Y (2019) The svm based on smo optimization for speech emotion recognition. In: 2019 Chinese control conference (CCC), pp 7884–7888
Google Scholar
Hao Y, Sheng Y, Wang J (2019) Variant gated recurrent units with encoders to preprocess packets for payload-aware intrusion detection. IEEE Access 7:49985–49998
Article Google Scholar
Hu Z, Chen P, Zhu M, Liu P (2019) Reinforcement learning for adaptive cyber defense against zero-day attacks. Springer International Publishing, Cham, pp 54–93
Google Scholar
Ingre B, Yadav A, Soni AK (2018) Decision tree based intrusion detection system for nsl-kdd dataset. In: Satapathy SC, Joshi A (eds) Information and communication technology for intelligent systems (ICTIS 2017), vol 2. Springer International Publishing, Cham, pp 207–218
Google Scholar
Jang-Jaccard J, Nepal S (2014) A survey of emerging threats in cybersecurity. J Comput Syst Sci 80(5):973–993 Special Issue on Dependable and Secure Computing
Article MathSciNet MATH Google Scholar
Jiang F, Fu Y, Gupta BB, Lou F, Rho S, Meng F, Tian Z (2018) Deep learning based multi-channel intelligent attack detection for data security. IEEE Trans Sustain Comput pp 1–1
Google Scholar
Kedziora M, Gawin P, Szczepanik M, Jozwiak I (2019) Malware detection using machine learning algorithms and reverse engineering of android java code. SSRN Electron J. https://doi.org/10.2139/ssrn.3328497
Article Google Scholar
Khan RU, Zhang X, Alazab M, Kumar R (2019) An improved convolutional neural network model for intrusion detection in networks. In: 2019 Cybersecurity and cyberforensics conference (CCC), pp 74–77
Google Scholar
Kim JY, Bu SJ, Cho SB (2018) Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders. Inf Sci 460–461:83–102
Article Google Scholar
Kim JY, Cho SB (2018) Detecting intrusive malware with a hybrid generative deep learning model. In: Yin H, Camacho D, Novais P, Tallón-Ballesteros AJ (eds) Intelligent data engineering and automated learning—IDEAL 2018. Springer International Publishing, Cham, pp 499–507
Google Scholar
Kim T, Suh SC, Kim H, Kim J, Kim J (2018) An encoding technique for cnn-based network anomaly detection. In: International conference on big data, pp 2960–2965
Google Scholar
Kremer J, Steenstrup Pedersen K, Igel C (2014) Active learning with support vector machines. WIREs Data Min Knowl Discov 4(4):313–326
Article Google Scholar
Krishnaveni S, Vigneshwar P, Kishore S, Jothi B, Sivamohan S (2020) Anomaly-based intrusion detection system using support vector machine. In: Dash SS, Lakshmi C, Das S, Panigrahi BK (eds) Artificial intelligence and evolutionary computations in engineering systems. Springer Singapore, Singapore, pp 723–731
Chapter Google Scholar
Labonne M, Olivereau A, Polve B, Zeghlache D (2019) A cascade-structured meta-specialists approach for neural network-based intrusion detection. In: 16th Annual consumer communications & networking conference, pp 1–6
Google Scholar
Lashkari AH, Kadir AFA, Gonzalez H, Mbah KF, Ghorbani AA (2017) Towards a network-based framework for android malware detection and characterization. In: PST. IEEE Computer Society, pp 233–234
Google Scholar
Le T, Kang H, Kim H (2019) The impact of pca-scale improving gru performance for intrusion detection. In: 2019 International conference on platform technology and service (PlatCon), pp 1–6
Google Scholar
Lewis DD, Gale WA (1994) A sequential algorithm for training text classifiers. In: Croft BW, van Rijsbergen CJ (eds) SIGIR ’94. Springer, London, London, pp 3–12
Chapter Google Scholar
Li D, Chen D, Jin B, Shi L, Goh J, Ng SK (2019) Mad-gan: multivariate anomaly detection for time series data with generative adversarial networks. In: Tetko IV, Kůrková V, Karpov P, Theis F (eds) Artificial neural networks and machine learning—ICANN 2019: text and time series. Springer International Publishing, Cham, pp 703–716
Chapter Google Scholar
Li Y, Ma R, Jiao R (2015) A hybrid malicious code detection method based on deep learning. Int J Softw Eng Appl 9:205–216
Google Scholar
Lin WC, Ke SW, Tsai CF (2015) Cann: an intrusion detection system based on combining cluster centers and nearest neighbors. Knowl-Based Syst 78:13–21
Article Google Scholar
Liu J, Tian Z, Zheng R, Liu L (2019) A distance-based method for building an encrypted malware traffic identification framework. IEEE Access 7:100014–100028
Article Google Scholar
Liu J, Zhang W, Tang Z, Xie Y, Ma T, Zhang J, Zhang G, Niyoyita JP (2020) Adaptive intrusion detection via ga-gogmm-based pattern learning with fuzzy rough set-based attribute selection. Expert Syst Appl 139:112845
Article Google Scholar
Liu W, Ci L, Liu L (2020) A new method of fuzzy support vector machine algorithm for intrusion detection. Appl Sci 10(3):1065
Article Google Scholar
Malerba D, Ceci M, Appice A (2009) A relational approach to probabilistic classification in a transductive setting. Eng Appl Artif Intell 22(1):109–116. https://doi.org/10.1016/j.engappai.2008.04.005
Article Google Scholar
Malik AJ, Khan FA (2017) A hybrid technique using binary particle swarm optimization and decision tree pruning for network intrusion detection. Cluster Comput pp 1–14
Google Scholar
Moti Z, Hashemi S, Namavar A (2019) Discovering future malware variants by generating new malware samples using generative adversarial network. In: 2019 9th International conference on computer and knowledge engineering (ICCKE), pp 319–324
Google Scholar
Naseer S, Saleem Y, Khalid S, Bashir MK, Han J, Iqbal MM, Han K (2018) Enhanced network anomaly detection based on deep neural networks. IEEE Access 6:48231–48246
Article Google Scholar
Pang, Y., Chen, Z., Peng, L., Ma, K., Zhao, C., Ji, K.: A signature-based assistant random oversampling method for malware detection. In: 2019 18th IEEE International conference on trust, security and privacy in computing and communications/13th IEEE international conference on big data science and engineering (TrustCom/BigDataSE), pp 256–263
Google Scholar
Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE symposium on security and privacy (SP), pp 582–597
Google Scholar
Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in large margin classifiers. MIT Press, pp 61–74
Google Scholar
Powers D (2007) Evaluation: from precision, recall and fmeasure to roc, informedness, markedness and correlation. J Mach Learn Technol 2:37–63
Google Scholar
Qu X, Yang L, Guo K, Ma L, Feng T, Ren S, Sun M (2019) Statistics-enhanced direct batch growth self-organizing mapping for efficient dos attack detection. IEEE Access 7:78434–78441
Article Google Scholar
Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, Langs G (2017) Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: Niethammer M, Styner M, Aylward S, Zhu H, Oguz I, Yap PT, Shen D (eds) Information processing in medical imaging. Springer International Publishing, Cham, pp 146–157
Chapter Google Scholar
Shapoorifard H, Shamsinjead Babaki P (2017) Intrusion detection using a novel hybrid method incorporating an improved knn. Int J Comput Appl 173:5–9. https://doi.org/10.5120/ijca2017914340
Article Google Scholar
Stellios I, Kotzanikolaou P, Psarakis M (2019) Advanced persistent threats and zero-day exploits in industrial internet of things. Springer International Publishing, Cham, pp 47–68
Google Scholar
Stokes JW, Seifert C, Li J, Hejazi N (2019) Detection of prevalent malware families with deep learning. In: MILCOM 2019—2019 IEEE military communications conference (MILCOM), pp 1–8
Google Scholar
Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the kdd cup 99 data set. In: Symposium on computational intelligence for security and defense applications, pp 1–6
Google Scholar
Vapnik VN (1998) Statistical learning theory. Wiley-Interscience
Google Scholar
Vigneswaran RK, Vinayakumar R, Soman KP, Poornachandran P (2018) Evaluating shallow and deep neural networks for network intrusion detection systems in cyber security. In: 2018 9th International conference on computing, communication and networking technologies (ICCCNT), pp 1–6. https://doi.org/10.1109/ICCCNT.2018.8494096
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access 7:41525–41550
Article Google Scholar
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Venkatraman S (2019) Robust intelligent malware detection using deep learning. IEEE Access 7:46717–46738
Article Google Scholar
Virmani C, Choudhary T, Pillai A, Rani M (2020) Applications of machine learning in cyber security. In: Handbook of research on machine and deep learning applications for cyber security
Google Scholar
Wadkar M, Troia FD, Stamp M (2020) Detecting malware evolution using support vector machines. Expert Syst Appl 143:113022
Article Google Scholar
Wang Q, Guo W, Zhang K, Ororbia AG, Xing X, Liu X, Giles CL (2017) Adversary resistant deep neural networks with an application to malware detection. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’17. Association for Computing Machinery, New York, NY, USA, pp 1145–1153
Google Scholar
Wang W, Zhu M, Zeng X, Ye X, Sheng Y (2017) Malware traffic classification using convolutional neural network for representation learning. In: 2017 International conference on information networking (ICOIN). IEEE, pp 712–717
Google Scholar
Yin C, Zhu Y, Fei J, He X (2017) A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access 5:21954–21961
Article Google Scholar
Yin Z, Liu W, Chawla S (2019) Adversarial attack, defense, and applications with deep learning frameworks. Springer International Publishing, Berlin, pp 1–25
Google Scholar
Yin Z, Wang F, Liu W, Chawla S (2018) Sparse feature attacks in adversarial learning. IEEE Trans Knowl Data Eng 30(6):1164–1177
Article Google Scholar
Zenati H, Foo CS, Lecouat B, Manek G, Chandrasekhar VR (2018) Efficient gan-based anomaly detection. ArXiv abs/1802.06222
Google Scholar
Zenati H, Romain M, Foo CS, Lecouat B, Chandrasekhar VR (2018) Adversarially learned anomaly detection. In: 2018 IEEE International conference on data mining (ICDM), pp 727–736
Google Scholar
Zhang Y, Chen X, Jin L, Wang X, Guo D (2019) Network intrusion detection: Based on deep hierarchical network and original flow data. IEEE Access 7:37004–37016
Article Google Scholar
Zhang Z, Pan P (2019) A hybrid intrusion detection method based on improved fuzzy c-means and support vector machine. In: 2019 International conference on communications, information system and computer engineering (CISCE), pp 210–214
Google Scholar

Download references

Acknowledgements

We acknowledge the support of the MIUR-Ministero dell’Istruzione dell’Università e della Ricerca through the project “TALIsMan—Tecnologie di Assistenza personALizzata per il Miglioramento della quAlità della vitA” (Grant ID: ARS01_01116) funded by PON RI 2014–2020 and the ATENEO 2017/18 “Modelli e tecniche di data science per la analisi di dati strutturati” funded by the University of Bari “Aldo Moro”. The authors wish to thank Lynn Rudd for her help in reading the manuscript.

Author information

Authors and Affiliations

Dipartimento di Informatica, Università degli Studi di Bari Aldo Moro via Orabona, 4 - 70126, Bari, Italy
Giuseppina Andresini, Annalisa Appice, Francesco Paolo Caforio & Donato Malerba
Consorzio Interuniversitario Nazionale per l’Informatica—CINI, Bari, Italy
Annalisa Appice & Donato Malerba

Authors

Giuseppina Andresini
View author publications
You can also search for this author in PubMed Google Scholar
Annalisa Appice
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Paolo Caforio
View author publications
You can also search for this author in PubMed Google Scholar
Donato Malerba
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giuseppina Andresini .

Editor information

Editors and Affiliations

Sultan Moulay Slimane University, Beni Mellal, Morocco
Yassine Maleh
Institute for Communication Systems, University of Surrey, Guildford, UK
Mohammad Shojafar
Charles Darwin University, Darwin, NT, Australia
Mamoun Alazab
Chouaib Doukkali University El Jadida, El Jadida, Morocco
Youssef Baddi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Andresini, G., Appice, A., Paolo Caforio, F., Malerba, D. (2021). Improving Cyber-Threat Detection by Moving the Boundary Around the Normal Samples. In: Maleh, Y., Shojafar, M., Alazab, M., Baddi, Y. (eds) Machine Intelligence and Big Data Analytics for Cybersecurity Applications. Studies in Computational Intelligence, vol 919. Springer, Cham. https://doi.org/10.1007/978-3-030-57024-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-57024-8_5
Published: 15 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57023-1
Online ISBN: 978-3-030-57024-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Improving Cyber-Threat Detection by Moving the Boundary Around the Normal Samples