Abstract
Recent research trends definitely recognise deep learning as an important approach in cybersecurity. Deep learning allows us to learn accurate threat detection models in various scenarios. However, it often suffers from training data over-fitting. In this paper, we propose a supervised machine learning method for cyber-threat detection, which modifies the training set to reduce data over-fitting when training a deep neural network. This is done by re-positioning the decision boundary that separates the normal training samples and the threats. Particularly, it re-assigns the normal training samples that are close to the boundary to the opposite class and trains a competitive deep neural network from the modified training set. In this way, it learns a classification model that can detect unseen threats, which behave similarly to normal samples. The experiments, performed by considering three benchmark datasets, prove the effectiveness of the proposed method. They provide encouraging results, also compared to several prominent competitors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
10%KDDCUP99Train and KDDCUP99Test are populated with the data stored in kddcup.data_10_percent.gz and corrected.gz at http://kdd.ics.uci.edu//databases//kddcup99//kddcup99.html.
- 8.
- 9.
- 10.
In principle, any traditional supervised algorithm, that is able to estimate the classification certainty, can be used in place of SVM. We consider SVM as several studies [27, 34, 40] have repeatedly proved that it outperforms competitors based on Linear SVM, RBF SVM, Random Forest, K-NN and Naive Bayes in various cybersecurity applications.
References
Abdulhammed Alani R, Musafer H, Alessa A, Faezipour M, Abuzneid A (2019) Features dimensionality reduction approaches for machine learning based network intrusion detection. Electronics 8:322
Abri F, Siami-Namini S, Khanghah MA, Soltani FM, Namin AS (2019) Can machine/deep learning classifiers detect zero-day malware with high accuracy? In: 2019 IEEE international conference on big data (Big Data), pp 3252–3259
Al-Qatf M, Lasheng Y, Al-Habib M, Al-Sabahi K (2018) Deep learning approach combining sparse autoencoder with svm for network intrusion detection. IEEE Access 6:52843–52856
Aldweesh A, Derhab A, Emam AZ (2020) Deep learning approaches for anomaly-based intrusion detection systems: a survey, taxonomy, and open issues. Knowl-Based Syst 189:105124
AlEroud A, Karabatis G (2020) Sdn-gan: generative adversarial deep nns for synthesizing cyber attacks on software defined networks. In: Debruyne C, Panetto H, Guédria W, Bollen P, Ciuciu I, Karabatis G, Meersman R (eds) On the move to meaningful internet systems: OTM 2019 workshops. Springer International Publishing, Cham, pp 211–220
Althubiti SA, Jones EM, Roy K (2018) Lstm for anomaly-based network intrusion detection. In: 2018 28th International telecommunication networks and applications conference (ITNAC). IEEE Computer Society, pp 1–3
Amigó E, Gonzalo J, Artiles J, Verdejo M (2009) Amigó e, gonzalo j, artiles j et ala comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retrieval 12:461–486
Andresini G, Appice A, Malerba D (2020) Dealing with class imbalance in android malware detection by cascading clustering and classification. In: Complex pattern mining—new challenges, methods and applications, Studies in Computational Intelligence, vol 880. Springer, pp 173–187. https://doi.org/10.1007/978-3-030-36617-9_11
Andresini G, Appice A, Mauro ND, Loglisci C, Malerba D (2019) Exploiting the auto-encoder residual error for intrusion detection. In: 2019 IEEE European symposium on security and privacy workshops, EuroS&P workshops 2019, Stockholm, Sweden, 17–19 June 2019. IEEE, pp 281–290
Andresini G, Appice A, Mauro ND, Loglisci C, Malerba D (2020) Multi-channel deep feature learning for intrusion detection. IEEE Access 8:53346–53359
Angelo P, Costa Drummond A (2018) Adaptive anomaly-based intrusion detection system using genetic algorithm and profiling. Secur Priv 1(4):e36
Appice A, Andresini G, Malerba D (2020) Clustering-aided multi-view classification: a case study on android malware detection. J Intell Inf Systms. https://doi.org/10.1007/s10844-020-00598-6
Appice A, Guccione P, Malerba D (2017) A novel spectral-spatial co-training algorithm for the transductive classification of hyperspectral imagery data. Pattern Recognit 63:229–245
Appice A, Malerba D (2019) Segmentation-aided classification of hyperspectral data using spatial dependency of spectral bands. ISPRS J Photogrammetry Remote Sens 147:215–231
Berman DS, Buczak AL, Chavis JS, Corbett CL (2019) A survey of deep learning methods for cyber security. Information 10(4):1–35
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Kluwer Academic Publishers, USA
Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Cheng F, Yang K, Zhang L (2015) A structural svm based approach for binary classification under class imbalance. Math Probl Eng 2015:1–10
Chun M, Wei D, Qing W (2020) Speech analysis for wilson’s disease using genetic algorithm and support vector machine. In: Abawajy JH, Choo KKR, Islam R, Xu Z, Atiquzzaman M (eds) International conference on applications and techniques in cyber intelligence ATCI 2019. Springer International Publishing, Cham, pp 1286–1295
Comar PM, Liu L, Saha S, Tan P, Nucci A (2013) Combining supervised and unsupervised learning for zero-day malware detection. In: 2013 Proceedings IEEE INFOCOM, pp 2022–2030
Dan L, Dacheng C, Baihong J, Lei S, Jonathan G, See-Kiong N (2019) Mad-gan: Multivariate anomaly detection for time series data with generative adversarial networks. In: Artificial neural networks and machine learning, pp 703–716
Dunn JC (1973) A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. J Cybern 3(3):32–57
Gandotra E, Bansal D, Sofat S (2016) Zero-day malware detection. In: 2016 Sixth international symposium on embedded computing and system design (ISED), pp 171–175
Goh KS, Chang E, Cheng KT (2001) Svm binary classifier ensembles for image classification. In: Proceedings of the tenth international conference on information and knowledge management, CIKM ’01. Association for Computing Machinery, New York, NY, USA, pp 395–402
Goodfellow I, McDaniel P, Papernot N (2018) Making machine learning robust against adversarial inputs. Commun ACM 61(7):56–66
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems 27, Annual conference on neural information processing systems 2014, 8–13 December 2014, Montreal, Quebec, Canada, pp 2672–2680
Halimaa A, Sundarakantham K (2019) Machine learning based intrusion detection system. In: 2019 3rd International conference on trends in electronics and informatics (ICOEI), pp 916–920
Hao M, Tianhao Y, Fei Y (2019) The svm based on smo optimization for speech emotion recognition. In: 2019 Chinese control conference (CCC), pp 7884–7888
Hao Y, Sheng Y, Wang J (2019) Variant gated recurrent units with encoders to preprocess packets for payload-aware intrusion detection. IEEE Access 7:49985–49998
Hu Z, Chen P, Zhu M, Liu P (2019) Reinforcement learning for adaptive cyber defense against zero-day attacks. Springer International Publishing, Cham, pp 54–93
Ingre B, Yadav A, Soni AK (2018) Decision tree based intrusion detection system for nsl-kdd dataset. In: Satapathy SC, Joshi A (eds) Information and communication technology for intelligent systems (ICTIS 2017), vol 2. Springer International Publishing, Cham, pp 207–218
Jang-Jaccard J, Nepal S (2014) A survey of emerging threats in cybersecurity. J Comput Syst Sci 80(5):973–993 Special Issue on Dependable and Secure Computing
Jiang F, Fu Y, Gupta BB, Lou F, Rho S, Meng F, Tian Z (2018) Deep learning based multi-channel intelligent attack detection for data security. IEEE Trans Sustain Comput pp 1–1
Kedziora M, Gawin P, Szczepanik M, Jozwiak I (2019) Malware detection using machine learning algorithms and reverse engineering of android java code. SSRN Electron J. https://doi.org/10.2139/ssrn.3328497
Khan RU, Zhang X, Alazab M, Kumar R (2019) An improved convolutional neural network model for intrusion detection in networks. In: 2019 Cybersecurity and cyberforensics conference (CCC), pp 74–77
Kim JY, Bu SJ, Cho SB (2018) Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders. Inf Sci 460–461:83–102
Kim JY, Cho SB (2018) Detecting intrusive malware with a hybrid generative deep learning model. In: Yin H, Camacho D, Novais P, Tallón-Ballesteros AJ (eds) Intelligent data engineering and automated learning—IDEAL 2018. Springer International Publishing, Cham, pp 499–507
Kim T, Suh SC, Kim H, Kim J, Kim J (2018) An encoding technique for cnn-based network anomaly detection. In: International conference on big data, pp 2960–2965
Kremer J, Steenstrup Pedersen K, Igel C (2014) Active learning with support vector machines. WIREs Data Min Knowl Discov 4(4):313–326
Krishnaveni S, Vigneshwar P, Kishore S, Jothi B, Sivamohan S (2020) Anomaly-based intrusion detection system using support vector machine. In: Dash SS, Lakshmi C, Das S, Panigrahi BK (eds) Artificial intelligence and evolutionary computations in engineering systems. Springer Singapore, Singapore, pp 723–731
Labonne M, Olivereau A, Polve B, Zeghlache D (2019) A cascade-structured meta-specialists approach for neural network-based intrusion detection. In: 16th Annual consumer communications & networking conference, pp 1–6
Lashkari AH, Kadir AFA, Gonzalez H, Mbah KF, Ghorbani AA (2017) Towards a network-based framework for android malware detection and characterization. In: PST. IEEE Computer Society, pp 233–234
Le T, Kang H, Kim H (2019) The impact of pca-scale improving gru performance for intrusion detection. In: 2019 International conference on platform technology and service (PlatCon), pp 1–6
Lewis DD, Gale WA (1994) A sequential algorithm for training text classifiers. In: Croft BW, van Rijsbergen CJ (eds) SIGIR ’94. Springer, London, London, pp 3–12
Li D, Chen D, Jin B, Shi L, Goh J, Ng SK (2019) Mad-gan: multivariate anomaly detection for time series data with generative adversarial networks. In: Tetko IV, Kůrková V, Karpov P, Theis F (eds) Artificial neural networks and machine learning—ICANN 2019: text and time series. Springer International Publishing, Cham, pp 703–716
Li Y, Ma R, Jiao R (2015) A hybrid malicious code detection method based on deep learning. Int J Softw Eng Appl 9:205–216
Lin WC, Ke SW, Tsai CF (2015) Cann: an intrusion detection system based on combining cluster centers and nearest neighbors. Knowl-Based Syst 78:13–21
Liu J, Tian Z, Zheng R, Liu L (2019) A distance-based method for building an encrypted malware traffic identification framework. IEEE Access 7:100014–100028
Liu J, Zhang W, Tang Z, Xie Y, Ma T, Zhang J, Zhang G, Niyoyita JP (2020) Adaptive intrusion detection via ga-gogmm-based pattern learning with fuzzy rough set-based attribute selection. Expert Syst Appl 139:112845
Liu W, Ci L, Liu L (2020) A new method of fuzzy support vector machine algorithm for intrusion detection. Appl Sci 10(3):1065
Malerba D, Ceci M, Appice A (2009) A relational approach to probabilistic classification in a transductive setting. Eng Appl Artif Intell 22(1):109–116. https://doi.org/10.1016/j.engappai.2008.04.005
Malik AJ, Khan FA (2017) A hybrid technique using binary particle swarm optimization and decision tree pruning for network intrusion detection. Cluster Comput pp 1–14
Moti Z, Hashemi S, Namavar A (2019) Discovering future malware variants by generating new malware samples using generative adversarial network. In: 2019 9th International conference on computer and knowledge engineering (ICCKE), pp 319–324
Naseer S, Saleem Y, Khalid S, Bashir MK, Han J, Iqbal MM, Han K (2018) Enhanced network anomaly detection based on deep neural networks. IEEE Access 6:48231–48246
Pang, Y., Chen, Z., Peng, L., Ma, K., Zhao, C., Ji, K.: A signature-based assistant random oversampling method for malware detection. In: 2019 18th IEEE International conference on trust, security and privacy in computing and communications/13th IEEE international conference on big data science and engineering (TrustCom/BigDataSE), pp 256–263
Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE symposium on security and privacy (SP), pp 582–597
Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in large margin classifiers. MIT Press, pp 61–74
Powers D (2007) Evaluation: from precision, recall and fmeasure to roc, informedness, markedness and correlation. J Mach Learn Technol 2:37–63
Qu X, Yang L, Guo K, Ma L, Feng T, Ren S, Sun M (2019) Statistics-enhanced direct batch growth self-organizing mapping for efficient dos attack detection. IEEE Access 7:78434–78441
Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, Langs G (2017) Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: Niethammer M, Styner M, Aylward S, Zhu H, Oguz I, Yap PT, Shen D (eds) Information processing in medical imaging. Springer International Publishing, Cham, pp 146–157
Shapoorifard H, Shamsinjead Babaki P (2017) Intrusion detection using a novel hybrid method incorporating an improved knn. Int J Comput Appl 173:5–9. https://doi.org/10.5120/ijca2017914340
Stellios I, Kotzanikolaou P, Psarakis M (2019) Advanced persistent threats and zero-day exploits in industrial internet of things. Springer International Publishing, Cham, pp 47–68
Stokes JW, Seifert C, Li J, Hejazi N (2019) Detection of prevalent malware families with deep learning. In: MILCOM 2019—2019 IEEE military communications conference (MILCOM), pp 1–8
Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the kdd cup 99 data set. In: Symposium on computational intelligence for security and defense applications, pp 1–6
Vapnik VN (1998) Statistical learning theory. Wiley-Interscience
Vigneswaran RK, Vinayakumar R, Soman KP, Poornachandran P (2018) Evaluating shallow and deep neural networks for network intrusion detection systems in cyber security. In: 2018 9th International conference on computing, communication and networking technologies (ICCCNT), pp 1–6. https://doi.org/10.1109/ICCCNT.2018.8494096
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access 7:41525–41550
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Venkatraman S (2019) Robust intelligent malware detection using deep learning. IEEE Access 7:46717–46738
Virmani C, Choudhary T, Pillai A, Rani M (2020) Applications of machine learning in cyber security. In: Handbook of research on machine and deep learning applications for cyber security
Wadkar M, Troia FD, Stamp M (2020) Detecting malware evolution using support vector machines. Expert Syst Appl 143:113022
Wang Q, Guo W, Zhang K, Ororbia AG, Xing X, Liu X, Giles CL (2017) Adversary resistant deep neural networks with an application to malware detection. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’17. Association for Computing Machinery, New York, NY, USA, pp 1145–1153
Wang W, Zhu M, Zeng X, Ye X, Sheng Y (2017) Malware traffic classification using convolutional neural network for representation learning. In: 2017 International conference on information networking (ICOIN). IEEE, pp 712–717
Yin C, Zhu Y, Fei J, He X (2017) A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access 5:21954–21961
Yin Z, Liu W, Chawla S (2019) Adversarial attack, defense, and applications with deep learning frameworks. Springer International Publishing, Berlin, pp 1–25
Yin Z, Wang F, Liu W, Chawla S (2018) Sparse feature attacks in adversarial learning. IEEE Trans Knowl Data Eng 30(6):1164–1177
Zenati H, Foo CS, Lecouat B, Manek G, Chandrasekhar VR (2018) Efficient gan-based anomaly detection. ArXiv abs/1802.06222
Zenati H, Romain M, Foo CS, Lecouat B, Chandrasekhar VR (2018) Adversarially learned anomaly detection. In: 2018 IEEE International conference on data mining (ICDM), pp 727–736
Zhang Y, Chen X, Jin L, Wang X, Guo D (2019) Network intrusion detection: Based on deep hierarchical network and original flow data. IEEE Access 7:37004–37016
Zhang Z, Pan P (2019) A hybrid intrusion detection method based on improved fuzzy c-means and support vector machine. In: 2019 International conference on communications, information system and computer engineering (CISCE), pp 210–214
Acknowledgements
We acknowledge the support of the MIUR-Ministero dell’Istruzione dell’Università e della Ricerca through the project “TALIsMan—Tecnologie di Assistenza personALizzata per il Miglioramento della quAlità della vitA” (Grant ID: ARS01_01116) funded by PON RI 2014–2020 and the ATENEO 2017/18 “Modelli e tecniche di data science per la analisi di dati strutturati” funded by the University of Bari “Aldo Moro”. The authors wish to thank Lynn Rudd for her help in reading the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Andresini, G., Appice, A., Paolo Caforio, F., Malerba, D. (2021). Improving Cyber-Threat Detection by Moving the Boundary Around the Normal Samples. In: Maleh, Y., Shojafar, M., Alazab, M., Baddi, Y. (eds) Machine Intelligence and Big Data Analytics for Cybersecurity Applications. Studies in Computational Intelligence, vol 919. Springer, Cham. https://doi.org/10.1007/978-3-030-57024-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-57024-8_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57023-1
Online ISBN: 978-3-030-57024-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)