Skip to main content

Improving Cyber-Threat Detection by Moving the Boundary Around the Normal Samples

  • Chapter
  • First Online:
Machine Intelligence and Big Data Analytics for Cybersecurity Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 919))

Abstract

Recent research trends definitely recognise deep learning as an important approach in cybersecurity. Deep learning allows us to learn accurate threat detection models in various scenarios. However, it often suffers from training data over-fitting. In this paper, we propose a supervised machine learning method for cyber-threat detection, which modifies the training set to reduce data over-fitting when training a deep neural network. This is done by re-positioning the decision boundary that separates the normal training samples and the threats. Particularly, it re-assigns the normal training samples that are close to the boundary to the opposite class and trains a competitive deep neural network from the modified training set. In this way, it learns a classification model that can detect unseen threats, which behave similarly to normal samples. The experiments, performed by considering three benchmark datasets, prove the effectiveness of the proposed method. They provide encouraging results, also compared to several prominent competitors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://scikit-learn.org/stable/index.html.

  2. 2.

    https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html.

  3. 3.

    https://github.com/gsndr/THEODORA.

  4. 4.

    https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html.

  5. 5.

    https://github.com/gsndr/MINDFUL.

  6. 6.

    http://kdd.ics.uci.edu//databases//kddcup99//kddcup99.html.

  7. 7.

    10%KDDCUP99Train and KDDCUP99Test are populated with the data stored in kddcup.data_10_percent.gz and corrected.gz at http://kdd.ics.uci.edu//databases//kddcup99//kddcup99.html.

  8. 8.

    https://www.unb.ca/cic/datasets/ids-2017.html.

  9. 9.

    https://www.unb.ca/cic/datasets/android-adware.html.

  10. 10.

    In principle, any traditional supervised algorithm, that is able to estimate the classification certainty, can be used in place of SVM. We consider SVM as several studies [27, 34, 40] have repeatedly proved that it outperforms competitors based on Linear SVM, RBF SVM, Random Forest, K-NN and Naive Bayes in various cybersecurity applications.

References

  1. Abdulhammed Alani R, Musafer H, Alessa A, Faezipour M, Abuzneid A (2019) Features dimensionality reduction approaches for machine learning based network intrusion detection. Electronics 8:322

    Article  Google Scholar 

  2. Abri F, Siami-Namini S, Khanghah MA, Soltani FM, Namin AS (2019) Can machine/deep learning classifiers detect zero-day malware with high accuracy? In: 2019 IEEE international conference on big data (Big Data), pp 3252–3259

    Google Scholar 

  3. Al-Qatf M, Lasheng Y, Al-Habib M, Al-Sabahi K (2018) Deep learning approach combining sparse autoencoder with svm for network intrusion detection. IEEE Access 6:52843–52856

    Article  Google Scholar 

  4. Aldweesh A, Derhab A, Emam AZ (2020) Deep learning approaches for anomaly-based intrusion detection systems: a survey, taxonomy, and open issues. Knowl-Based Syst 189:105124

    Article  Google Scholar 

  5. AlEroud A, Karabatis G (2020) Sdn-gan: generative adversarial deep nns for synthesizing cyber attacks on software defined networks. In: Debruyne C, Panetto H, Guédria W, Bollen P, Ciuciu I, Karabatis G, Meersman R (eds) On the move to meaningful internet systems: OTM 2019 workshops. Springer International Publishing, Cham, pp 211–220

    Chapter  Google Scholar 

  6. Althubiti SA, Jones EM, Roy K (2018) Lstm for anomaly-based network intrusion detection. In: 2018 28th International telecommunication networks and applications conference (ITNAC). IEEE Computer Society, pp 1–3

    Google Scholar 

  7. Amigó E, Gonzalo J, Artiles J, Verdejo M (2009) Amigó e, gonzalo j, artiles j et ala comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retrieval 12:461–486

    Article  Google Scholar 

  8. Andresini G, Appice A, Malerba D (2020) Dealing with class imbalance in android malware detection by cascading clustering and classification. In: Complex pattern mining—new challenges, methods and applications, Studies in Computational Intelligence, vol 880. Springer, pp 173–187. https://doi.org/10.1007/978-3-030-36617-9_11

  9. Andresini G, Appice A, Mauro ND, Loglisci C, Malerba D (2019) Exploiting the auto-encoder residual error for intrusion detection. In: 2019 IEEE European symposium on security and privacy workshops, EuroS&P workshops 2019, Stockholm, Sweden, 17–19 June 2019. IEEE, pp 281–290

    Google Scholar 

  10. Andresini G, Appice A, Mauro ND, Loglisci C, Malerba D (2020) Multi-channel deep feature learning for intrusion detection. IEEE Access 8:53346–53359

    Article  Google Scholar 

  11. Angelo P, Costa Drummond A (2018) Adaptive anomaly-based intrusion detection system using genetic algorithm and profiling. Secur Priv 1(4):e36

    Article  Google Scholar 

  12. Appice A, Andresini G, Malerba D (2020) Clustering-aided multi-view classification: a case study on android malware detection. J Intell Inf Systms. https://doi.org/10.1007/s10844-020-00598-6

    Article  Google Scholar 

  13. Appice A, Guccione P, Malerba D (2017) A novel spectral-spatial co-training algorithm for the transductive classification of hyperspectral imagery data. Pattern Recognit 63:229–245

    Article  Google Scholar 

  14. Appice A, Malerba D (2019) Segmentation-aided classification of hyperspectral data using spatial dependency of spectral bands. ISPRS J Photogrammetry Remote Sens 147:215–231

    Article  Google Scholar 

  15. Berman DS, Buczak AL, Chavis JS, Corbett CL (2019) A survey of deep learning methods for cyber security. Information 10(4):1–35

    Article  Google Scholar 

  16. Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Kluwer Academic Publishers, USA

    Book  MATH  Google Scholar 

  17. Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27

    Article  Google Scholar 

  18. Cheng F, Yang K, Zhang L (2015) A structural svm based approach for binary classification under class imbalance. Math Probl Eng 2015:1–10

    MathSciNet  MATH  Google Scholar 

  19. Chun M, Wei D, Qing W (2020) Speech analysis for wilson’s disease using genetic algorithm and support vector machine. In: Abawajy JH, Choo KKR, Islam R, Xu Z, Atiquzzaman M (eds) International conference on applications and techniques in cyber intelligence ATCI 2019. Springer International Publishing, Cham, pp 1286–1295

    Google Scholar 

  20. Comar PM, Liu L, Saha S, Tan P, Nucci A (2013) Combining supervised and unsupervised learning for zero-day malware detection. In: 2013 Proceedings IEEE INFOCOM, pp 2022–2030

    Google Scholar 

  21. Dan L, Dacheng C, Baihong J, Lei S, Jonathan G, See-Kiong N (2019) Mad-gan: Multivariate anomaly detection for time series data with generative adversarial networks. In: Artificial neural networks and machine learning, pp 703–716

    Google Scholar 

  22. Dunn JC (1973) A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. J Cybern 3(3):32–57

    Article  MathSciNet  MATH  Google Scholar 

  23. Gandotra E, Bansal D, Sofat S (2016) Zero-day malware detection. In: 2016 Sixth international symposium on embedded computing and system design (ISED), pp 171–175

    Google Scholar 

  24. Goh KS, Chang E, Cheng KT (2001) Svm binary classifier ensembles for image classification. In: Proceedings of the tenth international conference on information and knowledge management, CIKM ’01. Association for Computing Machinery, New York, NY, USA, pp 395–402

    Google Scholar 

  25. Goodfellow I, McDaniel P, Papernot N (2018) Making machine learning robust against adversarial inputs. Commun ACM 61(7):56–66

    Article  Google Scholar 

  26. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems 27, Annual conference on neural information processing systems 2014, 8–13 December 2014, Montreal, Quebec, Canada, pp 2672–2680

    Google Scholar 

  27. Halimaa A, Sundarakantham K (2019) Machine learning based intrusion detection system. In: 2019 3rd International conference on trends in electronics and informatics (ICOEI), pp 916–920

    Google Scholar 

  28. Hao M, Tianhao Y, Fei Y (2019) The svm based on smo optimization for speech emotion recognition. In: 2019 Chinese control conference (CCC), pp 7884–7888

    Google Scholar 

  29. Hao Y, Sheng Y, Wang J (2019) Variant gated recurrent units with encoders to preprocess packets for payload-aware intrusion detection. IEEE Access 7:49985–49998

    Article  Google Scholar 

  30. Hu Z, Chen P, Zhu M, Liu P (2019) Reinforcement learning for adaptive cyber defense against zero-day attacks. Springer International Publishing, Cham, pp 54–93

    Google Scholar 

  31. Ingre B, Yadav A, Soni AK (2018) Decision tree based intrusion detection system for nsl-kdd dataset. In: Satapathy SC, Joshi A (eds) Information and communication technology for intelligent systems (ICTIS 2017), vol 2. Springer International Publishing, Cham, pp 207–218

    Google Scholar 

  32. Jang-Jaccard J, Nepal S (2014) A survey of emerging threats in cybersecurity. J Comput Syst Sci 80(5):973–993 Special Issue on Dependable and Secure Computing

    Article  MathSciNet  MATH  Google Scholar 

  33. Jiang F, Fu Y, Gupta BB, Lou F, Rho S, Meng F, Tian Z (2018) Deep learning based multi-channel intelligent attack detection for data security. IEEE Trans Sustain Comput pp 1–1

    Google Scholar 

  34. Kedziora M, Gawin P, Szczepanik M, Jozwiak I (2019) Malware detection using machine learning algorithms and reverse engineering of android java code. SSRN Electron J. https://doi.org/10.2139/ssrn.3328497

    Article  Google Scholar 

  35. Khan RU, Zhang X, Alazab M, Kumar R (2019) An improved convolutional neural network model for intrusion detection in networks. In: 2019 Cybersecurity and cyberforensics conference (CCC), pp 74–77

    Google Scholar 

  36. Kim JY, Bu SJ, Cho SB (2018) Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders. Inf Sci 460–461:83–102

    Article  Google Scholar 

  37. Kim JY, Cho SB (2018) Detecting intrusive malware with a hybrid generative deep learning model. In: Yin H, Camacho D, Novais P, Tallón-Ballesteros AJ (eds) Intelligent data engineering and automated learning—IDEAL 2018. Springer International Publishing, Cham, pp 499–507

    Google Scholar 

  38. Kim T, Suh SC, Kim H, Kim J, Kim J (2018) An encoding technique for cnn-based network anomaly detection. In: International conference on big data, pp 2960–2965

    Google Scholar 

  39. Kremer J, Steenstrup Pedersen K, Igel C (2014) Active learning with support vector machines. WIREs Data Min Knowl Discov 4(4):313–326

    Article  Google Scholar 

  40. Krishnaveni S, Vigneshwar P, Kishore S, Jothi B, Sivamohan S (2020) Anomaly-based intrusion detection system using support vector machine. In: Dash SS, Lakshmi C, Das S, Panigrahi BK (eds) Artificial intelligence and evolutionary computations in engineering systems. Springer Singapore, Singapore, pp 723–731

    Chapter  Google Scholar 

  41. Labonne M, Olivereau A, Polve B, Zeghlache D (2019) A cascade-structured meta-specialists approach for neural network-based intrusion detection. In: 16th Annual consumer communications & networking conference, pp 1–6

    Google Scholar 

  42. Lashkari AH, Kadir AFA, Gonzalez H, Mbah KF, Ghorbani AA (2017) Towards a network-based framework for android malware detection and characterization. In: PST. IEEE Computer Society, pp 233–234

    Google Scholar 

  43. Le T, Kang H, Kim H (2019) The impact of pca-scale improving gru performance for intrusion detection. In: 2019 International conference on platform technology and service (PlatCon), pp 1–6

    Google Scholar 

  44. Lewis DD, Gale WA (1994) A sequential algorithm for training text classifiers. In: Croft BW, van Rijsbergen CJ (eds) SIGIR ’94. Springer, London, London, pp 3–12

    Chapter  Google Scholar 

  45. Li D, Chen D, Jin B, Shi L, Goh J, Ng SK (2019) Mad-gan: multivariate anomaly detection for time series data with generative adversarial networks. In: Tetko IV, Kůrková V, Karpov P, Theis F (eds) Artificial neural networks and machine learning—ICANN 2019: text and time series. Springer International Publishing, Cham, pp 703–716

    Chapter  Google Scholar 

  46. Li Y, Ma R, Jiao R (2015) A hybrid malicious code detection method based on deep learning. Int J Softw Eng Appl 9:205–216

    Google Scholar 

  47. Lin WC, Ke SW, Tsai CF (2015) Cann: an intrusion detection system based on combining cluster centers and nearest neighbors. Knowl-Based Syst 78:13–21

    Article  Google Scholar 

  48. Liu J, Tian Z, Zheng R, Liu L (2019) A distance-based method for building an encrypted malware traffic identification framework. IEEE Access 7:100014–100028

    Article  Google Scholar 

  49. Liu J, Zhang W, Tang Z, Xie Y, Ma T, Zhang J, Zhang G, Niyoyita JP (2020) Adaptive intrusion detection via ga-gogmm-based pattern learning with fuzzy rough set-based attribute selection. Expert Syst Appl 139:112845

    Article  Google Scholar 

  50. Liu W, Ci L, Liu L (2020) A new method of fuzzy support vector machine algorithm for intrusion detection. Appl Sci 10(3):1065

    Article  Google Scholar 

  51. Malerba D, Ceci M, Appice A (2009) A relational approach to probabilistic classification in a transductive setting. Eng Appl Artif Intell 22(1):109–116. https://doi.org/10.1016/j.engappai.2008.04.005

    Article  Google Scholar 

  52. Malik AJ, Khan FA (2017) A hybrid technique using binary particle swarm optimization and decision tree pruning for network intrusion detection. Cluster Comput pp 1–14

    Google Scholar 

  53. Moti Z, Hashemi S, Namavar A (2019) Discovering future malware variants by generating new malware samples using generative adversarial network. In: 2019 9th International conference on computer and knowledge engineering (ICCKE), pp 319–324

    Google Scholar 

  54. Naseer S, Saleem Y, Khalid S, Bashir MK, Han J, Iqbal MM, Han K (2018) Enhanced network anomaly detection based on deep neural networks. IEEE Access 6:48231–48246

    Article  Google Scholar 

  55. Pang, Y., Chen, Z., Peng, L., Ma, K., Zhao, C., Ji, K.: A signature-based assistant random oversampling method for malware detection. In: 2019 18th IEEE International conference on trust, security and privacy in computing and communications/13th IEEE international conference on big data science and engineering (TrustCom/BigDataSE), pp 256–263

    Google Scholar 

  56. Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE symposium on security and privacy (SP), pp 582–597

    Google Scholar 

  57. Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in large margin classifiers. MIT Press, pp 61–74

    Google Scholar 

  58. Powers D (2007) Evaluation: from precision, recall and fmeasure to roc, informedness, markedness and correlation. J Mach Learn Technol 2:37–63

    Google Scholar 

  59. Qu X, Yang L, Guo K, Ma L, Feng T, Ren S, Sun M (2019) Statistics-enhanced direct batch growth self-organizing mapping for efficient dos attack detection. IEEE Access 7:78434–78441

    Article  Google Scholar 

  60. Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, Langs G (2017) Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: Niethammer M, Styner M, Aylward S, Zhu H, Oguz I, Yap PT, Shen D (eds) Information processing in medical imaging. Springer International Publishing, Cham, pp 146–157

    Chapter  Google Scholar 

  61. Shapoorifard H, Shamsinjead Babaki P (2017) Intrusion detection using a novel hybrid method incorporating an improved knn. Int J Comput Appl 173:5–9. https://doi.org/10.5120/ijca2017914340

    Article  Google Scholar 

  62. Stellios I, Kotzanikolaou P, Psarakis M (2019) Advanced persistent threats and zero-day exploits in industrial internet of things. Springer International Publishing, Cham, pp 47–68

    Google Scholar 

  63. Stokes JW, Seifert C, Li J, Hejazi N (2019) Detection of prevalent malware families with deep learning. In: MILCOM 2019—2019 IEEE military communications conference (MILCOM), pp 1–8

    Google Scholar 

  64. Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the kdd cup 99 data set. In: Symposium on computational intelligence for security and defense applications, pp 1–6

    Google Scholar 

  65. Vapnik VN (1998) Statistical learning theory. Wiley-Interscience

    Google Scholar 

  66. Vigneswaran RK, Vinayakumar R, Soman KP, Poornachandran P (2018) Evaluating shallow and deep neural networks for network intrusion detection systems in cyber security. In: 2018 9th International conference on computing, communication and networking technologies (ICCCNT), pp 1–6. https://doi.org/10.1109/ICCCNT.2018.8494096

  67. Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access 7:41525–41550

    Article  Google Scholar 

  68. Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Venkatraman S (2019) Robust intelligent malware detection using deep learning. IEEE Access 7:46717–46738

    Article  Google Scholar 

  69. Virmani C, Choudhary T, Pillai A, Rani M (2020) Applications of machine learning in cyber security. In: Handbook of research on machine and deep learning applications for cyber security

    Google Scholar 

  70. Wadkar M, Troia FD, Stamp M (2020) Detecting malware evolution using support vector machines. Expert Syst Appl 143:113022

    Article  Google Scholar 

  71. Wang Q, Guo W, Zhang K, Ororbia AG, Xing X, Liu X, Giles CL (2017) Adversary resistant deep neural networks with an application to malware detection. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’17. Association for Computing Machinery, New York, NY, USA, pp 1145–1153

    Google Scholar 

  72. Wang W, Zhu M, Zeng X, Ye X, Sheng Y (2017) Malware traffic classification using convolutional neural network for representation learning. In: 2017 International conference on information networking (ICOIN). IEEE, pp 712–717

    Google Scholar 

  73. Yin C, Zhu Y, Fei J, He X (2017) A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access 5:21954–21961

    Article  Google Scholar 

  74. Yin Z, Liu W, Chawla S (2019) Adversarial attack, defense, and applications with deep learning frameworks. Springer International Publishing, Berlin, pp 1–25

    Google Scholar 

  75. Yin Z, Wang F, Liu W, Chawla S (2018) Sparse feature attacks in adversarial learning. IEEE Trans Knowl Data Eng 30(6):1164–1177

    Article  Google Scholar 

  76. Zenati H, Foo CS, Lecouat B, Manek G, Chandrasekhar VR (2018) Efficient gan-based anomaly detection. ArXiv abs/1802.06222

    Google Scholar 

  77. Zenati H, Romain M, Foo CS, Lecouat B, Chandrasekhar VR (2018) Adversarially learned anomaly detection. In: 2018 IEEE International conference on data mining (ICDM), pp 727–736

    Google Scholar 

  78. Zhang Y, Chen X, Jin L, Wang X, Guo D (2019) Network intrusion detection: Based on deep hierarchical network and original flow data. IEEE Access 7:37004–37016

    Article  Google Scholar 

  79. Zhang Z, Pan P (2019) A hybrid intrusion detection method based on improved fuzzy c-means and support vector machine. In: 2019 International conference on communications, information system and computer engineering (CISCE), pp 210–214

    Google Scholar 

Download references

Acknowledgements

We acknowledge the support of the MIUR-Ministero dell’Istruzione dell’Università e della Ricerca through the project “TALIsMan—Tecnologie di Assistenza personALizzata per il Miglioramento della quAlità della vitA” (Grant ID: ARS01_01116) funded by PON RI 2014–2020 and the ATENEO 2017/18 “Modelli e tecniche di data science per la analisi di dati strutturati” funded by the University of Bari “Aldo Moro”. The authors wish to thank Lynn Rudd for her help in reading the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giuseppina Andresini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Andresini, G., Appice, A., Paolo Caforio, F., Malerba, D. (2021). Improving Cyber-Threat Detection by Moving the Boundary Around the Normal Samples. In: Maleh, Y., Shojafar, M., Alazab, M., Baddi, Y. (eds) Machine Intelligence and Big Data Analytics for Cybersecurity Applications. Studies in Computational Intelligence, vol 919. Springer, Cham. https://doi.org/10.1007/978-3-030-57024-8_5

Download citation

Publish with us

Policies and ethics