Abstract
Outlier detection is often considered a challenge due to the inherent class imbalance in datasets, with the small number of available outliers that are insufficient to describe their overall distribution. This makes it difficult for classifiers to effectively learn the demarcation (boundary) between normal samples and outliers, which is the key for accurate detection. In this paper, we propose a novel discriminative boundary generation framework, called BoG. The framework extracts the border samples in the dataset and expands them to form the initial boundary outliers. With the adversarial training in GAN, the boundary outliers are further augmented, which, together with the boundary normal data, provides the valuable demarcation information for the classifier. Two method variants are proposed under our BoG framework to achieve a balance between detection efficiency and effectiveness. Extensive experiments show that our proposed framework achieves significant improvements compared to the existing outlier detection methods.
Similar content being viewed by others
References
Fiore U, De Santis A, Perla F, Zanetti P, Palmieri F (2019) Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Inf Sci 479:448–455
Dal Pozzolo A, Boracchi G, Caelen O, Alippi C, Bontempi G (2017) Credit card fraud detection: a realistic modeling and a novel learning strategy. IEEE Trans Neural Netw Learn Syst 29(8):3784–3797
Makki S, Assaghir Z, Taher Y, Haque R, Hacid M-S, Zeineddine H (2019) An experimental study with imbalanced classification approaches for credit card fraud detection. IEEE Access 7:93010–93022
Van Vlasselaer V, Bravo C, Caelen O, Eliassi-Rad T, Akoglu L, Snoeck M, Baesens B (2015) APATE: a novel approach for automated credit card transaction fraud detection using network-based extensions. Decis Support Syst 75:38–48
Choi H, Kim M, Lee G, Kim W (2019) Unsupervised learning approach for network intrusion detection system using autoencoders. J Supercomput 75(9):5597–5621
Osada G, Omote K, Nishide T (2017) Network intrusion detection based on semi-supervised variational auto-encoder. In: European symposium on research in computer security, pp 344–361
Kuypers MA, Maillart T, Paté-Cornell E (2016) An empirical analysis of cyber security incidents at a large organization. In: Department of management science and engineering. Stanford University, School of Information, UC Berkeley, pp 30
Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, Langs G (2017) Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: International conference on information processing in medical imaging, pp 146–157
Steinwart I, Hush D, Scovel C (2005) A classification framework for anomaly detection. J Mach Learn Res 6(2):211–232
Chehreghani MH (2016) K-nearest neighbor search and outlier detection via minimax distances. In: Proceedings of the 2016 SIAM international conference on data mining, pp 405–413
Djenouri Y, Belhadi A, Lin JC-W, Cano A (2019) Adapted k-nearest neighbors for detecting anomalies on spatio-temporal traffic flow. IEEE Access 7:10015–10027
Zhang Y-L, Li L, Zhou J, Li X, Zhou Z-H (2018) Anomaly detection with partially observed anomalies. In: Companion proceedings of the the web conference 2018, pp 639–646
Daneshpazhouh A, Sami A (2014) Entropy-based outlier detection using semi-supervised approach with few positive examples. Pattern Recognit Lett 49:77–84
Daneshpazhouh A, Sami A (2013) Semi-supervised outlier detection with only positive and unlabeled data based on fuzzy clustering. In: The 5th Conference on information and knowledge technology, pp 344–348
Kołcz A, Chowdhury A, Alspector J (2003) Data duplication: An imbalance problem?
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Han H, Wang W-Y, Mao B-H (2005) Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: International conference on intelligent computing, pp 878–887
He H, Bai Y, Garcia EA, Li S (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International joint conference on neural networks (IEEE world congress on computational intelligence), pp 1322–1328
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv Neural Inf Process Syst 27
Liu Y, Li Z, Zhou C, Jiang Y, Sun J, Wang M, He X (2019) Generative adversarial active learning for unsupervised outlier detection. IEEE Trans Knowl Data Eng 32(8):1517–1528
Shen H, Chen J, Wang R, Zhang J (2020) Counterfeit anomaly using generative adversarial network for anomaly detection. IEEE Access 8:133051–133062
Ngo PC, Winarto AA, Kou CKL, Park S, Akram F, Lee HK (2019) Fence GAN: towards better anomaly detection. In: 2019 IEEE 31St International conference on tools with artificial intelligence (ICTAI), pp 141–148
Schulze J-P, Sperl P, Böttinger K (2021) Double-adversarial activation anomaly detection: adversarial autoencoders are anomaly generators. arXiv preprint arXiv:2101.04645
Averbuch-Elor H, Bar N, Cohen-Or D (2019) Border-peeling clustering. IEEE Trans Pattern Anal Mach Intell 42(7):1791–1797
Lim SK, Loo Y, Tran N-T, Cheung N-M, Roig G, Elovici Y (2018) Doping: generative data augmentation for unsupervised anomaly detection with GAN. In: 2018 IEEE International conference on data mining (ICDM), pp 1122–1127
Intrator Y, Katz G, Shabtai A (2018) Mdgan: boosting anomaly detection using multi-discriminator generative adversarial networks. arXiv preprint arXiv:1810.05221
Wang X, Du Y, Lin S, Cui P, Shen Y, Yang Y (2020) adVAE: a self-adversarial variational autoencoder with gaussian anomaly prior knowledge for anomaly detection. Knowl Based Syst 190:105187
Borghesi A, Bartolini A, Lombardi M, Milano M, Benini L (2019) Anomaly detection using autoencoders in high performance computing systems. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 9428–9433
Akcay S, Atapour-Abarghouei A, Breckon TP (2018) GANomaly: semi-supervised anomaly detection via adversarial training
Salem M, Taheri S, Yuan JS (2018) Anomaly generation using generative adversarial networks in host-based intrusion detection. In: 2018 9th IEEE Annual Ubiquitous computing, electronics & mobile communication conference (UEMCON), pp 683–687
Schlkopf B, Williamson RC, Smola AJ, Shawe-Taylor J, Platt JC (1999) Support vector method for novelty detection. In: Advances in neural information processing systems 12, NIPS conference, Denver, Colorado, USA, November 29—December 4, 1999
Li L, Huang L, Yang W, Yao X, Liu A (2015) Privacy-preserving LOF outlier detection. Knowl Inf Syst 42(3):579–597
Cheng Z, Zou C, Dong J (2019) Outlier detection using isolation forest and local outlier factor. In: Proceedings of the conference on research in adaptive and convergent systems, pp 161–168
Zhou C, Paffenroth RC (2017) Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 665–674
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International conference on machine learning, pp 214–223
Author information
Authors and Affiliations
Contributions
QL, JZ, MJB wrote the main manuscript text, and all authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, J., Liang, Q., Bah, M.J. et al. Discriminative boundary generation for effective outlier detection. Knowl Inf Syst 66, 2987–3004 (2024). https://doi.org/10.1007/s10115-023-02012-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-02012-3