A fast and noise resilient cluster-based anomaly detection

Abstract

Clustering, while systematically applied in anomaly detection, has a direct impact on the accuracy of the detection methods. Existing cluster-based anomaly detection methods are mainly based on spherical shape clustering. In this paper, we focus on arbitrary shape clustering methods to increase the accuracy of the anomaly detection. However, since the main drawback of arbitrary shape clustering is its high memory complexity, we propose to summarize clusters first. For this, we design an algorithm, called Summarization based on Gaussian Mixture Model (SGMM), to summarize clusters and represent them as Gaussian Mixture Models (GMMs). After GMMs are constructed, incoming new samples are presented to the GMMs, and their membership values are calculated, based on which the new samples are labeled as “normal” or “anomaly.” Additionally, to address the issue of noise in the data, instead of labeling samples individually, they are clustered first, and then each cluster is labeled collectively. For this, we present a new approach, called Collective Probabilistic Anomaly Detection (CPAD), in which, the distance of the incoming new samples and the existing SGMMs is calculated, and then the new cluster is labeled the same as of the closest cluster. To measure the distance of two GMM-based clusters, we propose a modified version of the Kullback–Libner measure. We run several experiments to evaluate the performances of the proposed SGMM and CPAD methods and compare them against some of the well-known algorithms including ABACUS, local outlier factor (LOF), and one-class support vector machine (SVM). The performance of SGMM is compared with ABACUS using Dunn and DB metrics, and the results indicate that the SGMM performs superior in terms of summarizing clusters. Moreover, the proposed CPAD method is compared with the LOF and one-class SVM considering the performance criteria of (a) false alarm rate, (b) detection rate, and (c) memory efficiency. The experimental results show that the CPAD method is noise resilient, memory efficient, and its accuracy is higher than the other methods.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

References

  1. 1.

    Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. The Morgan Kaufmann series in data management systems

  2. 2.

    Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):1–58

    Article  Google Scholar 

  3. 3.

    Zhou CV, Leckie C, Karunasekera S (2009) A survey of coordinated attacks and collaborative intrusion detection. Comput Secur 29:124–140

    Article  Google Scholar 

  4. 4.

    Teodoro PG, Verdejo JED, Fernández GM, Vázquez E (2009) Anomaly-based network intrusion detection: techniques, systems and challenges. Comput Secur 28:18–28

    Article  Google Scholar 

  5. 5.

    Beusekom JV, Shafait F (2011) Distortion measurement for automatic document verification. In: International conference on document analysis and recognition (ICDAR), 2011

  6. 6.

    Lin J, Keogh E, Herle HV (2005) Approximations to magic: finding unusual medical time series. In: Proceedings of the 18th IEEE symposium on computer-based medical systems, 2005

  7. 7.

    Sajja PS, Akerkar R (2010) Knowledge-based systems for development. Advanced Knowledge Based Systems: Model, Applications & Research 1–11

  8. 8.

    Mohammadi M, Akbari A, Raahemi B, Nasersharif B, Asgharian H (2014) A fast anomaly detection system using probabilistic artificial immune algorithm capable of learning new attacks. Evol Intel 6(5):135–156

    Article  Google Scholar 

  9. 9.

    Smith R, Bivens A, Embrechits M, Palagiri C, Szymanski B (2002) Clustering approaches for anomaly-based intrusion detection. In: Proceedings of intelligent engineering systems through artificial neural networks

  10. 10.

    Hajji H (2005) Statistical analysis of network traffic for adaptive faults detection. Trans Neural Netw 16(5):1053–1063

    Article  Google Scholar 

  11. 11.

    Ndousse TD, Okuda T (1996) Computational intelligence for distributed fault management in networks using fuzzy cognitive maps. In: 1996 IEEE international conference on communications, 1996. ICC ‘96, Conference Record, Converging Technologies for Tomorrow’s Applications, Dallas, TX

  12. 12.

    Brause R, Langsdorf T (1999) Neural data mining for credit card fraud detection. In: Proceedings of the 11th IEEE international conference on tools with artificial intelligence, 1999

  13. 13.

    Tandon G, Chan P (2007) Weighting versus pruning in rule validation for detecting network and host anomalies. In: KDD ‘07 proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining

  14. 14.

    Thatte G, Mitra U, Heidemann J (2011) Parametric methods for anomaly detection in aggregate traffic. IEEE/ACM Trans Netw 19(2):512–525

    Article  Google Scholar 

  15. 15.

    Leng M, Lai X, Tan G, Xu X (2009) Time series representation for anomaly detection. In: 2nd IEEE international conference on computer science and information technology, 2009 (ICCSIT 2009)

  16. 16.

    Sricharan K, Hero AO (2011) Efficient anomaly detection using bipartite k-nn graphs. In: Proceedings of advances in neural information processing systems (NIPS)

  17. 17.

    Orair GH, Teixeira CHC, Meira W, Wang JY, Parthasarathy S (2010) Distance-based outlier detection: consolidation and renewed bearing. Proc VLDB Endow 3(1–2):1469–1480

    Article  Google Scholar 

  18. 18.

    Xie M, Hu J, Han S, Chen HH (2013) Scalable hyper-grid k-NN-based online anomaly detection in wireless sensor networks. IEEE Trans Parallel Distrib Syst 24:1661–1670

    Article  Google Scholar 

  19. 19.

    Boriah S, Chandola V, Kumar V (2008) Similarity measures for categorical data: a comparative evaluation. In: Proceedings of the eighth SIAM international conference on data mining

  20. 20.

    Breunig MM, Kriegel HP, Ng TR, Sander J (2000) LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data

  21. 21.

    Kim MS, Han J (2009) A particle-and-density based evolutionary clustering method for dynamic networks. Proc VLDB Endow 2(1):622–633

    Article  Google Scholar 

  22. 22.

    Scholkopf B, Platt JC, Taylor JS, Smola AJ, Williamson RC (2001) Estimating the support of a high dimensional distribution. Neural Comput 13:1443–1471

    Article  MATH  Google Scholar 

  23. 23.

    Keerthi SS, Chapelle O, DeCoste D (2006) Building support vector machines with reduced classifier complexity. J Mach Learn Res 7:1493–1515

    MathSciNet  MATH  Google Scholar 

  24. 24.

    Amer M, Goldstein M, Abdennadher S (2013) Enhancing one-class support vector machines for unsupervised anomaly detection. In: Proceeding ODD ‘13 proceedings of the ACM SIGKDD workshop on outlier detection and description

  25. 25.

    Chen Y, Qian J, Saligrama V (2013) A new one-class SVM for anomaly detection. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP)

  26. 26.

    Shon T, Moon J (2007) A hybrid machine learning approach to network anomaly detection. Inf Sci 177(18):3799–3821

    Article  Google Scholar 

  27. 27.

    Han JS, Cho SB (2006) Evolutionary neural networks for anomaly detection based on the behavior of a program. IEEE Trans Syst Man Cybern B Cybern 36(3):559–570

    Google Scholar 

  28. 28.

    Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining (KDD-96)

  29. 29.

    He Z, Xu X, Deng S (2003) Discovering cluster-based local outliers. Pattern Recog Lett 24(9–10):1641–1650

    Article  MATH  Google Scholar 

  30. 30.

    Wang W, Yang J, Muntz RR (1997) Sting: a statistical information grid approach to spatial data mining. In: Proceeding VLDB ‘97 proceedings of the 23rd international conference on very large data bases, San Francisco

  31. 31.

    Agrawal J, Gunopulos D, Raghavan P (1998) Automatic sub-space clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data

  32. 32.

    Kersting K, Wahabzada M, Thurau C, Bauckhage C (2010) Hierarchical convex NMF for clustering massive data. In: Machine Learning Research—Proceedings Track, pp 253–268

  33. 33.

    Hershberger J, Shrivastava N, Suri S (2009) Summarizing spatial data streams using ClusterHulls. J Exp Algorithm (JEA) 13:4

    MATH  Google Scholar 

  34. 34.

    Gaddam S, Phoha V, Balagani K (2007) K-means+id3: a novel method for supervised anomaly detection by cascading k-means clustering and id3 decision tree learning methods. IEEE Trans Knowl Data Eng 19(3):345–354

    Article  Google Scholar 

  35. 35.

    Cao F, Ester M, Qian W, Zhou A (2006) Density-based clustering over an evolving data stream with noise. In: Proceeding of SIAM conference on data mining

  36. 36.

    Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86

    MathSciNet  Article  MATH  Google Scholar 

  37. 37.

    Goldberger J, Gordon S Greenspan H (2003) An efficient image similarity measure based on approximations of kl divergence between two gaussian mixtures. In: Proceedings of the ninth IEEE international conference on computer vision

  38. 38.

    Hershey J, Olsen P (2007) Approximating the Kullback Leibler divergence between Gaussian mixture models. In: IEEE international conference on acoustics, speech and signal processing, 2007 (ICASSP 2007)

  39. 39.

    Chaoji V, Li G, Yildirim H, Zaki MJ (2011) Mining arbitrary shaped clusters from large datasets based on backbone identification. In: SDM 2011

  40. 40.

    Dunn K, Dunn J (1997) Well separated clusters and optimal fuzzy partitions. Cybernetics 4:95–104

    MathSciNet  Article  MATH  Google Scholar 

  41. 41.

    Davies LD, Bouldin WD (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(4):224–227

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Elnaz Bigdeli.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bigdeli, E., Mohammadi, M., Raahemi, B. et al. A fast and noise resilient cluster-based anomaly detection. Pattern Anal Applic 20, 183–199 (2017). https://doi.org/10.1007/s10044-015-0484-0

Download citation

Keywords

  • Anomaly detection
  • Arbitrary shape clustering
  • Gaussian Mixture Model
  • Distribution distance