Skip to main content
Log in

Spectral ranking and unsupervised feature selection for point, collective, and contextual anomaly detection

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

Unsupervised anomaly detection algorithm is typically suitable only to a specific type of anomaly, among point anomaly, collective anomaly, and contextual anomaly. A mismatch between the intended anomaly type of an algorithm and the actual type in the data can lead to poor performance. In this paper, utilizing Hilbert–Schmidt independence criterion (HSIC), we propose an unsupervised backward elimination feature selection algorithm BAHSIC-AD to identify a subset of features with the strongest interdependence for anomaly detection. Using BAHSIC-AD, we compare the effectiveness of a recent Spectral Ranking for Anomalies (SRA) algorithm with other popular anomaly detection methods on a few synthetic datasets and real-world datasets. Furthermore, we demonstrate that SRA, combined with BAHSIC-AD, can be a generally applicable method for detecting point, collective, and contextual anomalies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Achtert, E., Kriegel, HP., Schubert, E., Zimek, A.: Interactive data mining with 3D-parallel-coordinate-trees. In: SIGMOD Conference, pp. 1009–1012 (2013)

  2. Alcalá, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Mult. Valued Log. Soft Comput. 17, 255–287 (2010)

    Google Scholar 

  3. Almusallam, N.Y., Tari, Z., Bertok, P., Zomaya, A.Y.: Dimensionality reduction for intrusion detection systems in multi-data streams–a review and proposal of unsupervised feature selection scheme. In: Almusallam, N.Y., Tari, Z., Bertok, P., Zomaya, A.Y. (eds.) Emergent Computation, pp. 467–487. Springer, Berlin (2017)

    Chapter  Google Scholar 

  4. Angiulli, F., Pizzuti, C.: Fast outlier detection in high dimensional spaces. In: Principles of Data Mining and Knowledge Discovery, pp. 15–27. Springer (2002)

  5. Atkins, J.E., Boman, E.G., Hendrickson, B.: A spectral algorithm for seriation and the consecutive ones problem. SIAM J. Comput. 28(1), 297–310 (1998)

    Article  MathSciNet  Google Scholar 

  6. Bache, K., Lichman, M.: UCI machine learning repository (2013) URL http://archive.ics.uci.edu/ml

  7. Belkin, M., Niyogi, P.: Laplacian eigenmaps for demensionality reduction and data representation. Neural Comput. 15, 1373–1396 (2003)

    Article  Google Scholar 

  8. Bolton, R.J., Hand, D.J.: Statistical fraud detection: a review. Stat. Sci. 17, 235–249 (2002)

    Article  MathSciNet  Google Scholar 

  9. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  10. Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: ACM Sigmod Record, vol. 29, pp. 93–104. ACM (2000)

  11. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 15 (2009)

    Article  Google Scholar 

  12. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)

    Google Scholar 

  13. Costa, B.S.J., Angelov, P.P., Guedes, L.A.: Fully unsupervised fault detection and identification based on recursive density estimation and self-evolving cloud-based classifier. Neurocomputing 150, 289–303 (2015)

    Article  Google Scholar 

  14. Couto, J.: Kernel k-means for categorical data. In: Advances in Intelligent Data Analysis VI, pp. 46–56. Springer (2005)

  15. Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)

    MathSciNet  MATH  Google Scholar 

  16. Ertöz, L., Steinbach, M., Kumar, V.: Finding Topics in Collections of Documents: A Shared Nearest Neighbor Approach. Springer, Berlin (2004)

    Google Scholar 

  17. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol. 96, pp. 226–231 (1996)

  18. Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)

    Article  MathSciNet  Google Scholar 

  19. Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.: Measuring statistical dependence with hilbert-schmidt norms. In: Algorithmic Learning Theory, pp. 63–77. Springer (2005)

  20. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  21. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002)

    Article  Google Scholar 

  22. Hautamäki, V., Kärkkäinen, I., Fränti, P.: Outlier detection using k-nearest neighbour graph. In: ICPR (3), pp. 430–433 (2004)

  23. Kohonen, T.: Self-Organization and Associative Memory. Springer Series in Information Sciences, vol. 8, p. 1. Springer, Berlin (1988)

    Book  Google Scholar 

  24. Kriegel, H.P., Kröger, P., Schubert, E., Zimek, A.: Loop: local outlier probabilities. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1649–1652. ACM (2009)

  25. Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17(4), 491–502 (2005)

    Article  Google Scholar 

  26. Manevitz, L.M., Yousef, M.: One-class SVMs for document classification. J. Mach. Learn. Res. 2, 139–154 (2002)

    MATH  Google Scholar 

  27. Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2, 849–856 (2002)

    Google Scholar 

  28. Nian, K., Zhang, H., Tayal, A., Coleman, T., Li, Y.: Auto insurance fraud detection using unsupervised spectral ranking for anomaly. J. Finance Data Sci. 2, 58–75 (2016). https://doi.org/10.1016/j.jfds.2016.03.001

    Article  Google Scholar 

  29. Pang, G., Cao, L., Chen, L.C., Liu, H.: Unsupervised feature selection for outlier detection by modeling hierarchical value-feature couplings. In: 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE (2016)

  30. Pang, G., Cao, L., Chen, L.C., Liu, H.: Learning representations ofultrahigh-dimensional data for random distance-based outlier detection. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2–41. ACM (2018)

  31. Papadimitriou, S., Kitagawa, H., Gibbons, P.B., Faloutsos, C.: Loci: fast outlier detection using the local correlation integral. In: 2003 Proceedings of the 19th International Conference on Data Engineering. pp. 315–326. IEEE (2003)

  32. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  33. Phua, C., Alahakoon, D., Lee, V.: Minority report in fraud detection: classification of skewed data. ACM SIGKDD Explor. Newslett. 6(1), 50–59 (2004)

    Article  Google Scholar 

  34. Roth, V.: Kernel fisher discriminants for outlier detection. Neural Comput. 18(4), 942–960 (2006)

    Article  MathSciNet  Google Scholar 

  35. Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)

    Article  Google Scholar 

  36. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)

    Book  Google Scholar 

  37. Song, L., Smola, A., Gretton, A., Bedo, J., Borgwardt, K.: Feature selection via dependence maximization. J. Mach. Learn. Res. 98888(1), 1393–1434 (2012)

    MathSciNet  MATH  Google Scholar 

  38. Song, Q., Ni, J., Wang, G.: A fast clustering-based feature subset selection algorithm. IEEE Trans. Knowl. Data Eng. 25, 1–14 (2013)

    Article  Google Scholar 

  39. Tayal, A., Coleman, T.F., Li, Y.: Primal explicit max margin feature selection for nonlinear support vector machines. J. Pattern Recognit. 47(6), 2153–2164 (2014)

    Article  Google Scholar 

  40. Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  41. Wang, X., Davidson, I.: Discovering contexts and contextual outliers using random walks in graphs. In: Ninth IEEE International Conference on Data Mining, 2009, ICDM’09, pp. 1034–1039. IEEE (2009)

  42. Weston, J., Elisseeff, A., Schölkopf, B., Tipping, M.: Use of the zero norm with linear models and kernel methods. J. Mach. Learn. Res. 3, 1439–1461 (2003)

    MathSciNet  MATH  Google Scholar 

  43. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Burlington (2005)

    MATH  Google Scholar 

  44. Yuwono, M., Guo, Y., Wall, J., Li, J., West, S., Platt, G., Su, S.W.: Unsupervised feature selection using swarm intelligence and consensus clustering for automatic fault detection and diagnosis in heating ventilation and air conditioning systems. Appl. Soft Comput. 34, 402–425 (2015)

    Article  Google Scholar 

  45. Zhang, J., Wang, H.: Detecting outlying subspaces for high-dimensional data: the new task, algorithms, and performance. Knowl. Inf. Syst. 10(3), 333–355 (2006)

    Article  MathSciNet  Google Scholar 

  46. Zhu, W., Si, G., Zhang, Y., Wang, J.: Neoghborhood effective information ratio for hybrid feature subset evaluation and selection. Neurocomputing 99, 25–37 (2003)

    Article  Google Scholar 

Download references

Acknowledgements

All authors acknowledge funding from the National Sciences and Engineering Research Council of Canada. The third author acknowledges funding from the Ophelia Lazaridis University Research Chair. The views expressed herein are solely from the authors. The authors acknowledge comments from anonymous referees which have improved presentation of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuying Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, H., Nian, K., Coleman, T.F. et al. Spectral ranking and unsupervised feature selection for point, collective, and contextual anomaly detection. Int J Data Sci Anal 9, 57–75 (2020). https://doi.org/10.1007/s41060-018-0161-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-018-0161-7

Keywords

Navigation