Spectral ranking and unsupervised feature selection for point, collective, and contextual anomaly detection

Zhang, Haofan; Nian, Ke; Coleman, Thomas F.; Li, Yuying

doi:10.1007/s41060-018-0161-7

Spectral ranking and unsupervised feature selection for point, collective, and contextual anomaly detection

Regular Paper
Published: 04 December 2018

Volume 9, pages 57–75, (2020)
Cite this article

International Journal of Data Science and Analytics Aims and scope Submit manuscript

Haofan Zhang¹,
Ke Nian¹,
Thomas F. Coleman² &
…
Yuying Li ORCID: orcid.org/0000-0001-9423-7313¹

630 Accesses
8 Citations
Explore all metrics

Abstract

Unsupervised anomaly detection algorithm is typically suitable only to a specific type of anomaly, among point anomaly, collective anomaly, and contextual anomaly. A mismatch between the intended anomaly type of an algorithm and the actual type in the data can lead to poor performance. In this paper, utilizing Hilbert–Schmidt independence criterion (HSIC), we propose an unsupervised backward elimination feature selection algorithm BAHSIC-AD to identify a subset of features with the strongest interdependence for anomaly detection. Using BAHSIC-AD, we compare the effectiveness of a recent Spectral Ranking for Anomalies (SRA) algorithm with other popular anomaly detection methods on a few synthetic datasets and real-world datasets. Furthermore, we demonstrate that SRA, combined with BAHSIC-AD, can be a generally applicable method for detecting point, collective, and contextual anomalies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comprehensive Survey of Clustering Algorithms

Article 01 June 2015

Density-Based Clustering Based on Hierarchical Density Estimates

A survey of methods for time series change point detection

Article 08 September 2016

References

Achtert, E., Kriegel, HP., Schubert, E., Zimek, A.: Interactive data mining with 3D-parallel-coordinate-trees. In: SIGMOD Conference, pp. 1009–1012 (2013)
Alcalá, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Mult. Valued Log. Soft Comput. 17, 255–287 (2010)
Google Scholar
Almusallam, N.Y., Tari, Z., Bertok, P., Zomaya, A.Y.: Dimensionality reduction for intrusion detection systems in multi-data streams–a review and proposal of unsupervised feature selection scheme. In: Almusallam, N.Y., Tari, Z., Bertok, P., Zomaya, A.Y. (eds.) Emergent Computation, pp. 467–487. Springer, Berlin (2017)
Chapter Google Scholar
Angiulli, F., Pizzuti, C.: Fast outlier detection in high dimensional spaces. In: Principles of Data Mining and Knowledge Discovery, pp. 15–27. Springer (2002)
Atkins, J.E., Boman, E.G., Hendrickson, B.: A spectral algorithm for seriation and the consecutive ones problem. SIAM J. Comput. 28(1), 297–310 (1998)
Article MathSciNet Google Scholar
Bache, K., Lichman, M.: UCI machine learning repository (2013) URL http://archive.ics.uci.edu/ml
Belkin, M., Niyogi, P.: Laplacian eigenmaps for demensionality reduction and data representation. Neural Comput. 15, 1373–1396 (2003)
Article Google Scholar
Bolton, R.J., Hand, D.J.: Statistical fraud detection: a review. Stat. Sci. 17, 235–249 (2002)
Article MathSciNet Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: ACM Sigmod Record, vol. 29, pp. 93–104. ACM (2000)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 15 (2009)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Google Scholar
Costa, B.S.J., Angelov, P.P., Guedes, L.A.: Fully unsupervised fault detection and identification based on recursive density estimation and self-evolving cloud-based classifier. Neurocomputing 150, 289–303 (2015)
Article Google Scholar
Couto, J.: Kernel k-means for categorical data. In: Advances in Intelligent Data Analysis VI, pp. 46–56. Springer (2005)
Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)
MathSciNet MATH Google Scholar
Ertöz, L., Steinbach, M., Kumar, V.: Finding Topics in Collections of Documents: A Shared Nearest Neighbor Approach. Springer, Berlin (2004)
Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol. 96, pp. 226–231 (1996)
Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)
Article MathSciNet Google Scholar
Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.: Measuring statistical dependence with hilbert-schmidt norms. In: Algorithmic Learning Theory, pp. 63–77. Springer (2005)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
MATH Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002)
Article Google Scholar
Hautamäki, V., Kärkkäinen, I., Fränti, P.: Outlier detection using k-nearest neighbour graph. In: ICPR (3), pp. 430–433 (2004)
Kohonen, T.: Self-Organization and Associative Memory. Springer Series in Information Sciences, vol. 8, p. 1. Springer, Berlin (1988)
Book Google Scholar
Kriegel, H.P., Kröger, P., Schubert, E., Zimek, A.: Loop: local outlier probabilities. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1649–1652. ACM (2009)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17(4), 491–502 (2005)
Article Google Scholar
Manevitz, L.M., Yousef, M.: One-class SVMs for document classification. J. Mach. Learn. Res. 2, 139–154 (2002)
MATH Google Scholar
Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2, 849–856 (2002)
Google Scholar
Nian, K., Zhang, H., Tayal, A., Coleman, T., Li, Y.: Auto insurance fraud detection using unsupervised spectral ranking for anomaly. J. Finance Data Sci. 2, 58–75 (2016). https://doi.org/10.1016/j.jfds.2016.03.001
Article Google Scholar
Pang, G., Cao, L., Chen, L.C., Liu, H.: Unsupervised feature selection for outlier detection by modeling hierarchical value-feature couplings. In: 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE (2016)
Pang, G., Cao, L., Chen, L.C., Liu, H.: Learning representations ofultrahigh-dimensional data for random distance-based outlier detection. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2–41. ACM (2018)
Papadimitriou, S., Kitagawa, H., Gibbons, P.B., Faloutsos, C.: Loci: fast outlier detection using the local correlation integral. In: 2003 Proceedings of the 19th International Conference on Data Engineering. pp. 315–326. IEEE (2003)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Article Google Scholar
Phua, C., Alahakoon, D., Lee, V.: Minority report in fraud detection: classification of skewed data. ACM SIGKDD Explor. Newslett. 6(1), 50–59 (2004)
Article Google Scholar
Roth, V.: Kernel fisher discriminants for outlier detection. Neural Comput. 18(4), 942–960 (2006)
Article MathSciNet Google Scholar
Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)
Article Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Song, L., Smola, A., Gretton, A., Bedo, J., Borgwardt, K.: Feature selection via dependence maximization. J. Mach. Learn. Res. 98888(1), 1393–1434 (2012)
MathSciNet MATH Google Scholar
Song, Q., Ni, J., Wang, G.: A fast clustering-based feature subset selection algorithm. IEEE Trans. Knowl. Data Eng. 25, 1–14 (2013)
Article Google Scholar
Tayal, A., Coleman, T.F., Li, Y.: Primal explicit max margin feature selection for nonlinear support vector machines. J. Pattern Recognit. 47(6), 2153–2164 (2014)
Article Google Scholar
Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
Article MathSciNet Google Scholar
Wang, X., Davidson, I.: Discovering contexts and contextual outliers using random walks in graphs. In: Ninth IEEE International Conference on Data Mining, 2009, ICDM’09, pp. 1034–1039. IEEE (2009)
Weston, J., Elisseeff, A., Schölkopf, B., Tipping, M.: Use of the zero norm with linear models and kernel methods. J. Mach. Learn. Res. 3, 1439–1461 (2003)
MathSciNet MATH Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Burlington (2005)
MATH Google Scholar
Yuwono, M., Guo, Y., Wall, J., Li, J., West, S., Platt, G., Su, S.W.: Unsupervised feature selection using swarm intelligence and consensus clustering for automatic fault detection and diagnosis in heating ventilation and air conditioning systems. Appl. Soft Comput. 34, 402–425 (2015)
Article Google Scholar
Zhang, J., Wang, H.: Detecting outlying subspaces for high-dimensional data: the new task, algorithms, and performance. Knowl. Inf. Syst. 10(3), 333–355 (2006)
Article MathSciNet Google Scholar
Zhu, W., Si, G., Zhang, Y., Wang, J.: Neoghborhood effective information ratio for hybrid feature subset evaluation and selection. Neurocomputing 99, 25–37 (2003)
Article Google Scholar

Download references

Acknowledgements

All authors acknowledge funding from the National Sciences and Engineering Research Council of Canada. The third author acknowledges funding from the Ophelia Lazaridis University Research Chair. The views expressed herein are solely from the authors. The authors acknowledge comments from anonymous referees which have improved presentation of the paper.

Author information

Authors and Affiliations

Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
Haofan Zhang, Ke Nian & Yuying Li
Combinatorics and Optimization, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
Thomas F. Coleman

Authors

Haofan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ke Nian
View author publications
You can also search for this author in PubMed Google Scholar
Thomas F. Coleman
View author publications
You can also search for this author in PubMed Google Scholar
Yuying Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuying Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Nian, K., Coleman, T.F. et al. Spectral ranking and unsupervised feature selection for point, collective, and contextual anomaly detection. Int J Data Sci Anal 9, 57–75 (2020). https://doi.org/10.1007/s41060-018-0161-7

Download citation

Received: 09 April 2018
Accepted: 23 November 2018
Published: 04 December 2018
Issue Date: February 2020
DOI: https://doi.org/10.1007/s41060-018-0161-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spectral ranking and unsupervised feature selection for point, collective, and contextual anomaly detection

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

A survey of methods for time series change point detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Spectral ranking and unsupervised feature selection for point, collective, and contextual anomaly detection

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

A survey of methods for time series change point detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation