Extreme value theory for anomaly detection – the GPD classifier

Vignotto, Edoardo; Engelke, Sebastian

doi:10.1007/s10687-020-00393-0

Extreme value theory for anomaly detection – the GPD classifier

Open access
Published: 09 September 2020

Volume 23, pages 501–520, (2020)
Cite this article

Download PDF

You have full access to this open access article

Extremes Aims and scope Submit manuscript

Extreme value theory for anomaly detection – the GPD classifier

Download PDF

2222 Accesses
10 Citations
Explore all metrics

Abstract

Classification tasks usually assume that all possible classes are present during the training phase. This is restrictive if the algorithm is used over a long time and possibly encounters samples from unknown new classes. It is therefore fundamental to develop algorithms able to distinguish between normal and abnormal test data. In the last few years, extreme value theory has become an important tool in multivariate statistics and machine learning. The recently introduced extreme value machine, a classifier motivated by extreme value theory, addresses this problem and achieves competitive performance in specific cases. We show that this algorithm has some theoretical and practical drawbacks and can fail even if the recognition task is fairly simple. To overcome these limitations, we propose two new algorithms for anomaly detection relying on approximations from extreme value theory that are more robust in such cases. We exploit the intuition that test points that are extremely far from the training classes are more likely to be abnormal objects. We derive asymptotic results motivated by univariate extreme value theory that make this intuition precise. We show the effectiveness of our classifiers in simulations and on real data sets.

Article PDF

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Introduction to Machine Learning

A Comprehensive Survey of Anomaly Detection Algorithms

Article 26 November 2021

References

Abe, N., Zadrozny, B., Langford, J.: Outlier detection by active learning. In: International Conference on Knowledge Discovery and Data Mining. ACM (2006)
Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An optimal algorithm for approximate nearest neighbor searching fixed dimensions. Journal of the ACM 45(6) (1998)
Bendale, A., Boult, T.: Towards open world recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Bishop, C.M.: Novelty detection and neural network validation. IEE Proceedings-Vision, Image and Signal Processing 141(4) (1994)
Bradley, A.P. : The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7) (1997)
Cai, J. , Einmahl, J., De Haan, L., et al.: Estimation of extreme risk regions under multivariate regular variation. The Annals of Statistics 39(3) (2011)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Computing Surveys (CSUR) 41(3) (2009)
Christopher, M.B.: Pattern Recognition and Machine Learning. Springer, New York (2016)
Google Scholar
Coles, S., Bawa, J., Trenner, L., Dorazio, P.: An Introduction to Statistical Modeling of Extreme Values. Springer, Berlin (2001)
Book Google Scholar
De Haan, L., Ferreira, A.: Extreme Value Theory: an Introduction. Springer Science & Business Media, Berlin (2007)
MATH Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Conference on Computer Vision and Pattern Recognition (2009)
Désir, C., Bernard, S., Petitjean, C., Heutte, L.: One class random forests. Pattern Recognition 46(12) (2013)
Dua, D., Graff, C.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2017)
Einmahl, J., Li, J., Liu, R., et al.: Bridging centrality and extremity: refining empirical data depth using extreme value statistics. The Annals of Statistics 43(6) (2015)
Embrechts, P., Klüppelberg, C., Mikosch, T.: Modelling Extremal Events: for Insurance and Finance, vol. 33. Springer Science & Business Media, Berlin (2013)
MATH Google Scholar
Fragoso, V., Sen, P., Rodriguez, S., Turk, M.: EVSAC: accelerating hypotheses generation by modeling matching scores with extreme value theory. In: IEEE International Conference on Computer Vision (2013)
Frey, P.W., Slate, D.J.: Letter recognition using holland-style adaptive classifiers. Machine Learning 6(2) (1991)
Geng, C., Huang, S., Chen, S.: Recent advances in open set recognition: a survey. Preprint arXiv:1811.08581 (2018)
Goix, N., Sabourin, A., Clemencon, S.: Sparse representation of multivariate extremes with applications to anomaly ranking. In: AISTATS (2016)
Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2013)
Hall, P.: On estimating the endpoint of a distribution. The Annals of Statistics 10(2) (1982)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
He, Y., Einmahl, J.: Estimation of extreme depth-based quantile regions. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 (2017)
Hill, B.M.: A simple general approach to inference about the tail of a distribution. The Annals of Statistics, pp 1163–1174 (1975)
Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering 17(3) (2005)
Jalalzai, H., Clémençon, S., Sabourin, A.: On binary classification in extreme regions. In: Advances in Neural Information Processing Systems (2018)
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning. Springer, Berlin (2013)
Book Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, University of Toronto (2009)
Liu, F.T., Ting, K.M., Zhou, Z.: Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data (TKDD) 6(1) (2012)
Mensink, T., Verbeek, J., Perronnin, F., Csurka, G.: Metric learning for large scale image classification: generalizing to new classes at near-zero cost. In: European Conference on Computer Vision. Springer, Berlin (2012)
Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Pimentel, M.A.F., Clifton, D.A., Clifton, L., Tarassenko, L.: A review of novelty detection. Signal Process. 99 (2014)
Quinlan, J.R., Compton, P.J., Horn, K.A., Lazarus, L.: Inductive knowledge acquisition: a case study. In: Proceedings of the second Australian Conference on the Applications of Expert Systems (1986)
Rebuffi, S., Kolesnikov, A., Lampert, C.H.: icaRL: incremental classifier and representation learning. In: Conference on Computer Vision and Pattern Recognition (2017)
Roberts, S.J.: Novelty detection using extreme value statistics. IEE Proceedings-Vision, Image and Signal Processing 146(3) (1999)
Rudd, E.M., Jain, L.P., Scheirer, W.J., Boult, T.E. : The extreme value machine. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(3) (2018)
Ruping, S.: Incremental learning with support vector machines. In: IEEE International Conference on Data Mining (2001)
Saffari, A., Leistner, C., Santner, J., Godec, M., Bischof, H.: On-line random forests. In: IEEE International Conference on Computer Vision Workshops (2009)
Scheirer, W.J.: Extreme value theory-based methods for visual recognition. Synthesis Lectures on Computer Vision 7(1) (2017)
Scheirer, W.J., Rocha, A., Micheals, R.J., Boult, T.E.: Meta-recognition: the theory and practice of recognition score analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(8) (2011)
Schiffmann, W., Joost, M., Werner, R.: Synthesis and performance analysis of multilayer neural network architectures. Technical report, University of Koblenz (1992)
Schölkopf, B., Williamson, R.C., Smola, A.J., Shawe-Taylor, J., Platt, J.C.: Support vector method for novelty detection. In: Advances in Neural Information Processing Systems (2000)
Shaffer, J.P.: Multiple hypothesis testing. Annual Review of Psychology 46(1) (1995)
Shon, T., Moon, J.: A hybrid machine learning approach to network anomaly detection. Information Sciences 177(18) (2007)
Siffer, A., Fouque, P., Termier, A., Largouet, C.: Anomaly detection in streams with extreme value theory. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2017)
Thomas, A., Clemencon, S., Gramfort, A., Sabourin, A.: Anomaly detection in extreme regions via empirical MV-sets on the sphere. In: AISTATS (2017)
Walfish, S.: A review of statistical outlier methods. Pharmaceutical Technology 30(11) (2006)
Weissman, I.: Estimation of parameters and large quantiles based on the k largest observations. J. Amer. Statist. Assoc. 73 (1978)

Download references

Acknowledgments

Edoardo Vignotto acknowledges funding from the Swiss National Science Foundation (Doc.Mobility Grant 188229). We gratefully acknowledge helpful comments by two anonymous referees and the editorial board. Sebastian Engelke was supported by the Swiss National Science Foundation; the paper was completed while he was a visitor at the Department of Statistical Sciences, University of Toronto.

Funding

Open access funding provided by University of Geneva.

Author information

Authors and Affiliations

Research Center for Statistics, University of Geneva, Geneva, Switzerland
Edoardo Vignotto & Sebastian Engelke

Authors

Edoardo Vignotto
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Engelke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Edoardo Vignotto.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Vignotto, E., Engelke, S. Extreme value theory for anomaly detection – the GPD classifier. Extremes 23, 501–520 (2020). https://doi.org/10.1007/s10687-020-00393-0

Download citation

Received: 22 July 2019
Revised: 10 August 2020
Accepted: 12 August 2020
Published: 09 September 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10687-020-00393-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Extreme value theory for anomaly detection – the GPD classifier

Abstract

Article PDF

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Introduction to Machine Learning

A Comprehensive Survey of Anomaly Detection Algorithms

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Extreme value theory for anomaly detection – the GPD classifier

Abstract

Article PDF

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Introduction to Machine Learning

A Comprehensive Survey of Anomaly Detection Algorithms

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation