Quantifying the Confidence of Anomaly Detectors in Their Example-Wise Predictions

Perini, Lorenzo; Vercruyssen, Vincent; Davis, Jesse

doi:10.1007/978-3-030-67664-3_14

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12459))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1954 Accesses
7 Citations

Abstract

Anomaly detection focuses on identifying examples in the data that somehow deviate from what is expected or typical. Algorithms for this task usually assign a score to each example that represents how anomalous the example is. Then, a threshold on the scores turns them into concrete predictions. However, each algorithm uses a different approach to assign the scores, which makes them difficult to interpret and can quickly erode a user’s trust in the predictions. This paper introduces an approach for assessing the reliability of any anomaly detector’s example-wise predictions. To do so, we propose a Bayesian approach for converting anomaly scores to probability estimates. This enables the anomaly detector to assign a confidence score to each prediction which captures its uncertainty in that prediction. We theoretically analyze the convergence behaviour of our confidence estimate. Empirically, we demonstrate the effectiveness of the framework in quantifying a detector’s confidence in its predictions on a large benchmark of datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Anomaly Detection: How to Artificially Increase Your F1-Score with a Biased Evaluation Protocol

LoPAD: A Local Prediction Approach to Anomaly Detection

Incorporating Privileged Information to Unsupervised Anomaly Detection

Notes

1.
We assume that \(k \in \mathbb {N}\), taking the floor function when needed.
2.
A failure would correspond to an training example having a higher anomaly score than the chosen threshold. Given the assumption that all training examples are normal, this would indicate a false positive.
3.
Implementation available at: https://github.com/Lorenzo-Perini/Confidence_AD.

References

Campos, G.O., et al.: On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min. Knowl. Disc. 30(4), 891–927 (2016). https://doi.org/10.1007/s10618-015-0444-8
Article MathSciNet Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 1–58 (2009)
Article Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple datasets. J. Mach. Learn. Res. 7, 1–30 (2006)
Google Scholar
Gao, J., Tan, P.N.: Converting output scores from outlier detection algorithms into probability estimates. In: Proceedings of Sixth IEEE International Conference on Data Mining, pp. 212–221. IEEE (2006)
Google Scholar
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: Proceedings of the 34th International Conference on Machine Learning, pp. 1321–1330 (2017)
Google Scholar
Kriegel, H.P., Kroger, P., Schubert, E., Zimek, A.: Interpreting and unifying outlier scores. In: Proceedings of the 2011 SIAM International Conference on Data Mining, pp. 13–24. SIAM (2011)
Google Scholar
Kull, M., Nieto, M.P., Kängsepp, M., Filho, T.S., Song, H., Flach, P.: Beyond temperature scaling: obtaining well-calibrated multi-class probabilities with Dirichlet calibration. In: Advances in Neural Information Processing Systems (2019)
Google Scholar
Kull, M., Silva Filho, T.M., Flach, P., et al.: Beyond sigmoids: How to obtain well-calibrated probabilities from binary classifiers with beta calibration. Electron. J. Stat. 11(2), 5052–5080 (2017)
Article MathSciNet Google Scholar
Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: Proceeding of 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422. IEEE (2008)
Google Scholar
Naeini, M.P., Cooper, G., Hauskrecht, M.: Obtaining well calibrated probabilities using Bayesian binning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
Perello-Nieto, M., De Menezes Filho, E.S.T., Kull, M., Flach, P.: Background check: a general technique to build more reliable and versatile classifiers. In: Proceedings of 16th IEEE International Conference on Data Mining. IEEE (2016)
Google Scholar
Perini, L., Vercruyssen, V., Davis, J.: Class prior estimation in active positive and unlabeled learning. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence (IJCAI-PRICAI) (2020)
Google Scholar
Platt, J., et al.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. large Margin Classifiers 10, 61–74 (1999)
Google Scholar
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large datasets. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 427–438 (2000)
Google Scholar
Robberechts, P., Bosteels, M., Davis, J., Meert, W.: Query log analysis: detecting anomalies in DNS traffic at a TLD resolver. In: Monreale, A., et al. (eds.) ECML PKDD 2018. CCIS, vol. 967, pp. 55–67. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14880-5_5
Chapter Google Scholar
Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)
Article Google Scholar
Vaicenavicius, J., Widmann, D., Andersson, C., Lindsten, F., Roll, J., Schön, T.B.: Evaluating model calibration in classification. arXiv:1902.06977 (2019)
Vercruyssen, V., Wannes, M., Gust, V., Koen, M., Ruben, B., Jesse, D.: Semi-supervised anomaly detection with an application to water analytics. In: Proceedings of 18th IEEE International Conference on Data Mining, pp. 527–536. IEEE (2018)
Google Scholar
Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and Naive Bayesian classifiers. In: Proceedings of ICML, pp. 609–616 (2001)
Google Scholar
Zadrozny, B., Elkan, C.: Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 694–699 (2002)
Google Scholar

Download references

Acknowledgements

This work is supported by the Flemish government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme (JD, LP, VV), FWO (G0D8819N to JD), and KU Leuven Research Fund (C14/17/07 to JD).

Author information

Authors and Affiliations

DTAI Research Group & Leuven.AI, KU Leuven, Leuven, Belgium
Lorenzo Perini, Vincent Vercruyssen & Jesse Davis

Authors

Lorenzo Perini
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Vercruyssen
View author publications
You can also search for this author in PubMed Google Scholar
Jesse Davis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Lorenzo Perini , Vincent Vercruyssen or Jesse Davis .

Editor information

Editors and Affiliations

Albert-Ludwigs-Universität, Freiburg, Germany
Frank Hutter
TU Darmstadt, Darmstadt, Germany
Kristian Kersting
Ghent University, Ghent, Belgium
Jefrey Lijffijt
Saarland University, Saarbrücken, Germany
Isabel Valera

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Perini, L., Vercruyssen, V., Davis, J. (2021). Quantifying the Confidence of Anomaly Detectors in Their Example-Wise Predictions. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12459. Springer, Cham. https://doi.org/10.1007/978-3-030-67664-3_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-67664-3_14
Published: 25 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67663-6
Online ISBN: 978-3-030-67664-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Quantifying the Confidence of Anomaly Detectors in Their Example-Wise Predictions

Abstract

Access this chapter

Similar content being viewed by others

Anomaly Detection: How to Artificially Increase Your F1-Score with a Biased Evaluation Protocol

LoPAD: A Local Prediction Approach to Anomaly Detection

Incorporating Privileged Information to Unsupervised Anomaly Detection

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Quantifying the Confidence of Anomaly Detectors in Their Example-Wise Predictions

Abstract

Access this chapter

Similar content being viewed by others

Anomaly Detection: How to Artificially Increase Your F1-Score with a Biased Evaluation Protocol

LoPAD: A Local Prediction Approach to Anomaly Detection

Incorporating Privileged Information to Unsupervised Anomaly Detection

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation