Skip to main content
Log in

Functional anomaly detection: a benchmark study

  • Review
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

The increasing automation in many areas of the Industry expressly demands to design efficient machine learning solutions for the detection of abnormal events. With the ubiquitous deployment of sensors monitoring nearly continuously the health of complex infrastructures, anomaly detection can now rely on measurements sampled at a very high frequency, providing a very rich representation of the phenomenon under surveillance. In order to exploit fully the information thus collected, the observations cannot be treated as multivariate data anymore and a functional analysis approach is required. It is the purpose of this paper to investigate the performance of recent techniques for anomaly detection in the functional setup on real datasets. After an overview of the state of the art and a visual-descriptive study, a variety of anomaly detection methods are compared. While taxonomies of abnormalities (e.g., shape, location) in the functional setup are documented in the literature, assigning a specific type to the identified anomalies appears to be a challenging task. Thus, strengths and weaknesses of the existing approaches are benchmarked in view of these highlighted types in a simulation study. Anomaly detection methods are next evaluated on two datasets, related to the monitoring of helicopters in flight and to the spectrometry of construction materials namely. The benchmark analysis is concluded by a recommendation guidance for practitioners.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability

The datasets are available on the following link https://drive.google.com/drive/folders/1p1k5eRwSPDH_BP6E8j_iLMCaUtEfLOkN?usp=sharing.

Code Availability

The code is available on the following link https://drive.google.com/drive/folders/1p1k5eRwSPDH_BP6E8j_iLMCaUtEfLOkN?usp=sharing.

Notes

  1. https://github.com/GAA-UAM/scikit-fda.

  2. https://github.com/GuillaumeStaermanML.

  3. https://github.com/danieltan07/dagmm.

  4. https://github.com/GuansongPang/deep-outlier-detection.

  5. https://github.com/billhhh/RDP.

References

  1. Hawkins, D.M.: Identification of Outliers. Monographs on Applied Probability and Statistics. Chapman and Hall, London (1980)

    Book  Google Scholar 

  2. Rousseeuw, P.J., Hubert, M.: Anomaly detection by robust statistics. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 8(2), e1236 (2018)

    Article  Google Scholar 

  3. Staerman, G., Mozharovskyi, P., Clémençon, S., d’Alché Buc, F.: Functional isolation forest. In: Proceedings of The 11th Asian Conference on Machine Learning, pp. 332–347 (2019)

  4. Wang, J.-L., Chiou, J.-M., Müller, H.-G.: Functional data analysis. Annu. Rev. Stat. Appl. 3, 257–295 (2016)

    Article  Google Scholar 

  5. Ramsay, J.O., Silverman, B.W.: Functional Data Analysis. Springer, New York (2005)

    Book  MATH  Google Scholar 

  6. Ferraty, F., Vieu, P.: Nonparametric Functional Data Analysis: Theory and Practice. Springer, Berlin (2006)

    MATH  Google Scholar 

  7. Ramsay, J.O., Silverman, B.W.: Applied Functional Data Analysis: Methods and Case Studies. Springer, Berlin (2002)

    Book  MATH  Google Scholar 

  8. Hubert, M., Rousseeuw, P.J., Segaert, P.: Multivariate functional outlier detection. Stat. Methods Appl. 24(2), 177–202 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  9. Cuevas, A., Febrero, M., Fraiman, R.: Robust estimation and classification for functional data via projection-based depth notions. Comput. Stat. 22(3), 481–496 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  10. Staerman, G., Mozharovskyi, P., Clémençon, S.: The area of the convex hull of sampled curves: a robust functional statistical depth measure. In: Proceedings of the 23nd International Conference on Artificial Intelligence and Statistics (AISTATS 2020), vol. 108, pp. 570–579 (2020)

  11. Tukey, J.W.: Mathematics and the picturing of data. In: Proceedings of the International Congress of Mathematicians. Vancouver, 1975, vol. 2, pp. 523–531 (1975)

  12. Donoho, D.L., Gasko, M., et al.: Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann. Stat. 20(4), 1803–1827 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  13. Becker, C., Fried, R., Kuhnt, S.: Festschrift in Honour of Ursula Gather. Springer, Berlin (2014)

    MATH  Google Scholar 

  14. Nagy, S., Gijbels, I., Hlubinka, D.: Depth-based recognition of shape outlying functions. J. Comput. Graph. Stat. 26(4), 883–893 (2017)

    Article  MathSciNet  Google Scholar 

  15. Rousseeuw, P.J., Van Driessen, K.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223 (1999)

    Article  Google Scholar 

  16. Polonik, W.: Minimum volume sets and generalized quantile processes. Stoch. Process. Appl. 69(1), 1–24 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  17. Scott, C., Nowak, R.: Learning minimum volume sets. J. Mach. Learn. Res. 7, 665–704 (2006)

    MathSciNet  MATH  Google Scholar 

  18. Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)

    Article  MATH  Google Scholar 

  19. Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, vol. 29, pp. 93–104. ACM (2000)

  20. Liu, F.T., Ting, K.M., Zhou, Z.: Isolation forest. In: Proceedings of the Eighth IEEE International Conference on Data Mining, pp. 413–422 (2008)

  21. Hariri, S., Kind, M.C., Brunner, R.J.: Extended isolation forest. IEEE Trans. Knowl. Data Eng. 33, 1479–1489 (2019)

    Article  Google Scholar 

  22. Zuo, Y., Serfling, R.: General notions of statistical depth function. Ann. Stat. 28(2), 461–482 (2000). (04)

    MathSciNet  MATH  Google Scholar 

  23. Staerman, G.: Functional anomaly detection and robust estimation. PhD thesis, Institut polytechnique de Paris (2022)

  24. Mosler, K.: Depth statistics. In: Becker, C., Fried, R., Kuhnt, S. (eds.) Robustness and Complex Data Structures: Festschrift in Honour of Ursula Gather, pp. 17–34. Springer, Berlin (2013)

    Chapter  Google Scholar 

  25. Kuelbs, J., Zinn, J.: Half-region depth for stochastic processes. J. Multivar. Anal. 142, 86–105 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  26. Nieto-Reyes, A., Battey, H.: A topologically valid definition of depth for functional data. Stat Sci 31, 61–79 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  27. Gijbels, I., Nagy, S., et al.: On a general definition of depth for functional data. Stat. Sci. 32(4), 630–639 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  28. Mosler, K., Polyakova, Y.: General notions of depth for functional data (2018). arXiv:1208.1981

  29. Claeskens, G., Hubert, M., Slaets, L., Vakili, K.: Multivariate functional halfspace depth. J. Am. Stat. Assoc. 109(505), 411–423 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  30. Fraiman, R., Muniz, G.: Trimmed means for functional data. TEST 10(2), 419–440 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  31. Staerman, G., Mozharovskyi, P., Clémençon, S., d’Alché Buc, F.: A pseudo-metric between probability distributions based on depth-trimmed regions (2021). arXiv:2103.12711

  32. Staerman, G., Mozharovskyi, P., Clémençon, S.: Affine-invariant integrated rank-weighted depth: definition, properties and finite sample analysis (2021). arXiv:2106.11068

  33. Brys, G., Hubert, M., Struyf, A.: A robust measure of skewness. J. Comput. Graph. Stat. 13(4), 996–1017 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  34. Chen, J., Sathe, S., Aggarwal, C., Turaga, D.: Outlier detection with autoencoder ensembles. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 90–98. SIAM (2017)

  35. Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 665–674 (2017)

  36. Ngo, P.C., Winarto, A.A., Kou, C.K.L., Park, S., Akram, F., Lee, H.K.: Fence gan: towards better anomaly detection. In: 2019 IEEE 31St International Conference on Tools with Artificial Intelligence (ICTAI), pp. 141–148. IEEE (2019)

  37. Schlegl, T., Seeböck, P., Waldstein, S.M., Langs, G., Schmidt-Erfurth, U.: f-anogan: fast unsupervised anomaly detection with generative adversarial networks. Med. Image Anal. 54, 30–44 (2019)

    Article  Google Scholar 

  38. Pang, G., Shen, C., Cao, L., Van Den Hengel, A.: Deep learning for anomaly detection: a review. ACM Comput. Surv.: CSUR 54(2), 1–38 (2021)

    Article  Google Scholar 

  39. Pang, G., Cao, L., Chen, L., Liu, H.: Learning representations of ultrahigh-dimensional data for random distance-based outlier detection. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2041–2050 (2018)

  40. Zong, B., Song, Q., Min, M.R., Cheng, W., Lumezanu, C., Cho, D., Chen, H.: Deep autoencoding Gaussian mixture model for unsupervised anomaly detection. In: International Conference on Learning Representations (2018)

  41. Wang, H., Pang, G., Shen, C., Ma, C. Unsupervised representation learning by predicting random distances (2019). arXiv:1912.12186

  42. Zhang, C., Song, D., Chen, Y., Feng, X., Lumezanu, C., Cheng, W., Ni, J., Zong, B., Chen, H., Chawla, N.V.: A deep neural network for unsupervised anomaly detection and diagnosis in multivariate time series data. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 1409–1416 (2019)

  43. Ma, R., Pang, G., Chen, L., van den Hengel, A.: Deep graph-level anomaly detection by glocal knowledge distillation. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, pp. 704–714 (2022)

  44. Fawcett, T.: An introduction to ROC analysis. Lett. Pattern Recogn. 27(8), 861–874 (2006)

    Article  MathSciNet  Google Scholar 

  45. Clémençon, S., Vayatis, N.: Nonparametric estimation of the precision-recall curve. In: ICML ’09: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 185–192 (2009)

  46. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv.: CSUR 41(3), 1–58 (2009)

    Article  Google Scholar 

  47. Segaert, P., Hubert, M., Rousseeuw, P., Raymaekers, J.: mrfdepth: depth measures in multivariate, regression and functional settings. R package version 1.0.11 (2019)

  48. Tarabelloni, N., Arribas-Gil, A., Ieva, F., Paganoni, A.M., Romo, J.: Roahd: robust analysis of high dimensional data. R package version 1.4.1 (2018)

  49. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

  50. Hyndman, R.J., Shang, H.L.: Rainbow plots, bagplots, and boxplots for functional data. J. Comput. Graph. Stat. 19(1), 29–45 (2010)

    Article  MathSciNet  Google Scholar 

  51. Sun, Y., Genton, M.G.: Functional boxplots. J. Comput. Graph. Stat. 20(2), 316–334 (2011)

    Article  MathSciNet  Google Scholar 

  52. Xie, W., Kurtek, S., Bharath, K., Sun, Y.: A geometric approach to visualization of variability in functional data. J. Am. Stat. Assoc. 112(519), 979–993 (2017)

    Article  MathSciNet  Google Scholar 

  53. Arribas-Gil, A., Romo, J.: Shape outlier detection and visualization for functional data: the outliergram. Biostatistics 15(4), 603–619 (2014)

    Article  Google Scholar 

  54. Rousseeuw, P.J., Raymaekers, J., Hubert, M.: A measure of directional outlyingness with applications to image data and video. J. Comput. Graph. Stat. 27(2), 345–359 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  55. Dai, W., Genton, M.: Multivariate functional data visualization and outlier detection. J. Comput. Graph. Stat. 27, 923–934 (2017)

Download references

Funding

This work has been funded by BPI France in the context of the PSPC Project Expresso (2017–2021). This project also received financial support from the initiative “Forschungspartnerschaften Mineralrohstoffe - ein strategischer Forschungsschwerpunkt der Geologischen Bundesanstalt”. The spectroscopic data of sedimentary material were provided by the Geological Survey of Austria.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guillaume Staerman.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Benchmarked datasets

In this part, we display in Fig. 8 the aeronautics and the rocks datasets.

Appendix B: Additional experiments on simulated anomalies

In this part, complementary experiments to the Sect. 3 are displayed. They are conducted with the same methodology but varying proportion of anomalies: 1% in Table  5, 2% in Table  6, 3% in Table  7 and 4% in Table  8.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Staerman, G., Adjakossa, E., Mozharovskyi, P. et al. Functional anomaly detection: a benchmark study. Int J Data Sci Anal 16, 101–117 (2023). https://doi.org/10.1007/s41060-022-00366-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-022-00366-5

Keywords

Navigation