Skip to main content

On Kernel Search Based Gaussian Process Anomaly Detection

  • Conference paper
  • First Online:
Innovative Intelligent Industrial Production and Logistics (IN4PL 2020, IN4PL 2021)

Abstract

Anomaly detection becomes more important with increasing automation. Especially for time series data, prevalent in industry, there are numerous methods that have been well researched. In this work we provide a proof of concept for a novel approach using the interpretability of Gaussian processes. To detect an abnormal section, the data is split into equally sized segments which are then interpreted individually using separate kernel searches. The resulting kernels can then be compared and clustered by one of multiple presented methods. The segments that contain an anomaly end up in their own cluster.

To test all possible configurations of our proposed approach, we applied them to a subset of the SIGKDD 2021 anomaly dataset mutliple times and evaluated the results. Almost all configurations were able to succeed, although not yet reliably reproducible. The results of our performance evaluation indicate that kernel searches are in principle applicable to anomaly detection in univariate time series data.

This research was supported by the research training group “Dataninja” (Trustworthy AI for Seamless Problem Solving: Next Generation Intelligence Joins Robust Data Analysis) funded by the German federal state of North Rhine-Westphalia.

J. D. Hüwel and A. Besginow—Equal contribution to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Everything will be found under the repository https://github.com/JanHuewel/KernelSearchAnomalyDetection.

References

  1. An, J., Cho, S.: Variational autoencoder based anomaly detection using reconstruction probability. Spec. Lect. IE 2(1), 1–18 (2015)

    Google Scholar 

  2. Auslander, B., Gupta, K.M., Aha, D.W.: A comparative evaluation of anomaly detection algorithms for maritime video surveillance. In: Carapezza, E.M. (ed.) Proceedings of the SPIE 8019, Sensors, and Command, Control, Communications, and Intelligence (C3I) Technologies for Homeland Security and Homeland Defense X, p. 801907. SPIE Proceedings, SPIE (2011). https://doi.org/10.1117/12.883535

  3. Berkhahn, F., Keys, R., Ouertani, W., Shetty, N., Geißler, D.: Augmenting variational autoencoders with sparse labels: a unified framework for unsupervised, semi-(un) supervised, and supervised learning. arXiv preprint arXiv:1908.03015 (2019)

  4. Berns, F., Beecks, C.: Automatic Gaussian process model retrieval for big data. In: CIKM. ACM (2020)

    Google Scholar 

  5. Berns, F., Beecks, C.: Complexity-adaptive Gaussian process model inference for large-scale data. SIAM (2021). https://doi.org/10.1137/1.9781611976700.41

    Article  Google Scholar 

  6. Berns, F., Lange-Hegermann, M., Beecks, C.: Towards Gaussian processes for automatic and interpretable anomaly detection in industry 4.0. In: IN4PL, pp. 87–92 (2020)

    Google Scholar 

  7. Berns, F., Schmidt, K., Bracht, I., Beecks, C.: 3CS algorithm for efficient Gaussian process model retrieval. In: 25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event/Milan, Italy, 10–15 January 2021, pp. 1773–1780. IEEE (2020). https://doi.org/10.1109/ICPR48806.2021.9412805

  8. Bowman, S.R., Vilnis, L., Vinyals, O., Dai, A.M., Józefowicz, R., Bengio, S.: Generating sentences from a continuous space. In: CoNLL, pp. 10–21. ACL (2016)

    Google Scholar 

  9. Breunig, M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 93–104. ACM (2000)

    Google Scholar 

  10. Candel, A., LeDell, E., Parmar, V., Arora, A.: Deep learning with H2O, December 2018. https://www.h2o.ai/wp-content/themes/h2o2016/images/resources/DeepLearningBooklet.pdf. Accessed 28 Sept 2020

  11. Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: a survey. CoRR abs/1901.03407 (2019)

    Google Scholar 

  12. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 1–58 (2009)

    Article  Google Scholar 

  13. Cheng, K.W., Chen, Y.T., Fang, W.H.: Video anomaly detection and localization using hierarchical feature representation and Gaussian process regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2909–2917 (2015)

    Google Scholar 

  14. Damianou, A.C., Titsias, M.K., Lawrence, N.D.: Variational inference for latent variables and uncertain inputs in Gaussian processes. J. Mach. Learn. Res. 17(42), 1–62 (2016)

    MathSciNet  MATH  Google Scholar 

  15. Dias, M.L.D., Mattos, C.L.C., da Silva, T.L.C., de Macêdo, J.A.F., Silva, W.C.P.: Anomaly detection in trajectory data with normalizing flows. CoRR abs/2004.05958 (2020)

    Google Scholar 

  16. Domingues, R., Buonora, F., Senesi, R., Thonnard, O.: An application of unsupervised fraud detection to passenger name records. In: 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshop (DSN-W), pp. 54–59, June 2016. https://doi.org/10.1109/DSN-W.2016.21

  17. Duvenaud, D.: Automatic model construction with Gaussian processes. Ph.D. thesis, University of Cambridge (2014)

    Google Scholar 

  18. Duvenaud, D., Lloyd, J.R., Grosse, R.B., Tenenbaum, J.B., Ghahramani, Z.: Structure discovery in nonparametric regression through compositional kernel search. In: Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, vol. 28, pp. 1166–1174. JMLR Workshop and Conference Proceedings (JMLR.org) (2013). http://proceedings.mlr.press/v28/duvenaud13.html

  19. Duvenaud, D., Lloyd, J.R., Grosse, R.B., Tenenbaum, J.B., Ghahramani, Z.: Structure discovery in nonparametric regression through compositional kernel search. In: ICML, vol. 28, no. 3, pp. 1166–1174. JMLR Workshop and Conference Proceedings (JMLR.org) (2013)

    Google Scholar 

  20. Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A geometric framework for unsupervised anomaly detection. In: Barbará, D., Jajodia, S. (eds.) Applications of Data Mining in Computer Security. Advances in Information Security, vol. 6, pp. 77–101. Springer, Boston (2002). Series ISSN 1568-2633. https://doi.org/10.1007/978-1-4615-0953-0_4

  21. Goldstein, M., Uchida, S.: A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4), 152–173 (2016). https://doi.org/10.1371/journal.pone.0152173

    Article  Google Scholar 

  22. Gong, D., et al.: Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1705–1714 (2019)

    Google Scholar 

  23. Goodfellow, I., et al.: Generative Adversarial Nets. In: NeurIPS (2014)

    Google Scholar 

  24. Graß, A., Beecks, C., Soto, J.A.C.: Unsupervised anomaly detection in production lines. In: Machine Learning for Cyber Physical Systems. TA, vol. 9, pp. 18–25. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-662-58485-9_3

    Chapter  Google Scholar 

  25. Gu, M., Fei, J., Sun, S.: Online anomaly detection with sparse Gaussian processes. Neurocomputing 403, 383–399 (2020)

    Article  Google Scholar 

  26. Guo, Y., Liao, W., Wang, Q., Yu, L., Ji, T., Li, P.: Multidimensional time series anomaly detection: a GRU-based Gaussian mixture variational autoencoder approach. In: Asian Conference on Machine Learning, pp. 97–112 (2018)

    Google Scholar 

  27. Hammerbacher, T., Lange-Hegermann, M., Platz, G.: Including sparse production knowledge into variational autoencoders to increase anomaly detection reliability (2021)

    Google Scholar 

  28. Hensman, J., Matthews, A., Ghahramani, Z.: Scalable variational Gaussian process classification. In: Artificial Intelligence and Statistics, pp. 351–360. PMLR (2015)

    Google Scholar 

  29. Hoare, C.A.: Quicksort. Comput. J. 5(1), 10–16 (1962)

    Article  MathSciNet  MATH  Google Scholar 

  30. Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)

    Article  MATH  Google Scholar 

  31. Hundman, K., Constantinou, V., Laporte, C., Colwell, I., Soderstrom, T.: Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 387–395 (2018)

    Google Scholar 

  32. Hwang, Y., Tong, A., Choi, J.: Automatic construction of nonparametric relational regression models for multiple time series. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of the 33rd International Conference on Machine Learning, ICML 2016, vol. 48, pp. 3030–3039. Proceedings of Machine Learning Research. PLMR (2016)

    Google Scholar 

  33. Kawachi, Y., Koizumi, Y., Harada, N.: Complementary set variational autoencoder for supervised anomaly detection. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2366–2370. IEEE (2018)

    Google Scholar 

  34. Keogh, E., Dutta, R.T., Naik, U., Agrawal, A.: Multi-dataset time-series anomaly detection competition. In: SIGKDD 2021 (2021). https://compete.hexagon-ml.com/practice/competition/39/

  35. Kim, H., Teh, Y.W.: Scaling up the automatic statistician: scalable structure discovery using Gaussian processes. In: Proceedings of the 21st International Conference on Artificial Intelligence and Statistics, vol. 84 (2018)

    Google Scholar 

  36. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015). http://arxiv.org/abs/1412.6980

  37. Kingma, D.P., Welling, M.: Auto-Encoding Variational Bayes. In: ICLR (2014)

    Google Scholar 

  38. Kowalska, K., Peel, L.: Maritime anomaly detection using Gaussian process active learning. In: 2012 15th International Conference on Information Fusion, pp. 1164–1171. IEEE (2012)

    Google Scholar 

  39. Lange-Hegermann, M.: Algorithmic linearly constrained Gaussian processes. In: NeurIPS, pp. 2141–2152 (2018)

    Google Scholar 

  40. Lange-Hegermann, M.: Linearly constrained Gaussian processes with boundary conditions. In: International Conference on Artificial Intelligence and Statistics, pp. 1090–1098. PMLR (2021)

    Google Scholar 

  41. Laptev, N., Amizadeh, S., Billwala, Y.: S5 - a labeled anomaly detection dataset, version 1.0(16m). https://webscope.sandbox.yahoo.com/catalog.php?datatype=s&did=70

  42. Lemercier, M., Salvi, C., Cass, T., Bonilla, E.V., Damoulas, T., Lyons, T.: SigGPDE: scaling sparse Gaussian processes on sequential data (2021)

    Google Scholar 

  43. Li, D., Chen, D., Jin, B., Shi, L., Goh, J., Ng, S.-K.: MAD-GAN: multivariate anomaly detection for time series data with generative adversarial networks. In: Tetko, I.V., Kůrková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11730, pp. 703–716. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30490-4_56

    Chapter  Google Scholar 

  44. Lin, F., Cohen, W.W.: Power iteration clustering. In: Fürnkranz, J., Joachims, T. (eds.) Proceedings of the 27th International Conference on Machine Learning, ICML 2010, 21–24 June 2010, Haifa, Israel, pp. 655–662. Omnipress (2010). https://icml.cc/Conferences/2010/papers/387.pdf

  45. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: Giannotti, F. (ed.) 2008 8th IEEE International Conference on Data Mining, pp. 413–422. IEEE, Piscataway (2008). https://doi.org/10.1109/ICDM.2008.17

  46. Lloyd, J.R., Duvenaud, D., Grosse, R.B., Tenenbaum, J.B., Ghahramani, Z.: Automatic construction and natural-language description of nonparametric regression models. In: AAAI, pp. 1242–1250. AAAI Press (2014)

    Google Scholar 

  47. Mandt, S., Hoffman, M.D., Blei, D.M.: Stochastic gradient descent as approximate Bayesian inference (2018)

    Google Scholar 

  48. Müller, A., Lange-Hegermann, M., von Birgelen, A.: Automatisches training eines variational autoencoder für anomalieerkennung in zeitreihen. In: VDI Kongress Automation 2020, vol. VDI-Berichte 2375, pp. 687–698. VDI Wissensforum GmbH, VDI Verlag GmbH, Baden-Baden (2020)

    Google Scholar 

  49. Müllner, D.: Modern hierarchical, agglomerative clustering algorithms. CoRR abs/1109.2378 (2011). http://arxiv.org/abs/1109.2378

  50. Pang, J., Liu, D., Liao, H., Peng, Y., Peng, X.: Anomaly detection based on data stream monitoring and prediction with improved Gaussian process regression algorithm. In: 2014 International Conference on Prognostics and Health Management, pp. 1–7. IEEE (2014)

    Google Scholar 

  51. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). http://dl.acm.org/citation.cfm?id=1953048.2078195

  52. Phua, C., Lee, V.C.S., Smith-Miles, K., Gayler, R.W.: A comprehensive survey of data mining-based fraud detection research. CoRR abs/1009.6119 (2010). http://arxiv.org/abs/1009.6119

  53. Quinonero-Candela, J., Rasmussen, C.E.: A unifying view of sparse approximate Gaussian process regression. J. Mach. Learn. Res. 6, 1939–1959 (2005)

    MathSciNet  MATH  Google Scholar 

  54. Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press (2006)

    Google Scholar 

  55. Reece, S., Garnett, R., Osborne, M., Roberts, S.: Anomaly detection and removal using non-stationary Gaussian processes. arXiv preprint arXiv:1507.00566 (2015)

  56. Rezende, D.J., Mohamed, S.: Variational inference with normalizing flows. In: ICML, vol. 37, pp. 1530–1538. JMLR Workshop and Conference Proceedings (JMLR.org) (2015)

    Google Scholar 

  57. Rousseeuw, P.J.: Least median of squares regression. J. Am. Stat. Assoc. 79(388), 871–880 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  58. Sabokrou, M., Khalooei, M., Fathy, M., Adeli, E.: Adversarially learned one-class classifier for novelty detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3379–3388 (2018)

    Google Scholar 

  59. Schölkopf, B., Platt, J.C., Shawe-Taylor, J.C., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001). https://doi.org/10.1162/089976601750264965

  60. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000). https://doi.org/10.1109/34.868688

  61. Suh, S., Chae, D.H., Kang, H.G., Choi, S.: Echo-state conditional variational autoencoder for anomaly detection. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 1015–1022. IEEE (2016)

    Google Scholar 

  62. Tavallaee, M., Stakhanova, N., Ghorbani, A.A.: Toward credible evaluation of anomaly-based intrusion-detection methods. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 40(5), 516–524 (2010). https://doi.org/10.1109/TSMCC.2010.2048428

  63. Titsias, M.: Variational learning of inducing variables in sparse Gaussian processes. In: Artificial Intelligence and Statistics, pp. 567–574. PMLR (2009)

    Google Scholar 

  64. Vodenčarević, A., Büning, H.K., Niggemann, O., Maier, A.: Using behavior models for anomaly detection in hybrid systems. In: 2011 XXIII International Symposium on Information, Communication and Automation Technologies, pp. 1–8. IEEE (2011)

    Google Scholar 

  65. Von Birgelen, A., Buratti, D., Mager, J., Niggemann, O.: Self-organizing maps for anomaly localization and predictive maintenance in cyber-physical production systems. Procedia CIRP 72, 480–485 (2018)

    Article  Google Scholar 

  66. Wagner, S., Wagner, D.: Comparing clusterings: an overview. Universität Karlsruhe, Fakultät für Informatik Karlsruhe (2007)

    Google Scholar 

  67. Wang, J., Ma, Y., Zhang, L., Gao, R.X., Wu, D.: Deep learning for smart manufacturing: methods and applications. J. Manuf. Syst. 48, 144–156 (2018)

    Article  Google Scholar 

  68. Wang, X., Du, Y., Lin, S., Cui, P., Yang, Y.: Self-adversarial variational autoencoder with Gaussian anomaly prior distribution for anomaly detection. CoRR, abs/1903.00904 (2019)

    Google Scholar 

  69. Wu, R., Keogh, E.J.: Current time series anomaly detection benchmarks are flawed and are creating the illusion of progress (2020). https://wu.renjie.im/research/anomaly-benchmarks-are-flawed/arxiv/

  70. Zenati, H., Romain, M., Foo, C.S., Lecouat, B., Chandrasekhar, V.: Adversarially learned anomaly detection. In: 2018 IEEE International Conference on Data Mining (ICDM), pp. 727–736. IEEE (2018)

    Google Scholar 

  71. Zhang, C., Chen, Y.: Time series anomaly detection with variational autoencoders. CoRR abs/1907.01702 (2019). http://arxiv.org/abs/1907.01702

  72. Zinkevich, M., Weimer, M., Smola, A.J., Li, L.: Parallelized stochastic gradient descent. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) 24th Annual Conference on Neural Information Processing Systems 2010. Advances in Neural Information Processing Systems, vol. 23, 6–9 December 2010, Vancouver, British Columbia, Canada, pp. 2595–2603. Curran Associates, Inc. (2010). https://proceedings.neurips.cc/paper/2010/hash/abea47ba24142ed16b7d8fbf2c740e0d-Abstract.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan David Hüwel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hüwel, J.D., Besginow, A., Berns, F., Lange-Hegermann, M., Beecks, C. (2023). On Kernel Search Based Gaussian Process Anomaly Detection. In: Smirnov, A., Panetto, H., Madani, K. (eds) Innovative Intelligent Industrial Production and Logistics. IN4PL IN4PL 2020 2021. Communications in Computer and Information Science, vol 1855. Springer, Cham. https://doi.org/10.1007/978-3-031-37228-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-37228-5_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-37227-8

  • Online ISBN: 978-3-031-37228-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics