Abstract
Anomaly detection becomes more important with increasing automation. Especially for time series data, prevalent in industry, there are numerous methods that have been well researched. In this work we provide a proof of concept for a novel approach using the interpretability of Gaussian processes. To detect an abnormal section, the data is split into equally sized segments which are then interpreted individually using separate kernel searches. The resulting kernels can then be compared and clustered by one of multiple presented methods. The segments that contain an anomaly end up in their own cluster.
To test all possible configurations of our proposed approach, we applied them to a subset of the SIGKDD 2021 anomaly dataset mutliple times and evaluated the results. Almost all configurations were able to succeed, although not yet reliably reproducible. The results of our performance evaluation indicate that kernel searches are in principle applicable to anomaly detection in univariate time series data.
This research was supported by the research training group “Dataninja” (Trustworthy AI for Seamless Problem Solving: Next Generation Intelligence Joins Robust Data Analysis) funded by the German federal state of North Rhine-Westphalia.
J. D. Hüwel and A. Besginow—Equal contribution to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Everything will be found under the repository https://github.com/JanHuewel/KernelSearchAnomalyDetection.
References
An, J., Cho, S.: Variational autoencoder based anomaly detection using reconstruction probability. Spec. Lect. IE 2(1), 1–18 (2015)
Auslander, B., Gupta, K.M., Aha, D.W.: A comparative evaluation of anomaly detection algorithms for maritime video surveillance. In: Carapezza, E.M. (ed.) Proceedings of the SPIE 8019, Sensors, and Command, Control, Communications, and Intelligence (C3I) Technologies for Homeland Security and Homeland Defense X, p. 801907. SPIE Proceedings, SPIE (2011). https://doi.org/10.1117/12.883535
Berkhahn, F., Keys, R., Ouertani, W., Shetty, N., Geißler, D.: Augmenting variational autoencoders with sparse labels: a unified framework for unsupervised, semi-(un) supervised, and supervised learning. arXiv preprint arXiv:1908.03015 (2019)
Berns, F., Beecks, C.: Automatic Gaussian process model retrieval for big data. In: CIKM. ACM (2020)
Berns, F., Beecks, C.: Complexity-adaptive Gaussian process model inference for large-scale data. SIAM (2021). https://doi.org/10.1137/1.9781611976700.41
Berns, F., Lange-Hegermann, M., Beecks, C.: Towards Gaussian processes for automatic and interpretable anomaly detection in industry 4.0. In: IN4PL, pp. 87–92 (2020)
Berns, F., Schmidt, K., Bracht, I., Beecks, C.: 3CS algorithm for efficient Gaussian process model retrieval. In: 25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event/Milan, Italy, 10–15 January 2021, pp. 1773–1780. IEEE (2020). https://doi.org/10.1109/ICPR48806.2021.9412805
Bowman, S.R., Vilnis, L., Vinyals, O., Dai, A.M., Józefowicz, R., Bengio, S.: Generating sentences from a continuous space. In: CoNLL, pp. 10–21. ACL (2016)
Breunig, M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 93–104. ACM (2000)
Candel, A., LeDell, E., Parmar, V., Arora, A.: Deep learning with H2O, December 2018. https://www.h2o.ai/wp-content/themes/h2o2016/images/resources/DeepLearningBooklet.pdf. Accessed 28 Sept 2020
Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: a survey. CoRR abs/1901.03407 (2019)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 1–58 (2009)
Cheng, K.W., Chen, Y.T., Fang, W.H.: Video anomaly detection and localization using hierarchical feature representation and Gaussian process regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2909–2917 (2015)
Damianou, A.C., Titsias, M.K., Lawrence, N.D.: Variational inference for latent variables and uncertain inputs in Gaussian processes. J. Mach. Learn. Res. 17(42), 1–62 (2016)
Dias, M.L.D., Mattos, C.L.C., da Silva, T.L.C., de Macêdo, J.A.F., Silva, W.C.P.: Anomaly detection in trajectory data with normalizing flows. CoRR abs/2004.05958 (2020)
Domingues, R., Buonora, F., Senesi, R., Thonnard, O.: An application of unsupervised fraud detection to passenger name records. In: 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshop (DSN-W), pp. 54–59, June 2016. https://doi.org/10.1109/DSN-W.2016.21
Duvenaud, D.: Automatic model construction with Gaussian processes. Ph.D. thesis, University of Cambridge (2014)
Duvenaud, D., Lloyd, J.R., Grosse, R.B., Tenenbaum, J.B., Ghahramani, Z.: Structure discovery in nonparametric regression through compositional kernel search. In: Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, vol. 28, pp. 1166–1174. JMLR Workshop and Conference Proceedings (JMLR.org) (2013). http://proceedings.mlr.press/v28/duvenaud13.html
Duvenaud, D., Lloyd, J.R., Grosse, R.B., Tenenbaum, J.B., Ghahramani, Z.: Structure discovery in nonparametric regression through compositional kernel search. In: ICML, vol. 28, no. 3, pp. 1166–1174. JMLR Workshop and Conference Proceedings (JMLR.org) (2013)
Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A geometric framework for unsupervised anomaly detection. In: Barbará, D., Jajodia, S. (eds.) Applications of Data Mining in Computer Security. Advances in Information Security, vol. 6, pp. 77–101. Springer, Boston (2002). Series ISSN 1568-2633. https://doi.org/10.1007/978-1-4615-0953-0_4
Goldstein, M., Uchida, S.: A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4), 152–173 (2016). https://doi.org/10.1371/journal.pone.0152173
Gong, D., et al.: Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1705–1714 (2019)
Goodfellow, I., et al.: Generative Adversarial Nets. In: NeurIPS (2014)
Graß, A., Beecks, C., Soto, J.A.C.: Unsupervised anomaly detection in production lines. In: Machine Learning for Cyber Physical Systems. TA, vol. 9, pp. 18–25. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-662-58485-9_3
Gu, M., Fei, J., Sun, S.: Online anomaly detection with sparse Gaussian processes. Neurocomputing 403, 383–399 (2020)
Guo, Y., Liao, W., Wang, Q., Yu, L., Ji, T., Li, P.: Multidimensional time series anomaly detection: a GRU-based Gaussian mixture variational autoencoder approach. In: Asian Conference on Machine Learning, pp. 97–112 (2018)
Hammerbacher, T., Lange-Hegermann, M., Platz, G.: Including sparse production knowledge into variational autoencoders to increase anomaly detection reliability (2021)
Hensman, J., Matthews, A., Ghahramani, Z.: Scalable variational Gaussian process classification. In: Artificial Intelligence and Statistics, pp. 351–360. PMLR (2015)
Hoare, C.A.: Quicksort. Comput. J. 5(1), 10–16 (1962)
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)
Hundman, K., Constantinou, V., Laporte, C., Colwell, I., Soderstrom, T.: Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 387–395 (2018)
Hwang, Y., Tong, A., Choi, J.: Automatic construction of nonparametric relational regression models for multiple time series. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of the 33rd International Conference on Machine Learning, ICML 2016, vol. 48, pp. 3030–3039. Proceedings of Machine Learning Research. PLMR (2016)
Kawachi, Y., Koizumi, Y., Harada, N.: Complementary set variational autoencoder for supervised anomaly detection. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2366–2370. IEEE (2018)
Keogh, E., Dutta, R.T., Naik, U., Agrawal, A.: Multi-dataset time-series anomaly detection competition. In: SIGKDD 2021 (2021). https://compete.hexagon-ml.com/practice/competition/39/
Kim, H., Teh, Y.W.: Scaling up the automatic statistician: scalable structure discovery using Gaussian processes. In: Proceedings of the 21st International Conference on Artificial Intelligence and Statistics, vol. 84 (2018)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015). http://arxiv.org/abs/1412.6980
Kingma, D.P., Welling, M.: Auto-Encoding Variational Bayes. In: ICLR (2014)
Kowalska, K., Peel, L.: Maritime anomaly detection using Gaussian process active learning. In: 2012 15th International Conference on Information Fusion, pp. 1164–1171. IEEE (2012)
Lange-Hegermann, M.: Algorithmic linearly constrained Gaussian processes. In: NeurIPS, pp. 2141–2152 (2018)
Lange-Hegermann, M.: Linearly constrained Gaussian processes with boundary conditions. In: International Conference on Artificial Intelligence and Statistics, pp. 1090–1098. PMLR (2021)
Laptev, N., Amizadeh, S., Billwala, Y.: S5 - a labeled anomaly detection dataset, version 1.0(16m). https://webscope.sandbox.yahoo.com/catalog.php?datatype=s&did=70
Lemercier, M., Salvi, C., Cass, T., Bonilla, E.V., Damoulas, T., Lyons, T.: SigGPDE: scaling sparse Gaussian processes on sequential data (2021)
Li, D., Chen, D., Jin, B., Shi, L., Goh, J., Ng, S.-K.: MAD-GAN: multivariate anomaly detection for time series data with generative adversarial networks. In: Tetko, I.V., Kůrková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11730, pp. 703–716. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30490-4_56
Lin, F., Cohen, W.W.: Power iteration clustering. In: Fürnkranz, J., Joachims, T. (eds.) Proceedings of the 27th International Conference on Machine Learning, ICML 2010, 21–24 June 2010, Haifa, Israel, pp. 655–662. Omnipress (2010). https://icml.cc/Conferences/2010/papers/387.pdf
Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: Giannotti, F. (ed.) 2008 8th IEEE International Conference on Data Mining, pp. 413–422. IEEE, Piscataway (2008). https://doi.org/10.1109/ICDM.2008.17
Lloyd, J.R., Duvenaud, D., Grosse, R.B., Tenenbaum, J.B., Ghahramani, Z.: Automatic construction and natural-language description of nonparametric regression models. In: AAAI, pp. 1242–1250. AAAI Press (2014)
Mandt, S., Hoffman, M.D., Blei, D.M.: Stochastic gradient descent as approximate Bayesian inference (2018)
Müller, A., Lange-Hegermann, M., von Birgelen, A.: Automatisches training eines variational autoencoder für anomalieerkennung in zeitreihen. In: VDI Kongress Automation 2020, vol. VDI-Berichte 2375, pp. 687–698. VDI Wissensforum GmbH, VDI Verlag GmbH, Baden-Baden (2020)
Müllner, D.: Modern hierarchical, agglomerative clustering algorithms. CoRR abs/1109.2378 (2011). http://arxiv.org/abs/1109.2378
Pang, J., Liu, D., Liao, H., Peng, Y., Peng, X.: Anomaly detection based on data stream monitoring and prediction with improved Gaussian process regression algorithm. In: 2014 International Conference on Prognostics and Health Management, pp. 1–7. IEEE (2014)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). http://dl.acm.org/citation.cfm?id=1953048.2078195
Phua, C., Lee, V.C.S., Smith-Miles, K., Gayler, R.W.: A comprehensive survey of data mining-based fraud detection research. CoRR abs/1009.6119 (2010). http://arxiv.org/abs/1009.6119
Quinonero-Candela, J., Rasmussen, C.E.: A unifying view of sparse approximate Gaussian process regression. J. Mach. Learn. Res. 6, 1939–1959 (2005)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press (2006)
Reece, S., Garnett, R., Osborne, M., Roberts, S.: Anomaly detection and removal using non-stationary Gaussian processes. arXiv preprint arXiv:1507.00566 (2015)
Rezende, D.J., Mohamed, S.: Variational inference with normalizing flows. In: ICML, vol. 37, pp. 1530–1538. JMLR Workshop and Conference Proceedings (JMLR.org) (2015)
Rousseeuw, P.J.: Least median of squares regression. J. Am. Stat. Assoc. 79(388), 871–880 (1984)
Sabokrou, M., Khalooei, M., Fathy, M., Adeli, E.: Adversarially learned one-class classifier for novelty detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3379–3388 (2018)
Schölkopf, B., Platt, J.C., Shawe-Taylor, J.C., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001). https://doi.org/10.1162/089976601750264965
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000). https://doi.org/10.1109/34.868688
Suh, S., Chae, D.H., Kang, H.G., Choi, S.: Echo-state conditional variational autoencoder for anomaly detection. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 1015–1022. IEEE (2016)
Tavallaee, M., Stakhanova, N., Ghorbani, A.A.: Toward credible evaluation of anomaly-based intrusion-detection methods. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 40(5), 516–524 (2010). https://doi.org/10.1109/TSMCC.2010.2048428
Titsias, M.: Variational learning of inducing variables in sparse Gaussian processes. In: Artificial Intelligence and Statistics, pp. 567–574. PMLR (2009)
Vodenčarević, A., Büning, H.K., Niggemann, O., Maier, A.: Using behavior models for anomaly detection in hybrid systems. In: 2011 XXIII International Symposium on Information, Communication and Automation Technologies, pp. 1–8. IEEE (2011)
Von Birgelen, A., Buratti, D., Mager, J., Niggemann, O.: Self-organizing maps for anomaly localization and predictive maintenance in cyber-physical production systems. Procedia CIRP 72, 480–485 (2018)
Wagner, S., Wagner, D.: Comparing clusterings: an overview. Universität Karlsruhe, Fakultät für Informatik Karlsruhe (2007)
Wang, J., Ma, Y., Zhang, L., Gao, R.X., Wu, D.: Deep learning for smart manufacturing: methods and applications. J. Manuf. Syst. 48, 144–156 (2018)
Wang, X., Du, Y., Lin, S., Cui, P., Yang, Y.: Self-adversarial variational autoencoder with Gaussian anomaly prior distribution for anomaly detection. CoRR, abs/1903.00904 (2019)
Wu, R., Keogh, E.J.: Current time series anomaly detection benchmarks are flawed and are creating the illusion of progress (2020). https://wu.renjie.im/research/anomaly-benchmarks-are-flawed/arxiv/
Zenati, H., Romain, M., Foo, C.S., Lecouat, B., Chandrasekhar, V.: Adversarially learned anomaly detection. In: 2018 IEEE International Conference on Data Mining (ICDM), pp. 727–736. IEEE (2018)
Zhang, C., Chen, Y.: Time series anomaly detection with variational autoencoders. CoRR abs/1907.01702 (2019). http://arxiv.org/abs/1907.01702
Zinkevich, M., Weimer, M., Smola, A.J., Li, L.: Parallelized stochastic gradient descent. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) 24th Annual Conference on Neural Information Processing Systems 2010. Advances in Neural Information Processing Systems, vol. 23, 6–9 December 2010, Vancouver, British Columbia, Canada, pp. 2595–2603. Curran Associates, Inc. (2010). https://proceedings.neurips.cc/paper/2010/hash/abea47ba24142ed16b7d8fbf2c740e0d-Abstract.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hüwel, J.D., Besginow, A., Berns, F., Lange-Hegermann, M., Beecks, C. (2023). On Kernel Search Based Gaussian Process Anomaly Detection. In: Smirnov, A., Panetto, H., Madani, K. (eds) Innovative Intelligent Industrial Production and Logistics. IN4PL IN4PL 2020 2021. Communications in Computer and Information Science, vol 1855. Springer, Cham. https://doi.org/10.1007/978-3-031-37228-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-37228-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37227-8
Online ISBN: 978-3-031-37228-5
eBook Packages: Computer ScienceComputer Science (R0)