Abstract
Unsupervised learning-based anomaly detection using autoencoders has gained importance since anomalies behave differently than normal data when reconstructed from a well-regularized latent space. Existing research shows that retaining valuable properties of input data in latent space helps in the better reconstruction of unseen data. Moreover, real-world sensor data is skewed and non-Gaussian in nature rendering mean-based estimators unreliable for such cases. Reconstruction-based anomaly detection methods rely on Euclidean distance as the reconstruction error which does not consider useful correlation information in the latent space. In this work, we address some of the limitations of the Euclidean distance when used as a reconstruction error to detect anomalies (especially near anomalies) that have a similar distribution as the normal data in the feature space. We propose a latent dimension regularized autoencoder that leverages a robust form of the Mahalanobis distance (MD) to measure the latent space correlation to effectively detect near as well as far anomalies. We showcase that incorporating the correlation information in the form of robust MD in the latent space is quite helpful in separating both near and far anomalies in the reconstructed space.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zong, B., et al.: Deep autoencoding Gaussian mixture model for unsupervised anomaly detection. In: International Conference on Learning Representations (2018)
Ren, J., Fort, S., Liu, J., Roy, A. G., Padhy, S., Lakshminarayanan, B.: A simple fix to Mahalanobis distance for improving near-OOD detection. arXiv preprint arXiv:2106.09022 (2021)
Angiulli, F., Fassetti, F., Ferragina, L.: Latent out: an unsupervised deep anomaly detection approach exploiting latent space distribution. Mach. Learn. 112, 4323–4349 (2022). https://doi.org/10.1007/s10994-022-06153-4
Guo, J., Liu, G., Zuo, Y., Wu, J.: An anomaly detection framework based on autoencoder and nearest neighbor. In: 2018 15th International Conference on Service Systems and Service Management (ICSSSM), pp. 1-6. IEEE (2018)
Rashid, A.B., Ahmed, M., Sikos, L.F., Haskell-Dowland, P.: Anomaly detection in cybersecurity datasets via cooperative co-evolution-based feature selection. ACM Trans. Manage. Inf. Syst. 13(3), 1–39 (2022)
Zhang, Z., Jiang, T., Li, S., Yang, Y.: Automated feature learning for nonlinear process monitoring-an approach using stacked denoising autoencoder and k-nearest neighbor rule. J. Process Control 64, 49–61 (2018)
Xiao, Z., Yan, Q., Amit, Y.: Likelihood regret: an out-of-distribution detection score for variational auto-encoder. Adv. Neural. Inf. Process. Syst. 33, 20685–20696 (2020)
Denouden, T., Salay, R., Czarnecki, K., and Abdelzad, V. Phan, B., Vernekar, S.: Improving reconstruction autoencoder out-of-distribution detection with mahalanobis distance (2018)
Hampel, Frank R: Robust statistics: a brief introduction and overview. Seminar für Statistik, Eidgenössische Technische Hochschule,vol 04 (2001)
Zhai, S., Cheng, Y., Lu, W., Zhang, Z.: Deep structured energy based models for anomaly detection. In: International Conference on Machine Learning, pp. 1100–1109 (2016)
Yang, X., Huang, K., Goulermas, J.Y., Zhang, R.: Joint learning of unsupervised dimensionality reduction and gaussian mixture model. Springer 45, 791–806 (2017)
Huber, P.J., 2004. Robust statistics (Vol. 523). John Wiley and Sons
Lee, K., Lee, K., Lee, H., Shin, J.: A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (pp. 665-674) (2017)
Kampffmeyer, M., Løkse, S., Bianchi, F.M., Jenssen, R., Livi, L.: Deep kernelized autoencoders. In: Sharma, P., Bianchi, F.M. (eds.) SCIA 2017. LNCS, vol. 10269, pp. 419–430. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59126-1_35
Fan, H., Zhang, F., Wang, R., Xi, L., Li, Z.: Correlation-aware deep generative model for unsupervised anomaly detection. In: Lauw, H.W., et al. (eds.) PAKDD 2020. LNCS (LNAI), vol. 12085, pp. 688–700. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-47436-2_52
Fort, S., Ren, J., Lakshminarayanan, B.: Exploring the limits of out-of-distribution detection. Adv. Neural. Inf. Process. Syst. 34, 7068–7081 (2021)
Pang, G., Shen, C., Cao, L., Van Den Hengel, A.: Deep learning for anomaly detection: a review. ACM Comput. Surv. 54(2), 1–38 (2021)
Wang, H., Pang, G., Shen, C., Ma, C.: Unsupervised representation learning by predicting random distances. arXiv preprint arXiv:1912.12186 (2019)
Ghorbani, H.: Mahalanobis distance and its application for detecting multivariate outliers. Facta. Univ. Ser. Math. Inform. 34(3), 583–95 (2019)
Laforgue, P., Clémençon, S., d’Alché-Buc, F.: Autoencoding any data through kernel autoencoders. In: The 22nd International Conference on Artificial Intelligence and Statistics, (pp. 1061-1069). PMLR (2019)
Erhan, L., et al.: Smart anomaly detection in sensor systems: a multi-perspective review. Inf. Fusion 67, 64–79 (2021)
Koner, R., Sinhamahapatra, P., Roscher, K., Günnemann, S., Tresp, V.: OODformer: Out-of-distribution detection transformer. arXiv preprint arXiv:2107.08976 (2021)
Ando, S., Ayaka, Y.: Anomaly detection via few-shot learning on normality. In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2022, Grenoble, France, September 19-23, 2022, Proceedings, Part I, pp. 275-290. Cham: Springer International Publishing, 2023 https://doi.org/10.1007/978-3-031-26387-3_17
Acknowledgment
We would like to thank Virginia Tech National Security Institute (VTNSI) for supporting our work and Dr. Lamine Mili (ECE, Virginia Tech) for the introductory course on Robust Statistics.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Roy, P., Singhal, H., O’Shea, T.J., Jin, M. (2024). Latent Space Correlation-Aware Autoencoder for Anomaly Detection in Skewed Data. In: Yang, DN., Xie, X., Tseng, V.S., Pei, J., Huang, JW., Lin, J.CW. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2024. Lecture Notes in Computer Science(), vol 14645. Springer, Singapore. https://doi.org/10.1007/978-981-97-2242-6_6
Download citation
DOI: https://doi.org/10.1007/978-981-97-2242-6_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2241-9
Online ISBN: 978-981-97-2242-6
eBook Packages: Computer ScienceComputer Science (R0)