Abstract
Accurately detecting anomalies in microservice systems is crucial to avoid system failures and economic losses for users. Existing approaches detect anomalies by extracting sequential information from single-modal data, such as metrics or logs. However, they do not clearly analyze the temporal and spatial dependencies of each data type and ignore the correlation between different types of data, which can lead to a significant number of false positives. In this paper, we propose a novel Microservice system Anomaly Detection method via Multi-modal data and Multi-feature extraction (MADMM), which performs a joint analysis of metrics and logs from a spatial and temporal perspective. Specifically, we firstly construct separate feature graphs for metrics and logs in each time window. A graph convolution network is then utilized to capture the spatial correlation among different metrics, while a graph attention network is employed to analyze the contextual relationships among different log events. Then, we design a Cross-Modal Attention-based Gate Recurrent Unit (CMA-GRU) to capture the intricate temporal dependencies of each modal data and facilitate fulfilling cross-modal interactions. Finally, we introduce multi-grained contrastive learning methods to learn robust cross-modal features from both inter- and intra-modality aspects. Experimental results on real-world datasets demonstrate that MADMM outperforms existing baseline methods and exhibits better robustness.
Similar content being viewed by others
Data Availability
The data that support the findings of this study are available from the corresponding author, [Xiuguo Zhang], upon reasonable request.
References
Leite L, Rocha C, Kon F, Milojicic D, Meirelles P (2019) A survey of devops concepts and challenges. ACM Comput Surv. https://doi.org/10.1145/3359981
Chen Z, Kang Y, Li L, Zhang X, Zhang H, Xu H, Zhou (2020) Towards intelligent incident management: why we need it and how we make it. In: ESEC/FSE 2020: proceedings of the 28th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pp 1487–1497. https://doi.org/10.1145/3368089.3417055
Zhu H, Rho S, Liu S, Jiang F (2023) Learning spatial graph structure for multivariate kpi anomaly detection in large-scale cyber-physical systems. IEEE Trans Instrum Meas. https://doi.org/10.1109/TIM.2023.3284920
Ko J, Comuzzi M (2023) A systematic review of anomaly detection for business process event logs. Bus Inform Syst Eng 65(4):441–462. https://doi.org/10.1007/s12599-023-00794-y
Su Y, Zhao Y, Niu C, Liu R, Sun W, Pei D (2019) Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. Association for Computing Machinery, New York, NY, USA pp 2828–2837. https://doi.org/10.1145/3292500.3330672
Chen N, Tu H, Duan X, Hu L, Guo C (2023) Semisupervised anomaly detection of multivariate time series based on a variational autoencoder. Appl. Intell. 53(5):6074–6098. https://doi.org/10.1007/s10489-022-03829-1
Xue S, Chen H, Zheng X (2022) Detection and quantification of anomalies in communication networks based on lstm-arima combined model. Int J Mach Learn Cybern 13(10):3159–3172. https://doi.org/10.1007/s13042-022-01586-8
Zhang X, Xu Y, Lin Q, Qiao B, Zhang H, Dang Y, Xie C, Yang X, Cheng Q, Li Z, Chen J, He X, Yao R (2019) Robust log-based anomaly detection on unstable log data. Association for Computing Machinery, New York, NY, USA pp 807–817. https://doi.org/10.1145/3338906.3338931
Wang Z, Tian J, Fang H, Chen L, Qin J (1996) Lightlog: a lightweight temporal convolutional network for log anomaly detection on the edge. Comput Netw. https://doi.org/10.1016/j.comnet.2021.108616
Zhang C, Wang X, Zhang H, Zhang J, Zhang H, Liu C, Han P (2023) Layerlog: log sequence anomaly detection based on hierarchical semantics. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2022.109860
Zhao N, Chen J, Yu Z, Wang H, Li J, Qiu B, Xu H, Zhang W, Sui K, Pei D (2021) Identifying bad software changes via multimodal anomaly detection for online service systems. In: Proceedings of the 29th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering. Association for Computing Machinery, New York, NY, USA, pp 527–539. https://doi.org/10.1145/3468264.3468543
Meng W, Liu Y, Zhu Y, Zhang S, Pei D, Liu Y, Chen Y, Zhang R, Tao S, Sun P, Rong Z (2019) Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence. IJCAI, pp 4739–4745
Kipf TN, Max W (2017) Semi-supervised classification with graph convolutional networks. In: 5th International Conference on learning representations, ICLR
Petar V, Guillem C, Arantxa C, Adriana R, Pietro L, Yoshua B (2018) Graph attention networks. In: 6th International conference on learning representations, ICLR
Chung J, Glehre G, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. In: 5th international conference on learning representations, ICLR. CoRR arXiv:1412.3555
Dang Y, Lin Q, Huang P (2019) Aiops: real-world challenges and research innovations. In: 2019 IEEE/ACM 41st international conference on software engineering: companion proceedings (ICSE-Companion), pp 4–5. https://doi.org/10.1109/ICSE-Companion.2019.00023
Du M, Li F, Zheng G, Srikumar V (2017) Deeplog: anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp 1285–1298. https://doi.org/10.1145/3133956.3134015
Wang J, Tang Y, He S, Zhao C, Sharma PK, Alfarraj O, Tolba A (2020) Logevent2vec: Logevent-to-vector based anomaly detection for large-scale logs in internet of things. Sensors. https://doi.org/10.3390/s20092451
He P, Hu J, He S, Li J, Lyu MR (2018) Towards automated log parsing for large-scale log data analysis. IEEE Trans Depend Secur 15(6):931–944. https://doi.org/10.1109/TDSC.2017.2762673
Xu W, Huang L, Fox A, Patterson D, Jordan MI (2009) Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles. Association for Computing Machinery, New York, NY, USA, pp 117–132. https://doi.org/10.1145/1629575.1629587
Yan L, Luo C, Shao R (2023) Discrete log anomaly detection: a novel time-aware graph-based link prediction approach. Inform Sci. https://doi.org/10.1016/j.ins.2023.119576
Xie Y, Yang K (2023) Log anomaly detection by adversarial autoencoders with graph feature fusion. IEEE Trans Reliab. https://doi.org/10.1109/TR.2023.3305376
Xu J, Wu H, Wang J, Long M (2022) Anomaly transformer: time series anomaly detection with association discrepancy. In: The tenth international conference on learning representations, ICLR, pp 6894–6910
Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS (2021) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 32(1):4–24. https://doi.org/10.1109/TNNLS.2020.2978386
Zhao H, Wang Y, Duan J, Huang C, Cao D, Tong Y, Xu B, Bai J, Tong j, Zhang Q (2021) Multivariate time-series anomaly detection via graph attention network. In: 2020 IEEE international conference on data mining (ICDM). IEEE, USA, pp 841–850. https://doi.org/10.1109/ICDM50108.2020.00093
Shi Y, Wang B, Yu Y, Tang X, Huang C, Dong J (2023) Robust anomaly detection for multivariate time series through temporal gcns and attention-based vae. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2023.110725
Ding C, Sun S, Zhao J (2023) Mst-gat: a multimodal spatial-emporal graph attention network for time series anomaly detection. Inf Fusion 89:527–536. https://doi.org/10.1016/j.inffus.2022.08.011
Han S, Woo SS (2022) Learning sparse latent graph representations for anomaly detection in multivariate time series. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. KDD ’22. Association for Computing Machinery, New York, NY, USA, pp 2977–2986. https://doi.org/10.1145/3534678.3539117
Chen Y, Yan M, Yang D, Zhang X, Wang Z (2022) Deep attentive anomaly detection for microservice systems with multimodal time-series data. In: 2022 IEEE international conference on web services (ICWS), pp 373–378. https://doi.org/10.1109/ICWS55610.2022.00062
Lee C, Yang T, Chen Z, Su Y, Yang Y, Lyu MR (2023) Heterogeneous anomaly detection for software systems via semi-supervised cross-modal attention. In: 2023 IEEE/ACM 45th international conference on software engineering (ICSE), pp 1724–1736. https://doi.org/10.1109/ICSE48619.2023.00148
Zhu J, He S, Liu J, He P, Xie Q, Zheng Z, Lyu MR (2019) Tools and benchmarks for automated log parsing. In: 2019 IEEE/ACM 41st international conference on software engineering: software engineering in practice (ICSE-SEIP), pp 121–130. https://doi.org/10.1109/ICSE-SEIP.2019.00021
He P, Zhu J, Zheng Z, Lyu MR (2017) Drain: an online log parsing approach with fixed depth tree. In: 2017 IEEE international conference on web services (ICWS). https://doi.org/10.1109/ICWS.2017.13
Chen Q, Huang G, Wang Y (2022) The weighted cross-modal attention mechanism with sentiment prediction auxiliary task for multimodal sentiment analysis. IEEE-ACM Trans Audio SPE 30:2689–2695. https://doi.org/10.1109/TASLP.2022.3192728
Liu K, Xue F, Li S, Sang S, Hong R (2024) Multimodal hierarchical graph collaborative filtering for multimedia-based recommendation. IEEE Trans Comput Soc Syst 11(1):216–227. https://doi.org/10.1109/TCSS.2022.3226862
Chen L, Wang F, Yang R, Xie F, Wang W, Xu C, Zhao W, Guan Z (2022) Representation learning from noisy user-tagged data for sentiment classification. Int J Mach Learn Cybern 13(12):3727–3742. https://doi.org/10.1007/s13042-022-01622-7
Zhou H, Yu K, Zhang X, Wu G, Yazidi A (2022) Contrastive autoencoder for anomaly detection in multivariate time series. Inform Sci 610:266–280. https://doi.org/10.1016/j.ins.2022.07.179
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd International conference on learning representations, ICLR
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Grant No. 52231014) and Liaoning Province Applied Basic Research Program Project (Grant No. 2023JH2/101300195).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, P., Zhang, X., Cao, Z. et al. MADMM: microservice system anomaly detection via multi-modal data and multi-feature extraction. Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09918-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00521-024-09918-1