Skip to main content
Log in

MADMM: microservice system anomaly detection via multi-modal data and multi-feature extraction

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Accurately detecting anomalies in microservice systems is crucial to avoid system failures and economic losses for users. Existing approaches detect anomalies by extracting sequential information from single-modal data, such as metrics or logs. However, they do not clearly analyze the temporal and spatial dependencies of each data type and ignore the correlation between different types of data, which can lead to a significant number of false positives. In this paper, we propose a novel Microservice system Anomaly Detection method via Multi-modal data and Multi-feature extraction (MADMM), which performs a joint analysis of metrics and logs from a spatial and temporal perspective. Specifically, we firstly construct separate feature graphs for metrics and logs in each time window. A graph convolution network is then utilized to capture the spatial correlation among different metrics, while a graph attention network is employed to analyze the contextual relationships among different log events. Then, we design a Cross-Modal Attention-based Gate Recurrent Unit (CMA-GRU) to capture the intricate temporal dependencies of each modal data and facilitate fulfilling cross-modal interactions. Finally, we introduce multi-grained contrastive learning methods to learn robust cross-modal features from both inter- and intra-modality aspects. Experimental results on real-world datasets demonstrate that MADMM outperforms existing baseline methods and exhibits better robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability

The data that support the findings of this study are available from the corresponding author, [Xiuguo Zhang], upon reasonable request.

References

  1. Leite L, Rocha C, Kon F, Milojicic D, Meirelles P (2019) A survey of devops concepts and challenges. ACM Comput Surv. https://doi.org/10.1145/3359981

    Article  Google Scholar 

  2. Chen Z, Kang Y, Li L, Zhang X, Zhang H, Xu H, Zhou (2020) Towards intelligent incident management: why we need it and how we make it. In: ESEC/FSE 2020: proceedings of the 28th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pp 1487–1497. https://doi.org/10.1145/3368089.3417055

  3. Zhu H, Rho S, Liu S, Jiang F (2023) Learning spatial graph structure for multivariate kpi anomaly detection in large-scale cyber-physical systems. IEEE Trans Instrum Meas. https://doi.org/10.1109/TIM.2023.3284920

    Article  Google Scholar 

  4. Ko J, Comuzzi M (2023) A systematic review of anomaly detection for business process event logs. Bus Inform Syst Eng 65(4):441–462. https://doi.org/10.1007/s12599-023-00794-y

    Article  Google Scholar 

  5. Su Y, Zhao Y, Niu C, Liu R, Sun W, Pei D (2019) Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. Association for Computing Machinery, New York, NY, USA pp 2828–2837. https://doi.org/10.1145/3292500.3330672

  6. Chen N, Tu H, Duan X, Hu L, Guo C (2023) Semisupervised anomaly detection of multivariate time series based on a variational autoencoder. Appl. Intell. 53(5):6074–6098. https://doi.org/10.1007/s10489-022-03829-1

    Article  Google Scholar 

  7. Xue S, Chen H, Zheng X (2022) Detection and quantification of anomalies in communication networks based on lstm-arima combined model. Int J Mach Learn Cybern 13(10):3159–3172. https://doi.org/10.1007/s13042-022-01586-8

    Article  Google Scholar 

  8. Zhang X, Xu Y, Lin Q, Qiao B, Zhang H, Dang Y, Xie C, Yang X, Cheng Q, Li Z, Chen J, He X, Yao R (2019) Robust log-based anomaly detection on unstable log data. Association for Computing Machinery, New York, NY, USA pp 807–817. https://doi.org/10.1145/3338906.3338931

  9. Wang Z, Tian J, Fang H, Chen L, Qin J (1996) Lightlog: a lightweight temporal convolutional network for log anomaly detection on the edge. Comput Netw. https://doi.org/10.1016/j.comnet.2021.108616

    Article  Google Scholar 

  10. Zhang C, Wang X, Zhang H, Zhang J, Zhang H, Liu C, Han P (2023) Layerlog: log sequence anomaly detection based on hierarchical semantics. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2022.109860

    Article  Google Scholar 

  11. Zhao N, Chen J, Yu Z, Wang H, Li J, Qiu B, Xu H, Zhang W, Sui K, Pei D (2021) Identifying bad software changes via multimodal anomaly detection for online service systems. In: Proceedings of the 29th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering. Association for Computing Machinery, New York, NY, USA, pp 527–539. https://doi.org/10.1145/3468264.3468543

  12. Meng W, Liu Y, Zhu Y, Zhang S, Pei D, Liu Y, Chen Y, Zhang R, Tao S, Sun P, Rong Z (2019) Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence. IJCAI, pp 4739–4745

  13. Kipf TN, Max W (2017) Semi-supervised classification with graph convolutional networks. In: 5th International Conference on learning representations, ICLR

  14. Petar V, Guillem C, Arantxa C, Adriana R, Pietro L, Yoshua B (2018) Graph attention networks. In: 6th International conference on learning representations, ICLR

  15. Chung J, Glehre G, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. In: 5th international conference on learning representations, ICLR. CoRR arXiv:1412.3555

  16. Dang Y, Lin Q, Huang P (2019) Aiops: real-world challenges and research innovations. In: 2019 IEEE/ACM 41st international conference on software engineering: companion proceedings (ICSE-Companion), pp 4–5. https://doi.org/10.1109/ICSE-Companion.2019.00023

  17. Du M, Li F, Zheng G, Srikumar V (2017) Deeplog: anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp 1285–1298. https://doi.org/10.1145/3133956.3134015

  18. Wang J, Tang Y, He S, Zhao C, Sharma PK, Alfarraj O, Tolba A (2020) Logevent2vec: Logevent-to-vector based anomaly detection for large-scale logs in internet of things. Sensors. https://doi.org/10.3390/s20092451

    Article  Google Scholar 

  19. He P, Hu J, He S, Li J, Lyu MR (2018) Towards automated log parsing for large-scale log data analysis. IEEE Trans Depend Secur 15(6):931–944. https://doi.org/10.1109/TDSC.2017.2762673

    Article  Google Scholar 

  20. Xu W, Huang L, Fox A, Patterson D, Jordan MI (2009) Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles. Association for Computing Machinery, New York, NY, USA, pp 117–132. https://doi.org/10.1145/1629575.1629587

  21. Yan L, Luo C, Shao R (2023) Discrete log anomaly detection: a novel time-aware graph-based link prediction approach. Inform Sci. https://doi.org/10.1016/j.ins.2023.119576

    Article  Google Scholar 

  22. Xie Y, Yang K (2023) Log anomaly detection by adversarial autoencoders with graph feature fusion. IEEE Trans Reliab. https://doi.org/10.1109/TR.2023.3305376

    Article  Google Scholar 

  23. Xu J, Wu H, Wang J, Long M (2022) Anomaly transformer: time series anomaly detection with association discrepancy. In: The tenth international conference on learning representations, ICLR, pp 6894–6910

  24. Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS (2021) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 32(1):4–24. https://doi.org/10.1109/TNNLS.2020.2978386

    Article  MathSciNet  Google Scholar 

  25. Zhao H, Wang Y, Duan J, Huang C, Cao D, Tong Y, Xu B, Bai J, Tong j, Zhang Q (2021) Multivariate time-series anomaly detection via graph attention network. In: 2020 IEEE international conference on data mining (ICDM). IEEE, USA, pp 841–850. https://doi.org/10.1109/ICDM50108.2020.00093

  26. Shi Y, Wang B, Yu Y, Tang X, Huang C, Dong J (2023) Robust anomaly detection for multivariate time series through temporal gcns and attention-based vae. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2023.110725

    Article  Google Scholar 

  27. Ding C, Sun S, Zhao J (2023) Mst-gat: a multimodal spatial-emporal graph attention network for time series anomaly detection. Inf Fusion 89:527–536. https://doi.org/10.1016/j.inffus.2022.08.011

    Article  Google Scholar 

  28. Han S, Woo SS (2022) Learning sparse latent graph representations for anomaly detection in multivariate time series. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. KDD ’22. Association for Computing Machinery, New York, NY, USA, pp 2977–2986. https://doi.org/10.1145/3534678.3539117

  29. Chen Y, Yan M, Yang D, Zhang X, Wang Z (2022) Deep attentive anomaly detection for microservice systems with multimodal time-series data. In: 2022 IEEE international conference on web services (ICWS), pp 373–378. https://doi.org/10.1109/ICWS55610.2022.00062

  30. Lee C, Yang T, Chen Z, Su Y, Yang Y, Lyu MR (2023) Heterogeneous anomaly detection for software systems via semi-supervised cross-modal attention. In: 2023 IEEE/ACM 45th international conference on software engineering (ICSE), pp 1724–1736. https://doi.org/10.1109/ICSE48619.2023.00148

  31. Zhu J, He S, Liu J, He P, Xie Q, Zheng Z, Lyu MR (2019) Tools and benchmarks for automated log parsing. In: 2019 IEEE/ACM 41st international conference on software engineering: software engineering in practice (ICSE-SEIP), pp 121–130. https://doi.org/10.1109/ICSE-SEIP.2019.00021

  32. He P, Zhu J, Zheng Z, Lyu MR (2017) Drain: an online log parsing approach with fixed depth tree. In: 2017 IEEE international conference on web services (ICWS). https://doi.org/10.1109/ICWS.2017.13

  33. Chen Q, Huang G, Wang Y (2022) The weighted cross-modal attention mechanism with sentiment prediction auxiliary task for multimodal sentiment analysis. IEEE-ACM Trans Audio SPE 30:2689–2695. https://doi.org/10.1109/TASLP.2022.3192728

    Article  Google Scholar 

  34. Liu K, Xue F, Li S, Sang S, Hong R (2024) Multimodal hierarchical graph collaborative filtering for multimedia-based recommendation. IEEE Trans Comput Soc Syst 11(1):216–227. https://doi.org/10.1109/TCSS.2022.3226862

    Article  Google Scholar 

  35. Chen L, Wang F, Yang R, Xie F, Wang W, Xu C, Zhao W, Guan Z (2022) Representation learning from noisy user-tagged data for sentiment classification. Int J Mach Learn Cybern 13(12):3727–3742. https://doi.org/10.1007/s13042-022-01622-7

    Article  Google Scholar 

  36. Zhou H, Yu K, Zhang X, Wu G, Yazidi A (2022) Contrastive autoencoder for anomaly detection in multivariate time series. Inform Sci 610:266–280. https://doi.org/10.1016/j.ins.2022.07.179

    Article  Google Scholar 

  37. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd International conference on learning representations, ICLR

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Grant No. 52231014) and Liaoning Province Applied Basic Research Program Project (Grant No. 2023JH2/101300195).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiuguo Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, P., Zhang, X., Cao, Z. et al. MADMM: microservice system anomaly detection via multi-modal data and multi-feature extraction. Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09918-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00521-024-09918-1

Keywords

Navigation