On Anomaly Detection and Root Cause Analysis of Microservice Systems
In this demonstration, we design and implement a prototype of proof for causal graph building, anomaly detection and root cause analysis of microservice systems. The system comprises two core functionalities: (i) monitoring of systems and services; (ii) Application anomaly detection and root cause analysis. In the first part, the key metrics for the health of a system and an application, are collected by backend and plotted with dynamic charts in the frontend, which can help operators spot the overall system status. In the second part, the system can automatically build a causal graph of the microservice applications, indicating the dependencies between different modules, without instrumenting any source code. When an anomaly of a service instance is detected, it will be highlighted in the graph. A root cause inference function is also applied to analyze the root cause and returns a ranked list of root cause candidates to operators.
KeywordsMicroservice Root cause analysis Monitoring system Kubernetes
The work described in this paper was supported by the National Key R&D Program of China (2018YFB1004804), the National Natural Science Foundation of China (61802448) and the Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme 2016.
- 1.Jinjin, L., Pengfei, C., Zibin, Z.: Microscope: pinpoint the abnormal services with causal graphs in micro-service environments. In: Proceedings of the 16th International Conference on Service Oriented Computing (ICSOC 2018) (2018, to appear)Google Scholar
- 2.Kubernetes. https://kubernetes.io/
- 3.Sysdig. https://sysdig.com/opensource/
- 5.Elasticsearch. https://www.elastic.co/products/elasticsearch
- 6.Prometheus. https://prometheus.io/
- 7.Grafana. https://grafana.com
- 8.Sock-shop. https://microservices-demo.github.io/