Advertisement

Intelligent Detection of Large-Scale KPI Streams Anomaly Based on Transfer Learning

  • XiaoYan Duan
  • NingJiang ChenEmail author
  • YongSheng Xie
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1120)

Abstract

In the complex and variable SDDC (Software Defined Data Center) environment, in order to ensure that applications and services are undisturbed, it needs to closely monitor various KPI (Key Performance Indicators, such as CPU utilization, number of online users, request response delays, etc.) streams of resources and services. However, the frequent launch and update of applications, which produces new KPIs or change the data characteristics of KPIs monitored, so that the original anomaly detection model may become unavailable. So we propose ADT-SHL (Anomaly Detection through Transferring the Shared-hidden-layers model). It first clusters historical KPIs by similarity, and then all KPIs in each cluster are through Shared-hidden-layers method to train VAE (Variational Auto-Encoder) anomaly detection model so as to reconstruct KPIs with corresponding characteristics. When a new KPI is generated, ADT-SHL classify it into similar cluster, and finally model of the cluster is transferred and fine-tuned to detect the new KPI anomalies. This process is rapid, accurate and without manual tuning or labeling for new KPI. and its F-scores range 0.69 to 0.959 for the studied KPIs from two cloud vendors and others, ADT-SHL is only lower a state-of-art supervised method under all labelings by 3.05% and greatly outperforming a state-of-art unsupervised method by 67.46%, and compared with them, ADT-SHL reduce model training time by 94% on average.

Keywords

KPI anomaly detection Transfer learning Shared-hidden-layers KPI similarity Data center 

Notes

Acknowledgment

This work was is supported by the National Natural Science Foundation of China (61762008), the Natural Science Foundation Project of Guangxi (2017GXNSFAA198141) and Key R&D project of Guangxi (No. Guike AB17195014).

References

  1. 1.
    Software Defined Data Center. https://www.vmware.com/cn/solutions/software-defined-datacenter.html. Accessed 23 Jun 2018
  2. 2.
    Bu, J., Liu, Y., Zhang, S., et al.: Rapid deployment of anomaly detection models for large number of emerging KPI streams. In: 2018 IEEE 36th International Performance Computing and Communications Conference (IPCCC), pp. 1–8. IEEE. Orlando (2018)Google Scholar
  3. 3.
    Chen, Y., Mahajan, R., Sridharan, B., et al.: A provider-side view of web search response time. In: ACM SIGCOMM Computer Communication Review, vol. 43, no. 4, pp. 243–254. ACM, New York (2013)Google Scholar
  4. 4.
    Choffnes, D.R., Bustamante, F.E., Ge, Z.: Crowdsourcing service-level network event monitoring. ACM SIGCOMM Comput. Commun. Rev. 41(4), 387–398 (2011)CrossRefGoogle Scholar
  5. 5.
    Liu, D., Zhao, Y., Xu, H., et al.: Opprentice: towards practical and automatic anomaly detection through machine learning. In: Proceedings of the 2015 Internet Measurement Conference, pp. 211–224. ACM, Tokyo (2015)Google Scholar
  6. 6.
    Xu, H., Chen, W., Zhao, N., et al.: Unsupervised anomaly detection via variational auto-encoder for seasonal KPIs in web applications. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, Lyon, France, pp. 187–196 (2018)Google Scholar
  7. 7.
    Li, Z., Chen, W., Pei, D.: Robust and unsupervised KPI anomaly detection based on conditional variational autoencoder. In: 2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC), pp. 1–9. IEEE, Orlando (2018)Google Scholar
  8. 8.
    Li, Z., Zhao, Y., Liu, R., et al.: Robust and rapid clustering of kpis for large-scale anomaly detection. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), pp. 1–10. IEEE, Banff (2018)Google Scholar
  9. 9.
    Csurka, G.: Domain adaptation for visual applications: a comprehensive survey. arXiv preprint arXiv:1702.05374 (2017)
  10. 10.
    Petitjean, F., Forestier, G., Webb, G.I., et al.: Dynamic time warping averaging of time series allows faster and more accurate classification. In: 2014 IEEE International Conference on Data Mining, pp. 470–479. IEEE, Shenzhen (2014)Google Scholar
  11. 11.
    Zhang, Y.L., Li, L., Zhou, J., et al.: Anomaly detection with partially observed anomalies. In: Companion of the Web Conference 2018 on The Web Conference 2018. International World Wide Web Conferences Steering Committee, Lyon, France, pp. 639–646 (2018)Google Scholar
  12. 12.
    Diederik, P., Welling, M.: Auto-encoding variational Bayes. In: Proceedings of the International Conference on Learning Representations (2013)Google Scholar
  13. 13.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on International Conference on Machine Learning, Lille, France, vol. 37, pp. 448–456 (2015)Google Scholar
  14. 14.
  15. 15.
    Baidu Public Data. https://github.com/baidu/Curve. Accessed 10 Apr 2019
  16. 16.
    Yahoo Public Data. https://webscope.sandbox.yahoo.com/. Accessed 15 Jun 2019

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Guangxi UniversityNanningChina

Personalised recommendations