Leveraging AI in Service Automation Modeling: From Classical AI Through Deep Learning to Combination Models

  • Qing WangEmail author
  • Larisa Shwartz
  • Genady Ya. Grabarnik
  • Michael Nidd
  • Jinho Hwang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11895)


With the advent of cloud, new generations of digital services are being conceived to respond to the ever-growing demands and expectations of the market place. In parallel, automations are becoming an essential enabler for successful management of these services. With such importance being placed on digital services, automated management of these services – in particular, automated incident resolution – becomes a key issue for both the provider and the users. The challenge facing automation providers lies in variability and the frequently changing nature of the monitoring tickets that provide the primary input to automation. Despite the value of the correct automation at the correct time, it is also important to remember that triggering an incorrect automation may damage the smooth operation of the business. In this paper, we discuss AI modeling for automation recommendations. We describe a wide range of experiments which allowed us to conclude an optimal method with respect to accuracy and speed acceptable to service providers.


Classical and deep learning models Combination models Multiclass text classification AI for service automation 


  1. 1.
    Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1305 (2003)zbMATHGoogle Scholar
  2. 2.
    Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification, arXiv preprint arXiv:1607.01759 (2016)
  3. 3.
    Ruck, D.W., Rogers, S.K., Kabrisky, M.: Feature selection using a multilayer perceptron. J. Neural Network Comput. 2(2), 40–48 (1990)Google Scholar
  4. 4.
    Kim, Y.: Convolutional neural networks for sentence classification, arXiv preprint arXiv:1408.5882 (2014)
  5. 5.
    Zhang, X., LeCun, Y.: Text understanding from scratch, arXiv preprint arXiv:1502.01710 (2015)
  6. 6.
    Shwartz, L., et al.: CEA: a service for cognitive event automation. In: Liu, X., et al. (eds.) ICSOC 2018. LNCS, vol. 11434, pp. 425–429. Springer, Cham (2019). Scholar
  7. 7.
    Zeng, C., Li, T., Shwartz, L., Grabarnik, G.Y.: Hierarchical multi-label classification over ticket data using contextual loss. In: 2014 IEEE NOMS (2014)Google Scholar
  8. 8.
    Wang, Q., Li, T., Iyengar, S., Shwartz, L., Grabarnik, G.Y.: Online IT ticket automation recommendation using hierarchical multi-armed bandit algorithms. In: Proceedings of the 2018 SIAM International Conference on Data Mining, SIAM, pp. 657–665 (2018)CrossRefGoogle Scholar
  9. 9.
    Wang, Q., Zeng, C., Zhou, W., Li, T., Iyengar, S.S., Shwartz, L., Grabarnik, G.: Online interactive collaborative filtering using multi-armed bandit with dependent arms. IEEE Trans. Knowl. Data Eng. 31, 1569–1580 (2018)CrossRefGoogle Scholar
  10. 10.
    Zeng, C., Wang, Q., Wang, W., Li, T., Shwartz, L.: Online inference for time-varying temporal dependency discovery from time series. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 1281–1290. IEEE (2016) Google Scholar
  11. 11.
    Kao, A., Poteet, S.R.: Natural Language Processing and Text Mining. Springer, London (2007). Scholar
  12. 12.
    Bottou, L., et al.: Comparison of classifier methods: a case study in handwritten digit recognition. In: International Conference on Pattern Recognition, p. 77. IEEE Computer Society Press (1994)Google Scholar
  13. 13.
    Hsu, C.-W., Lin, C.-J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Networks 13, 415–425 (2002)CrossRefGoogle Scholar
  14. 14.
    Dietterich, T.G.: Machine-learning research. AI Mag. 18(4), 97–97 (1997)Google Scholar
  15. 15.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar
  16. 16.
    Liaw, A., Wiener, M., et al.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)Google Scholar
  17. 17.
    Prinzie, A., Van den Poel, D.: Random multiclass classification: generalizing random forests to random MNL and random NB. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 349–358. Springer, Heidelberg (2007). Scholar
  18. 18.
    Santner, J., Unger, M., Pock, T., Leistner, C., Saffari, A., Bischof, A.: Interactive texture segmentation using random forests and total variation. In: BMVC, pp. 1–12. Citeseer (2009)Google Scholar
  19. 19.
    Chen, T., Guestrin, T.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD, pp. 785–794 (2016)Google Scholar
  20. 20.
    Madisetty, S., Desarkar, M.S.: An ensemble based method for predicting emotion intensity of tweets. In: Ghosh, A., Pal, R., Prasath, R. (eds.) MIKE 2017. LNCS (LNAI), vol. 10682, pp. 359–370. Springer, Cham (2017). Scholar
  21. 21.
    Sharif Razavian, A., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 806–813 (2014)Google Scholar
  22. 22.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRefGoogle Scholar
  23. 23.
    Narasimhan, H., Pan, W., Kar, P., Protopapas, P., Ramaswamy, H.G.: Optimizing the multiclass f-measure via biconcave programming. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 1101–1106. IEEE (2016)Google Scholar
  24. 24.
    Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)Google Scholar
  25. 25.
    Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
  26. 26.
    Cover, T.M., Hart, P.E., et al.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (1967)CrossRefGoogle Scholar
  27. 27.
    Hartigan, J.A., Wong, M.A.: Algorithm as 136: a k-means clustering algorithm. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979)zbMATHGoogle Scholar
  28. 28.
    Ren, X., Guo, H., Li, S., Wang, S., Li, J.: A novel image classification method with CNN-XGBoost model. In: Kraetzer, C., Shi, Y.-Q., Dittmann, J., Kim, H.J. (eds.) IWDW 2017. LNCS, vol. 10431, pp. 378–390. Springer, Cham (2017). Scholar
  29. 29.
    Wang, Q.: Intelligent data mining techniques for automatic service management. In: FIU Electronic Theses and Dissertations. FIU (2018).
  30. 30.
    Wang, Q., Zhou, W., Zeng, C., Li, T., Shwartz, L., Grabarnik, G.Y.: Constructing the knowledge base for cognitive it service management. In: 2017 IEEE International Conference on Services Computing (SCC), pp. 410–417. IEEE (2017)Google Scholar
  31. 31.
    Zhou, W., et al.: Star: a system for ticket analysis and resolution. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2181–2190. ACM (2017)Google Scholar
  32. 32.
    Wang, Q., Zeng, C., Iyengar, S., Li, T., Shwartz, L., Grabarnik, G.Y.: AISTAR: an intelligent system for online IT ticket automation recommendation. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 1875–1884. IEEE (2018)Google Scholar
  33. 33.
    Zeng, C., Wang, C., Mokhtari, S., Li, T.: Online context-aware recommendation with time varying multi-armed bandit. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2025–2034. ACM (2016)Google Scholar
  34. 34.
    Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)Google Scholar
  35. 35.
    Zhang, Y., Jin, R., Zhou, Z.-H.: Understanding bag-of-words model: a statistical framework. Int. J. Mach. Learn. Cybern. 1(1–4), 43–52 (2010)CrossRefGoogle Scholar
  36. 36.
    Brown, P.F., Desouza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C.: Class-based n-gram models of natural language. Comput. Linguist. 18(4), 467–479 (1992)Google Scholar
  37. 37.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.IBM T.J. Watson Research CenterYorktown HeightsUSA
  2. 2.Department of Math and Computer ScienceSt. John’s UniversityQueensUSA
  3. 3.IBM Research–ZurichRueschlikonSwitzerland

Personalised recommendations