Advertisement

Online Detection of Operator Errors in Cloud Computing Using Anti-patterns

  • Arthur VetterEmail author
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 340)

Abstract

IT services are subject of several maintenance operations like upgrades, reconfigurations or redeployments. Monitoring those changes is crucial to detect operator errors, which are a main source of service failures. Another challenge, which exacerbates operator errors is the increasing frequency of changes, e.g. because of continuous deployments like often performed in cloud computing. In this paper, we propose a monitoring approach to detect operator errors online in real-time by using complex event processing and anti-patterns. The basis of the monitoring approach is a novel business process modelling method, combining TOSCA and Petri nets. This model is used to derive pattern instances, which are input for a complex event processing engine in order to analyze them against the generated events of the monitored applications.

Keywords

Complex event processing Anti-pattern TOSCA IT service management Anomaly detection 

References

  1. 1.
    Gunawi, H.S., et al.: What bugs live in the cloud? A study of 3000+ issues in cloud systems. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 1–14 (2014)Google Scholar
  2. 2.
    Hagen, S., Seibold, M., Kemper, A.: Efficient verification of IT change operations or: how we could have prevented Amazon’s cloud outage. Presented at the Network Operations and Management Symposium (NOMS), 2012 IEEE, pp. 368–376 (2012)Google Scholar
  3. 3.
    Dumitra, T., Narasimhan, P.: Why do upgrades fail and what can we do about it? Toward dependable, online upgrades in enterprise system. In: Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware, p. 18 (2009)Google Scholar
  4. 4.
    Pertet, S., Narasimhan, P.: Causes of failure in web applications. Parallel Data Laboratory, p. 48 (2005)Google Scholar
  5. 5.
    Oppenheimer, D., Ganapathi, A., Patterson, D.A.: Why do internet services fail, and what can be done about it? In: Proceedings of the 4th Conference on USENIX Symposium on Internet Technologies and Systems, vol. 4, Berkeley, p. 1 (2003)Google Scholar
  6. 6.
    Scott, D.: Making smart investments to reduce unplanned downtime. Tactical Guidelines Research Note TG-07-4033, Gartner Group, Stamford, CT (1999)Google Scholar
  7. 7.
    Elliot, S.: DevOps and the cost of downtime: fortune 1000 best practice metrics quantified. International Data Corporation, IDC (2014)Google Scholar
  8. 8.
    Vetter, A.: Detecting operator errors in cloud maintenance operations. In: 2016 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), pp. 639–644 (2016)Google Scholar
  9. 9.
    Nagaraja, K., Oliveira, F., Bianchini, R., Martin, R.P., Nguyen, T.D.: Under-standing and dealing with operator mistakes in internet services. In: OSDI 2004: 6th Symposium on Operating Systems Design and Implementation (2004)Google Scholar
  10. 10.
    Yin, Z., Ma, X., Zheng, J., Zhou, Y., Bairavasundaram, L.N., Pasupathy, S.: An empirical study on configuration errors in commercial and open source systems. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pp. 159–172 (2011)Google Scholar
  11. 11.
    Peterson, J.L.: Petri Net Theory and the Modeling of Systems. Prentice Hall, Upper Saddle River (1981)zbMATHGoogle Scholar
  12. 12.
    Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Elements of Reusable Object-Oriented Software. Pearson Education, London (1994)zbMATHGoogle Scholar
  13. 13.
    van der Aalst, W.M., Ter Hofstede, A.H., Kiepuszewski, B., Barros, A.P.: Workflow patterns. Distrib. Parallel Databases 14(1), 5–51 (2003)CrossRefGoogle Scholar
  14. 14.
    Russell, N., Ter Hofstede, A.H., Edmond, D., van der Aalst, W.M.: Workflow Data Patterns. QUT Technical report, FIT-TR-2004-01. Queensland University of Technology, Brisbane (2004)Google Scholar
  15. 15.
    Russell, N., van der Aalst, W.M.P., ter Hofstede, A.H.M., Edmond, D.: Workflow resource patterns: identification, representation and tool support. In: Pastor, O., Falcão e Cunha, J. (eds.) CAiSE 2005. LNCS, vol. 3520, pp. 216–232. Springer, Heidelberg (2005).  https://doi.org/10.1007/11431855_16CrossRefGoogle Scholar
  16. 16.
    Riehle, D., Züllighoven, H.: Understanding and using patterns in software development. TAPOS 2(1), 3–13 (1996)Google Scholar
  17. 17.
    Dwyer, M.B., Avrunin, G.S., Corbett, J.C.: Property specification patterns for finite-state verification. In: Proceedings of the Second Workshop on Formal Methods in Software Practice, pp. 7–15 (1998)Google Scholar
  18. 18.
    Van Der Aalst, W.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-19345-3CrossRefzbMATHGoogle Scholar
  19. 19.
    Weidlich, M., Mendling, J., Weske, M.: Computation of behavioural profiles of process models. Business Process Technology, Hasso Plattner Institute for IT-Systems Engineering, Potsdam (2009)Google Scholar
  20. 20.
    Xu, X., Zhu, L., Weber, I., Bass, L., et al.: POD-diagnosis: error diagnosis of sporadic operations on cloud applications. In: 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 252–263 (2014)Google Scholar
  21. 21.
    Chef: About Handlers, 08 November 2017. https://docs.chef.io/handlers.html
  22. 22.
    Xu, T., Zhou, Y.: Systems approaches to tackling configuration errors: a survey (2014)Google Scholar
  23. 23.
    Farshchi, M., Schneider, J.-G., Weber, I., Grundy, J.: Metric selection and anomaly detection for cloud operations using log and metric correlation analysis. J. Syst. Softw. 137, 531–549 (2017)CrossRefGoogle Scholar
  24. 24.
    Powers, D.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)MathSciNetGoogle Scholar
  25. 25.
    Kopp, O., Binz, T., Breitenbücher, U., Leymann, F.: BPMN4TOSCA: a domain-specific language to model management plans for composite applications. In: Mendling, J., Weidlich, M. (eds.) BPMN 2012. LNBIP, vol. 125, pp. 38–52. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33155-8_4CrossRefGoogle Scholar
  26. 26.
    Becker, M., Klingner, S.: A criteria catalogue for evaluating business process pattern approaches. In: Bider, I., et al. (eds.) BPMDS/EMMSAD-2014. LNBIP, vol. 175, pp. 257–271. Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-662-43745-2_18CrossRefGoogle Scholar
  27. 27.
    Awad, A., Barnawi, A., Elgammal, A., Elshawi, R., Almalaise, A., Sakr, S.: Runtime detection of business process compliance violations: an approach based on anti patterns. In: 12th Enterprise Engineering Track at ACM, SAC 2015 (2015)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2019

Authors and Affiliations

  1. 1.Horus software GmbHEttlingenGermany

Personalised recommendations