Case-Based Reasoning for Autonomous Service Failure Diagnosis and Remediation in Software Systems

  • Stefania Montani
  • Cosimo Anglano
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4106)


Self-healing, one of the four key properties characterizing Autonomic Systems, aims to enable large-scale software systems delivering complex services on a 24/7 basis to meet their goals without any human intervention. Achieving self-healing requires the elicitation and maintenance of domain knowledge in the form of 〈service failure diagnosis, remediation strategy〉 patterns, a task which can be overwhelming. Case-Based Reasoning (CBR) is a lazy learning paradigm that largely reduces this kind of knowledge acquisition bottleneck. Moreover, the application of CBR for failure diagnosis and remediation in software systems appears to be very suitable, as in this domain most errors are re-occurrences of known problems. In this paper, we describe a CBR approach for providing large-scale, distributed software systems with self-healing capabilities, and demonstrate the practical applicability of our methodology by means of some experimental results on a real world application.


Case Base Reasoning Service Failure Remediation Strategy Autonomic Computing Autonomic Manager 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    The Moodle project web site (accessed on January15, 2006),
  2. 2.
    Avizienis, A., Laprie, J., Randell, B., Landwehr, C.: Basic Concepts and Taxonomy of Dependable and Secure Computing. IEEE Transactions on Dependable and Secure Computing 1(1) (January-March 2004)Google Scholar
  3. 3.
    Aamodt, A., Plaza, E.: Case-based reasoning: foundational issues, methodological variations and systems approaches. AI Communications 7, 39–59 (1994)Google Scholar
  4. 4.
    Aghassi, D.S.: Evaluating case-based reasoning for heart failure diagnosis. Technical report, September of EECS. MIT, Cambridge, MA (1990)Google Scholar
  5. 5.
    Aha, D., Daniels, J. (eds.): Proc. AAAI Workshop on CBR Integrations. AAAI Press, Menlo Park (1998)Google Scholar
  6. 6.
    Anglano, C., Montani, S.: Achieving self-healing in autonomic software systems: a case-based reasoning approach. In: Czap, H., Unland, R., Branki, C., Tianfield, H. (eds.) Proc. International Conference on Self-Organization and Adaptation of Multi-agent and Grid Systems (SOAS), Glasgow, pp. 267–281. IOS Press, Amsterdam (2005)Google Scholar
  7. 7.
    Anglano, C., Montani, S.: Cavy: a tool for the deployment and operation of Self-Healing testbeds (January 2006) (submitted for publication)Google Scholar
  8. 8.
    Arshad, N., Heimbigner, D., Wolf, A.: A Planning Based Approach to Failure Recovery in Distributed Systems. In: Proc. of 2nd ACM Workshop on Self-Healing Systems (WOSS 2004), Newport Beach, CA, USA. ACM Press, New York (2004)Google Scholar
  9. 9.
    Bichindaritz, I., Kansu, E., Sullivan, K.M.: Case-based reasoning in CARE-PARTNER: Gathering evidence for evidence-based medical practice. In: Smyth, B., Cunningham, P. (eds.) EWCBR 1998. LNCS, vol. 1488, pp. 334–345. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  10. 10.
    Bonissone, P.P., Dutta, S.: Integrating case-based and rule-based reasoning: the possibilistic connection. In: Proc. of 6th Conference on Uncertainty in Artificial Intelligence, Cambridge, MA, USA (July 1990)Google Scholar
  11. 11.
    Branting, L.K., Porter, B.W.: Rules and precedents as complementary warrants. In: Proc. of 9th National Conference on Artificial Intelligence, Anaheim, CA, USA. AAAI Press, Menlo Park (1991)Google Scholar
  12. 12.
    Brewer, E.: Lessons from giant-scale services. IEEE Internet Computing 5(4) (2001)Google Scholar
  13. 13.
    Brodie, M., Ma, S., Lohman, G., Syeda-Mahmood, T., Mignet, L., Modani, N., Champlin, J., Sohn, P.: Quickly finding known software problems via automated symptom matching. In: Proc. of the 2nd International Conference on Autonomic Computing, Seattle, WA, USA (June 2005)Google Scholar
  14. 14.
    Freuder, E. (ed.): Proc. AAAI Spring Symposium on Multi-modal Reasoning. AAAI Press, Menlo Park (1998)Google Scholar
  15. 15.
    Ganek, A.G., Corbi, T.A.: The dawning of the autonomic computing era. IBM Systems Journal 42(1), 5–18 (2003)CrossRefGoogle Scholar
  16. 16.
    Garlan, D., Schmerl, B.: Model-based Adaptation for Self-Healing Systems. In: Proc. of 1st ACM Workshop on Self-Healing Systems (WOSS 2002), Charleston, SC, USA. ACM Press, New York (2002)CrossRefGoogle Scholar
  17. 17.
    Hammond, K.J.: Case-Based Planning: viewing planning as a memory task. Academic Press, London (1989)Google Scholar
  18. 18.
    Joshi, K.R., Hiltunen, M.A., Sanders, W.H., Schlichting, R.D.: Automatic Model- Driver Recovery in Distributed Systems. In: Proc. of 24th IEEE Symposium on Reliable Distributed Systems (SRDS 2005). IEEE Press, Los Alamitos (2005)Google Scholar
  19. 19.
    Kaiser, G., Parekh, J., Gross, P., Valetto, G.: Kenesthetics eXtreme: An External Infrastructure for Monitoring Distributed Legacy Systems. In: Proc. of 5th IEEE International Active Middleware Workshop, Seattle, WA, USA. IEEE CS Press, Los Alamitos (2003)Google Scholar
  20. 20.
    Kaiser, G., Parekh, J., Gross, P., Valetto, G.: Retrofitting Autonomic Capabilities onto Legacy Systems. Technical Report TR CUCS-026-03, Department of Computer Science, Columbia University (2003)Google Scholar
  21. 21.
    Kephart, J.O., Chess, D.M.: The vision of autonomic computing. IEEE Computer (January 2003)Google Scholar
  22. 22.
    Kolodner, J.L.: Case-Based Reasoning. Morgan Kaufmann, San Mateo (1993)Google Scholar
  23. 23.
    Littman, M., Nguyen, T., Hirsh, H.: Cost-Sensitive Fault Remediation for Autonomic Computing. In: Proc. of IJCAI Workshop on AI and Autonomic Computing: Developing a Research Agenda for Self-Managing Computer Systems, Acapulco, Mexico (August. 2003)Google Scholar
  24. 24.
    Macchion, D., Vo, D.: A hybrid knowledge-based system for technical diagnosis learning and assistance. In: Wess, S., Richter, M., Althoff, K.-D. (eds.) EWCBR 1993. LNCS, vol. 837, pp. 301–312. Springer, Heidelberg (1994)Google Scholar
  25. 25.
    Montani, S., Portinale, L.: Accounting for the temporal dimension in case-based retrieval: a framework for medical applications. Computational Intelligence (to appear)Google Scholar
  26. 26.
    Oppenheimer, D., Ganapathi, A., Patterson, D.: Why do Internet services fail, and what can be done about it? In: Proc. of 4th Usenix Symposium on Internet Technologies and Systems (USITS 2003), Seattle, WA, USA (March 2003)Google Scholar
  27. 27.
    Oppenheimer, D., Patterson, D.: Architecture and Dependability of Large-Scale Internet Services. IEEE Internet Computing, (September-October, 2002)Google Scholar
  28. 28.
    Portinale, L., Torasso, P., Magro, D.: Selecting most adaptable diagnostic solutions through pivoting-based retrieval. In: Leake, D.B., Plaza, E. (eds.) ICCBR 1997. LNCS, vol. 1266, pp. 393–402. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  29. 29.
    Rissland, E., Skalak, D.: Combining case-based and rule-based reasoning: A heuristic approach. In: Sridharan, N.S. (ed.) Proc. of 11th International Joint Conference on Artificial Intelligence, pp. 524–530 (1989)Google Scholar
  30. 30.
    Schaaf, J.W.: Fish and shrink. a next step towards efficient case retrieval in large-scale case bases. In: Smith, I., Faltings, B.V. (eds.) EWCBR 1996. LNCS, vol. 1168, pp. 362–376. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  31. 31.
    Schmidt, R., Montani, S., Bellazzi, R., Portinale, L., Gierl, L.: Case-based reasoning for medical knowledge-based systems. International Journal of Medical Informatics 64(2-3), 355–367 (2001)CrossRefGoogle Scholar
  32. 32.
    Sterrit, R.: Autonomic networks: engineering the self-healing property. Engineering Applications of Artificial Intelligence 17, 727–739 (2004)CrossRefGoogle Scholar
  33. 33.
    Surma, J., Vanhoof, K.: Integration rules and cases for the classification task. In: Veloso, M., Aamodt, A. (eds.) ICCBR 1995. LNCS, vol. 1010, pp. 325–334. Springer, Heidelberg (1995)CrossRefGoogle Scholar
  34. 34.
    Wilson, D.R., Martinez, T.R.: Improved heterogeneous distance functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)MATHMathSciNetGoogle Scholar
  35. 35.
    Xu, L.D.: An integrated rule- and case-based approach to AIDS initial assessment. International Journal of Biomedical Computing 40, 197–207 (1996)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Stefania Montani
    • 1
  • Cosimo Anglano
    • 1
  1. 1.Dipartimento di InformaticaUniversità del Piemonte OrientaleAlessandriaItaly

Personalised recommendations