What Evaluation Criteria Are Right for CCBR? Considering Rank Quality

  • Steven Bogaerts
  • David Leake
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4106)


Evaluation criteria for conversational CBR (CCBR) systems are important to guide development and tuning of new methods, and to enable practitioners to make informed decisions about which methods to use. Traditional criteria for evaluating CCBR performance by precision and efficiency provide useful information, but are limited by their focus on the single point at which a case is selected at the end of the system dialogue, and by their dependence on a model of the user’s case selection criteria. This paper begins by revisiting issues in the evaluation of CCBR systems, arguing for the value of assessing the quality of the intermediate dialogue before case selection. It then proposes an evaluation approach based on rank quality to provide a fuller picture of system performance, and illustrates with an empirical study the use of rank quality to illuminate characteristics of similarity assessment strategies for partially-specified cases.


User Model Case Selection Candidate List Target Problem Candidate Case 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Watson, I.: Applying Case-Based Reasoning: Techniques for Enterprise Systems. Morgan Kaufmann, San Mateo (1997)MATHGoogle Scholar
  2. 2.
    Aha, D., Breslow, L.: Refining conversational case libraries. In: Leake, D.B., Plaza, E. (eds.) ICCBR 1997. LNCS, vol. 1266, pp. 267–278. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  3. 3.
    Bogaerts, S., Leake, D.: Facilitating CBR for incompletely-described cases: Distance metrics for partial problem descriptions. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 62–76. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  4. 4.
    Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998)Google Scholar
  5. 5.
    McSherry, D.: Minimizing dialog length in interactive case-based reasoning. In: Proceedings of the seventeenth International Joint Conference on Artificial Intelligence (IJCAI 2001), pp. 993–998. Morgan Kaufmann, San Francisco (2001)Google Scholar
  6. 6.
    Buchanan, B., Shortliffe, E.: Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading (1984)Google Scholar
  7. 7.
    Cheetham, W., Price, J.: Measures of solution accuracy in case-based reasoning systems. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 106–118. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Miller, R., Pople, H., Meyers, J.: Internist-i, an experimental computer-based diagnostic consultant for general internal medicine. New England Journal of Medicine 307(8), 468–476 (1982)CrossRefGoogle Scholar
  9. 9.
    McSherry, D.: Incremental relaxation of unsuccessful queries. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 331–345. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  10. 10.
    McCarthy, K., Reilly, J., McGinty, L., Smyth, B.: Experiments in dynamic critiquing. In: IUI 2005: Proceedings of the 10th international conference on Intelligent user interfaces, pp. 175–182. ACM Press, New York (2005)CrossRefGoogle Scholar
  11. 11.
    Bogaerts, S., Leake, D.: IUCBRF: A framework for rapid and modular CBR system development. Technical Report TR 617, Computer Science Department, Indiana University, Bloomington, IN (2005)Google Scholar
  12. 12.
    McSherry, D.: Precision and recall in interactive case-based reasoning. In: Aha, D.W., Watson, I. (eds.) ICCBR 2001. LNCS (LNAI), vol. 2080, pp. 392–406. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  13. 13.
    Gupta, K.M., Aha, D.W., Sandhu, N.: Exploiting taxonomic and causal relations in conversational case retrieval. In: Craw, S., Preece, A.D. (eds.) ECCBR 2002. LNCS (LNAI), vol. 2416, pp. 133–147. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  14. 14.
    Kohlmaier, A., Schmitt, S., Bergmann, R.: Evaluation of a similarity-based approach to customer-adaptive elect ronic sales dialogs (2001)Google Scholar
  15. 15.
    Voorhees, E.M.: Evaluation by highly relevant documents. In: SIGIR 2001: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 74–82. ACM Press, New York (2001)CrossRefGoogle Scholar
  16. 16.
    Boyan, J., Freitag, D., Joachims, T.: A machine learning architecture for optimizing web search engines. In: Proceedings of the AAAI Workshop on Internet-Based Information Systems, Portland, Oregon (1996)Google Scholar
  17. 17.
    Borlund, P., Ingwersen, P.: Measures of relative relevance and ranked half-life: performance indicators for interactive ir. In: SIGIR 1998: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 324–331. ACM Press, New York (1998)CrossRefGoogle Scholar
  18. 18.
    Cooper, W.S.: Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation 19(1), 30–42 (1968)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Steven Bogaerts
    • 1
  • David Leake
    • 1
  1. 1.Computer Science DepartmentIndiana UniversityBloomingtonU.S.A.

Personalised recommendations