What Evaluation Criteria Are Right for CCBR? Considering Rank Quality
Evaluation criteria for conversational CBR (CCBR) systems are important to guide development and tuning of new methods, and to enable practitioners to make informed decisions about which methods to use. Traditional criteria for evaluating CCBR performance by precision and efficiency provide useful information, but are limited by their focus on the single point at which a case is selected at the end of the system dialogue, and by their dependence on a model of the user’s case selection criteria. This paper begins by revisiting issues in the evaluation of CCBR systems, arguing for the value of assessing the quality of the intermediate dialogue before case selection. It then proposes an evaluation approach based on rank quality to provide a fuller picture of system performance, and illustrates with an empirical study the use of rank quality to illuminate characteristics of similarity assessment strategies for partially-specified cases.
KeywordsUser Model Case Selection Candidate List Target Problem Candidate Case
Unable to display preview. Download preview PDF.
- 4.Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998)Google Scholar
- 5.McSherry, D.: Minimizing dialog length in interactive case-based reasoning. In: Proceedings of the seventeenth International Joint Conference on Artificial Intelligence (IJCAI 2001), pp. 993–998. Morgan Kaufmann, San Francisco (2001)Google Scholar
- 6.Buchanan, B., Shortliffe, E.: Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading (1984)Google Scholar
- 11.Bogaerts, S., Leake, D.: IUCBRF: A framework for rapid and modular CBR system development. Technical Report TR 617, Computer Science Department, Indiana University, Bloomington, IN (2005)Google Scholar
- 14.Kohlmaier, A., Schmitt, S., Bergmann, R.: Evaluation of a similarity-based approach to customer-adaptive elect ronic sales dialogs (2001)Google Scholar
- 16.Boyan, J., Freitag, D., Joachims, T.: A machine learning architecture for optimizing web search engines. In: Proceedings of the AAAI Workshop on Internet-Based Information Systems, Portland, Oregon (1996)Google Scholar
- 17.Borlund, P., Ingwersen, P.: Measures of relative relevance and ranked half-life: performance indicators for interactive ir. In: SIGIR 1998: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 324–331. ACM Press, New York (1998)CrossRefGoogle Scholar