Information-based evaluation criterion for classifier's performance
In the past few years many systems for learning decision rules from examples were developed. As different systems allow different types of answers when classifying new instances, it is difficult to appropriately evaluate the systems' classification power in comparison with other classification systems or in comparison with human experts. Classification accuracy is usually used as a measure of classification performance. This measure is, however, known to have several defects. A fair evaluation criterion should exclude the influence of the class probabilities which may enable a completely uninformed classifier to trivially achieve high classification accuracy. In this paper a method for evaluating the information score of a classifier's answers is proposed. It excludes the influence of prior probabilities, deals with various types of imperfect or probabilistic answers and can be used also for comparing the performance in different domains.
- Bratko, I., Kononenko, I. Learning rules from incomplete and noisy data. In: Phelps, B. eds. (1987) Interactions in artificial intelligence and statistical methods. Technical Press, Hampshire, England
- Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J. (1984) Classification and regression trees. Wadsworth, Int. Group, Belmont, California
- Cestnik, B., Kononenko, I., Bratko, I. ASSISTANT 86: A knowledte elicitation tool for sophisticated users. In: Bratko, I., Lavrac, N. eds. (1987) Progress in machine learning. Sigma Press, Wilmslow, England
- Clark, P., Niblett, T. Learning if then rules in noisy domains. In: Phelps, B. eds. (1987) Interactions in artificial intelligence and statistical methods. Technical Press, Hampshire, England
- Hansmann, D.R., Sheppard, J.J., Yeshaya, A. (1976) Evaluation of the Dyna-Gram Holter ECG Analysis System. Computers in Cardiology 1976: pp. 171-182
- Horn, K.A., Compton, P., Lazarus, L., Quinlan, J.R. (1985) An expert system for the interpretation of thyroid assays in a clinical laboratory. The Australian Computer Journal 17: pp. 7-11
- Michalski, R.S., Chilausky, R.L. (1980) Learning by being told and learning from examples: An experimental comparison of the two methods of knowledge acquisition in the context of developing an expert system for soybean disease diagnosis. International Journal of Policy Analysis and Information Systems 4: pp. 125-161
- Michalski, R.S., Mozetic, I., Hong, J., & Lavrac, N. (1986). The multipurpose incremental learning system AQ15 and its testing application to three medical domains. Proceedings of the National Conference on Artificial Intelligence AAAI 86. Philadelphia.
- Mozetic, I., Lavrac, N., & Kononenko, I. (1986). Automatic construction of diagnostic rules. Proceedings of the Fourth Mediterranean Conference on Medical & Biological Engineering. Sevilla, Spain.
- Paterson, A., Niblett, T. (1982) The ACLS user manual. Intelligent Terminals Ltd, Glasgow
- Quinlan, J.R. (1979). Discovering rules by induction from large collections of examples. In D. Michie (Ed.), Expert systems in the microelectronic age. Edinburgh University Press.
- Quinlan, J.R. (1986) Induction of decision trees. Machine Learning 1: pp. 81-106
- Ripley, K.L., Arthur, R.M. (1975) Evaluation and comparison of automatic arrhythmia detectors. Computers in Cardiology 1975: pp. 27-32
- Shannon, C.E., Weaver, W. (1949) The mathematical theory of communications. The University of Illinois Press, Urbana, IL
- Spackman, K.A. (1989) Signal detection theory: Valuable tools for evaluating inductive learning. Proceedings of the Sixth International Workshop on Machine Learning. Cornell University, Ithaca, NY, pp. 160-163
- Weiss, S.M., Galen, R.S., & Tadepalli, P.V. (1987). Optimizing the predictive value of diagnostic decision rules. Proceedings of the Sixth National Conf. on Artificial Intelligence AAAI-87, (pp. 521–526) Seattle, Washington.
- Williams, B.T. eds. (1982) Computer aids to clinical decisions. CRC Press, Boca Raton, FL
- Winter, J. (1982) Computer assessment of observer performance by receiver operating characteristic curve and information theory. Computers and Biomedical Research 15: pp. 555-562
- Information-based evaluation criterion for classifier's performance
Volume 6, Issue 1 , pp 67-80
- Cover Date
- Print ISSN
- Online ISSN
- Kluwer Academic Publishers
- Additional Links
- evaluation criteria
- machine learning
- information theory
- Industry Sectors