In the past few years many systems for learning decision rules from examples were developed. As different systems allow different types of answers when classifying new instances, it is difficult to appropriately evaluate the systems' classification power in comparison with other classification systems or in comparison with human experts. Classification accuracy is usually used as a measure of classification performance. This measure is, however, known to have several defects. A fair evaluation criterion should exclude the influence of the class probabilities which may enable a completely uninformed classifier to trivially achieve high classification accuracy. In this paper a method for evaluating the information score of a classifier's answers is proposed. It excludes the influence of prior probabilities, deals with various types of imperfect or probabilistic answers and can be used also for comparing the performance in different domains.
Bratko, I., & Kononenko, I.(1987).Learning rules from incomplete and noisy data.In B. Phelps (Ed.), Interac-tions in artificial intelligence and statistical methods.Hampshire, England: Technical Press.
Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J.(1984).Classification and regression trees, Belmont, California: Wadsworth, Int.Group.
Cestnik, B., Kononenko, I., & Bratko, I.(1987).ASSISTANT 86:A knowledge elicitation tool for sophisticated users.In I. Bratko, N. Lavrac (Eds.), Progress in machine learning.Wilmslow, England: Sigma Press.
Clark, P., & Niblett, T.(1987).Learning if then rules in noisy domains.In B. Phelps (Ed.), Interactions in ar-tificial intelligence and statistical methods.Hampshire, England: Technical Press.
Hansmann, D.R., Sheppard, J.J., & Yeshaya, A.(1976).Evaluation of the Dyna-Gram Holler ECG Analysis System, Computers in Cardiology, 1976, 171-182.
Horn, K.A., Compton, P., Lazarus, L., & Quinlan, J.R.(1985).An expert system for the interpretation of thyroid assays in a clinical laboratory.The Australian Computer Journal, 17, 7-11.
Michalski, R.S., & Chilausky, R.L.(1980).Learning by being told and learning from examples:An experimen-tal comparison of the two methods of knowledge acquisition in the context of developing an expert system for soybean disease diagnosis.International Journal of Policy Analysis and Information Systems, 4, 125-161.
Michalski, R.S., Mozetic, I., Hong, J., & Lavrac, N.(1986).The multipurpose incremental learning system AQ15 and its testing application to three medical domains.Proceedings of the National Conference on Artificial In-telligence AMI 86.Philadelphia.
Mozetic, I., Lavrac, N., & Kononenko, I.(1986).Automatic construction of diagnostic rules.Proceedings of the Fourth Mediterranean Conference on Medical & Biological Engineering.Sevilla, Spain.
Paterson, A., & Niblett, T.(1982).The ACLS user manual.Glasgow: Intelligent Terminals Ltd.
Quinlan, J.R.(1979).Discovering rules by induction from large collections of examples.In D. Michie (Ed.), Expert systems in the microelectronic age.Edinburgh University Press.
Quinlan, J.R.(1986).Induction of decision trees.Machine Learning, 1, 81-106.
Ripley, K.L., & Arthur, R.M.(1975).Evaluation and comparison of automatic arrhythmia detectors, Computers in Cardiology, 1975, 27-32.
Shannon, C.E., & Weaver, W.(1949).The mathematical theory of communications.Urbana, IL: The University of Illinois Press.
Spackman, K.A.(1989).Signal detection theory:Valuable tools for evaluating inductive learning, Proceedings of the Sixth International Workshop on Machine Learning, (pp.160-163)Ithaca, NY: Cornell University.
Weiss, S.M., Galen, R.S., & Tadepalli, P.V.(1987).Optimizing the predictive value of diagnostic decision rules.Proceedings of the Sixth National Conf.on Artificial Intelligence AAAI-87, (pp.521-526)Seattle, Washington.
Williams, B.T.(Ed.)(1982).Computer aids to clinical decisions (Vol.I & II).Boca Raton, FL: CRC Press.
Winter, J.(1982).Computer assessment of observer performance by receiver operating characteristic curve and information theory, Computers and Biomedical Research, 15, 555-562.
Rights and permissions
About this article
Cite this article
Kononenko, I., Bratko, I. Information-Based Evaluation Criterion for Classifier's Performance. Machine Learning 6, 67–80 (1991). https://doi.org/10.1023/A:1022642017308
- evaluation criteria
- machine learning
- information theory