Machine Learning

, Volume 6, Issue 1, pp 67–80 | Cite as

Information-Based Evaluation Criterion for Classifier's Performance

  • Igor Kononenko
  • Ivan Bratko


In the past few years many systems for learning decision rules from examples were developed. As different systems allow different types of answers when classifying new instances, it is difficult to appropriately evaluate the systems' classification power in comparison with other classification systems or in comparison with human experts. Classification accuracy is usually used as a measure of classification performance. This measure is, however, known to have several defects. A fair evaluation criterion should exclude the influence of the class probabilities which may enable a completely uninformed classifier to trivially achieve high classification accuracy. In this paper a method for evaluating the information score of a classifier's answers is proposed. It excludes the influence of prior probabilities, deals with various types of imperfect or probabilistic answers and can be used also for comparing the performance in different domains.

Classifier evaluation criteria machine learning information theory 


  1. Bratko, I., & Kononenko, I.(1987).Learning rules from incomplete and noisy data.In B. Phelps (Ed.), Interac-tions in artificial intelligence and statistical methods.Hampshire, England: Technical Press.Google Scholar
  2. Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J.(1984).Classification and regression trees, Belmont, California: Wadsworth, Int.Group.Google Scholar
  3. Cestnik, B., Kononenko, I., & Bratko, I.(1987).ASSISTANT 86:A knowledge elicitation tool for sophisticated users.In I. Bratko, N. Lavrac (Eds.), Progress in machine learning.Wilmslow, England: Sigma Press.Google Scholar
  4. Clark, P., & Niblett, T.(1987).Learning if then rules in noisy domains.In B. Phelps (Ed.), Interactions in ar-tificial intelligence and statistical methods.Hampshire, England: Technical Press.Google Scholar
  5. Hansmann, D.R., Sheppard, J.J., & Yeshaya, A.(1976).Evaluation of the Dyna-Gram Holler ECG Analysis System, Computers in Cardiology, 1976, 171-182.Google Scholar
  6. Horn, K.A., Compton, P., Lazarus, L., & Quinlan, J.R.(1985).An expert system for the interpretation of thyroid assays in a clinical laboratory.The Australian Computer Journal, 17, 7-11.Google Scholar
  7. Michalski, R.S., & Chilausky, R.L.(1980).Learning by being told and learning from examples:An experimen-tal comparison of the two methods of knowledge acquisition in the context of developing an expert system for soybean disease diagnosis.International Journal of Policy Analysis and Information Systems, 4, 125-161.Google Scholar
  8. Michalski, R.S., Mozetic, I., Hong, J., & Lavrac, N.(1986).The multipurpose incremental learning system AQ15 and its testing application to three medical domains.Proceedings of the National Conference on Artificial In-telligence AMI 86.Philadelphia.Google Scholar
  9. Mozetic, I., Lavrac, N., & Kononenko, I.(1986).Automatic construction of diagnostic rules.Proceedings of the Fourth Mediterranean Conference on Medical & Biological Engineering.Sevilla, Spain.Google Scholar
  10. Paterson, A., & Niblett, T.(1982).The ACLS user manual.Glasgow: Intelligent Terminals Ltd.Google Scholar
  11. Quinlan, J.R.(1979).Discovering rules by induction from large collections of examples.In D. Michie (Ed.), Expert systems in the microelectronic age.Edinburgh University Press.Google Scholar
  12. Quinlan, J.R.(1986).Induction of decision trees.Machine Learning, 1, 81-106.Google Scholar
  13. Ripley, K.L., & Arthur, R.M.(1975).Evaluation and comparison of automatic arrhythmia detectors, Computers in Cardiology, 1975, 27-32.Google Scholar
  14. Shannon, C.E., & Weaver, W.(1949).The mathematical theory of communications.Urbana, IL: The University of Illinois Press.Google Scholar
  15. Spackman, K.A.(1989).Signal detection theory:Valuable tools for evaluating inductive learning, Proceedings of the Sixth International Workshop on Machine Learning, (pp.160-163)Ithaca, NY: Cornell University.Google Scholar
  16. Weiss, S.M., Galen, R.S., & Tadepalli, P.V.(1987).Optimizing the predictive value of diagnostic decision rules.Proceedings of the Sixth National Conf.on Artificial Intelligence AAAI-87, (pp.521-526)Seattle, Washington.Google Scholar
  17. Williams, B.T.(Ed.)(1982).Computer aids to clinical decisions (Vol.I & II).Boca Raton, FL: CRC Press.Google Scholar
  18. Winter, J.(1982).Computer assessment of observer performance by receiver operating characteristic curve and information theory, Computers and Biomedical Research, 15, 555-562.Google Scholar

Copyright information

© Kluwer Academic Publishers 1991

Authors and Affiliations

  • Igor Kononenko
    • 1
  • Ivan Bratko
    • 2
  1. 1.Faculty of Electrical and Computer EngineeringLjubljanaYugoslavia
  2. 2.Faculty of Electrical and Computer EngineeringJozef Stefan InstituteLjubljanaYugoslavia

Personalised recommendations