Conditions for Occam's razor applicability and noise elimination

  • Dragan Gamberger
  • Nada Lavrač
Part II: Regular Papers
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1224)


The Occam's razor principle suggests that among all the correct hypotheses, the simplest hypothesis is the one which best captures the structure of the problem domain and has the highest prediction accuracy when classifying new instances. This principle is implicitly used also for dealing with noise, in order to avoid overfitting a noisy training set by rule truncation or by pruning of decision trees. This work gives a theoretical framework for the applicability of Occam's razor, developed into a procedure for eliminating noise from a training set. The results of empirical evaluation show the usefulness of the presented approach to noise elimination.


  1. 1.
    Brodley, M.: Recursive automatic bias selection for classifier construction. Machine Learning 20 (1995) 63–94Google Scholar
  2. 2.
    Clark, P., Niblett, T.: The CN2 Induction Algorithm. Machine Learning 3 (1989) 261–284Google Scholar
  3. 3.
    Gamberger, D.: A minimization approach to propositional inductive learning. In Proc. of the 8th European Conference on Machine Learning (1995) 151–160Google Scholar
  4. 4.
    Gamberger, D., Lavrač, N.: Noise detection and elimination applied to noise handling in a KRK chess endgame. In Proc. of the 5th International Workshop on Inductive Logic Programming (1996) 59–75Google Scholar
  5. 5.
    Gamberger, D., Lavrač, N., Džeroski, S.: Noise Elimination in Inductive Concept Learning: A case study in medical diagnosis. In Proc. of the 7th International Workshop on Algorithmic Learning Theory (1996) 199–212.Google Scholar
  6. 6.
    Kohavi, R., Wolpert, D.H.: Bias Plus Variance Decomposition for Zero-One Loss Functions. In Proc. of the 13th International Conference on Machine Learning (1996) 275–283Google Scholar
  7. 7.
    Kononenko, I., Bratko, I.: Information-based evaluation criterion for classifier performance. Machine Learning 6 (1991) 67–80Google Scholar
  8. 8.
    Lavrač, N., Džeroski, S.: Inductive Logic Programming: Techniques and Applications. Ellis Horwood (1994)Google Scholar
  9. 9.
    Li, M., Vitányi, P.: An Introduction to Kolmogorov Complexity and its Applications. Springer (1993)Google Scholar
  10. 10.
    Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1 (1986) 81–106Google Scholar
  11. 11.
    Quinlan, J.R.: Learning Logical Definitions from Relations. Machine Learning 5 (1990) 239–266Google Scholar
  12. 12.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1992)Google Scholar
  13. 13.
    Rao, R., Gordon, D., Spears, W.: For every generalization action, is there really an equal or opposite reaction? Analysis of conservation law. In Proc. of the 12th International Conference on Machine Learning (1995) 471–479Google Scholar
  14. 14.
    Rissanen, J.: Modeling by the shortest data description. Automatica 14 (1978) 465–471CrossRefGoogle Scholar
  15. 15.
    Schaffer, C.: A conservation law for generalization performance. In Proc. of the 11th International Conference on Machine Learning (1994) 259–265Google Scholar
  16. 16.
    Stahl, I.: Compression Measures in ILP. In L. De Raedt (ed.): Advances in Inductive Logic Programming IOS Press (1996) 295–307Google Scholar
  17. 17.
    Webb, G.I.: Further Experimental Evidence against the Utility of Occam's razor. Journal of Artificial Intelligence Research 4 (1996) 397–417Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • Dragan Gamberger
    • 1
  • Nada Lavrač
    • 2
  1. 1.Rudjer Bošković InstituteZagrebCroatia
  2. 2.Jožef Stefan InstituteLjubljanaSlovenia

Personalised recommendations