The more we learn the less we know? On inductive learning from examples

  • Piotr Ejdys
  • Grzegorz Góra
Communications 3A Learning and Knowledge Discovery
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1609)


We consider the average error rate of classification as a function of the number of training examples. We investigate the upper and lower bounds of this error in the class of commonly used algorithms based on inductive learning from examples. As a result we arrive at the astonishing conclusion, that, contrary to what one could expect, the error rate of some algorithms does not decrease monotonically with number of training examples; it rather, initially increases up to a certain point and then it starts to decrease. Furthermore, the classification quality of some algorithms is as poor as that of a naive algorithm. We show that for simple monomials, even if we take an exponentially large training data set, the classification quality of some methods will not be better than if we took just one or several training examples.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Anthony, M., Biggs, N.: Computational Learning Theory, Cambridge: Cambridge University Press (1992).MATHGoogle Scholar
  2. 2.
    Bazan, J.: A Comparison of Dynamic and non-Dynamic Rough set Methods for Extracting Laws from Decision Table, Polkowski L., Skowron A. (eds.): Rough Sets in Knowledge Discovery. Heidelberg: Physica-Verlag (1998) 321–365.Google Scholar
  3. 3.
    Dietterich, T.: Machine Learning Research: Four Current Directions. Department of Computer Science, Oregon State University, Corvallis (1997).Google Scholar
  4. 4.
    Ejdys, P., Góra, G.:On Inductive Learning from Examples. Fundamenta Inforaticae (submited).Google Scholar
  5. 5.
    Grzymała-Busse, J. W.: Classification of Unseen Examples under Uncertainty. Fundamenta Informaticae, 30 (1997) 255–267. Press.Google Scholar
  6. 6.
    Grzymała-Busse, J. W.: A new version of the rule induction system LERS. Fundamenta Informaticae, 31 (1997) 27–39.Google Scholar
  7. 7.
    Grzymała-Busse, J. W.: LERS—a system for learning from examples based on rough sets. In: R. Słowiński, (ed.) Intelligent Decision Support, Dordrecht: Kluwer (1992) 3–18.Google Scholar
  8. 8.
    Hand, D. J.: Construction and Assesment of Classification rules. Chichester: John Wiley and Sons (1998).Google Scholar
  9. 9.
    Leja, F.: Differential and integeral calculus, Warsaw: PWN (1978). (In Polish)Google Scholar
  10. 10.
    Michalski, R., Carbonell, J. G. Mitchel, T. M. (ed): Machine Learning vol. I. Los Altos: Tioga/Morgan Kaufmann (1983).MATHGoogle Scholar
  11. 11.
    Michalski, R. S., Mozetic, I., Hong, J., Lavrac, N.: The Multi-Purpose Incremental Learning System AQ15 and its Testing to Three Medical Domains, Proceedings of AAAI-86. San Mateo: Morgan Kaufmann (1986) 1041–1045.Google Scholar
  12. 12.
    Michalski, R., Wnęk, J.: Constructive Induction: An Automated Improvement of Knowledge Representation Spaces for Machine Learning, in Proceedings of a Workshop on Intelligent Information Systems, Practical Aspect of AI II, Augustów (1993) 188–236.Google Scholar
  13. 13.
    Michalski, R.: A Tutorial on Machine learning, data mining and knowledge discovery Principles and Applications, Zakopane (1997).Google Scholar
  14. 14.
    Mitchell, T. M.: Machine Learning, Portland: McGraw-Hill (1997).MATHGoogle Scholar
  15. 15.
    Pawlak, Z.: Rough sets: Theoretical aspects of reasoning about data, Dordrecht: Kluwer (1991).Google Scholar
  16. 16.
    Skowron, A., Rauszer, C.: The Discernibility Matrices and Functions in Information Systems. R. Słowiński (ed.), Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory. Dordrecht: Kluwer (1992) 331–362.Google Scholar
  17. 17.
    Tsumoto, S., Tanaka H.: Incremental learning of probabilistic rules from clinical databases. Proceedings Information Processing and Management of Uncertainty on Knowledge Based Systems (IPMU-96), July 1–5, Granada, Spain, Universidad de Granada, vol. II, (1996) 1457–1462Google Scholar
  18. 18.
    Wegener, I.: The Complexity of Boolean Functions. Stuttgart: John Wiley and Sons (1987).MATHGoogle Scholar
  19. 19.
    Ziarko, W., Shan, N.: Database Mining Using Rough Sets, Intelligent Information Systems IV Proceedings of the Workshop held in Augustów. Warsaw, IPIPAN (1995) 74–68.Google Scholar
  20. 20.
    Ziarko, W., Shan, N.: An incremental learning algorithm for constructing decision rules, Proceedings of the International Workshop on Rough Sets and Knowledge Discovery. Banff (1993) 335–346.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Piotr Ejdys
    • 1
  • Grzegorz Góra
    • 1
  1. 1.Institute of MathematicsWarsaw UniversityWarsawPoland

Personalised recommendations