Algebraic specification of empirical inductive learning methods based on rough sets and matroid theory

  • Shusaku Tsumoto
  • Hiroshi Tanaka
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 958)


In order to acquire knowledge from databases, there have been proposed several methods of inductive learning, such as ID3 family and AQ family. These methods are applied to discover meaningful knowledge from large databases, and their usefulness is ensured. However, since there has been no formal approach proposed to treat these methods, efficiency of each method is only compared empirically. In this paper, we introduce matroid theory and rough sets to construct a common framework for empirical machine learning methods which induce the combination of attribute-value pairs from databases. Combination of the concepts of rough sets and matroid theory gives us an excellent framework and enables us to understand the differences and the similarities between these methods clearly. In this paper, we compare three classical methods, AQ, Pawlak's Consistent Rules and ID3. The results show that there exists the differences in algebraic structure between the former two and the latter and that this causes the differences between AQ and ID3.


Equivalence Relation Training Sample Greedy Algorithm Pruning Method Matroid Theory 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bergadano, F., Matwin, S., Michalski, R.S. and Zhang, J. Learning Two-Tiered Descriptions of Flexible Concepts: The POSEIDON System, Machine Learning, 8, 5–43, 1992.Google Scholar
  2. 2.
    Breiman, L., Freidman, J., Olshen, R. and Stone, C. Classification And Regression Trees. Belmont, CA: Wadsworth International Group, 1984.Google Scholar
  3. 3.
    Hunter, L.(eds). Proceedings of AAAI-94 Spring Workshop on Goal-Driven Learning, AAAI Press, 1994.Google Scholar
  4. 4.
    Michalski, R.S. A Theory and Methodology of Machine Learning. Michalski, R.S., Carbonell, J.G. and Mitchell, T.M., Machine Learning — An Artificial Intelligence Approach, 83–134, Morgan Kaufmann, CA, 1983.Google Scholar
  5. 5.
    Michalski, R.S., et al. The Multi-Purpose Incremental Learning System AQ15 and its Testing Application to Three Medical Domains, Proc. of AAAI-86, 1041–1045, Morgan Kaufmann, CA, 1986.Google Scholar
  6. 6.
    Michalski, R.S., and Tecuci, G.(eds) Machine Learning vol.4 — A Multistrategy Approach-, Morgan Kaufmann, CA, 1994.Google Scholar
  7. 7.
    Mingers, J. An Empirical Comparison of Selection Measures for Decision Tree Induction. Machine Learning, 3, 319–342, 1989.Google Scholar
  8. 8.
    Mingers, J. An Empirical Comparison of Pruning Methods for Decision Tree Induction. Machine Learning, 4, 227–243, 1989.CrossRefGoogle Scholar
  9. 9.
    Nakakuki, Y., Koseki, Y., and Tanaka, M. Inductive Learning in Probabilistic Domain in Proc. of AAAI-90, 809–814, 1990.Google Scholar
  10. 10.
    Pawlak, Z. Rough Sets, Kluwer Academic Publishers, Dordrecht, 1991.Google Scholar
  11. 11.
    Pendnault, E.P.D. Some Experiments in Applying Inductive Inference Principles to Surface Reconstruction, Proceedings of IJCAI-89, 1603–1609, 1989.Google Scholar
  12. 12.
    Pendnault, E.P.D. Inferring probabilistic theories from data, Proceedings of AAAI-88, 1988.Google Scholar
  13. 13.
    Quinlan, J.R. Induction of decision trees, Machine Learning, 1, 81–106, 1986.Google Scholar
  14. 14.
    Quinlan, J.R. Simplifying Decision Trees. International Journal of Man-Machine Studies, 27, 221–234, 1987.Google Scholar
  15. 15.
    Quinlan, J.R. and Rivest, R.L. Inferring Decision Trees Using the Minimum Description Length Principle, Information and Computation, 80, 227–248, 1989.CrossRefGoogle Scholar
  16. 16.
    Rissanen, J. Stochastic complexity and modeling, Ann. of Statist., 14, 1080–1100, 1986.Google Scholar
  17. 17.
    Rissanen, J. Universal Coding, Information, Prediction, and Estimation, IEEE. Trans. Inform. Theory, IT-30, 629–636, 1984.CrossRefGoogle Scholar
  18. 18.
    Schaffer, C. Overfitting Avoidance as Bias. Machine Learning, 10, 153–178, 1993.Google Scholar
  19. 19.
    Tsumoto, S. and Tanaka, H. PRIMEROSE: Probabilistic Rule Induction Method based on Rough Sets. in: Ziarko, W.(eds) Rough Sets, Fuzzy Sets, and Knowledge Discovery, Springer, London, 1994.Google Scholar
  20. 20.
    Welsh, D.J.A. Matroid Theory, Academic Press, London, 1976.Google Scholar
  21. 21.
    White, N.(ed.) Matroid Applications, Cambridge University Press, 1991.Google Scholar
  22. 22.
    Whitney, H. On the abstract properties of linear dependence, Am. J. Math., 57, 509–533, 1935.Google Scholar
  23. 23.
    Ziarko, W. The Discovery, Analysis, and Representation of Data Dependencies in Databases, in: Knowledge Discovery in Database, Morgan Kaufmann, 1991.Google Scholar
  24. 24.
    Ziarko, W. Variable Precision Rough Set Model, Journal of Computer and System Sciences, 46, 39–59, 1993.CrossRefGoogle Scholar
  25. 25.
    Ziarko, W. Analysis of Uncertain Information in the Framework of Variable Precision Rough Sets, Foundation of Computing and Decision Science, 18, 381–396, 1993.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1995

Authors and Affiliations

  • Shusaku Tsumoto
    • 1
  • Hiroshi Tanaka
    • 1
  1. 1.Department of Information Medicine Medical Research InstituteTokyo Medical and Dental UniversityTokyoJapan

Personalised recommendations