An empirical study on the incompetence of attribute selection criteria

  • Ibrahim F. Imam
Communications Session 5B Learning and Discovery Systems
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1079)


One of the main tasks in most supervised learning systems is the evaluation of the attributional relevancy in the given databases. Such relevancy is mainly concerned with the relationship between the available attributes and the decision classes. Attributes relevant to the decision classes are used to represent the learned knowledge, while irrelevant attributes are removed or ignored during the learning process. This paper investigates the relationship between attributional relevancy to decision classes and to learning systems. The experimental results from different databases show that some attributes relevant to decision classes may be irrelevant to the learning system. Experiments are performed on eight different databases using the C4.5 system for learning decision trees from examples.

Key words

relevancy machine learning decision trees 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arciszewski, T, Bloedorn, E., Michalski, R., Mustafa, M., and Wnek, J., “Constructive Induction in Structural Design”, Report of the Machine Learning and Inference Labratory, MLI-92-7, Center for AI, George Mason Univerity, 1992.Google Scholar
  2. 2.
    Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J., “Classification and Regression Trees”, Belmont, California: Wadsworth Int. Group, 1984.Google Scholar
  3. 3.
    Cestnik, B., and Karalic, A., “The Estimation of Probabilities in Attribute Selection Measures for Decision Tree Induction” Proceedings of the European Summer School on Machine Learning, July 22–31, Priory Corsendonk, Belgium, 1991.Google Scholar
  4. 4.
    Fayyad, U.M., and Irani, K.B., “On the Handling of Continous-Valued Attributes in Decision Tree Generation”, Journal of Machine Learning, Vol. 8, No. 1, pp. 87–102, 1992.Google Scholar
  5. 5.
    Hart, A., “Experience in the use of an inductive system in knowledge engineering”, Research and Developments in Expert Systems, M. Bramer (Ed.), Cambridge, Cambridge University Press, 1984.Google Scholar
  6. 6.
    Imam, I.F. and Michalski, R.S., “Learning Decision Trees from Decision Rules: A method and initial results from a comparative study”, in Journal of Intelligent Information Systems JIIS, Vol. 2, No. 3, pp. 279–304, Kerschberg, L., Ras, Z., & Zemankova, M. (Eds.), Kluwer Academic Pub., MA, 1993.Google Scholar
  7. 7.
    Imam, I.F., and Vafaie, H., “An Empirical Comparison Between Global and Greedy-Like Search for Feature Selection”, proceeding of the 7th Florida AI Research Symposium, Florida, 1994.Google Scholar
  8. 8.
    Michalski, R.S., “Designing Extended Entry Decision Tables and Optimal Decision Trees Using Decision Diagrams”, Technical Report No.898, Urbana: University of Illinois, March, 1978.Google Scholar
  9. 9.
    Mingers, J., “An Empirical Comparison of selection Measures for Decision-Tree Induction”, Machine Learning, Vol. 3, No. 3, pp. 319–342, Kluwer Academic Publishers, 1989.Google Scholar
  10. 10.
    Piatetsky-Shapiro, G., and Matheus, C.J., “Measuring Data Dependencies in Large Databases”, Proceedings of the AAAI-93 Workshop on Knowledge Discovery in Databases, pp. 162–174, Washington D.C., 1993.Google Scholar
  11. 11.
    Quinlan, J.R., “Discovering Rules By Induction from Large Collections of Examples”, in D. Michie (Editor), Expert Systems in the Microelectronic Age, Edinburgh University Press, 1979.Google Scholar
  12. 12.
    Quinlan, J.R., “Induction of Decision Trees”, Machine Learning Vol. 1, No. 1, pp. 81–106, Kluwer Academic Publishers, 1986.Google Scholar
  13. 13.
    Quinlan, J.R., “C4.5: Programs for Machine Learning”, Morgan Kaufmann, Los Altos, California, 1993.Google Scholar
  14. 14.
    Sokal, R., and Rohlf, F., “Biometry”, Freeman Pub., San Francisco, 1981.Google Scholar
  15. 15.
    Thrun, S.B., Mitchell, T., and Cheng, J., (Eds.) “The MONK's Problems: A Performance Comparison of Different Learning Algorithms”, Technical Report, Carnegie Mellon University, October, 1991.Google Scholar
  16. 16.
    Ziarko, W., “The Discovery, Analysis, and Representation of Data Dependencies in Databases”, Knowledge Discovery In Databases, Shapiro, G., Frawley, W., (Eds.), AAAI Press, 1991.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Ibrahim F. Imam
    • 1
  1. 1.Machine Learning and Inference LaboratoryGeorge Mason UniversityFairfax

Personalised recommendations