Applied Intelligence

, Volume 7, Issue 1, pp 39–55 | Cite as

Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF

  • Igor Kononenko
  • Edvard Šimec
  • Marko Robnik-Šikonja
Article

Abstract

Current inductive machine learning algorithms typically use greedy search with limited lookahead. This prevents them to detect significant conditional dependencies between the attributes that describe training objects. Instead of myopic impurity functions and lookahead, we propose to use RELIEFF, an extension of RELIEF developed by Kira and Rendell [10, 11], for heuristic guidance of inductive learning algorithms. We have reimplemented Assistant, a system for top down induction of decision trees, using RELIEFF as an estimator of attributes at each selection step. The algorithm is tested on several artificial and several real world problems and the results are compared with some other well known machine learning algorithms. Excellent results on artificial data sets and two real world problems show the advantage of the presented approach to inductive learning.

learning from examples estimating attributes impurity function RELIEFF empirical evaluation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone, Classification and Regression Trees, Wadsworth International Group, 1984.Google Scholar
  2. 2.
    B. Cestnik, “Estimating probabilities: A crucial task in machine learning,” Proc. European Conference on Artificial Intelligence, Stockholm, Aug. 1990, pp. 147–149.Google Scholar
  3. 3.
    B. Cestnik and I. Bratko, “On estimating probabilities in tree pruning,” Proc. European Working Session on Learning, edited by Y. Kodratoff, Springer-Verlag: Porto, March 1991, pp. 138–150.Google Scholar
  4. 4.
    B. Cestnik, I. Kononenko, and I. Bratko, “ASSISTANT 86: A knowledge elicitation tool for sophisticated users,” in Progress in Machine Learning, edited by I. Bratko and N. Lavrač, Sigma Press: Wilmslow, England, 1987.Google Scholar
  5. 5.
    W. Chase and F. Brown, General Statistics, John Wiley & Sons, 1986.Google Scholar
  6. 6.
    B. Dolšak and S. Muggleton, “The application of inductive logic programming to finite element mesh design,” in Inductive Logic Programming, edited by S. Muggleton, Academic Press, 1992.Google Scholar
  7. 7.
    S. Džeroski, “Handling noise in inductive logic programming,” M.Sc. Thesis, University of Ljubljana, Faculty of Electrical Engineering & Computer Science, Ljubljana, Slovenia, 1991.Google Scholar
  8. 8.
    S.J. Hong, “Use of contextual information for feature ranking and discretization,” Technical Report, IBM RC19664, 7/94, 1994 (to appear in IEEE Trans. on Knowledge and Data Engineering).Google Scholar
  9. 9.
    E. Hunt, J. Martin, and P. Stone, Experiments in Induction, Academic Press: New York, 1966.Google Scholar
  10. 10.
    K. Kira and L. Rendell, “A practical approach to feature selection,” Proc. Intern. Conf. on Machine Learning, edited by D. Sleeman and P. Edwards, Morgan Kaufmann: Aberdeen, July 1992, pp. 249–256.Google Scholar
  11. 11.
    K. Kira and L. Rendell, “The feature selection problem: Traditional methods and new algorithm,” Proc. AAAI'92, San Jose, CA, July 1992.Google Scholar
  12. 12.
    I. Kononenko, “Inductive and Bayesian learning in medical diagnosis,” Applied Artificial Intelligence, vol. 7, pp. 317–337, 1993.Google Scholar
  13. 13.
    I. Kononenko, “Estimating attributes: Analysis and extensions of RELIEF,” Proc. European Conf. on Machine Learning, edited by L. De Raedt and F. Bergadano, Springer-Verlag: Catania, April 1994, pp. 171–182.Google Scholar
  14. 14.
    I. Kononenko, “On biases when estimating multivalued attributes,” Proc. IJCAI-95, edited by C. Mellish, Morgan Kaufmann: Montreal, Aug. 1995, pp. 1034–1040.Google Scholar
  15. 15.
    I. Kononenko and I. Bratko, “Information based evaluation criterion for classifier's performance,” Machine Learning, vol. 6, pp. 67–80, 1991.Google Scholar
  16. 16.
    R.L. Mantaras, “ID3 Revisited: A distance based criterion for attribute selection,” Proc. Int. Symp. Methodologies for Intelligent Systems, Charlotte, North Carolina, U.S.A., Oct. 1989.Google Scholar
  17. 17.
    R.S. Michalski and R.L. Chilausky, “Learning by being told and learning from examples: An experimental comparison of the two methods of knowledge acquisition in the context of developing an expert system for soybean disease diagnosis,” International Journal of Policy Analysis and Information Systems, vol. 4, pp. 125–161, 1980.Google Scholar
  18. 18.
    D. Michie, D.J. Spiegelhalter, and C.C. Taylor (eds.), Machine Learning, Neural and Statistical Classification, Ellis Horwood Limited, 1994.Google Scholar
  19. 19.
    D. Mladenič, “Combinatorial optimization in inductive concept learning,” Proc. 10th Intern. Conf. on Machine Learning, Morgan Kaufmann: Amherst, June 1993, pp. 205–211.Google Scholar
  20. 20.
    S. Muggleton (ed.), Inductive Logic Programming, Academic Press, 1992.Google Scholar
  21. 21.
    P.M. Murphy and D.W. Aha, UCI Repository of Machine Learning Databases [Machine-readable data repository], Irvine, CA, University of California, Department of Information and Computer Science, 1991.Google Scholar
  22. 22.
    T. Niblett and I. Bratko, “Learning decision rules in noisy domains,” Proc. Expert Systems 86, Brighton, UK, Dec. 1986.Google Scholar
  23. 23.
    U. Pompe and I. Kononenko, “Linear space induction in first order logic with RELIEFF,” in Mathematical and Statistical Methods in Artificial Intelligence, edited by G. Della Riccia, R. Kruse, and R. Viertl, CISM Lecture Notes, Springer-Verlag, 1995.Google Scholar
  24. 24.
    U. Pompe, M. Kovačič, and I. Kononenko, “SFOIL: Stochastic approach to inductive logic programming,” Proc. Slovenian Conf. on Electrical Engineering and Computer Science, Portorož, Slovenia, Sept. 1993, pp. 189–192.Google Scholar
  25. 25.
    R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, pp. 81–106, 1986.Google Scholar
  26. 26.
    R. Quinlan, “The minimum description length principle and categorical theories,” Proc. 11th Int. Conf. on Machine Learning edited by W. Cohen and H. Hirsh, Morgan Kaufmann: Ruthers University, New Brunswick, July 1994, pp. 233–241.Google Scholar
  27. 27.
    H. Ragavan and L. Rendell, “Lookahead feature construction for learning hard concepts,” Proc. 10th Intern. Conf. on Machine Learning, Morgan Kaufmann: Amherst, June 1993, pp. 252–259.Google Scholar
  28. 28.
    H. Ragavan, L. Rendell, M. Shaw, and A. Tessmer, “Learning complex real-world concepts through feature construction,” Technical Report UIUC-BI-AI-93-03, The Beckman Institute, University of Illinois, 1993.Google Scholar
  29. 29.
    M. Robnik, “Constructive induction with decision trees,” B.Sc. Thesis (in Slovene), University of Ljubljana, Faculty of Electrical Engineering & Computer Science, Ljubljana, Slovenia, 1993.Google Scholar
  30. 30.
    P. Smyth and R.M. Goodman, “Rule induction using information theory,” in Knowledge Discovery in Databases, edited by G. Piatetsky-Shapiro and W. Frawley, MIT Press, 1990.Google Scholar
  31. 31.
    P. Smyth, R.M. Goodman, and C. Higgins, “A hybrid rule-based Bayesian classifier,” Proc. European Conf. on Artificial Intelligence, Stockholm, Aug. 1990, pp. 610–615.Google Scholar

Copyright information

© Kluwer Academic Publishers 1997

Authors and Affiliations

  • Igor Kononenko
    • 1
  • Edvard Šimec
    • 1
  • Marko Robnik-Šikonja
    • 1
  1. 1.University of LjubljanaLjubljanaSlovenia E-mail

Personalised recommendations