Theoretical and Empirical Analysis of ReliefF and RReliefF

Abstract

Relief algorithms are general and successful attribute estimators. They are able to detect conditional dependencies between attributes and provide a unified view on the attribute estimation in regression and classification. In addition, their quality estimates have a natural interpretation. While they have commonly been viewed as feature subset selection methods that are applied in prepossessing step before a model is learned, they have actually been used successfully in a variety of settings, e.g., to select splits or to guide constructive induction in the building phase of decision or regression tree learning, as the attribute weighting method and also in the inductive logic programming.

A broad spectrum of successful uses calls for especially careful investigation of various features Relief algorithms have. In this paper we theoretically and empirically investigate and discuss how and why they work, their theoretical and practical properties, their parameters, what kind of dependencies they detect, how do they scale up to large number of examples and features, how to sample data for them, how robust are they regarding the noise, how irrelevant and redundant attributes influence their output and how different metrics influences them.

References

  1. Bentley, J. L. (1975). Multidimensional binary search trees used for associative searching. Communications of the ACM, 15:9, 509-517.

    Google Scholar 

  2. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. Belmont, California: Wadsworth Inc.

    Google Scholar 

  3. Brodley, C. E. (1995). Automatic selection of split criterion during tree growing based on node location. In Machine Learning: Proceedings of the Twelfth International Conference (ICML'95) (pp. 73-80). Morgan Kaufmann.

  4. Cestnik, B., Kononenko, I., & Bratko, I. (1987). ASSISTANT 86: A knowledge-elicitation tool for sophisticated users. In I. Bratko, & N. Lavrač (Eds.), Progress in Machine Learning, Proceedings of European Working Session on Learning EWSL'87 (pp. 31-36). Wilmslow: Sigma Press.

    Google Scholar 

  5. Dalaka, A., Kompare, B., Robnik-Šikonja, M., & Sgardelis, S. (2000). Modeling the effects of environmental conditions on apparent photosynthesis of Stipa bromoides by machine learning tools. Ecological Modelling, 129, 245-257.

    Google Scholar 

  6. Deng, K., & Moore, A. W. (1995). Multiresolution instance-based learning. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI'95) (pp. 1233-1239). Morgan Kaufmann.

  7. Dietterich, T. G. (1997). Machine learning research: Four current directions. AI Magazine, 18:4, 97-136.

    Google Scholar 

  8. Domingos, P. (1997). Context-sensitive feature selection for lazy learners. Artificial Intelligence Review, 11, 227-253.

    Google Scholar 

  9. Friedman, J. H., Bentley, J. L., & Finkel, R. A. (1975). An algorithm for finding best matches in logarithmic expected time. Technical Report STAN-CS-75-482, Stanford University.

  10. Hong, S. J. (1994). Use of contextual information for feature ranking and discretization.Technical Report RC19664, IBM.

  11. Hong, S. J. (1997). Use of contextual information for feature ranking and discretization. IEEE Transactions on Knowledge and Data Engineering, 9:5, 718-730.

    Google Scholar 

  12. Hunt, E. B., Martin, J., & Stone, P. J. (1966). Experiments in Induction. New York: Academic Press.

    Google Scholar 

  13. Jovanoski, V., & Lavrač, N. (1999). Feature subset selection in association rules learning systems. In M. Grobelnik, & D. Mladenič (Eds.), Prooceedings of the Conference Analysis, Warehousing and Mining the Data (AWAMIDA'99) (pp. 74-77).

  14. Kira, K., & Rendell, L. A. (1992a). The feature selection problem: Traditional methods and new algorithm. In Proceedings of AAAI'92.

  15. Kira, K., & Rendell, L. A. (1992b). A practical approach to feature selection. In D. Sleeman, & P. Edwards (Eds.), Machine Learning: Proceedings of International Conference (ICML'92) (pp. 249-256). Morgan Kaufmann.

  16. Kononenko, I. (1994). Estimating attributes: Analysis and extensions of Relief. In L. De Raedt, & F. Bergadano (Eds.), Machine Learning: ECML-94 (pp. 171-182). Springer Verlag.

  17. Kononenko, I. (1995). On biases in estimating multi-valued attributes. In Proceedings of the International Joint Conference on Aartificial Intelligence (IJCAI'95) (pp. 1034-1040). Morgan Kaufmann.

  18. Kononenko, I., & Šimec, E. (1995). Induction of decision trees using reliefF. In G. Della Riccia, R. Kruse, & R. Viertl (Eds.), Mathematical and Statistical Methods in Artificial Intelligence, CISM Courses and Lectures No. 363. Springer Verlag.

  19. Kononenko, I., Šimec, E., & Robnik-Šikonja, M. (1997). Overcoming the myopia of inductive learning algorithms with RELIEFF. Applied Intelligence, 7, 39-55.

    Google Scholar 

  20. Kukar, M., Kononenko, I., Grošelj, C., Kralj, K., & Fettich, J. (1999). Analysing and improving the diagnosis of ischaemic heart disease with machine learning. Artificial Intelligence in Medicine, 16, 25-50.

    Google Scholar 

  21. Lubinsky, D. J. (1995). Increasing the performance and consistency of classification trees by using the accuracy criterion at the leaves. In Machine Learning: Proceedings of the Twelfth International Conference (ICML'95) (pp. 371-377). Morgan Kaufmann.

  22. Mantaras, R. L. (1989). ID3 revisited: A distance based criterion for attribute selection. In Proceedings of Int. Symp. Methodologies for Intelligent Systems. Charlotte, North Carolina, USA.

  23. Moore, A. W., Schneider, J., & Deng, K. (1997). Efficient locally weighted polynomial regression predictions. In D. H. Fisher (Ed.), Machine Learning: Proceedings of the Fourteenth International Conference (ICML'97 (pp. 236-244). Morgan Kaufmann.

  24. Murphy, P. M., & Aha, D. W. (1995) UCI repository of machine learning databases. http://www.ics.uci.edu/ mlearn/MLRepository.html.

  25. Perèz, E., & Rendell, L. A. (1996). Learning despite concept variation by finding structure in attribute-based data. In Machine Learning: Proceedings of the Thirteenth International Conference (ICML'96) (pp. 391-399).

  26. Pompe, U., & Kononenko, I. (1995). Linear space induction in first order logic with ReliefF. InG. Della Riccia, R. Kruse, & R. Viertl (Eds.), Mathematical and Statistical Methods in Artificial Intelligence. CISM Courses and Lectures No. 363. Springer Verlag.

  27. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1:1, 81-106.

    Google Scholar 

  28. Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann.

  29. Rendell, L. A., & Seshu, R. (1990). Learning hard concepts through constructive induction: Framework and rationale. Computational Intelligence, 6, 247-270.

    Google Scholar 

  30. Ricci, F., & Avesani, P. (1995). Learning a local similarity metric for case-based reasoning. In Proceedings of the International Conference on Case-Based Reasoning (ICCBR-95). Sesimbra, Portugal.

  31. Robnik, M. (1995). Constructive induction in machine learning. Electrotehnical Review, 62:1, 43-49. (in Slovene).

    Google Scholar 

  32. Robnik Šikonja, M. (1998). Speeding up relief algorithm with k-d trees. In Proceedings of Electrotehnical and Computer Science Conference (ERK'98) (pp. B:137-140). Portorož, Slovenia.

  33. Robnik Šikonja, M., & Kononenko, I. (1996). Context sensitive attribute estimation in regression. In M. Kubat, & G. Widmer (Eds.), Proceedings of ICML'96 Workshop on Learning in Context Sensitive Domains (pp. 43-52). Morgan Kaufmann.

  34. Robnik Šikonja, M., & Kononenko, I. (1997). An adaptation of relief for attribute estimation in regression. In D. H. Fisher (Ed.), Machine Learning: Proceedings of the Fourteenth International Conference (ICML'97) (pp. 296-304). Morgan Kaufmann.

  35. Robnik Šikonja, M., & Kononenko, I. (1999). Attribute dependencies, understandability and split selection in tree based models. In I. Bratko, & S. Džeroski (Eds.), Machine Learning: Proceedings of the Sixteenth International Conference (ICML'99) (pp. 344-353). Morgan Kaufmann.

  36. Sefgewick, R. (1990). Algorithms in C. Addison-Wesley.

  37. Smyth, P., & Goodman, R. M. (1990). Rule induction using information theory. In G. Piatetsky-Shapiro, & W. J. Frawley (Eds.), Knowledge Discovery in Databases. MIT Press.

  38. Thrun, S. B., Bala, J. W., Bloedorn, E., Bratko, I., Cestnik, B., Cheng, J., De Jong, K., Džeroski, S., Fahlman, S. E., Fisher, D. H., Hamann, R., Kaufman, K. A., Keller, S. F., Kononenko, I., Kreuziger, J., Michalski, R. S., Mitchell, T., Pachowicz, P. W., Reich, Y., Vafaie, H., Van de Welde, W., Wenzel, W., Wnek, J., & Zhang, J. (1991). The MONK's problems-A performance comparison of different learning algorithms. Technical Report CS-CMU-91-197, Carnegie Mellon University.

  39. Vilalta, R. (1999). Understanding accuracy performance through concept characterization and algorithm analysis. In Proceedings of the ICML-99 Workshop on Recent Advances in Meta-Learning and Future Work (pp. 3-9).

  40. Wettschereck, D., Aha, D.W.,& Mohri, T. (1997).Are view and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artificial Intelligence Review, 11, 273-314.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Robnik-Šikonja, M., Kononenko, I. Theoretical and Empirical Analysis of ReliefF and RReliefF. Machine Learning 53, 23–69 (2003). https://doi.org/10.1023/A:1025667309714

Download citation

  • attribute evaluation
  • feature selection
  • Relief algorithm
  • classification
  • regression