A Statistical Method for Determining Importance of Variables in an Information System

  • Witold R. Rudnicki
  • Marcin Kierczak
  • Jacek Koronacki
  • Jan Komorowski
Conference paper

DOI: 10.1007/11908029_58

Part of the Lecture Notes in Computer Science book series (LNCS, volume 4259)
Cite this paper as:
Rudnicki W.R., Kierczak M., Koronacki J., Komorowski J. (2006) A Statistical Method for Determining Importance of Variables in an Information System. In: Greco S. et al. (eds) Rough Sets and Current Trends in Computing. RSCTC 2006. Lecture Notes in Computer Science, vol 4259. Springer, Berlin, Heidelberg

Abstract

A new method for estimation of attributes’ importance for supervised classification, based on the random forest approach, is presented. Essentially, an iterative scheme is applied, with each step consisting of several runs of the random forest program. Each run is performed on a suitably modified data set: values of each attribute found unimportant at earlier steps are randomly permuted between objects. At each step, apparent importance of an attribute is calculated and the attribute is declared unimportant if its importance is not uniformly better than that of the attributes earlier found unimportant. The procedure is repeated until only attributes scoring better than the randomized ones are retained. Statistical significance of the results so obtained is verified. This method has been applied to 12 data sets of biological origin. The method was shown to be more reliable than that based on standard application of a random forest to assess attributes’ importance.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Witold R. Rudnicki
    • 1
  • Marcin Kierczak
    • 2
  • Jacek Koronacki
    • 3
  • Jan Komorowski
    • 1
    • 2
  1. 1.ICMWarsaw UniversityWarsawPoland
  2. 2.The Linnaeus Centre for BioinformaticsUppsala UniversityUppsalaSweden
  3. 3.Institute of Computer SciencePolish Academy of SciencesWarsawPoland

Personalised recommendations