Feature Selection Based on Run Covering

  • Su Yang
  • Jianning Liang
  • Yuanyuan Wang
  • Adam Winstanley
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4319)


This paper proposes a new feature selection algorithm. First, the data at every attribute are sorted. The continuously distributed data with the same class labels are grouped into runs. The runs whose length is greater than a given threshold are selected as “valid” runs, which enclose the instances separable from the other classes. Second, we count how many runs cover every instance and check how the covering number changes once eliminate a feature. Then, we delete the feature that has the least impact on the covering cases for all instances. We compare our method with ReliefF and a method based on mutual information. Evaluation was performed on 3 image databases. Experimental results show that the proposed method outperformed the other two.


Feature Selection Mutual Information Class Label Support Vector Machine Classifier Feature Selection Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)CrossRefMATHGoogle Scholar
  2. [2]
    Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowledge and Data Engineering 17, 491–502 (2005)CrossRefGoogle Scholar
  3. [3]
    Jain, A., Zongker, D.: Feature selection: Evaluation, application, and small sample performance. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 153–158 (1997)CrossRefGoogle Scholar
  4. [4]
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)CrossRefMATHGoogle Scholar
  5. [5]
    Robnik-Sikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RreliefF. Machine Learning 53, 23–69 (2003)CrossRefMATHGoogle Scholar
  6. [6]
    Kira, K., Rendell, L.: A practical approach to feature selection. In: Proc. Int. Conf. Machine Learning, pp. 249–256 (1992)Google Scholar
  7. [7]
    Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1226–1238 (2005)CrossRefGoogle Scholar
  8. [8]
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proc. Int. Joint Conf. Artificial Intelligence, pp. 1137–1145 (1995)Google Scholar
  9. [9]
    Rui, Y., Huang, T.S., Chang, S.: Image retrieval: Current techniques, promising directions and open issues. Visual communication and image representation 10(4), 39–62 (1999)CrossRefGoogle Scholar
  10. [10]
    Ho, T.K., Baird, H.S.: Pattern classification with compact distribution maps. Computer Vision and Image Understanding 70, 101–110 (1998)CrossRefGoogle Scholar
  11. [11]
    Cover, T.M.: The best two independent measurements are not the two best. IEEE Transactions on Systems, Man, and Cybernetics 4, 116–117 (1974)MATHGoogle Scholar
  12. [12]
    Narendra, P.M., Fukunaga, K.: A branch and bound algorithm for feature subset selection. IEEE Transactions on Computers 26, 917–922 (1977)CrossRefMATHGoogle Scholar
  13. [13]
    Somol, P., Pudil, P., Kittler, J.: Fast branch & bound algorithms for optimal feature selection. IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 900–912 (2004)CrossRefGoogle Scholar
  14. [14]
    Kwak, N., Choi, C.H.: Input feature selection by mutual information based on Parzen windows. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12), 1667–1671 (2002)CrossRefGoogle Scholar
  15. [15]
    Trappenberg, T., Ouyang, J., Back, A.: Input variable selection: Mutual information and linear mixing measures. IEEE Trans. Knowledge and Data Engineering 18, 37–46 (2006)CrossRefGoogle Scholar
  16. [16]
  17. [17]

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Su Yang
    • 1
  • Jianning Liang
    • 1
  • Yuanyuan Wang
    • 2
  • Adam Winstanley
    • 3
  1. 1.Shanghai Key Laboratory of Intelligent Information Processing, Dept. of Computer Science and EngineeringFudan UniversityShanghaiChina
  2. 2.Dept. of Electronic EngineeringFudan UniversityShanghaiChina
  3. 3.National Centre for Geocomputation, Dept. of Computer ScienceNational University of IrelandMaynooth, Co. KildareIreland

Personalised recommendations