Feature Selection Based on Run Covering
This paper proposes a new feature selection algorithm. First, the data at every attribute are sorted. The continuously distributed data with the same class labels are grouped into runs. The runs whose length is greater than a given threshold are selected as “valid” runs, which enclose the instances separable from the other classes. Second, we count how many runs cover every instance and check how the covering number changes once eliminate a feature. Then, we delete the feature that has the least impact on the covering cases for all instances. We compare our method with ReliefF and a method based on mutual information. Evaluation was performed on 3 image databases. Experimental results show that the proposed method outperformed the other two.
KeywordsFeature Selection Mutual Information Class Label Support Vector Machine Classifier Feature Selection Method
Unable to display preview. Download preview PDF.
- Kira, K., Rendell, L.: A practical approach to feature selection. In: Proc. Int. Conf. Machine Learning, pp. 249–256 (1992)Google Scholar
- Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proc. Int. Joint Conf. Artificial Intelligence, pp. 1137–1145 (1995)Google Scholar