Abstract
Feature selection is a data preprocessing step for classification and data mining tasks. Traditionally, feature selection is done by selecting fewest features that determine the class label, i.e., by the horizontal compactness of data. In this chapter1, we propose a new selection criterion that aims at the vertical compactness of data. In particular, we select a subset of features that yields the fewest projected instances while determining the class label. Limitations of direct adoption of the standard depth-first search (DFS) and breath-first search (BFS) are analyzed. A hybrid approach that is partially DFS and partially BFS is described. To see the effectiveness of the new criterion on the classification task, we compare the result induced by C4.5 before and after the feature selection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Almuallim, H., & Dietterich, T. G. (1994). Learning boolean concepts in the presence of many irrelevant features. Artificial Intelligence, 69(1–2), 279–305.
John, G. H., Kohavi, R., & Pfleger, K. (1994). Irrelevant features and the subset selection problem. In Proceedings of the Eleventh International Conference on Machine Learning, pp. 121–129. Morgan Kaufmann Publishers.
Kira, K., & Rendell, L. A. (1992). The feature selection problem: Traditional methods and a new algorithm. In Proceedings of Ninth National Conference on AI, pp. 129–134. AAAI Press/MIT Press.
Koller, D., & Sahami, M. (1996). Toward optimal feature selection. In Machine Learning: Proceedings of the Thirteenth International Conference. Morgan Kaufmann Publishers.
P. Murphy and D. Aha. Repository of Machine Learning Databases. http://www.ics.uci.edu/rñlearn/MLRepository.html
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo California.
Schlimmer, J. C. (1993). Efficiently Inducing determinations: A complete and systematic search algorithm that uses optimal pruning. In Proceedings of Tenth International Conference on Machine Learning, 284–290.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media New York
About this chapter
Cite this chapter
Wang, K., Sundaresh, S. (1998). Selecting Features by Vertical Compactness of Data. In: Liu, H., Motoda, H. (eds) Feature Extraction, Construction and Selection. The Springer International Series in Engineering and Computer Science, vol 453. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5725-8_5
Download citation
DOI: https://doi.org/10.1007/978-1-4615-5725-8_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7622-4
Online ISBN: 978-1-4615-5725-8
eBook Packages: Springer Book Archive