Cluster-Dependent Feature Selection through a Weighted Learning Paradigm
This paper addresses the problem of selecting a subset of the most relevant features from a dataset through a weighted learning paradigm.We propose two automated feature selection algorithms for unlabeled data. In contrast to supervised learning, the problem of automated feature selection and feature weighting in the context of unsupervised learning is challenging, because label information is not available or not used to guide the feature selection. These algorithms involve both the introduction of unsupervised local feature weights, identifying certain relevant features of the data, and the suppression of the irrelevant features using unsupervised selection. The algorithms described in this paper provide topographic clustering, each cluster being associated to a prototype and a weight vector, reflecting the relevance of the feature. The proposed methods require simple computational techniques and are based on the self-organizing map (SOM) model. Empirical results based on both synthetic and real datasets from the UCI repository, are given and discussed.
KeywordsTopographic Clustering Self-organizing Map Unsupervised Features Selection Cluster Characterization Weighted Learning
Unable to display preview. Download preview PDF.
- Almuallim, H., Dietterich, T.: Learning with many irrelevant features. In: Proceedings of the Ninth National Conference on Artificial Intelligence, pp. 547–552. AAAI Press, Anaheim (1991)Google Scholar
- Asuncion, A., Newman, D.: UCI Machine Learning Repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
- Benabdeslem, K., Lebbah, M.: Feature selection for Self Organizing Map. In: International Conference on Information Technology Interface-ITI 2007, Cavtat-Dubrovnik,Croatia, June 25-28, pp. 45–50 (2007)Google Scholar
- Bennani., Y.: Adaptive weighting of pattern features during learning. In: IJCNN 1999, Piscataway, NJ, vol. 5, pp. 3008–3013 (1999)Google Scholar
- Guérif, S., Bennani, Y.: Dimensionality reduction trough unsupervised features selection. In: International Conference on Engineering Applications of Neural Networks (2007)Google Scholar
- Lebbah, M., Rogovschi, N., Bennani, Y.: BeSOM: Bernoulli on Self Organizing Map. In: IJCNN 2007, Orlando, Florida (2007)Google Scholar
- Raîche, G., Riopel, M., Blais, J.-G.: Non Graphical Solutions for the Cattell’s Scree Test. In: International Meeting of the Psychometric Society, IMPS 2006, HEC, Montréal (2006)Google Scholar
- Wang, Q., Ye, Y., Huang, J.Z.: Fuzzy K-Means with Variable Weighting in High Dimensional Data Analysis. In: International Conference on Web-Age Information Management, vol. 0, pp. 365–372 (2008), http://doi.ieeecomputersociety.org/10.1109/WAIM.2008.50
- Yacoub, M., Bennani, Y.: Features Selection and Architecture Optimization in Connectionist Systems. IJNS 10(5) (2000)Google Scholar