Unsupervised Feature Selection in High Dimensional Spaces and Uncertainty
Developing models and methods to manage data vagueness is a current effervescent research field. Some work has been done with supervised problems but unsupervised problems and uncertainty have still not been studied. In this work, an extension of the Fuzzy Mutual Information Feature Selection algorithm for unsupervised problems is outlined. This proposal is a two stage procedure. Firstly, it makes use of the fuzzy mutual information measure and Battiti’s feature selection algorithm and of a genetic algorithm to analyze the relationships between feature subspaces in a high dimensional space. The second stage uses a simple ad hoc heuristic with the aim to extract the most relevant relationships. It is concluded, given the results from the experiments carried out in this preliminary work, that it is possible to apply frequent pattern mining or similar methods in the second stage to reduce the dimensionality of the data set.
KeywordsUnsupervised feature selection genetic algorithms data uncertainty frequent pattern mining
Unable to display preview. Download preview PDF.
- 1.Alcala-Fdez, J., Sanchez, L., Garcia, S., Jesus, M.J.D., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernandez, J.C., Herrera, F.: KEEL: A Software Tool to Assess Evolutionary Algorithms to Data Mining Problems. Soft Computing 13(3), 307–318 (2009)CrossRefGoogle Scholar
- 5.Conaire, C.O., Connor, N.E.: Unsupervised feature selection for detection using mutual information thresholding. In: Ninth International Workshop on Image Analysis for Multimedia Interactive Services (2008)Google Scholar
- 10.Marcelloni, F.: Feature selection based on a modified fuzzy c-means algorithm with supervision. Information Sciences 151 (2003)Google Scholar
- 13.Sanchez, L., Suarez, M.R., Villar, J.R., Couso, I.: Mutual Information-based Feature Selection and Fuzzy Discretization of Vague Data. International Journal of Aproximate Reasoning (2008), http://dx.doi.org/10.1016/
- 14.Sanchez, L., Villar, J.R., Couso, I.: Proceedings of the 12th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems. EUSFLAT, Genetic Feature Selection for Fuzzy Discretized Data (2008)Google Scholar
- 15.Sedano, J., Villar, J.R., Corchado, E.S., Curiel, L., Bravo, P.M.: The application of a two-step AI model to an Automated Pneumatic Drilling Process. Accepted to be published in the International Journal of Computer Mathematics (2008)Google Scholar