Advertisement

Global Feature Subset Selection on High-Dimensional Datasets Using Re-ranking-based EDAs

  • Pablo Bermejo
  • Luis de La Ossa
  • Jose M. Puerta
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7023)

Abstract

The relatively recent appearance of high-dimensional databases has made traditional search algorithms too expensive in terms of time and memory resources. Thus, several modifications or enhancements to local search algorithms can be found in the literature to deal with this problem. However, non-deterministic global search, which is expected to perform better than local, still lacks appropriate adaptations or new developments for high-dimensional databases. We present a new non-deterministic iterative method which performs a global search and can easily handle datasets with high cardinality and, furthermore, it outperforms a wide variety of local search algorithms.

Keywords

Feature Selection Local Search Global Search Local Search Algorithm Hill Climbing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bermejo, P., de la Ossa, L., Gámez, J.A., Puerta, J.M.: Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking, Knowledge-Based Systems (in press)Google Scholar
  2. 2.
    Bermejo, P., Gámez, J., Puerta, J.: On incremental wrapper-based attribute selection: experimental analysis of the relevance criteria. In: IPMU 2008: Proceedings of the 12th Intl. Conf. on Information Processing and Management of Uncertainty in Knowledge-Based Systems (2008)Google Scholar
  3. 3.
    Bermejo, P., Gámez, J.A., Puerta, J.M.: Incremental wrapper-based subset selection with replacement: An advantageous alternative to sequential forward selection. In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2009 (2009)Google Scholar
  4. 4.
    Bermejo, P., Gámez, J.A., Puerta, J.M.: A grasp algorithm for fast hybrid (filter-wrapper) feature subset selection in high-dimensional datasets. Pattern Recognition Letters 32(5), 701–711 (2011)CrossRefGoogle Scholar
  5. 5.
    Blanco, R., Naga, P.L., Iñaki Inza, I., Sierra, B.: Selection of highly accurate genes for cancer classification by estimation of distribution algorithms. In: Workshop of Bayesian Models in Medicine, AIME 2001 (2001)Google Scholar
  6. 6.
    Casado-Yusta, S.: Different metaheuristic strategies to solve the feature selection problem. Pattern Recognition Letters 30(5), 525–534 (2009)CrossRefGoogle Scholar
  7. 7.
    Demsar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Esseghir, M.A.: Effective wrapper-filter hybridization through grasp schemata. In: MLR Workshop and Conference Proceedings, Feature Selection in Data Mining, vol. 10 (2010)Google Scholar
  9. 9.
    Feo, T.A., Resende, M.G.: Greedy randomized adaptive search procedures. Global Optimization 6(2), 109–133 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Fleuret, F.: Fast binary feature selection with conditional mutual information. Journal of Machine Learning Research 5, 1531–1555 (2004)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Flores, J., Gámez, J.A., Mateo, J.L.: Mining the esrom: A study of breeding value classification in manchego sheep by means of attribute selection and construction. Computers and Electronics in Agriculture 60(2), 167–177 (2008)CrossRefGoogle Scholar
  12. 12.
    Garcia, S., Herrera, F.: An extension on ”statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. Journal of Machine Learning Research 9, 2677–2694 (2008)zbMATHGoogle Scholar
  13. 13.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)zbMATHGoogle Scholar
  14. 14.
    Inza, I., Larrañaga, P., Etxeberria, R., Sierra, B.: Feature subset selection by bayesian network-based optimization. Artificial Intelligence 123, 157–184 (2000)CrossRefzbMATHGoogle Scholar
  15. 15.
    Jolliffe, I.: Principal Component Analysis. Springer, Heidelberg (1986)CrossRefzbMATHGoogle Scholar
  16. 16.
    Kittler, J.: Feature set search algorithms. Pattern Recognition and Signal Processing, 41–60 (1978)Google Scholar
  17. 17.
    Larrañaga, P., Lozano, J.A.: Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation. Kluwer Academic Publishers (2001)Google Scholar
  18. 18.
    Mühlenbein, H.: The equation for response to selection and its use for prediction. Evolutionary Computation 5, 303–346 (1998)CrossRefGoogle Scholar
  19. 19.
    Ruiz, R., Aguilar, J.S., Riquelme, J.: Best agglomerative ranked subset for feature selection. In: JMLR: Workshop and Conference Proceedings, vol. 4 (New Challenges for feature selection) (2009)Google Scholar
  20. 20.
    Ruiz, R., Riquelme, J.C., Aguilar-Ruiz, J.S.: Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recogn. 39, 2383–2392 (2006)CrossRefGoogle Scholar
  21. 21.
    Tan, Q., Thomassen, M., Jochumsen, K.M., Zhao, J.H., Christensen, K., Kruse, T.A.: Evolutionary algorithm for feature subset selection in predicting tumor outcomes using microarray data. In: Măndoiu, I., Wang, S.-L., Zelikovsky, A. (eds.) ISBRA 2008. LNCS (LNBI), vol. 4983, pp. 426–433. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  22. 22.
    Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm. IEEE Intelligent Systems 13(2), 44–49 (1998)CrossRefGoogle Scholar
  23. 23.
    Zhu, Z., Ong, Y.-S., Dash, M.: Wrapper-filter feature selection algorithm using a memetic framework. IEEE Transactions on Systems, Man, and Cybernetics, Part B 37(1), 70–76 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Pablo Bermejo
    • 1
  • Luis de La Ossa
    • 1
  • Jose M. Puerta
    • 1
  1. 1.Edificio I3A, AlbaceteCastilla-La Mancha UniversitySpain

Personalised recommendations