Advertisement

Memetic Algorithms for Feature Selection on Microarray Data

  • Zexuan Zhu
  • Yew-Soon Ong
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4491)

Abstract

In this paper, we present two novel memetic algorithms (MAs) for gene selection. Both are synergies of Genetic Algorithm (wrapper methods) and local search methods (filter methods) under a memetic framework. In particular, the first MA is a Wrapper-Filter Feature Selection Algorithm (WFFSA) fine-tunes the population of genetic algorithm (GA) solutions by adding or deleting features based on univariate feature filter ranking method. The second MA approach, Markov Blanket-Embedded Genetic Algorithm (MBEGA), fine-tunes the population of solutions by adding relevant features, removing redundant and/or irrelevant features using Markov blanket. Our empirical studies on synthetic and real world microarray dataset suggest that both memetic approaches select more suitable gene subset than the basic GA and at the same time outperforms GA in terms of classification predictions. While the classification accuracies between WFFSA and MBEGA are not significantly statistically different on most of the datasets considered, MBEGA is observed to converge to more compact gene subsets than WFFSA.

Keywords

Feature Selection Local Search Gene Selection Feature Subset Memetic Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kohavi, R., John, G.H.: Wrapper for Feature Subset Selection. Artificial Intelligence 97(1-2), 273–324 (1997)CrossRefzbMATHGoogle Scholar
  2. 2.
    Ong, Y.S., Keane, A.J.: A Domain Knowledge Based Search Advisor for Design Problem Solving Environments. Engineering Applications of Artificial Intelligence 15(1), 105–116 (2002)CrossRefGoogle Scholar
  3. 3.
    Lim, M.H., Yu, Y., Omatu, S.: Extensive Testing of a Hybrid Genetic Algorithm for Solving Quadratic Assignment Problems. Computational Optimization and Applications 23, 47–64 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Ong, Y.S., Nair, P.B., Lum, K.Y.: Max-min Surrogate-assisted Evolutionary Algorithm for Robust Aerodynamic Design. IEEE Trans. on Evolutionary Computation 10(4), 392–404 (2006)CrossRefGoogle Scholar
  5. 5.
    Wahde, M., Szallasi, Z.: A Survey of Methods for Classification of Gene Expression Data Using Evolutionary Algorithms. Expert Review of Molecular Diagnostic 6(1), 101–110 (2006)CrossRefGoogle Scholar
  6. 6.
    Ong, Y.S., Keane, A.J.: Meta-Lamarckian in Memetic Algorithm. IEEE Trans. on Evolutionary Computation 8(2), 99–110 (2004)CrossRefGoogle Scholar
  7. 7.
    Ong, Y.S., Lim, M.H., Zhu, N., Wong, K.W.: Classification of Adaptive Memetic Algorithms: A Comparative Study. IEEE Transactions on Systems, Man and Cybernetics-Part B 36(1), 141–152 (2006)CrossRefGoogle Scholar
  8. 8.
    Zhu, Z., Ong, Y.S., Dash, M.: Wrapper-filter Feature Selection Algorithm Using a Memetic Framework. IEEE Transactions On Systems, Man and Cybernetics-Part B, accepted (2006)Google Scholar
  9. 9.
    Zhu, Z., Ong, Y.S., Dash, M.: Markov Blanket-embedded Genetic Algorithm for Gene Selection. Pattern Recognition, submitted (2006)Google Scholar
  10. 10.
    Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)zbMATHGoogle Scholar
  11. 11.
    Robnic-Sikonja, M., Kononenko, I.: Theoretical and Empirical Analysis of ReliefF and RReliefF. Machine Learning 53(1-2), 23–69 (2003)CrossRefzbMATHGoogle Scholar
  12. 12.
    Baker, J.E.: Adaptive Selection Methods for Genetic Algorithms. In: Proc. Int’l Conf. Genetic Algorithm and Their Applications, pp. 101–111 (1985)Google Scholar
  13. 13.
    Koller, D., Sahami, M.: Toward Optimal Feature Selection. In: 13th International Conference on Machine Learning, Bari, Italy, Morgan Kaufmann, San Francisco (1996)Google Scholar
  14. 14.
    Yu, L., Liu, H.: Efficient Feature Selection via Analysis of Relevance and Redundancy. Journal of Machine Learning Research 5, 1205–1224 (2004)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Braga-Neto, U.M., Dougherty, E.R.: Is Cross-validation Valid for Small-sample Microarray Classification. Bioinformatics 20(3), 374–380 (2004)CrossRefGoogle Scholar
  16. 16.
    Li, J., Liu, H.: Kent Ridge Biomedical Data Set Repository (2002), http://sdmc-lit.org.sg/GEDatasets
  17. 17.
    Salahuddin, M., Hung, T., Soh, H., Sulaiman, E., Ong, Y.S., Lee, B.S., Ren, Y.: Grid-based PSE for Engineering of Materials (GPEM). In: CCGrid 2007, submitted (2007)Google Scholar
  18. 18.
    Lim, D., Ong, Y.S., Jin, Y., Sendhoff, B., Lee, B.S.: Efficient Hierarchical Parallel Genetic Algorithms Using Grid Computing. Future Generation Computer Systems: The International Journal of Grid Computing: Theory, Methods and Applications 23(4), 658–670 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Zexuan Zhu
    • 1
    • 2
  • Yew-Soon Ong
    • 1
  1. 1.Division of Information Systems, School of Computer Engineering, Nanyang Technological University, Nanyang Avenue, 639798Singapore
  2. 2.Bioinformatics Research Centre, Nanyang Technological University, Research TechnoPlaza, 50 Nanyang Drive, 637553Singapore

Personalised recommendations