Fuzzy Patterns and GCS Networks to Clustering Gene Expression Data

  • Daniel Glez-Peña
  • Fernando Díaz
  • Florentino Fdez-Riverola
  • José R. Méndez
  • Juan M. Corchado

Summary

The advent of DNA microarray technology has supplied a large volume of data to many fields like machine learning and data mining. Gene expression profiles are composed of thousands of genes at the same time, representing complex relationships between them. In this context, intelligent support is essential for managing and interpreting this great amount of information. One of the well-known constraints specifically related to microarray data is the large number of genes in comparison with the small number of available experiments. In this situation, the ability of design methods capable of overcoming current limitations of state-of-the-art algorithms is crucial to the development of successful applications. In this chapter we present a flexible framework for the task of feature selection and classification of microarray data. Dimensionality reduction is achieved by the application of a supervised fuzzy pattern algorithm able to reduce and discretize existing gene expression profiles. An informed growing cell structures network is proposed for clustering biological homogeneous experiments starting from the previous simplified microarray data. Experimental results over different data sets containing acute myeloid leukemia profiles show the effectiveness of the proposed method.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alizadeh, A.A., Eisen, M.B., Davis, R.E., Ma, C., Lossos, I.S., Lossos, A., Rosenwald, J., Boldrick, J.C., Sabet, H., Tran, T., Yu, X., Powell, J.I., Yang, L., Marti, G.E., Moore, T., Hudson Jr., J., Lu, L., Lewis, D.B., Tibshirani, R., Sherlock, G., Chan, W.C., Greiner, T.C., Weisenburger, D.D., Armitage, J.O., Warnke, R., Levy, R., Wilson, W., Grever, M.R., Byrd, J.C., Botstein, D., Brown, P.O., Staudt, L.M.: Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)CrossRefGoogle Scholar
  2. 2.
    Blanco, R., Larrañaga, P., Inza, I., Sierra, B.: Gene selection for cancer classification using wrapper approaches. International Journal of Pattern Recognition and Artificial Intelligence 18(8), 1373–1390 (2004)CrossRefGoogle Scholar
  3. 3.
    Bolstad, B.M., Irizarry, R.A., Astrand, M., Speed, T.P.: A comparison of normalization methods for high density oligonucleotide array data based on bias and variance. Bioinformatics (2), 185–193 (2003)CrossRefGoogle Scholar
  4. 4.
    Chu, F., Wang, L.: Applications of support vector machines to cancer classification with microarray data. International Journal of Neural Systems 15(6), 475–484 (2005)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Chu, F., Wang, L.: Gene expression data analysis using support vector machines. In: Bioinformatics using Computational Intelligence Paradigms, pp. 167–189. Springer, Berlin (2005)CrossRefGoogle Scholar
  6. 6.
    Dai, J.J., Lieu, L., Rocke, D.: Dimension reduction for classification with gene expression microarray data. Statistical Applications in Genetics and Molecular Biology 5(1), 6 (2006)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Deutsch, J.M.: Evolutionary algorithms for finding optimal gene sets in microarray prediction. Bioinformatics 19(1), 45–52 (2003)CrossRefGoogle Scholar
  8. 8.
    Díaz, F., Fdez-Riverola, F., Corchado, J.M.: Gene-cbr: A case-based reasoning tool for cancer diagnosis using microarray datasets. Computational Intelligence 22(3-4), 254–268 (2006)CrossRefMathSciNetGoogle Scholar
  9. 9.
    Díaz, F., Fdez-Riverola, F., Glez-Peña, D., Corchado, J.M.: Using fuzzy patterns for gene selection and data reduction on microarray data. In: Proceedings of the 7th International Conference on Intelligent Data Engineering and Automated Learning, Burgos, Spain, pp. 1087–1094 (2006)Google Scholar
  10. 10.
    Dubois, D., Prade, H.: Fuzzy sets and systems: Theory and applications. Academic Press, New York (1980)MATHGoogle Scholar
  11. 11.
    Fdez-Riverola, F., Díaz, F., Borrajo, M.L., Yánez, J.C., Corchado, J.M.: Improving gene selection in microarray data analysis using fuzzy patterns inside a cbr system. In: Proceedings of the 6th International Conference on Case-Based Reasoning, Chicago, Illinois, USA, pp. 191–205 (2005)Google Scholar
  12. 12.
    Fritzke, B.: Growing self-organising networks – why? In: Proceedings of the 11th European Symposium on Artificial Neural Networks, pp. 61–72 (1993)Google Scholar
  13. 13.
    Fritzke, B.: Growing cell structures - a self-organizing network for unsupervised and supervised learning, Tech. report, International Computer Science Institute, Berkeley, CA, USA (1993b)Google Scholar
  14. 14.
    Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Collar, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar
  15. 15.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1-3), 389–422 (2002)CrossRefMATHGoogle Scholar
  16. 16.
    Mamdani, E.H., Assilian, S.: An experiment in linguistic synthesis with a fuzzy logic controller. International Journal of Man-Machine Studies 7(1), 1–13 (1975)CrossRefMATHGoogle Scholar
  17. 17.
    Hanczar, B., Courtine, M., Benis, A., Hennegar, C., Clément, K., Zucker, J.D.: Improving classification of microarray data using prototype-based feature selection. ACM SIGKDD Explorations Newsletter 5(2), 23–30 (2003)CrossRefGoogle Scholar
  18. 18.
    Hochreiter, S., Obermayer, K.: Feature selection and classification on matrix data: from large margins to small covering numbers. In: Advances in Neural Information Processing Systems, vol. 15, pp. 913–920. MIT Press, Cambridge (2003)Google Scholar
  19. 19.
    Jaeger, J., Sengupta, R., Ruzzo, W.L.: Improved gene selection for classification of microarrays. In: Proceedings of the 8th Pacific Symposium on Biocomputing, Kauai, Hawaii, pp. 53–64 (2003)Google Scholar
  20. 20.
    Jang, J.S.R., Sun, C.T.: Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence. Prentice-Hall, Englewood Cliffs (1997)Google Scholar
  21. 21.
    Khan, J., Wei, J.S., Ringnér, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C., Meltzer, P.S.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine 7(6), 673–679 (2001)CrossRefGoogle Scholar
  22. 22.
    Kohonen, T.: Self-organising maps. Springer, Berlin (1995)Google Scholar
  23. 23.
    Li, L., Darden, T.A., Weinberg, C.R., Levine, A.J., Pedersen, L.G.: Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor method. Combinatorial Chemistry & High Throughput Screening 4(8), 727–739 (2001)Google Scholar
  24. 24.
    Liu, B., Wan, C., Wang, L.: Unsupervised gene selection via spectral biclustering. In: Proceedings of IEEE International Joint Conference on Neural Networks, Budapest, Hungary, pp. 1681–1686 (2004)Google Scholar
  25. 25.
    Niijima, S., Kuhara, S.: Effective nearest neighbor methods for multiclass cancer classification using microarray data. In: Proceedings of the 16th International Conference on Genome Informatics, p. P051 (2005)Google Scholar
  26. 26.
    Ochs, M.F., Godwin, A.K.: Microarrays in cancer: Research and applications. BioTechniques 34, s4–s15 (2003)Google Scholar
  27. 27.
    Valk, P.J., Verhaak, R.G., Beijen, M.A., Erpelinck, C.A., van Waalwijk, B., Doorn-Khosrovani, S., Boer, J.M., Beverloo, H.B., Moorhouse, M.J., van der Spek, P.J., Löwenberg, B., Delwel, R.: Prognostically useful gene-expression profiles in acute myeloid leukaemia. New England Journal of Medicine 350(16), 1617–1628 (2004)CrossRefGoogle Scholar
  28. 28.
    Qi, H.: Feature selection and knn fusion in molecular classification of multiple tumor types. In: Proceedings of the International Conference on Mathematics and Engineering Techniques in Medicine and Biological Sciences, Las Vegas, Nevada, USA (1992)Google Scholar
  29. 29.
    Sugeno, M.: Industrial applications of fuzzy control. Elsevier, Amsterdam (1985)Google Scholar
  30. 30.
    Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V.: Feature selection for svms. In: Advances in Neural Information Processing Systems, vol. 13, pp. 668–674. MIT Press, Cambridge (2001)Google Scholar
  31. 31.
    Zadeh, L.A.: Fuzzy sets. Information and Control 12, 338–353 (1965)CrossRefMathSciNetGoogle Scholar
  32. 32.
    Zadeh, L.A.: Soft computing and fuzzy logic. IEEE Software 11(6), 48–56 (1994)CrossRefGoogle Scholar
  33. 33.
    Zheng, B., Olusegun, E., Narasimhan, G.: Neural network classifiers and gene selection methods for microarray data on human lung adenocarcinoma. In: Proceedings of the 6th Critical Assessment of Microarray Data Analysis, North Carolina, USA, pp. 63–67 (2003)Google Scholar
  34. 34.
    Zong, N., Adjouadi, M., Ayala, M.: Optimizing the classification of acute lymphoblastic leukemia and acute myeloid leukemia samples using artificial neural networks. Biomedical Sciences Instrumentation 42, 261–266 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Daniel Glez-Peña
    • 1
  • Fernando Díaz
    • 2
  • Florentino Fdez-Riverola
    • 1
  • José R. Méndez
    • 1
  • Juan M. Corchado
    • 3
  1. 1.Department of Computer ScienceUniversity of VigoOurenseSpain
  2. 2.Department of Computer ScienceUniversity of ValladolidSegoviaSpain
  3. 3.Department of Computer ScienceUniversity of SalamancaSalamancaSpain

Personalised recommendations