Using Fuzzy Patterns for Gene Selection and Data Reduction on Microarray Data

  • Fernando Díaz
  • Florentino Fdez-Riverola
  • Daniel Glez-Peña
  • Juan M. Corchado
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4224)


The advent of DNA microarray technology has supplied a large volume of data to many fields like machine learning and data mining. Intelligent support is essential for managing and interpreting this great amount of information. One of the well-known constraints specifically related to microarray data is the large number of genes in comparison with the small number of available experiments. In this context, the ability of design methods capable of overcoming current limitations of state-of-the-art algorithms is crucial to the development of successful applications. In this paper we demonstrate how a supervised fuzzy pattern algorithm can be used to perform DNA microarray data reduction over real data. The benefits of our method can be employed to find biologically significant insights relating to meaningful genes in order to improve previous successful techniques. Experimental results on acute myeloid leukemia diagnosis show the effectiveness of the proposed approach.


Support Vector Machine Acute Myeloid Leukemia Microarray Data Acute Promyelocytic Leukemia Gene Selection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Cakmakov, D., Bennani, Y.: Feature selection for pattern recognition. Informa Press (2002)Google Scholar
  2. Zheng, G., Olusegun, E., Narasimhan, G.: Neural network classifiers and gene selection methods for microarray data on human lung adenocarcinoma. In: Proc. of the CAMDA 2003 Conference, pp. 63–67 (2003)Google Scholar
  3. Fuhrman, S., Cunningham, M.J., Wen, X., Zweiger, G., Seilhamer, J.J., Somogyi, R.: The application of Shannon entropy in the identification of putative drug targets. Biosystems 55, 5–14 (2000)CrossRefGoogle Scholar
  4. Li, L., Darden, T.A., Weinberg, C.R., Levine, A.J., Pedersen, L.G.: Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor method. Combinatorial Chemistry and High Throughput Screening 4(8), 727–739 (2001)Google Scholar
  5. Blanco, R., Larrañaga, P., Inza, I., Sierra, B.: Gene selection for cancer classification using wrapper approaches. International Journal of Pattern Recognition and Artificial Intelligence 18(8), 1373–1390 (2004)CrossRefGoogle Scholar
  6. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1-3), 389–422 (2002)zbMATHCrossRefGoogle Scholar
  7. Chu, F., Wang, L.: Gene Expression Data Analysis Using Support Vector Machines. In: Seiffert, U., Jain, L.C. (eds.) Bioinformatics using Computational Intelligence Paradigms, pp. 167–189. Springer, Berlin (2005)CrossRefGoogle Scholar
  8. Liu, L., Wan, C.R., Wang, L.P.: Unsupervised gene selection via spectral biclustering. In: Proc. of the International Joint Conference on Neural Networks, pp. 1681–1686 (2005)Google Scholar
  9. Jaeger, J., Sengupta, R., Ruzzo, W.L.: Improved gene selection for classification of microarrays. In: Proc. of the PSB 2003 Conference, pp. 53–64 (2003)Google Scholar
  10. Qi, H.: Feature selection and kNN fusion in molecular classification of multiple tumor types. In: Proc. of the METMBS 2002 Conference (2002)Google Scholar
  11. Hanczar, B., Courtine, M., Benis, A., Hennegar, C., Clément, K., Zucker, J.D.: Improving classification of microarray data using prototype-based feature selection. ACM SIGKDD Explorations Newsletter 5(2), 23–30 (2003)CrossRefGoogle Scholar
  12. Fdez-Riverola, F., Díaz, F., Corchado, J.M., Hernández, J.M., San Miguel, J.: Improving Gene Selection in Microarray Data Analysis using Fuzzy Patterns inside a CBR System. In: Proc. of the ICCBR 2005 Conference, pp. 23–26 (2005)Google Scholar
  13. Díaz, F., Fdez-Riverola, F., Corchado, J.M.: GENE-CBR: a Case-Based Reasoning Tool for Cancer Diagnosis using Microarray Datasets. Computational Intelligence, (in Press) ISSN 0824-7935Google Scholar
  14. Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. of the National Academy of Sciences 99(10), 6567–6572 (2002)CrossRefGoogle Scholar
  15. Fritzke, B.: Growing Cell Structures – A Self-Organizing Network for Unsupervised and Supervised Learning. Neural Networks 7, 1441–1460 (1994)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Fernando Díaz
    • 1
  • Florentino Fdez-Riverola
    • 2
  • Daniel Glez-Peña
    • 2
  • Juan M. Corchado
    • 3
  1. 1.University of ValladolidSegoviaSpain
  2. 2.University of VigoOurenseSpain
  3. 3.University of SalamancaSalamancaSpain

Personalised recommendations