An Evolutionary Approach for Sample-Based Clustering on Microarray Data

  • Daniel Glez-Peña
  • Fernando Díaz
  • José R. Méndez
  • Juan M. Corchado
  • Florentino Fdez-Riverola
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5518)


Sample-based clustering is one of the most common methods for discovering disease subtypes as well as unknown taxonomies. By revealing hidden structures in microarray data, cluster analysis can potentially lead to more tailored therapies for patients as well as better diagnostic procedures. In this work, we present a novel method for automatically discovering clusters of samples which are coherent from a genetic point of view. Each possible cluster is characterized by a fuzzy pattern which maintains a fuzzy discretization of relevant gene expression values. Noise genes are identified and removed from the fuzzy pattern based on their probability of appearance. Possible clusters are randomly constructed and iteratively refined by following a probabilistic search and an optimization schema. Experimental results over publicly available microarray data show the effectiveness of the proposed method.


simulated annealing sample-based clustering discriminant fuzzy pattern microarray data 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Xing, E.P., Karp, R.M.: Cliff: Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts. Bioinformatics 17(1), 306–315 (2001)CrossRefGoogle Scholar
  2. 2.
    Jiang, D., Tang, C., Zhang, A.: Cluster Analysis for Gene Expression Data: A Survey. IEEE Transactions on Knowledge and Data Engineering 16(11), 1370–1386 (2004)CrossRefGoogle Scholar
  3. 3.
    Alter, O., Brown, P.O., Bostein, D.: Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the National Academy of Sciences of the United States of America 97(18), 10101–10106 (2000)CrossRefGoogle Scholar
  4. 4.
    Ding, C.: Analysis of gene expression profiles: class discovery and leaf ordering. In: Proceedings of the Six Annual International Conference on Computational Molecular Biology, pp. 127–136 (2002)Google Scholar
  5. 5.
    Yeung, K.Y., Ruzzo, W.L.: Principal component analysis for clustering gene expression data. Oxford Bioinformatics 17(9), 763–774 (2000)CrossRefGoogle Scholar
  6. 6.
    Ben-Dor, A., Friedman, N., Yakhini, Z.: Class discovery in gene expression data. In: Proceedings of the fifth Annual International Conference on Computational Biology, pp. 31–38 (2001)Google Scholar
  7. 7.
    Xing, E.P., Karp, R.M.: Cliff: Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts. Oxford Bioinformatics 17(1), 306–315 (2001)CrossRefGoogle Scholar
  8. 8.
    von Heydebreck, A., Huber, W., Poustka, A., Vingron, M.: Identifying splits with clear separation: a new class discovery method for gene expression data. Oxford Bioinformatics 17, 107–114 (2001)CrossRefGoogle Scholar
  9. 9.
    Tang, C., Zhang, A., Ramanathan, M.: ESPD: a pattern detection model underlying gene expression profiles. Oxford Bioinformatics 20(6), 829–838 (2004)CrossRefGoogle Scholar
  10. 10.
    Varma, S., Simon, R.: Iterative class discovery and feature selection using Minimal Spanning Trees. BMC Bioinformatics 5, 126 (2004)CrossRefGoogle Scholar
  11. 11.
    Glez-Peña, D., Álvarez, R., Díaz, F., Fdez-Riverola, F.: DFP: A Bioconductor package for fuzzy profile identification and gene reduction of microarray data. BMC Bioinformatics 10, 37 (2009)CrossRefGoogle Scholar
  12. 12.
    Armstrong, S.A., Stauton, J.E., Silverman, L.B., Pieters, R., den Boer, M.L., Minden, M.D., Sallan, S.E., Lander, E.S., Golub, T.R., Korsmeyer, S.J.: MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics 20, 41–47 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Daniel Glez-Peña
    • 1
  • Fernando Díaz
    • 2
  • José R. Méndez
    • 1
  • Juan M. Corchado
    • 3
  • Florentino Fdez-Riverola
    • 1
  1. 1.ESEI: Escuela Superior de Ingeniería InformáticaUniversity of Vigo, Edificio PolitécnicoOurenseSpain
  2. 2.Dept. InformáticaUniversity of Valladolid, Escuela Universitaria de InformáticaSegoviaSpain
  3. 3.Dept. Informática y AutomáticaUniversity of SalamancaSalamancaSpain

Personalised recommendations