An Experimental Evaluation of a Novel Stochastic Method for Iterative Class Discovery on Real Microarray Datasets

  • Héctor Gómez
  • Daniel Glez-Peña
  • Miguel Reboiro-Jato
  • Reyes Pavón
  • Fernando Díaz
  • Florentino Fdez-Riverola
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 74)

Abstract

Within a gene expression matrix, there are usually several particular macroscopic phenotypes of samples related to some diseases or drug effects, such as diseased samples, normal samples or drug treated samples. The goal of sample-based clustering is to find the phenotype structures of these samples. A novel method for automatically discovering clusters of samples which are coherent from a genetic point of view is evaluated on publicly available datasets. Each possible cluster is characterized by a fuzzy pattern which maintains a fuzzy discretization of relevant gene expression values. Possible clusters are randomly constructed and iteratively refined by following a probabilistic search and an optimization schema.

Keywords

microarray data fuzzy discretization gene selection fuzzy pattern class discovery simulated annealing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Xing, E.P., Karp, R.M.: CLIFF: clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts. Bioinformatics 17, S306–S315 (2001)CrossRefGoogle Scholar
  2. 2.
    Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: a survey. IEEE T. Knowl. Data En. 16, 1370–1386 (2004)CrossRefGoogle Scholar
  3. 3.
    Ben-Dor, A., Friedman, N., Yakhini, Z.: Class discovery in gene expression data. In: Proceedings of the Fifth Annual International Conference on Computational Biology. ACM, Montreal (2001)Google Scholar
  4. 4.
    von Heydebreck, A., Huber, W., Poustka, A., Vingron, M.: Identifying splits with clear separation: a new class discovery method for gene expression data. Bioinformatics 17, S107–S114 (2001)CrossRefGoogle Scholar
  5. 5.
    Tang, C., Zhang, A., Ramanathan, M.: ESPD: a pattern detection model underlying gene expression profiles. Bioinformatics 20, 829–838 (2004)CrossRefGoogle Scholar
  6. 6.
    Varma, S., Simon, R.: Iterative class discovery and feature selection using Minimal Spanning Trees. BMC Bioinformatics 5, 126 (2004)CrossRefGoogle Scholar
  7. 7.
    Gutiérrez, N.C., López-Pérez, R., Hernández, J.M., Isidro, I., González, B., Delgado, M., Fermiñán, E., García, J.L., Vázquez, L., González, M., San Miguel, J.F.: Gene expression profile reveals deregulation of genes with relevant functions in the different subclasses of acute myeloid leukemia. Leukemia 19, 402–409 (2005)CrossRefGoogle Scholar
  8. 8.
    Armstrong, S.A., Staunton, J.E., Silverman, L.B., Pieters, R., den Boer, M.L., Minden, M.D., Sallan, S.E., Lander, E.S., Golub, T.R., Korsmeyer, S.J.: MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat. Genet. 30, 41–47 (2002)CrossRefGoogle Scholar
  9. 9.
    Díaz, F., Fdez-Riverola, F., Corchado, J.M.: geneCBR: a case-based reasoning tool for cancer diagnosis using microarray data sets. Comput. Intell. 22, 254–268 (2006)CrossRefGoogle Scholar
  10. 10.
    Glez-Peña, D., Díaz, F., Hernández, J.M., Corchado, J.M., Fdez-Riverola, F.: geneCBR: a translational tool for multiple-microarray analysis and integrative information retrieval for aiding diagnosis in cancer research. BMC Bioinformatics 10, 187 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Héctor Gómez
    • 1
  • Daniel Glez-Peña
    • 1
  • Miguel Reboiro-Jato
    • 1
  • Reyes Pavón
    • 1
  • Fernando Díaz
    • 2
  • Florentino Fdez-Riverola
    • 1
  1. 1.ESEI: Escuela Superior de Ingeniería InformáticaUniversity of Vigo, Edificio PolitécnicoOurenseSpain
  2. 2.EUI: Escuela Universitaria de InformáticaUniversity of ValladolidSegoviaSpain

Personalised recommendations