Iterative Clustering Analysis for Grouping Missing Data in Gene Expression Profiles

  • Dae-Won Kim
  • Bo-Yeong Kang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3918)


Clustering has been used as a popular technique for finding groups of genes that show similar expression patterns under multiple experimental conditions. Because a clustering method requires a complete data matrix as an input, we must estimate the missing values using an imputation method in the preprocessing step of clustering. However, a common limitation of these conventional approach is that once the estimates of missing values are fixed in the preprocessing step, they are not changed during subsequent process of clustering. Badly estimated missing values obtained in data preprocessing are likely to deteriorate the quality and reliability of clustering results. Thus, a new clustering method is required for improving missing values during iterative clustering process.


Cluster Method Cluster Result Cluster Performance Imputation Method Cluster Centroid 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hathaway, R.J., Bezdek, J.C.: Fuzzy c-means clustering of incomplete data. IEEE Transactions on Systems, Man, and Cybernetics–Part B: Cybernetics 31, 735–744 (2001)CrossRefGoogle Scholar
  2. 2.
    Troyanskaya, O., Cantor, M., Sherlock, G., et al.: Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001)CrossRefGoogle Scholar
  3. 3.
    Ouyang, M., Welsh, W.J., Georgopoulos, P.: Guassian mixture clustering and imputation of microarray data. Bioinformatics 20, 917–923 (2004)CrossRefGoogle Scholar
  4. 4.
    Alizadeh, A.A., Eisen, M.B., David, R.E., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)CrossRefGoogle Scholar
  5. 5.
    Bo, T.H., Dysvik, B., Jonassen, I.: LSimpute: accurate estimation of missing values in microarray data with least square methods. Nucleic Acids Research 32, e34 (2004)CrossRefGoogle Scholar
  6. 6.
    Dumitrescu, D., Lazzerini, B., Jain, L.C.: Fuzzy Sets and Their Applications to Clustering and Traning. CRC Press, Florida (2000)Google Scholar
  7. 7.
    Fuschik, M.E.: Methods for Knowledge Discovery in Microarray Data. Ph.D. Thesis, University of Otago (2003)Google Scholar
  8. 8.
    Horn, D., Axel, I.: Novel clustering algorithm for microarray expression data in a truncated SVD space. Bioinformatics 19, 1110–1115 (2003)CrossRefGoogle Scholar
  9. 9.
    Dudoit, S., Fridlyand, J.: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)CrossRefGoogle Scholar
  10. 10.
    Mizuguchi, G., Shen, X., Landry, J., et al.: ATP-driven exchange of histone H2AZ variant catalyzed by SWR1 chromatin remodeling complex. Science 303, 343–348 (2004)CrossRefGoogle Scholar
  11. 11.
    Yoshimoto, H., Saltsman, K., Gasch, A.P., et al.: Genome-wide analysis of gene expression regulated by the Calcineurin/Crz1p signaling pathway in Saccharomyces cerevisiae. The Journal of Biological Chemistry 277, 31079–31088 (2002)CrossRefGoogle Scholar
  12. 12.
    Cho, R.J., Campbell, M.J., Winzeler, E.A., et al.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2, 65–73 (1998)CrossRefGoogle Scholar
  13. 13.
    Chu, S., DeRish, J., Eisen, M., et al.: The transcriptional program of sporulation in budding yeast. Science 282, 699–705 (1998)CrossRefGoogle Scholar
  14. 14.
    Dembele, D., Kastner, P.: Fuzzy c-means method for clustering microarray data. Bioinformatics 19, 973–980 (2003)CrossRefGoogle Scholar
  15. 15.
    Dhilon, I.S., Marcotte, E.M., Roshan, U.: Diametrical clustering for identifying anticorrelated gene clusters. Bioinformatics 19, 1612–1619 (2003)CrossRefGoogle Scholar
  16. 16.
    Eisen, M., Spellman, P.T., Brown, P.O., et al.: Cluster analysis and display of genomewide expression patterns. In: Proc. Natl. Acad. Sci. USA, vol. 95, pp. 14863–14868 (1998)Google Scholar
  17. 17.
    Ashburner, M., Ball, C.A., Blake, J.A., et al.: Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000)CrossRefGoogle Scholar
  18. 18.
    Issel-Tarver, L., Christie, K.R., Dolinski, K., et al.: Saccharomyces genome database. Methods Enzymol 350, 329–346 (2002)CrossRefGoogle Scholar
  19. 19.
    Gibbons, F.D., Roth, F.P.: Judging the quality of gene expression-based clustering methods using gene annotation. Genome Res. 12, 1574–1581 (2002)CrossRefGoogle Scholar
  20. 20.
    Kim, D.W., Lee, K.H., Lee, D.: Detecting clusters of different geometrical shapes in microarray gene expression data. Bioinformatics 21, 1927–1934 (2005)CrossRefGoogle Scholar
  21. 21.
    Sharan, R., Maron-Katz, A., Shamir, R.: CLICK and EXPANDER: a system for clustering and visualizing gene expression data. Bioinformatics 19, 1787–1799 (2003)CrossRefGoogle Scholar
  22. 22.
    Steuer, R., Kurths, J., Daub, C.O., et al.: The mutual information: Detecting and evaluating dependencies between variables. Bioinformatics 18, S231–S240 (2002)CrossRefGoogle Scholar
  23. 23.
    Tamayo, P., Slonim, D., Mesirov, J., et al.: Interpreting patters of gene expression with self-organizing maps - methods and application to hematopoietic differentiation. In: Proc. Natl. Acad. Sci. USA, vol. 96, pp. 2907–2912 (1999)Google Scholar
  24. 24.
    Tavazoie, S., Hughes, J.D., Campbell, M.J., et al.: Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999)CrossRefGoogle Scholar
  25. 25.
    Xu, Y., Olman, V., Xu, D.: Clustering gene expression data using a graph-theoretic approach - an application of minimum spanning trees. Bioinformatics 17, 309–318 (2001)CrossRefGoogle Scholar
  26. 26.
    Yeung, K., Haynor, D.R., Ruzzo, W.L.: Validating clustering for gene expression data. Bioinformatics 17, 309–318 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Dae-Won Kim
    • 1
  • Bo-Yeong Kang
    • 2
  1. 1.School of Computer Science and EngineeringChung-Ang UniversitySeoulKorea
  2. 2.Center of Healthcare Ontology R&DSeoul National UniversitySeoulKorea

Personalised recommendations