Skip to main content

Iterative Clustering Analysis for Grouping Missing Data in Gene Expression Profiles

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3918))

Included in the following conference series:

Abstract

Clustering has been used as a popular technique for finding groups of genes that show similar expression patterns under multiple experimental conditions. Because a clustering method requires a complete data matrix as an input, we must estimate the missing values using an imputation method in the preprocessing step of clustering. However, a common limitation of these conventional approach is that once the estimates of missing values are fixed in the preprocessing step, they are not changed during subsequent process of clustering. Badly estimated missing values obtained in data preprocessing are likely to deteriorate the quality and reliability of clustering results. Thus, a new clustering method is required for improving missing values during iterative clustering process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hathaway, R.J., Bezdek, J.C.: Fuzzy c-means clustering of incomplete data. IEEE Transactions on Systems, Man, and Cybernetics–Part B: Cybernetics 31, 735–744 (2001)

    Article  Google Scholar 

  2. Troyanskaya, O., Cantor, M., Sherlock, G., et al.: Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001)

    Article  Google Scholar 

  3. Ouyang, M., Welsh, W.J., Georgopoulos, P.: Guassian mixture clustering and imputation of microarray data. Bioinformatics 20, 917–923 (2004)

    Article  Google Scholar 

  4. Alizadeh, A.A., Eisen, M.B., David, R.E., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)

    Article  Google Scholar 

  5. Bo, T.H., Dysvik, B., Jonassen, I.: LSimpute: accurate estimation of missing values in microarray data with least square methods. Nucleic Acids Research 32, e34 (2004)

    Article  Google Scholar 

  6. Dumitrescu, D., Lazzerini, B., Jain, L.C.: Fuzzy Sets and Their Applications to Clustering and Traning. CRC Press, Florida (2000)

    Google Scholar 

  7. Fuschik, M.E.: Methods for Knowledge Discovery in Microarray Data. Ph.D. Thesis, University of Otago (2003)

    Google Scholar 

  8. Horn, D., Axel, I.: Novel clustering algorithm for microarray expression data in a truncated SVD space. Bioinformatics 19, 1110–1115 (2003)

    Article  Google Scholar 

  9. Dudoit, S., Fridlyand, J.: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)

    Article  Google Scholar 

  10. Mizuguchi, G., Shen, X., Landry, J., et al.: ATP-driven exchange of histone H2AZ variant catalyzed by SWR1 chromatin remodeling complex. Science 303, 343–348 (2004)

    Article  Google Scholar 

  11. Yoshimoto, H., Saltsman, K., Gasch, A.P., et al.: Genome-wide analysis of gene expression regulated by the Calcineurin/Crz1p signaling pathway in Saccharomyces cerevisiae. The Journal of Biological Chemistry 277, 31079–31088 (2002)

    Article  Google Scholar 

  12. Cho, R.J., Campbell, M.J., Winzeler, E.A., et al.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2, 65–73 (1998)

    Article  Google Scholar 

  13. Chu, S., DeRish, J., Eisen, M., et al.: The transcriptional program of sporulation in budding yeast. Science 282, 699–705 (1998)

    Article  Google Scholar 

  14. Dembele, D., Kastner, P.: Fuzzy c-means method for clustering microarray data. Bioinformatics 19, 973–980 (2003)

    Article  Google Scholar 

  15. Dhilon, I.S., Marcotte, E.M., Roshan, U.: Diametrical clustering for identifying anticorrelated gene clusters. Bioinformatics 19, 1612–1619 (2003)

    Article  Google Scholar 

  16. Eisen, M., Spellman, P.T., Brown, P.O., et al.: Cluster analysis and display of genomewide expression patterns. In: Proc. Natl. Acad. Sci. USA, vol. 95, pp. 14863–14868 (1998)

    Google Scholar 

  17. Ashburner, M., Ball, C.A., Blake, J.A., et al.: Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000)

    Article  Google Scholar 

  18. Issel-Tarver, L., Christie, K.R., Dolinski, K., et al.: Saccharomyces genome database. Methods Enzymol 350, 329–346 (2002)

    Article  Google Scholar 

  19. Gibbons, F.D., Roth, F.P.: Judging the quality of gene expression-based clustering methods using gene annotation. Genome Res. 12, 1574–1581 (2002)

    Article  Google Scholar 

  20. Kim, D.W., Lee, K.H., Lee, D.: Detecting clusters of different geometrical shapes in microarray gene expression data. Bioinformatics 21, 1927–1934 (2005)

    Article  Google Scholar 

  21. Sharan, R., Maron-Katz, A., Shamir, R.: CLICK and EXPANDER: a system for clustering and visualizing gene expression data. Bioinformatics 19, 1787–1799 (2003)

    Article  Google Scholar 

  22. Steuer, R., Kurths, J., Daub, C.O., et al.: The mutual information: Detecting and evaluating dependencies between variables. Bioinformatics 18, S231–S240 (2002)

    Article  Google Scholar 

  23. Tamayo, P., Slonim, D., Mesirov, J., et al.: Interpreting patters of gene expression with self-organizing maps - methods and application to hematopoietic differentiation. In: Proc. Natl. Acad. Sci. USA, vol. 96, pp. 2907–2912 (1999)

    Google Scholar 

  24. Tavazoie, S., Hughes, J.D., Campbell, M.J., et al.: Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999)

    Article  Google Scholar 

  25. Xu, Y., Olman, V., Xu, D.: Clustering gene expression data using a graph-theoretic approach - an application of minimum spanning trees. Bioinformatics 17, 309–318 (2001)

    Article  Google Scholar 

  26. Yeung, K., Haynor, D.R., Ruzzo, W.L.: Validating clustering for gene expression data. Bioinformatics 17, 309–318 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, DW., Kang, BY. (2006). Iterative Clustering Analysis for Grouping Missing Data in Gene Expression Profiles. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_17

Download citation

  • DOI: https://doi.org/10.1007/11731139_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33206-0

  • Online ISBN: 978-3-540-33207-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics