Abstract
Clustering has been used as a popular technique for finding groups of genes that show similar expression patterns under multiple experimental conditions. Because a clustering method requires a complete data matrix as an input, we must estimate the missing values using an imputation method in the preprocessing step of clustering. However, a common limitation of these conventional approach is that once the estimates of missing values are fixed in the preprocessing step, they are not changed during subsequent process of clustering. Badly estimated missing values obtained in data preprocessing are likely to deteriorate the quality and reliability of clustering results. Thus, a new clustering method is required for improving missing values during iterative clustering process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hathaway, R.J., Bezdek, J.C.: Fuzzy c-means clustering of incomplete data. IEEE Transactions on Systems, Man, and Cybernetics–Part B: Cybernetics 31, 735–744 (2001)
Troyanskaya, O., Cantor, M., Sherlock, G., et al.: Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001)
Ouyang, M., Welsh, W.J., Georgopoulos, P.: Guassian mixture clustering and imputation of microarray data. Bioinformatics 20, 917–923 (2004)
Alizadeh, A.A., Eisen, M.B., David, R.E., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Bo, T.H., Dysvik, B., Jonassen, I.: LSimpute: accurate estimation of missing values in microarray data with least square methods. Nucleic Acids Research 32, e34 (2004)
Dumitrescu, D., Lazzerini, B., Jain, L.C.: Fuzzy Sets and Their Applications to Clustering and Traning. CRC Press, Florida (2000)
Fuschik, M.E.: Methods for Knowledge Discovery in Microarray Data. Ph.D. Thesis, University of Otago (2003)
Horn, D., Axel, I.: Novel clustering algorithm for microarray expression data in a truncated SVD space. Bioinformatics 19, 1110–1115 (2003)
Dudoit, S., Fridlyand, J.: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)
Mizuguchi, G., Shen, X., Landry, J., et al.: ATP-driven exchange of histone H2AZ variant catalyzed by SWR1 chromatin remodeling complex. Science 303, 343–348 (2004)
Yoshimoto, H., Saltsman, K., Gasch, A.P., et al.: Genome-wide analysis of gene expression regulated by the Calcineurin/Crz1p signaling pathway in Saccharomyces cerevisiae. The Journal of Biological Chemistry 277, 31079–31088 (2002)
Cho, R.J., Campbell, M.J., Winzeler, E.A., et al.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2, 65–73 (1998)
Chu, S., DeRish, J., Eisen, M., et al.: The transcriptional program of sporulation in budding yeast. Science 282, 699–705 (1998)
Dembele, D., Kastner, P.: Fuzzy c-means method for clustering microarray data. Bioinformatics 19, 973–980 (2003)
Dhilon, I.S., Marcotte, E.M., Roshan, U.: Diametrical clustering for identifying anticorrelated gene clusters. Bioinformatics 19, 1612–1619 (2003)
Eisen, M., Spellman, P.T., Brown, P.O., et al.: Cluster analysis and display of genomewide expression patterns. In: Proc. Natl. Acad. Sci. USA, vol. 95, pp. 14863–14868 (1998)
Ashburner, M., Ball, C.A., Blake, J.A., et al.: Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000)
Issel-Tarver, L., Christie, K.R., Dolinski, K., et al.: Saccharomyces genome database. Methods Enzymol 350, 329–346 (2002)
Gibbons, F.D., Roth, F.P.: Judging the quality of gene expression-based clustering methods using gene annotation. Genome Res. 12, 1574–1581 (2002)
Kim, D.W., Lee, K.H., Lee, D.: Detecting clusters of different geometrical shapes in microarray gene expression data. Bioinformatics 21, 1927–1934 (2005)
Sharan, R., Maron-Katz, A., Shamir, R.: CLICK and EXPANDER: a system for clustering and visualizing gene expression data. Bioinformatics 19, 1787–1799 (2003)
Steuer, R., Kurths, J., Daub, C.O., et al.: The mutual information: Detecting and evaluating dependencies between variables. Bioinformatics 18, S231–S240 (2002)
Tamayo, P., Slonim, D., Mesirov, J., et al.: Interpreting patters of gene expression with self-organizing maps - methods and application to hematopoietic differentiation. In: Proc. Natl. Acad. Sci. USA, vol. 96, pp. 2907–2912 (1999)
Tavazoie, S., Hughes, J.D., Campbell, M.J., et al.: Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999)
Xu, Y., Olman, V., Xu, D.: Clustering gene expression data using a graph-theoretic approach - an application of minimum spanning trees. Bioinformatics 17, 309–318 (2001)
Yeung, K., Haynor, D.R., Ruzzo, W.L.: Validating clustering for gene expression data. Bioinformatics 17, 309–318 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, DW., Kang, BY. (2006). Iterative Clustering Analysis for Grouping Missing Data in Gene Expression Profiles. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_17
Download citation
DOI: https://doi.org/10.1007/11731139_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33206-0
Online ISBN: 978-3-540-33207-7
eBook Packages: Computer ScienceComputer Science (R0)