Abstract
As the gene sequencing technology has been maturating, genomes of more and more organisms and genetic sequences are available in international repositories. However, the biological functions of most of these genes remain unknown. Microarray technology opens a door to discover information of underlying genes and is widely used in basic research and target discovery, biomarker determination, pharmacology, target selectivity, development of prognostic tests and disease-subclass determination. Clustering is one of the typical methods of analyzing microarray data. By clustering the gene microarray expression data into categories with similar profiles, genes with similar function can be focused on. There are many clustering methods used for the analysis of gene microarray data. However, they usually suffer from some shortcomings, such as sensitive to initial input, inappropriate grouping, difficult to discover natural or near optimal clusters, and so on. In this paper, we propose a novel clustering method to discover the optimal clusters by searching PPVs (Pair of Prototype Vector). The experiment results show that our method works very well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Schena, M., Shalon, D., Davis, R.W.: Quantitative monitoring of gene expression patterns with a DNA microarray. Science 270, 467–470 (1995)
Kohonen, T.: The self-organizing map. Proceedings of the IEEE 78, 1464–1479 (1990)
Wang, J., Delabie, J., Aasheim, H.: Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study. BMC Bioinformatics 3, 36 (2002)
Quackenbush, J.: Computational analysis of microarray data. Nature Reviews Genetics 2, 418–427 (2001)
Fred, A.L.N., Jain, A.K.: Data Clustering using Evidence Accumulation. In: ICPR 2002, pp. 276–280 (2002)
Yeung, K.Y., Fraley, C., Murua, A., Raftery, A.E.: Model-based clustering and data transformations for gene expression data. Bioinformatics 17, 977–987 (2001)
Allison, D.B., Gadbury, G.L., Heo, M.: A mixture model approach for the analysis of microarray gene expression data. Computational Statistics and Data Analysis 39, 1–20 (2002)
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of data clusters via the gap statistic. Journal of the Royal Statistical Society Series B, 411–423 (2001)
Hathaway, R.J., Bezdek, J.C., Hu, Y.: Generalized Fuzzy c-Means Clustering Strategies Using Lp Norm Distances. IEEE Trans. on Fuzzy Systems 8, 576–582 (2000)
Qu, Y., Xu, S.: Supervised cluster analysis for microarray data based on multivariate Gaussian mixture. Bioinformatics 20, 1905–1913 (2004)
Halkidi, M., Vazirgiannis, M.: Clustering validity assessment using multi representatives. In: Proceedings of SETN Conference, Thessaloniki, Greece (2002)
Zhang, Y.J., Liu, Z.Q.: Self-Splittng competitive learning: A new on-line clustering paradigm. IEEE Trans. Neural Networks 13, 369–380 (2002)
Cho, R.J., Campbell, M.J., Winzeler, E.A.: A genome-wide transcriptional analysis of the mitotic cell cycle. Molecular Cell. 2, 65–73 (1998)
Altman, R.B., Raychaudhuri, S.: Whole-genome expression analysis: challenges beyond clustering. Curr. Opin. Struct. Biol. 11, 340–347 (2001)
Gu, C.C., Rao, D.C., Stormo, G.: Role of Gene Expression Microarray Analysis in Finding Complex Disease Genes. Genetic Epidemiology 23, 37–56 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Luo, F., Liu, J. (2006). A Novel Clustering Method for Analysis of Gene Microarray Expression Data. In: Li, J., Yang, Q., Tan, AH. (eds) Data Mining for Biomedical Applications. BioDM 2006. Lecture Notes in Computer Science(), vol 3916. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11691730_8
Download citation
DOI: https://doi.org/10.1007/11691730_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33104-9
Online ISBN: 978-3-540-33105-6
eBook Packages: Computer ScienceComputer Science (R0)