A Novel Clustering Method for Analysis of Gene Microarray Expression Data

  • Fei Luo
  • Juan Liu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3916)


As the gene sequencing technology has been maturating, genomes of more and more organisms and genetic sequences are available in international repositories. However, the biological functions of most of these genes remain unknown. Microarray technology opens a door to discover information of underlying genes and is widely used in basic research and target discovery, biomarker determination, pharmacology, target selectivity, development of prognostic tests and disease-subclass determination. Clustering is one of the typical methods of analyzing microarray data. By clustering the gene microarray expression data into categories with similar profiles, genes with similar function can be focused on. There are many clustering methods used for the analysis of gene microarray data. However, they usually suffer from some shortcomings, such as sensitive to initial input, inappropriate grouping, difficult to discover natural or near optimal clusters, and so on. In this paper, we propose a novel clustering method to discover the optimal clusters by searching PPVs (Pair of Prototype Vector). The experiment results show that our method works very well.


Optimal Cluster Curvature Distance Natural Cluster Gene Expression Data Analysis Prototype Vector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Schena, M., Shalon, D., Davis, R.W.: Quantitative monitoring of gene expression patterns with a DNA microarray. Science 270, 467–470 (1995)CrossRefGoogle Scholar
  2. 2.
    Kohonen, T.: The self-organizing map. Proceedings of the IEEE 78, 1464–1479 (1990)CrossRefGoogle Scholar
  3. 3.
    Wang, J., Delabie, J., Aasheim, H.: Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study. BMC Bioinformatics 3, 36 (2002)CrossRefGoogle Scholar
  4. 4.
    Quackenbush, J.: Computational analysis of microarray data. Nature Reviews Genetics 2, 418–427 (2001)CrossRefGoogle Scholar
  5. 5.
    Fred, A.L.N., Jain, A.K.: Data Clustering using Evidence Accumulation. In: ICPR 2002, pp. 276–280 (2002)Google Scholar
  6. 6.
    Yeung, K.Y., Fraley, C., Murua, A., Raftery, A.E.: Model-based clustering and data transformations for gene expression data. Bioinformatics 17, 977–987 (2001)CrossRefGoogle Scholar
  7. 7.
    Allison, D.B., Gadbury, G.L., Heo, M.: A mixture model approach for the analysis of microarray gene expression data. Computational Statistics and Data Analysis 39, 1–20 (2002)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of data clusters via the gap statistic. Journal of the Royal Statistical Society Series B, 411–423 (2001)Google Scholar
  9. 9.
    Hathaway, R.J., Bezdek, J.C., Hu, Y.: Generalized Fuzzy c-Means Clustering Strategies Using Lp Norm Distances. IEEE Trans. on Fuzzy Systems 8, 576–582 (2000)CrossRefGoogle Scholar
  10. 10.
    Qu, Y., Xu, S.: Supervised cluster analysis for microarray data based on multivariate Gaussian mixture. Bioinformatics 20, 1905–1913 (2004)CrossRefGoogle Scholar
  11. 11.
    Halkidi, M., Vazirgiannis, M.: Clustering validity assessment using multi representatives. In: Proceedings of SETN Conference, Thessaloniki, Greece (2002)Google Scholar
  12. 12.
    Zhang, Y.J., Liu, Z.Q.: Self-Splittng competitive learning: A new on-line clustering paradigm. IEEE Trans. Neural Networks 13, 369–380 (2002)CrossRefGoogle Scholar
  13. 13.
    Cho, R.J., Campbell, M.J., Winzeler, E.A.: A genome-wide transcriptional analysis of the mitotic cell cycle. Molecular Cell. 2, 65–73 (1998)CrossRefGoogle Scholar
  14. 14.
    Altman, R.B., Raychaudhuri, S.: Whole-genome expression analysis: challenges beyond clustering. Curr. Opin. Struct. Biol. 11, 340–347 (2001)CrossRefGoogle Scholar
  15. 15.
    Gu, C.C., Rao, D.C., Stormo, G.: Role of Gene Expression Microarray Analysis in Finding Complex Disease Genes. Genetic Epidemiology 23, 37–56 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Fei Luo
    • 1
  • Juan Liu
    • 1
  1. 1.School of Computer ScienceWuhan UniversityWuhanChina

Personalised recommendations