Gene Selection by Cooperative Competition Clustering

  • Shun Pei
  • De-Shuang Huang
  • Kang Li
  • George W. Irwin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4115)


Clustering analysis of data from DNA microarray hybridization studies is an essential task for identifying biologically relevant groups of genes. Attribute cluster algorithm (ACA) has provided an attractive way to group and select meaningful genes. However, ACA needs much prior knowledge about the genes to set the number of clusters. In practical applications, if the number of clusters is misspecified, the performance of the ACA will deteriorate rapidly. In fact, it is a very demanding to do that because of our little knowledge.We propose the Cooperative Competition Cluster Algorithm (CCCA) in this paper. In the algorithm, we assume that both cooperation and competition exist simultaneously between clusters in the process of clustering. By using this principle of Cooperative Competition, the number of clusters can be found in the process of clustering. Experimental results on a synthetic and gene expression data are demonstrated. The results show that CCCA can choose the number of clusters automatically and get excellent performance with respect to other competing methods.


Mutual Information Gene Expression Data Gene Selection Cluster Quality True Cluster 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Au, W.-H., Keith, C.C.C., Andrew, K.C.W., Wang, Y.: Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data. IEEE Trans. Computation Biology and Bioinformatics 2(2), 83–101 (2005)CrossRefGoogle Scholar
  2. 2.
    Xing, E.P., Karp, R.M.: CLIFF: Clustering of High-Dimensional Microarray Data via Iterative Feature Filtering Using Normalized Cuts. Bioinformatics 17(Suppl.1), S306–S315 (2001)Google Scholar
  3. 3.
    Hastie, T., Tibshirani, R., Eisen, M., Brown, P., Scherf, U., Weinstein, J., Alizadeh, A., Staudt, L., Botstein, D.: Gene Shaving: a New Class of Clustering Methods for Expression Arrays. In: Tech. Report, Stanford University (2000)Google Scholar
  4. 4.
    Alter, O., Brown, P., Botstein, D.: Singular Value Decomposition for Genome-Wide Expression Data Processing and Modeling. Proc. Natl. Acad. Sci. USA, 10101–10106 (2000)Google Scholar
  5. 5.
    Piatetsky-Shapiro, G., Khabaza, T., Ramaswamy, S.: Capturing Best Practice for Microarray Gene Expression Data Analysis. In: Proc. Ninth ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, pp. 407–415 (2003)Google Scholar
  6. 6.
    Tamayo, P., Solni, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting Patterns of Gene Expression with Self-Organizing Maps: Methods and Application to Hematopoietic Differentiation. Proc. Nat’l academy of Sciences of the United States of Am. 96(6), 2907–2912 (1997)CrossRefGoogle Scholar
  7. 7.
    Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays. Proc. Nat’l Academy of Sciences of the United States of Am. 96(12), 6745–6750 (1999)CrossRefGoogle Scholar
  8. 8.
    Jiang, D., Tang, C., Zhang, A.: Cluster Analysis for Gene Expression Data: A Survey. IEEE Trans. Knowledge and Data Eng. 16(11), 1370–1386 (2004)CrossRefGoogle Scholar
  9. 9.
    Eisen, M., Spellman, P., Brown, P., Botstein, D.: Cluster Analysis and Display of Genome-Wide Expression Patterns. Proc. Natl. Acad. Sci. USA, 14863–14868 (1998)Google Scholar
  10. 10.
    Heyer, L.J., Kruglyak, S., Yooseph, S.: Exploring Expression Data: Identification and Analysis of Coexpressed Genes. Genome Research 9, 1106–1115 (1999)CrossRefGoogle Scholar
  11. 11.
    Wong, A.K.C., Liu, T.S.: Typicality, Diversity and Feature Patterns of an Ensemble. IEEE Trans. Computers 24(2), 158–181 (1975)MATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Liu, L., Wong, A.K.C., Wang, Y.: A Global Optimal Algorithm for Class-Dependent Discretization of Continuous Data. Intelligent Data Analysis 8(2), 151–170 (2004)Google Scholar
  13. 13.
    Jain, A.K., Chandrasekaran, B.: Dimensionality and Sample Size Considerations in Pattern Recognition Practice. In: Krishnaiah, P.P., Kanal, L.N. (eds.) Handbook of Statistics, pp. 835–855. North Holland, Amsterdam (1982)Google Scholar
  14. 14.
    Raudys, S.J., Jain, A.K.: Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(3), 252–264 (1991)CrossRefGoogle Scholar
  15. 15.
    Xu, L.: Rival Penalized Competitive Learning, Finite Mixture, and Multisets Clustering. In: Proc.1998 IEEE Int. Joint Conf. Neural Networks, vol. 3, pp. 2525–2530 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Shun Pei
    • 1
    • 2
  • De-Shuang Huang
    • 1
  • Kang Li
    • 3
  • George W. Irwin
    • 3
  1. 1.Intelligent Computing Lab, Institute of Intelligent MachinesChinese Academy of SciencesHefei, AnhuiChina
  2. 2.Department of AutomationUniversity of Science and Technology of China 
  3. 3.School of Electrical & Electronic Engineering Queen’s University Belfast 

Personalised recommendations