Ant-MST: An Ant-Based Minimum Spanning Tree for Gene Expression Data Clustering

  • Deyu Zhou
  • Yulan He
  • Chee Keong Kwoh
  • Hao Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4774)


We have proposed an ant-based clustering algorithm for document clustering based on the travelling salesperson scenario. In this paper, we presented an approach called Ant-MST for gene expression data clustering based on both ant-based clustering and minimum spanning trees (MST). The ant-based clustering algorithm is firstly used to construct a fully connected network of nodes. Each node represents one gene, and every edge is associated with a certain level of pheromone intensity describing the co-expression level between two genes. Then MST is used to break the linkages in order to generate clusters. Comparing to other MST-based clustering approaches, our proposed method uses pheromone intensity to measure the similarity between two genes instead of using Euclidean distance or correlation distance. Pheromone intensities associated with every edge in a fully-connected network records the collective memory of the ants. Self-organizing behavior could be easily discovered through pheromone intensities. Experimental results on three gene expression datasets show that our approach in general outperforms the classical clustering methods such as K-means and agglomerate hierarchical clustering.


gene expression data clustering ant-based clustering minimum spanning tree 


  1. 1.
    Baldi, P., Brunak, S.: Bioninformatics: The machine learning approach (2001)Google Scholar
  2. 2.
    Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 95(14), 14863–14868 (1998)CrossRefGoogle Scholar
  3. 3.
    Wen, X., Fuhrman, S., Michaels, G.S., Carr, D.B.: Large-scale temporal gene expression mapping of central nervous system development. Proceedings of the National Academy of Sciences of the United States of America 95(1), 334–339 (1998)CrossRefGoogle Scholar
  4. 4.
    Herwig, R., Poustka, A.J., Mller, C., Bull, C.: Large-scale clustering of cdna-fingerprinting data. Genome Research 9(11), 1093–1105 (1999)CrossRefGoogle Scholar
  5. 5.
    Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S.: Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proceedings of the National Academy of Sciences of the United States of America 96(6), 2907–2912 (1999)CrossRefGoogle Scholar
  6. 6.
    Xu, R., Wunsch II, D.: Survey of clustering algorithms. IEEE Transactions on Neural Networks 16(3), 645–678 (2005)CrossRefGoogle Scholar
  7. 7.
    Xu, Y., Olman, V., Xu, D.: Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees. Bioinformatics 18(4), 536–545 (2002)CrossRefGoogle Scholar
  8. 8.
    He, Y., Hui, S.C., Sim, Y.: A Novel Ant-Based Clustering Approach for Document Clustering. In: Asia Information Retrieval symposium, pp. 537–544. Springer, Heidelberg (2006)Google Scholar
  9. 9.
    Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The design and analysis of computer algorithms (1974)Google Scholar
  10. 10.
    Rand, W.M.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66, 622–626 (1971)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Deyu Zhou
    • 1
  • Yulan He
    • 1
  • Chee Keong Kwoh
    • 1
  • Hao Wang
    • 1
  1. 1.School of Computer Engineering, Nanyang Technological University, Nanyang Avenue, 639798Singapore

Personalised recommendations