A Novel Ant-Based Clustering Approach for Document Clustering

  • Yulan He
  • Siu Cheung Hui
  • Yongxiang Sim
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4182)


Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant Colony Optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper proposes a novel document clustering approach based on ACO. Unlike other ACO-based clustering approaches which are based on the same scenario that ants move around in a 2D grid and carry or drop objects to perform categorization. Our proposed ant-based clustering approach does not rely on a 2D grid structure. In addition, it can also generate optimal number of clusters without incorporating any other algorithms such as K-means or AHC. Experimental results on the subsets of 20 Newsgroup data show that the ant-based clustering approach outperforms the classical document clustering methods such as K-means and Agglomerate Hierarchical Clustering. It also achieves better results than those obtained using the Artificial Immune Network algorithm when tested in the same datasets.


Travel Salesman Problem Tabu List Agglomerate Hierarchical Cluster Document Cluster Pheromone Trail 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Yu, E., Sung, K.S.: A genetic algorithm for a university weekly courses timetabling problem. International Transactions in Operational Research 9(6), 703–717 (2002)MATHCrossRefGoogle Scholar
  2. 2.
    Burke, E.K., Elliman, D.G., Weare, R.F.: A genetic algorithm based university timetabling system. In: Proceedings of the 2nd East-West International Conference on Computer Technologies in Education, Crimea, Ukraine, September 1994, pp. 35–40 (1994)Google Scholar
  3. 3.
    Dorigo, M., Maniezzo, V., Colorni, A.: Positive feedback as a search strategy. Technical report 91-016, Politecnico di milano, Dip. Elettronica (1991)Google Scholar
  4. 4.
    Dorigo, M., Maniezzo, V., Colorni, A.: The ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics – Part B 26(1), 29–42 (1996)CrossRefGoogle Scholar
  5. 5.
    de Castro, L.N., Von Zuben, F.J.: Learning and optimization using the clonal selection principle. IEEE Transactions on Evolutionary Computation, Special Issue on Artificial Immune Systems 6(3), 239–251 (2002)Google Scholar
  6. 6.
    Dasgupta, D., Ji, Z., Gonzlez, F.: Artificial immune system (ais) research in the last five years. In: Proceedings of the International Conference on Evolutionary Computation Conference (CEC), Canbara, Australia (December 2003)Google Scholar
  7. 7.
    Deneubourg, J.L., Goss, S., Franks, N., Sendova-Franks, A., Detrain, C., Chretien, L.: The dynamics of collective sorting robot-like ants and ant-like robots. In: Proceedings of the first international conference on simulation of adaptive behavior on From animals to animats, pp. 356–363. MIT Press, Cambridge (1990)Google Scholar
  8. 8.
    Lumer, E.D., Faieta, B.: Diversity and adaptation in populations of clustering ants. In: Cli, D., Husbands, P., Meyer, J., Wilson, S. (eds.) Proceedings of the Third International Conference on Simulation of Adaptive Behaviour: From Animals to Animats, 3rd edn., pp. 501–508. MIT Press, Cambridge (1994)Google Scholar
  9. 9.
    Kuntz, P., Layzell, P., Snyers, D.: A colony of ant-like agents for partitioning in vlsi technology. In: Husbands, P., Harvey, I. (eds.) Proceedings of the Fourth European Conference on Artificial Life, pp. 417–424. MIT Press, Cambridge (1997)Google Scholar
  10. 10.
    Monmarche, N.: On data clustering with artificial ants. In: Freitas, A.A. (ed.) Data Mining with Evolutionary Algorithms: Research Directions, vol. 18, pp. 23–26. AAAI Press, Orlando (1999)Google Scholar
  11. 11.
    Wu, B., Zheng, Y., Liu, S., Shi, Z.: Csim: a document clustering algorithm based on swarm intelligence. In: Proceedings of the 2002 congress on Evolutionary Computation, Honolulu, USA (2002)Google Scholar
  12. 12.
    Peng, Y., Hou, X., Liu, S.: The k-means clustering algorithm based on density and ant colony. In: IEEE International Conference in Neural Networks and Signal Processing, Nanjing, China (December 2003)Google Scholar
  13. 13.
    Handl, J., Meyer, B.: Improved ant-based clustering and sorting. In: Guervós, J.J.M., Adamidis, P.A., Beyer, H.-G., Fernández-Villacañas, J.-L., Schwefel, H.-P. (eds.) PPSN 2002. LNCS, vol. 2439, pp. 913–923. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  14. 14.
    Chen, L., Xu, X., Chen, Y.: An adaptive ant colony clustering algorithm. In: Proceedings of the Third International Conference on Machine Learning and Cybernetics, Shanghai, China, August 2004, pp. 1387–1392 (2004)Google Scholar
  15. 15.
    Tang, N., Vemuri, V.R.: An artificial immune system approach to document clustering. In: Proceedings of the 2005 ACM symposium on Applied computing, pp. 918–922. ACM Press, New York (2005)CrossRefGoogle Scholar
  16. 16.
  17. 17.
    Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yulan He
    • 1
  • Siu Cheung Hui
    • 1
  • Yongxiang Sim
    • 1
  1. 1.School of Computer EngineeringNanyang Technological UniversitySingapore

Personalised recommendations