RATC: A Robust Automated Tag Clustering Technique

  • Ludovico Boratto
  • Salvatore Carta
  • Eloisa Vargiu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5692)


Nowadays, the most dominant and noteworthy web information sources are developed according to the collaborative-web paradigm, also known as Web 2.0. In particular, it represents a novel paradigm in the way users interact with the web. Users (also called prosumers) are no longer passive consumers of published content, but become involved, implicitly and explicitly, as they cooperate by providing their own resources in an “architecture of participation”. In this scenario, collaborative tagging, i.e., the process of classifying shared resources by using keywords, becomes more and more popular. The main problem in such task is related to well-known linguistic phenomena, such as polysemy and synonymy, making effective content retrieval harder. In this paper, an approach that monitors users activity in a tagging system and dynamically quantifies associations among tags is presented. The associations are then used to create tags clusters. Experiments are performed comparing the proposed approach with a state-of-the-art tag clustering technique. Results –given in terms of classical precision and recall– show that the approach is quite effective in the presence of strongly related tags in a cluster.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    O’Reilly, T.: What is web 2.0? Design patterns and business models for the next generation of software (2005)Google Scholar
  2. 2.
    Golder, S.A., Huberman, B.A.: Usage patterns of collaborative tagging systems. J. Inf. Sci. 32, 198–208 (2006)CrossRefGoogle Scholar
  3. 3.
    Bielenberg, K., Zacher, M.: Groups in Social Software: Utilizing Tagging to Integrate Individual Contexts for Social Navigation. PhD thesis, Media Universität Breme (2005)Google Scholar
  4. 4.
    Begelman, G., Keller, P., Smadja, F.: Automated tag clustering: Improving search and exploration in the tag space. In: Proceedings of the WWW Collaborative Web Tagging Workshop, Edinburgh, Scotland (2006)Google Scholar
  5. 5.
    Specia, L., Motta, E.: Integrating folksonomies with the semantic web. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 624–639. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  6. 6.
    Hamasaki, M., Matsuo, Y., Nishimura, T., Takeda, H.: Ontology extraction by collaborative tagging with social networking. In: Proceedings of the WWW 2008 (2008)Google Scholar
  7. 7.
    Wu, X., Zhang, L., Yu, Y.: Exploring social annotations for the semantic web. In: WWW 2006: Proceedings of the 15th international conference on World Wide Web, pp. 417–426. ACM, New York (2006)Google Scholar
  8. 8.
    Giannakidou, E., Koutsonikola, V., Vakali, A., Kompatsiaris, Y.: Co-clustering tags and social data sources. In: The Ninth International Conference on Web-Age Information Management, 2008. WAIM 2008, pp. 317–324 (2008)Google Scholar
  9. 9.
    Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: KDD 2001: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 269–274. ACM, New York (2001)Google Scholar
  10. 10.
    Smyth, B., Freyne, J., Coyle, M., Briggs, P., Balfe, E.: I-SPY: Anonymous, Community-Based Personalization by Collaborative Web Search. In: Proceedings of the 23rd SGAI International Conference on Innovative Techniques, pp. 367–380. Springer, Heidelberg (2003)Google Scholar
  11. 11.
    Baeza-Yates, R.: Applications of Web Query Mining. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 7–22. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)Google Scholar
  13. 13.
    van Dongen, S.: Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht (2000)Google Scholar
  14. 14.
    Porter, M.A., Onnela, J.P., Mucha, P.J.: Communities in networks (2009)Google Scholar
  15. 15.
    Carta, S., Alimonda, A., Clemente, M., Agelli, M.: Glue: Improving tag-based contents retrieval exploiting implicit user feedback. In: Hoenkamp, E., de Cock, M., Hoste, V. (eds.) Proceedings Of The 8th Dutch-Belgian Information Retrieval Workshop (DIR 2008), pp. 29–35 (2008)Google Scholar
  16. 16.
    Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69 (2004)Google Scholar
  17. 17.
    Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys (CSUR) 34, 1–55 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Ludovico Boratto
    • 1
  • Salvatore Carta
    • 1
  • Eloisa Vargiu
    • 2
  1. di Matematica e InformaticaUniversità di CagliariItaly
  2. di Ingegneria Elettrica ed ElettronicaUniversità di CagliariItaly

Personalised recommendations