RATC: A Robust Automated Tag Clustering Technique

Boratto, Ludovico; Carta, Salvatore; Vargiu, Eloisa

doi:10.1007/978-3-642-03964-5_30

Ludovico Boratto¹⁸,
Salvatore Carta¹⁸ &
Eloisa Vargiu¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5692))

Included in the following conference series:

International Conference on Electronic Commerce and Web Technologies

1451 Accesses
7 Citations

Abstract

Nowadays, the most dominant and noteworthy web information sources are developed according to the collaborative-web paradigm, also known as Web 2.0. In particular, it represents a novel paradigm in the way users interact with the web. Users (also called prosumers) are no longer passive consumers of published content, but become involved, implicitly and explicitly, as they cooperate by providing their own resources in an “architecture of participation”. In this scenario, collaborative tagging, i.e., the process of classifying shared resources by using keywords, becomes more and more popular. The main problem in such task is related to well-known linguistic phenomena, such as polysemy and synonymy, making effective content retrieval harder. In this paper, an approach that monitors users activity in a tagging system and dynamically quantifies associations among tags is presented. The associations are then used to create tags clusters. Experiments are performed comparing the proposed approach with a state-of-the-art tag clustering technique. Results –given in terms of classical precision and recall– show that the approach is quite effective in the presence of strongly related tags in a cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

O’Reilly, T.: What is web 2.0? Design patterns and business models for the next generation of software (2005)
Google Scholar
Golder, S.A., Huberman, B.A.: Usage patterns of collaborative tagging systems. J. Inf. Sci. 32, 198–208 (2006)
Article Google Scholar
Bielenberg, K., Zacher, M.: Groups in Social Software: Utilizing Tagging to Integrate Individual Contexts for Social Navigation. PhD thesis, Media Universität Breme (2005)
Google Scholar
Begelman, G., Keller, P., Smadja, F.: Automated tag clustering: Improving search and exploration in the tag space. In: Proceedings of the WWW Collaborative Web Tagging Workshop, Edinburgh, Scotland (2006)
Google Scholar
Specia, L., Motta, E.: Integrating folksonomies with the semantic web. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 624–639. Springer, Heidelberg (2007)
Chapter Google Scholar
Hamasaki, M., Matsuo, Y., Nishimura, T., Takeda, H.: Ontology extraction by collaborative tagging with social networking. In: Proceedings of the WWW 2008 (2008)
Google Scholar
Wu, X., Zhang, L., Yu, Y.: Exploring social annotations for the semantic web. In: WWW 2006: Proceedings of the 15th international conference on World Wide Web, pp. 417–426. ACM, New York (2006)
Google Scholar
Giannakidou, E., Koutsonikola, V., Vakali, A., Kompatsiaris, Y.: Co-clustering tags and social data sources. In: The Ninth International Conference on Web-Age Information Management, 2008. WAIM 2008, pp. 317–324 (2008)
Google Scholar
Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: KDD 2001: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 269–274. ACM, New York (2001)
Google Scholar
Smyth, B., Freyne, J., Coyle, M., Briggs, P., Balfe, E.: I-SPY: Anonymous, Community-Based Personalization by Collaborative Web Search. In: Proceedings of the 23rd SGAI International Conference on Innovative Techniques, pp. 367–380. Springer, Heidelberg (2003)
Google Scholar
Baeza-Yates, R.: Applications of Web Query Mining. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 7–22. Springer, Heidelberg (2005)
Chapter Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Google Scholar
van Dongen, S.: Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht (2000)
Google Scholar
Porter, M.A., Onnela, J.P., Mucha, P.J.: Communities in networks (2009)
Google Scholar
Carta, S., Alimonda, A., Clemente, M., Agelli, M.: Glue: Improving tag-based contents retrieval exploiting implicit user feedback. In: Hoenkamp, E., de Cock, M., Hoste, V. (eds.) Proceedings Of The 8th Dutch-Belgian Information Retrieval Workshop (DIR 2008), pp. 29–35 (2008)
Google Scholar
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69 (2004)
Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys (CSUR) 34, 1–55 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dip.to di Matematica e Informatica, Università di Cagliari, Italy
Ludovico Boratto & Salvatore Carta
Dip.to di Ingegneria Elettrica ed Elettronica, Università di Cagliari, Italy
Eloisa Vargiu

Authors

Ludovico Boratto
View author publications
You can also search for this author in PubMed Google Scholar
Salvatore Carta
View author publications
You can also search for this author in PubMed Google Scholar
Eloisa Vargiu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Electrical and Electronics Engineering Department, Technical University of Bari, Via E. Orabona 4, 70125, Bari, Italy
Tommaso Di Noia
Department DIMET, University of Reggio Calabria, Via Graziella, loc. Feo di Vito, 89122, Reggio Calabria, Italy
Francesco Buccafurri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Boratto, L., Carta, S., Vargiu, E. (2009). RATC: A Robust Automated Tag Clustering Technique. In: Di Noia, T., Buccafurri, F. (eds) E-Commerce and Web Technologies. EC-Web 2009. Lecture Notes in Computer Science, vol 5692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03964-5_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-03964-5_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03963-8
Online ISBN: 978-3-642-03964-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics