Semantic Grounding of Tag Relatedness in Social Bookmarking Systems

  • Ciro Cattuto
  • Dominik Benz
  • Andreas Hotho
  • Gerd Stumme
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5318)

Abstract

Collaborative tagging systems have nowadays become important data sources for populating semantic web applications. For tasks like synonym detection and discovery of concept hierarchies, many researchers introduced measures of tag similarity. Even though most of these measures appear very natural, their design often seems to be rather ad hoc, and the underlying assumptions on the notion of similarity are not made explicit. A more systematic characterization and validation of tag similarity in terms of formal representations of knowledge is still lacking. Here we address this issue and analyze several measures of tag similarity: Each measure is computed on data from the social bookmarking system del.icio.us and a semantic grounding is provided by mapping pairs of similar tags in the folksonomy to pairs of synsets in Wordnet, where we use validated measures of semantic distance to characterize the semantic relation between the mapped tags. This exposes important features of the investigated similarity measures and indicates which ones are better suited in the context of a given semantic application.

References

  1. 1.
    Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Computational Linguistics 32(1), 13–47 (2006)CrossRefMATHGoogle Scholar
  2. 2.
    Fellbaum, C. (ed.): WordNet: an electronic lexical database. MIT Press, Cambridge (1998)MATHGoogle Scholar
  3. 3.
    Firth, J.R.: A synopsis of linguistic theory 1930-55. Studies in Linguistic Analysis (special volume of the Philological Society) 1952-59, 1–32 (1957)Google Scholar
  4. 4.
    Harris, Z.S.: Mathematical Structures of Language. Wiley, New York (1968)MATHGoogle Scholar
  5. 5.
    de Saussure, F.: Course in General Linguistics. Duckworth, London (trans. Roy Harris) ( [1916] 1983)Google Scholar
  6. 6.
    Chandler, D.: Semiotics: The Basics, 2nd edn. Taylor & Francis, Abington (2007)Google Scholar
  7. 7.
    Salton, G.: Automatic text processing: the transformation, analysis, and retrieval of information by computer. Addison-Wesley Longman Publishing Co., Inc., Boston (1989)Google Scholar
  8. 8.
    Hotho, A., Jäschke, R., Schmitz, C., Stumme, G.: Information retrieval in folksonomies: Search and ranking. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 411–426. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web. In: WWW 1998, Brisbane, Australia, pp. 161–172 (1998)Google Scholar
  10. 10.
    Jiang, J.J., Conrath, D.W.: Semantic Similarity based on Corpus Statistics and Lexical Taxonomy. In: Proceedings of the International Conference on Research in Computational Linguistics (ROCLING), Taiwan (1997)Google Scholar
  11. 11.
    Mathes, A.: Folksonomies – Cooperative Classification and Communication Through Shared Metadata (December 2004), http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html
  12. 12.
    Hammond, T., Hannay, T., Lund, B., Scott, J.: Social Bookmarking Tools (I): A General Review. D-Lib Magazine 11(4) (April 2005)Google Scholar
  13. 13.
    Lund, B., Hammond, T., Flack, M., Hannay, T.: Social Bookmarking Tools (II): A Case Study - Connotea. D-Lib Magazine 11(4) (April 2005)Google Scholar
  14. 14.
    Lambiotte, R., Ausloos, M.: Collaborative tagging as a tripartite network. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3993, p. 1114. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Mika, P.: Ontologies are us: A unified model of social networks and semantics. In: International Semantic Web Conference. LNCS, pp. 522–536. Springer, Heidelberg (2005)Google Scholar
  16. 16.
    Golder, S., Huberman, B.A.: The structure of collaborative tagging systems. Journal of Information Science 32(2), 198–208 (2006)CrossRefGoogle Scholar
  17. 17.
    Cattuto, C., Loreto, V., Pietronero, L.: Semiotic dynamics and collaborative tagging. Proc. Natl. Acad. Sci. USA 104, 1461–1464 (2007)CrossRefGoogle Scholar
  18. 18.
    Cattuto, C., Schmitz, C., Baldassarri, A., Servedio, V.D.P., Loreto, V., Hotho, A., Grahl, M., Stumme, G.: Network properties of folksonomies. AI Communications Journal, Special Issue on Network Analysis in Natural Sciences and Engineering 20(4), 245–262 (2007)MathSciNetGoogle Scholar
  19. 19.
    Cattuto, C., Baldassarri, A., Servedio, V.D.P., Loreto, V.: Emergent community structure in social tagging systems. Advances in Complex Physics. In: Proceedings of the European Conference on Complex Systems ECCS 2007 (to appear)Google Scholar
  20. 20.
    Maguitman, A.G., Menczer, F., Erdinc, F., Roinestad, H., Vespignani, A.: Algorithmic computation and approximation of semantic similarity. World Wide Web 9(4), 431–456 (2006)CrossRefGoogle Scholar
  21. 21.
    Mohammad, S., Hirst, G.: Distributional measures as proxies for semantic relatedness (Submitted for publication), http://ftp.cs.toronto.edu/pub/gh/Mohammad+Hirst-2005.pdf
  22. 22.
    Cimiano, P.: Ontology Learning and Population from Text — Algorithms, Evaluation and Applications. Springer, Berlin, Heidelberg, Germany, Originally published as PhD Thesis, 2006, Universität Karlsruhe (TH), Karlsruhe, Germany (2006)Google Scholar
  23. 23.
    Heymann, P., Garcia-Molina, H.: Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical Report 2006-10, Computer Science Department (April 2006)Google Scholar
  24. 24.
    Schmitz, P.: Inducing ontology from Flickr tags. In: Collaborative Web Tagging Workshop at WWW 2006, Edinburgh, Scotland (May 2006)Google Scholar
  25. 25.
    Mishne, G.: Autotag: a collaborative approach to automated tag assignment for weblog posts. In: WWW 2006. Proceedings of the 15th international conference on World Wide Web, pp. 953–954. ACM Press, New York (2006)Google Scholar
  26. 26.
    Brooks, C.H., Montanez, N.: Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: WWW 2006. Proceedings of the 15th international conference on World Wide Web, pp. 625–632. ACM Press, New York (2006)Google Scholar
  27. 27.
    Xu, Z., Fu, Y., Mao, J., Su, D.: Towards the semantic web: Collaborative tag suggestions. In: Proceedings of the Collaborative Web Tagging Workshop at the WWW 2006, Edinburgh, Scotland (May 2006)Google Scholar
  28. 28.
    Jäschke, R., Marinho, L.B., Hotho, A., Schmidt-Thieme, L., Stumme, G.: Tag recommendations in folksonomies. In: Kok, J.N., Koronacki, J., López de Mántaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 506–514. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  29. 29.
    Aurnhammer, M., Hanappe, P., Steels, L.: Integrating collaborative tagging and emergent semantics for image retrieval. In: Proceedings WWW 2006, Collaborative Web Tagging Workshop (May 2006)Google Scholar
  30. 30.
    Halpin, H., Robu, V., Shepard, H.: The dynamics and semantics of collaborative tagging. In: Proceedings of the 1st Semantic Authoring and Annotation Workshop (SAAW 2006) (2006)Google Scholar
  31. 31.
    Zhang, L., Wu, X., Yu, Y.: Emergent semantics from folksonomies: A quantitative study. Journal on Data Semantics VI (2006)Google Scholar
  32. 32.
    Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1999)CrossRefMATHGoogle Scholar
  33. 33.
    Lehmann, F., Wille, R.: A triadic approach to formal concept analysis. In: Ellis, G., Rich, W., Levinson, R., Sowa, J.F. (eds.) ICCS 1995. LNCS, vol. 954. Springer, Heidelberg (1995)Google Scholar
  34. 34.
    Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)CrossRefGoogle Scholar
  35. 35.
    Resnik, P.: Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In: Proceedings of the XI International Joint Conferences on Artificial, pp. 448–453 (1995)Google Scholar
  36. 36.
    Pedersen, T., Patwardhan, S., Michelizzi, J.: Wordnet:similarity - measuring the relatedness of concepts (2004), http://citeseer.ist.psu.edu/665035.html

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Ciro Cattuto
    • 1
  • Dominik Benz
    • 2
  • Andreas Hotho
    • 2
  • Gerd Stumme
    • 2
  1. 1.Complex Networks Lagrange LaboratoryInstitute for Scientific Interchange (ISI) FoundationTorinoItaly
  2. 2.Knowledge & Data Engineering GroupUniversity of KasselKasselGermany

Personalised recommendations