Improving the Exploration of Tag Spaces Using Automated Tag Clustering

  • Joni Radelaar
  • Aart-Jan Boor
  • Damir Vandic
  • Jan-Willem van Dam
  • Frederik Hogenboom
  • Flavius Frasincar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6757)


Due to the increasing popularity of tagging, it is important to overcome challenges resulting from the free nature of tagging, such as the use of synonyms, homonyms, syntactic variations, etc. The Semantic Tag Clustering Search (STCS) framework deals with these challenges by detecting syntactic variations of tags and by clustering semantically related tags. We evaluate our framework using Flickr data from 2009 and compare the STCS framework to two previously introduced tag clustering techniques. We conclude that our framework performs significantly better in terms of cluster precision compared to one method and has a better average precision compared to the other method.


Tagging syntactic clustering semantic clustering tag disambiguation 


  1. 1.
    Amazon Web Services LLC: Amazon Elastic Compute Cloud, Amazon EC2 (2010),
  2. 2.
    Arenas, A., Diaz-Guilera, A., Perez-Vicente, C.J.: Synchronization Reveals Topological Scales in Complex Networks. Phys. Rev. Lett. 96(11), 1–4 (2006)CrossRefzbMATHGoogle Scholar
  3. 3.
    Begelman, G., Keller, P., Smadja, F.: Automated Tag Clustering: Improving Search and Exploration in the Tag Space. In: Carr, L.A., Roure, D.C.D., Iyengar, A., Goble, C.A., Dahlin, M. (eds.) 15th World Wide Web Conference (WWW 2006), pp. 22–26. ACM Press, New York (2006)Google Scholar
  4. 4.
    Cattuto, C., Benz, D., Hotho, A., Stumme, G.: Semantic Grounding of Tag Relatedness in Social Bookmarking Systems. In: Sheth, A.P., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 615–631. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  5. 5.
    CERN - European Organization for Nuclear Research: Colt Libraries for High Performance Scientific and Technical Computing in Java (2010),
  6. 6.
    Delling, D., Gaertler, M., Görke, R., Nikoloski, Z., Wagner, D.: How to Evaluate Clustering Techniques. Tech. rep., Faculty of Informatics, Universitat Karlsruhe (2006),
  7. 7.
    Echarte, F., Astrain, J.J., Córdoba, A., Villadangos, J.: Pattern Matching Techniques to Identify Syntactic Variations of Tags in Folksonomies. In: Lytras, M.D., Carroll, J.M., Damiani, E., Tennyson, R.D. (eds.) WSKS 2008. LNCS (LNAI), vol. 5288, pp. 557–564. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  9. 9.
    Golder, S., Huberman, B.: The Structure of Collaborative Tagging Systems. Tech. rep., Information Dynamics Lab, HP Labs (2005),
  10. 10.
    Jäschke, R., Hotho, A., Schmitz, C., Stumme, G.: Analysis of the Publication Sharing Behaviour in BibSonomy. In: Priss, U., Polovina, S., Hill, R. (eds.) ICCS 2007. LNCS (LNAI), vol. 4604, pp. 283–295. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Lancichinetti, A., Fortunato, S., Kertesz, J.: Detecting the Overlapping and Hierarchical Community Structure in Complex Networks. New Journal of Physics 11(3), 1–19 (2009)CrossRefGoogle Scholar
  12. 12.
    Larsen, B., Aone, C.: Fast and Effective Text Mining using Linear-Time Document Clustering. In: 5th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 1999), pp. 16–22. ACM, New York (1999)Google Scholar
  13. 13.
    Levenshtein, V.I.: Binary Codes Capable of Correction Deletions, Insertions, and Reversals. Soviet Physics Doklady 10(8), 707–710 (1966)Google Scholar
  14. 14.
    Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)CrossRefzbMATHGoogle Scholar
  15. 15.
    Markines, B., Cattuto, C., Menczer, F., Benz, D., Hotho, A., Stumme, G.: Evaluating Similarity Measures for Emergent Semantics of Social Tagging. In: 18th World Wide Web Conference (WWW 2009), pp. 641–650. ACM, New York (2009)Google Scholar
  16. 16.
    Mika, P.: Ontologies Are Us: A unified model of social networks and semantics. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 522–536. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Newman, M.E.J., Girvan, M.: Finding and Evaluating Community Structure in Networks. Physical Review E 69(2), 1–15 (2004)CrossRefGoogle Scholar
  18. 18.
    O’Madadhain, J., Fisher, D., Nelson, T., White, S., Boey, Y.B.: Java Universal Network Graph (JUNG) Framework (2010),
  19. 19.
    Pantel, P.: Clustering by Committee. Ph.D. thesis, University of Alberta (2003)Google Scholar
  20. 20.
    Schachter, J.: Delicious - Social Bookmarking (2010),
  21. 21.
    Specia, L., Motta, E.: Integrating Folksonomies with the Semantic Web. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 624–639. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  22. 22.
    TAGora: TAGora Sense Repository (2010),
  23. 23.
    van Dam, J.W., Vandic, D., Hogenboom, F., Frasincar, F.: Searching and Browsing Tag Spaces Using the Semantic Tag Clustering Search Framework. In: Fourth IEEE International Conference on Semantic Computing (ICSC 2010), pp. 436–439. IEEE Computer Society, Los Alamitos (2010)Google Scholar
  24. 24.
    Yeung, C., Gibbins, N., Shadbolt, N.: Contextualising Tags in Collaborative Tagging Systems. In: 20th ACM Conference on Hypertext and Hypermedia (HT 2009), pp. 251–260. ACM, New York (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Joni Radelaar
    • 1
  • Aart-Jan Boor
    • 1
  • Damir Vandic
    • 1
  • Jan-Willem van Dam
    • 1
  • Frederik Hogenboom
    • 1
  • Flavius Frasincar
    • 1
  1. 1.Erasmus University RotterdamRotterdamThe Netherlands

Personalised recommendations