Journal of Intelligent Information Systems

, Volume 38, Issue 3, pp 685–708 | Cite as

In & out zooming on time-aware user/tag clusters

  • Eirini GiannakidouEmail author
  • Vassiliki Koutsonikola
  • Athena Vakali
  • Ioannis Kompatsiaris


The common ground behind most approaches that analyze social tagging systems is addressing the information challenge that emerges from the massive activity of millions of users who interact and share resources and/or metadata online. However, lack of any time-related data in the analysis process implicitly denies much of the dynamic nature of social tagging activity. In this paper we claim that holding a temporal dimension, allows for tracking macroscopic and microscopic users’ interests, detecting emerging trends and recognizing events. To this end, we propose a time-aware co-clustering approach for acquiring semantic and temporal patterns out of the tagging activity. The resulted clusters contain both users and tags of similar patterns over time, and reveal non-obvious or “hidden” relations among users and topics of their common interest. Zoom in & out views serve as visualization methods on different aspects of the clusters’ structure, in order to evaluate the efficiency of the approach.


Time-aware clustering Social tagging systems Users’ interests over time Events 



This work was supported by the FP7 project WeKnowIt, partially funded by the EC under contract number 215453.


  1. Allan, J. (2002). Introduction to topic detection and tracking. In Topic detection and tracking: Event-based information organization (pp. 1–16). Norwell: Kluwer Academic.Google Scholar
  2. Andrews, D. F. (1972). Plots of high-dimensional data. In Biometrics (Vol. 28, pp. 125–136). Alexandria: International Biometric Society.Google Scholar
  3. Angeletou, S., Sabou, M., & Motta, E. (2008). Semantically enriching folksonomies with flor. In Proceedings of the 5th ESWC workshop: Collective Intelligence and the Semantic Web.Google Scholar
  4. Becker, H., Naaman, M., Gravano, L. (2010). Learning similarity metrics for event identification in social media. In WSDM ’10: Proceedings of the third ACM international conference on Web search and data mining (pp. 291–300). New York: ACM.CrossRefGoogle Scholar
  5. Begelman, G., Keller, P., & Smadja, F. (2006). Automated tag clustering: Improving search and exploration in the tag space. In Proceedings of the collaborative Web tagging workshop, 15th international World Wide Web conference (WWW’06) (pp. 89–98). Endinburgh, Scotland.Google Scholar
  6. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN. Systems, 30, 107–117.CrossRefGoogle Scholar
  7. Dhillon, I. S. (2001). Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. San Francisco: ACM.Google Scholar
  8. Dubinko, M., Kumar, R., Magnani, J., Novak, J., Raghavan, P., & Tomkins, A. (2006). Visualizing tags over time. In Proceedings of the 15th international conference on World Wide Web (pp. 193–202). Edinburgh: ACM.CrossRefGoogle Scholar
  9. Fayyad, U. M., Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In IJCAI’93 (pp. 1022–1029).Google Scholar
  10. Fellbaum, C. (1998). WordNet, an electronic lexical database. Cambridge: MIT Press.Google Scholar
  11. Giannakidou, E., Koutsonikola, V., Vakali, A., & Kompatsiaris, I. (2008). Co-clustering tags and social data sources. In Proceedings of the 9th international conference on Web-age information management, China (pp. 317–324).Google Scholar
  12. Heymann, P., Koutrika, G., & Garcia-Molina, H. (2007). Fighting spam on social web sites: A survey of approaches and future challenges. IEEE Internet Computing, 11(6), 36–45.CrossRefGoogle Scholar
  13. Hotho, A., Jaschke, R., Schmitz, C., & Stumme, G. (2006a). Information retrieval in folksonomies: Search and ranking. In Proceedings of the 3rd European Semantic Web conference, LNCS (Vol. 4011, pp. 411–426). Budva: Springer.Google Scholar
  14. Hotho, A., Jaschke, R., Schmitz, C., & Stumme, G. (2006b). Trend detection in folksonomies. In Proceedings of the 1st international conference on semantics and digital media technology (Vol. 4306, pp. 56–70). Athens, Greece.Google Scholar
  15. Kleinberg, J. (2006). Temporal dynamics of on-line information streams. In M. Garofalakis, J. Gehrke, & R. Rastogi (Eds.), Data stream management: Processing high-speed data streams. Springer.Google Scholar
  16. Koutsonikola, V., Petridou, S., Vakali, A., Hacid, H., & Benatallah, B. (2008). Correlating time-related data sources with co-clustering. In Proceedings of the 9th international conference on Web information systems engineering (pp. 264–279). Auckland: Springer.Google Scholar
  17. Koutsonikola, V., Vakali, A., Giannakidou, E., & Kompatsiaris, I. (2009). Clustering of social tagging system users: A topic and time based approach. In 10th int. conf. WISE (Vol. 5802, pp 75–86). Berlin: Springer.Google Scholar
  18. Kulldorff, M. (1999). Spatial scan statistics: Models, calculations and applications. In J. Glaz & N. Balakrishnan (Eds.), Recent advances on scan statistics and applications (pp. 303–322).Google Scholar
  19. Larsen, B., & Aone, C. (1999). Fast and effective text mining using linear-time document clustering. In KDD ’99: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 16–22). New York: ACM.CrossRefGoogle Scholar
  20. Nanopoulos, A., Gabriel, H., & Spiliopoulou, M. (2009). Spectral clustering in social-tagging systems. In 10th int. conf. on Web information systems engineering (pp. 87–100).Google Scholar
  21. Petridou, S. G., Koutsonikola, V. A., Vakali, A. I., & Papadimitriou, G. I. (2008). Time-aware web users’ clustering. IEEE Transactions on Knowledge and Data Engineering, 20, 653–667.CrossRefGoogle Scholar
  22. Porter, M. F. (1997). An algorithm for suffix stripping. In Readings in information retrieval (pp. 313–316). San Francisco: Morgan Kaufmann Publishers Inc.Google Scholar
  23. Rattenbury, T., Good, N., & Naaman, M. (2007). Towards automatic extraction of event and place semantics from flickr tags. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (pp. 103–110). New York: ACM.CrossRefGoogle Scholar
  24. Richeldi, M., & Rossotto, M. (1995). Class-driven statistical discretization of continuous attributes (extended abstract). In ECML’95 (pp. 335–338).Google Scholar
  25. Russell, T. (2006). Cloudalicious: Folksonomy over time. In Proceedings of the 6th ACM/IEEE-CS joint conference on digital libraries (pp. 364–364). Chapel Hill: ACM.CrossRefGoogle Scholar
  26. Shepitsen, A., Gemmell, J., Mobasher, B., & Burke, R. (2008). Personalized recommendation in social tagging systems using hierarchical clustering. In Proceedings of the 2008 ACM conference on recommender systems, RecSys ’08 (pp. 259–266). Lausanne: ACM.CrossRefGoogle Scholar
  27. Sigurbjornsson, B, & van Zwol, R. (2008). Flickr tag recommendation based on collective knowledge. In Proceeding of the 17th international conference on World Wide Web (pp. 327–336). Beijing: ACM.CrossRefGoogle Scholar
  28. Specia, L., & Motta, E. (2007). Integrating folksonomies with the semantic web. In 4th ESWC (pp. 624–639). Austria.Google Scholar
  29. Sun, A., Zeng, D., Li, H., & Zheng, X. (2008). Discovering trends in collaborative tagging systems. In Proceedings of the IEEE ISI 2008 PAISI, PACCF, and SOCO international workshops on intelligence and security informatics (pp. 377–383). Berlin: Springer.Google Scholar
  30. Swan, R., & Allan, J. (1999). Extracting significant time varying features from text. In Proceedings of the eighth international conference on information and knowledge management (pp. 38–45). New York: ACM.Google Scholar
  31. Theodosiou, T., Angelis, L., Vakali, A., & Thomopoulos, G. (2007). Gene functional annotation by statistical analysis of biomedical articles. International Journal of Medical Informatics, 76(8), 601–613.CrossRefGoogle Scholar
  32. Vlachos, M., Meek, C., Vagena, Z., & Gunopulos, D. (2004). Identifying similarities, periodicities and bursts for online search queries. In SIGMOD ’04: Proceedings of the 2004 ACM SIGMOD international conference on management of data (pp. 131–142). New York: ACM.CrossRefGoogle Scholar
  33. Wetzker, R., Plumbaum, T., Korth, A., Bauckhage, C., Alpcan, T., & Metze, F. (2008a). Detecting trends in social bookmarking systems using a probabilistic generative model and smoothing. In Proceedings of 19th international conference on pattern recognition (ICPR 2008) (pp. 1–4). Piscataway: IEEE.CrossRefGoogle Scholar
  34. Wetzker, R., Zimmermann, C., & Bauckhage, C. (2008b) Analyzing social bookmarking systems: A cookbook. In Proceedings of the ECAI 2008 mining social data workshop (2008) (pp. 26–30).Google Scholar
  35. Wu, E. H., Ng, M. K., Yip, A. M., & Chan, T. F. (2004). Discretization of multidimensional web data for informative dense regions discovery. In Computational and information science (pp. 718–724).Google Scholar
  36. Wu, Z., & Palmer, M. (1994). Verm semantics and lexical selection. In Proceedings of the 32nd annual meeting of the Association for Computational Linguistics (pp. 133–138). New Mexico, USA.Google Scholar
  37. Zhou, M., Bao, S., Wu, X., & Yu, Y. (2007). An unsupervised model for exploring hierarchical semantics from social annotations. In Proceedings of the 6th international Semantic Web conference, (ISWC ’07) (pp. 680–693). Busan, Korea.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Eirini Giannakidou
    • 1
    Email author
  • Vassiliki Koutsonikola
    • 1
  • Athena Vakali
    • 1
  • Ioannis Kompatsiaris
    • 2
  1. 1.Department of InformaticsAristotle UniversityThessalonikiGreece
  2. 2.Informatics and Telematics Institute, CERTHThermi-ThessalonikiGreece

Personalised recommendations