Skip to main content
Log in

Content-based and collaborative techniques for tag recommendation: an empirical evaluation

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

The rapid growth of the so-called Web 2.0 has changed the surfers’ behavior. A new democratic vision emerged, in which users can actively contribute to the evolution of the Web by producing new content or enriching the existing one with user generated metadata. In this context the use of tags, keywords freely chosen by users for describing and organizing resources, spread as a model for browsing and retrieving web contents. The success of that collaborative model is justified by two factors: firstly, information is organized in a way that closely reflects the users’ mental model; secondly, the absence of a controlled vocabulary reduces the users’ learning curve and allows the use of evolving vocabularies. Since tags are handled in a purely syntactical way, annotations provided by users generate a very sparse and noisy tag space that limits the effectiveness for complex tasks. Consequently, tag recommenders, with their ability of providing users with the most suitable tags for the resources to be annotated, recently emerged as a way of speeding up the process of tag convergence. The contribution of this work is a tag recommender system implementing both a collaborative and a content-based recommendation technique. The former exploits the user and community tagging behavior for producing recommendations, while the latter exploits some heuristics to extract tags directly from the textual content of resources. Results of experiments carried out on a dataset gathered from Bibsonomy show that hybrid recommendation strategies can outperform single ones and the way of combining them matters for obtaining more accurate results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www.youtube.com/

  2. http://www.flickr.com/

  3. http://en.wikipedia.org/

  4. http://www.bibsonomy.org/

  5. http://delicious.com/

  6. http://www.kde.cs.uni-kassel.de/ws/dc09/

  7. http://lucene.apache.org

  8. http://nlp.uned.es/~jperezi/Lucene-BM25/

References

  • Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. Reading: Addison-Wesley.

    Google Scholar 

  • Baruzzo, A., Dattolo, A., Pudota, N., Tasso, C. (2009). Recommending new tags using domain-ontologies. In Proceedings of the web intelligence/IAT workshops (pp. 409–412).

  • Basile, P., Degemmis, M., Gentile, A.L., Lops, P., Semeraro, G. (2007). UNIBA: JIGSAW algorithm for word sense disambiguation. In Proceedings of the 4th ACL 2007 international workshop on semantic evaluations (SemEval-2007), Prague, Czech Republic, 23–24 June 2007 (pp. 398–401). Association for Computational Linguistics.

  • Billsus, D., & Pazzani, M.J. (1998). Learning collaborative information filters. In Proceeding of the 15th international conference on machine learning (pp. 46–54). San Francisco: Morgan Kaufmann.

    Google Scholar 

  • Brooks, C.H., & Montanez, N. (2006). Improved annotation of the blogosphere via autotagging and hierarchical clustering. In WWW ’06: Proceedings of the 15th international conference on World Wide Web (pp. 625–632). New York: ACM.

    Chapter  Google Scholar 

  • Cattuto, C., Schmitz, C., Baldassarri, A., Servedio, V.D.P., Loreto, V., Hotho, A., Grahl, M., Stumme, G. (2007). Network properties of folksonomies. AI Communications, 20(4), 245–262.

    MathSciNet  Google Scholar 

  • Chen, X., & Shin, H. (2012). Tag recommendation by machine learning with textual and social features. Journal of Intelligent information Systems (JIIS). doi:10.1007/s10844-012-0200-0.

  • de Campos, L.M., Fernández-Luna, J.M., Huete, J.F., Rueda-Morales, M.A. (2010). Combining content-based and collaborative recommendations: a hybrid approach based on Bayesian networks. International Journal of Approximate Reasoning, 51(7), 785–799.

    Article  Google Scholar 

  • Gabrilovich, E., & Markovitch, S. (2009). Wikipedia-based semantic interpretation for natural language processing. Journal of Artificial Intelligence Research (JAIR), 34, 443–498.

    MATH  Google Scholar 

  • Gemmell, J., Schimoler, T., Ramezani, M., Mobasher, B. (2009). Adapting k-nearest neighbor for tag recommendation in folksonomies. In 7th workshop on intelligent techniques for web personalization and recommender systems, held in conjunction with the 21st international joint conference on artificial intelligence (IJCAI-09).

  • Golder, S., & Huberman, B.A. (2006). The structure of collaborative tagging systems. Journal of Information Science, 32(2), 198–208.

    Article  Google Scholar 

  • Grineva, M.P., Grinev, M.N., Lizorkin, D. (2009). Extracting key terms from noisy and multitheme documents. In J. Quemada, G. León, Y.S. Maarek, W. Nejdl (Eds.), Proceedings of the 18th international conference on World Wide Web, WWW 2009 (pp. 661–670). New York: ACM.

    Chapter  Google Scholar 

  • Heymann, P., Ramage, D., Garcia-Molina, H. (2008). Social tag prediction. In S. Myaeng, D.W. Oard, F. Sebastiani, T. Chua, M. Leong (Eds.), SIGIR ’08: proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (pp. 531–538). New York: ACM.

    Chapter  Google Scholar 

  • Jäschke, R., Marinho, L.B., Hotho, A., Schmidt-Thieme, L., Stumme, G. (2007). Tag recommendations in folksonomies. In J.N. Kok, J. Koronacki, R. López de Mántaras, S. Matwin, D. Mladenic, A. Skowron (Eds.), Knowledge discovery in databases: PKDD 2007, 11th European conference on principles and practice of knowledge discovery in databases, lecture notes in computer science (Vol. 4702, pp. 506–514). New York: Springer.

    Google Scholar 

  • Ju, S., & Hwang, K. (2009). A weighting scheme for tag recommendation in social bookmarking systems. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 109–118).

  • Kee Lee, S.O., & Wai Chun, A.H. (2007). Automatic tag recommendation for the web 2.0 blogosphere using collaborative tagging and hybrid ANN semantic structures. In ACOS’07: proceedings of the 6th conference on WSEAS international conference on applied computer science (pp. 88–93). Singapore: World Scientific and Engineering Academy and Society.

    Google Scholar 

  • Lipczak, M. (2008). Tag recommendation for folksonomies oriented towards individual users. In Proceedings of ECML PKDD discovery challenge (DC08) (pp. 84–95).

  • Lipczak, M., Hu, Y., Kollet, Y., Milios, E. (2009). Tag sources for recommendation in collaborative tagging systems. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 157–172).

  • Marinho, L.B., & Schmidt-Thieme, L. (2008). Collaborative tag recommendations. In C. Preisach, H. Burkhardt, L. Schmidt-Thieme, R. Decker (Eds.), Data analysis, machine learning and applications—proceedings of the 31st annual conference of the Gesellschaft für Klassifikation e.V., Albert-Ludwigs-Universitä t Freiburg, studies in classification, data analysis, and knowledge organization (pp. 533–540). New York: Springer.

    Google Scholar 

  • Mathes, A. (2004). Folksonomies—cooperative classification and communication through shared metadata. http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html

  • Mishne, G. (2006). Autotag: a collaborative approach to automated tag assignment for weblog posts. In WWW ’06: proceedings of the 15th international conference on World Wide Web (pp. 953–954). New York: ACM.

    Chapter  Google Scholar 

  • Mrosek, J., Bussmann, S., Albers, H., Posdziech, K., Hengefeld, B., Opperman, N., Robert, S., Spira, G. (2009). Content- and graph-based tag recommendation: two variations. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 189–199).

  • Murfi, H., & Obermayer, K. (2009). A two-level learning hierarchy of concept based keyword extraction for tag recommendations. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 201–214).

  • Musto, C., Narducci, F., de Gemmis, M., Lops, P., Semeraro, G. (2009) STaR: a social tag recommender system. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), Proceedings the ECML/PKDD 2009 discovery challenge workshop, CEUR workshop proceedings (Vol. 497, pp. 215–227).

  • Musto, C., Narducci, F., de Gemmis, M., Lops, P., Semeraro, G. (2010a). An IR-based approach for tag recommendation. In IIR 2010—proceedings of the f irst Italian information retrieval workshop, Padua, Italy, 27–28 January 2010, CEUR workshop proceedings (Vol. 560, pp. 65–69).

  • Musto, C., Narducci, F., de Gemmis, M., Lops, P. (2010b). Combining collaborative and contentbased techniques for tag recommendation. In F. Buccafurri, G. Semeraro (Eds.) E-Commerce and web technologies, 11th international conference, EC-Web 2010, Bilbao, Spain, 1–3 September 2010, of lecture notes in business information processing (LNBIP) (Vol. 61 pp. 13–23). ISBN: 978-3-642-15207-8.

  • Robertson, S.E., Walker, S., Beaulieu, M.H., Gull, A., Lau, M. (1992). Okapi at TREC. In Text retrieval conference (pp. 21–30).

  • Salton, G. (1989). Automatic text processing. Reading: Addison-Wesley.

    Google Scholar 

  • Schmitz, C., Hotho, A., Jäschke, R., Stumme, G. (2006). Mining association rules in folksonomies. In data science and classification (proc. IFCS 2006 conference), studies in classification, data analysis, and knowledge organization, Ljubljana (pp. 261–270). Berlin: Springer.

    Chapter  Google Scholar 

  • Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47.

    Article  Google Scholar 

  • Song, Y., Zhang, L., Giles, C.L. (2011). Automatic tag recommendation algorithms for social recommender systems. ACM Transactions on the Web, 5(1), 1–31.

    Article  Google Scholar 

  • Sood, S., Owsley, S., Hammond, K., Birnbaum, L. (2007). TagAssist: automatic tag suggestion for blog posts. In Proceedings of the international conference on weblogs and social media (ICWSM 2007).

  • Symeonidis, P. (2009). User recommendations based on tensor dimensionality reduction. In L.S. Iliadis, I. Maglogiannis, G. Tsoumakas, I.P. Vlahavas, Max Bramer (Eds.), Artificial intelligence applications and innovations III, proceedings of the 5th IFIP conference on artificial intelligence applications and innovations (AIAI’2009), IFIP (Vol. 296, pp. 331–340). New York: Springer.

    Google Scholar 

  • Tatu, M., Srikanth, M., D’Silva, T. (2008). RSDC’08: tag recommendations using bookmark content. In Proceedings of ECML PKDD discovery challenge (DC08) (pp. 96–107).

  • Wang, J., Hong, L., Davison, B.D. (2009). RSDC09: Tag recommendation using keywords and association rules. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 261–274).

  • Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G. (1999). Kea: practical automatic keyphrase extraction. In Proceedings of the fourth ACM conference on digital libraries (pp. 254–255). New York: ACM.

    Chapter  Google Scholar 

  • Wu, H., Zubair, M., Maly, K. (2006). Harvesting social knowledge from folksonomies. In HYPERTEXT ’06: proceedings of the seventeenth conference on hypertext and hypermedia (pp. 111–114). New York: ACM.

    Chapter  Google Scholar 

  • Zhang, Y., Zhang, N., Tang, J. (2009). A collaborative filtering tag recommendation system based on graph. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 297–306).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pasquale Lops.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lops, P., de Gemmis, M., Semeraro, G. et al. Content-based and collaborative techniques for tag recommendation: an empirical evaluation. J Intell Inf Syst 40, 41–61 (2013). https://doi.org/10.1007/s10844-012-0215-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-012-0215-6

Keywords

Navigation