Skip to main content

Evaluation of Tag Clusterings for User Profiling in Movie Recommendation

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 11731)


In the web 2.0 era, tags provide an effective mechanism to rapidly annotate and categorize items. However, tags suffer from many problems typically linked to language, like synonymy, polysemy, and ambiguity in general. To overcome this limitation, tag clustering can be used to group tags that represent similar concepts. One of the domains where tag clustering has shown to be particularly useful is the movie recommendation, where tags are used to represent users’ preferences and affinities. In this context it is not yet available a golden standard that can prove the quality of a clustering technique, especially considering that the final aim is the users’ satisfaction rather than an accuracy-like score. To this end, we propose an evaluation criterion for the quality of the resulting clusters based on human judgments.


  • Movie recommendation
  • Tag clustering
  • Human judgment
  • Word embeddings

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-30493-5_45
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-30493-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.


  1. 1.

  2. 2.


  1. Brooks, C.H., Montanez, N.: Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: Proceedings of the 15th International Conference on World Wide Web, pp. 625–632. ACM (2006).

  2. Buhrmester, M., Kwang, T., Gosling, S.D.: Amazon’s mechanical turk: a new source of inexpensive, yet high-quality, data? Perspect. Psychol. Sci. 6(1), 3–5 (2011).

    CrossRef  Google Scholar 

  3. Cui, J., Li, P., Liu, H., He, J., Du, X.: A neighborhood search method for link-based tag clustering. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds.) ADMA 2009. LNCS (LNAI), vol. 5678, pp. 91–103. Springer, Heidelberg (2009).

    CrossRef  Google Scholar 

  4. Cui, J., Liu, H., He, J., Li, P., Du, X., Wang, P.: TagClus: a random walk-based method for tag clustering. Knowl. Inf. Syst. 27(2), 193–225 (2011).

    CrossRef  Google Scholar 

  5. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  6. Faggioli, G., Polato, M., Aiolli, F.: Tag-based user profiling: a game theoretic approach. In: Proceedings of the 27th ACM Conference on User Modelling, Adaptation And Personalization (UMAP 2019) (2019).

  7. Garcia-Plaza, A.P., Zubiaga, A., Fresno, V., Martinez, R.: Reorganizing clouds: a study on tag clustering and evaluation. Expert Syst. Appl. 39(10), 9483–9493 (2012).

    CrossRef  Google Scholar 

  8. Gemmell, J., Shepitsen, A., Mobasher, B., Burke, R.D.: Personalization in folksonomies based on tag clustering. In: Association for the Advancement of Artificial Intelligence (AAAI 2008) (2008).

  9. Levy, O., Goldberg, Y.: Neural word embedding as implicit matrix factorization. In: Advances in Neural Information Processing Systems, pp. 2177–2185 (2014)

    Google Scholar 

  10. Li, B., et al.: Investigating different syntactic context types and context representations for learning word embeddings. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2411–2421 (2017).

  11. Li, X., et al.: Inducing taxonomy from tags: an agglomerative hierarchical clustering framework. In: Zhou, S., Zhang, S., Karypis, G. (eds.) ADMA 2012. LNCS (LNAI), vol. 7713, pp. 64–77. Springer, Heidelberg (2012).

    CrossRef  Google Scholar 

  12. Liu, Z., Bao, J., Ding, F.: An improved k-means clustering algorithm based on semantic model, pp. 30:1–30:5 (2018).

  13. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  14. Papadopoulos, S., Kompatsiaris, Y., Vakali, A.: A graph-based clustering scheme for identifying related tags in folksonomies. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2010. LNCS, vol. 6263, pp. 65–76. Springer, Heidelberg (2010).

    CrossRef  Google Scholar 

  15. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014).

  16. Santos, C.D., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31st International Conference on Machine Learning (ICML 2014), pp. 1818–1826 (2014)

    Google Scholar 

  17. Shepitsen, A., Gemmell, J., Mobasher, B., Burke, R.: Personalized recommendation in social tagging systems using hierarchical clustering. In: Proceedings of the 2008 ACM Conference on Recommender Systems, RecSys 2008, pp. 259–266. ACM, New York (2008).

  18. Sorokin, A., Forsyth, D.: Utility data annotation with Amazon mechanical turk. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8. IEEE (2008).

  19. Tang, J.: Improved k-means clustering algorithm based on user tag. J. Converg. Inf. Technol. 5, 124–130 (2010).

    CrossRef  Google Scholar 

  20. Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 384–394. Association for Computational Linguistics (2010)

    Google Scholar 

  21. Yang, J., Wang, J.: Tag clustering algorithm LMMSK: improved k-means algorithm based on latent semantic analysis. J. Syst. Eng. Electron. 28(2), 374–384 (2017).

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Guglielmo Faggioli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Faggioli, G., Polato, M., Lauriola, I., Aiolli, F. (2019). Evaluation of Tag Clusterings for User Profiling in Movie Recommendation. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions. ICANN 2019. Lecture Notes in Computer Science(), vol 11731. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30492-8

  • Online ISBN: 978-3-030-30493-5

  • eBook Packages: Computer ScienceComputer Science (R0)