Abstract
In the web 2.0 era, tags provide an effective mechanism to rapidly annotate and categorize items. However, tags suffer from many problems typically linked to language, like synonymy, polysemy, and ambiguity in general. To overcome this limitation, tag clustering can be used to group tags that represent similar concepts. One of the domains where tag clustering has shown to be particularly useful is the movie recommendation, where tags are used to represent users’ preferences and affinities. In this context it is not yet available a golden standard that can prove the quality of a clustering technique, especially considering that the final aim is the users’ satisfaction rather than an accuracy-like score. To this end, we propose an evaluation criterion for the quality of the resulting clusters based on human judgments.
Keywords
- Movie recommendation
- Tag clustering
- Human judgment
- Word embeddings
This is a preview of subscription content, access via your institution.
Buying options



References
Brooks, C.H., Montanez, N.: Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: Proceedings of the 15th International Conference on World Wide Web, pp. 625–632. ACM (2006). https://doi.org/10.1145/1135777.1135869
Buhrmester, M., Kwang, T., Gosling, S.D.: Amazon’s mechanical turk: a new source of inexpensive, yet high-quality, data? Perspect. Psychol. Sci. 6(1), 3–5 (2011). https://doi.org/10.1177/1745691610393980
Cui, J., Li, P., Liu, H., He, J., Du, X.: A neighborhood search method for link-based tag clustering. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds.) ADMA 2009. LNCS (LNAI), vol. 5678, pp. 91–103. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03348-3_12
Cui, J., Liu, H., He, J., Li, P., Du, X., Wang, P.: TagClus: a random walk-based method for tag clustering. Knowl. Inf. Syst. 27(2), 193–225 (2011). https://doi.org/10.1007/s10115-010-0307-y
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Faggioli, G., Polato, M., Aiolli, F.: Tag-based user profiling: a game theoretic approach. In: Proceedings of the 27th ACM Conference on User Modelling, Adaptation And Personalization (UMAP 2019) (2019). https://doi.org/10.1145/3314183.3323462
Garcia-Plaza, A.P., Zubiaga, A., Fresno, V., Martinez, R.: Reorganizing clouds: a study on tag clustering and evaluation. Expert Syst. Appl. 39(10), 9483–9493 (2012). https://doi.org/10.1016/j.eswa.2012.02.108
Gemmell, J., Shepitsen, A., Mobasher, B., Burke, R.D.: Personalization in folksonomies based on tag clustering. In: Association for the Advancement of Artificial Intelligence (AAAI 2008) (2008). https://doi.org/10.1007/978-3-540-85836-2_19
Levy, O., Goldberg, Y.: Neural word embedding as implicit matrix factorization. In: Advances in Neural Information Processing Systems, pp. 2177–2185 (2014)
Li, B., et al.: Investigating different syntactic context types and context representations for learning word embeddings. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2411–2421 (2017). https://doi.org/10.18653/v1/D17-1257
Li, X., et al.: Inducing taxonomy from tags: an agglomerative hierarchical clustering framework. In: Zhou, S., Zhang, S., Karypis, G. (eds.) ADMA 2012. LNCS (LNAI), vol. 7713, pp. 64–77. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35527-1_6
Liu, Z., Bao, J., Ding, F.: An improved k-means clustering algorithm based on semantic model, pp. 30:1–30:5 (2018). https://doi.org/10.1145/3148453.3306269
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Papadopoulos, S., Kompatsiaris, Y., Vakali, A.: A graph-based clustering scheme for identifying related tags in folksonomies. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2010. LNCS, vol. 6263, pp. 65–76. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15105-7_6
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014). https://doi.org/10.3115/v1/D14-1162
Santos, C.D., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31st International Conference on Machine Learning (ICML 2014), pp. 1818–1826 (2014)
Shepitsen, A., Gemmell, J., Mobasher, B., Burke, R.: Personalized recommendation in social tagging systems using hierarchical clustering. In: Proceedings of the 2008 ACM Conference on Recommender Systems, RecSys 2008, pp. 259–266. ACM, New York (2008). https://doi.org/10.1145/1454008.1454048
Sorokin, A., Forsyth, D.: Utility data annotation with Amazon mechanical turk. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8. IEEE (2008). https://doi.org/10.1109/CVPRW.2008.4562953
Tang, J.: Improved k-means clustering algorithm based on user tag. J. Converg. Inf. Technol. 5, 124–130 (2010). https://doi.org/10.4156/jcit.vol5.issue10.16
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 384–394. Association for Computational Linguistics (2010)
Yang, J., Wang, J.: Tag clustering algorithm LMMSK: improved k-means algorithm based on latent semantic analysis. J. Syst. Eng. Electron. 28(2), 374–384 (2017). https://doi.org/10.21629/JSEE.2017.02.18
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Faggioli, G., Polato, M., Lauriola, I., Aiolli, F. (2019). Evaluation of Tag Clusterings for User Profiling in Movie Recommendation. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions. ICANN 2019. Lecture Notes in Computer Science(), vol 11731. Springer, Cham. https://doi.org/10.1007/978-3-030-30493-5_45
Download citation
DOI: https://doi.org/10.1007/978-3-030-30493-5_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30492-8
Online ISBN: 978-3-030-30493-5
eBook Packages: Computer ScienceComputer Science (R0)