Skip to main content

User Profiling in Text-Based Recommender Systems Based on Distributed Word Representations

  • Conference paper
  • First Online:
Analysis of Images, Social Networks and Texts (AIST 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 661))

Abstract

We introduce a novel approach to constructing user profiles for recommender systems based on full-text items such as posts in a social network and implicit ratings (in the form of likes) that users give them. The profiles measure a user’s interest in various topics mined from the full texts of the items. As a result, we get a user profile that can be used for cold start recommendations for items, targeted advertisement, and other purposes. Our experiments show that the method performs on a level comparable with classical collaborative filtering algorithms while at the same time being a cold start approach, i.e., it does not use the likes of an item being recommended.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Webb, G.I., Pazzani, M.J., Billsus, D.: Machine learning for user modeling. User Model. User-Adap. Inter. 11(1–2), 19–29 (2001)

    Article  MATH  Google Scholar 

  2. Johnson, A., Taatgen, N.: User modeling. In: Handbook of Human Factors in Web Design. Lawrence Erlbaum, pp. 424–439 (2005)

    Google Scholar 

  3. Fischer, G.: User modeling in human–computer interaction. User Model. User-Adap. Inter. 11(1–2), 65–86 (2001)

    Article  MATH  Google Scholar 

  4. Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.): The Adaptive Web: Methods and Strategies of Web Personalization. Springer, Heidelberg (2007)

    Google Scholar 

  5. Bjorkoy, O.: User modeling on the web: an exploratory review of recommendation systems. PhD thesis, NTNU Trondheim (2010)

    Google Scholar 

  6. Lops, P., Gemmis, M.D., Semeraro, G., Lops, P., Gemmis, M.D., Semeraro, G.: Chapter 3 content-based recommender systems: state of the art and trends

    Google Scholar 

  7. Pazzani, M.J., Billsus, D.: The Adaptive Web. Springer, Heidelberg (2007)

    Google Scholar 

  8. Pazzani, M., Billsus, D.: Learning and revising user profiles: the identification ofinteresting web sites. Mach. Learn. 27(3), 313–331 (1997)

    Article  Google Scholar 

  9. Middleton, S.E., Shadbolt, N.R., De Roure, D.C.: Ontological user profiling in recommender systems. ACM Trans. Inf. Syst. 22(1), 54–88 (2004)

    Article  Google Scholar 

  10. Billsus, D., Pazzani, M.J.: User modeling for adaptive news access. User Model. User-Adap. Inter. 10(2–3), 147–180 (2000)

    Article  Google Scholar 

  11. Cohen, W.W.: Fast effective rule induction. In: 12th International Conference on Machine Learning (ML95), pp. 115–123 (1995)

    Google Scholar 

  12. Basu, C., Hirsh, H., Cohen, W.: Recommendation as classification: using social and content-based information in recommendation. In: Proceedings of the 15th National/10th Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence, AAAI 1998/IAAI 1998, pp. 714–720, Menlo Park, CA, USA. AAAI (1998)

    Google Scholar 

  13. Al-Rfou, R., Perozzi, B., Skiena, S.: Polyglot: distributed word representations for multilingual NLP. In: Proceedings of the 17th Conference on Computational Natural Language Learning, Sofia, Bulgaria, ACL, pp. 183–192, August 2013

    Google Scholar 

  14. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, Association for Computational Linguistics, pp. 1532–1543, October 2014

    Google Scholar 

  15. Panchenko, A., Loukachevitch, N., Ustalov, D., Paperno, D., Meyer, C.M., Konstantinova, N.: RUSSE: the first workshop on Russian semantic similarity. In: Proceedings of the International Conference on Computational Linguistics and Intellectual Technologies (Dialogue), pp. 89–105, May 2015

    Google Scholar 

  16. Kumar, B.V., Kotsia, I., Patras, I.: Max-margin non-negative matrix factorization. Image Vision Comput. 30(45), 279–291 (2012)

    Article  Google Scholar 

  17. Arefyev, N., Panchenko, A., Lukanin, A., Lesota, O., Romanov, P.: Evaluating three corpus-based semantic similarity systems for Russian. In: Proceedings of International Conference on Computational Linguistics Dialogue (2015, to appear)

    Google Scholar 

  18. Vorontsov, K., Frei, O., Apishev, M., Romov, P., Suvorova, M., Yanina, A.: Non-Bayesian additive regularization for multimodal topic modeling of large collections. In: Proceedings of the 2015 Workshop on Topic Models: Post-Processing and Applications, TM 2015, pp. 29–37, New York, NY, USA. ACM (2015)

    Google Scholar 

  19. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(4–5), 993–1022 (2003)

    MATH  Google Scholar 

  20. Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. SIGMOD Rec. 25(2), 103–114 (1996)

    Article  Google Scholar 

  21. Sander, J., Ester, M., Kriegel, H.P., Xu, X.: Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Min. Knowl. Discov. 2(2), 169–194 (1998)

    Article  Google Scholar 

  22. Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)

    Article  Google Scholar 

  23. Sculley, D.: Web-scale k-means clustering. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 1177–1178, New York, NY, USA. ACM(2010)

    Google Scholar 

  24. Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, pp. 1027–1035, Philadelphia, PA, USA. Society for Industrial and Applied Mathematics (2007)

    Google Scholar 

  25. Jarvelin, K., Kekalainen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)

    Article  Google Scholar 

  26. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Jebara, T., Xing, E.P., (eds.) Proceedings of the 31st International Conference on Machine Learning, JMLR Workshop and Conference Proceedings, pp. 1188–1196 (2014)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the “Recommendation Systems with Automated User Profiling” project sponsored by Samsung and the Government of the Russian Federation grant 14.Z50.31.0030. We thank Dmitry Bugaichenko and the “Odnoklassniki” social network for providing us with the social network dataset with texts of posts and user likes and Alexander Panchenko and Nikolay Arefyev for the trained word2vec model along with its Russian-language training data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergey Nikolenko .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Alekseev, A., Nikolenko, S. (2017). User Profiling in Text-Based Recommender Systems Based on Distributed Word Representations. In: Ignatov, D., et al. Analysis of Images, Social Networks and Texts. AIST 2016. Communications in Computer and Information Science, vol 661. Springer, Cham. https://doi.org/10.1007/978-3-319-52920-2_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-52920-2_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-52919-6

  • Online ISBN: 978-3-319-52920-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics