Employing Document Embeddings to Solve the “New Catalog” Problem in User Targeting, and Provide Explanations to the Users

  • Ludovico BorattoEmail author
  • Salvatore Carta
  • Gianni Fenu
  • Luca Piras
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10772)


In the current digital era, items that were consumed in a physical form are now available in online platforms that allow users to stream or buy them. However, not all of the items are available in digital form. When the companies that run these platforms acquire the rights to add a new catalog of items, the problem that arises is to identify who, among the customers, should be advertised with this new addition. Indeed, although the items may have existed for a long time, the preferences of the users for these items are not available. In this paper, we propose an approach that selects a set of users to target, to advertise a new catalog. In order to do so, we consider the textual description of these items and employ document embeddings (i.e., vector representations of a document) to model both the new catalog and the users. We also propose an approach to generate an explanation list to a user, represented by the top-n artists she evaluated that are most similar to the one of the new catalog. Experimental results show the effectiveness of both our targeting approach and of the explanation lists.


User targeting Document embeddings Explanation 



This work is partially funded by Regione Sardegna under project NOMAD (Next generation Open Mobile Apps Development), through PIA - Pacchetti Integrati di Agevolazione “Industria Artigianato e Servizi” (annualità 2013).


  1. 1.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  2. 2.
    Boutsidis, C., Gallopoulos, E.: SVD based initialization: a head start for nonnegative matrix factorization. Pattern Recogn. 41(4), 1350–1362 (2008)CrossRefzbMATHGoogle Scholar
  3. 3.
    Campr, M., Ježek, K.: Comparing semantic models for evaluating automatic document summarization. In: Král, P., Matoušek, V. (eds.) TSD 2015. LNCS (LNAI), vol. 9302, pp. 252–260. Springer, Cham (2015). CrossRefGoogle Scholar
  4. 4.
    Christou, D.: Feature extraction using latent dirichlet allocation and neural networks: a case study on movie synopses. CoRR abs/1604.01272 (2016)Google Scholar
  5. 5.
    Dhillon, I.S., Sra, S.: Generalized nonnegative matrix approximations with bregman divergences. In: Advances in Neural Information Processing Systems, NIPS 2005, 5–8 December, 2005, Vancouver, British Columbia, Canada, vol. 18, pp. 283–290 (2005)Google Scholar
  6. 6.
    Dumais, S.T.: Latent semantic analysis. ARIST 38(1), 188–230 (2004)Google Scholar
  7. 7.
    Hoffman, M.D., Blei, D.M., Bach, F.R.: Online learning for latent dirichlet allocation. In: Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, pp. 856–864. Curran Associates, Inc. (2010)Google Scholar
  8. 8.
    Kruskal, J.B., Wish, M.: Multidimensional Scaling. Sage Publications, Beverely Hills (1978)CrossRefGoogle Scholar
  9. 9.
    Lam, X.N., Vu, T., Le, T.D., Duong, A.D.: Addressing cold-start problem in recommendation systems. In: Proceedings of the 2nd International Conference on Ubiquitous Information Management and Communication, ICUIMC 2008, New York, pp. 208–211. ACM (2008)Google Scholar
  10. 10.
    Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31th International Conference on Machine Learning, ICML 2014. JMLR Proceedings, vol. 32, pp. 1188–1196 (2014). JMLR.orgGoogle Scholar
  11. 11.
    van der Maaten, L., Hinton, G.E.: Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  12. 12.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)Google Scholar
  13. 13.
    Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting similarities among languages for machine translation. CoRR abs/1309.4168 (2013)Google Scholar
  14. 14.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems 2013, pp. 3111–3119 (2013)Google Scholar
  15. 15.
    Mikolov, T., Yih, W., Zweig, G.: Linguistic regularities in continuous space word representations. In: Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, pp. 746–751. The Association for Computational Linguistics (2013)Google Scholar
  16. 16.
    Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)CrossRefGoogle Scholar
  17. 17.
    Zhila, A., Yih, W., Meek, C., Zweig, G., Mikolov, T.: Combining heterogeneous models for measuring relational similarity. In: Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, pp. 1000–1009. The Association for Computational Linguistics (2013)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Ludovico Boratto
    • 1
    Email author
  • Salvatore Carta
    • 2
  • Gianni Fenu
    • 2
  • Luca Piras
    • 2
  1. 1.Data Science and Big Data AnalyticsEURECATBarcelonaSpain
  2. 2.Dipartimento di Matematica e InformaticaUniversità di CagliariCagliariItaly

Personalised recommendations