Skip to main content
Log in

Combining content with user preferences for non-fiction multimedia recommendation: a study on TED lectures

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper introduces a new dataset and compares several methods for the recommendation of non-fiction audio visual material, namely lectures from the TED website. The TED dataset contains 1,149 talks and 69,023 profiles of users, who have made more than 100,000 ratings and 200,000 comments. The corresponding metadata, which we make available, can be used for training and testing generic or personalized recommender systems. We define content-based, collaborative, and combined recommendation methods for TED lectures and use cross-validation to select the best parameters of keyword-based (TFIDF) and semantic vector space-based methods (LSI, LDA, RP, and ESA). We compare these methods on a personalized recommendation task in two settings, a cold-start and a non-cold-start one. In the cold-start setting, semantic vector spaces perform better than keywords. In the non-cold-start setting, where collaborative information can be exploited, content-based methods are outperformed by collaborative filtering ones, but the proposed combined method shows acceptable performances, and can be used in both settings. For the generic recommendation task, LSI and RP again outperform TF-IDF.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://www.idiap.ch/dataset/ted/.

  2. http://www.khanacademy.org/.

  3. http://www.videolectures.net/.

  4. http://www.youtube.com/education/.

  5. http://www.dailymotion.com/.

  6. The scenario of this task does not presuppose that the user is currently viewing a talk, but considers only the user’s past history. As a consequence, if a user is interested in several different topics, it is likely that in the resulting recommendations each topic will be present with its probability of appearance in the user’s past history. On the contrary, in a contextual recommendation task as mentioned above, the topic of the talk that is currently viewed should be considerably boosted with respect to the others in the resulting recommendations.

  7. A multilingual lexical database where English and Italian senses are aligned.

References

  1. Adomavicius G, Tuzhilin A (2011) Context-aware recommender systems. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York, pp 217–253

    Chapter  Google Scholar 

  2. Anderson C (2006) The long tail: why the future of business is selling less of more. Hyperion, New York

    Google Scholar 

  3. Antulov-Fantulin N, Bošnjak M, žnidaršič M, Grčar M, Morzy M, Šmuc T (2011) ECML/PKDD 2011 discovery challenge overview. In: Proceedings of the ECML/PKDD 2011 discovery challenge workshop, Athens

  4. Arapakis I, Moshfeghi Y, Joho H, Ren R, Hannah D, Jose JM (2009) Enriching user profiling with affective features for the improvement of a multimodal recommender system. In: Proceedings of the ACM international conference on image and video retrieval, Santorini, CIVR ’09, pp 29:1–29:8

  5. Arapakis I, Moshfeghi Y, Joho H, Ren R, Hannah D, Jose JM (2009) Integrating facial expressions into user profiling for the improvement of a multimodal recommender system. In: Proceedings of the 2009 IEEE international conference on multimedia and expo, New York, ICME’09, pp 1440–1443

  6. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(30):993–1022

    MATH  Google Scholar 

  7. Cremonesi P, Koren Y, Turrin R (2010) Performance of recommender algorithms on top-n recommendation tasks. In: Proceedings of the 4th ACM conference on recommender systems, Barcelona, RecSys ’10

  8. Dasiopoulou S, Tzouvaras V, Kompatsiaris I, Strintzis M (2010) Enquiring MPEG-7 based multimedia ontologies. Multimed Tools Appl 46(2–3):331–370

    Article  Google Scholar 

  9. Degemmis M, Lops P, Semeraro G (2007) A content-collaborative recommender that exploits wordnet-based user profiles for neighborhood formation. User Model User-Adap Inter 17(3):217–255

    Article  Google Scholar 

  10. Deshpande M, Karypis G (2004) Item-based top-N recommendation algorithms. ACM Trans Inf Syst 22(1):143–177

    Article  Google Scholar 

  11. Di Massa R, Montagnuolo M, Messina A (2010) Implicit news recommendation based on user interest models and multimodal content analysis. In: Proceedings of the 3rd international workshop on automated information extraction in media production, Firenze, AIEMPro ’10, pp 33–38

  12. Federico M, Cettolo M, Bentivogli L, Paul M, Stüker S (2012) Overview of the IWSLT 2012 evaluation campaign. In: Proceedings of the international workshop on spoken language translation, Hong-Kong, IWSLT ’12

  13. Gabrilovich E, Markovitch S (2007) Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, IJCAI’07, pp 1606–1611

  14. Herlocker JL, Konstan JA, Terveen LG, Riedl JT (2004) Evaluating collaborative filtering recommender systems. ACM Trans Inf Syst 22(1):5–53

    Article  Google Scholar 

  15. Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, Berkeley, SIGIR ’99

  16. Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: Proceedings of the 8th IEEE international conference on data mining, ICDM ’08, pp 263–272

  17. Joachims T (2006) Training linear SVMs in linear time. In: Proceedings of the ACM conference on knowledge discovery and data mining, Philadelphia, KDD ’06, pp 217–226

  18. Johansson P (2003) Madfilm—a multimodal approach to handle search and organization in a movie recommendation system. In: Proceedings of the 1st nordic symposium on multimodal communication, Copenhagen, pp 53–65

  19. Koren Y, Bell R (2011) Advances in collaborative filtering. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York, pp 145–186

    Chapter  Google Scholar 

  20. Lees-Miller J, Anderson F, Hoehn B, Greiner R (2008) Does Wikipedia information help Netflix predictions? In: Proceedings of the 7th international conference on machine learning and applications, San Diego, ICMLA ’08, pp 337–343

  21. Li Y, Hu J, Zhai C, Chen Y (2010) Improving one-class collaborative filtering by incorporating rich user information. In: Proceedings of the 19th ACM international conference on information and knowledge management, Toronto, CIKM ’10, pp 959–968

  22. Lops P, Gemmis M, Semeraro G (2011) Content-based recommender systems: state of the art and trends. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York, pp 73–105

    Chapter  Google Scholar 

  23. Magnini B, Strapparava C (2001) Improving user modelling with content-based techniques. In: Bauer M, Gmytrasiewicz P, Vassileva J (eds) User modeling 2001. Springer, New York, pp 74–83

    Chapter  Google Scholar 

  24. Mahmood T, Ricci F (2009) Improving recommender systems with adaptive conversational strategies. In: Proceedings of the 20th ACM conference on hypertext and hypermedia, Torino, HT ’09, pp 73–82

  25. Martinez J (2002) Standards—MPEG-7 overview of MPEG-7 description tools, part 2. IEEE Multimed 9(3):83–93

    Article  Google Scholar 

  26. Mei T, Yang B, Hua XS, Li S (2011) Contextual video recommendation by multimodal relevance and user feedback. ACM Trans Inf Syst 29(2):10:1–10:24

    Article  Google Scholar 

  27. Middleton SE, Shadbolt NR, De Roure DC (2004) Ontological user profiling in recommender systems. ACM Trans Inf Syst 22(1):54–88

    Article  Google Scholar 

  28. Ning X, Karypis G (2011) SLIM: sparse linear methods for top-N recommender systems. In: Proceedings of the 11th IEEE international conference on data mining, Vancouver, ICDM ’11, pp 497–506

  29. Pan R, Scholz M (2009) Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, Paris, KDD ’09, pp 667–676

  30. Pan R, Zhou Y, Cao B, Liu N, Lukose R, Scholz M, Yang Q (2008) One-class collaborative filtering. In: Proceedings of the 8th IEEE international conference on data mining, Pisa, ICDM ’08, pp 502–511

  31. Papagelis M, Plexousakis D (2005) Qualitative analysis of user-based and item-based prediction algorithms for recommendation agents. In: Engineering applications of artificial intelligence, Pergamon, pp 152–166

  32. Pappas N, Popescu-Belis A (2013) Combining content with user preferences for TED lecture recommendation. In: Proceedings of the 11th international workshop on content based multimedia indexing, Veszprém, Hungary, CBMI ’13, pp 47–52

  33. Pappas N, Popescu-Belis A (2013) Sentiment analysis of user comments for one-class collaborative filtering over TED talks. In: Proceedings of the 36th ACM SIGIR conference on research and development in information retrieval, Short papers, Dublin, SIGIR ’13, pp 773–776

  34. Řehůřek R, Sojka P (2010) Software framework for topic modeling with large corpora. In: Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks, Valletta, pp 45–50

  35. Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2009) BPR: Bayesian personalized ranking from implicit feedback. In: Proceedings of the 25th conference on uncertainty in artificial intelligence, Montreal, UAI ’09, pp 452–461

  36. Sahlgren M (2005) An introduction to random indexing. In: Proceedings of the 7th international conference on terminology and knowledge engineering, methods and applications of semantic indexing workshop, vol 5. Copenhagen

  37. Sahlgren M (2006) The word-space model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. PhD thesis, Stockholm University, Stockholm

  38. Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523

    Article  Google Scholar 

  39. Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th international conference on World Wide Web, Hong Kong, WWW ’01, pp 285–295

  40. Semeraro G, Degemmis M, Lops P, Basile P (2007) Combining learning and word sense disambiguation for intelligent user profiling. In: Proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, IJCAI ’07, pp 2856–2861

  41. Semeraro G, Basile P, De Gemmis M, Lops P (2009) User profiles for personalizing digital libraries. In: Theng Y, D G, Foo S, Na J (eds) Handbook of research on digital libraries design development and impact. Information Science Reference, pp 149–158

  42. Semeraro G, Lops P, Basile P, de Gemmis M (2009) Knowledge infusion into content-based recommender systems. In: Proceedings of the 3rd ACM conference on recommender systems, New York, RecSys ’09, pp 301–304

  43. Shani G, Gunawardana A (2011) Evaluating recommendation systems. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York, pp 257–297

    Chapter  Google Scholar 

  44. Shi Y, Karatzoglou A, Baltrunas L, Larson M, Oliver N, Hanjalic A (2012) CLiMF: learning to maximize reciprocal rank with collaborative less-is-more filtering. In: Proceedings of the 6th ACM conference on recommender systems, Dublin, RecSys ’12, pp 139–146

  45. Shin H, Lee M, Kim E (2009) Personalized digital TV content recommendation with integration of user behavior profiling and multimodal content rating. IEEE Trans Consum Electron 55(3):1417–1423

    Article  Google Scholar 

  46. Sindhwani V, Bucak SS, Hu J, Mojsilovic A (2009) A family of non-negative matrix factorizations for one-class collaborative filtering problems. In: Proceedings of the 3rd ACM conference on recommender systems, recommender based industrial applications workshop, New York, RecSys ’09

  47. Smirnov AV, Krizhanovsky A (2008) Information filtering based on Wiki index database. CoRR arXiv:08042354

  48. Tsinaraki C, Christodoulakis S (2006) A multimedia user preference model that supports semantics and its application to MPEG 7/21. In: Proceedings of the 12th international conference on multi-media modelling, Beijing, p 8

  49. Yang B, Mei T, Hua XS, Yang L, Yang SQ, Li M (2007) Online video recommendation based on multimodal fusion and relevance feedback. In: Proceedings of the 6th ACM international conference on image and video retrieval, Amsterdam, CIVR ’07, pp 73–80

Download references

Acknowledgments

We would like to thank the managers of the TED website for their support in accessing and distributing the TED metadata, as well as the anonymous MTAP reviewers for their insightful suggestions. This paper includes material presented at the CBMI 2013 workshop [32], and we are grateful to the organizers for this special issue of the MTAP journal related to CBMI.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikolaos Pappas.

Additional information

We are grateful for the support received for this work from the European Union through the inEvent project FP7-ICT n. 287872 (see http://www.inevent-project.eu).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pappas, N., Popescu-Belis, A. Combining content with user preferences for non-fiction multimedia recommendation: a study on TED lectures. Multimed Tools Appl 74, 1175–1197 (2015). https://doi.org/10.1007/s11042-013-1840-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1840-y

Keywords

Navigation