Abstract
This paper introduces a new dataset and compares several methods for the recommendation of non-fiction audio visual material, namely lectures from the TED website. The TED dataset contains 1,149 talks and 69,023 profiles of users, who have made more than 100,000 ratings and 200,000 comments. The corresponding metadata, which we make available, can be used for training and testing generic or personalized recommender systems. We define content-based, collaborative, and combined recommendation methods for TED lectures and use cross-validation to select the best parameters of keyword-based (TFIDF) and semantic vector space-based methods (LSI, LDA, RP, and ESA). We compare these methods on a personalized recommendation task in two settings, a cold-start and a non-cold-start one. In the cold-start setting, semantic vector spaces perform better than keywords. In the non-cold-start setting, where collaborative information can be exploited, content-based methods are outperformed by collaborative filtering ones, but the proposed combined method shows acceptable performances, and can be used in both settings. For the generic recommendation task, LSI and RP again outperform TF-IDF.
Similar content being viewed by others
Notes
The scenario of this task does not presuppose that the user is currently viewing a talk, but considers only the user’s past history. As a consequence, if a user is interested in several different topics, it is likely that in the resulting recommendations each topic will be present with its probability of appearance in the user’s past history. On the contrary, in a contextual recommendation task as mentioned above, the topic of the talk that is currently viewed should be considerably boosted with respect to the others in the resulting recommendations.
A multilingual lexical database where English and Italian senses are aligned.
References
Adomavicius G, Tuzhilin A (2011) Context-aware recommender systems. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York, pp 217–253
Anderson C (2006) The long tail: why the future of business is selling less of more. Hyperion, New York
Antulov-Fantulin N, Bošnjak M, žnidaršič M, Grčar M, Morzy M, Šmuc T (2011) ECML/PKDD 2011 discovery challenge overview. In: Proceedings of the ECML/PKDD 2011 discovery challenge workshop, Athens
Arapakis I, Moshfeghi Y, Joho H, Ren R, Hannah D, Jose JM (2009) Enriching user profiling with affective features for the improvement of a multimodal recommender system. In: Proceedings of the ACM international conference on image and video retrieval, Santorini, CIVR ’09, pp 29:1–29:8
Arapakis I, Moshfeghi Y, Joho H, Ren R, Hannah D, Jose JM (2009) Integrating facial expressions into user profiling for the improvement of a multimodal recommender system. In: Proceedings of the 2009 IEEE international conference on multimedia and expo, New York, ICME’09, pp 1440–1443
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(30):993–1022
Cremonesi P, Koren Y, Turrin R (2010) Performance of recommender algorithms on top-n recommendation tasks. In: Proceedings of the 4th ACM conference on recommender systems, Barcelona, RecSys ’10
Dasiopoulou S, Tzouvaras V, Kompatsiaris I, Strintzis M (2010) Enquiring MPEG-7 based multimedia ontologies. Multimed Tools Appl 46(2–3):331–370
Degemmis M, Lops P, Semeraro G (2007) A content-collaborative recommender that exploits wordnet-based user profiles for neighborhood formation. User Model User-Adap Inter 17(3):217–255
Deshpande M, Karypis G (2004) Item-based top-N recommendation algorithms. ACM Trans Inf Syst 22(1):143–177
Di Massa R, Montagnuolo M, Messina A (2010) Implicit news recommendation based on user interest models and multimodal content analysis. In: Proceedings of the 3rd international workshop on automated information extraction in media production, Firenze, AIEMPro ’10, pp 33–38
Federico M, Cettolo M, Bentivogli L, Paul M, Stüker S (2012) Overview of the IWSLT 2012 evaluation campaign. In: Proceedings of the international workshop on spoken language translation, Hong-Kong, IWSLT ’12
Gabrilovich E, Markovitch S (2007) Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, IJCAI’07, pp 1606–1611
Herlocker JL, Konstan JA, Terveen LG, Riedl JT (2004) Evaluating collaborative filtering recommender systems. ACM Trans Inf Syst 22(1):5–53
Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, Berkeley, SIGIR ’99
Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: Proceedings of the 8th IEEE international conference on data mining, ICDM ’08, pp 263–272
Joachims T (2006) Training linear SVMs in linear time. In: Proceedings of the ACM conference on knowledge discovery and data mining, Philadelphia, KDD ’06, pp 217–226
Johansson P (2003) Madfilm—a multimodal approach to handle search and organization in a movie recommendation system. In: Proceedings of the 1st nordic symposium on multimodal communication, Copenhagen, pp 53–65
Koren Y, Bell R (2011) Advances in collaborative filtering. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York, pp 145–186
Lees-Miller J, Anderson F, Hoehn B, Greiner R (2008) Does Wikipedia information help Netflix predictions? In: Proceedings of the 7th international conference on machine learning and applications, San Diego, ICMLA ’08, pp 337–343
Li Y, Hu J, Zhai C, Chen Y (2010) Improving one-class collaborative filtering by incorporating rich user information. In: Proceedings of the 19th ACM international conference on information and knowledge management, Toronto, CIKM ’10, pp 959–968
Lops P, Gemmis M, Semeraro G (2011) Content-based recommender systems: state of the art and trends. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York, pp 73–105
Magnini B, Strapparava C (2001) Improving user modelling with content-based techniques. In: Bauer M, Gmytrasiewicz P, Vassileva J (eds) User modeling 2001. Springer, New York, pp 74–83
Mahmood T, Ricci F (2009) Improving recommender systems with adaptive conversational strategies. In: Proceedings of the 20th ACM conference on hypertext and hypermedia, Torino, HT ’09, pp 73–82
Martinez J (2002) Standards—MPEG-7 overview of MPEG-7 description tools, part 2. IEEE Multimed 9(3):83–93
Mei T, Yang B, Hua XS, Li S (2011) Contextual video recommendation by multimodal relevance and user feedback. ACM Trans Inf Syst 29(2):10:1–10:24
Middleton SE, Shadbolt NR, De Roure DC (2004) Ontological user profiling in recommender systems. ACM Trans Inf Syst 22(1):54–88
Ning X, Karypis G (2011) SLIM: sparse linear methods for top-N recommender systems. In: Proceedings of the 11th IEEE international conference on data mining, Vancouver, ICDM ’11, pp 497–506
Pan R, Scholz M (2009) Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, Paris, KDD ’09, pp 667–676
Pan R, Zhou Y, Cao B, Liu N, Lukose R, Scholz M, Yang Q (2008) One-class collaborative filtering. In: Proceedings of the 8th IEEE international conference on data mining, Pisa, ICDM ’08, pp 502–511
Papagelis M, Plexousakis D (2005) Qualitative analysis of user-based and item-based prediction algorithms for recommendation agents. In: Engineering applications of artificial intelligence, Pergamon, pp 152–166
Pappas N, Popescu-Belis A (2013) Combining content with user preferences for TED lecture recommendation. In: Proceedings of the 11th international workshop on content based multimedia indexing, Veszprém, Hungary, CBMI ’13, pp 47–52
Pappas N, Popescu-Belis A (2013) Sentiment analysis of user comments for one-class collaborative filtering over TED talks. In: Proceedings of the 36th ACM SIGIR conference on research and development in information retrieval, Short papers, Dublin, SIGIR ’13, pp 773–776
Řehůřek R, Sojka P (2010) Software framework for topic modeling with large corpora. In: Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks, Valletta, pp 45–50
Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2009) BPR: Bayesian personalized ranking from implicit feedback. In: Proceedings of the 25th conference on uncertainty in artificial intelligence, Montreal, UAI ’09, pp 452–461
Sahlgren M (2005) An introduction to random indexing. In: Proceedings of the 7th international conference on terminology and knowledge engineering, methods and applications of semantic indexing workshop, vol 5. Copenhagen
Sahlgren M (2006) The word-space model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. PhD thesis, Stockholm University, Stockholm
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523
Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th international conference on World Wide Web, Hong Kong, WWW ’01, pp 285–295
Semeraro G, Degemmis M, Lops P, Basile P (2007) Combining learning and word sense disambiguation for intelligent user profiling. In: Proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, IJCAI ’07, pp 2856–2861
Semeraro G, Basile P, De Gemmis M, Lops P (2009) User profiles for personalizing digital libraries. In: Theng Y, D G, Foo S, Na J (eds) Handbook of research on digital libraries design development and impact. Information Science Reference, pp 149–158
Semeraro G, Lops P, Basile P, de Gemmis M (2009) Knowledge infusion into content-based recommender systems. In: Proceedings of the 3rd ACM conference on recommender systems, New York, RecSys ’09, pp 301–304
Shani G, Gunawardana A (2011) Evaluating recommendation systems. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York, pp 257–297
Shi Y, Karatzoglou A, Baltrunas L, Larson M, Oliver N, Hanjalic A (2012) CLiMF: learning to maximize reciprocal rank with collaborative less-is-more filtering. In: Proceedings of the 6th ACM conference on recommender systems, Dublin, RecSys ’12, pp 139–146
Shin H, Lee M, Kim E (2009) Personalized digital TV content recommendation with integration of user behavior profiling and multimodal content rating. IEEE Trans Consum Electron 55(3):1417–1423
Sindhwani V, Bucak SS, Hu J, Mojsilovic A (2009) A family of non-negative matrix factorizations for one-class collaborative filtering problems. In: Proceedings of the 3rd ACM conference on recommender systems, recommender based industrial applications workshop, New York, RecSys ’09
Smirnov AV, Krizhanovsky A (2008) Information filtering based on Wiki index database. CoRR arXiv:08042354
Tsinaraki C, Christodoulakis S (2006) A multimedia user preference model that supports semantics and its application to MPEG 7/21. In: Proceedings of the 12th international conference on multi-media modelling, Beijing, p 8
Yang B, Mei T, Hua XS, Yang L, Yang SQ, Li M (2007) Online video recommendation based on multimodal fusion and relevance feedback. In: Proceedings of the 6th ACM international conference on image and video retrieval, Amsterdam, CIVR ’07, pp 73–80
Acknowledgments
We would like to thank the managers of the TED website for their support in accessing and distributing the TED metadata, as well as the anonymous MTAP reviewers for their insightful suggestions. This paper includes material presented at the CBMI 2013 workshop [32], and we are grateful to the organizers for this special issue of the MTAP journal related to CBMI.
Author information
Authors and Affiliations
Corresponding author
Additional information
We are grateful for the support received for this work from the European Union through the inEvent project FP7-ICT n. 287872 (see http://www.inevent-project.eu).
Rights and permissions
About this article
Cite this article
Pappas, N., Popescu-Belis, A. Combining content with user preferences for non-fiction multimedia recommendation: a study on TED lectures. Multimed Tools Appl 74, 1175–1197 (2015). https://doi.org/10.1007/s11042-013-1840-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1840-y