Abstract
Item features play an important role in movie recommender systems, where recommendations can be generated by using explicit or implicit preferences of users on traditional features (attributes) such as tag, genre, and cast. Typically, movie features are human-generated, either editorially (e.g., genre and cast) or by leveraging the wisdom of the crowd (e.g., tag), and as such, they are prone to noise and are expensive to collect. Moreover, these features are often rare or absent for new items, making it difficult or even impossible to provide good quality recommendations. In this paper, we show that users’ preferences on movies can be well or even better described in terms of the mise-en-scène features, i.e., the visual aspects of a movie that characterize design, aesthetics and style (e.g., colors, textures). We use both MPEG-7 visual descriptors and Deep Learning hidden layers as examples of mise-en-scène features that can visually describe movies. These features can be computed automatically from any video file, offering the flexibility in handling new items, avoiding the need for costly and error-prone human-based tagging, and providing good scalability. We have conducted a set of experiments on a large catalog of 4K movies. Results show that recommendations based on mise-en-scène features consistently outperform traditional metadata attributes (e.g., genre and tag).
Similar content being viewed by others
References
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Bao X, Fan S, Varshavsky A, Li K, Roy Choudhury R (2013) Your reactions suggest you liked the movie: automatic content rating via reaction sensing. In: Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing. ACM, pp 197–206
Bastan M, Cam H, Gudukbay U, Ulusoy O (2010) Bilvideo-7: an mpeg-7-compatible video indexing and retrieval system. IEEE MultiMed 17(3):62–73
Bogdanov D, Serrà J, Wack N, Herrera P, Serra X (2011) Unifying low-level and high-level music similarity measures. IEEE Trans Multimed 13(4):687–701
Braunhofer M, Elahi M, Ricci F (2014) Techniques for cold-starting context-aware mobile recommender systems for tourism. Intelligenza Artificiale 8(2):129–143
Brezeale D, Cook DJ (2008) Automatic video classification: a survey of the literature. IEEE Trans Syst Man Cybern Part C Appl Rev 38(3):416–430
Buckland W (2008) What does the statistical style analysis of film involve? A review of moving into pictures. More on film history, style, and analysis. Lit Linguist Comput 23(2):219–230
Cantador I, Szomszor M, Alani H, Fernández M, Castells P (2008) Enriching ontological user profiles with tagging history for multi-domain recommendations. In: 1st International workshop on collective semantics: collective intelligence & the semantic web (CISWeb 2008), Tenerife, Spain
Cremonesi P, Elahi M, Garzotto F (2015) Interaction design patterns in recommender systems. In: Proceedings of the 11th biannual conference on Italian SIGCHI chapter. ACM, pp 66–73
Cremonesi P, Elahi M, Garzotto F (2017) User interface patterns in recommendation-empowered content intensive multimedia applications. Multimed Tools Appl 76(4):5275–5309
Cremonesi P, Garzotto F, Negro S, Papadopoulos AV, Turrin R (2011) Looking for good recommendations: a comparative evaluation of recommender systems. In: Human–computer interaction–INTERACT 2011. Springer, pp 152–168
Cremonesi P, Koren Y, Turrin R (2010) Performance of recommender algorithms on top-n recommendation tasks. In: Proceedings of the 2010 ACM conference on recommender systems, RecSys 2010, Barcelona, Spain, September 26–30, 2010, pp 39–46
Deldjoo Y, Atani RE (2016) A low-cost infrared-optical head tracking solution for virtual 3d audio environment using the nintendo wii-remote. Entertain Comput 12:9–27
Deldjoo Y, Constantin MG, Schedl M, Ionescu B, Cremonesi P (2018) Mmtf-14k: a multifaceted movie trailer feature dataset for recommendation and retrieval. In: Proceedings of the 9th ACM multimedia systems conference. ACM
Deldjoo Y, Cremonesi P, Schedl M, Quadrana M (2017) The effect of different video summarization models on the quality of video recommendation based on low-level visual features. In: Proceedings of the 15th international workshop on content-based multimedia indexing. ACM, p 20
Deldjoo Y, Elahi M, Cremonesi P, Garzotto F, Piazzolla P (2016) Recommending movies based on mise-en-scene design. In: Proceedings of the 2016 CHI conference extended abstracts on human factors in computing systems. ACM, pp 1540–1547
Deldjoo Y, Elahi M, Cremonesi P, Garzotto F, Piazzolla P, Quadrana M (2016) Content-based video recommendation system based on stylistic visual features. J Data Semant 5:1–15
Deldjoo Y, Elahi M, Quadrana M, Cremonesi P, Garzotto F (2015) Toward effective movie recommendations based on mise-en-scène film styles. In: Proceedings of the 11th biannual conference on Italian SIGCHI chapter. ACM, pp 162–165
Deldjoo Y, Elahi Y, Cremonesi P, Moghaddam FB, Caielli ALE (2017) How to combine visual features with tags to improve movie recommendation accuracy? In: E-commerce and web technologies: 17th international conference, EC-Web 2016, Porto, Portugal, September 5–8, 2016, Revised Selected Papers, vol. 278. Springer, p 34
Deldjoo Y, Frà C, Valla M, Cremonesi P (2017) Letting users assist what to watch: an interactive query-by-example movie recommendation system. In: Proceedings of the 8th Italian information retrieval workshop, Lugano, Switzerland, June 05–07, 2017, pp 63–66. http://ceur-ws.org/Vol-1911/10.pdf. Accessed 15 Dec 2017
Dorai C, Venkatesh S (2001) Computational media aesthetics: finding meaning beautiful. IEEE MultiMed 8(4):10–12
Elahi M, Braunhofer M, Ricci F, Tkalcic M (2013) Personality-based active learning for collaborative filtering recommender systems. In: Congress of the Italian association for artificial intelligence. Springer, pp 360–371
Elahi M, Deldjoo Y, Bakhshandegan Moghaddam F, Cella L, Cereda S, Cremonesi P (2017) Exploring the semantic gap for movie recommendations. In: Proceedings of the eleventh ACM conference on recommender systems. ACM, pp 326–330
Elahi M, Ricci F, Repsys V (2011) System-wide effectiveness of active learning in collaborative filtering. In: Proceedings of the international workshop on social web mining, co-located with IJCAI, Barcelona, Spain
Elahi M, Ricci F, Rubens N (2013) Active learning strategies for rating elicitation in collaborative filtering: a system-wide perspective. ACM Trans Intell Syst Technol (TIST) 5(1):13
Elahi M, Ricci F, Rubens N (2016) A survey of active learning in collaborative filtering recommender systems. Comput Sci Rev 20:29–50
Fleischman M, Hovy E (2003) Recommendations without user preferences: a natural language processing approach. In: Proceedings of the 8th international conference on Intelligent user interfaces. ACM, pp 242–244
Gedikli F, Jannach D, Ge M (2014) How should i explain? A comparison of different explanation types for recommender systems. Int J Hum Comput Stud 72(4):367–382
Haghighat M, Abdel-Mottaleb M, Alhalabi W (2016) Fully automatic face normalization and single sample face recognition in unconstrained environments. Expert Syst Appl 47:23–34
Hardoon DR, Szedmak S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16(12):2639–2664
Harper FM, Konstan JA (2015) The movielens datasets: history and context. ACM Trans Interact Intell Syst (TiiS) 5(4):19
He R, McAuley J (2015) Vbpr: visual bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1510.01784
Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern Part C Appl Rev 41(6):797–819
Jakob N, Weber SH, Müller MC, Gurevych I (2009) Beyond the stars: exploiting free-text user reviews to improve the accuracy of movie recommendations. In: Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion. ACM, pp 57–64
Lika B, Kolomvatsos K, Hadjiefthymiades S (2014) Facing the cold start problem in recommender systems. Expert Syst Appl 41(4):2065–2073
Manjunath BS, Ohm JR, Vasudevan VV, Yamada A (2001) Color and texture descriptors. IEEE Trans Circuits Syst Video Technol 11(6):703–715
Manjunath BS, Salembier P, Sikora T (2002) Introduction to MPEG-7: multimedia content description interface, vol 1. Wiley, Chichester
Melville P, Sindhwani V (2011) Recommender systems. In: Encyclopedia of machine learning. Springer, pp 829–838
Musto C, Narducci F, Lops P, Semeraro G, de Gemmis M, Barbieri M, Korst J, Pronk V, Clout R (2012) Enhanced semantic tv-show representation for personalized electronic program guides. In: User modeling, adaptation, and personalization. Springer, pp 188–199
Nasery M, Elahi M, Cremonesi P (2015) Polimovie: a feature-based dataset for recommender systems. In: ACM RecSys workshop on crowdsourcing and human computation for recommender systems (CrawdRec), vol 3, pp 25–30
Ning X, Karypis G (2012) Sparse linear methods with side information for top-n recommendations. In: Proceedings of the sixth ACM conference on Recommender systems. ACM, pp 155–162
Rasheed Z, Shah M (2003) Video categorization using semantics and semiotics. In: Video mining. Springer, pp 185–217
Rasheed Z, Sheikh Y, Shah M (2005) On the use of computable features for film classification. IEEE Trans Circuits Syst Video Technol 15(1):52–64
Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2009) Bpr: Bayesian personalized ranking from implicit feedback. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, pp 452–461
Rubens N, Elahi M, Sugiyama M, Kaplan D (2015) Active learning in recommender systems. In: Recommender systems handbook. Springer, pp 809–846
Saveski M, Mantrach A (2014) Item cold-start recommendations: learning local collective embeddings. In: Proceedings of the 8th ACM conference on recommender systems. ACM, pp 89–96
Schedl M, Zamani H, Chen CW, Deldjoo Y, Elahi M (2018) Current challenges and visions in music recommender systems research. Int J Multimed Inf Retr. https://doi.org/10.1007/s13735-018-0154-2
Schein AI, Popescul A, Ungar LH, Pennock DM (2002) Methods and metrics for cold-start recommendations. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 253–260
Shi Y, Larson M, Hanjalic A (2014) Collaborative filtering beyond the user-item matrix: a survey of the state of the art and future challenges. ACM Comput Surv (CSUR) 47(1):3
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Szomszor M, Cattuto C, Alani H, O’Hara K, Baldassarri A, Loreto V, Servedio VDP (2007) Folksonomies, the semantic web, and movie recommendation. In: 4th European Semantic Web Conference, Bridging the Gap between Semantic Web and Web 2.0, Innsbruck, Austria
Tubularinsights: 500 hours of video uploaded to youtube every minute [forecast]. http://tubularinsights.com/hours-minute-uploaded-youtube/. Accessed 19 Jan 2018
Vig J, Sen S, Riedl J (2009) Tagsplanations: explaining recommendations using tags. In: Proceedings of the 14th international conference on intelligent user interfaces. ACM, pp 47–56
Wang XY, Zhang BB, Yang HY (2014) Content-based image retrieval by integrating color and texture features. Multimed Tools Appl 68(3):545–569
Wang Y, Xing C, Zhou L (2006) Video semantic models: survey and evaluation. Int J Comput Sci Netw Secur 6:10–20
Xu S, Jiang H, Lau F (2008) Personalized online document, image and video recommendation via commodity eye-tracking. In: Proceedings of the 2008 ACM conference on recommender systems. ACM, pp 83–90
Yang B, Mei T, Hua XS, Yang L, Yang SQ, Li M (2007) Online video recommendation based on multimodal fusion and relevance feedback. In: Proceedings of the 6th ACM international conference on image and video retrieval. ACM, pp 73–80
Zettl H (2002) Essentials of applied media aesthetics. In: Dorai C, Venkatesh S (eds) Media computing. The Springer international series in video computing, vol 4. Springer, Berlin, pp 11–38
Zettl H (2013) Sight, sound, motion: applied media aesthetics. Cengage Learning, Boston
Zhang ZK, Liu C, Zhang YC, Zhou T (2010) Solving the cold-start problem in recommender systems with social tags. EPL (Europhys Lett 92(2):28,002
Zhao X, Li G, Wang M, Yuan J, Zha ZJ, Li Z, Chua TS (2011) Integrating rich information for video recommendation with multi-task rank aggregation. In: Proceedings of the 19th ACM international conference on Multimedia. ACM, pp 1521–1524
Zhou H, Hermans T, Karandikar AV, Rehg JM (2010) Movie genre classification via scene categorization. In: Proceedings of the international conference on multimedia. ACM, pp 747–750
Acknowledgements
This work is supported by Telecom Italia S.p.A., Open Innovation Department, Joint Open Lab S-Cube, Milan. The work has been also supported by the Amazon AWS Cloud Credits for Research program.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Deldjoo, Y., Elahi, M., Quadrana, M. et al. Using visual features based on MPEG-7 and deep learning for movie recommendation. Int J Multimed Info Retr 7, 207–219 (2018). https://doi.org/10.1007/s13735-018-0155-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13735-018-0155-1