Using visual features based on MPEG-7 and deep learning for movie recommendation

Deldjoo, Yashar; Elahi, Mehdi; Quadrana, Massimo; Cremonesi, Paolo

doi:10.1007/s13735-018-0155-1

Using visual features based on MPEG-7 and deep learning for movie recommendation

Regular Paper
Published: 14 June 2018

Volume 7, pages 207–219, (2018)
Cite this article

International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Yashar Deldjoo¹,
Mehdi Elahi ORCID: orcid.org/0000-0003-2203-9195²,
Massimo Quadrana¹ &
…
Paolo Cremonesi¹

1310 Accesses
46 Citations
Explore all metrics

Abstract

Item features play an important role in movie recommender systems, where recommendations can be generated by using explicit or implicit preferences of users on traditional features (attributes) such as tag, genre, and cast. Typically, movie features are human-generated, either editorially (e.g., genre and cast) or by leveraging the wisdom of the crowd (e.g., tag), and as such, they are prone to noise and are expensive to collect. Moreover, these features are often rare or absent for new items, making it difficult or even impossible to provide good quality recommendations. In this paper, we show that users’ preferences on movies can be well or even better described in terms of the mise-en-scène features, i.e., the visual aspects of a movie that characterize design, aesthetics and style (e.g., colors, textures). We use both MPEG-7 visual descriptors and Deep Learning hidden layers as examples of mise-en-scène features that can visually describe movies. These features can be computed automatically from any video file, offering the flexibility in handling new items, avoiding the need for costly and error-prone human-based tagging, and providing good scalability. We have conducted a set of experiments on a large catalog of 4K movies. Results show that recommendations based on mise-en-scène features consistently outperform traditional metadata attributes (e.g., genre and tag).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybrid Recommendation of Movies Based on Deep Content Features

Content-Based Video Recommendation System Based on Stylistic Visual Features

Article 11 February 2016

Yashar Deldjoo, Mehdi Elahi, … Massimo Quadrana

How to Combine Visual Features with Tags to Improve Movie Recommendation Accuracy?

Notes

References

Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Article Google Scholar
Bao X, Fan S, Varshavsky A, Li K, Roy Choudhury R (2013) Your reactions suggest you liked the movie: automatic content rating via reaction sensing. In: Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing. ACM, pp 197–206
Bastan M, Cam H, Gudukbay U, Ulusoy O (2010) Bilvideo-7: an mpeg-7-compatible video indexing and retrieval system. IEEE MultiMed 17(3):62–73
Article Google Scholar
Bogdanov D, Serrà J, Wack N, Herrera P, Serra X (2011) Unifying low-level and high-level music similarity measures. IEEE Trans Multimed 13(4):687–701
Article Google Scholar
Braunhofer M, Elahi M, Ricci F (2014) Techniques for cold-starting context-aware mobile recommender systems for tourism. Intelligenza Artificiale 8(2):129–143
Google Scholar
Brezeale D, Cook DJ (2008) Automatic video classification: a survey of the literature. IEEE Trans Syst Man Cybern Part C Appl Rev 38(3):416–430
Article Google Scholar
Buckland W (2008) What does the statistical style analysis of film involve? A review of moving into pictures. More on film history, style, and analysis. Lit Linguist Comput 23(2):219–230
Article Google Scholar
Cantador I, Szomszor M, Alani H, Fernández M, Castells P (2008) Enriching ontological user profiles with tagging history for multi-domain recommendations. In: 1st International workshop on collective semantics: collective intelligence & the semantic web (CISWeb 2008), Tenerife, Spain
Cremonesi P, Elahi M, Garzotto F (2015) Interaction design patterns in recommender systems. In: Proceedings of the 11th biannual conference on Italian SIGCHI chapter. ACM, pp 66–73
Cremonesi P, Elahi M, Garzotto F (2017) User interface patterns in recommendation-empowered content intensive multimedia applications. Multimed Tools Appl 76(4):5275–5309
Article Google Scholar
Cremonesi P, Garzotto F, Negro S, Papadopoulos AV, Turrin R (2011) Looking for good recommendations: a comparative evaluation of recommender systems. In: Human–computer interaction–INTERACT 2011. Springer, pp 152–168
Cremonesi P, Koren Y, Turrin R (2010) Performance of recommender algorithms on top-n recommendation tasks. In: Proceedings of the 2010 ACM conference on recommender systems, RecSys 2010, Barcelona, Spain, September 26–30, 2010, pp 39–46
Deldjoo Y, Atani RE (2016) A low-cost infrared-optical head tracking solution for virtual 3d audio environment using the nintendo wii-remote. Entertain Comput 12:9–27
Article Google Scholar
Deldjoo Y, Constantin MG, Schedl M, Ionescu B, Cremonesi P (2018) Mmtf-14k: a multifaceted movie trailer feature dataset for recommendation and retrieval. In: Proceedings of the 9th ACM multimedia systems conference. ACM
Deldjoo Y, Cremonesi P, Schedl M, Quadrana M (2017) The effect of different video summarization models on the quality of video recommendation based on low-level visual features. In: Proceedings of the 15th international workshop on content-based multimedia indexing. ACM, p 20
Deldjoo Y, Elahi M, Cremonesi P, Garzotto F, Piazzolla P (2016) Recommending movies based on mise-en-scene design. In: Proceedings of the 2016 CHI conference extended abstracts on human factors in computing systems. ACM, pp 1540–1547
Deldjoo Y, Elahi M, Cremonesi P, Garzotto F, Piazzolla P, Quadrana M (2016) Content-based video recommendation system based on stylistic visual features. J Data Semant 5:1–15
Article Google Scholar
Deldjoo Y, Elahi M, Quadrana M, Cremonesi P, Garzotto F (2015) Toward effective movie recommendations based on mise-en-scène film styles. In: Proceedings of the 11th biannual conference on Italian SIGCHI chapter. ACM, pp 162–165
Deldjoo Y, Elahi Y, Cremonesi P, Moghaddam FB, Caielli ALE (2017) How to combine visual features with tags to improve movie recommendation accuracy? In: E-commerce and web technologies: 17th international conference, EC-Web 2016, Porto, Portugal, September 5–8, 2016, Revised Selected Papers, vol. 278. Springer, p 34
Google Scholar
Deldjoo Y, Frà C, Valla M, Cremonesi P (2017) Letting users assist what to watch: an interactive query-by-example movie recommendation system. In: Proceedings of the 8th Italian information retrieval workshop, Lugano, Switzerland, June 05–07, 2017, pp 63–66. http://ceur-ws.org/Vol-1911/10.pdf. Accessed 15 Dec 2017
Dorai C, Venkatesh S (2001) Computational media aesthetics: finding meaning beautiful. IEEE MultiMed 8(4):10–12
Article Google Scholar
Elahi M, Braunhofer M, Ricci F, Tkalcic M (2013) Personality-based active learning for collaborative filtering recommender systems. In: Congress of the Italian association for artificial intelligence. Springer, pp 360–371
Elahi M, Deldjoo Y, Bakhshandegan Moghaddam F, Cella L, Cereda S, Cremonesi P (2017) Exploring the semantic gap for movie recommendations. In: Proceedings of the eleventh ACM conference on recommender systems. ACM, pp 326–330
Elahi M, Ricci F, Repsys V (2011) System-wide effectiveness of active learning in collaborative filtering. In: Proceedings of the international workshop on social web mining, co-located with IJCAI, Barcelona, Spain
Elahi M, Ricci F, Rubens N (2013) Active learning strategies for rating elicitation in collaborative filtering: a system-wide perspective. ACM Trans Intell Syst Technol (TIST) 5(1):13
Google Scholar
Elahi M, Ricci F, Rubens N (2016) A survey of active learning in collaborative filtering recommender systems. Comput Sci Rev 20:29–50
Article MathSciNet Google Scholar
Fleischman M, Hovy E (2003) Recommendations without user preferences: a natural language processing approach. In: Proceedings of the 8th international conference on Intelligent user interfaces. ACM, pp 242–244
Gedikli F, Jannach D, Ge M (2014) How should i explain? A comparison of different explanation types for recommender systems. Int J Hum Comput Stud 72(4):367–382
Article Google Scholar
Haghighat M, Abdel-Mottaleb M, Alhalabi W (2016) Fully automatic face normalization and single sample face recognition in unconstrained environments. Expert Syst Appl 47:23–34
Article Google Scholar
Hardoon DR, Szedmak S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16(12):2639–2664
Article Google Scholar
Harper FM, Konstan JA (2015) The movielens datasets: history and context. ACM Trans Interact Intell Syst (TiiS) 5(4):19
Google Scholar
He R, McAuley J (2015) Vbpr: visual bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1510.01784
Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern Part C Appl Rev 41(6):797–819
Article Google Scholar
Jakob N, Weber SH, Müller MC, Gurevych I (2009) Beyond the stars: exploiting free-text user reviews to improve the accuracy of movie recommendations. In: Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion. ACM, pp 57–64
Lika B, Kolomvatsos K, Hadjiefthymiades S (2014) Facing the cold start problem in recommender systems. Expert Syst Appl 41(4):2065–2073
Article Google Scholar
Manjunath BS, Ohm JR, Vasudevan VV, Yamada A (2001) Color and texture descriptors. IEEE Trans Circuits Syst Video Technol 11(6):703–715
Article Google Scholar
Manjunath BS, Salembier P, Sikora T (2002) Introduction to MPEG-7: multimedia content description interface, vol 1. Wiley, Chichester
Google Scholar
Melville P, Sindhwani V (2011) Recommender systems. In: Encyclopedia of machine learning. Springer, pp 829–838
Musto C, Narducci F, Lops P, Semeraro G, de Gemmis M, Barbieri M, Korst J, Pronk V, Clout R (2012) Enhanced semantic tv-show representation for personalized electronic program guides. In: User modeling, adaptation, and personalization. Springer, pp 188–199
Nasery M, Elahi M, Cremonesi P (2015) Polimovie: a feature-based dataset for recommender systems. In: ACM RecSys workshop on crowdsourcing and human computation for recommender systems (CrawdRec), vol 3, pp 25–30
Ning X, Karypis G (2012) Sparse linear methods with side information for top-n recommendations. In: Proceedings of the sixth ACM conference on Recommender systems. ACM, pp 155–162
Rasheed Z, Shah M (2003) Video categorization using semantics and semiotics. In: Video mining. Springer, pp 185–217
Rasheed Z, Sheikh Y, Shah M (2005) On the use of computable features for film classification. IEEE Trans Circuits Syst Video Technol 15(1):52–64
Article Google Scholar
Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2009) Bpr: Bayesian personalized ranking from implicit feedback. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, pp 452–461
Rubens N, Elahi M, Sugiyama M, Kaplan D (2015) Active learning in recommender systems. In: Recommender systems handbook. Springer, pp 809–846
Saveski M, Mantrach A (2014) Item cold-start recommendations: learning local collective embeddings. In: Proceedings of the 8th ACM conference on recommender systems. ACM, pp 89–96
Schedl M, Zamani H, Chen CW, Deldjoo Y, Elahi M (2018) Current challenges and visions in music recommender systems research. Int J Multimed Inf Retr. https://doi.org/10.1007/s13735-018-0154-2
Article Google Scholar
Schein AI, Popescul A, Ungar LH, Pennock DM (2002) Methods and metrics for cold-start recommendations. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 253–260
Shi Y, Larson M, Hanjalic A (2014) Collaborative filtering beyond the user-item matrix: a survey of the state of the art and future challenges. ACM Comput Surv (CSUR) 47(1):3
Article Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Szomszor M, Cattuto C, Alani H, O’Hara K, Baldassarri A, Loreto V, Servedio VDP (2007) Folksonomies, the semantic web, and movie recommendation. In: 4th European Semantic Web Conference, Bridging the Gap between Semantic Web and Web 2.0, Innsbruck, Austria
Tubularinsights: 500 hours of video uploaded to youtube every minute [forecast]. http://tubularinsights.com/hours-minute-uploaded-youtube/. Accessed 19 Jan 2018
Vig J, Sen S, Riedl J (2009) Tagsplanations: explaining recommendations using tags. In: Proceedings of the 14th international conference on intelligent user interfaces. ACM, pp 47–56
Wang XY, Zhang BB, Yang HY (2014) Content-based image retrieval by integrating color and texture features. Multimed Tools Appl 68(3):545–569
Article Google Scholar
Wang Y, Xing C, Zhou L (2006) Video semantic models: survey and evaluation. Int J Comput Sci Netw Secur 6:10–20
Google Scholar
Xu S, Jiang H, Lau F (2008) Personalized online document, image and video recommendation via commodity eye-tracking. In: Proceedings of the 2008 ACM conference on recommender systems. ACM, pp 83–90
Yang B, Mei T, Hua XS, Yang L, Yang SQ, Li M (2007) Online video recommendation based on multimodal fusion and relevance feedback. In: Proceedings of the 6th ACM international conference on image and video retrieval. ACM, pp 73–80
Zettl H (2002) Essentials of applied media aesthetics. In: Dorai C, Venkatesh S (eds) Media computing. The Springer international series in video computing, vol 4. Springer, Berlin, pp 11–38
Google Scholar
Zettl H (2013) Sight, sound, motion: applied media aesthetics. Cengage Learning, Boston
Google Scholar
Zhang ZK, Liu C, Zhang YC, Zhou T (2010) Solving the cold-start problem in recommender systems with social tags. EPL (Europhys Lett 92(2):28,002
Article Google Scholar
Zhao X, Li G, Wang M, Yuan J, Zha ZJ, Li Z, Chua TS (2011) Integrating rich information for video recommendation with multi-task rank aggregation. In: Proceedings of the 19th ACM international conference on Multimedia. ACM, pp 1521–1524
Zhou H, Hermans T, Karandikar AV, Rehg JM (2010) Movie genre classification via scene categorization. In: Proceedings of the international conference on multimedia. ACM, pp 747–750

Download references

Acknowledgements

This work is supported by Telecom Italia S.p.A., Open Innovation Department, Joint Open Lab S-Cube, Milan. The work has been also supported by the Amazon AWS Cloud Credits for Research program.

Author information

Authors and Affiliations

Politecnico di Milano, Via Ponzio 34/5, 20133, Milan, Italy
Yashar Deldjoo, Massimo Quadrana & Paolo Cremonesi
Free University of Bozen - Bolzano, Piazza Domenicani 3, 39100, Bolzano, Italy
Mehdi Elahi

Authors

Yashar Deldjoo
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Elahi
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Quadrana
View author publications
You can also search for this author in PubMed Google Scholar
Paolo Cremonesi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mehdi Elahi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Deldjoo, Y., Elahi, M., Quadrana, M. et al. Using visual features based on MPEG-7 and deep learning for movie recommendation. Int J Multimed Info Retr 7, 207–219 (2018). https://doi.org/10.1007/s13735-018-0155-1

Download citation

Received: 19 January 2018
Revised: 24 April 2018
Accepted: 02 June 2018
Published: 14 June 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s13735-018-0155-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using visual features based on MPEG-7 and deep learning for movie recommendation

Abstract

Access this article

Similar content being viewed by others

Hybrid Recommendation of Movies Based on Deep Content Features

Content-Based Video Recommendation System Based on Stylistic Visual Features

How to Combine Visual Features with Tags to Improve Movie Recommendation Accuracy?

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using visual features based on MPEG-7 and deep learning for movie recommendation

Abstract

Access this article

Similar content being viewed by others

Hybrid Recommendation of Movies Based on Deep Content Features

Content-Based Video Recommendation System Based on Stylistic Visual Features

How to Combine Visual Features with Tags to Improve Movie Recommendation Accuracy?

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation