Predicting Interestingness of Visual Content

Demarty, Claire-Hélène; Sjöberg, Mats; Constantin, Mihai Gabriel; Duong, Ngoc Q. K.; Ionescu, Bogdan; Do, Thanh-Toan; Wang, Hanli

doi:10.1007/978-3-319-57687-9_10

Claire-Hélène Demarty⁴,
Mats Sjöberg⁵,
Mihai Gabriel Constantin⁶,
Ngoc Q. K. Duong⁴,
Bogdan Ionescu⁶,
Thanh-Toan Do^7,8 &
…
Hanli Wang⁹

Part of the book series: Multimedia Systems and Applications ((MMSA))

608 Accesses
10 Citations

Abstract

The ability of multimedia data to attract and keep people’s interest for longer periods of time is gaining more and more importance in the fields of information retrieval and recommendation, especially in the context of the ever growing market value of social media and advertising. In this chapter we introduce a benchmarking framework (dataset and evaluation tools) designed specifically for assessing the performance of media interestingness prediction techniques. We release a dataset which consists of excerpts from 78 movie trailers of Hollywood-like movies. These data are annotated by human assessors according to their degree of interestingness. A real-world use scenario is targeted, namely interestingness is defined in the context of selecting visual content for illustrating a Video on Demand (VOD) website. We provide an in-depth analysis of the human aspects of this task, i.e., the correlation between perceptual characteristics of the content and the actual data, as well as of the machine aspects by overviewing the participating systems of the 2016 MediaEval Predicting Media Interestingness campaign. After discussing the state-of-art achievements, valuable insights, existing current capabilities as well as future challenges are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Almeida, J.: UNIFESP at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Almeida, J., Leite, N.J., Torres, R.S.: Comparison of video sequences with histograms of motion patterns. In: IEEE ICIP International Conference on Image Processing, pp. 3673–3676 (2011)
Google Scholar
Baveye, Y., Dellandréa, E., Chamaret, C., Chen, L.: Liris-accede: a video database for affective content analysis. IEEE Trans. Affect. Comput. 6(1), 43–55 (2015)
Article Google Scholar
Berg, A.C., Berg, T.L., Daume, H., Dodge, J., Goyal, A., Han, X., Mensch, A., Mitchell, M., Sood, A., Stratos, K., et al.: Understanding and predicting importance in images. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 3562–3569. IEEE, Providence (2012)
Google Scholar
Berlyne, D.E.: Conflict, Arousal and Curiosity. Mc-Graw-Hill, New York (1960)
Google Scholar
Boiman, O., Irani, M.: Detecting irregularities in images and in video. Int. J. Comput. Vis. 74(1), 17–31 (2007)
Article Google Scholar
Bradley, R.A., Terry, M.E.: Rank analysis of incomplete block designs: the method of paired comparisons. Biometrika 39(3-4), 324–345 (1952)
Article MathSciNet MATH Google Scholar
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. In: ACM Sigmod Record, vol. 29, pp. 93–104. ACM, New York (2000)
Google Scholar
Bulling, A., Roggen, D.: Recognition of visual memory recall processes using eye movement analysis. In: Proceedings of the 13th international conference on Ubiquitous Computing, pp. 455–464. ACM, New York (2011)
Google Scholar
Chamaret, C., Demarty, C.H., Demoulin, V., Marquant, G.: Experiencing the interestingness concept within and between pictures. In: Proceeding of SPIE, Human Vision and Electronic Imaging (2016)
Google Scholar
Chen, A., Darst, P.W., Pangrazi, R.P.: An examination of situational interest and its sources. Br. J. Educ. Psychol. 71(3), 383–400 (2001)
Article Google Scholar
Chen, S., Dian, Y., Jin, Q.: RUC at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Chu, S.L., Fedorovskaya, E., Quek, F., Snyder, J.: The effect of familiarity on perceived interestingness of images. In: Proceedings of SPIE, vol. 8651, pp. 86,511C–86,511C–12 (2013). DOI 10.1117/12.2008551, http://dx.doi.org/10.1117/12.2008551
Constantin, M.G., Boteanu, B., Ionescu, B.: LAPI at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition (2005)
Book Google Scholar
Danelljan, M., Hager, G., Khan, F.S., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference (2014)
Book Google Scholar
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: IEEE ECCV European Conference on Computer Vision, pp. 288–301. Springer, Berlin (2006)
Google Scholar
Demarty, C.H., Sjöberg, M., Ionescu, B., Do, T.T., Wang, H., Duong, N.Q.K., Lefebvre, F.: Mediaeval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Dhar, S., Ordonez, V., Berg, T.L.: High level describable attributes for predicting aesthetics and interestingness. In: IEEE International Conference on Computer Vision and Pattern Recognition (2011)
Book Google Scholar
Elazary, L., Itti, L.: Interesting objects are visually salient. J. Vis. 8(3), 3–3 (2008)
Article Google Scholar
Erdogan, G., Erdem, A., Erdem, E.: HUCVL at MediaEval 2016: predicting interesting key frames with deep models. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Grabner, H., Nater, F., Druey, M., Gool, L.V.: Visual interestingness in image sequences. In: ACM International Conference on Multimedia, pp. 1017–1026. ACM, New York (2013). DOI 10.1145/2502081.2502109, http://doi.acm.org/10.1145/2502081.2502109
Gygli, M., Grabner, H., Riemenschneider, H., Nater, F., van Gool, L.: The interestingness of images. In: ICCV International Conference on Computer Vision (2013)
Book Google Scholar
Gygli, M., Song, Y., Cao, L.: Video2gif: automatic generation of animated gifs from video. CoRR abs/1605.04850 (2016). http://arxiv.org/abs/1605.04850
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, pp. 545–552 (2006)
Google Scholar
Hayes, A.F., Krippendorff, K.: Answering the call for a standard reliability measure for coding data. Commun. Methods Meas. 1(1), 77–89 (2007). DOI 10.1080/19312450709336664, http://dx.doi.org/10.1080/19312450709336664
Article Google Scholar
Hsieh, L.C., Hsu, W.H., Wang, H.C.: Investigating and predicting social and visual image interestingness on social media by crowdsourcing. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4309–4313. IEEE, Providence (2014)
Google Scholar
Hua, X.S., Yang, L., Wang, J., Wang, J., Ye, M., Wang, K., Rui, Y., Li, J.: Clickage: towards bridging semantic and intent gaps via mining click logs of search engines. In: ACM International Conference on Multimedia (2013)
Book Google Scholar
Isola, P., Parikh, D., Torralba, A., Oliva, A.: Understanding the intrinsic memorability of images. In: Advances in Neural Information Processing Systems, pp. 2429–2437 (2011)
Google Scholar
Isola, P., Xiao, J., Torralba, A., Oliva, A.: What makes an image memorable? In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 145–152. IEEE, Providence (2011)
Google Scholar
Jiang, Y.G., Wang, Y., Feng, R., Xue, X., Zheng, Y., Yan, H.: Understanding and predicting interestingness of videos. In: AAAI Conference on Artificial Intelligence (2013)
Google Scholar
Jiang, Y.G., Dai, Q., Mei, T., Rui, Y., Chang, S.F.: Super fast event recognition in internet videos. IEEE Trans. Multimedia 177(8), 1–13 (2015)
Article Google Scholar
Joachims, T.: Optimizing search engines using clickthrough data. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 133–142. ACM, New York (2002)
Google Scholar
Ke, Y., Hoiem, D., Sukthankar, R.: Computer vision for music identification. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 597–604. IEEE, Providence (2005)
Google Scholar
Ke, Y., Tang, X., Jing, F.: The design of high-level features for photo quality assessment. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 419–426. IEEE, Providence (2006)
Google Scholar
Khosla, A., Raju, A.S., Torralba, A., Oliva, A.: Understanding and predicting image memorability at a large scale. In: International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Krippendorff, K.: Content Analysis: An Introduction to Its Methodology, 3rd edn. Sage, Thousand Oaks (2013)
Google Scholar
Lam, V., Do, T., Phan, S., Le, D.D., Satoh, S., Duong, D.: NII-UIT at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)
Google Scholar
Li, J., Barkowsky, M., Le Callet, P.: Boosting paired comparison methodology in measuring visual discomfort of 3dtv: performances of three different designs. In: Proceedings of SPIE Electronic Imaging, Stereoscopic Displays and Applications, vol. 8648 (2013)
Google Scholar
Li, L.J., Su, H., Fei-Fei, L., Xing, E.P.: Object bank: a high-level image representation for scene classification & semantic feature sparsification. In: Advances in Neural Information Processing Systems, pp. 1378–1386 (2010)
Google Scholar
Liem, C.: TUD-MMC at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Liu, F., Niu, Y., Gleicher, M.: Using web photos for measuring video frame interestingness. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 2058–2063 (2009)
Google Scholar
Liu, Y., Gu, Z., Cheung, Y.M.: Supervised manifold learning for media interestingness prediction. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Article Google Scholar
Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: ACM International Conference on Multimedia, pp. 83–92. ACM, New York (2010). DOI 10.1145/1873951.1873965, http://doi.acm.org/10.1145/1873951.1873965
McCrae, R.R.: Aesthetic chills as a universal marker of openness to experience. Motiv. Emot. 31(1), 5–11 (2007)
Article Google Scholar
Murray, N., Marchesotti, L., Perronnin, F.: Ava: a large-scale database for aesthetic visual analysis. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 2408–2415. IEEE, Providence (2012)
Google Scholar
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Article MATH Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)
Article MATH Google Scholar
Ovadia, S.: Ratings and rankings: reconsidering the structure of values and their measurement. Int. J. Soc. Res. Methodol. 7(5), 403–414 (2004). DOI 10.1080/1364557032000081654, http://dx.doi.org/10.1080/1364557032000081654
Article Google Scholar
Parekh, J., Parekh, S.: The MLPBOON Predicting Media Interestingness System for MediaEval 2016. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Rayatdoost, S., Soleymani, M.: Ranking images and videos on visual interestingness by visual sentiment features. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Schaul, T., Pape, L., Glasmachers, T., Graziano, V., Schmidhuber, J.: Coherence progress: a measure of interestingness based on fixed compressors. In: International Conference on Artificial General Intelligence, pp. 21–30. Springer, Berlin (2011)
Google Scholar
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, Providence (2007)
Google Scholar
Shen, Y., Demarty, C.H., Duong, N.Q.K.: Technicolor@MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Shen, Y., Demarty, C.H., Duong, N.Q.K.: Deep learning for multimodal-based video interestingness prediction. In: IEEE International Conference on Multimedia and Expo, ICME’17 (2017)
Google Scholar
Silvia, P.J.: What is interesting? Exploring the appraisal structure of interest. Emotion 5(1), 89 (2005)
Google Scholar
Silvia, P.J., Henson, R.A., Templin, J.L.: Are the sources of interest the same for everyone? using multilevel mixture models to explore individual differences in appraisal structures. Cognit. Emot. 23(7), 1389–1406 (2009)
Article Google Scholar
Sjöberg, M., Baveye, Y., Wang, H., Quang, V.L., Ionescu, B., Dellandréa, E., Schedl, M., Demarty, C.H., Chen, L.: The mediaeval 2015 affective impact of movies task. In: Proceedings of the MediaEval Workshop, CEUR Workshop Proceedings (2015)
Google Scholar
Soleymani, M.: The quest for visual interest. In: ACM International Conference on Multimedia, pp. 919–922. New York, NY, USA (2015). DOI 10.1145/2733373.2806364, http://doi.acm.org/10.1145/2733373.2806364
Spain, M., Perona, P.: Measuring and predicting object importance. Int. J. Comput. Vis. 91(1), 59–76 (2011)
Article Google Scholar
Stein, B.E., Stanford, T.R.: Multisensory integration: current issues from the perspective of the single neuron. Nat. Rev. Neurosci. 9(4), 255–266 (2008)
Article Google Scholar
Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: IEEE ECCV European Conference on Computer Vision, pp. 776–789. Springer, Berlin (2010)
Google Scholar
Turner, S.A. Jr, Silvia, P.J.: Must interesting things be pleasant? A test of competing appraisal structures. Emotion 6(4), 670 (2006)
Google Scholar
Valdez, P., Mehrabian, A.: Effects of color on emotions. J. Exp. Psychol. Gen. 123(4), 394 (1994)
Article Google Scholar
Vasudevan, A.B., Gygli, M., Volokitin, A., Gool, L.V.: Eth-cvl @ MediaEval 2016: Textual-visual embeddings and video2gif for video interestingness. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 3485–3492 (2010)
Google Scholar
Xu, B., Fu, Y., Jiang, Y.G.: BigVid at MediaEval 2016: predicting interestingness in images and videos. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Google Scholar
Yang, Y.H., Chen, H.H.: Ranking-based emotion recognition for music organization and retrieval. IEEE Trans. Audio Speech Lang. Process. 19(4), 762–774 (2011)
Article Google Scholar
Yannakakis, G.N., Hallam, J.: Ranking vs. preference: a comparative study of self-reporting. In: International Conference on Affective Computing and Intelligent Interaction, pp. 437–446. Springer, Berlin (2011)
Google Scholar

Download references

Acknowledgements

We would like to thank Yu-Gang Jiang and Baohan Xu from the Fudan University, China, and Hervé Bredin, from LIMSI, France for providing the features that accompany the released data, and Frédéric Lefebvre, Alexey Ozerov and Vincent Demoulin for their valuable inputs to the task definition. We also would like to thank our anonymous annotators for their contribution to building the ground-truth for the datasets. Part of this work was funded under project SPOTTER PN-III-P2-2.1-PED-2016-1065, contract 30PED/2017.

Author information

Authors and Affiliations

Technicolor R&I, Rennes, France
Claire-Hélène Demarty & Ngoc Q. K. Duong
Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
Mats Sjöberg
LAPI, University Politehnica of Bucharest, Bucharest, Romania
Mihai Gabriel Constantin & Bogdan Ionescu
Singapore University of Technology and Design, Singapore, Singapore
Thanh-Toan Do
University of Science, Ho Chi Minh City, Vietnam
Thanh-Toan Do
Department of Computer Science and Technology, Tongji University, Shanghai, China
Hanli Wang

Authors

Claire-Hélène Demarty
View author publications
You can also search for this author in PubMed Google Scholar
Mats Sjöberg
View author publications
You can also search for this author in PubMed Google Scholar
Mihai Gabriel Constantin
View author publications
You can also search for this author in PubMed Google Scholar
Ngoc Q. K. Duong
View author publications
You can also search for this author in PubMed Google Scholar
Bogdan Ionescu
View author publications
You can also search for this author in PubMed Google Scholar
Thanh-Toan Do
View author publications
You can also search for this author in PubMed Google Scholar
Hanli Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Claire-Hélène Demarty .

Editor information

Editors and Affiliations

LaBRI UMR 5800, Univ. Bordeaux, CNRS, Bordeaux INP, Univ. Bordeaux, Talence, France
Jenny Benois-Pineau
LS2N, UMR CNRS 6004, Université de Nantes, Nantes Cedex 3, France
Patrick Le Callet

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Demarty, CH. et al. (2017). Predicting Interestingness of Visual Content. In: Benois-Pineau, J., Le Callet, P. (eds) Visual Content Indexing and Retrieval with Psycho-Visual Models. Multimedia Systems and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-57687-9_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-57687-9_10
Published: 16 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57686-2
Online ISBN: 978-3-319-57687-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics