Evaluation of visual video summaries: user-supplied constructs and descriptions



Evaluation of video summarization approaches requires more information on the user-perceived qualities of different types of summaries. Also, evaluation measures need to be further developed in a user-led manner. This article reports on a user-centered evaluation of visual video summaries. Four types of summaries (fastforward, user-controlled fastforward, scene clips, and storyboard) were evaluated with a set of existing performance and satisfaction measures. A repertory grid elicitation was conducted with our participants gathering evaluation constructs related to both video summary content and controls. Results showed a lack of correlation between performance and satisfaction measures. User-supplied evaluation constructs were shown to span both the performance and satisfaction dimensions of the video summary evaluation space. Most constructs achieved moderate to good inter-rater agreement in a consequent survey. Free descriptions of videos and respective summaries showed that while users are able to interpret object- and event-related information from short summaries, thematic inference lacked, leading to worse descriptions than for the full videos.


Video summarization Video summaries Evaluation measures Repertory grid Video attributes 


  1. 1.
    Anon: Open video digital archive (2010). http://www.open-video.org
  2. 2.
    Balatsoukas P., Morris A., O’Brien A.: An evaluation framework of user interaction with metadata surrogates. J. Inf. Sci. 35, 321–339 (2009). doi:10.1177/0165551508099090 CrossRefGoogle Scholar
  3. 3.
    Benini S., Migliorati P., Leonardi R.: Statistical skimming of feature films. Int. J. Digital Multimedia Broadcast. 2010, 1–12 (2010)CrossRefGoogle Scholar
  4. 4.
    Christel, M.G.: Evaluation and user studies with respect to video summarization and browsing. In: Proceedings of SPIE Multimedia Content Analysis, Management and Retrieval, vol. 6073 (2006)Google Scholar
  5. 5.
    Christel, M.G., Lin, W.H., Maher, B.: Evaluating audio skimming and frame rate acceleration for summarizing bbc rushes. In: Proceedings of the 2008 International Conference on Content-based image and video retrieval, CIVR ’08, pp. 407–416. ACM, New York, NY, USA (2008). doi:10.1145/1386352.1386405
  6. 6.
    Christel, M.G., Smith, M.A., Taylor, C.R., Winkler, D.B.: Evolving video skims into useful multimedia abstractions. In: Proceedings of the SIGCHI Conference on Human factors in computing systems, CHI ’98, pp. 171–178. ACM Press/Addison-Wesley Publishing Co., New York, NY, USA (1998)Google Scholar
  7. 7.
    Cohen J.: Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychol. Bull. 70, 213–220 (1968)CrossRefGoogle Scholar
  8. 8.
    Corchs, S., Ciocca, G., Schettini, R.: Video summarization using a neurodynamical model of visual attention. In: Proceedings of IEEE 6th Workshop on Multimedia Signal Processing, pp. 71–74 (2004). doi:10.1109/MMSP.2004.1436419
  9. 9.
    de Avila S.E.F., Lopes A.P.B., da Luz A., de Albuquerque Araújo A.: Vsumm: A mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recogn. Lett. 32(1), 56–68 (2011). doi:10.1016/j.patrec.2010.08.004 CrossRefGoogle Scholar
  10. 10.
    Dillon A., McKnight C.: Towards a classification of text types: a repertory grid approach. Int. J. Man-Mach. Stud. 33, 623–636 (1990). doi:10.1016/S0020-7373(05)80066-5 CrossRefGoogle Scholar
  11. 11.
    Dumont E., Mérialdo B.: Rushes video summarization and evaluation. Multimedia Tools Appl. 48, 51–68 (2010). doi:10.1007/s11042-009-0374-9 CrossRefGoogle Scholar
  12. 12.
    Fayzullin, M., Subrahmanian, V.S., Albanese, M., Picariello, A.: The priority curve algorithm for video summarization. In: Proceedings of the 2nd ACM International Workshop on Multimedia databases, MMDB ’04, pp. 28–35. ACM, New York, NY, USA (2004). doi:10.1145/1032604.1032611
  13. 13.
    Goodrum A.A.: Multidimensional scaling of video surrogates. J. Am. Soc. Inf. Sci. Technol. 52, 174–182 (2001). doi:10.1002/1097-4571 CrossRefGoogle Scholar
  14. 14.
    Guironnet M., Pellerin D., Guyader N., Ladret P.: Video summarization based on camera motion and a subjective evaluation method. J. Image Video Process. 2007, 60245 (2007)CrossRefGoogle Scholar
  15. 15.
    He, L., Sanocki, E., Gupta, A., Grudin, J.: Auto-summarization of audio-video presentations. In: Proceedings of the 7th ACM International Conference on Multimedia (Part 1), MULTIMEDIA ’99, pp. 489–498. ACM, New York, NY, USA (1999). doi:10.1145/319463.319691
  16. 16.
    Herranz L., Martinez J.: A framework for scalable summarization of video. IEEE Trans. Circuits Syst. 20(9), 1265–1270 (2010). doi:10.1109/TCSVT.2010.2057020 Google Scholar
  17. 17.
    Jaimes, A., Echigo, T., Teraguchi, M., Satoh, F.: Learning personalized video highlights from detailed mpeg-7 metadata. In: Proceedings of International Conference on Image Processing, vol. 1, pp. I–133 – I–136 vol.1 (2002). doi:10.1109/ICIP.2002.1037977
  18. 18.
    Johnson F.C., Crudge S.E.: Using the repertory grid and laddering technique to determine the user’s evaluative model of search engines. J. Doc. 63, 259–280 (2007). doi:10.1108/00220410710737213 CrossRefGoogle Scholar
  19. 19.
    Komlodi, A., Marchionini, G.: Key frame preview techniques for video browsing. In: Proceedings of the 3rd ACM Conference on Digital libraries, DL ’98, pp. 118–125. ACM, New York, NY, USA (1998). doi:10.1145/276675.276688
  20. 20.
    Kopf, S., Haenselmann, T., Farin, D., Effelsberg, W.: Automatic generation of video summaries for historical films. In: Proceedings of IEEE International Conference on Multimedia and Expo, ICME ’04. 2004, vol. 3, pp. 2067–2070 (2004). doi:10.1109/ICME.2004.1394672
  21. 21.
    Li Y., Narayanan S., Kuo C.: Movie content analysis, indexing and skimming via multimodal information. In: Rosenfeld, A., Doermann, D., Dementhon, D. (eds.) Video Mining, Chapt. 5, Kluwer Academic Publishers, Boston (2003)Google Scholar
  22. 22.
    Ma, Y.F., Lu, L., Zhang, H.J., Li, M.: A user attention model for video summarization. In: Proceedings of the 10th ACM International Conference on Multimedia, MULTIMEDIA ’02, pp. 533–542. ACM, New York, NY, USA (2002). doi:10.1145/641007.641116
  23. 23.
    Marchionini, G.: Human performance measures for video retrieval. In: Proceedings of the 8th ACM International Workshop on Multimedia information retrieval, MIR ’06, pp. 307–312. ACM, New York, NY, USA (2006). doi:10.1145/1178677.1178720
  24. 24.
    Marchionini G., Song Y., Farrell R.: Multimedia surrogates for video gisting: Toward combining spoken words and imagery. Inf. Process. Manage. 45, 615–630 (2009). doi:10.1016/j.ipm.2009.05.007 CrossRefGoogle Scholar
  25. 25.
    Marchionini G., Wildemuth B.M., Geisler G.: The open video digital library: A möbius strip of research and practice. J. Am. Soc. Inf. Sci. Technol. 57, 1629–1643 (2006). doi:10.1002/asi.v57:12 CrossRefGoogle Scholar
  26. 26.
    McKnight, C.: The personal construction of information space. J. Am. Soc. Inf. Sci. 51, 730–733 (2000). http://dx.doi.org/10.1002 Google Scholar
  27. 27.
    Mei T., Yang B., Yang S.Q., Hua X.S.: Video collage: presenting a video sequence using a single image. Vis. Comput. 25, 39–51 (2008). doi:10.1007/s00371-008-0282-4 CrossRefGoogle Scholar
  28. 28.
    Money A.G., Agius H.: Video summarisation: A conceptual framework and survey of the state of the art. J. Vis. Commun. Image Represent. 19, 121–143 (2008). doi:10.1016/j.jvcir.2007.04.002 CrossRefGoogle Scholar
  29. 29.
    Ngo, C.W., Ma, Y.F., Zhang, H.J.: Automatic video summarization by graph modeling. In: Proceedings of the 9th IEEE International Conference on Computer Vision, vol. 2, ICCV ’03, p. 104. IEEE Computer Society, Washington, DC, USA (2003)Google Scholar
  30. 30.
    Oppenheim C., Stenson J., Wilson R.M.S.: Studies on information as an asset i: Definitions. J. Inf. Sci. 29(3), 159–166 (2003). doi:10.1177/01655515030293003 CrossRefGoogle Scholar
  31. 31.
    Over, P., Smeaton, A.F., Awad, G.: The trecvid 2008 bbc rushes summarization evaluation. In: Proceedings of the 2nd ACM TRECVid Video Summarization Workshop, TVS ’08, pp. 1–20. ACM, New York, NY, USA (2008). doi:10.1145/1463563.1463564
  32. 32.
    Over, P., Smeaton, A.F., Kelly, P.: The trecvid 2007 bbc rushes summarization evaluation pilot. In: Proceedings of the international workshop on TRECVID video summarization, TVS ’07, pp. 1–15. ACM, New York, NY, USA (2007). doi:10.1145/1290031.1290032
  33. 33.
    Smith, M., Kanade, T.: Video skimming and characterization through the combination of image and language understanding. In: Proceedings IEEE International Workshop on Content-Based Access of Image and Video Database, pp. 61 –70 (1998). doi:10.1109/CAIVD.1998.646034
  34. 34.
    Sundaram, H., Chang, S.F.: Condensing computable scenes using visual complexity and film syntax analysis. Proceedings of IEEE International Conference on Multimedia and Expo 0, 70 (2001). doi:10.1109/ICME.2001.1237709
  35. 35.
    Sundaram, H., Xie, L., Chang, S.F.: A utility framework for the automatic generation of audio-visual skims. In: Proceedings of the 10th ACM international conference on Multimedia, MULTIMEDIA ’02, pp. 189–198. ACM, New York, NY, USA (2002). doi:10.1145/641007.641042
  36. 36.
    Takahashi, Y., Nitta, N., Babaguchi, N.: Video summarization for large sports video archives. In: Proceedings of IEEE International Conference on Multimedia and Expo, vol. 0, pp. 1170–1173. IEEE Computer Society, Los Alamitos, CA, USA (2005). doi:10.1109/ICME.2005.1521635
  37. 37.
    Tan F.B., Hunter M.G.: The repertory grid technique: A method for the study of cognition in information systems. MIS Q. 26(1), 39–57 (2002)CrossRefGoogle Scholar
  38. 38.
    Taskiran, C.: Evaluation of automatic video summarization systems. In: Proceedings of SPIE Multimedia content analysis, management and retrieval (2006)Google Scholar
  39. 39.
    Taskiran, C.M., Bentley, F.: Automatic and user-centric approaches to video summary evaluation. In: A. Hanjalic, R. Schettini, N. Sebe (eds.) Multimedia Content Access: Algorithms and Systems, vol. 6506, p. 650607. SPIE (2007). doi:10.1117/12.713913
  40. 40.
    Taskiran C.M., Pizlo Z., Amir A., Ponceleon D.B., Delp E.J.: Automated video program summarization using speech transcripts. IEEE Trans. Multimedia 8(4), 775–791 (2006)CrossRefGoogle Scholar
  41. 41.
    Truong, B.T., Venkatesh, S.: Video abstraction: A systematic review and classification. ACM Trans. Multimedia Comput. Commun. Appl. 3 (2007). doi:10.1145/1198302.1198305
  42. 42.
    Tsoneva, T., Barbieri, M., Weda, H.: Automated summarization of narrative video on a semantic level. In: Proceedings of the International Conference on Semantic Computing, pp. 169–176. IEEE Computer Society, Washington, DC, USA (2007). doi:10.1109/ICSC.2007.16
  43. 43.
    Westman, S., Laine-Hernandez, M., Oittinen, P.: Development and evaluation of a multifaceted magazine image categorization model. J. Am. Soc. Inf. Sci. Technol. (2010). doi:10.1002/asi.21463
  44. 44.
    Wildemuth, B.M., Marchionini, G., Yang, M., Geisler, G., Wilkens, T., Hughes, A., Gruss, R.: How fast is too fast?: Evaluating fast forward surrogates for digital video. In: Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, JCDL ’03, pp. 221–230. IEEE Computer Society, Washington, DC, USA (2003)Google Scholar
  45. 45.
    Wildemuth, B.M., Russell, T., Ward, T., Marchionini, G., Oh, S.: The influence of context and interactivity on video browsing. Tech. Rep. SILS Technical Report 2006-01, University of North Carolina, School of Information and Library Science (2006). http://www.ils.unc.edu/ils/research/TR-2006-1.pdf
  46. 46.
    Yang, M., Marchionini, G.: Deciphering visual gist and its implications for video retrieval and interface design. In: CHI ’05 extended abstracts on Human factors in computing systems, CHI ’05, pp. 1877–1880. ACM, New York, NY, USA (2005). doi:10.1145/1056808.1057045
  47. 47.
    Yang, M., Wildemuth, B.M., Marchionini, G., Wilkens, T., Geisler G., Hughes, A., Gruss, R., Webster, C.: Measures of user performance in video retrieval research. Tech. Rep. SILS Technical Report 2003-02, University of North Carolina, School of Information and Library Science (2003). http://www.ils.unc.edu/ils/research/TR-2003-02.pdf
  48. 48.
    Zhang X., Chignell M.: Assessment of the effects of user characteristics on mental models of information retrieval systems. J. Am. Soc. Inf. Sci. Technol. 52, 445–459 (2001). doi:10.1002/1532-2890 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.Department of Media TechnologyAalto University School of ScienceAaltoFinland

Personalised recommendations