Video Content Representation Using Recurring Regions Detection

  • Lukas DiemEmail author
  • Maia Zaharieva
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9516)


In this work we present an approach for video content representation based on the detection of recurring visual elements or regions. We hypothesize that such elements play a potentially central role in the underlying video sequence. The approach makes use of fundamental intrinsic properties of a video and, thus, it does not make any assumptions about the video content itself. Furthermore, our approach does not require for any training or prior knowledge about the general settings and video domain. Preliminary experiments with a small and heterogeneous dataset of web videos demonstrate the potential of the approach to be employed as a compact summary of the video content with focus on its central visual elements. Additionally, resulting representations enable the retrieval of video sequences sharing common visual elements.


Video content-based analysis Recurring regions Video representation 



This work has been partly funded by the Vienna Science and Technology Fund (WWTF) through project ICT12-010.


  1. 1.
    de Avila, S.E.F., Lopes, A.P.B., da Luz, A., de Albuquerque Araújo, A.: VSUMM: a mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recognit. Lett. 32(1), 56–68 (2011)CrossRefGoogle Scholar
  2. 2.
    Banica, D., Agape, A., Ion, A., Sminchisescu, C.: Video object segmentation by salient segment chain composition. In: IEEE International Conference on Computer Vision Workshops, pp. 283–290 (2013)Google Scholar
  3. 3.
    Chatzichristofis, S.A., Boutalis, Y.S.: CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS 2008. LNCS, vol. 5008, pp. 312–322. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  4. 4.
    Diem, L., Zaharieva, M.: Interpretable video representation. In: International Workshop on Content-based Multimedia Indexing, pp. 1–6 (2015)Google Scholar
  5. 5.
    Fragkiadaki, K., Arbelaez, P., Felsen, P., Malik, J.: Learning to segment moving objects in videos. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)Google Scholar
  6. 6.
    Fragkiadaki, K., Zhang, G., Shi, J.: Video segmentation by tracing discontinuities in a trajectory embedding. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1846–1853 (2012)Google Scholar
  7. 7.
    Galasso, F., Keuper, M., Brox, T., Schiele, B.: Spectral graph reduction for efficient image and streaming video segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 49–56 (2014)Google Scholar
  8. 8.
    Galasso, F., Nagaraja, N.S., Cardenas, T.J., Brox, T., Schiele, B.: A unified video segmentation benchmark: annotation, metrics and analysis. In: IEEE International Conference on Computer Vision, pp. 3527–3534 (2013)Google Scholar
  9. 9.
    Garcia, D.: Robust smoothing of gridded data in one and higher dimensions with missing values. Comput. Stat. Data Anal. 54(4), 1167–1178 (2010)zbMATHCrossRefGoogle Scholar
  10. 10.
    Grundmann, M., Kwatra, V., Han, M., Essa, I.A.: Efficient hierarchical graph-based video segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)Google Scholar
  11. 11.
    Huang, H., Liu, H., Zhang, L.: Videoweb: space-time aware presentation of a videoclip collection. IEEE J. Emerg. Sel. Top. Circ. Syst. 4(1), 142–152 (2014)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Li, F., Kim, T., Humayun, A., Tsai, D., Rehg, J.M.: Video segmentation by tracking many figure-ground segments. In: IEEE International Conference on Computer Vision, pp. 2192–2199 (2013)Google Scholar
  13. 13.
    Liu, D., Yu, Z.: A computationally efficient algorithm for large scale near-duplicate video detection. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015, Part II. LNCS, vol. 8936, pp. 481–490. Springer, Heidelberg (2015) Google Scholar
  14. 14.
    Liu, J., Huang, Z., Cai, H., Shen, H.T., Ngo, C., Wang, W.: Near-duplicate video retrieval: current research and future trends. ACM Comput. Surv. 45(4), 44:1–44:23 (2013)CrossRefGoogle Scholar
  15. 15.
    Mahmoud, K.M., Ghanem, N.M., Ismail, M.A.: Unsupervised video summarization via dynamic modeling-based hierarchical clustering. Int. Conf. Mach. Learn. Appl. 2, 303–308 (2013)Google Scholar
  16. 16.
    Nock, R., Nielsen, F.: Statistical region merging. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1452–1458 (2004)CrossRefGoogle Scholar
  17. 17.
    Ochs, P., Malik, J., Brox, T.: Segmentation of moving objects by long term video analysis. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1187–1200 (2014)CrossRefGoogle Scholar
  18. 18.
    Ommer, B., Mader, T., Buhmann, J.M.: Seeing the objects behind the dots: recognition in videos from a moving camera. Int. J. Comp. Vis. 83(1), 57–71 (2009)CrossRefGoogle Scholar
  19. 19.
    Phan, R., Chia, J., Androutsos, D.: Unconstrained logo and trademark retrieval in general color image databases using color edge gradient co-occurrence histograms. IEEE Int. Conf. Acoust. Speech, Sign. Proces. 114(1), 1221–1224 (2008)Google Scholar
  20. 20.
    Sadanand, S., Corso, J.J.: Action bank: a high-level representation of activity in video. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1234–1241 (2012)Google Scholar
  21. 21.
    Truong, B.T., Venkatesh, S.: Video abstraction: a systematic review and classification. ACM Trans. Multimedia Comput. Commun. Appl. 3(1), 1–37 (2007)CrossRefGoogle Scholar
  22. 22.
    Wang, H., Kläser, A., Schmid, C., Liu, C.: Dense trajectories and motion boundary descriptors for action recognition. Int. J. Computer Vision 103(1), 60–79 (2013)CrossRefGoogle Scholar
  23. 23.
    Xu, C., Xiong, C., Corso, J.J.: Streaming hierarchical video segmentation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 626–639. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  24. 24.
    Zaharieva, M., Breiteneder, C.: Recurring element detection in movies. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 222–232. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  25. 25.
    Zeppelzauer, M., Mitrovic, D., Breiteneder, C.: Analysis of historical artistic documentaries. In: International Workshop on Image Analysis for Multimedia Interactive Services, pp. 201–206 (2008)Google Scholar
  26. 26.
    Zhang, L., Xu, Q., Nie, L., Huang, H.: Videograph: A non-linear video representation for efficient exploration. Vis. Comput. 30(10), 1123–1132 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Multimedia Information Systems GroupUniversity of ViennaViennaAustria
  2. 2.Interactive Media Systems GroupVienna University of TechnologyViennaAustria

Personalised recommendations