Visual Video Analytics for Interactive Video Content Analysis

  • Julius SchöningEmail author
  • Gunther Heidemann
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 886)


Reasoning as an essential processing step for any data analysis task, yet it requires semantic, contextual understanding on a high level, e.g., for the identification of entities. Developing an architecture for visual video analytics (VVA), we integrate human knowledge for highly accurate video content analysis to extract information by a tight coupling of automatic video analysis algorithms on the one hand and visualization as well as user interaction on the other hand. For accurate video content analysis, our semi-automatic VVA-architecture effectively understands and identifies regular and irregular behavior in real-world datasets. The VVA-architecture is described with both (i) its interactive information extraction and representation and (ii) its content-based reasoning process. We give an overview of existing techniques for information extraction and representation, and propose two interactive applications for reasoning. One of the applications uses 3D object representations to provide adaptive playback based on selected object parts in the 3D viewer. Another application allows the formulation of a proposition about the video by using all extracted objects and information. In case the proposition is correct, the corresponding frames of the video are highlighted. Based on a user study, relevant open topics for increasing the performance of video content analysis and VVA is discussed.


Visual analytics Video analysis 3D reconstruction Object annotation Human-machine-interaction 


  1. 1.
    Pillai, G.: Caught on camera: you are filmed on CCTV 300 times a day in London. International Business Times, September 2017.
  2. 2.
    Wu, S., Zheng, S., Yang, H., Fan, Y., Liang, L., Su, H.: SAGTA: semi-automatic ground truth annotation in crowd scenes. In: International Conference on Multimedia and Expo Workshops (ICMEW). IEEE - Institute of Electrical and Electronics Engineers (2014)Google Scholar
  3. 3.
    Schöning, J., Faion, P., Heidemann, G.: Pixel-wise ground truth annotation in videos: an semi-automatic approach for pixel-wise and semantic object annotation. In: International Conference on Pattern Recognition Applications and Methods (ICPRAM), SCITEPRESS - Science and and Technology Publications, pp. 690–697 (2016)Google Scholar
  4. 4.
    Schroeter, R., Hunter, J., Kosovic, D.: Vannotea—a collaborative video indexing, annotation and discussion system for broadband networks. In: Workshop on Knowledge Markup & Semantic Annotation (2003)Google Scholar
  5. 5.
    Tanisaro, P., Schöning, J., Kurzhals, K., Heidemann, G., Weiskopf, D.: Visual analytics for video applications. IT Inf. Technol. 57, 30–36 (2015)Google Scholar
  6. 6.
    Keim, D.A., Mansmann, F., Schneidewind, J., Thomas, J., Ziegler, H.: Visual analytics: scope and challenges. Lecture Notes in Computer Science, pp. 76–90. Springer, Heidelberg (2008)Google Scholar
  7. 7.
    Höferlin, M., Höferlin, B., Weiskopf, D., Heidemann, G.: Uncertainty-aware video visual analytics of tracked moving objects. J. Spat. Inf. Sci. 2, 87–117 (2011)Google Scholar
  8. 8.
    Pintore, G., Gobbetti, E.: Effective mobile mapping of multi-room indoor structures. Vis. Comput. 30(6–8), 707–716 (2014)CrossRefGoogle Scholar
  9. 9.
    Sensopia Inc.: Capture the floor plan of your house with magicplan, September 2017.
  10. 10.
    Kowdle, A., Chang, Y.-J., Gallagher, A., Batra, D., Chen, T.: Putting the user in the loop for image-based modeling. Int. J. Comput. Vis. 108(1–2), 30–48 (2014)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Pan, Q., Reitmayr, G., Drummond, T.: ProFORMA: probabilistic feature-based on-line rapid model acquisition. In: British Machine Vision Conference (BMVC).British Machine Vision Association (2009)Google Scholar
  12. 12.
    Wu, C.: VisualSFM: a visual structure from motion system, January 2011.
  13. 13.
    Marconi, D.: Enemy of the State. Touchstone Pictures (1998)Google Scholar
  14. 14.
    Höferlin, B., Höferlin, M., Weiskopf, D., Heidemann, G.: Scalable video visual analytics. Inf. Vis. 14(1), 10–26 (2013)CrossRefGoogle Scholar
  15. 15.
    Russell, D.M., Stefik, M.J., Pirolli, P., Card, S.K.: The cost structure of sensemaking. In: SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 269–276. ACM Press (1993)Google Scholar
  16. 16.
    Thomas, J.J., Cook, K.A. (eds.): Illuminating the Path: The Research and Development Agenda for Visual Analytics. IEEE Computer Society Press (2005)Google Scholar
  17. 17.
    Höferlin, B., Netzel, R., Höferlin, M., Weiskopf, D., Heidemann, G.: Inter-active learning of ad-hoc classifiers for video visual analytics. In: Conference on Visual Analytics Science and Technology (VAST), pp. 23–32. IEEE - Institute of Electrical and Electronics Engineers (2012)Google Scholar
  18. 18.
    Pirolli, P., Card, S.: The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In: International Conference on Intelligence Analysis (2005)Google Scholar
  19. 19.
    Thomas, J.J., Cook, K.A.: A visual analytics agenda. IEEE Comput. Graph. Appl. 26(1), 10–13 (2006)CrossRefGoogle Scholar
  20. 20.
    Dasiopoulou, S., Giannakidou, E., Litos, G., Malasioti, P., Kompatsiaris, Y.: A survey of semantic image and video annotation tools. In: Knowledge-Driven Multimedia Information Extraction and Ontology Evolution, pp. 196–239. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  21. 21.
    Multimedia Knowledge and Social Media Analytics Laboratory. Video Image Annotation Tool|Multimedia Knowledge and Social Media Analytics Laboratory, January 2012.
  22. 22.
    Doermann, D., Mihalcik, D.: Tools and techniques for video performance evaluation. In: International Conference on Pattern Recognition (ICPR). IEEE Computer Society Press, pp. 167–170 (2000)Google Scholar
  23. 23.
    Schöning, J., Heidemann, G.: Interactive 3D modeling: a survey-based perspective on interactive 3D reconstruction. In: International Conference on Pattern Recognition Applications and Methods (ICPRAM), pp. 289–294. SCITEPRESS - Science and and Technology Publications (2015)Google Scholar
  24. 24.
    Schöning, J., Heidemann, G.: Bio-inspired architecture for deriving 3D models from video sequences. In: Computer Vision – ACCV Workshops, pp. 62–76. Springer, Heidelberg (2016)CrossRefGoogle Scholar
  25. 25.
    Trick, L.M., Enns, J.T.: Lifespan changes in attention: the visual search task. Cogn. Dev. 13(3), 369–386 (1998)CrossRefGoogle Scholar
  26. 26.
    Eriksen, C.W., Schultz, D.W.: Information processing in visual search: a continuous flow conception and experimental results. Percept. Psychophys. 25(4), 249–263 (1979)CrossRefGoogle Scholar
  27. 27.
    Schöning, J., Faion, P., Heidemann, G.: Interactive feature growing for accurate object detection in megapixel images. In: Computer Vision – ECCV, Workshops, vol. 9913, pp. 546–556. Springer, Heidelberg (2016)Google Scholar
  28. 28.
    Schöning, J., Faion, P., Heidemann, G., Krumnack, U.: Providing video annotations in multimedia containers for visualization and research. In: Winter Conference on Applications of Computer Vision (WACV). IEEE - Institute of Electrical and Electronics Engineers (2017)Google Scholar
  29. 29.
    Schöning, J., Faion, P., Heidemann, G., Krumnack, U.: Eye tracking data in multimedia containers for instantaneous visualizations. In: IEEE VIS Workshop on Eye Tracking and Visualization (ETVIS), pp. 74–78. IEEE - Institute of Electrical and Electronics Engineers (2016)Google Scholar
  30. 30.
    Schöning, J., Gert, A.L., Açik, A., Kietzmann, T.C., Heidemann, G., König, P.: Exploratory multimodal data analysis with standard multimedia player: multimedia containers: – a feasible solution to make multimodal research data accessible to the broad audience. In: Computer Vision, Imagingand Computer Graphics Theory and Applications (VISAPP), pp. 272–279. SCITEPRESS - Science and Technology Publications (2017)Google Scholar
  31. 31. Ogg, September 2017.
  32. 32.
    Matroska: Matroska media container, September 2017.

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Institute of Cognitive ScienceOsnabrück UniversityOsnabrückGermany

Personalised recommendations