An Alternative Approach to Exploring a Video

  • Fahim A. Salim
  • Fasih Haider
  • Owen Conlan
  • Saturnino Luz
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10458)


Exploring the content of a video is typically inefficient due to the linear streamed nature of its media. Video may be seen as a combination of a set of features, the visual track, the audio track and transcription of the spoken words, etc. These features may be viewed as a set of temporally bounded parallel modalities. It is our contention that together these modalities and derived features have the potential to be presented individually or in discrete combination, to allow deeper and more effective content exploration within different parts of a video. This paper presents a novel system for videos’ exploration and reports a recent user study conducted to learn usage patterns by offering video content as an alternative representation. The learned usage patterns may be utilized to build a template driven representation engine that uses the features to offer a multimodal synopsis of video that may lead to more efficient exploration of video content.


Multimedia analysis Video representation 



This research is supported by Science Foundation Ireland through the CNGL Programme (Grant 12/CE/I2267) in the ADAPT Centre at School of Computer Science and Statistics, Trinity College Dublin, Ireland.


  1. 1. (2016).
  2. 2.
    Belo, L., Caetano, C., do Patrocínio, Z., Guimarães, S.J.: Summarizing video sequence using a graph-based hierarchical approach. Neurocomputing 173, 1001–1016 (2016)CrossRefGoogle Scholar
  3. 3.
    Bouamrane, M.M., King, D., Luz, S., Masoodian, M.: A framework for collaborative writing with recording and post-meeting retrieval capabilities. IEEE Distrib. Syst. Online, 1–6 (2004)Google Scholar
  4. 4.
    Bouamrane, M.M., Luz, S.: An analytical evaluation of search by content and interaction patterns on multimodal meeting records. Multimedia Syst. 13(2), 89–103 (2007)CrossRefGoogle Scholar
  5. 5.
    Chen, F., De Vleeschouwer, C., Cavallaro, A.: Resource allocation for personalized video summarization. IEEE Trans. Multimedia 16(2), 455–469 (2014)CrossRefGoogle Scholar
  6. 6.
    Choi, F.Y.Y.: Advances in domain independent linear text segmentation. In: Proceedings of NAACL 2000, Stroudsburg, PA, USA, pp. 26–33 (2000)Google Scholar
  7. 7.
    Dong, A., Li, H.: Ontology-driven annotation and access of presentation video data. Estudios de Economía Aplicada (2008)Google Scholar
  8. 8.
    Evangelopoulos, G., Zlatintsi, A., Potamianos, A., Maragos, P., Rapantzikos, K., Skoumas, G., Avrithis, Y.: Multimodal saliency and fusion for movie summarization based on aural, visual, and textual attention. IEEE Trans. Multimedia 15(7), 1553–1568 (2013)CrossRefGoogle Scholar
  9. 9.
    Gravier, G., Ragot, M., Amsaleg, L., Bois, R., Jadi, G., Jamet, É., Monceaux, L., Sébillot, P.: Shaping-up multimedia analytics: needs and expectations of media professionals. In: Tian, Q., Sebe, N., Qi, G.-J., Huet, B., Hong, R., Liu, X. (eds.) MMM 2016, Part II. LNCS, vol. 9517, pp. 303–314. Springer, Cham (2016). doi: 10.1007/978-3-319-27674-8_27. CrossRefGoogle Scholar
  10. 10.
    Haesen, M., Meskens, J., Luyten, K., Coninx, K., Becker, J., Tuytelaars, T., Poulisse, G., Pham, T., Moens, M.: Finding a needle in a haystack: an interactive video archive explorer for professional video searchers. Multimedia Tools Appl. 63(2), 331–356 (2011)CrossRefGoogle Scholar
  11. 11.
    Halvey, M., Vallet, D., Hannah, D., Jose, J.M.: Supporting exploratory video retrieval tasks with grouping and recommendation. Inf. Process. Manag. 50(6), 876–898 (2014)CrossRefGoogle Scholar
  12. 12.
    Hosseini, M.S., Eftekhari-Moghadam, A.M.: Fuzzy rule-based reasoning approach for event detection and annotation of broadcast soccer video. Appl. Soft Comput. 13(2), 846–866 (2013)CrossRefGoogle Scholar
  13. 13.
    Lei, P., Sun, C., Lin, S., Huang, T.: Effect of metacognitive strategies and verbal-imagery cognitive style on biology-based video search and learning performance. Comput. Educ. 87, 326–339 (2015)CrossRefGoogle Scholar
  14. 14.
    Lienhart, R., Kuranov, A., Pisarevsky, V.: Empirical analysis of detection cascades of boosted classifiers for rapid object detection. In: Michaelis, B., Krell, G. (eds.) DAGM 2003. LNCS, vol. 2781, pp. 297–304. Springer, Heidelberg (2003). doi: 10.1007/978-3-540-45243-0_39 CrossRefGoogle Scholar
  15. 15.
    Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: ACL System Demos, pp. 55–60 (2014)Google Scholar
  16. 16.
    Marchionini, G.: Exploratory search: from finding to understanding. Commun. ACM 49(4), 41–46 (2006)CrossRefGoogle Scholar
  17. 17.
    Matejka, J., Grossman, T., Fitzmaurice, G.: Video lens: rapid playback and exploration of large video collections and associated metadata. In: Proceedings of UIST 2014, pp. 541–550 (2014)Google Scholar
  18. 18.
    Mujacic, S., Debevc, M., Kosec, P., Bloice, M., Holzinger, A.: Modeling, design, development and evaluation of a hypervideo presentation for digital systems teaching and learning. Multimedia Tools Appl. 58(2), 435–452 (2012)CrossRefGoogle Scholar
  19. 19.
    Nautiyal, A., Kenny, E., Dawson-Howe, K.: Video adaptation for the creation of advanced intelligent content for conferences. In: Irish Machine Vision and Image Processing Conference, pp. 122–127 (2014)Google Scholar
  20. 20.
    Pavel, A., Reed, C., orn Hartmann, B., Agrawala, M.: Video digests: a browsable, skimmable format for informational lecture videos. In: Symposium on User Interface Software and Technology, USA, pp. 573–582 (2014)Google Scholar
  21. 21.
    Piketty, T.: New thoughts on capital in the twenty-first century (2014)Google Scholar
  22. 22.
    Rafailidis, D., Manolopoulou, S., Daras, P.: A unified framework for multimodal retrieval. Pattern Recogn. 46(12), 3358–3370 (2013)CrossRefGoogle Scholar
  23. 23.
    Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the CoNLL 2009, pp. 147–155. ACL, Stroudsburg (2009)Google Scholar
  24. 24.
    Rogers, Y.: HCI Theory: Classical, Modern, and Contemporary, vol. 5. Morgan & Claypool Publishers, San Francisco (2012)Google Scholar
  25. 25.
    Schoeffmann, K., Taschwer, M., Boeszoermenyi, L.: The video explorer a tool for navigation and searching within a single video based on fast content analysis. In: Proceedings of the ACM Conference on Multimedia Systems, pp. 247–258 (2010)Google Scholar
  26. 26.
    Steinbock, D.: (2016).
  27. 27.
    Tan, S., Bu, J., Qin, X., Chen, C., Cai, D.: Cross domain recommendation based on multi-type media fusion. Neurocomputing 127, 124–134 (2014)CrossRefGoogle Scholar
  28. 28.
    Waitelonis, J., Sack, H.: Towards exploratory video search using linked data. Multimedia Tools Appl. 59(2), 645–672 (2012)CrossRefGoogle Scholar
  29. 29.
    Zhang, H., Liu, Y., Ma, Z.: Fusing inherent and external knowledge with nonlinear learning for cross-media retrieval. Neurocomputing 119, 10–16 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Fahim A. Salim
    • 1
  • Fasih Haider
    • 1
  • Owen Conlan
    • 1
  • Saturnino Luz
    • 2
  1. 1.ADAPT CentreTrinity College DublinDublinIreland
  2. 2.IPHSIUniversity of EdinburghEdinburghUK

Personalised recommendations