OVIS: ontology video surveillance indexing and retrieval system

  • Mohammed Yassine Kazi TaniEmail author
  • Abdelghani Ghomari
  • Adel Lablack
  • Ioan Marius Bilasco
Regular Paper


Nowadays, the diversity and large deployment of video recorders result in a large volume of video data, whose effective use requires a video indexing process. However, this process generates a major problem consisting in the semantic gap between the extracted low-level features and the ground truth. The ontology paradigm provides a promising solution to overcome this problem. However, no naming syntax convention has been followed in the concept creation step, which constitutes another problem. In this paper, we have considered these two issues and have developed a full video surveillance ontology following a formal naming syntax convention and semantics that addresses queries of both academic research and industrial applications. In addition, we propose an ontology video surveillance indexing and retrieval system (OVIS) using a set of semantic web rule language (SWRL) rules that bridges the semantic gap problem. Currently, the existing indexing systems are essentially based on low-level features and the ontology paradigm is used only to support this process with representing surveillance domain. In this paper, we developed the OVIS system based on the SWRL rules and the experiments prove that our approach leads to promising results on the top video evaluation benchmarks and also shows new directions for future developments.


Video surveillance ontology Video indexing Crowdsourced events Semantic gap Naming syntax convention OVIS system SWRL rules 


  1. 1.
    Kless D, Jansen L, Lindenthal J, Wiebensohn J (2012) A method for reengineering a thesaurus into an ontology. In: Frontiers in artificial intelligence and applications (FAIA), pp 133–146Google Scholar
  2. 2.
    Badii A, Lallah C, Zhu M, Crouch M (2009) The dream framework: Using a network of scalable ontologies for intelligent indexing and retrieval of visual content. In: International conference on web intelligence and intelligent agent technology (WI-IAT), pp 551–554Google Scholar
  3. 3.
    Rodrguez-Muro M, Calvanese D (2012) High performance query answering over DL-Lite ontologies. In: International conference on principles of knowledge representation and reasoning (KR), pp 308–318Google Scholar
  4. 4.
    Scherp A, Saathoff C, Franz T, Staab S (2011) Designing core ontologies. J Appl Ontol 03:177–221Google Scholar
  5. 5.
    Benmokhtar R, Huet B (2011) An ontology-based evidential framework for video indexing using high-level multimodal fusion. Multimed Tools Appl 55(3):1–27Google Scholar
  6. 6.
    Rector A, Brandt S, Drummond N, Horridge M, Pulestin C, Stevens R (2012) Engineering use cases for modular development of ontologies in owl. J Appl Ontol 02:113–132Google Scholar
  7. 7.
    Smith B, Ceusters W (2010) Ontological realism as a methodology for coordinated evolution of scientific ontologies. J Appl Ontol 03(4):139–188Google Scholar
  8. 8.
    Hernandez-Leal P, Escalante HJ, Sucar LE (2017) Towards a generic ontology for video surveillance. In: Applications for future internetGoogle Scholar
  9. 9.
    Kara S, Alan Z, Sabuncu O, Akpnar S, Cicekli NK, Alpaslan FN (2012) An ontology-based retrieval system using semantic indexing. Inf Syst J 04:294–305CrossRefGoogle Scholar
  10. 10.
    Mossakowski T, Lange C, Kutz O (2013) Three semantics for the core of the distributed ontology language. In: International joint conferences on artificial intelligence (IJCAI), pp 3027–3031Google Scholar
  11. 11.
    Ballan L, Bertini M, Del Bimbo A, Serra G (2010) Semantic annotation of soccer videos by visual instance clustering and spatial/temporal reasoning in ontologies. Multimed Tools Appl 02:313–337CrossRefGoogle Scholar
  12. 12.
    Bagdanov AD, Bertini M, Del Bimbo A, Serra G, Torniai C (2007) Semantic annotation and retrieval of video events using multimedia ontologies. In: International conference on semantic computing (ICSC), pp 713–720Google Scholar
  13. 13.
    Bertini M, Del Bimbo A, Torniai C, Grana C, Cucchiara R (2007) Dynamic pictorial ontologies for video digital libraries annotation. In: 1st ACM workshop on the many faces of multimedia semantics, pp 47–56Google Scholar
  14. 14.
    Bertini M, Del Bimbo A, Serra G (2008) Learning ontology rules for semantic video annotation. In: 2nd ACM workshop on multimedia semantics, pp 1–8Google Scholar
  15. 15.
    OConnor M, Knublauch H, Tu S, Grosof B, Dean M, Grosso W, Musen M (2005) Supporting rule system interoperability on the semantic web with SWRL. In: 4th international semantic web conference (ISWC), pp 974–986Google Scholar
  16. 16.
    Xue M, Zheng S, Zhang C (2012) Ontology-based surveillance video archive and retrieval system. In: 5th International conference on advanced computational intelligence (ICACI), pp 84–89Google Scholar
  17. 17.
    Lee J, Abualkibash MH, Ramalingam PK (2008) Ontology based shot indexing for video surveillance system. In: Innovations and advanced techniques in systems, computing sciences and software engineering, pp 237–242Google Scholar
  18. 18.
    Snidaro L, Belluz M, Foresti GL (2007) Representing and recognizing complex events in surveillance applications. In: IEEE international conference on advanced video and signal-based surveillance (AVSS), pp 493–498Google Scholar
  19. 19.
    Calavia L, Baladrn C, Aguiar JM, Carro B, Sanchez-Esguevillas A (2012) A semantic autonomous video surveillance system for dense camera networks in smart cities. Sensors 12:10407–10429CrossRefGoogle Scholar
  20. 20.
    Papadopoulos GT, Mezaris V, Kompatsiaris I, Strintzis MG (2007) Ontology-driven semantic video analysis using visual information objects. In: International conference on semantic and digital media technologies, pp 56–69Google Scholar
  21. 21.
    Saad S, Beul DD, Said M, Pierre M (2012) An ontology for video human movement representation based on benesh notation. In: IEEE international conference on multimedia computing and systems (ICMCS), pp 77–82Google Scholar
  22. 22.
    Trochidis I, Tambouris E, Tarabanis K (2007) An ontology for modeling life-events. In: IEEE international conference on services computing (SCC), pp 19–20Google Scholar
  23. 23.
    Bohlken W, Neumann B (2009) Generation of rules from ontologies for high-level scene interpretation. In: Lecture notes in computer science, pp 93–107Google Scholar
  24. 24.
    Nevatia R, Hobbs J, Bolles B (2004) An ontology for video event representation. In: Computer vision and pattern recognition (CVPR), pp 119–128Google Scholar
  25. 25.
    Francois ARJ, Nevatia R, Hobbs J, Bolles RC, Smith JR (2005) VERL: an ontology framework for representing and annotating video events. IEEE Multimed 12:76–86CrossRefGoogle Scholar
  26. 26.
    Bai L, Lao S, Zhang W, Jones GJF, Smeaton AF (2008) Video semantic content analysis framework based on ontology combined mpeg-7. In: Lecture notes in computer science, pp 237–250Google Scholar
  27. 27.
    SanMiguel JC, Martinez JM, Garcia A (2009) An ontology for event detection and its application in surveillance video. In: IEEE international conference on advanced video and signal-based surveillance (AVSS), pp 220–225Google Scholar
  28. 28.
    Utasi A, Kiss A, Sziranyi T (2009) Statistical filters for crowd image analysis. In: Performance evaluation of tracking and surveillance workshop, at CVPR, pp 95–100Google Scholar
  29. 29.
    Chan AB, Morrowand M, Vasconcelos N (2009) Analysis of crowded scenes using holistic properties. In: 11th IEEE international workshop on performance evaluation of tracking and surveillance (PETS)Google Scholar
  30. 30.
    Zhao Z, Wang M, Xiang R, Zhao S, Zhou K, liu M, He S, Zhu Y, Zhao Y, Su F (2016) BUPT-MCPRL, at TRECVIDGoogle Scholar
  31. 31.
    Markatopoulou F, Moumtzidou A, Galanopoulos D, Mironidis T, Kaltsa V, Ioannidou A, Symeonidis S, Avgerinakis K, Andreadis S, Gialampoukidis I, Vrochidis S, Briassouli A, Mezaris V, Kompatsiaris I, Patras I (2016) ITI-CERTH, at TRECVIDGoogle Scholar
  32. 32.
    Kazi Tani MY, Ghomari A, Belhadef H, Lablack A, Bilasco IM (2014) An ontology based approach for inferring multiple object events in surveillance domain. In: IEEE science and information conference (SAI), pp 404–409Google Scholar
  33. 33.
    Kazi Tani MY, Ghomari A, Lablack A, Bilasco IM (2015) Events detection using a video-surveillance ontology and a rule-based approach. In Computer vision + ONTology applied cross-disciplinary technologies workshop (CONTACT) in conjunction with European conference in computer vision (ECCV), pp 299–308Google Scholar
  34. 34.
  35. 35.
  36. 36.
    Kuznetsova P, Ordonez V, Berg T, Choi Y (2014) Treetalk: composition and compression of trees for image descriptions. In: Transactions of the association for computational linguistics (TACL), pp 351–362Google Scholar
  37. 37.
    Socher R, Karpathy A, Le VQ, Manning CD, Ng AY (2014) Grounded compositional semantics for finding and describing images with sentences. Trans Assoc Comput Linguist 2:207–218Google Scholar
  38. 38.
    Vinyals O, Toshev A, Bengio S, Erhan D (2014) Show and tell: a neural image caption generator. arXiv:1411.4555
  39. 39.
    Kiros R, Salakhutdinov R, Zemel RS (2014) Unifying visual-semantic embeddings with multimodal neural language models. arXiv:1411.2539
  40. 40.
    Mao J, Xu W, Yang Y, Wang J, Yuille AL (2014) Explain images with multimodal recurrent neural networks,.arXiv:1410.1090
  41. 41.
    Yao L, Torabi A, Cho K, Ballas N, Pal C, Larochelle H, Courville A (2015) Describing videos by exploiting temporal structure. In: IEEE international conference on computer vision (ICCV)Google Scholar
  42. 42.
    Rohrbach A, Rohrbach M, Qiu W, Friedrich A, Pinkal M, Schiele B (2014) Coherent multi-sentence video description with variable level of detail. In: German conference on pattern recognition (GCPR)Google Scholar
  43. 43.
    Rohrbach M, Qiu W, Titov I, Stefan T, Pinkal M, Schiele B (2013) Translating video content to natural language descriptions. In: IEEE international conference on computer vision (ICCV)Google Scholar
  44. 44.
    Venugopalan S, Xu H, Donahue J, Rohrbach M, Mooney RJ, Saenko K (2014) Translating videos to natural language using deep recurrent neural networks. arXiv:1412.4729
  45. 45.
    OpenCV. The OpenCV API.
  46. 46.
    Protege. The protege project.
  47. 47.
    Sirin EB, Parsia B, Cuenca Grau B, Kalyanpur A, Katz Y (2003) Pellet: a practical OWL-DL reasoner. J Web Semantics 5:51–53CrossRefGoogle Scholar
  48. 48.
    Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26:832–843CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag London Ltd. 2017

Authors and Affiliations

  • Mohammed Yassine Kazi Tani
    • 1
    Email author
  • Abdelghani Ghomari
    • 1
  • Adel Lablack
    • 2
  • Ioan Marius Bilasco
    • 2
  1. 1.RIIR Laboratory, Computer Science Department, Exact Sciences and Applied FacultyUniversity of Oran 1 Ahmed Ben BellaOranAlgeria
  2. 2.Research Center in Signal Informatics and Automatic of Lille (CRIStAL)University of Lille 1LilleFrance

Personalised recommendations