Skip to main content
Log in

Knowledge-based detection of events in video streams from salient regions of activity

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Visual events occurring in video streams (such as human postures or more complex activities) are detected from a robust and generic region-based representation of the visual content and inferred using a spatio-temporal language that integrates domain-specific knowledge. More specifically, salient regions of activity are first extracted from the dynamic of the salient points along the scene. They are mapped to a vocabulary of the domain, using a state-of-the-art classifier, to describe the visual content in terms of semantic facts. Occurrences of events, modelled as assertions of a language representing spatio-temporal relationships between facts, are inferred from the description of videos by applying a forward-reasoning engine. An application to visual events retrieval in videos of meetings is presented as a test case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Del Bimbo A, Vicario E (1995) Symbolic description and visual querying of image sequences using spatio-temporal logic. IEEE Trans Knowl Data Eng 7:4

    Article  Google Scholar 

  2. Pinhanez C, Bobick A (1997) Human action detection using PNF propagation of temporal constraints. M.T.T. Media Laboratory Perceptual Section Report No 423

  3. Comaniciu D, Ramesh V, Meer P (2000) Real-time tracking of non-rigid objects using mean shift. Comput Pattern Recog 2: 142–149

    Google Scholar 

  4. Cordelia Schmid, Roger Mohr (1997) Local grayvalue Invariants for Image retrieval. IEEE Trans Pattern Anal Mach Intell 5:19

    Google Scholar 

  5. Giarratano J, Riley G (1998) Expert system: principles and programming. PWS publishing company, Boston

    Google Scholar 

  6. Howel AJ, Buxton H (2002) Active vision techniques for visually mediated interaction. Image Vis Comput 20:861–871

    Article  Google Scholar 

  7. Mikolajczyk K, Schmid C (2001) Indexing based on scale invariant interest points. In: 8th international conference on computer vision

  8. Ghallab M (1996) On chronicles: representation on-line recognition and learning. In: 5th international conference on principles of knowledge representation and reasoning

  9. Fisher MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24:381–395

    Article  Google Scholar 

  10. Rota N, Thonnat M (2000) Video sequence interpretation for visual surveillance. In: 3rd IEEE international workshop on visual surveillance

  11. Ounis I, Huibers TWC (1997) A logical relational approach for information retrieval indexing. In: 19th annual BCS-IRSG colloquium on IR research

  12. Mikolajczyk K, Schmid C (2003) A performance evaluation of local descriptors. IEEE Conf Comput Vis Pattern Recog 2:257

    Google Scholar 

  13. Moënne-Loccoz N, Brémond F, Thonnat M (2003) Recurrent Bayesian network for the recognition of human behaviors from video. In: 3rd international conference on computer vision systems

  14. Moënne-Loccoz N, Bruno E, Marchand-Maillet S (2004) Video content representation as salient regions of activity. In: International conference on image and video retrieval

  15. Oliver N, Pentland A (2000) Graphical models for driver behavior recognition in a SmartCar. In: Proceedings of IEEE conference on intelligent vehicles

  16. Pallotta V, Ballim A, Marchand-Maillet S, Lisowska A (2004) Towards meeting information systems: meeting knowledge management. In: 6th international conference on enterprise information systems

  17. Philip HST, Zisserman A (1999) Feature based methods for structure and motion estimation. In: Workshop on vision algorithms

  18. Stiller C, Konrad J (1999) Estimating motion in image sequences: a tutorial on modeling and computation of 2D motion. IEEE Signal Process 16(4):70–91

    Article  Google Scholar 

  19. Tian Q, Sebe N, Lew MS, Loupias E, Huang TS (2001) Image retrieval using wavelet-based salient points. J Electronic Imaging (Special issue on storage and retrieval of digital media) 10(4):835–849

    Google Scholar 

  20. Lindeberg T (1998) Feature detection with automatic scale selection. Int J Comput Vis 2:30

    Google Scholar 

  21. Van-Thin V, Brémond F, Thonnat M (2002) Temporal constraints for video interpretation. In: 15th European conference on artificial intelligence

  22. Vapnik VN (1998) Statistical learning theory. Wiley, New York

    Google Scholar 

Download references

Acknowledgements

This work is funded by EU-IST project M4 (http://www.m4project.org) and the Swiss NCCR IM2 (Interactive Multimodal Information Management).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicolas Moënne-Loccoz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moënne-Loccoz, N., Bruno, E. & Marchand-Maillet, S. Knowledge-based detection of events in video streams from salient regions of activity. Pattern Anal Applic 7, 422–429 (2004). https://doi.org/10.1007/s10044-004-0235-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-004-0235-0

Keywords

Navigation