Skip to main content

Exploiting Multimodal Interaction Techniques for Video-Surveillance

  • Chapter
  • 1347 Accesses

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 48))

Abstract

In this paper we present an example of a video surveillance application that exploits Multimodal Interactive (MI) technologies. The main objective of the so-called VID-Hum prototype was to develop a cognitive artificial system for both the detection and description of a particular set of human behaviours arising from real-world events. The main procedure of the prototype described in this chapter entails: (i) adaptation, since the system adapts itself to the most common behaviours (qualitative data) inferred from tracking (quantitative data) thus being able to recognize abnormal behaviors; (ii) feedback, since an advanced interface based on Natural Language understanding allows end-users the communicationwith the prototype by means of conceptual sentences; and (iii) multimodality, since a virtual avatar has been designed to describe what is happening in the scene, based on those textual interpretations generated by the prototype. Thus, the MI methodology has provided an adequate framework for all these cooperating processes.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fusier, F., Valentin, V., Brémond, F., Thonnat, M., Borg, M., Thirde, D., Ferryman, J.: Video understanding for complex activity recognition. Machine Vision and Applications 18(3), 167–188 (2007)

    Article  MATH  Google Scholar 

  2. Arens, M., Gerber, R., Nagel, H.–H.: Conceptual representations between video signals and natural language descriptions. Image and Vision Computing 26(1), 53–66 (2008)

    Article  Google Scholar 

  3. Dee, H.M., Fraile, R., Hogg, D.C., Cohn, A.G.: Modelling Scenes Using the Activity within Them. In: Freksa, C., Newcombe, N.S., Gärdenfors, P., Wölfl, S. (eds.) Spatial Cognition VI. LNCS (LNAI), vol. 5248, pp. 394–408. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. Makris, D., Ellis, T., Black, J.: Intelligent Visual Surveillance: Towards Cognitive Vision Systems. The Open Cybernetics and Systemics Journal 2, 219–229 (2008)

    Article  Google Scholar 

  5. Fernández, C., Baiget, P., Roca, F.X., Gonzàlez, J.: Determining the Best Suited Semantic Events for Cognitive Surveillance. Expert Systems with Applications 38(4), 4068–4079 (2011)

    Article  Google Scholar 

  6. Gonzàlez, J., Rowe, D., Varona, J., Roca, F.X.: Understanding Dynamic Scenes based on Human Sequence Evaluation. Image and Vision Computing 27(10), 1433–1444 (2009)

    Article  Google Scholar 

  7. Bellotto, N., Sommerlade, E., Benfold, B., Bibby, C., Reid, I., Roth, D., Van Gool, L., Fernández, C., Gonzàlez, J.: A Distributed Camera System for Multi-Resolution Surveillance. In: 3rd ACM/IEEE International Conference on Distributed Smart Cameras (2009)

    Google Scholar 

  8. Makris, D., Ellis, T.: Learning semantic scene models from observing activity in visual surveillance. IEEE Trans. on Systems Man and Cybernetics-Part B 35(3), 397–408 (2005)

    Article  Google Scholar 

  9. Piciarelli, C., Foresti, G.L.: Online trajectory clustering for anomalous events detection. Pattern Recognition Letters 27(15), 1835–1842 (2006)

    Article  Google Scholar 

  10. Johnson, N., Hogg, D.C.: Learning the distribution of object trajectories for event recognition. In: British Machine Vision Conference, pp. 583–592 (1995)

    Google Scholar 

  11. Hu, W., Xiao, X., Fu, Z., Xie, D.: A system for learning statistical motion patterns. IEEE Trans. on PAMI 28(9), 1450–1464 (2006)

    Article  Google Scholar 

  12. Basharat, A., Gritai, A., Shah, M.: Learning object motion patterns for anomaly detection and improved object detection. In: IEEE Conference on CVPR (2008)

    Google Scholar 

  13. Hu, W., Xie, D., Tan, T.: A Hierarchical Self-Organizing Approach for Learning the Patterns of Motion Trajectories. IEEE Trans. on Neural Networks 15(1), 135–144 (2004)

    Article  Google Scholar 

  14. McKenna, S., Nait-Charif, H.: Summarizing Contextual Activity and Detecting Unusual Inactivity in Supportive Home Environment. Pattern Analysis and Applications Journal 7(4), 386–401 (2004)

    Article  Google Scholar 

  15. Zhang, Z., Huang, K., Tan, T., Wang, L.: Trajectory Series Analysis based Event Rule Induction for Visual Surveillance. In: IEEE Conference on CVPR (2007)

    Google Scholar 

  16. Yao, B., Wang, L., Zhu, S.: Learning a Scene Contextual Model for Tracking and Abnormality Detection. In: IEEE Conference on CVPR Workshops (2008)

    Google Scholar 

  17. Morris, B., Trivedi, M.: Learning trajectory patterns by clustering: Experimental studies and comparative evaluation. In: IEEE Conference on CVPR (2009)

    Google Scholar 

  18. Bremond, F., Thonnat, M., Zuniga, M.: Video understanding framework for automatic behavior recognition. Behavior Research Methods 38(3), 416–426 (2006)

    Article  Google Scholar 

  19. Arens, M., Nagel, H.-H.: Behavioral Knowledge Representation for the Understanding and Creation of Video Sequences. In: Günter, A., Kruse, R., Neumann, B. (eds.) KI 2003. LNCS (LNAI), vol. 2821, pp. 149–163. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  20. Fernández, C., Baiget, P., Roca, F.X., Gonzàlez, J.: Interpretation of Complex Situations in a Semantic-Based Surveillance Framework. Signal Processing: Image Communication 23(7), 554–569 (2008)

    Article  Google Scholar 

  21. Gerber, R., Nagel, H.-H.: (Mis-?)-Using DRT for Generation of Natural Language Text from Image Sequences. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 255–270. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  22. Lakoff, G.: Women, fire, and dangerous things. University of Chicago Press (1987)

    Google Scholar 

  23. Reiter, E., Dale, R.: Building Natural Language Generation Systems. Cambridge University Press (2000)

    Google Scholar 

  24. Wilson, R.A., Keil, F.C. (eds.): The MIT Encyclopedia of the Cognitive Sciences. Bradford Books (2001)

    Google Scholar 

  25. Fernández, C., Baiget, P., Roca, F.X., Gonzàlez, J.: Augmenting video surveillance footage with virtual agents for incremental event evaluation. Pattern Recognition Letters 32(6), 878–889 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marc Castelló .

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Castelló, M. et al. (2013). Exploiting Multimodal Interaction Techniques for Video-Surveillance. In: Multimodal Interaction in Image and Video Applications. Intelligent Systems Reference Library, vol 48. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35932-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35932-3_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35931-6

  • Online ISBN: 978-3-642-35932-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics