Exploiting Multimodal Interaction Techniques for Video-Surveillance

Castelló, Marc; Gonzàlez, Jordi; Amato, Ariel; Baiget, Pau; Fernández, Carles; Gonfaus, Josep M.; Mollineda, Ramón A.; Pedersoli, Marco; de la Blanca, Nicolás Pérez; Roca, F. Xavier

doi:10.1007/978-3-642-35932-3_8

Exploiting Multimodal Interaction Techniques for Video-Surveillance

Marc Castelló³,
Jordi Gonzàlez³,
Ariel Amato³,
Pau Baiget³,
Carles Fernández³,
Josep M. Gonfaus³,
Ramón A. Mollineda⁴,
Marco Pedersoli³,
Nicolás Pérez de la Blanca⁵ &
…
F. Xavier Roca³

Chapter

1347 Accesses

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 48))

Abstract

In this paper we present an example of a video surveillance application that exploits Multimodal Interactive (MI) technologies. The main objective of the so-called VID-Hum prototype was to develop a cognitive artificial system for both the detection and description of a particular set of human behaviours arising from real-world events. The main procedure of the prototype described in this chapter entails: (i) adaptation, since the system adapts itself to the most common behaviours (qualitative data) inferred from tracking (quantitative data) thus being able to recognize abnormal behaviors; (ii) feedback, since an advanced interface based on Natural Language understanding allows end-users the communicationwith the prototype by means of conceptual sentences; and (iii) multimodality, since a virtual avatar has been designed to describe what is happening in the scene, based on those textual interpretations generated by the prototype. Thus, the MI methodology has provided an adequate framework for all these cooperating processes.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fusier, F., Valentin, V., Brémond, F., Thonnat, M., Borg, M., Thirde, D., Ferryman, J.: Video understanding for complex activity recognition. Machine Vision and Applications 18(3), 167–188 (2007)
Article MATH Google Scholar
Arens, M., Gerber, R., Nagel, H.–H.: Conceptual representations between video signals and natural language descriptions. Image and Vision Computing 26(1), 53–66 (2008)
Article Google Scholar
Dee, H.M., Fraile, R., Hogg, D.C., Cohn, A.G.: Modelling Scenes Using the Activity within Them. In: Freksa, C., Newcombe, N.S., Gärdenfors, P., Wölfl, S. (eds.) Spatial Cognition VI. LNCS (LNAI), vol. 5248, pp. 394–408. Springer, Heidelberg (2008)
Chapter Google Scholar
Makris, D., Ellis, T., Black, J.: Intelligent Visual Surveillance: Towards Cognitive Vision Systems. The Open Cybernetics and Systemics Journal 2, 219–229 (2008)
Article Google Scholar
Fernández, C., Baiget, P., Roca, F.X., Gonzàlez, J.: Determining the Best Suited Semantic Events for Cognitive Surveillance. Expert Systems with Applications 38(4), 4068–4079 (2011)
Article Google Scholar
Gonzàlez, J., Rowe, D., Varona, J., Roca, F.X.: Understanding Dynamic Scenes based on Human Sequence Evaluation. Image and Vision Computing 27(10), 1433–1444 (2009)
Article Google Scholar
Bellotto, N., Sommerlade, E., Benfold, B., Bibby, C., Reid, I., Roth, D., Van Gool, L., Fernández, C., Gonzàlez, J.: A Distributed Camera System for Multi-Resolution Surveillance. In: 3rd ACM/IEEE International Conference on Distributed Smart Cameras (2009)
Google Scholar
Makris, D., Ellis, T.: Learning semantic scene models from observing activity in visual surveillance. IEEE Trans. on Systems Man and Cybernetics-Part B 35(3), 397–408 (2005)
Article Google Scholar
Piciarelli, C., Foresti, G.L.: Online trajectory clustering for anomalous events detection. Pattern Recognition Letters 27(15), 1835–1842 (2006)
Article Google Scholar
Johnson, N., Hogg, D.C.: Learning the distribution of object trajectories for event recognition. In: British Machine Vision Conference, pp. 583–592 (1995)
Google Scholar
Hu, W., Xiao, X., Fu, Z., Xie, D.: A system for learning statistical motion patterns. IEEE Trans. on PAMI 28(9), 1450–1464 (2006)
Article Google Scholar
Basharat, A., Gritai, A., Shah, M.: Learning object motion patterns for anomaly detection and improved object detection. In: IEEE Conference on CVPR (2008)
Google Scholar
Hu, W., Xie, D., Tan, T.: A Hierarchical Self-Organizing Approach for Learning the Patterns of Motion Trajectories. IEEE Trans. on Neural Networks 15(1), 135–144 (2004)
Article Google Scholar
McKenna, S., Nait-Charif, H.: Summarizing Contextual Activity and Detecting Unusual Inactivity in Supportive Home Environment. Pattern Analysis and Applications Journal 7(4), 386–401 (2004)
Article Google Scholar
Zhang, Z., Huang, K., Tan, T., Wang, L.: Trajectory Series Analysis based Event Rule Induction for Visual Surveillance. In: IEEE Conference on CVPR (2007)
Google Scholar
Yao, B., Wang, L., Zhu, S.: Learning a Scene Contextual Model for Tracking and Abnormality Detection. In: IEEE Conference on CVPR Workshops (2008)
Google Scholar
Morris, B., Trivedi, M.: Learning trajectory patterns by clustering: Experimental studies and comparative evaluation. In: IEEE Conference on CVPR (2009)
Google Scholar
Bremond, F., Thonnat, M., Zuniga, M.: Video understanding framework for automatic behavior recognition. Behavior Research Methods 38(3), 416–426 (2006)
Article Google Scholar
Arens, M., Nagel, H.-H.: Behavioral Knowledge Representation for the Understanding and Creation of Video Sequences. In: Günter, A., Kruse, R., Neumann, B. (eds.) KI 2003. LNCS (LNAI), vol. 2821, pp. 149–163. Springer, Heidelberg (2003)
Chapter Google Scholar
Fernández, C., Baiget, P., Roca, F.X., Gonzàlez, J.: Interpretation of Complex Situations in a Semantic-Based Surveillance Framework. Signal Processing: Image Communication 23(7), 554–569 (2008)
Article Google Scholar
Gerber, R., Nagel, H.-H.: (Mis-?)-Using DRT for Generation of Natural Language Text from Image Sequences. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 255–270. Springer, Heidelberg (1998)
Chapter Google Scholar
Lakoff, G.: Women, fire, and dangerous things. University of Chicago Press (1987)
Google Scholar
Reiter, E., Dale, R.: Building Natural Language Generation Systems. Cambridge University Press (2000)
Google Scholar
Wilson, R.A., Keil, F.C. (eds.): The MIT Encyclopedia of the Cognitive Sciences. Bradford Books (2001)
Google Scholar
Fernández, C., Baiget, P., Roca, F.X., Gonzàlez, J.: Augmenting video surveillance footage with virtual agents for incremental event evaluation. Pattern Recognition Letters 32(6), 878–889 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Centre de Visió per Computador, Dept. Ciències de la Computació, Universitat Autònoma de Barcelona, Barcelona, Spain
Marc Castelló, Jordi Gonzàlez, Ariel Amato, Pau Baiget, Carles Fernández, Josep M. Gonfaus, Marco Pedersoli & F. Xavier Roca
Instituto de Nuevas Tecnologías de la Imagen, Universitat Jaume I, Castelló, Spain
Ramón A. Mollineda
Dpto. Ciencias de la Computación e I.A., ETSI Informática y de Telecomunicación, Universidad de Granada, Granada, Spain
Nicolás Pérez de la Blanca

Authors

Marc Castelló
View author publications
You can also search for this author in PubMed Google Scholar
Jordi Gonzàlez
View author publications
You can also search for this author in PubMed Google Scholar
Ariel Amato
View author publications
You can also search for this author in PubMed Google Scholar
Pau Baiget
View author publications
You can also search for this author in PubMed Google Scholar
Carles Fernández
View author publications
You can also search for this author in PubMed Google Scholar
Josep M. Gonfaus
View author publications
You can also search for this author in PubMed Google Scholar
Ramón A. Mollineda
View author publications
You can also search for this author in PubMed Google Scholar
Marco Pedersoli
View author publications
You can also search for this author in PubMed Google Scholar
Nicolás Pérez de la Blanca
View author publications
You can also search for this author in PubMed Google Scholar
F. Xavier Roca
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Castelló .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Castelló, M. et al. (2013). Exploiting Multimodal Interaction Techniques for Video-Surveillance. In: Multimodal Interaction in Image and Video Applications. Intelligent Systems Reference Library, vol 48. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35932-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-35932-3_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35931-6
Online ISBN: 978-3-642-35932-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics