Towards High-Level Human Activity Recognition through Computer Vision and Temporal Logic

Ijsselmuiden, Joris; Stiefelhagen, Rainer

doi:10.1007/978-3-642-16111-7_49

Towards High-Level Human Activity Recognition through Computer Vision and Temporal Logic

Joris Ijsselmuiden²³ &
Rainer Stiefelhagen^23,24

Conference paper

2727 Accesses
11 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6359))

Abstract

Most approaches to the visual perception of humans do not include high-level activity recognitition. This paper presents a system that fuses and interprets the outputs of several computer vision components as well as speech recognition to obtain a high-level understanding of the perceived scene. Our laboratory for investigating new ways of human-machine interaction and teamwork support, is equipped with an assemblage of cameras, some close-talking microphones, and a videowall as main interaction device. Here, we develop state of the art real-time computer vision systems to track and identify users, and estimate their visual focus of attention and gesture activity. We also monitor the users’ speech activity in real time. This paper explains our approach to high-level activity recognition based on these perceptual components and a temporal logic engine.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Waibel, A., Stiefelhagen, R. (eds.): Computers in the Human Interaction Loop. Springer, London (2010)
Google Scholar
Nakashima, H., Aghajan, H., Augusto, J.C. (eds.): Handbook of Ambient Intelligence and Smart Environments. Springer, New York (2010)
Google Scholar
Ivergard, T., Hunt, B.: Handbook of Control Room Design and Ergonomics: A Perspective for the Future, 2nd edn. CRC Press, London (2008)
Book Google Scholar
Turaga, P., Chellappa, R., Subrahmanian, V.S., Udrea, O.: Machine Recognition of Human Activities: A Survey. Circ. Syst. Vid. Techn. 18(11), 1473–1488 (2008)
Article Google Scholar
Thiran, J.-P., Marques, F., Bourlard, H. (eds.): Multimodal Signal Processing, Theory and Applications for Human-Computer Interaction. Academic P., Oxford (2010)
Google Scholar
Ryoo, M.S., Aggarwal, J.K.: Semantic Representation and Recognition of Continued and Recursive Human Activities. Int. Jour. of Computer Vision 82, 1–24 (2009)
Article Google Scholar
Gu, T., Wu, Z., Tao, X., Pung, H.K., Lu, J.: epSICAR: An Emerging Patterns based Approach to Sequential, Interleaved and Concurrent Activity Recognition. In: 7th Conf. on Pervasive Computing and Communications. IEEE P., New York (2009)
Google Scholar
Brdiczka, O., Langet, M., Maisonnasse, J., Crowley, J.L.: Detecting Human Behavior Models from Multimodal Observation in a Smart Home. IEEE T. Automation Science and Engineering 6(4), 588–597 (2009)
Article Google Scholar
Gerber, R., Nagel, H.-H.: Representation of Occurrences for Road Vehicle Traffic. Artificial Intelligence 172, 351–391 (2008)
Article Google Scholar
Gonzalez, J., Rowe, D., Varona, J., Xavier Roca, F.: Understanding dynamic scenes based on human sequence evaluation. Im. Vis. Comput. 27(10), 1433–1444 (2009)
Article Google Scholar
Yao, B.Z., Yang, X., Lin, L., Lee, M.W., Zhu, S.-C.: I2T: Image Parsing to Text Description. Proceedings of the IEEE 99, 1–24 (2010)
Google Scholar
Gupta, A., Srinivasan, P., Shi, J., Davis, L.S.: Understanding Videos, Constructing Plots; Learning a Visually Grounded Storyline Model from Annotated Videos. In: Conf. on Computer Vision and Pattern Recog., pp. 2004–2011. IEEE P., New York (2009)
Google Scholar
Bernardin, K., Gehrig, T., Stiefelhagen, R.: Multi-Level Particle Filter Fusion of Features and Cues for Audio-Visual Person Tracking. In: Stiefelhagen, R., Bowers, R., Fiscus, J.G. (eds.) RT 2007 and CLEAR 2007. LNCS, vol. 4625, pp. 70–81. Springer, Heidelberg (2008)
Chapter Google Scholar
Ekenel, H.K., Jin, Q., Fischer, M., Stiefelhagen, R.: ISL Person Identification Systems in the CLEAR 2007 Evaluations. In: Stiefelhagen, R., Bowers, R., Fiscus, J.G. (eds.) RT 2007 and CLEAR 2007. LNCS, vol. 4625, pp. 256–265. Springer, Heidelberg (2008)
Chapter Google Scholar
Voit, M., Stiefelhagen, R.: Deducing the Visual Focus of Attention from Head Pose Estimation in Dynamic Multi-view Meeting Scenarios. In: 10th International Conference on Multimodal Interfaces, pp. 173–180. ACM Press, New York (2008)
Google Scholar
Schick, A., van de Camp, F., Ijsselmuiden, J., Stiefelhagen, R.: Extending Touch: Towards Interaction with Large-Scale Surfaces. In: Interactive Tabletops and Surfaces 2009, pp. 127–134. ACM Press, New York (2009)
Google Scholar
Soltau, H., Metze, F., Fugen, C., Waibel, A.: A One-pass Decoder Based on Polymorphic Linguistic Context Assignment. In: 2001 Automatic Speech Recognition and Understanding Workshop, pp. 214–217. IEEE Press, New York (2001)
Chapter Google Scholar
Naik, R.: Blending the Logic Paradigm into C++ (2008), http://mpprogramming.com
Allen, J.F., Ferguson, G.: Actions and Events in Interval Temporal Logic. Journal of Logic and Computation 4(5), 531–579 (1994)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Fraunhofer IOSB Karlsruhe, Germany
Joris Ijsselmuiden & Rainer Stiefelhagen
Institute for Anthropomatics, Karlsruhe Institute of Technology, Germany
Rainer Stiefelhagen

Authors

Joris Ijsselmuiden
View author publications
You can also search for this author in PubMed Google Scholar
Rainer Stiefelhagen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute for Anthropomatics, KIT, Karlsruhe, Germany
Rüdiger Dillmann
Lehrstuhl für Interaktive Echtzeitsysteme, KIT, Karlsruhe, Germany
Jürgen Beyerer
Lehrstuhl für Intelligente Sensor-Aktor-Systeme (ISAS), KIT, Karlsruhe, Germany
Uwe D. Hanebeck
Cognitive Systems Lab (CSL), KIT, Karlsruhe, Germany
Tanja Schultz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ijsselmuiden, J., Stiefelhagen, R. (2010). Towards High-Level Human Activity Recognition through Computer Vision and Temporal Logic. In: Dillmann, R., Beyerer, J., Hanebeck, U.D., Schultz, T. (eds) KI 2010: Advances in Artificial Intelligence. KI 2010. Lecture Notes in Computer Science(), vol 6359. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16111-7_49

Download citation

DOI: https://doi.org/10.1007/978-3-642-16111-7_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16110-0
Online ISBN: 978-3-642-16111-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics