Exploiting stereoscopic disparity for augmenting human activity recognition performance

Mademlis, Ioannis; Iosifidis, Alexandros; Tefas, Anastasios; Nikolaidis, Nikos; Pitas, Ioannis

doi:10.1007/s11042-015-2719-x

Exploiting stereoscopic disparity for augmenting human activity recognition performance

Published: 04 July 2015

Volume 75, pages 11641–11660, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Ioannis Mademlis¹,
Alexandros Iosifidis¹,
Anastasios Tefas¹,
Nikos Nikolaidis¹ &
…
Ioannis Pitas¹

303 Accesses
11 Citations
Explore all metrics

Abstract

This work investigates several ways to exploit scene depth information, implicitly available through the modality of stereoscopic disparity in 3D videos, with the purpose of augmenting performance in the problem of recognizing complex human activities in natural settings. The standard state-of-the-art activity recognition algorithmic pipeline consists in the consecutive stages of video description, video representation and video classification. Multimodal, depth-aware modifications to standard methods are being proposed and studied, both for video description and for video representation, that indirectly incorporate scene geometry information derived from stereo disparity. At the description level, this is made possible by suitably manipulating video interest points based on disparity data. At the representation level, the followed approach represents each video by multiple vectors corresponding to different disparity zones, resulting in multiple activity descriptions defined by disparity characteristics. In both cases, a scene segmentation is thus implicitly implemented, based on the distance of each imaged object from the camera during video acquisition. The investigated approaches are flexible and able to cooperate with any monocular low-level feature descriptor. They are evaluated using a publicly available activity recognition dataset of unconstrained stereoscopic 3D videos, consisting in extracts from Hollywood movies, and compared both against competing depth-aware approaches and a state-of-the-art monocular algorithm. Quantitative evaluation reveals that some of the examined approaches achieve state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RGBD-HuDaAct: A Color-Depth Video Database for Human Daily Activity Recognition

Compact Video Description and Representation for Automated Summarization of Human Activities

Disparity-augmented trajectories for human activity recognition

Article 14 January 2021

Pejman Habashi, Boubakeur Boufama & Imran Shafiq Ahmad

References

Biswas KK, Basu SK (2011) Gesture recognition using Microsoft Kinect. IEEE, Proceedings International Conference on Automation, Robotics and Applications
Chen L, Wei H, Ferryman J (2013) A survey of human motion analysis using depth imagery. Pattern Recog Lett 34:1995–2006
Article Google Scholar
Csurka G, Bray C, Dance C, Fan L (2004) Visual categorization with bags of keypoints. In: ECCV, Workshop on Statistical Learning in Computer Vision
Farneback G (2003) Two-frame motion estimation based on polynomial expansion. Lect Notes Comput Sci 2749:363–370
Article MATH Google Scholar
Hadfield S, Bowden R (2013) Hollywood 3D: Recognizing actions in 3D natural scenes. In: IEEE, Proceedings Conference on Computer Vision and Pattern Recognition
Iosifidis A, Marami E, Tefas A, Pitas I (2012) Eating and drinking activity recognition based on discriminant analysis of fuzzy distances and activity volumes. In: IEEE International Conference on Acoustics, Speech and Signal Processing
Iosifidis A, Tefas A, Pitas I (2012) Multi-view human action recognition under occlusion based on fuzzy distances and neural networks. European Signal Processing Conference (EUSIPCO)
Iosifidis A, Tefas A, Pitas I (2012) View-invariant action recognition based on artificial neural networks. IEEE Trans Neural Netw Learn Syst 23(3):412–424
Article Google Scholar
Iosifidis A, Tefas A, Pitas I (2013) Dynamic action recognition based on dynemes and extreme learning machine. Pattern Recog Lett 34:1890–1898
Article Google Scholar
Iosifidis A, Tefas A, Pitas I (2013) Minimum class variance extreme learning machine for human action recognition. IEEE Trans Circ Syst Video Technol 23(11):1968–1979
Article Google Scholar
Iosifidis A, Tefas A, Pitas I (2013) Multi-view action recognition based on action volumes, fuzzy distances and cluster discriminant analysis. Sig Process 93:1445–1457
Article Google Scholar
Iosifidis A, Tefas A, Pitas I (2014) Regularized extreme learning machine for multi-view semi-supervised action recognition. Neurocomputing 145:250–262
Article Google Scholar
Iosifidis A, Tefas A, Pitas I (2014) Discriminant bag of words based representation for human action recognition. Pattern Recog Lett 49:185–192
Article Google Scholar
Konda K, Memisevic R (2013) Unsupervised learning of depth and motion. arXiv: 1312.3429v2
Laptev I (2005) On space-time interest points. Int J Comput Vis 64(2-3):107–123
Article Google Scholar
Laptev I, Marszałek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: IEEE, proceedings conference on computer vision and pattern recognition
Le QV, Zou WY, Yeung SY, Ng AY (2011) Learning hierarchical invariant spatio-temporal features for action recognition with Independent Subspace Analysis. In: IEEE, proceedings conference on computer vision and pattern recognition
Marszałek M, Laptev I, Schmid C (2009) Actions in context. In: IEEE, proceedings conference on computer vision and pattern recognition
Oreifej O, Liu Z (2013) HON4D: Histogram of oriented 4d normals for activity recognition from depth sequences. CVPR:716–723
Riechert C, Zilly F, Kauff P (2011) Real time depth estimation using line recursive matching. In: Proceedings European Conference on Visual Media Production
Sanchez-Riera J, Cech J, Horaud R (2012) Action recognition robust to background clutter by using stereo vision. In: European Conference on Computer Vision
Sanchez-Riera J, Cech J, Horaud R (2012) Action recognition robust to background clutter by using stereo vision. In: Proceedings ECCV Workshops, vol 7583
Scharstein D, Szeleiski R (2002) A taxonomy and evaluation of dense two frame stereo correspondence algorithm. IEEE Int J Comput Vis 47(1/2/3):7–42
Sigalas P, Trahanias M, Baltzakis H (2009) Visual tracking of independently moving body and arms. In: Proceedings International Conference on Intelligent Robots and Systems
Spagnolo P, Orazio TD, Leo M, Distante A (2006) Moving object segmentation by background subtraction and temporal analysis. Image Vis Comput 24 (5):411–423
Article Google Scholar
Wang H, Kläser A, Schmid C, Liu C-L (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103(1):60–79
Article MathSciNet Google Scholar
Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vis Image Underst 104(2–3):249–257
Article Google Scholar
Zhang J, Marszałek M, Lazebnik S, Schmid C (2007) Local features and kernels for classification of texture and object categories: a comprehensive study. Int J Comput Vis 73(2):213–238
Article Google Scholar

Download references

Acknowledgment

The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement number 287674 (3DTVS). This publication reflects only the author’s views. The European Union is not liable for any use that may be made of the information contained therein.

Author information

Authors and Affiliations

Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
Ioannis Mademlis, Alexandros Iosifidis, Anastasios Tefas, Nikos Nikolaidis & Ioannis Pitas

Authors

Ioannis Mademlis
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros Iosifidis
View author publications
You can also search for this author in PubMed Google Scholar
Anastasios Tefas
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Nikolaidis
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis Pitas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ioannis Mademlis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mademlis, I., Iosifidis, A., Tefas, A. et al. Exploiting stereoscopic disparity for augmenting human activity recognition performance. Multimed Tools Appl 75, 11641–11660 (2016). https://doi.org/10.1007/s11042-015-2719-x

Download citation

Received: 28 January 2015
Revised: 19 May 2015
Accepted: 26 May 2015
Published: 04 July 2015
Issue Date: October 2016
DOI: https://doi.org/10.1007/s11042-015-2719-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploiting stereoscopic disparity for augmenting human activity recognition performance

Abstract

Access this article

Similar content being viewed by others

RGBD-HuDaAct: A Color-Depth Video Database for Human Daily Activity Recognition

Compact Video Description and Representation for Automated Summarization of Human Activities

Disparity-augmented trajectories for human activity recognition

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exploiting stereoscopic disparity for augmenting human activity recognition performance

Abstract

Access this article

Similar content being viewed by others

RGBD-HuDaAct: A Color-Depth Video Database for Human Daily Activity Recognition

Compact Video Description and Representation for Automated Summarization of Human Activities

Disparity-augmented trajectories for human activity recognition

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation