Abstract
In this paper, we propose an efficient, fast, and easy-to-implement method for recognizing human actions in depth image sequences. In this method, the human body silhouettes are initially extracted from the depth image sequences using the Gaussian mixture background subtraction model. After removing noise from the foreground image by performing a cascade of morphological operations and area filtering, the contour of the human silhouette is extracted by applying Moore’s neighbor contour tracing algorithm. From this contour, features describing the human posture are calculated using the Histogram of Templates (HoT) descriptor. These features are then used to train a Dendogram-based support vector machine for generating the frame-by-frame posture variation signal of the action sequence. The histogram of this signal is created, and finally introduced as an input vector into a Fuzzy k Nearest Neighbor (FkNN) classifier for recognizing human actions. The proposed method is evaluated on two publicly available datasets containing various daily actions (Bending, Sitting, Lying, etc.) performed by different human subjects. Extensive experiments are conducted using several values of the nearest neighbor (k) in the FkNN and different similarity measures, namely Euclidean distance, Bhattacharyya distance, Kullback–Leibler distance, and histogram intersection-based distance. The results show that the proposed method performs better or comparable to other state-of-the-art approaches. Moreover, this method can process 18 frames per second from the image sequence, which makes it well suited for applications needing real-time human action recognition.
Similar content being viewed by others
Data Availability
The datasets used in this research are publicly available from the websites SDUFall Dataset (http://www.sucro.org/homepage/wanghaibo/SDUFall.html) and Fall Detection Dataset (https://falldataset.com/).
References
Elharrouss O, Almaadeed N, Al-Maadeed S et al (2021) A combined multiple action recognition and summarization for surveillance video sequences. Appl Intell 51:690–712
Zhang J, Shan Y, Huang K (2015) ISEE Smart Home (ISH): Smart video analysis for home security. Neurocomputing 149:752–766
Yao L, Sheng QZ, Benatallah B et al (2018) WITS: an IoT-endowed computational framework for activity recognition in personalized smart homes. Computing 100:369–385
Gao Y, Xiang X, Xiong N et al (2018) Human Action Monitoring for Healthcare Based on Deep Learning. IEEE Access 6:52277–52285
Mukherjee D, Mondal R, Singh PK et al (2020) EnsemConvNet: a deep learning approach for human activity recognition using smartphone sensors for healthcare applications. Multimedia Tools Appl 79:31663–31690
Olatunji IE (2018) Human Activity Recognition for Mobile Robot. J Phys: Conf Ser 1069:012148
Fan J, Zheng P, Li S (2022) Vision-based holistic scene understanding towards proactive human–robot collaboration. Robot Comput-Integr Manuf 75:102304
Sowmyayani S, Rani PAJ (2022) STHARNet: spatio-temporal human action recognition network in content based video retrieval. Multimed Tools Appl. https://doi.org/10.1007/s11042-022-14056-8
Lu M, Hu Y, Lu X (2020) Driver action recognition using deformable and dilated faster R-CNN with optimized region proposals. Appl Intell 50:1100–1111
Zhou E, Zhang H (2020) Human action recognition toward massive-scale sport sceneries based on deep multi-model feature fusion. Signal Process: Image Commun 84:115802
Host K, Ivašić-Kos M (2022) An overview of Human Action Recognition in sports based on Computer Vision. Heliyon 8:e09633
Minh Dang L, Min K, Wang H et al (2020) Sensor-based and vision-based human activity recognition: A comprehensive survey. Pattern Recogn 108:107561
Gupta P, Dallas T (2014) Feature Selection and Activity Recognition System Using a Single Triaxial Accelerometer. IEEE Trans Biomed Eng 61:1780–1786
Jiang W, Yin Z (2015) Human Activity Recognition Using Wearable Sensors by Deep Convolutional Neural Networks. Proceedings of the International Conference on Multimedia. ACM, pp 1307–1310. https://doi.org/10.1145/2733373.2806333
Ha S, Choi S (2016) Convolutional neural networks for human activity recognition using multiple accelerometer and gyroscope sensors. Proceedings of the International Joint Conference on Neural Networks. IEEE, pp 381–388. https://doi.org/10.1109/IJCNN.2016.7727224
Jalloul N, Porée F, Viardot G et al (2018) Activity Recognition Using Complex Network Analysis. IEEE J Biomed Health Inform 22:989–1000
Quaid MAK, Jalal A (2020) Wearable sensors based human behavioral pattern recognition using statistical features and reweighted genetic algorithm. Multimed Tools Appl 79:6061–6083
ud din Tahir SB, Jalal A, Kim K (2020) Wearable Inertial Sensors for Daily Activity Analysis Based on Adam Optimization and the Maximum Entropy Markov Model. Entropy 22:579
Varshney N, Bakariya B, Kushwaha AKS, Khare M (2022) Human activity recognition by combining external features with accelerometer sensor data using deep learning network model. Multimed Tools Appl 81:34633–34652
Rajamoney J, Ramachandran A (2023) Representative-discriminative dictionary learning algorithm for human action recognition using smartphone sensors. Concurr Comput: Pract Experience 35:e7468
Javeed M, Gochoo M, Jalal A, Kim K (2021) HF-SPHR: Hybrid Features for Sustainable Physical Healthcare Pattern Recognition Using Deep Belief Networks. Sustainability 13:1699
Das Dawn D, Shaikh SH (2016) A comprehensive survey of human action recognition with spatio-temporal interest point (STIP) detector. Vis Comput 32:289–306
Chen E, Zhang S, Liang C (2017) Action Recognition Using Motion History Image and Static History Image-based Local Binary Patterns. Int J Multimed Ubiquit Eng 12:203–214
Kumar SS, John M (2016) Human activity recognition using optical flow based feature set. Proceedings of the International Carnahan Conference on Security Technology. IEEE, pp 1–5. https://doi.org/10.1109/CCST.2016.7815694
Aslan MF, Durdu A, Sabanci K (2020) Human action recognition with bag of visual words using different machine learning methods and hyperparameter optimization. Neural Comput Appl 32:8585–8597
Sun Z, Ke Q, Rahmani H et al (2023) Human Action Recognition From Various Data Modalities: A Review. IEEE Trans Pattern Anal Mach Intell 45:3200–3225
Roh M-C, Shin H-K, Lee S-W (2010) View-independent human action recognition with Volume Motion Template on single stereo camera. Pattern Recogn Lett 31:639–647
Sanchez-Riera J, Čech J, Horaud R (2012) Action Recognition Robust to Background Clutter by Using Stereo Vision. Proceedings of the European Conference on Computer Vision. Springer, pp 332–341. https://doi.org/10.1007/978-3-642-33863-2_33
Murtaza F, Yousaf MH, Velastin SA (2016) Multi-view human action recognition using 2D motion templates based on MHIs and their HOG description. IET Comput Vision 10:758–767
Kushwaha AKS, Srivastava S, Srivastava R (2017) Multi-view human activity recognition based on silhouette and uniform rotation invariant local binary patterns. Multimedia Syst 23:451–467
Singh R, Kushwaha AKS, Srivastava R (2019) Multi-view recognition system for human activity based on multiple features for video surveillance system. Multimed Tools Appl 78:17165–17196
Jalal A, Khalid N, Kim K (2020) Automatic Recognition of Human Interaction via Hybrid Descriptors and Maximum Entropy Markov Model Using Depth Sensors. Entropy 22:817
Akula A, Shah AK, Ghosh R (2018) Deep learning approach for human action recognition in infrared images. Cogn Syst Res 50:146–154
Batchuluun G, Nguyen DT, Pham TD et al (2019) Action Recognition From Thermal Videos. IEEE Access 7:103893–103917
Batchuluun G, Kang JK, Nguyen DT et al (2021) Action Recognition From Thermal Videos Using Joint and Skeleton Information. IEEE Access 9:11716–11733
Malawski F, Kwolek B (2019) Improving multimodal action representation with joint motion history context. J Vis Commun Image Represent 61:198–208
Wang H, Wang L (2018) Learning content and style: Joint action recognition and person identification from human skeletons. Pattern Recogn 81:23–35
Qiao R, Liu L, Shen C, van den Hengel A (2017) Learning discriminative trajectorylet detector sets for accurate skeleton-based action recognition. Pattern Recogn 66:202–212
Carbonera Luvizon D, Tabia H, Picard D (2017) Learning features combination for human action recognition from skeleton sequences. Pattern Recogn Lett 99:13–20
Patrona F, Chatzitofis A, Zarpalas D, Daras P (2018) Motion analysis: Action detection, recognition and evaluation based on motion capture data. Pattern Recogn 76:612–622
Wang H, Wang L (2018) Beyond Joints: Learning Representations From Primitive Geometries for Skeleton-Based Action Recognition and Detection. IEEE Trans Image Process 27:4382–4394
Wang P, Li W, Li C, Hou Y (2018) Action recognition based on joint trajectory maps with convolutional neural networks. Knowl-Based Syst 158:43–53
Sun B, Kong D, Wang S et al (2019) Effective human action recognition using global and local offsets of skeleton joints. Multimedia Tools and Applications 78:6329–6353
Caetano C, Sena J, Bremond FF et al (2019) SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition. AVSS 2019 - 16th IEEE International Conference on Advanced Video and Signal-based Surveillance. Taipei, pp 1–8. https://doi.org/10.1109/AVSS.2019.8909840
Fan Y, Weng S, Zhang Y et al (2020) Context-Aware Cross-Attention for Skeleton-Based Human Action Recognition. IEEE Access 8:15280–15290
Song Y-F, Zhang Z, Shan C, Wang L (2023) Constructing Stronger and Faster Baselines for Skeleton-Based Action Recognition. IEEE Trans Pattern Anal Mach Intell 45:1474–1488
Feng L, Zhao Y, Zhao W, Tang J (2022) A comparative review of graph convolutional networks for human skeleton-based action recognition. Artif Intell Rev 55:4275–4305
Duan H, Zhao Y, Chen K et al (2022) Revisiting Skeleton-Based Action Recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 2969–2978. https://doi.org/10.1109/CVPR52688.2022.00298
Du Y, Fu Y, Wang L (2016) Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition. IEEE Trans Image Process 25:3010–3022
Vemulapalli R, Arrate F, Chellappa R (2016) R3DG features: Relative 3D geometry-based skeletal representations for human action recognition. Comput Vis Image Underst 152:155–166
Shao Z, Li Y, Guo Y et al (2018) A Hierarchical Model for Action Recognition Based on Body Parts. Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, pp 1978–1985. https://doi.org/10.1109/ICRA.2018.8460516
El-Ghaish HA, Shoukry A, Hussein ME (2018) CovP3DJ: Skeleton-parts-based-covariance Descriptor for Human Action Recognition. Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. pp 343–350. https://doi.org/10.5220/0006625703430350
Guo Y, Li Y, Shao Z (2018) DSRF: A flexible trajectory descriptor for articulated human action recognition. Pattern Recogn 76:137–148
Si C, Jing Y, Wang W et al (2020) Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network. Pattern Recogn 107:107511
Qin Y, Mo L, Li C, Luo J (2020) Skeleton-based action recognition by part-aware graph convolutional networks. Vis Comput 36:621–631
Shi L, Zhang Y, Cheng J, Lu H (2022) Action recognition via pose-based graph convolutional networks with intermediate dense supervision. Pattern Recogn 121:108170
Chen T, Zhou D, Wang J et al (2023) Part-aware Prototypical Graph Network for One-shot Skeleton-based Action Recognition. 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG). pp 1–8. https://doi.org/10.1109/FG57933.2023.10042671
Ma X, Wang H, Xue B et al (2014) Depth-Based Human Fall Detection via Shape Features and Improved Extreme Learning Machine. IEEE J Biomed Health Inform 18:1915–1922
Aslan M, Sengur A, Xiao Y et al (2015) Shape feature encoding via Fisher Vector for efficient fall detection in depth-videos. Appl Soft Comput 37:1023–1028
Liu M, Liu H (2016) Depth Context: a new descriptor for human activity recognition by using sole depth sequences. Neurocomputing 175:747–758
Zhang B, Yang Y, Chen C et al (2017) Action Recognition Using 3D Histograms of Texture and A Multi-Class Boosting Classifier. IEEE Trans Image Process 26:4648–4660
Trelinski J, Kwolek B (2019) Ensemble of Classifiers Using CNN and Hand-Crafted Features for Depth-Based Action Recognition. Artificial Intelligence and Soft Computing. Springer International Publishing, Cham, pp 91–103
Dhiman C, Vishwakarma DK (2019) A Robust Framework for Abnormal Human Action Recognition Using R-Transform and Zernike Moments in Depth Videos. IEEE Sens J 19:5195–5203
Li X, Hou Z, Liang J, Chen C (2020) Human action recognition based on 3D body mask and depth spatial-temporal maps. Multimedia Tools and Applications 79:35761–35778
Chen C, Jafari R, Kehtarnavaz N (2015) Action Recognition from Depth Sequences Using Depth Motion Maps-Based Local Binary Patterns. Proceedings of the IEEE Winter Conference on Applications of Computer Vision. IEEE, pp 1092–1099. https://doi.org/10.1109/WACV.2015.150
Wang P, Li W, Gao Z et al (2016) Action Recognition From Depth Maps Using Deep Convolutional Neural Networks. IEEE Trans Hum-Mach Syst 46:498–509
Chen C, Zhang B, Hou Z et al (2017) Action recognition from depth sequences using weighted fusion of 2D and 3D auto-correlation of gradients features. Multimed Tools Appl 76:4651–4669
Cai L, Liu X, Chen F, Xiang M (2018) Robust human action recognition based on depth motion maps and improved convolutional neural network. JEI 27:051218
Weiyao X, Muqing W, Min Z et al (2019) Human Action Recognition Using Multilevel Depth Motion Maps. IEEE Access 7:41811–41822
Bulbul MF, Ali H (2021) Gradient local auto-correlation features for depth human action recognition. SN Appl Sci 3:535
Ghorbel E, Boutteau R, Boonaert J et al (2015) 3D real-time human action recognition using a spline interpolation approach. Proceedings of the International Conference on Image Processing Theory, Tools and Applications. IEEE, pp 61–66. https://doi.org/10.1109/IPTA.2015.7367097
Goyal K, Singhai J (2018) Review of background subtraction methods using Gaussian mixture model for video surveillance systems. Artif Intell Rev 50:241–259
He L, Ren X, Gao Q et al (2017) The connected-component labeling problem: A review of state-of-the-art algorithms. Pattern Recogn 70:25–43
Asadzadeh S, Daneshvar S, Abedi B et al (2019) Technical report: An advanced algorithm for the description of mice oocyte cytoplasm and polar body. Biomed Signal Process Control 48:171–178
Tang S, Goto S (2010) Histogram of template for human detection. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, pp 2186–2189. https://doi.org/10.1109/ICASSP.2010.5495685
Gattal A, Chibani Y (2015) SVM-Based Segmentation-Verification of Handwritten Connected Digits Using the Oriented Sliding Window. Int J Comput Intell Appl 14:1550005
Sotiropoulos DN, Pournarakis DE, Giaglis GM (2017) SVM-based sentiment classification: a comparative study against state-of-the-art classifiers. Int J Comput Intell Stud 6:52
Shrivastava A, Tripathy AK, Dalal PK (2019) A SVM-based classification approach for obsessive compulsive disorder by oxidative stress biomarkers. J Comput Sci 36:101023
Sidaoui B, Sadouni K (2017) Binary tree multi-class SVM based on OVA approach and variable neighbourhood search algorithm. Int J Comput Appl Technol 55:183–190
Sharan RV, Moir TJ (2015) Noise robust audio surveillance using reduced spectrogram image feature and one-against-all SVM. Neurocomputing 158:90–99
Mansour A, Chenchah F, Lachiri Z (2019) Emotional speaker recognition in real life conditions using multiple descriptors and i-vector speaker modeling technique. Multimed Tools Appl 78:6441–6458
Benabdeslem K, Bennani Y (2006) Dendogram based SVM for multi-class classification. Proceedings of the International Conference on Information Technology Interfaces. IEEE, pp 173–178. https://doi.org/10.1109/ITI.2006.1708473
Tomašev N, Radovanović M, Mladenić D, Ivanović M (2014) Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification. Int J Mach Learn Cybern 5:445–458
Tan YT, Rosdi BA (2015) FPGA-based hardware accelerator for the prediction of protein secondary class via fuzzy K-nearest neighbors with Lempel-Ziv complexity based distance measure. Neurocomputing 148:409–419
Ben Fredj I, Ouni K (2017) Comparison of crisp and fuzzy kNN in phoneme recognition. 2017 International Conference on Advanced Systems and Electric Technologies. pp 118–122. https://doi.org/10.1109/ASET.2017.7983676
Xu Y, Zhu Q, Fan Z et al (2013) Coarse to fine K nearest neighbor classifier. Pattern Recogn Lett 34:980–986
Gou J, Qiu W, Yi Z et al (2019) A Local Mean Representation-based K -Nearest Neighbor Classifier. ACM Trans Intell Syst Technol 10(3):1–25
Kumar P, Thakur RS (2021) Liver disorder detection using variable- neighbor weighted fuzzy K nearest neighbor approach. Multimed Tools Appl 80:16515–16535
SDUFall Dataset. http://www.sucro.org/homepage/wanghaibo/SDUFall.html. Accessed 28 Jan 2019
Fall detection Dataset. https://falldataset.com/. Accessed 15 Jun 2023
Ahmed H, Nandi AK (2019) Classification Algorithm Validation. Condition Monitoring with Vibration Signals: Compressive Sampling and Learning Algorithms for Rotating Machines. IEEE, pp 307–319. https://doi.org/10.1002/9781119544678.ch15
Tyagi V (2017) Similarity Measures and Performance Evaluation. Content-Based Image Retrieval: Ideas, Influences, and Current Trends. Springer, Singapore, pp 63–83
Fan K, Wang P, Zhuang S (2019) Human fall detection using slow feature analysis. Multimed Tools Appl 78:9101–9128
Merrouche F, Baha N (2020) Fall detection based on shape deformation. Multimed Tools Appl 79:30489–30508
Adhikari K, Bouchachia H, Nait-Charif H (2017) Activity recognition for indoor fall detection using convolutional neural network. 2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA). pp 81–84. https://doi.org/10.23919/MVA.2017.7986795
Liu G, Tian G, Li J et al (2018) Human Action Recognition Using a Distributed RGB-Depth Camera Network. IEEE Sens J 18:7570–7576
Liu J, Wang Z, Liu H (2020) HDS-SP: A novel descriptor for skeleton-based human action recognition. Neurocomputing 385:22–32
Chen Y, Wang L, Li C et al (2020) ConvNets-based action recognition from skeleton motion maps. Multimed Tools Appl 79:1707–1725
Author information
Authors and Affiliations
Contributions
Conceptualization: Moussa Diaf; Methodology: Merzouk Younsi, Samir Yesli; Writing—original draft preparation: Merzouk Younsi, Samir Yesli; Writing—review and editing: Moussa Diaf, Samir Yesli; Supervision: Moussa Diaf; Software: Merzouk Younsi; Visualization: Samir Yesli; Validation: Moussa Diaf.
Corresponding author
Ethics declarations
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
Financial interests
The authors declare they have no financial interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Younsi, M., Yesli, S. & Diaf, M. Depth-based human action recognition using histogram of templates. Multimed Tools Appl 83, 40415–40449 (2024). https://doi.org/10.1007/s11042-023-16989-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16989-0