Skip to main content

Advertisement

Log in

Depth-based human action recognition using histogram of templates

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose an efficient, fast, and easy-to-implement method for recognizing human actions in depth image sequences. In this method, the human body silhouettes are initially extracted from the depth image sequences using the Gaussian mixture background subtraction model. After removing noise from the foreground image by performing a cascade of morphological operations and area filtering, the contour of the human silhouette is extracted by applying Moore’s neighbor contour tracing algorithm. From this contour, features describing the human posture are calculated using the Histogram of Templates (HoT) descriptor. These features are then used to train a Dendogram-based support vector machine for generating the frame-by-frame posture variation signal of the action sequence. The histogram of this signal is created, and finally introduced as an input vector into a Fuzzy k Nearest Neighbor (FkNN) classifier for recognizing human actions. The proposed method is evaluated on two publicly available datasets containing various daily actions (Bending, Sitting, Lying, etc.) performed by different human subjects. Extensive experiments are conducted using several values of the nearest neighbor (k) in the FkNN and different similarity measures, namely Euclidean distance, Bhattacharyya distance, Kullback–Leibler distance, and histogram intersection-based distance. The results show that the proposed method performs better or comparable to other state-of-the-art approaches. Moreover, this method can process 18 frames per second from the image sequence, which makes it well suited for applications needing real-time human action recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Algorithm 2
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data Availability

The datasets used in this research are publicly available from the websites SDUFall Dataset (http://www.sucro.org/homepage/wanghaibo/SDUFall.html) and Fall Detection Dataset (https://falldataset.com/).

References

  1. Elharrouss O, Almaadeed N, Al-Maadeed S et al (2021) A combined multiple action recognition and summarization for surveillance video sequences. Appl Intell 51:690–712

    Google Scholar 

  2. Zhang J, Shan Y, Huang K (2015) ISEE Smart Home (ISH): Smart video analysis for home security. Neurocomputing 149:752–766

    Google Scholar 

  3. Yao L, Sheng QZ, Benatallah B et al (2018) WITS: an IoT-endowed computational framework for activity recognition in personalized smart homes. Computing 100:369–385

    MathSciNet  Google Scholar 

  4. Gao Y, Xiang X, Xiong N et al (2018) Human Action Monitoring for Healthcare Based on Deep Learning. IEEE Access 6:52277–52285

    Google Scholar 

  5. Mukherjee D, Mondal R, Singh PK et al (2020) EnsemConvNet: a deep learning approach for human activity recognition using smartphone sensors for healthcare applications. Multimedia Tools Appl 79:31663–31690

    Google Scholar 

  6. Olatunji IE (2018) Human Activity Recognition for Mobile Robot. J Phys: Conf Ser 1069:012148

    Google Scholar 

  7. Fan J, Zheng P, Li S (2022) Vision-based holistic scene understanding towards proactive human–robot collaboration. Robot Comput-Integr Manuf 75:102304

    Google Scholar 

  8. Sowmyayani S, Rani PAJ (2022) STHARNet: spatio-temporal human action recognition network in content based video retrieval. Multimed Tools Appl. https://doi.org/10.1007/s11042-022-14056-8

    Article  Google Scholar 

  9. Lu M, Hu Y, Lu X (2020) Driver action recognition using deformable and dilated faster R-CNN with optimized region proposals. Appl Intell 50:1100–1111

    Google Scholar 

  10. Zhou E, Zhang H (2020) Human action recognition toward massive-scale sport sceneries based on deep multi-model feature fusion. Signal Process: Image Commun 84:115802

    Google Scholar 

  11. Host K, Ivašić-Kos M (2022) An overview of Human Action Recognition in sports based on Computer Vision. Heliyon 8:e09633

    Google Scholar 

  12. Minh Dang L, Min K, Wang H et al (2020) Sensor-based and vision-based human activity recognition: A comprehensive survey. Pattern Recogn 108:107561

    Google Scholar 

  13. Gupta P, Dallas T (2014) Feature Selection and Activity Recognition System Using a Single Triaxial Accelerometer. IEEE Trans Biomed Eng 61:1780–1786

    Google Scholar 

  14. Jiang W, Yin Z (2015) Human Activity Recognition Using Wearable Sensors by Deep Convolutional Neural Networks. Proceedings of the International Conference on Multimedia. ACM, pp 1307–1310. https://doi.org/10.1145/2733373.2806333

    Chapter  Google Scholar 

  15. Ha S, Choi S (2016) Convolutional neural networks for human activity recognition using multiple accelerometer and gyroscope sensors. Proceedings of the International Joint Conference on Neural Networks. IEEE, pp 381–388. https://doi.org/10.1109/IJCNN.2016.7727224

    Chapter  Google Scholar 

  16. Jalloul N, Porée F, Viardot G et al (2018) Activity Recognition Using Complex Network Analysis. IEEE J Biomed Health Inform 22:989–1000

    Google Scholar 

  17. Quaid MAK, Jalal A (2020) Wearable sensors based human behavioral pattern recognition using statistical features and reweighted genetic algorithm. Multimed Tools Appl 79:6061–6083

    Google Scholar 

  18. ud din Tahir SB, Jalal A, Kim K (2020) Wearable Inertial Sensors for Daily Activity Analysis Based on Adam Optimization and the Maximum Entropy Markov Model. Entropy 22:579

    Google Scholar 

  19. Varshney N, Bakariya B, Kushwaha AKS, Khare M (2022) Human activity recognition by combining external features with accelerometer sensor data using deep learning network model. Multimed Tools Appl 81:34633–34652

    Google Scholar 

  20. Rajamoney J, Ramachandran A (2023) Representative-discriminative dictionary learning algorithm for human action recognition using smartphone sensors. Concurr Comput: Pract Experience 35:e7468

    Google Scholar 

  21. Javeed M, Gochoo M, Jalal A, Kim K (2021) HF-SPHR: Hybrid Features for Sustainable Physical Healthcare Pattern Recognition Using Deep Belief Networks. Sustainability 13:1699

    Google Scholar 

  22. Das Dawn D, Shaikh SH (2016) A comprehensive survey of human action recognition with spatio-temporal interest point (STIP) detector. Vis Comput 32:289–306

    Google Scholar 

  23. Chen E, Zhang S, Liang C (2017) Action Recognition Using Motion History Image and Static History Image-based Local Binary Patterns. Int J Multimed Ubiquit Eng 12:203–214

    Google Scholar 

  24. Kumar SS, John M (2016) Human activity recognition using optical flow based feature set. Proceedings of the International Carnahan Conference on Security Technology. IEEE, pp 1–5. https://doi.org/10.1109/CCST.2016.7815694

    Chapter  Google Scholar 

  25. Aslan MF, Durdu A, Sabanci K (2020) Human action recognition with bag of visual words using different machine learning methods and hyperparameter optimization. Neural Comput Appl 32:8585–8597

    Google Scholar 

  26. Sun Z, Ke Q, Rahmani H et al (2023) Human Action Recognition From Various Data Modalities: A Review. IEEE Trans Pattern Anal Mach Intell 45:3200–3225

    Google Scholar 

  27. Roh M-C, Shin H-K, Lee S-W (2010) View-independent human action recognition with Volume Motion Template on single stereo camera. Pattern Recogn Lett 31:639–647

    Google Scholar 

  28. Sanchez-Riera J, Čech J, Horaud R (2012) Action Recognition Robust to Background Clutter by Using Stereo Vision. Proceedings of the European Conference on Computer Vision. Springer, pp 332–341. https://doi.org/10.1007/978-3-642-33863-2_33

    Chapter  Google Scholar 

  29. Murtaza F, Yousaf MH, Velastin SA (2016) Multi-view human action recognition using 2D motion templates based on MHIs and their HOG description. IET Comput Vision 10:758–767

    Google Scholar 

  30. Kushwaha AKS, Srivastava S, Srivastava R (2017) Multi-view human activity recognition based on silhouette and uniform rotation invariant local binary patterns. Multimedia Syst 23:451–467

    Google Scholar 

  31. Singh R, Kushwaha AKS, Srivastava R (2019) Multi-view recognition system for human activity based on multiple features for video surveillance system. Multimed Tools Appl 78:17165–17196

    Google Scholar 

  32. Jalal A, Khalid N, Kim K (2020) Automatic Recognition of Human Interaction via Hybrid Descriptors and Maximum Entropy Markov Model Using Depth Sensors. Entropy 22:817

    Google Scholar 

  33. Akula A, Shah AK, Ghosh R (2018) Deep learning approach for human action recognition in infrared images. Cogn Syst Res 50:146–154

    Google Scholar 

  34. Batchuluun G, Nguyen DT, Pham TD et al (2019) Action Recognition From Thermal Videos. IEEE Access 7:103893–103917

    Google Scholar 

  35. Batchuluun G, Kang JK, Nguyen DT et al (2021) Action Recognition From Thermal Videos Using Joint and Skeleton Information. IEEE Access 9:11716–11733

    Google Scholar 

  36. Malawski F, Kwolek B (2019) Improving multimodal action representation with joint motion history context. J Vis Commun Image Represent 61:198–208

    Google Scholar 

  37. Wang H, Wang L (2018) Learning content and style: Joint action recognition and person identification from human skeletons. Pattern Recogn 81:23–35

    Google Scholar 

  38. Qiao R, Liu L, Shen C, van den Hengel A (2017) Learning discriminative trajectorylet detector sets for accurate skeleton-based action recognition. Pattern Recogn 66:202–212

    Google Scholar 

  39. Carbonera Luvizon D, Tabia H, Picard D (2017) Learning features combination for human action recognition from skeleton sequences. Pattern Recogn Lett 99:13–20

    Google Scholar 

  40. Patrona F, Chatzitofis A, Zarpalas D, Daras P (2018) Motion analysis: Action detection, recognition and evaluation based on motion capture data. Pattern Recogn 76:612–622

    Google Scholar 

  41. Wang H, Wang L (2018) Beyond Joints: Learning Representations From Primitive Geometries for Skeleton-Based Action Recognition and Detection. IEEE Trans Image Process 27:4382–4394

    MathSciNet  Google Scholar 

  42. Wang P, Li W, Li C, Hou Y (2018) Action recognition based on joint trajectory maps with convolutional neural networks. Knowl-Based Syst 158:43–53

    Google Scholar 

  43. Sun B, Kong D, Wang S et al (2019) Effective human action recognition using global and local offsets of skeleton joints. Multimedia Tools and Applications 78:6329–6353

    Google Scholar 

  44. Caetano C, Sena J, Bremond FF et al (2019) SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition. AVSS 2019 - 16th IEEE International Conference on Advanced Video and Signal-based Surveillance. Taipei, pp 1–8. https://doi.org/10.1109/AVSS.2019.8909840

  45. Fan Y, Weng S, Zhang Y et al (2020) Context-Aware Cross-Attention for Skeleton-Based Human Action Recognition. IEEE Access 8:15280–15290

    Google Scholar 

  46. Song Y-F, Zhang Z, Shan C, Wang L (2023) Constructing Stronger and Faster Baselines for Skeleton-Based Action Recognition. IEEE Trans Pattern Anal Mach Intell 45:1474–1488

    Google Scholar 

  47. Feng L, Zhao Y, Zhao W, Tang J (2022) A comparative review of graph convolutional networks for human skeleton-based action recognition. Artif Intell Rev 55:4275–4305

    Google Scholar 

  48. Duan H, Zhao Y, Chen K et al (2022) Revisiting Skeleton-Based Action Recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 2969–2978. https://doi.org/10.1109/CVPR52688.2022.00298

    Chapter  Google Scholar 

  49. Du Y, Fu Y, Wang L (2016) Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition. IEEE Trans Image Process 25:3010–3022

    MathSciNet  Google Scholar 

  50. Vemulapalli R, Arrate F, Chellappa R (2016) R3DG features: Relative 3D geometry-based skeletal representations for human action recognition. Comput Vis Image Underst 152:155–166

    Google Scholar 

  51. Shao Z, Li Y, Guo Y et al (2018) A Hierarchical Model for Action Recognition Based on Body Parts. Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, pp 1978–1985. https://doi.org/10.1109/ICRA.2018.8460516

    Chapter  Google Scholar 

  52. El-Ghaish HA, Shoukry A, Hussein ME (2018) CovP3DJ: Skeleton-parts-based-covariance Descriptor for Human Action Recognition. Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. pp 343–350. https://doi.org/10.5220/0006625703430350

    Chapter  Google Scholar 

  53. Guo Y, Li Y, Shao Z (2018) DSRF: A flexible trajectory descriptor for articulated human action recognition. Pattern Recogn 76:137–148

    Google Scholar 

  54. Si C, Jing Y, Wang W et al (2020) Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network. Pattern Recogn 107:107511

    Google Scholar 

  55. Qin Y, Mo L, Li C, Luo J (2020) Skeleton-based action recognition by part-aware graph convolutional networks. Vis Comput 36:621–631

    Google Scholar 

  56. Shi L, Zhang Y, Cheng J, Lu H (2022) Action recognition via pose-based graph convolutional networks with intermediate dense supervision. Pattern Recogn 121:108170

    Google Scholar 

  57. Chen T, Zhou D, Wang J et al (2023) Part-aware Prototypical Graph Network for One-shot Skeleton-based Action Recognition. 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG). pp 1–8. https://doi.org/10.1109/FG57933.2023.10042671

    Chapter  Google Scholar 

  58. Ma X, Wang H, Xue B et al (2014) Depth-Based Human Fall Detection via Shape Features and Improved Extreme Learning Machine. IEEE J Biomed Health Inform 18:1915–1922

    Google Scholar 

  59. Aslan M, Sengur A, Xiao Y et al (2015) Shape feature encoding via Fisher Vector for efficient fall detection in depth-videos. Appl Soft Comput 37:1023–1028

    Google Scholar 

  60. Liu M, Liu H (2016) Depth Context: a new descriptor for human activity recognition by using sole depth sequences. Neurocomputing 175:747–758

    Google Scholar 

  61. Zhang B, Yang Y, Chen C et al (2017) Action Recognition Using 3D Histograms of Texture and A Multi-Class Boosting Classifier. IEEE Trans Image Process 26:4648–4660

    MathSciNet  Google Scholar 

  62. Trelinski J, Kwolek B (2019) Ensemble of Classifiers Using CNN and Hand-Crafted Features for Depth-Based Action Recognition. Artificial Intelligence and Soft Computing. Springer International Publishing, Cham, pp 91–103

    Google Scholar 

  63. Dhiman C, Vishwakarma DK (2019) A Robust Framework for Abnormal Human Action Recognition Using R-Transform and Zernike Moments in Depth Videos. IEEE Sens J 19:5195–5203

    Google Scholar 

  64. Li X, Hou Z, Liang J, Chen C (2020) Human action recognition based on 3D body mask and depth spatial-temporal maps. Multimedia Tools and Applications 79:35761–35778

    Google Scholar 

  65. Chen C, Jafari R, Kehtarnavaz N (2015) Action Recognition from Depth Sequences Using Depth Motion Maps-Based Local Binary Patterns. Proceedings of the IEEE Winter Conference on Applications of Computer Vision. IEEE, pp 1092–1099. https://doi.org/10.1109/WACV.2015.150

    Chapter  Google Scholar 

  66. Wang P, Li W, Gao Z et al (2016) Action Recognition From Depth Maps Using Deep Convolutional Neural Networks. IEEE Trans Hum-Mach Syst 46:498–509

    Google Scholar 

  67. Chen C, Zhang B, Hou Z et al (2017) Action recognition from depth sequences using weighted fusion of 2D and 3D auto-correlation of gradients features. Multimed Tools Appl 76:4651–4669

    Google Scholar 

  68. Cai L, Liu X, Chen F, Xiang M (2018) Robust human action recognition based on depth motion maps and improved convolutional neural network. JEI 27:051218

    Google Scholar 

  69. Weiyao X, Muqing W, Min Z et al (2019) Human Action Recognition Using Multilevel Depth Motion Maps. IEEE Access 7:41811–41822

    Google Scholar 

  70. Bulbul MF, Ali H (2021) Gradient local auto-correlation features for depth human action recognition. SN Appl Sci 3:535

    Google Scholar 

  71. Ghorbel E, Boutteau R, Boonaert J et al (2015) 3D real-time human action recognition using a spline interpolation approach. Proceedings of the International Conference on Image Processing Theory, Tools and Applications. IEEE, pp 61–66. https://doi.org/10.1109/IPTA.2015.7367097

    Chapter  Google Scholar 

  72. Goyal K, Singhai J (2018) Review of background subtraction methods using Gaussian mixture model for video surveillance systems. Artif Intell Rev 50:241–259

    Google Scholar 

  73. He L, Ren X, Gao Q et al (2017) The connected-component labeling problem: A review of state-of-the-art algorithms. Pattern Recogn 70:25–43

    Google Scholar 

  74. Asadzadeh S, Daneshvar S, Abedi B et al (2019) Technical report: An advanced algorithm for the description of mice oocyte cytoplasm and polar body. Biomed Signal Process Control 48:171–178

    Google Scholar 

  75. Tang S, Goto S (2010) Histogram of template for human detection. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, pp 2186–2189. https://doi.org/10.1109/ICASSP.2010.5495685

    Chapter  Google Scholar 

  76. Gattal A, Chibani Y (2015) SVM-Based Segmentation-Verification of Handwritten Connected Digits Using the Oriented Sliding Window. Int J Comput Intell Appl 14:1550005

    Google Scholar 

  77. Sotiropoulos DN, Pournarakis DE, Giaglis GM (2017) SVM-based sentiment classification: a comparative study against state-of-the-art classifiers. Int J Comput Intell Stud 6:52

    Google Scholar 

  78. Shrivastava A, Tripathy AK, Dalal PK (2019) A SVM-based classification approach for obsessive compulsive disorder by oxidative stress biomarkers. J Comput Sci 36:101023

    Google Scholar 

  79. Sidaoui B, Sadouni K (2017) Binary tree multi-class SVM based on OVA approach and variable neighbourhood search algorithm. Int J Comput Appl Technol 55:183–190

    Google Scholar 

  80. Sharan RV, Moir TJ (2015) Noise robust audio surveillance using reduced spectrogram image feature and one-against-all SVM. Neurocomputing 158:90–99

    Google Scholar 

  81. Mansour A, Chenchah F, Lachiri Z (2019) Emotional speaker recognition in real life conditions using multiple descriptors and i-vector speaker modeling technique. Multimed Tools Appl 78:6441–6458

    Google Scholar 

  82. Benabdeslem K, Bennani Y (2006) Dendogram based SVM for multi-class classification. Proceedings of the International Conference on Information Technology Interfaces. IEEE, pp 173–178. https://doi.org/10.1109/ITI.2006.1708473

    Chapter  Google Scholar 

  83. Tomašev N, Radovanović M, Mladenić D, Ivanović M (2014) Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification. Int J Mach Learn Cybern 5:445–458

    Google Scholar 

  84. Tan YT, Rosdi BA (2015) FPGA-based hardware accelerator for the prediction of protein secondary class via fuzzy K-nearest neighbors with Lempel-Ziv complexity based distance measure. Neurocomputing 148:409–419

    Google Scholar 

  85. Ben Fredj I, Ouni K (2017) Comparison of crisp and fuzzy kNN in phoneme recognition. 2017 International Conference on Advanced Systems and Electric Technologies. pp 118–122. https://doi.org/10.1109/ASET.2017.7983676

    Chapter  Google Scholar 

  86. Xu Y, Zhu Q, Fan Z et al (2013) Coarse to fine K nearest neighbor classifier. Pattern Recogn Lett 34:980–986

    Google Scholar 

  87. Gou J, Qiu W, Yi Z et al (2019) A Local Mean Representation-based K -Nearest Neighbor Classifier. ACM Trans Intell Syst Technol 10(3):1–25

    Google Scholar 

  88. Kumar P, Thakur RS (2021) Liver disorder detection using variable- neighbor weighted fuzzy K nearest neighbor approach. Multimed Tools Appl 80:16515–16535

    Google Scholar 

  89. SDUFall Dataset. http://www.sucro.org/homepage/wanghaibo/SDUFall.html. Accessed 28 Jan 2019

  90. Fall detection Dataset. https://falldataset.com/. Accessed 15 Jun 2023

  91. Ahmed H, Nandi AK (2019) Classification Algorithm Validation. Condition Monitoring with Vibration Signals: Compressive Sampling and Learning Algorithms for Rotating Machines. IEEE, pp 307–319. https://doi.org/10.1002/9781119544678.ch15

    Chapter  Google Scholar 

  92. Tyagi V (2017) Similarity Measures and Performance Evaluation. Content-Based Image Retrieval: Ideas, Influences, and Current Trends. Springer, Singapore, pp 63–83

    Google Scholar 

  93. Fan K, Wang P, Zhuang S (2019) Human fall detection using slow feature analysis. Multimed Tools Appl 78:9101–9128

    Google Scholar 

  94. Merrouche F, Baha N (2020) Fall detection based on shape deformation. Multimed Tools Appl 79:30489–30508

    Google Scholar 

  95. Adhikari K, Bouchachia H, Nait-Charif H (2017) Activity recognition for indoor fall detection using convolutional neural network. 2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA). pp 81–84. https://doi.org/10.23919/MVA.2017.7986795

    Chapter  Google Scholar 

  96. Liu G, Tian G, Li J et al (2018) Human Action Recognition Using a Distributed RGB-Depth Camera Network. IEEE Sens J 18:7570–7576

    Google Scholar 

  97. Liu J, Wang Z, Liu H (2020) HDS-SP: A novel descriptor for skeleton-based human action recognition. Neurocomputing 385:22–32

    Google Scholar 

  98. Chen Y, Wang L, Li C et al (2020) ConvNets-based action recognition from skeleton motion maps. Multimed Tools Appl 79:1707–1725

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Moussa Diaf; Methodology: Merzouk Younsi, Samir Yesli; Writing—original draft preparation: Merzouk Younsi, Samir Yesli; Writing—review and editing: Moussa Diaf, Samir Yesli; Supervision: Moussa Diaf; Software: Merzouk Younsi; Visualization: Samir Yesli; Validation: Moussa Diaf.

Corresponding author

Correspondence to Merzouk Younsi.

Ethics declarations

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Financial interests

The authors declare they have no financial interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Younsi, M., Yesli, S. & Diaf, M. Depth-based human action recognition using histogram of templates. Multimed Tools Appl 83, 40415–40449 (2024). https://doi.org/10.1007/s11042-023-16989-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16989-0

Keywords

Navigation