3D human action analysis and recognition through GLAC descriptor on 2D motion and static posture images

  • Mohammad Farhad Bulbul
  • Saiful Islam
  • Hazrat AliEmail author


In this paper, we present an approach for identification of actions within depth action videos. First, we process the video to get motion history images (MHIs) and static history images (SHIs) corresponding to an action video based on the use of 3D Motion Trail Model (3DMTM). We then characterize the action video by extracting the Gradient Local Auto-Correlations (GLAC) features from the SHIs and the MHIs. The two sets of features i.e., GLAC features from MHIs and GLAC features from SHIs are concatenated to obtain a representation vector for action. Finally, we perform the classification on all the action samples by using the l2-regularized Collaborative Representation Classifier (l2-CRC) to recognize different human actions in an effective way. We perform evaluation of the proposed method on three action datasets, MSR-Action3D, DHA and UTD-MHAD. Through experimental results, we observe that the proposed method performs superior to other approaches.


Human action recognition l2-CRC Motion history images Static history images 



  1. 1.
    Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In Proc. 10th IEEE Int. Conf. Comput. Vis., Beijing, pp. 1395–1402Google Scholar
  2. 2.
    Bobick AF, Davis JW (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267CrossRefGoogle Scholar
  3. 3.
    Bulbul MF, Jiang Y, Ma J (2015) DMMs-based multiple features fusion for human action recognition. International Journal of Multimedia Data Engineering and Management (IJMDEM) 6(4):23–39CrossRefGoogle Scholar
  4. 4.
    Chaaraoui AA, Climent-Pérez P, Flórez-Revuelta F (2012) A Review on Vision Techniques Applied to Human Behaviour Analysis for Ambient-Assisted Living. International Journal of Expert Systems with Applications 39(12):10873–10888CrossRefGoogle Scholar
  5. 5.
    Chaaraoui AA, Padilla-López JR, Climent-Pérez P, Flórez-Revuelta F (2014) Evolutionary joint selection to improve human action recognition with rgb-d devices. Expert Syst Appl 41(3):786–794CrossRefGoogle Scholar
  6. 6.
    Chen C, Fowler JE (2012) Single-image super-resolution using multi hypothesis prediction. In: Proceedings of the 46th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, 608–612Google Scholar
  7. 7.
    Chen, C., Hou, Z., Zhang, B., Jiang, J., & Yang, Y. (2015). Gradient local auto-correlations and extreme learning machine for depth-based activity recognition. In International Symposium on Visual Computing (pp. 613-623). Springer International PublishingGoogle Scholar
  8. 8.
    Chen C, Jafari R, Kehtarnavaz N (2015) Improving human action recognition using fusion of depth camera and inertial sensors. IEEE Transactions on Human-Machine Systems 45(1):51–61CrossRefGoogle Scholar
  9. 9.
    Chen C, Jafari R, Kehtarnavaz N (2015) UTD-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: Proc. IEEE Int. Conf. Image Process., pp. 168–172Google Scholar
  10. 10.
    Chen C, Jafari R, Kehtarnavaz N (2015) Action recognition from depth sequences using depth motion maps-based local binary patterns. In: WACV, pp. 1092–1099Google Scholar
  11. 11.
    Chen C, Kehtarnavaz N, Jafari R (2014) A medication adherence monitoring system for pill bottles based on a wearable inertial sensor. In: EMBC, pp. 4983–4986Google Scholar
  12. 12.
    Chen C, Li W, Tramel EW, Fowler JE (2014) Reconstruction of hyperspectral imagery from random projections using multi hypothesis prediction. IEEE Trans Geosci Remote Sens 52(1):365–374CrossRefGoogle Scholar
  13. 13.
    Chen C, Liu K, Jafari R, Kehtarnavaz N (2014) Home-based senior fitness test measurement system using collaborative inertial and depth sensors. In: EMBC, pp. 4135–4138Google Scholar
  14. 14.
    Chen C, Liu K, Kehtarnavaz N (2013) Real-time human action recognition based on depth motion maps. J Real-Time Image Process:1–9.
  15. 15.
    Chen C, Liu M, Zhang B, Han J, Jiang J, Liu H (2016) 3D action recognition using multi-temporal depth motion maps and Fisher vector. In: Proc. Int. Joint Conf. Artif. Intell., pp. 3331–3337Google Scholar
  16. 16.
    Chen C, Tramel W, Fowler JE (2011) Compressed sensing recovery of images and video using multi hypothesis predictions. In: Proceedings of the 45th Asilomar Conference on signals, Systems, and Computers, Pacific Grove, 1193–1198Google Scholar
  17. 17.
    Chen L, Wei H, Ferryman J (2013) A survey of human motion analysis using depth imagery. Pattern Recognition Letters, 1995–2006Google Scholar
  18. 18.
    Chen C, Zhang B, Hou Z, Jiang J, Liu M, Yang Y (2017) Action recognition from depth sequences using weighted fusion of 2D and 3D auto-correlation of gradients features. Multimed Tools Appl 76(3):4651–4669CrossRefGoogle Scholar
  19. 19.
    Chen E, Zhang S, Liang C (2017) Action Recognition Using Motion History Image and Static History Image-based Local Binary Patterns. International Journal of Multimedia and Ubiquitous Engineering 12(1):203–214CrossRefGoogle Scholar
  20. 20.
    Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893Google Scholar
  21. 21.
    Elmadany NED, He Y, Guan L (2018) Information Fusion for Human Action Recognition via Biset/Multiset Globality Locality Preserving Canonical Correlation Analysis. IEEE Trans Image Process 27(11):5275–5287MathSciNetCrossRefGoogle Scholar
  22. 22.
    Evangelidis G, Singh G, Horaud R (2014) Skeletal quads: human action recognition using joint quadruples. In ICPR, pp. 4513–4518Google Scholar
  23. 23.
    Farhad M, Jiang Y, Ma J (2015) Human Action Recognition Based On DMMs, HOGs and Contourlet Transform. In: Proceedings of IEEE international conference on multimedia big data, Beijing, China, 389–394Google Scholar
  24. 24.
    Farhad M, Jiang Y, Ma J (2015) Real-time human action recognition using DMMs-Based LBP and EOH feautres. In Proceedings of the International Conference on Intelligent Computing. FuzhouGoogle Scholar
  25. 25.
    Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139MathSciNetCrossRefGoogle Scholar
  26. 26.
    Gao Z, Zhang H, Xu GP, Xue YB (Mar. 2015) Multi-perspective and multi-modality joint representation and recognition model for 3D action recognition. Neuro-computing 151:554–564Google Scholar
  27. 27.
    Golub G, Hansen PC, O’Leary D (1999) Tikhonov regularization and total least squares. SIAM Journal on Matrix Analysis and Applications 21(1):185–194MathSciNetCrossRefGoogle Scholar
  28. 28.
    Gorelick L, Blank M, Irani ESM, Basri R (2007) Actions as space-time shapes. TPMAI 29(12):2247–2253CrossRefGoogle Scholar
  29. 29.
    Hossein Rahmani Q, Du H, Mahmood A, Mian A (2015) Discriminative human action classification using locality-constrained linear coding. PRLGoogle Scholar
  30. 30.
    Kobayashi T, Otsu N (2008) Image feature extraction using gradient local auto-correlations. In: Forsyth D, Torr P, Zisserman A (eds) ECCV 2008, Part I. LNCS, vol 5302. Springer, Heidelberg, pp 346–358Google Scholar
  31. 31.
    Lei Q, Zhang H, Xin M, Cai Y (2018) A hierarchical representation for human action recognition in realistic scenes. Multimed Tools Appl 77(9):11403–11423CrossRefGoogle Scholar
  32. 32.
    Li B, He M, Dai Y, Cheng X, Chen Y (2018) 3D skeleton based action recognition by video-domain translation-scale invariant mapping and multi-scale dilated CNN. Multimed Tools Appl 77(17):22901–22921CrossRefGoogle Scholar
  33. 33.
    Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. In: CVPRW, pp. 9–14Google Scholar
  34. 34.
    Liang B, Zheng L (2013) Three dimensional motion trail model for gesture recognition. In: Computer Vision Workshops (ICCVW), 2013 IEEE International Conference on, pp. 684–691Google Scholar
  35. 35.
    Lin YC, Hu MC, Cheng WH, Hsieh YH, Chen HM (2012) Human action recognition and retrieval using sole depth information. In: Proc. ACM MM, pp. 1053–1056Google Scholar
  36. 36.
    Liu H, Tian L, Liu M, Tang H (2015) SDM-BSM: A fusing depth scheme for human action recognition. In: Proc. ICIP, pp. 4674–4678Google Scholar
  37. 37.
    Luo J, Wang W, Qi H (2014) Spatio-Temporal Feature Extraction and Representation for RGB-D Human Action Recognition. Pattern Recognition Letters, 139–148Google Scholar
  38. 38.
    Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987CrossRefGoogle Scholar
  39. 39.
    Oreifej O, Liu Z (2013) HON4D: Histogram of oriented 4D normals for activity recognition from depth sequences. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 716–723Google Scholar
  40. 40.
    Poppe R (2010) A Survey on Vision-Based Human Action Recognition. J Image Vision Comput 28(6):976–990CrossRefGoogle Scholar
  41. 41.
    Rahmani H, Mahmood A, Huynh DQ, Mian A (2014) Real-time action recognition using histograms of depth gradients and random decision forests. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, (pp. 626–633). RIGoogle Scholar
  42. 42.
    Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: CVPR, pp. 1297–1304Google Scholar
  43. 43.
    Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R et al (2013) Real-Time Human Pose Recognition in Parts from Single Depth Images. Commun ACM 56(1):116–124CrossRefGoogle Scholar
  44. 44.
    Theodoridis T, Agapitos A, Hu H, Lucas SM (2008) Ubiquitous robotics in physical human action recognition: a comparison between dynamic ANNs and GP. In: ICRA, pp. 3064–3069Google Scholar
  45. 45.
    Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3D skeletons as points in a lie group. In: CVPR, pp. 588–595Google Scholar
  46. 46.
    Vieira AW, Nascimento ER, Oliveira GL, Liu Z, Campos MF (2012) STOP: space-time occupancy patterns for 3D action recognition from depth map sequences. In Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp. 252–259Google Scholar
  47. 47.
    Vieira AW, Nascimento ER, Oliveira GL, Liu Z, Campos MF (2014) On the improvement of human action recognition from depth map sequences using space-time occupancy patterns. Pattern Recogn Lett 36:221–227CrossRefGoogle Scholar
  48. 48.
    Wang J, Liu Z, Chorowski J, Chen Z, Wu Y (2012) Robust 3D action recognition with random occupancy patterns. In: Proc. Eur. Conf. Comput. Vis., pp. 872–885Google Scholar
  49. 49.
    Wang J, Liu Z, Wu Y, Yuan J (2012b) Mining actionlet ensemble for action recognition with depth cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 1290–1297). ProvidenceGoogle Scholar
  50. 50.
    Wang J, Liu Z, Wu Y, Yuan J (2014) Learning actionlet ensemble for 3D human action recognition. TPAMI 36(5):914–927CrossRefGoogle Scholar
  51. 51.
    Wang H, Schmid C (2013) Action recognition with improved trajectories. In Proceedings of the IEEE International Conference on Computer Vision, (pp. 3551–3558). SydneyGoogle Scholar
  52. 52.
    Wang L, Zhang B, Yang W (2015) Boosting-like deep convolutional network for pedestrian detection. In: Proc. Chin. Conf. Biometric Recognit., pp. 581–588Google Scholar
  53. 53.
    Wiliem A, Madasu V, Boles W, Yarlagadda P (2010) An update-describe approach for human action recognition in surveillance video. In: Proceedings of the International Conference on Digital Image Computing: Techniques and Applications , (pp. 270–275). SydneyGoogle Scholar
  54. 54.
    Wright J, Ma Y, Mairal J, Sapiro G, Huang T, Yan S (2010) Sparse representation for computer vision and pattern recognition. Proc IEEE 98(6):1031–1044CrossRefGoogle Scholar
  55. 55.
    Xia L, Aggarwal JK (2013) Spatio-temporal depth cuboid similarity feature for action recognition using depth camera. In: CVPR, pp. 2834–2841Google Scholar
  56. 56.
    Xia L, Chen C-C, Aggarwal J (2012) View invariant human action recognition using histograms of 3d joints. In: CVPR Workshops, pp. 20–27Google Scholar
  57. 57.
    Yang X, Tian Y (2012) eigenjoints-based action recognition using naïve-bayes-nearest-neighbor. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, (pp. 14–19). ProvinceGoogle Scholar
  58. 58.
    Yang X, Tian Y (2014) Super normal vector for action recognition using depth sequences. In: CVPR, pp. 804–811Google Scholar
  59. 59.
    Yang R, Yang R (2014) DMM-pyramid based deep architectures for action recognition with depth cameras. In: Proc. Asian Conf. Comput.Vis., pp. 37–49Google Scholar
  60. 60.
    Yang X, Zhang C, Tian Y (2012) Recognizing actions using depth motion maps-based histograms of oriented gradients. In: ACM Multimedia, pp. 1057–1060Google Scholar
  61. 61.
    Yu Kong B, Satarboroujeni B, Fu Y (2015) Hierarchical 3D kernel descriptors for action recognition using depth sequences. In FG, pages 1–6Google Scholar
  62. 62.
    Zanfir M, Leordeanu M, Sminchisescu C (2013) The moving pose: An efficient 3d kinematics descriptor for low-latency action recognition and detection. In ICCV, pp. 2752–2759Google Scholar
  63. 63.
    Zeng S, Lu G, Yan P (2018) Enhancing human action recognition via structural average curves analysis. SIViP 12(8):1551–1558CrossRefGoogle Scholar
  64. 64.
    Zhang Y-Z, Pan C, Sun J, Tang C (2018) Multiple sclerosis identification by convolutional neural network with dropout and parametric ReLU. J Comput Sci 28:1–10MathSciNetCrossRefGoogle Scholar
  65. 65.
    Zhang B, Yang Y, Chen C, Yang L, Han J, Shao L (2017) Action Recognition Using 3D Histograms of Texture and A Multi-Class Boosting Classifier. IEEE Trans Image Process 26(10)Google Scholar
  66. 66.
    Zhang Y-D, Zhang Y, Hou X-X, Chen H, Wang S-H (2018) Seven-layer deep neural network based on sparse autoencoder for voxelwise detection of cerebral microbleed. Multimed Tools Appl 77(9):10521–10538CrossRefGoogle Scholar
  67. 67.
    Zhu H-M, Pun C-M (2013) Human Action Recognition with Skeletal Information from Depth Camera. In: Proceedings of the IEEE International Conference Information and Automation, (pp. 1082–1085). YinchuanGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of MathematicsJashore University of Science and TechnologyJashoreBangladesh
  2. 2.Department of MathematicsBangabandhu Sheikh Mujibur Rahman Science and Technology UniversityGopalganjBangladesh
  3. 3.Department of Electrical and Computer EngineeringCOMSATS University IslamabadAbbottabadPakistan

Personalised recommendations