Multimedia Tools and Applications

, Volume 77, Issue 7, pp 8531–8549 | Cite as

Local derivative pattern for action recognition in depth images

  • Xuan Son NguyenEmail author
  • Thanh Phuong Nguyen
  • François Charpillet
  • Ngoc-Son Vu


This paper proposes a new local descriptor for action recognition in depth images using second-order directional Local Derivative Patterns (LDPs). LDP relies on local derivative direction variations to capture local patterns contained in an image region. Our proposed local descriptor combines different directional LDPs computed from three depth maps obtained by representing depth sequences in three orthogonal views and is able to jointly encode the shape and motion cues. Moreover, we suggest the use of Sparse Coding-based Fisher Vector (SCFVC) for encoding local descriptors into a global representation of depth sequences. SCFVC has been proven effective for object recognition but has not gained much attention for action recognition. We perform action recognition using Extreme Learning Machine (ELM). Experimental results on three public benchmark datasets show the effectiveness of the proposed approach.


Action recognition Local derivative pattern Sparse coding Fisher vector Extreme learning machine 


  1. 1.
    Amor BB, Su J, Srivastava A (2016) Action recognition using Rate-Invariant analysis of skeletal shape trajectories. TPAMI 38(1):1–13CrossRefGoogle Scholar
  2. 2.
    Boiman O, Shechtman E, Irani M (2008) Defense of nearest-neighbor based image classification. In: CVPR, pp 1–8Google Scholar
  3. 3.
    Breiman L (2001) Random Forests. Mach Learn 45(1):5–32CrossRefzbMATHGoogle Scholar
  4. 4.
    Chaaraoui AA, Padilla-Lopez JR, Florez-Revuelta F (2013) Fusion of skeletal and Silhouette-Based features for human action recognition with RGB-d devices. In: ICCVW, pp 91–97Google Scholar
  5. 5.
    Chen C, Jafari R, Kehtarnavaz N (2015) Action recognition from depth sequences using depth motion maps-based local binary patterns. In: WACV, pp 1092–1099Google Scholar
  6. 6.
    Chen C, Liu K, Kehtarnavaz N (2016) Real-time Human Action Recognition Based on Depth Motion Maps. J Real-Time Image Proc 12(1):155–163CrossRefGoogle Scholar
  7. 7.
    Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In: CVPR, pp 1110–1118Google Scholar
  8. 8.
    Evangelidis G, Singh G, Horaud R (2014) Skeletal quads: Human action recognition using joint quadruples. In: ICPR, pp 4513–4518Google Scholar
  9. 9.
    Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin, CJ (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874zbMATHGoogle Scholar
  10. 10.
    Gowayyed MA, Torki M, Hussein ME, El-Saban M (2013) Histogram of oriented displacements (HOD): describing trajectories of human joints for action recognition. In: IJCAI, pp 1351–1357Google Scholar
  11. 11.
    Huang GB, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70(16–18):3056–3062CrossRefGoogle Scholar
  12. 12.
    Huang GB, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16–18):3460–3468CrossRefGoogle Scholar
  13. 13.
    Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 42(2):513–529CrossRefGoogle Scholar
  14. 14.
    Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: IEEE international joint conference on neural networks, vol. 2, pp 985–990Google Scholar
  15. 15.
    Kurakin A, Zhang Z, Liu Z (2012) A Read-Time system for dynamic hand gesture recognition with a depth sensor. In: EUSIPCO, pp 1975–1979Google Scholar
  16. 16.
    Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: CVPR, pp 1–8Google Scholar
  17. 17.
    Li W, Zhang Z, Liu Z (2008) Expandable Data-Driven graphical modeling of human actions based on salient postures. IEEE Transactions on Circuits and Systems for Video Technology 18(11):1499–1510CrossRefGoogle Scholar
  18. 18.
    Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. In: CVPRW, pp 9–14Google Scholar
  19. 19.
    Liang C, Chen E, Qi L, Guan L (2016) Improving action recognition using collaborative representation of local depth map feature. IEEE Signal Processing Letters 23(9):1241–1245CrossRefGoogle Scholar
  20. 20.
    Liu L, Shen C, Wang L, van den Hengel A, Wang C (2014) Encoding high dimensional local features by sparse coding based fisher vectors. In: NIPS, pp 1143–1151Google Scholar
  21. 21.
    Luo J, Wang W, Qi H (2013) Group sparsity and geometry constrained dictionary learning for action recognition from depth maps. In: ICCV, pp 1809–1816Google Scholar
  22. 22.
    Mairal J, Bach F, Ponce J, Sapiro G, Jenatton R, Obozinski G SPAMS: SPArse modeling software, v2.4.
  23. 23.
    Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. TPAMI 27(10):1615–1630CrossRefGoogle Scholar
  24. 24.
    Müller M (2007) Information retrieval for music and motion. Springer-VerlagInc, New YorkCrossRefGoogle Scholar
  25. 25.
    Murray RM, Sastry SS, Zexiang L (1994) A mathematical introduction to robotic manipulation. Crc Press, IncGoogle Scholar
  26. 26.
    Ngo CW, Pong TC, Chin RT (1999) Detection of gradual transitions through temporal slice analysis. In: CVPR, pp 36–41Google Scholar
  27. 27.
    Ojala T, Pietikainen M, Harwood D (1994) Performance evaluation of texture measures with classification based on kullback discrimination of distributions. In: Proceedings of the 12th IAPR international conference on pattern recognition, vol. 1, pp 582–585Google Scholar
  28. 28.
    Oreifej O, Liu Z (2013) HON4d: Histogram of oriented 4D normals for activity recognition from depth sequences. In: CVPR, pp 716–723Google Scholar
  29. 29.
    Padilla-López JR, Chaaraoui AA, Flórez-Revuelta F (2014) A discussion on the validation tests employed to compare human action recognition methods using the msr action3d dataset. CoRR arXiv:1407.7390
  30. 30.
    Rahmani H, Mian A (2016) 3D action recognition from novel viewpoints. In: CVPR, pp 1506–1515Google Scholar
  31. 31.
    Sanchez J, Perronnin F, Mensink T, Verbeek J (2013) Image classification with the fisher vector: theory and practice. IJCV 105(3):222–245MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117CrossRefGoogle Scholar
  33. 33.
    Song Y, Liu S, Tang J (2015) Describing trajectory of surface patch for human action recognition on RGB and depth videos. IEEE Signal Process Lett 22(4):426–429CrossRefGoogle Scholar
  34. 34.
    Vedaldi A, Fulkerson B (2010) Vlfeat: an open and portable library of computer vision algorithms. In: Proceedings of the 18th ACM international conference on multimedia, pp 1469–1472Google Scholar
  35. 35.
    Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3D skeletons as points in a lie group. In: CVPR, pp 588–595Google Scholar
  36. 36.
    Wang C, Wang Y, Yuille AL (2013) An approach to Pose-Based action recognition. In: CVPR, pp 915–922Google Scholar
  37. 37.
    Wang J, Liu Z, Chorowski J, Chen Z, Wu Y (2012) Robust 3D action recognition with random occupancy patterns. In: ECCV, pp 872–885Google Scholar
  38. 38.
    Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: CVPR, pp 1290–1297Google Scholar
  39. 39.
    Yang X, Tian Y (2014) Super normal vector for activity recognition using depth sequences. In: CVPR, pp 804–811Google Scholar
  40. 40.
    Yang X, Tian YL (2012) EigenJoints-based action recognition using naive-bayes-nearest-neighbor. In: CVPRW, pp 14–19Google Scholar
  41. 41.
    Yang X, Zhang C, Tian Y (2012) Recognizing Actions Using Depth Motion Maps-based Histograms of Oriented Gradients. In: Proceedings of the 20th ACM international conference on multimedia, pp 1057–1060Google Scholar
  42. 42.
    Zhang B, Gao Y, Zhao S, Liu J (2010) Local derivative pattern versus local binary pattern: face recognition with High-Order local pattern descriptor. IEEE Trans Image Process 19(2):533–544MathSciNetCrossRefzbMATHGoogle Scholar
  43. 43.
    Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. TPAMI 29(6):915–928CrossRefGoogle Scholar
  44. 44.
    Zhu Y, Chen W, Guo G (2013) Fusing spatiotemporal features and joints for 3D action recognition. In: CVPRW, pp 486–491Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Université de Caen Basse-Normandie, CNRS, GREYC, UMR 6072CaenFrance
  2. 2.Aix Marseille Université, CNRS, ENSAM, LSIS, UMR 7296MarseilleFrance
  3. 3.Université de Toulon, CNRS, LSIS, UMR 7296La GardeFrance
  4. 4.INRIAVillers-Lès-NancyFrance
  5. 5.CNRSLORIA, UMR 7503Villers-Lès-NancyFrance
  6. 6.ETIS UMR 8051Université Paris Seine, UCP, ENSEA, CNRSCergyFrance
  7. 7.Université de Lorraine, LORIA, UMR 7503Villers-Lès-NancyFrance

Personalised recommendations