Trajectory Based Integrated Features for Action Classification from Depth Data

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 841)


We present an approach for Human Action Recognition based on amalgamation of features from depth maps and body-joint data. This Integrated feature set consists of depth features based on gradient orientation and motion energy, in addition to features from 3D- skeleton data capturing its statistical details. Feature selection is carried out to extract a relevant set of features for action recognition. The resultant set of features are evaluated using SVM classifier. We validate our proposed method on various benchmark datasets for Action Recognition such as MSR-Daily Activity and UT-Kinect dataset.


Skeleton Data Depth Map Activity Recognition Human Action Recognition ReliefF 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Wang, H., Klaser, A., Schmid, C., Liu, C.-L.: Dense trajectories and motion boundary descriptors for action recognition. Int. J. Comput. Vis. 103, 60–79 (2013)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM International Conference on Multimedia, MM 12, pp. 1057–1060 (2012)Google Scholar
  3. 3.
    Jetley, S., Cuzzolin, F.: 3D activity recognition using motion history and binary shape templates. In: Jawahar, C.V., Shan, S. (eds.) ACCV 2014. LNCS, vol. 9008, pp. 129–144. Springer, Cham (2015). Scholar
  4. 4.
    Yang, X., Tian, Y.: EigenJoints based action recognition using naive Bayes nearest neighbor. In: CVPR Workshop (2012)Google Scholar
  5. 5.
    Kruthiventi, S., Babu, R.: 3D action recognition by learning sequence of poses. In: ICVGIP (2014)Google Scholar
  6. 6.
    Liu, J., Shahroudy, A., Xu, D., Wang, G.: Spatio-temporal LSTM with trust gates for 3D human action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 816–833. Springer, Cham (2016). Scholar
  7. 7.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  8. 8.
    Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: Proceedings of the 2012 IEEE Conference on CVPR, pp. 1290–1297 (2012)Google Scholar
  9. 9.
    Zhu, Y., Chen, W., Guo, G.: Fusing spatiotemporal features and joints for 3D action recognition. In: CVPRW (2013)Google Scholar
  10. 10.
    Chaaraoui, A.A., Padilla-Lopez, J.R., Florez-Revuelta, F.: Fusion of skeletal and silhouette based features for human action recognition with RGBD sensors. In: ICCV Workshops (2013)Google Scholar
  11. 11.
    Arora, N., Shukla, P., Biswas, K.K.: Integrating depth-HOG and spatio-temporal joints data for action recognition. In: International Conference on Computer Graphics, Visualization and Computer Vision (WSCG) (2016)Google Scholar
  12. 12.
    Kononenko, I., Simec, E., Sikonja, M.R.: Overcoming the myopia of inductive learning algorithms with RELIEFF. Appl. Intell. 7, 39–55 (1997)CrossRefGoogle Scholar
  13. 13.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)Google Scholar
  14. 14.
    Scovanner, P., Ali, S., Shah, M.: A 3-dimensional SIFT descriptor and its application to action recognition. In: 15th International Conference on Multimedia, pp. 357–360 (2007)Google Scholar
  15. 15.
    Klaser, A., Marszaek, M., Schmid, C.: A spatio-temporal descriptor based on 3D gradients. In: British Machine Vision Conference, pp. 995–1004 (2008)Google Scholar
  16. 16.
    Perez, E.A., Mota, V.F., Maciel, L.M., Sad, D., Vieira, M.B.: Combining gradient histograms using orientation tensors for human action recognition. In: ICPR, pp. 3460–3463 (2012)Google Scholar
  17. 17.
    Robnik-Sikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53, 23–69 (2003)CrossRefGoogle Scholar
  18. 18.
    Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. In: CVPR (2010)Google Scholar
  19. 19.
    Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: CVPR, pp. 1290–1297 (2012)Google Scholar
  20. 20.
    Xia, L., Chen, C., Aggarwal, J.: View invariant human action recognition using histograms of 3D joints. In: CVPR Workshop (2012)Google Scholar
  21. 21.
    Cippitelli, E., Gasparrini, S., Gambi, E., Spinsante, S.: A human activity recognition system using skeleton data using RGBD sensors. J. Comput. Intell. Neurosci. 2016, 21 (2016)Google Scholar
  22. 22.
    Zhu, Y., Chen, W., Guo, G.: Fusing spatio-temporal features and joints for 3D action recognition. In: Computer Vision and Pattern Recognition Workshops (2013)Google Scholar
  23. 23.
    Dollar, P., Rabaund, V., Cottrell, G., Belongie, S.: Behaviour recognition via sparse spatio temporal features. In: PETS (2005)Google Scholar
  24. 24.
    Zanfir, M., Leordeanu, M. and Sminchisescu, C.: The moving pose: an efficient 3D kinematics descriptor for low-latency action recognition and detection, In: ICCV (2013)Google Scholar
  25. 25.
    Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)Google Scholar
  26. 26.
    Oreifej, O., Liu, Z.: Histogram of oriented 4D normals for activity recognition from depth sequences. In: CVPR (2013)Google Scholar
  27. 27.
    Wang, P., Li, W., Gao, Z., Zhang, J., Tang, C., Ogunbona, P.: deep convolutional neural networks for action recognition using depth map sequences. In: CVPR (2015)Google Scholar
  28. 28.
    Yang, X., Tian, Y.: Super normal vector for activity recognition using depth sequences. In: CVPR (2014)Google Scholar
  29. 29.
    Xia, L., Aggarwal, J.K.: Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In: CVPR (2013)Google Scholar
  30. 30.
    Liu, Z., Feng, X., Tian, Y.: An effective view and time-invariant action recognition methods based on depth videos. In: Visual Communications and Image Processing (2015)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Indian Institute of TechnologyDelhiIndia
  2. 2.Works ApplicationSingaporeSingapore
  3. 3.Bennett UniversityGreater NoidaIndia

Personalised recommendations