Advertisement

Human action recognition using Lie Group features and convolutional neural networks

  • Linqin CaiEmail author
  • Chengpeng Liu
  • Rongdi Yuan
  • Heen Ding
Original paper
  • 54 Downloads

Abstract

In recent years, skeleton-based human action recognition has attracted substantial attentions. However, owing to the complexity and nonlinearity of human action data, it is still a challenging task to precisely represent skeleton features. Motivated by the effectiveness of Lie Group skeletal representations in extracting human action features and the powerful capability of deep neural networks in feature learning and high-dimensional data processing, we proposed to combine Lie Group features and deep learning for human action recognition. Human skeleton information was firstly used to overcome the interference of external factors such as changes of lighting conditions and body shape. And then, Lie Group was applied to naturally represent the complex and diverse action data. Finally, we took use of convolutional neural networks to learn and classify the Lie Group features. Experiments were performed on three public datasets, and the experimental results show that our methods can achieve higher average recognition accuracy of 93.00% on Florence3D-Action, 93.68% on MSR Action Pairs, and 97.96% on UT Kinect-Action, which outperforms many of the state-of-the-art methods.

Keywords

Human action recognition Lie Group features Convolutional neural networks 

Notes

Acknowledgements

This work was funded by the National Key R&D Program of China (2017YFE0123000) and the Key R&D Program of Common Key Technology Innovation for Key Industries in Chongqing (No. CSTC2015zdcy-ztzx60001).

Author Contributions

LC did the most of the theoretical analysis, convinced the methodology, and supervised the implementation. CL did the experimental test and wrote the manuscript. RY made revision and proofreading. HD helped data collection and data analysis.

Compliance with ethical standards

Conflict of interest

The authors declare no conflict of interest.

References

  1. 1.
    Wang, P., Li, W., Ogunbona, P., et al.: RGB-D-based human motion recognition with deep learning: a survey. Comput. Vis. Image Underst. 171, 118–39 (2018)CrossRefGoogle Scholar
  2. 2.
    Lo Presti, L., La Cascia, M.: 3D skeleton-based human action classification: a survey. Pattern Recognit. 53(5), 130–147 (2016)CrossRefGoogle Scholar
  3. 3.
    Abdoli, A., Murillo, A., Yeh, M., Gerry, A., Keogh, E.: Time series classification to improve poultry welfare. In: 17th IEEE International Conference on Machine Learning and Applications (ICMLA) (2018)Google Scholar
  4. 4.
    Barwick, J., et al.: Categorising sheep activity using a tri-axial accelerometer. Comput. Electron. Agric. 145, 289–297 (2018)CrossRefGoogle Scholar
  5. 5.
    Jeantet, L., Dell’Amico, F., Forin-Wiart, M.A., Coutant, M., Bonola, M., Etienne, D., De Thoisy, B.: Combined use of two supervised learning algorithms to model sea turtle behaviours from tri-axial acceleration data. J. Exp. Biol. 221(10), jeb177378 (2018)CrossRefGoogle Scholar
  6. 6.
    Hounslow, J.L., Brewster, L.R., Lear, K.O., Guttridge, T.L., Daly, R., Whitney, N.M., Gleiss, A.C.: Assessing the effects of sampling frequency on behavioural classification of accelerometer data. J. Exp. Mar. Biol. Ecol. 512, 22–30 (2019)CrossRefGoogle Scholar
  7. 7.
    Wang, P., Yuan, C., Weiming, H., Li, B., Zhang, Y.: Graph based skeleton motion representation and similarity measurement for action recognition. In: European Conference on Computer Vision, pp. 370–385 (2016)CrossRefGoogle Scholar
  8. 8.
    Di, W., Pigou, L., Kindermans, P.J., Nam, L.E., Shao, L., Dambre, J., Odobez, J.M.: Deep dynamic neural networks for multimodal gesture segmentation and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1583–1597 (2016)CrossRefGoogle Scholar
  9. 9.
    Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Zatsiorsky, V.: Kinematics of human motion. Human Kinetics Inc (1998)Google Scholar
  11. 11.
    Zhang, S., Liu, X., Xiao, J.: On geometric features for skeleton based action recognition using multilayer LSTM networks. In: IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp. 148–157 (2017)Google Scholar
  12. 12.
    Devanne, M., Wannous, H., Berretti, S., Pala, P., Daoudi, M., Bimbo, A.D.: 3-d human action recognition by shape analysis of motion trajectories on Riemannian manifold. IEEE Trans. Cybern. 45(7), 1340–1352 (2015)CrossRefGoogle Scholar
  13. 13.
    Anirudh, R., Turaga, P., Su, J., Srivastava, A.: Elastic functional coding of Riemannian trajectories. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 39(5), 922–936 (2017)CrossRefGoogle Scholar
  14. 14.
    Vemulapalli, R., Arrate, F., Chellappa, R.: Human action recognition by representing 3d skeletons as points in a lie Group. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 588–595 (2014)Google Scholar
  15. 15.
    Vemulapalli, R., Arrate, F., Chellappa, R.: R3DG features: relative 3d geometry-based skeletal representations for human action recognition. Comput. Vis. Image Underst. 152, 155–166 (2016)CrossRefGoogle Scholar
  16. 16.
    Ciregan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 3642–3649 (2012)Google Scholar
  17. 17.
    LeCun, Y., Bengio, Y., et al.: Convolutional networks for images, speech, and time series. Handb. Brain Theory Neural Netw. 3361(10), 1995 (1995)Google Scholar
  18. 18.
    Wang, J., Liu, Z., Wu, Y.: Learning actionlet ensemble for 3D human action recognition. In: Wang, J. (ed.) Human Action Recognition with Depth Cameras, pp. 11–40. Springer, Cham (2014)CrossRefGoogle Scholar
  19. 19.
    Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. In: Computer Vision and Pattern Recognition Workshops. IEEE, pp. 9–14 (2010)Google Scholar
  20. 20.
    Xia, L., Chen, C.C., Aggarwal, J.K.: View invariant human action recognition using histograms of 3D joints. In: Computer Vision and Pattern Recognition Workshops. IEEE, pp. 20–27 (2012)Google Scholar
  21. 21.
    Hussein, M.E., Torki, M., Gowayyed, M.A., et al.: Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations. In: International Joint Conference on Artificial Intelligence, pp. 639–644 (2013)Google Scholar
  22. 22.
    Liang, B., Zheng, L.: 3D motion trail model based pyramid histograms of oriented gradient for action recognition. In: International Conference on Pattern Recognition. IEEE Computer Society, pp. 1952–1957 (2014)Google Scholar
  23. 23.
    Chen, H., Wang, G., Xue, J.H., et al.: A novel hierarchical framework for human action recognition. Pattern Recognit. 55(C), 148–159 (2016)CrossRefGoogle Scholar
  24. 24.
    Ofli, F., Chaudhry, R., Kurillo, G., et al.: Sequence of the most informative joints (SMIJ): a new representation for human skeletal action recognition. J. Vis. Commun. Image Represent. 25(1), 24–38 (2014)CrossRefGoogle Scholar
  25. 25.
    Cai, L., Liu, X., Ding, H., et al.: Human action recognition using improved sparse Gaussian process latent variable model and hidden conditional random filed. IEEE Access 6, 20047–20057 (2018)CrossRefGoogle Scholar
  26. 26.
    Chaudhry, R., Ofli, F., Kurillo, G., et al.: Bio-inspired dynamic 3D discriminative skeletal features for human action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 471–478 (2013)Google Scholar
  27. 27.
    Turaga, P., Chellappa, R.: Locally time-invariant models of human activities using trajectories on the grassmannian. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009. IEEE, pp. 2435–2441 (2009)Google Scholar
  28. 28.
    Cai, L., Liu, X., Chen, F., Xiang, M.: Robust human action recognition based on depth motion maps and improved convolutional neural network. J. Electron. Imaging 27(5), 051218 (2018)CrossRefGoogle Scholar
  29. 29.
    Zhi, L., Zhang, C., Tian, Y.: 3D-based Deep convolutional neural network for action recognition with depth sequences. Image Vis. Comput. 55, 93–100 (2016)CrossRefGoogle Scholar
  30. 30.
    Wu, D., Pigou, L., Kindermans, P.J., et al.: Deep dynamic neural networks for multimodal gesture segmentation and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1583–1597 (2016)CrossRefGoogle Scholar
  31. 31.
    Ke, Q., Bennamoun, M., An, S., et al.: Learning clip representations for skeleton-based 3D action recognition. IEEE Trans. Image Process. 99, 1 (2018)MathSciNetzbMATHGoogle Scholar
  32. 32.
    Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp. 1110–1118 (2015)Google Scholar
  33. 33.
    Liu, J., Shahroudy, A., Xu, D., et al.: Skeleton-based action recognition using spatio-temporal LSTM network with trust gates. IEEE Trans. Pattern Anal. Mach. Intell. 99, 1–1 (2017)Google Scholar
  34. 34.
    Huang, Z., Wan, C., Probst, T., Van Gool, L.: Deep learning on Lie Groups for skeleton-based action recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6099–6108 (2017)Google Scholar
  35. 35.
    Kulkarni, K., Evangelidis, G., Cech, J., et al.: Continuous action recognition based on sequence alignment. Int. J. Comput. Vis. 112(1), 130–130 (2015)CrossRefGoogle Scholar
  36. 36.
    Linqin, C.A.I., Shuangjie, C., Min, X., Jimin, Y., Jianrong, Z.: Dynamic hand gesture recognition using RGB-D data for natural human–computer interaction. J. Intell. Fuzzy Syst. 32(5), 3495–3507 (2017)CrossRefGoogle Scholar
  37. 37.
    Ji, S., Wei, X., Yang, M., Kai, Y.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)CrossRefGoogle Scholar
  38. 38.
    Kaiming, H., Jian, S.: Convolutional neural networks at constrained time cost. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5353–5360 (2015)Google Scholar
  39. 39.
    Seidenari, L., Varano, V., Berretti, S., et al.: Recognizing actions from depth cameras as weakly aligned multi-part bag-of-poses. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 479–485 (2013)Google Scholar
  40. 40.
    Oreifej, O., Liu, Z.: HON4D: histogram of oriented 4D normals for activity recognition from depth sequences. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, pp. 716–723 (2013)Google Scholar
  41. 41.
    Zhu, Y., Chen, W., Guo, G.: Fusing spatiotemporal features and joints for 3D action recognition. In: Computer Vision and Pattern Recognition Workshops. IEEE, pp. 486–491 (2013)Google Scholar

Copyright information

© Springer Nature B.V. 2020

Authors and Affiliations

  1. 1.Key Laboratory of Industrial Internet of Things and Networked Control, Ministry of EducationChongqing University of Posts and TelecommunicationsChongqingChina

Personalised recommendations