Advertisement

Deep Residual Temporal Convolutional Networks for Skeleton-Based Human Action Recognition

  • R. KhamsehashariEmail author
  • K. Gadzicki
  • C. Zetzsche
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11754)

Abstract

Deep residual networks for action recognition based on skeleton data can avoid the degradation problem, and a 56-layer Res-Net has recently achieved good results. Since a much “shallower” 11-layer model (Res-TCN) with a temporal convolution network and a simplified residual unit achieved almost competitive performance, we investigate deep variants of Res-TCN and compare them to Res-Net architectures. Our results outperform the other approaches in this class of residual networks. Our investigation suggests that the resistance of deep residual networks to degradation is not only determined by the architecture but also by data and task properties.

Keywords

Deep residual networks Action recognition Degradation Hyperparameters 

References

  1. 1.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)Google Scholar
  2. 2.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), Venice, pp. 2980–2988 (2017).  https://doi.org/10.1109/ICCV.2017.322
  3. 3.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint. arXiv:1502.03167 (2015)
  4. 4.
    Kim, T.S., Reiter, A.: Interpretable 3D human action analysis with temporal convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2017)Google Scholar
  5. 5.
    Lea, C., Flynn, M.D., Vidal, R., Reiter, A., Hager, G.D.: Temporal convolutional networks for action segmentation and detection. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2017)Google Scholar
  6. 6.
    Li, H., Xu, Z., Taylor, G., Goldstein, T.: Visualizing the loss landscape of neural nets. In: CoRR. arXiv:1712.09913 (2017)
  7. 7.
    Pham, H., Khoudour, L., Crouzil, A., Zegers, P., Velastin, S.: Exploiting deep residual networks for human action recognition from skeletal data. Comput. Vis. Image Underst. (CVIU) 170, 51–66 (2018)CrossRefGoogle Scholar
  8. 8.
    Shahroudy, A., Liu, J., Ng, T.-T., Wang, G.: NTU RGB+D: a large scale dataset for 3D human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  9. 9.
    Yang, Z., Li, Y., Yang, J., Luo, J.: Action recognition with visual attention on skeleton images. In: CoRR. arXiv:1804.07453 (2018)
  10. 10.
    Zhang, P., Lan, C., Xing, J., Zeng, W., Xue, J., Zheng, N.: View adaptive neural networks for high performance skeleton-based human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1963–1978 (2019) CrossRefGoogle Scholar
  11. 11.
    Zhu, J., et al.: Action machine: rethinking action recognition in trimmed videos. In: CoRR. arXiv:1812.05770 (2019)
  12. 12.
    Rasouli, A., Tsotsos, J.K.: Joint attention in driver-pedestrian interaction: from theory to practice. In: CoRR. arXiv:1802.02522 (2018)
  13. 13.
    Liu, M., Hong, L., Chen, C.: Enhanced skeleton visualization for view invariant human action recognition. Pattern Recogn. 68, 346–362 (2017)CrossRefGoogle Scholar
  14. 14.
    Li, C., Wang, P., Wang, S., Hou, Y., Li, W.: Skeleton-based action recognition using LSTM and CNN. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE (2017)Google Scholar
  15. 15.
    Li, C., Zhong, Q., Xie, D., Pu, S.: Skeleton-based action recognition with convolutional neural networks. 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE (2017)Google Scholar
  16. 16.
    Ke, Q., Bennamoun, M., An, S., Sohel, F., Boussaid, F.: A new representation of skeleton sequences for 3D action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  17. 17.
    Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Cognitive NeuroinformaticsUniversity of BremenBremenGermany

Personalised recommendations