Abstract
The skeleton-based action recognition has attracted much attention of researchers. The existing methods mostly introduce motion information into models by using multi-stream architecture, which leads to more parameters and FLOPs. In this paper, to resolve this problem, the proposed 2s-MAGCN (Two-Stream Multi-Attention Graph Convolutional Network) introduces motion information by applying the Motion Excitation attention module, which not only leads to less parameters and FLOPs by merging multi-stream into two-stream, but also improves the performance of the model. By proposing new strategies of pooling operations in attention modules, we get attention modules with better performance. It includes Spatial Excitation and Temporal Excitation, which are proposed to enhance the spatio-temporal expression ability of the model. On cross-subject benchmark and cross-view benchmark of NTU-RGB+D datasets, the proposed model achieves 88.60% and 97.16% accuracy respectively, and 35.62% accuracy on the Kinetics dataset. On both datasets, our method outperforms state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cao, Y., Liu, C., Huang, Z., Sheng, Y., Ju, Y.: Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure. Multimed. Tools Appl. 80(19), 29139–29162 (2021)
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)
Chen, S., Xu, K., Jiang, X., Sun, T.: Spatiotemporal-spectral graph convolutional networks for skeleton-based action recognition. In: 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–6. IEEE (2021)
Duan, H., Zhao, Y., Chen, K., Lin, D., Dai, B.: Revisiting skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2969–2978 (2022)
Fu, Y., et al.: Partial feature selection and alignment for multi-source domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16654–16663 (2021)
Kay, W., et al.: The kinetics human action video dataset. arXiv preprint arXiv:1705.06950 (2017)
Li, M., Chen, S., Chen, X., Zhang, Y., Wang, Y., Tian, Q.: Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3595–3603 (2019)
Li, S., Yi, J., Farha, Y.A., Gall, J.: Pose refinement graph convolutional network for skeleton-based action recognition. IEEE Robot. Autom. Lett. 6(2), 1028–1035 (2021)
Liu, X., Li, Y., Xia, R.: Adaptive multi-view graph convolutional networks for skeleton-based action recognition. Neurocomputing 444, 288–300 (2021)
Lu, H., Li, Y., Chen, M., Kim, H., Serikawa, S.: Brain intelligence: go beyond artificial intelligence. Mob. Netw. Appl. 23, 368–375 (2018)
Qin, Z., Zhang, P., Wu, F., Li, X.: FcaNet: frequency channel attention networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 783–792 (2021)
Shahroudy, A., Liu, J., Ng, T.T., Wang, G.: NTU RGB+ D: a large scale dataset for 3D human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010–1019 (2016)
Shi, L., Zhang, Y., Cheng, J., Lu, H.: Skeleton-based action recognition with multi-stream adaptive graph convolutional networks. IEEE Trans. Image Process. 29, 9532–9545 (2020)
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Wang, Z., She, Q., Smolic, A.: Action-net: multipath excitation for action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13214–13223 (2021)
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Zhao, M., et al.: PB-GCN: progressive binary graph convolutional networks for skeleton-based action recognition. Neurocomputing 501, 640–649 (2022)
Acknowledgements
This work was supported by the National Key Research and Development Program of China under Grant No. 2020AAA0108100.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhou, H., Tian, Z., Du, S. (2024). Two Stream Multi-Attention Graph Convolutional Network for Skeleton-Based Action Recognition. In: Lu, H., Cai, J. (eds) Artificial Intelligence and Robotics. ISAIR 2023. Communications in Computer and Information Science, vol 1998. Springer, Singapore. https://doi.org/10.1007/978-981-99-9109-9_11
Download citation
DOI: https://doi.org/10.1007/978-981-99-9109-9_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9108-2
Online ISBN: 978-981-99-9109-9
eBook Packages: Computer ScienceComputer Science (R0)