Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction

Li, Maosen; Chen, Siheng; Zhang, Zijing; Xie, Lingxi; Tian, Qi; Zhang, Ya

doi:10.1007/978-3-031-20068-7_2

Maosen Li¹²,
Siheng Chen^12,13,
Zijing Zhang¹⁴,
Lingxi Xie¹⁵,
Qi Tian¹⁵ &
…
Ya Zhang^12,13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13666))

Included in the following conference series:

European Conference on Computer Vision

2031 Accesses
13 Citations

Abstract

Graph convolutional network based methods that model the body-joints’ relations, have recently shown great promise in 3D skeleton-based human motion prediction. However, these methods have two critical issues: first, deep graph convolutions filter features within only limited graph spectrums, losing sufficient information in the full band; second, using a single graph to model the whole body underestimates the diverse patterns on various body-parts. To address the first issue, we propose adaptive graph scattering, which leverages multiple trainable band-pass graph filters to decompose pose features into richer graph spectrum bands. To address the second issue, body-parts are modeled separately to learn diverse dynamics, which enables finer feature extraction along the spatial dimensions. Integrating the above two designs, we propose a novel skeleton-parted graph scattering network (SPGSN). The cores of the model are cascaded multi-part graph scattering blocks (MPGSBs), building adaptive graph scattering on diverse body-parts, as well as fusing the decomposed features based on the inferred spectrum importance and body-part interactions. Extensive experiments have shown that SPGSN outperforms state-of-the-art methods by remarkable margins of \(13.8\%\), \(9.3\%\) and \(2.7\%\) in terms of 3D mean per joint position error (MPJPE) on Human3.6M, CMU Mocap and 3DPW datasets, respectively (The codes are available at https://github.com/MediaBrain-SJTU/SPGSN).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://mocap.cs.cmu.edu/.

References

Andén, J., Mallat, S.: Deep scattering spectrum. IEEE Trans. Signal Process. 62(16), 4114–4128 (2014)
Article MathSciNet MATH Google Scholar
Bruna, J., Mallat, S.: Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1872–1886 (2013)
Article Google Scholar
Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs. In: ICLR (Apr 2014)
Google Scholar
Cai, Y., Huang, L., Wang, Y., Cham, T.-J., Cai, J., Yuan, J., Liu, J., Yang, X., Zhu, Y., Shen, X., Liu, D., Liu, J., Thalmann, N.M.: Learning progressive joint propagation for human motion prediction. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12352, pp. 226–242. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58571-6_14
Chapter Google Scholar
Chen, G., Song, X., Zeng, H., Jiang, S.: Scene recognition with prototype-agnostic scene layout. IEEE Trans. Image Process. 29, 5877–5888 (2020)
Article MATH Google Scholar
Chen, S., Liu, B., Feng, C., Vallespi-Gonzalez, C., Wellington, C.: 3d point cloud processing and learning for autonomous driving. IEEE Sig. Process. Mag. 38, 68–86 (2020)
Article Google Scholar
Cui, Q., Sun, H., Yang, F.: Learning dynamic relationships for 3d human motion prediction. In: CVPR (June 2020)
Google Scholar
Dai, H., Dai, B., Song, L.: Discriminative embeddings of latent variable models for structured data. In: ICML (June 2016)
Google Scholar
Dang, L., Nie, Y., Long, C., Zhang, Q., Li, G.: Msr-gcn: Multi-scale residual graph convolution networks for human motion prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 11467–11476 (October 2021)
Google Scholar
Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: NeurIPS (Dec 2016)
Google Scholar
Fan, L., Wang, W., Huang, S., Tang, X., Zhu, S.C.: Understanding human gaze communication by spatio-temporal graph reasoning. In: ICCV (Oct 2019)
Google Scholar
Fragkiadaki, K., Levine, S., Felsen, P., Malik, J.: Recurrent network models for human dynamics. In: ICCV, pp. 4346–4354 (December 2015)
Google Scholar
Gama, F., Ribeiro, A., Bruna, J.: Diffusion scattering transforms on graphs. In: ICLR (May 2019)
Google Scholar
Gama, F., Ribeiro, A., Bruna, J.: Stability of graph scattering transforms. In: NeurIPS, vol. 32 (December 2019)
Google Scholar
Gao, F., Wolf, G., Hirn, M.: Geometric scattering for graph data analysis. In: ICML, pp. 2122–2131 (June 2019)
Google Scholar
Gui, L.-Y., Wang, Y.-X., Liang, X., Moura, J.M.F.: Adversarial geometry-aware human motion prediction. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 823–842. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_48
Chapter Google Scholar
Gui, L., Zhang, K., Wang, Y., Liang, X., Moura, J., Veloso, M.: Teaching robots to predict human motion. In: The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (Oct 2018)
Google Scholar
Guo, X., Choi, J.: Human motion prediction via learning local structure representations and temporal dependencies. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 2580–2587 (2019)
Google Scholar
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: NeurIPS (Dec 2017)
Google Scholar
Hu, G., Cui, B., Yu, S.: Skeleton-based action recognition with synchronous local and non-local spatio-temporal learning and frequency attention. In: ICME (July 2019)
Google Scholar
Hu, Y., Chen, S., Zhang, Y., Gu, X.: Collaborative motion prediction via neural motion message passing. In: CVPR (June 2020)
Google Scholar
Huang, Y., Bi, H., Li, Z., Mao, T., Wang, Z.: Stgat: Modeling spatial-temporal interactions for human trajectory prediction. In: ICCV, pp. 6272–6281 (2019)
Google Scholar
Ioannidis, V.N., Chen, S., Giannakis, G.B.: Pruned graph scattering transforms. In: ICLR (Apr 2020)
Google Scholar
Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1325–1339 (2013)
Article Google Scholar
Jain, A., Zamir, A., Savarese, S., Saxena, A.: Structural-rnn: Deep learning on spatio-temporal graphs. In: CVPR, pp. 5308–5317 (June 2016)
Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv:1412.6980 (2014)
Kipf, T., Fetaya, E., Wang, K.C., Welling, M., Zemel, R.: Neural relational inference for interacting systems. In: ICML. pp. 2688–2697 (2018)
Google Scholar
Kipf, T., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (Apr 2017)
Google Scholar
Kosaraju, V., Sadeghian, A., Martín-Martín, R., Reid, I., Rezatofighi, S.H., Savarese, S.: Social-bigat: Multimodal trajectory forecasting using bicycle-gan and graph attention networks. arXiv preprint arXiv:1907.03395 (2019)
Lee, S., Lim, J., Suh, I.H.: Progressive feature matching: Incremental graph construction and optimization. IEEE Trans. Image Process. 29, 6992–7005 (2020)
Article MathSciNet MATH Google Scholar
Lehrmann, A., Gehler, P., Nowozin, S.: Efficient nonlinear markov models for human motion. In: CVPR, pp. 1314–1321 (June 2014)
Google Scholar
Li, C., Zhang, Z., Sun Lee, W., Hee Lee, G.: Convolutional sequence to sequence model for human dynamics. In: CVPR (June 2018)
Google Scholar
Li, J., Yang, F., Tomizuka, M., Choi, C.: Evolvegraph: Multi-agent trajectory prediction with dynamic relational reasoning. NeurIPS (2020)
Google Scholar
Li, M., Chen, S., Zhang, Y., Tsang, I.: Graph cross networks with vertex infomax pooling. In: NeurIPS, vol. 33, pp. 14093–14105 (2020)
Google Scholar
Li, M., Chen, S., Zhao, Y., Zhang, Y., Wang, Y., Tian, Q.: Dynamic multiscale graph neural networks for 3d skeleton based human motion prediction. In: CVPR (June 2020)
Google Scholar
Li, M., Chen, S., Zhao, Y., Zhang, Y., Wang, Y., Tian, Q.: Multiscale spatio-temporal graph neural networks for 3d skeleton-based motion prediction. IEEE Trans. Image Process. 30, 7760–7775 (2021)
Article MathSciNet Google Scholar
Li, Y., Tarlow, D., Brockschmidt, M., Zemel, R.: Gated graph sequence neural networks. In: ICLR (May 2016)
Google Scholar
Liu, Z., Su, P., Wu, S., Shen, X., Chen, H., Hao, Y., Wang, M.: Motion prediction using trajectory cues. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 13299–13308 (October 2021)
Google Scholar
Lu, X., Wang, W., Danelljan, M., Zhou, T., Shen, J., Van Gool, L.: Video object segmentation with episodic graph memory networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 661–679. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_39
Chapter Google Scholar
Mao, W., Liu, M., Salzmann, M.: History repeats itself: Human motion prediction via motion attention. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 474–489. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_28
Chapter Google Scholar
Mao, W., Liu, M., Salzmann, M., Li, H.: Learning trajectory dependencies for human motion prediction. In: ICCV (Oct 2019)
Google Scholar
von Marcard, T., Henschel, R., Black, M.J., Rosenhahn, B., Pons-Moll, G.: Recovering accurate 3D human pose in the wild using IMUs and a moving camera. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 614–631. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_37
Chapter Google Scholar
Martinez, J., Black, M., Romero, J.: On human motion prediction using recurrent neural networks. In: CVPR, pp. 4674–4683 (July 2017)
Google Scholar
Min, Y., Wenkel, F., Wolf, G.: Scattering gcn: Overcoming oversmoothness in graph convolutional networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 14498–14508 (Dec 2020)
Google Scholar
Min, Y., Wenkel, F., Wolf, G.: Geometric scattering attention networks. In: ICASSP, pp. 8518–8522 (2021)
Google Scholar
Niepert, M., Ahmed, M., Kutzkovl, K.: Learning convolutional neural networks for graphs. In: ICML (June 2016)
Google Scholar
Pan, C., Chen, S., Ortega, A.: Spatio-temporal graph scattering transform. In: ICLR (May 2021)
Google Scholar
Pavlovic, V., Rehg, J.M., MacCormick, J.: Learning switching linear models of human motion. In: NeurIPS (2001)
Google Scholar
Qi, S., Wang, W., Jia, B., Shen, J., Zhu, S.C.: Learning human-object interactions by graph parsing neural networks. In: ECCV, pp. 401–417 (2018)
Google Scholar
Rizkallah, M., Su, X., Maugey, T., Guillemot, C.: Geometry-aware graph transforms for light field compact representation. IEEE Trans. Image Process. 29, 602–616 (2020)
Article MathSciNet MATH Google Scholar
Shi, L., Zhang, Y., Cheng, J., Lu, H.: Skeleton-based action recognition with directed graph neural networks. In: CVPR (June 2019)
Google Scholar
Sifre, L., Mallat, S.: Rotation, scaling and deformation invariant scattering for texture discrimination. In: CVPR, pp. 1233–1240 (June 2013)
Google Scholar
Sofianos, T., Sampieri, A., Franco, L., Galasso, F.: Space-time-separable graph convolutional network for pose forecasting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 11209–11218 (October 2021)
Google Scholar
Tabassum, S., Pereira, F.S., Fernandes, S., Gama, J.: Social network analysis: An overview. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 8(5), e1256 (2018)
Article Google Scholar
Taylor, G., Hinton, G.: Factored conditional restricted Boltzmann machines for modeling motion style. In: ICML (June 2009)
Google Scholar
Taylor, G., Hinton, G., Roweis, S.: Modeling human motion using binary latent variables. In: NeurIPS (December 2007)
Google Scholar
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: ICLR (Apr 2018)
Google Scholar
Walker, J., Marino, K., Gupta, A., Hebert, M.: The pose knows: Video forecasting by generating pose futures. In: ICCV, pp. 3332–3341 (Oct 2017)
Google Scholar
Wang, W., Zhu, H., Dai, J., Pang, Y., Shen, J., Shao, L.: Hierarchical human parsing with typed part-relation reasoning. In: CVPR (June 2020)
Google Scholar
Xu, C., Chen, S., Li, M., Zhang, Y.: Invariant teacher and equivariant student for unsupervised 3d human pose estimation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 3013–3021 (2021)
Google Scholar
Xu, C., Li, M., Ni, Z., Zhang, Y., Chen, S.: Groupnet: Multiscale hypergraph neural networks for trajectory prediction with relational reasoning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6498–6507 (2022)
Google Scholar
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: AAAI (Feb 2018)
Google Scholar
Zhang, J., Shen, F., Xu, X., Shen, H.T.: Temporal reasoning graph for activity recognition. IEEE Trans. Image Process. 29, 5491–5506 (2020)
Article MATH Google Scholar
Zhang, X., Xu, C., Tian, X., Tao, D.: Graph edge convolutional neural networks for skeleton-based action recognition. IEEE Trans. Neural Netw. Learn. Syst. 31(8), 3047–3060 (2019)
Article Google Scholar
Zheng, C., Pan, L., Wu, P.: Multimodal deep network embedding with integrated structure and attribute information. IEEE Trans. Neural Netw. Learn. Syst. 31(5), 1437–1449 (2020)
Article Google Scholar
Zou, D., Lerman, G.: Graph convolutional neural networks via scattering. Appl. Comput. Harmon. Anal. 49(3), 1046–1074 (2020)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (2020YFB1406801), the National Natural Science Foundation of China under Grant (62171276), 111 plan (BP0719010), STCSM (18DZ2270700, 21511100900), State Key Laboratory of UHD Video and Audio Production and Presentation.

Author information

Authors and Affiliations

Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China
Maosen Li, Siheng Chen & Ya Zhang
Shanghai AI Laboratory, Shanghai, China
Siheng Chen & Ya Zhang
Zhejiang University, Hangzhou, China
Zijing Zhang
Huawei Cloud & AI, Shenzhen, China
Lingxi Xie & Qi Tian

Authors

Maosen Li
View author publications
You can also search for this author in PubMed Google Scholar
Siheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zijing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lingxi Xie
View author publications
You can also search for this author in PubMed Google Scholar
Qi Tian
View author publications
You can also search for this author in PubMed Google Scholar
Ya Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Siheng Chen or Ya Zhang .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 3985 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, M., Chen, S., Zhang, Z., Xie, L., Tian, Q., Zhang, Y. (2022). Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13666. Springer, Cham. https://doi.org/10.1007/978-3-031-20068-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-20068-7_2
Published: 11 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20067-0
Online ISBN: 978-3-031-20068-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics