Skip to main content

A fast non-convex optimization technique for human action recovery from misrepresented 3D motion capture data using trajectory movement and pair-wise hierarchical constraints

Abstract

Motion capture (mocap) data is often corrupted by the presence of erroneous entries. Corrupted mocap data tends to deteriorate the structure of the recorded human actions, causing hindrance while using it for different applications. In this paper, we propose a new four-way optimization model for recovering plausible human actions from the misrepresented mocap sequences without the requirement of any pre-trained model. The proposed model employs joint \(\ell _\frac{1}{2}\) regularized low rank and sparse priors for separating the clean mocap data and the erroneous entries effectively. The brilliance of this work is confined in the efficient formulation of the optimization model by integrating two additional constraints based on the movement of individual node trajectories and the hierarchy of the parent-child node pairs. The former ensures the smoothness of the action sequences, while the latter limits the spatial drifting of the skeletal nodes. Detailed analysis of the experimental results showcases the ability of the developed algorithm in achieving commendable human action recovery in minimal time compared to the recent counterparts, even from mocap data with 20–30% of misrepresented joints in 90% of the frames.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Notes

  1. http://mocap.cs.cmu.edu/.

  2. https://resources.mpi-inf.mpg.de/HDM05/.

References

  • Bautembach D, Oikonomidis I, Argyros A (2018) A comparative study of matrix completion and recovery techniques for human pose estimation. In: Proceedings of the 11\(^{th}\) Pervasive Technologies Related to Assistive Environments Conference, pp 23–30

  • Cai Y, Wang Y, Zhu Y, Cham TJ, Cai J, Yuan J, Liu J, Zheng C, Yan S, Ding H et al (2021) A unified 3d human motion synthesis model via conditional variational auto-encoder. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 11645–11655

  • Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? JACM 58(3):1–37

    MathSciNet  Article  Google Scholar 

  • Caputo A, Giachetti A, Giannini F, Lupinetti K, Monti M, Pegoraro M, Ranieri A (2020) Sfinge 3d: a novel benchmark for online detection and recognition of heterogeneous hand gestures from 3d fingers’ trajectories. Comput Graph 91:232–242

    Article  Google Scholar 

  • Caputo A, Giachetti A, Soso S, Pintani D, D’Eusanio A, Pini S, Borghi G, Simoni A, Vezzani R, Cucchiara R et al (2021) Shrec 2021: Skeleton-based hand gesture recognition in the wild. Comput Graph 99:201–211

    Article  Google Scholar 

  • Chen B, Sun H, Xia G, Feng L, Li B (2018) Human motion recovery utilizing truncated schatten p-norm and kinematic constraints. Inf Sci 450:89–108

    MathSciNet  Article  Google Scholar 

  • Chen H, Wei M, Sun Y, Xie X, Wang J (2020) Multi-patch collaborative point cloud denoising via low-rank recovery with graph constraint. IEEE Trans Vis Comput Graph 26(11):3255–3270

    Article  Google Scholar 

  • Cippitelli E, Gasparrini S, Gambi E, Spinsante S (2016) A human activity recognition system using skeleton data from RGBD sensors. Comput Intell Neurosci

  • Cui Q, Chen B, Sun H (2019) Nonlocal low-rank regularization for human motion recovery based on similarity analysis. Inf Sci 493:57–74

    MathSciNet  Article  Google Scholar 

  • Cui Q, Sun H, Li Y, Yue K (2020) Efficient human motion recovery using bidirectional attention network. Neural Comput Appl 32:10127–10142

    Article  Google Scholar 

  • Feng Y, Xiao J, Zhuang Y, Yang X, Zhang JJ, Song R (2014) Exploiting temporal stability and low-rank structure for motion capture data refinement. Inf Sci 277:777–793

    Article  Google Scholar 

  • Gu S, Zhang L, Zuo W, Feng X (2014) Weighted nuclear norm minimization with application to image denoising. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2862–2869

  • Holden D, Saito J, Komura T (2016) A deep learning framework for character motion synthesis and editing. ACM Trans Graph (TOG) 35(4):1–11

    Article  Google Scholar 

  • Hou J, Chau LP, Magnenat-Thalmann N, He Y (2014) Scalable and compact representation for motion capture data using tensor decomposition. IEEE Signal Process Lett 21(3):255–259

    Article  Google Scholar 

  • Hou J, Chau LP, Magnenat-Thalmann N, He Y (2015) Human motion capture data tailored transform coding. IEEE Trans Vis Comput Graph 21(7):848–859

    Article  Google Scholar 

  • Hou J, Chau L, He Y, Chen J, Magnenat-Thalmann N (2013) Human motion capture data recovery via trajectory-based sparse representation. In: IEEE International Conference on Image Processing, pp 709–713

  • Huang S, Ye J, Wang T, Jiang L, Wu X, Li Y (2015) Extracting refined low-rank features of robust PCA for human action recognition. Arab J Sci Eng 40(5):1427–1441

    Article  Google Scholar 

  • Kong Y, Fu Y (2018) Human action recognition and prediction: A survey. arXiv preprint arXiv:1806.11230

  • Lai RY, Yuen PC, Lee KK (2011) Motion capture data completion and denoising by singular value thresholding. In: Eurographics (Short Papers), pp 45–48

  • Leonardos S, Zhou X, Daniilidis K (2016) Articulated motion estimation from a monocular image sequence using spherical tangent bundles. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp 587–593

  • Li K, Wang M, Lai Y, Yang J, Wu F (2017) 3D motion recovery via low rank matrix restoration on articulation graphs. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp 721–726

  • Madathil B, George SN (2019) Simultaneous reconstruction and anomaly detection of subsampled hyperspectral images using \(l_{({1}/{2})}\) regularized joint sparse and low-rank recovery. IEEE Trans Geosci Remote Sens 57(7):5190–5197

    Article  Google Scholar 

  • Madathil B, George SN, (2020) Noise robust image clustering based on reweighted low rank tensor approximation and \(\ell _\frac{1}{2}\) regularization. Signal, Image and Video Processing pp 1–9

  • Malti A (2019) On the exact recovery conditions of 3D human motion from 2D landmark motion with sparse articulated motion. Comput Vis Image Understand 202:103072

    Article  Google Scholar 

  • Menier C, Boyer E, Raffin B (2006) 3D skeleton-based body pose recovery. In: Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT’06), pp 389–396

  • Moeslund TB, Granum E (2001) A survey of computer vision-based human motion capture. Comput Vis Image Understand 81(3):231–268

    Article  Google Scholar 

  • Muller M, Roder T, Clausen M, Eberhardt B, Kruger B, Weber A (2007) Documentation mocap database HDM05. Tech. Rep. CG-2007-2, Universitat Bonn

  • Park HS, Sheikh Y (2011) 3D reconstruction of a smooth articulated trajectory from a monocular image sequence. In: 2011 International Conference on Computer Vision, pp 201–208

  • Perez-Sala X, Escalera S, Angulo C, Gonzalez J (2014) A survey on model based approaches for 2D and 3D visual human pose recovery. Sensors 14(3):4189–4210

    Article  Google Scholar 

  • Shi J, Yang J, Zhu Y, Li K, Hou C (2020) 3D motion recovery via low rank matrix restoration with hankel-like augmentation. In: IEEE International Conference on Multimedia and Expo (ICME), pp 1–6

  • Stefanidi E, Partarakis N, Zabulis X, Zikas P, Papagiannakis G, Magnenat Thalmann N (2021) Toolty: An approach for the combination of motion capture and 3d reconstruction to present tool usage in 3d environments. In: Intelligent Scene Modeling and Human-Computer Interaction, Springer, pp 165–180

  • Su L, Liao L, Zhai W, Xia S (2018) Data-driven human model estimation for realtime motion capture. J Vis Lang Comput 48:10–18

    Article  Google Scholar 

  • Subodh RMS, George SN, (2020) \(l_{1/2}\) regularized RPCA technique for 3D human action recovery. In: IEEE 17\(^{th}\) India Council International Conference (INDICON), pp 1–5

  • Tom AJ, George SN (2021) A three-way optimization technique for noise robust moving object detection using tensor low-rank approximation, l\(_{1/2}\), and TTV regularizations. IEEE Trans Cybern 51(2):1004–1014

    Article  Google Scholar 

  • Valmadre J, Zhu Y, Sridharan S, Lucey S (2012) Efficient articulated trajectory reconstruction using dynamic programming and filters. In: European Conference on Computer Vision, Springer, New York pp 72–85

  • Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3D skeletons as points in a lie group. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 588–595

  • Wang Q, Kurillo G, Ofli F, Bajcsy R (2015) Evaluation of pose tracking accuracy in the first and second generations of microsoft kinect. In: International Conference on Healthcare Informatics, pp 380–389

  • Wang M, Li K, Wu F, Lai Y, Yang J (2016) 3D motion recovery via low rank matrix analysis. In: Visual Communications and Image Processing (VCIP), IEEE, pp 1–4

  • Wang X, Wang F, Chen Y (2017) Capturing complex 3D human motions with kernelized low-rank representation from monocular RGB camera. Sensors 17(9)

  • Xia G, Sun H, Chen B, Liu Q, Feng L, Zhang G, Hang R (2018) Nonlinear low-rank matrix completion for human motion recovery. IEEE Trans Image Process 27(6):3011–3024

    MathSciNet  Article  Google Scholar 

  • Xu Z, Chang X, Xu F, Zhang H (2012) \({L}_{\frac{1}{2}}\) regularization: a thresholding representation theory and a fast solver. IEEE Trans Neural Netw Learn Syst 23(7):1013–1027

    Article  Google Scholar 

  • Yang J, Guo X, Li K, Wang M, Lai Y, Wu F (2020) Spatio-temporal reconstruction for 3D motion recovery. IEEE Trans Circuits Syst Video Technol 30(6):1583–1596

    Article  Google Scholar 

  • Zeng J, Lin S, Wang Y, Xu Z (2014) \({L}_{\frac{1}{2}}\) regularization: convergence of iterative half thresholding algorithm. IEEE Trans Signal Process 62(9):2317–2329

    MathSciNet  Article  Google Scholar 

  • Zhu L, Hao Y, Song Y (2018) \({L_{1/2}}\) norm and spatial continuity regularized low-rank approximation for moving object detection in dynamic background. IEEE Signal Process Lett 25(1):15–19

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. S. Subodh Raj.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Subodh Raj, M.S., George, S.N. A fast non-convex optimization technique for human action recovery from misrepresented 3D motion capture data using trajectory movement and pair-wise hierarchical constraints. J Ambient Intell Human Comput (2022). https://doi.org/10.1007/s12652-022-04349-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12652-022-04349-z

Keywords

  • Human action recovery
  • Low rank recovery
  • \(\ell _\frac{1}{2}\) regularization
  • Robust principal component analysis