Abstract
We present a novel approach for dynamic human body reconstruction and motion tracking using low-cost depth cameras. Our reconstruction system is able to produce a sequence of dynamic 3D human body models from the noisy input depth data. To accurately align the template model with noisy input data, we combine skeleton-driven deformation and mesh deformation techniques to enhance the registration robustness to depth missing, occlusions, and severe noise. In addition, a novel data-driven 3D human body model is introduced to efficiently reconstruct human body models with wide shape and pose variations only using a limited number of training databases with standard standing pose. We perform quantitative and qualitative experiments to evaluate our method and compare it with other methods for body reconstruction on both synthetic and real datasets. Experimental results demonstrate the effectiveness of the proposed approach.
Similar content being viewed by others
References
Cui, Y., Chang, W., Noll, T., Stricker, D.: Kinectavatar: fully automatic body capture using a single kinect. In: ACCV 2012 Workshop on Color Depth Fusion in Computer Vision (2011)
Weiss, A., Hirshberg, D., Black, M.J.: Home 3D body scans from noisy image and range data. In: ICCV, pp. 1951–1958 (2011)
Wang, R., Choi, J., Medioni, G.: Accurate full body scanning from a single fixed 3D camera. In: International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission, pp. 432–439 (2012)
Tong, J., Zhou, J., Liu, L., Pan, Z., Yan, H.: Scanning 3D full human bodies using kinects. IEEE Trans. Vis. Comput. Graph. 18, 643–650 (2012)
Li, H., Vouga, E., Gudym, A., Luo, L., Barron, J.T., Gusev, G.: 3D self-portraits. ACM SIGGRAPH Aisa 32(6), 147–156 (2013)
Zeng, M., Zheng, J., Cheng, X., Liu, X.: Templateless quasi-rigid shape modeling with implicit loop-closure. In: CVPR, pp. 145–152 (2013)
Shapiro, A., Feng, A., Wang, R., Li, H., Bolas, M., Medioni, G., Suma, E.: Rapid avatar capture and simulation using commodity depth sensors. In: Proceedings of the 27th Conference on Computer Animation and Social Agents, vol. 25, pp. 201–211 (2014)
Newcombe, R., Fox, D., Seitz, S.: DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time. In: CVPR (2015)
Dou, M., Taylor, J., Fuchs, H., Fitzgibbon, A., Izadi, S.: 3D scanning deformable objects with a single RGBD sensor. In: CVPR (2015)
Yu, T., Guo, K., Xu, F., Dong, Y., Su, Z., Zhao, J., Li, J., Dai, Q., Liu, Y.: BodyFusion: real-time capture of human motion and surface geometry using a single depth camera. In: The IEEE International Conference on Computer Vision (ICCV) (2017)
Guo, K., Xu, F., Wang, Y., Liu, Y., Dai, Q.: Robust non-rigid motion tracking and surface reconstruction using L0 regularization. In: ICCV (2015)
Bogo, F., Black, M.J., Loper, M., Romero, J.: Detailed full-body reconstructions of moving people from monocular RGB-D sequences. In: ICCV (2015)
Yu, T., Zheng, Z., Guo, K., Zhao, J., Dai, Q., Li, H., Pons-Moll, G., Liu, Y.: DoubleFusion: Real-time capture of human performances with inner body shapes from a single depth sensor. In: The IEEE International Conference on Computer Vision and Pattern Recognition(CVPR). IEEE (2018)
Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., Davis, J.: SCAPE: shape completion and animation of people. ACM Trans. Graph. 24, 408–416 (2005)
Allen, B., Curless, B., Popović, Z., Hertzmann, A.: Learning a correlated model of identity and pose-dependent body shape variation for real-time synthesis. In: ACM SIGGRAPH, pp. 147–156 (2006)
Hasler, N., Stoll, C., Sunkel, M., Rosenhahn, B., Seidel, H.-P.: A statistical model of human pose and body shape. Comput. Gr. Forum (Proc. Eurogr. 2008) 28(2), 337–346 (2009)
Chen, Y., Liu, Z., Zhang, Z.: Tensor-based human body modeling. In: CVPR, pp. 105–112 (2013)
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. (TOG) 34(6), 248 (2015)
van Kaick, O., Zhang, H., Hamarneh, G., Cohen-Or, D.: A survey on shape correspondence. Comput. Gr. Forum 30(6), 1681–1707 (2011)
Castellani, U., Cristani, M., Fantoni, S., Murino, V.: Sparse points matching by combining 3D mesh saliency with statistical descriptors. In: Computer Graphics Forum, vol. 27, no. 2, pp. 643–652 (2008)
Anguelov, D., Srinivasan, P., Pang, H.-C., Koller, D., Thrun, S., Davis, J.: The correlated correspondence algorithm for unsupervised registration of nonrigid surfaces. In: NIPS (2004)
Biasotti, S., Marini, S., Spagnuolo, M., Falcidieno, B.: Sub-part correspondence by structural descriptors of 3D shapes. Comput. Aided Des. 38(9), 1002–1019 (2006)
Gelfand, N., Mitra, N.J., Guibas, L.J., Pottmann, H.: Robust global registration. In: Eurographics symposium on Geometry processing, pp. 197–206 (2005)
Aiger, D., Mitra, N.J., Cohen-Or, D.: 4-points congruent sets for robust surface registration. ACM Trans. Graph. 27, 1–10 (2008)
Chang, W., Zwicker, M.: Automatic registration for articulated shapes. In: Eurographics symposium on Geometry processing, pp. 1459–1468 (2008)
Chang, W., Zwicker, M.: Range scan registration using reduced deformable models. In: Eurographics Symposium on Geometry processing, vol. 28, no. 2 (2009)
Pekelny, Y., Gotsman, C.: Articulated object reconstruction and markerless motion capture from depth video. Comput. Graphic. Forum 27, 399–408 (2008)
Gall, J., Stoll, C., Aguiar, E.D., Theobalt, C., Rosenhahn, B., Seidel, H. peter: Motion capture using joint skeleton tracking and surface estimation. In: CVPR, pp. 1746–1753 (2009)
Chen, J., Wu, X.: 3D human body shape and motion tracking by lbs and snake. J. Comput. Aided Des. Comput. Graphic. 24(3), 357 (2012)
Elad, A., Kimmel, R.: On bending invariant signatures for surfaces. PAMI 25, 1285–1295 (2003)
Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Efficient computation of isometry-invariant distances between surfaces. SIAM J. Sci. Comput. 28, 1812–1836 (2006)
Jain, V., Zhang, H., van Kaick, O.: Non-rigid spectral correspondence of triangle meshes. Int. J. Shape Model. 13(1), 101–124 (2007)
Mateus, D., Horaud, R., Knossow, D., Cuzzolin, F., Boyer, E.: Articulated shape matching using laplacian eigenfunctions and unsupervised point registration. In: CVPR, pp. 1–8 (2008)
Sahillioglu, Y., Yemez, Y.: 3D shape correspondence by isometry-driven greedy optimization. In: CVPR, pp. 453–458 (2010)
Allen, B., Curless, B., Popović, Z.: The space of human body shapes: reconstruction and parameterization from range scans. ACM Trans. Graphic. 22, 587–594 (2003)
Sumner, R.W., Popović, J.: Deformation transfer for triangle meshes. In: ACM SIGGRAPH, pp. 399–405 (2004)
Pauly, M., Mitra, N.J., Giesen, J., Gross, M., Guibas, L.J.: Example-based 3D scan completion. In: Eurographics Symposium on Geometry Processing, pp. 23–32 (2005)
Seo, H., Magnenat-thalmann, N.: An example-based approach to human body manipulation. Graphic. Models 66(1), 1–23 (2004)
Liao, M., Zhang, Q., Wang, H., Yang, R., Gong, M.: Modeling deformable objects from a single depth camera. In: ICCV, pp. 167–174 (2009)
Wei, X., Zhang, P., Chai, J.: Accurate realtime full-body motion capture using a single depth camera. ACM Trans. Graphic. 31, 1–12 (2012)
Ye, G., Liu, Y., Hasler, N., Ji, X., Dai, Q., Theobalt, C.: Performance capture of interacting characters with handheld kinects. In: ECCV, pp. 828–841 (2012)
Cui, Y., Chang, W., Stricker, D.: Fully automatic body scanning and motion capture using two kinects. In: ACM SIGGRAPH Aisa (2013)
Cui, Y., Schuon, S., Thrun, S., Stricker, D., Theobalt, C.: Algorithms for 3D shape scanning with a depth camera. PAMI 35, 1039–1050 (2013)
Zhang, P., Siu, K., Zhang, J., Liu, C.K., Chai, J.: Leveraging depth cameras and wearable pressure sensors for full-body kinematics and dynamics capture. ACM Trans. Graph. 33, 1–14 (2014)
Zhang, Q., Fu, B., Ye, M., Yang, R.: Quality dynamic human body modeling using a single low-cost depth camera. In: CVPR, pp. 676–683 (2014)
Innmann, M., Zollhöfer, M., Nießner, M., Theobalt, C., Stamminger, M.: VolumeDeform: real-time volumetric non-rigid reconstruction. In: In ECCV, pp. 362–379 (2016)
Dou, M., Khamis, S., Degtyarev, Y., Davidson, P., Fanello, S., Kowdle, A., Escolano, S.O., Rhemann, C., Kim, D., Taylor, J., Kohli, P., Tankovich, V., Izadi, S.: Fusion4D: real-time performance capture of challenging scenes. In: ACM SIGGRAPH (2016)
RGBDemo tool, http://nicolas.burrus.name/index.php/Research/KinectRgbDemoV4?from=Research.KinectRgbDemo
Point cloud library, http://www.pointclouds.org/
Myronenko, A., Song, X.: Point set registration: coherent point drift. PAMI 32, 2262–2275 (2010)
Liu, Y., Stoll, C., Gall, J., Seidel, H.-P., Theobalt, C.: Markerless motion capture of interacting characters using multi-view image segmentation. In: CVPR, pp. 1249–1256 (June 2011)
Lewis, J.P., Cordner, M., Fong, N.: Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation. In: ACM SIGGRAPH, pp. 165–172 (2000)
Baran, I., Popović, J.: Automatic rigging and animation of 3D characters. ACM Trans. Graphic. 26, 72 (2007)
Murray, R.M., Li, Z., Sastry, S.S.: A Mathematical Introduction to Robotic Manipulation. CRC Press Inc, Boca Raton (1994)
Pons-Moll, G., Rosenhahn, B.: Model-Based Pose Estimation. Visual Analysis of Humans, pp. 139–170. Springer, Berlin (2011)
Wang, K., Zhang, G., Bao, H.: Robust 3D reconstruction with an RGB-D camera. IEEE Trans. Image Process. 23, 4893–4906 (2014)
Wang, K., Wang, X., Pan, Z., Liu, K.: A two-stage framework for 3D face reconstruction from RGBD images. PAMI 36, 1493–1504 (2014)
Kendall, D.G.: A survey of the statistical theory of shape. Stat. Sci. 4(2), 87–99 (1989)
Tena, J.R., la Torre, F.D., Matthews, I.: Interactive region-based linear 3D face models. ACM Trans. Graphic. 30, 76 (2011)
Kim, S.jean, Koh, K., Lustig, M., Boyd, S., Gorinevsky, D.: An interior-point method for large-scale l1-regularized least squares. IEEE J. Sel. Top. Signal Process. 1, 606–617 (2007)
Hodrick, R.J., Prescott, E.C.: Postwar us business cycles: an empirical investigation. J. Money Bank. Credit 29(1), 1–16 (1997)
Varol, G., Romero, J., Martin, X., Mahmood, N., Black, M.J., Laptev, I., Schmid, C.: Learning from synthetic humans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 109–117 (2017)
Ye, M., Wang, X., Yang, R., Ren, L., Pollefeys, M.: Accurate 3D pose estimation from a single depth image. In: ICCV, pp. 731–738 (2011)
Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: CVPR, pp. 7122–7131 (2018)
Kolotouros, N., Pavlakos, G., Daniilidis, K.: Convolutional mesh regression for single-image human shape reconstruction. In: CVPR (2019)
Funding
This work was partially funded by the Natural Science Foundation of China (No. 61602444) and the Fundamental Research Funds for the Central Universities (No. 2018FZA5011).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 1 (mp4 51948 KB)
Rights and permissions
About this article
Cite this article
Wang, K., Zhang, G., Yang, J. et al. Dynamic human body reconstruction and motion tracking with low-cost depth cameras. Vis Comput 37, 603–618 (2021). https://doi.org/10.1007/s00371-020-01826-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-020-01826-4