Skip to main content

Joint Optimization for Multi-person Shape Models from Markerless 3D-Scans

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12363))

Included in the following conference series:

  • 3249 Accesses

Abstract

We propose a markerless end-to-end training framework for parametric 3D human shape models. The training of statistical 3D human shape models with minimal supervision is an important problem in computer vision. Contrary to prior work, the whole training process (i) uses a differentiable shape model surface and (ii) is trained end-to-end by jointly optimizing all parameters of a single, self-contained objective that can be solved with slightly modified off-the-shelf non-linear least squares solvers. The training process only requires a compact model definition and an off-the-shelf 2D RGB pose estimator. No pre-trained shape models are required. For training (iii) a medium-sized dataset of approximately 1000 low-resolution human body scans is sufficient to achieve competitive performance on the challenging FAUST surface correspondence benchmark. The training and evaluation code will be made available for research purposes to facilitate end-to-end shape model training on novel datasets with minimal setup cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/Intelligent-Systems-Research-Group/JOMS/.

References

  1. Agarwal, S., Mierle, K., et al.: Ceres solver. http://ceres-solver.org

  2. Alexa, M., Wardetzky, M.: Discrete laplacians on general polygonal meshes. In: ACM Transactions on Graphics (TOG), vol. 30, p. 102. ACM (2011)

    Google Scholar 

  3. Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., Davis, J.: Scape: shape completion and animation of people. In: ACM Transactions on Graphics (TOG), vol. 24, pp. 408–416. ACM (2005)

    Google Scholar 

  4. Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Proceedings of the Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), vol. 99, pp. 187–194 (1999)

    Google Scholar 

  5. Bogo, F., Romero, J., Loper, M., Black, M.J.: Faust: dataset and evaluation for 3D mesh registration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3794–3801 (2014)

    Google Scholar 

  6. Bogo, F., Romero, J., Pons-Moll, G., Black, M.J.: Dynamic faust: registering human bodies in motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6233–6242 (2017)

    Google Scholar 

  7. Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  8. Cashman, T.J., Fitzgibbon, A.W.: What shape are dolphins? Building 3D morphable models from 2D images. Trans. Pattern Anal. Mach. Intell. (PAMI) 35(1), 232–244 (2012)

    Article  Google Scholar 

  9. Catmull, E., Clark, J.: Recursively generated b-spline surfaces on arbitrary topological meshes. Comput. Aided Des. 10(6), 350–355 (1978)

    Article  Google Scholar 

  10. Deprelle, T., Groueix, T., Fisher, M., Kim, V., Russell, B., Aubry, M.: Learning elementary structures for 3D shape generation and matching. In: Advances in Neural Information Processing Systems (NIPS), pp. 7433–7443 (2019)

    Google Scholar 

  11. DeVito, Z., et al.: Opt: a domain specific language for non-linear least squares optimization in graphics and imaging. In: ACM Transactions on Graphics 2017 (TOG) (2017)

    Google Scholar 

  12. Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. Trans. Pattern Anal. Mach. Intell. (PAMI) 40(3), 611–625 (2018)

    Article  Google Scholar 

  13. Farin, G.E., Farin, G.: Curves and Surfaces for CAGD: A Practical Guide. Morgan Kaufmann (2002)

    Google Scholar 

  14. Genova, K., Cole, F., Maschinot, A., Sarna, A., Vlasic, D., Freeman, W.T.: Unsupervised training for 3D morphable model regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8377–8386 (2018)

    Google Scholar 

  15. Groueix, T., Fisher, M., Kim, V.G., Russell, B.C., Aubry, M.: 3D-coded: 3D correspondences by deep deformation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 230–246 (2018)

    Google Scholar 

  16. Halimi, O., Litany, O., Rodola, E., Bronstein, A.M., Kimmel, R.: Unsupervised learning of dense shape correspondence. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4370–4379 (2019)

    Google Scholar 

  17. Hesse, N., et al.: Learning an infant body model from RGB-D data for accurate full body motion analysis. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 792–800. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_89

    Chapter  Google Scholar 

  18. Hirshberg, D.A., Loper, M., Rachlin, E., Black, M.J.: Coregistration: simultaneous alignment and modeling of articulated 3D shape. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 242–255. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_18

    Chapter  Google Scholar 

  19. Jaimez, M., Cashman, T.J., Fitzgibbon, A., Gonzalez-Jimenez, J., Cremers, D.: An efficient background term for 3D reconstruction and tracking with smooth surface models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  20. Joo, H., Simon, T., Sheikh, Y.: Total capture: a 3D deformation model for tracking faces, hands, and bodies. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8320–8329 (2018)

    Google Scholar 

  21. Khamis, S., Taylor, J., Shotton, J., Keskin, C., Izadi, S., Fitzgibbon, A.: Learning an efficient model of hand shape variation from depth images, June 2015

    Google Scholar 

  22. Lewis, J.P., Cordner, M., Fong, N.: Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation. In: Proceedings of the Conference on Computer Graphics and Interactive Techniques, pp. 165–172 (2000)

    Google Scholar 

  23. Li, C.L., Simon, T., Saragih, J., Póczos, B., Sheikh, Y.: LBS autoencoder: self-supervised fitting of articulated meshes to point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11967–11976 (2019)

    Google Scholar 

  24. Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. (TOG) 36(6), 194 (2017)

    Google Scholar 

  25. Loop, C., Schaefer, S.: Approximating Catmull-Clark subdivision surfaces with bicubic patches. ACM Trans. Graph. (TOG) 27(1), 8 (2008)

    Article  Google Scholar 

  26. Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. (TOG) 34(6), 248 (2015)

    Article  Google Scholar 

  27. Pavlakos, G., et al.: Expressive body capture: 3D hands, face, and body from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  28. Pishchulin, L., Wuhrer, S., Helten, T., Theobalt, C., Schiele, B.: Building statistical shape spaces for 3D human modeling. Pattern Recogn. 67, 276–286 (2017)

    Article  Google Scholar 

  29. Robinette, K.M., Daanen, H., Paquet, E.: The CAESAR project: a 3-D surface anthropometry survey. In: International Conference on 3-D Digital Imaging and Modeling, pp. 380–386. IEEE (1999)

    Google Scholar 

  30. Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. ACM Trans. Graph. (TOG) 36(6), 245 (2017)

    Article  Google Scholar 

  31. Sorkine, O., Alexa, M.: As-rigid-as-possible surface modeling. In: Symposium on Geometry Processing, vol. 4 (2007)

    Google Scholar 

  32. Taylor, J., et al.: User-specific hand modeling from monocular depth sequences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 644–651 (2014)

    Google Scholar 

  33. Taylor, J., et al.: Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences. ACM Trans. Graph. (TOG) 35(4), 143 (2016)

    Article  Google Scholar 

  34. Varol, G., et al.: Learning from synthetic humans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  35. Zach, C.: Robust bundle adjustment revisited. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 772–787. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_50

    Chapter  Google Scholar 

  36. Zach, C., Bourmaud, G.: Iterated lifting for robust cost optimization. In: Proceedings of the British Machine Vision Conference (BMVC) (2017)

    Google Scholar 

  37. Zeitvogel, S., Laubenheimer, A.: Towards end-to-end 3D human avatar shape reconstruction from 4D data. In: International Symposium on Electronics and Telecommunications (ISETC), pp. 1–4. IEEE (2018)

    Google Scholar 

  38. Zuffi, S., Black, M.J.: The stitched puppet: a graphical model of 3D human shape and pose. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3537–3546 (2015)

    Google Scholar 

Download references

Acknowledgment

We thank A. Bender for the data setup figure. We thank J. Wetzel and N. Link for technical discussion. This work was supported by the German Federal Ministry of Education and Research (BMBF) under Grant 13FH025IX6.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Samuel Zeitvogel , Johannes Dornheim or Astrid Laubenheimer .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 2 (mp4 1352 KB)

Supplementary material 3 (mp4 23276 KB)

Supplementary material 4 (mp4 277 KB)

Supplementary material 5 (mp4 567 KB)

Supplementary material 6 (mp4 475 KB)

Supplementary material 7 (mp4 988 KB)

Supplementary material 8 (mp4 4841 KB)

Supplementary material 1 (pdf 19185 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zeitvogel, S., Dornheim, J., Laubenheimer, A. (2020). Joint Optimization for Multi-person Shape Models from Markerless 3D-Scans. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12363. Springer, Cham. https://doi.org/10.1007/978-3-030-58523-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58523-5_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58522-8

  • Online ISBN: 978-3-030-58523-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics