Topology-preserved human reconstruction with details

Lin, Lixiang; Zhu, Jianke

doi:10.1007/s00371-023-02957-0

Topology-preserved human reconstruction with details

Original article
Published: 05 July 2023

Volume 39, pages 3609–3619, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

190 Accesses
Explore all metrics

Abstract

Due to the high diversity and complexity of body shapes, it is challenging to directly estimate the human geometry from a single image with the various clothing styles. Most of the model-based approaches are limited to predict the shape and pose of a minimally clothed body with over-smoothing surface. While capturing the fine detailed geometries, the model-free methods are lack of the fixed mesh topology. To address these issues, we propose a novel topology-preserved human reconstruction approach by bridging the gap between model-based and model-free human reconstruction. We present an end-to-end neural network that simultaneously predicts the pixel-aligned implicit surface and an explicit mesh model built by graph convolutional neural network. Experiments on DeepHuman and our collected dataset showed that our approach is effective. The code will be made publicly available at https://github.com/l1346792580123/sdfgcn.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Single-Image 3D Human Pose and Shape Estimation Enhanced by Clothed 3D Human Reconstruction

LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling

Single-image clothed 3D human reconstruction guided by a well-aligned parametric body model

Article 21 March 2023

References

Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., Davis, J.: SCAPE: shape completion and animation of people. ACM Trans. Graph. 24(3), 408–416 (2005)
Article Google Scholar
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. 34(6), 248:1-248:16 (2015)
Article Google Scholar
Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 7122–7131 (2018)
Kolotouros, N., Pavlakos, G., Black, M.J., Daniilidis, K.: Learning to reconstruct 3d human pose and shape via model-fitting in the loop. In: International Conference on Computer Vision ICCV, pp. 2252–2261 (2019)
Neophytou, A., Hilton, A.: A layered model of human body and garment deformation. In: International Conference on 3DV, pp. 171–178 (2014)
Lähner, Z., Cremers, D., Tung, T.: Deepwrinkles: Accurate and realistic clothing modeling. In: European Conference on Computer Vision ECCV, vol. 11208, pp. 698–715 (2018)
Yang, J., Franco, J., Hétroy-Wheeler, F., Wuhrer, S.: Analyzing clothing layer deformation statistics of 3d human motions. In: European Conference on Computer Vision ECCV, vol. 11211, pp. 245–261 (2018)
Pavlakos, G., Choutas, V., Ghorbani, N., Bolkart, T., Osman, A.A.A., Tzionas, D., Black, M.J.: Expressive body capture: 3d hands, face, and body from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 10,975–10,985 (2019)
Saito, S., Simon, T., Saragih, J.M., Joo, H.: Pifuhd: Multi-level pixel-aligned implicit function for high-resolution 3d human digitization. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 81–90 (2020)
Varol, G., Ceylan, D., Russell, B.C., Yang, J., Yumer, E., Laptev, I., Schmid, C.: Bodynet: Volumetric inference of 3d human body shapes. In: European Conference on Computer Vision ECCV, vol. 11211, pp. 20–38 (2018)
Zheng, Z., Yu, T., Wei, Y., Dai, Q., Liu, Y.: Deephuman: 3d human reconstruction from a single image. In: International Conference on Computer Vision ICCV, pp. 7738–7748. IEEE (2019)
Saito, S., Huang, Z., Natsume, R., Morishima, S., Li, H., Kanazawa, A.: Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization. In: International Conference on Computer Vision ICCV, pp. 2304–2314 (2019)
Chen, Y., Tian, Y., He, M.: Monocular human pose estimation: a survey of deep learning-based methods. Comput. Vis. Image Underst. 192, 102,897 (2020)
Article Google Scholar
Desmarais, Y., Mottet, D., Slangen, P., Montesinos, P.: A review of 3d human pose estimation algorithms for markerless motion capture. Comput. Vis. Image Underst. 212, 103,275 (2021)
Article Google Scholar
Wang, J., Tan, S., Zhen, X., Xu, S., Zheng, F., He, Z., Shao, L.: Deep 3d human pose estimation: a review. Comput. Vis. Image Underst. 210, 103,225 (2021)
Article Google Scholar
Joo, H., Simon, T., Sheikh, Y.: Total capture: A 3d deformation model for tracking faces, hands, and bodies. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 8320–8329 (2018)
Ranjan, A., Bolkart, T., Sanyal, S., Black, M.J.: Generating 3d faces using convolutional mesh autoencoders. In: European Conference on Computer Vision ECCV, vol. 11207, pp. 725–741 (2018)
Choi, H., Moon, G., Lee, K.M.: Pose2mesh: Graph convolutional network for 3d human pose and mesh recovery from a 2d human pose. In: European Conference on Computer Vision ECCV, vol. 12352, pp. 769–787 (2020)
Bogo, F., Kanazawa, A., Lassner, C., Gehler, P.V., Romero, J., Black, M.J.: Keep it SMPL: automatic estimation of 3d human pose and shape from a single image. In: European Conference on Computer Vision ECCV, vol. 9909, pp. 561–578 (2016)
Xiang, D., Joo, H., Sheikh, Y.: Monocular total capture: Posing face, body, and hands in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 10965–10974 (2019)
Sun, W., Wang, L., Ma, S., Ma, Q.: Estimating 3d body mesh without smpl annotations via alternating successive convex approximation. Comput. Vis. Image Underst. 224, 103539 (2022)
Article Google Scholar
Alldieck, T., Magnor, M.A., Xu, W., Theobalt, C., Pons-Moll, G.: Detailed human avatars from monocular video. In: International Conference on 3DV, pp. 98–109 (2018)
Alldieck, T., Magnor, M.A., Bhatnagar, B.L., Theobalt, C., Pons-Moll, G.: Learning to reconstruct people in clothing from a single RGB camera. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 1175–1186 (2019)
Zhu, H., Zuo, X., Wang, S., Cao, X., Yang, R.: Detailed human shape estimation from a single image by hierarchical mesh deformation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 4491–4500 (2019)
Pons-Moll, G., Pujades, S., Hu, S., Black, M.J.: Clothcap: seamless 4d clothing capture and retargeting. ACM Trans. Graph. 36(4), 73:1-73:15 (2017)
Article Google Scholar
Ma, Q., Yang, J., Ranjan, A., Pujades, S., Pons-Moll, G., Tang, S., Black, M.J.: Learning to dress 3d people in generative clothing. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 6468–6477 (2020)
Corona, E., Pumarola, A., Alenyà, G., Pons-Moll, G., Moreno-Noguer, F.: Smplicit: Topology-aware generative model for clothed people. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 11875–11885 (2021)
Bhatnagar, B.L., Sminchisescu, C., Theobalt, C., Pons-Moll, G.: Combining implicit function learning and parametric models for 3d human reconstruction. In: European Conference on Computer Vision ECCV, vol. 12347, pp. 311–329 (2020)
Saito, S., Yang, J., Ma, Q., Black, M.J.: Scanimate: Weakly supervised learning of skinned clothed avatar networks. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 2886–2897 (2021)
Ma, Q., Saito, S., Yang, J., Tang, S., Black, M.J.: SCALE: modeling clothed humans with a surface codec of articulated local elements. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, pp. 16082–16093 (2021)
Gabeur, V., Franco, J., Martin, X., Schmid, C., Rogez, G.: Moulding humans: Non-parametric 3d human shape estimation from single images. In: International Conference on Computer Vision ICCV, pp. 2232–2241 (2019)
Tang, S., Tan, F., Cheng, K., Li, Z., Zhu, S., Tan, P.: A neural network for detailed human depth estimation from a single image. In: International Conference on Computer Vision ICCV, pp. 7749–7758 (2019)
Natsume, R., Saito, S., Huang, Z., Chen, W., Ma, C., Li, H., Morishima, S.: Siclope: Silhouette-based clothed people. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 4480–4490 (2019)
Wang, T., Liu, M., Zhu, J., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 8798–8807 (2018)
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: European Conference on Computer Vision ECCV, vol. 9912, pp. 483–499 (2016)
Maas, A., Hannun, A., Ng, A.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the International Conference on Machine Learning vol. 30, p. 3 (2013)
Mahmood, N., Ghorbani, N., Troje, N.F., Pons-Moll, G., Black, M.J.: AMASS: Archive of motion capture as surface shapes. In: International Conference on Computer Vision ICCV, pp. 5442–5451 (2019)
Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3d surface construction algorithm. In: SIGGRAPH, pp. 163–169. ACM (1987)
renderpeople: https://www.renderpeople.com
Lin, T., Maire, M., Belongie, S.J., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: European Conference on Computer Vision ECCV, vol. 8693, pp. 740–755 (2014)
Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human 3.6m: large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1325–1339 (2014)
Article Google Scholar
Kolotouros, N., Pavlakos, G., Daniilidis, K.: Convolutional mesh regression for single-image human shape reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 4501–4510 (2019)
Zhu, H., Zuo, X., Wang, S., Cao, X., Yang, R.: Detailed human shape estimation from a single image by hierarchical mesh deformation. In: IEEE Conference on Computer Vision and Pattern Recognition, (CVPR), pp. 4491–4500 (2019)

Download references

Funding

This work is supported by the National Natural Science Foundation of China under Grants (61831015).

Author information

Authors and Affiliations

College of Computer science and technology, Zhejiang University, Hangzhou City, Zhejiang Province, China
Lixiang Lin & Jianke Zhu

Authors

Lixiang Lin
View author publications
You can also search for this author in PubMed Google Scholar
Jianke Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianke Zhu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 12227 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lin, L., Zhu, J. Topology-preserved human reconstruction with details. Vis Comput 39, 3609–3619 (2023). https://doi.org/10.1007/s00371-023-02957-0

Download citation

Accepted: 09 June 2023
Published: 05 July 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s00371-023-02957-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Topology-preserved human reconstruction with details

Abstract

Access this article

Similar content being viewed by others

Single-Image 3D Human Pose and Shape Estimation Enhanced by Clothed 3D Human Reconstruction

LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling

Single-image clothed 3D human reconstruction guided by a well-aligned parametric body model

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Topology-preserved human reconstruction with details

Abstract

Access this article

Similar content being viewed by others

Single-Image 3D Human Pose and Shape Estimation Enhanced by Clothed 3D Human Reconstruction

LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling

Single-image clothed 3D human reconstruction guided by a well-aligned parametric body model

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation