Cloth texture preserving image-based 3D virtual try-on

Hu, Xinrong; Zheng, Cheng; Huang, Junjie; Luo, Ruiqi; Liu, Junping; Peng, Tao

doi:10.1007/s00371-023-02999-4

Cloth texture preserving image-based 3D virtual try-on

Original article
Published: 11 July 2023

Volume 39, pages 3347–3357, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Xinrong Hu^1,2,
Cheng Zheng¹,
Junjie Huang ORCID: orcid.org/0000-0003-3388-0094¹,
Ruiqi Luo¹,
Junping Liu¹ &
…
Tao Peng¹

438 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

3D virtual try-on based on a single image can provide an excellent shopping experience for Internet users and has enormous business potential. The existing methods of processing the clothed 3D human body generated from the virtual try-on images are reconstructed in 3D models by extracting the depth information from the input images. However, the generated results are unstable and often fail to capture the high-frequency information loss detail features in the larger spatial background during the process of downsampling for depth prediction, and the loss of the generator gradient when predicting the occluded areas in the high-resolution images. To address this problem, we propose a multi-resolution parallel approach to obtain low-frequency information and retain as much of the high-frequency depth features in the images during depth prediction; at the same time, we use a multi-scale generator and discriminator to more accurately infer the feature images of the occluded regions to generate a fine-grained dressed 3D human body. Our method not only provides better details and effects to the final 3D mannequin generation for 3D virtual fitting, but also significantly improves the user’s try-on experience than previous studies, as evidenced by our higher quantitative and qualitative evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Fig. 4

Three stages of 3D virtual try-on network with appearance flow and shape field

Article 24 July 2023

Context-Aware Enhanced Virtual Try-On Network with fabric adaptive registration

Article 11 May 2024

MFAR-VTON: Multi-scale Fabric Adaptive Registration for Image-Based Virtual Try-On

Data Availability

The data that support the findings of this study are available from the corresponding author, [author initials], upon reasonable request.

References

Peng, S., Dong, J., Wang, Q., Zhang, S., Shuai, Q., Bao, H., Zhou, X.: Animatable neural radiance fields for human body modeling. arXiv preprint arXiv:2105.02872 (2021)
Jiang, Y., Jiang, S., Sun, G., Su, Z., Guo, K., Wu, M., Yu, J., Xu, L.: Neuralhofusion: Neural volumetric rendering under human-object interactions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6155–6165 (2022)
Zhi, Y., Qian, S., Yan, X., Gao, S.: Dual-space nerf: Learning animatable avatars and scene lighting in separate spaces. arXiv preprint arXiv:2208.14851 (2022)
Wang, S., Schwarz, K., Geiger, A., Tang, S.: Arah: animatable volume rendering of articulated human sdfs. In: European Conference on Computer Vision, pp. 1–19 (2022). Springer
Weng, C.-Y., Curless, B., Srinivasan, P.P., Barron, J.T., Kemelmacher-Shlizerman, I.: Humannerf: Free-viewpoint rendering of moving people from monocular video. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16210–16220 (2022)
Te, G., Li, X., Li, X., Wang, J., Hu, W., Lu, Y.: Neural capture of animatable 3d human from monocular video. In: European Conference on Computer Vision, pp. 275–291 (2022). Springer
Jafarian, Y., Park, H.S.: Learning high fidelity depths of dressed humans by watching social media dance videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12753–12762 (2021)
Ma, Q., Yang, J., Black, M.J., Tang, S.: Neural point-based shape modeling of humans in challenging clothing. arXiv preprint arXiv:2209.06814 (2022)
Shao, R., Zheng, Z., Zhang, H., Sun, J., Liu, Y.: Diffustereo: High quality human reconstruction via diffusion-based stereo using sparse cameras. In: European Conference on Computer Vision, pp. 702–720 (2022). Springer
Hong, F., Pan, L., Cai, Z., Liu, Z.: Garment4d: Garment reconstruction from point cloud sequences. Adv. Neural. Inf. Process. Syst. 34, 27940–27951 (2021)
Google Scholar
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: Smpl: A skinned multi-person linear model. ACM transactions on graphics (TOG) 34(6), 1–16 (2015)
Article Google Scholar
Patel, C., Liao, Z., Pons-Moll, G.: Tailornet: Predicting clothing in 3d as a function of human pose, shape and garment style. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7365–7375 (2020)
Corona, E., Pumarola, A., Alenya, G., Pons-Moll, G., Moreno-Noguer, F.: Smplicit: Topology-aware generative model for clothed people. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11875–11885 (2021)
He, T., Xu, Y., Saito, S., Soatto, S., Tung, T.: Arch++: Animation-ready clothed human reconstruction revisited. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11046–11056 (2021)
Bhatnagar, B.L., Tiwari, G., Theobalt, C., Pons-Moll, G.: Multi-garment net: Learning to dress 3d people from images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5420–5430 (2019)
Zhao, F., Xie, Z., Kampffmeyer, M., Dong, H., Han, S., Zheng, T., Zhang, T., Liang, X.: M3d-vton: A monocular-to-3d virtual try-on network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13239–13249 (2021)
Chen, Q., Koltun, V.: Photographic image synthesis with cascaded refinement networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1511–1520 (2017)
Saito, S., Simon, T., Saragih, J., Joo, H.: Pifuhd: Multi-level pixel-aligned implicit function for high-resolution 3d human digitization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 84–93 (2020)
Jiang, B., Zhang, J., Hong, Y., Luo, J., Liu, L., Bao, H.: Bcnet: Learning body and cloth shape from a single image. In: European Conference on Computer Vision, pp. 18–35 (2020). Springer
Xiu, Y., Yang, J., Tzionas, D., Black, M.J.: Icon: implicit clothed humans obtained from normals. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13286–13296 (2022). IEEE
Zhu, H., Cao, Y., Jin, H., Chen, W., Du, D., Wang, Z., Cui, S., Han, X.: Deep fashion3d: A dataset and benchmark for 3d garment reconstruction from single images. In: European Conference on Computer Vision, pp. 512–530 (2020). Springer
Lei, J., Sridhar, S., Guerrero, P., Sung, M., Mitra, N., Guibas, L.J.: Pix2surf: Learning parametric 3d surface models of objects from images. In: European Conference on Computer Vision, pp. 121–138 (2020). Springer
Petrov, A.: On obtaining shape from color shading. Color Res. Appl. 18(6), 375–379 (1993)
Article Google Scholar
Zhang, R., Tsai, P.-S., Cryer, J.E., Shah, M.: Shape-from-shading: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 21(8), 690–706 (1999)
Article MATH Google Scholar
Yan, S., Wu, C., Wang, L., Xu, F., An, L., Guo, K., Liu, Y.: Ddrnet: Depth map denoising and refinement for consumer depth cameras using cascaded CNNS. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 151–167 (2018)
Sterzentsenko, V., Saroglou, L., Chatzitofis, A., Thermos, S., Zioulis, N., Doumanoglou, A., Zarpalas, D., Daras, P.: Self-supervised deep depth denoising. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1242–1251 (2019)
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A.C., Bengio, Y.: Generative adversarial nets. In: NIPS (2014)
Park, T., Liu, M.-Y., Wang, T.-C., Zhu, J.-Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)
Kazhdan, M., Hoppe, H.: Screened Poisson surface reconstruction. ACM Trans. Graph. 32(3), 1–13 (2013)
Article MATH Google Scholar
Telea, A.: An image inpainting technique based on the fast marching method. J. Gr. Tools 9(1), 23–34 (2004)
Article Google Scholar
Wang, L., Zhao, X., Yu, T., Wang, S., Liu, Y.: Normalgan: learning detailed 3D human from a single RGB-d image. In: European Conference on Computer Vision, pp. 430–446 (2020). Springer
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inf. Process. Syst. 30 (2017)
Wang, B., Zheng, H., Liang, X., Chen, Y., Lin, L., Yang, M.: Toward characteristic-preserving image-based virtual try-on network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 589–604 (2018)
Zheng, Z., Yu, T., Wei, Y., Dai, Q., Liu, Y.: Deephuman: 3D human reconstruction from a single image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7739–7749 (2019)
Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2304–2314 (2019)

Download references

Author information

Authors and Affiliations

School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, Hubei, China
Xinrong Hu, Cheng Zheng, Junjie Huang, Ruiqi Luo, Junping Liu & Tao Peng
State Key Laboratory of New Textile Materials and Advanced Processing Technologies, Wuhan, 430200, Hubei, China
Xinrong Hu

Authors

Xinrong Hu
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Huang
View author publications
You can also search for this author in PubMed Google Scholar
Ruiqi Luo
View author publications
You can also search for this author in PubMed Google Scholar
Junping Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tao Peng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors disclosed no relevant relationships. XH, CZ, JH, RL, JL, TP.

Corresponding author

Correspondence to Junjie Huang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hu, X., Zheng, C., Huang, J. et al. Cloth texture preserving image-based 3D virtual try-on. Vis Comput 39, 3347–3357 (2023). https://doi.org/10.1007/s00371-023-02999-4

Download citation

Accepted: 10 June 2023
Published: 11 July 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s00371-023-02999-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cloth texture preserving image-based 3D virtual try-on

Abstract

Access this article

Similar content being viewed by others

Three stages of 3D virtual try-on network with appearance flow and shape field

Context-Aware Enhanced Virtual Try-On Network with fabric adaptive registration

MFAR-VTON: Multi-scale Fabric Adaptive Registration for Image-Based Virtual Try-On

Data Availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cloth texture preserving image-based 3D virtual try-on

Abstract

Access this article

Similar content being viewed by others

Three stages of 3D virtual try-on network with appearance flow and shape field

Context-Aware Enhanced Virtual Try-On Network with fabric adaptive registration

MFAR-VTON: Multi-scale Fabric Adaptive Registration for Image-Based Virtual Try-On

Data Availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation