VTNCT: an image-based virtual try-on network by combining feature with pixel transformation

Chang, Yuan; Peng, Tao; Yu, Feng; He, Ruhan; Hu, Xinrong; Liu, Junping; Zhang, Zili; Jiang, Minghua

doi:10.1007/s00371-022-02480-8

VTNCT: an image-based virtual try-on network by combining feature with pixel transformation

Original article
Published: 22 April 2022

Volume 39, pages 2583–2596, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Yuan Chang^1,2,3,
Tao Peng^1,2,3,
Feng Yu^1,2,3,
Ruhan He^1,3,
Xinrong Hu^1,3,
Junping Liu^2,3,
Zili Zhang^2,3 &
…
Minghua Jiang^1,3

417 Accesses
4 Citations
Explore all metrics

Abstract

Image-based virtual try-on tasks with the goal of transferring a target clothing item onto the corresponding region of a person have attracted increasing research attention recently. However, most of the existing image-based virtual try-on methods have a shortcoming in detail generation and preservation. To resolve these issues, we propose a novel virtual try-on network to generate photo-realistic try-on image while preserving the details of clothes and non-target regions. We introduce two key innovations. One is the clothing warping module, which uses a warping strategy combining feature with pixel transformation to obtain the warped clothes with realistic texture and robust alignment. The other is the arm generation module, which is an original module and is highly effective for dealing with occlusion and generating the details of the arm region. In addition, we use a distillation strategy to solve the degeneration caused by the wrong parsing, which further proves the effectiveness of our components. Extensive experiments on a public fashion dataset demonstrate our system achieves the state-of-the-art virtual try-on performance both qualitatively and quantitatively. The code is available at https://github.com/changyuan96/VTNCT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LG-VTON: Fashion Landmark Meets Image-Based Virtual Try-On

PF-VTON: Toward High-Quality Parser-Free Virtual Try-On Network

CS-VITON: a realistic virtual try-on network based on clothing region alignment and SPM

Article 28 March 2024

References

Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Article Google Scholar
Brouet, R., Sheffer, A., Boissieux, L., Cani, M.P.: Design preserving garment transfer. ACM Transactions on Graphics 31(4). https://doi.org/10.1145/2185520.2185532
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)
Chang, Y., Peng, T., He, R., Hu, X., Liu, J., Zhang, Z., Jiang, M.: Dp-vton: toward detail-preserving image-based virtual try-on network. In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2295–2299. IEEE (2021)
Chen, W., Wang, H., Li, Y., Su, H., Wang, Z., Tu, C., Lischinski, D., Cohen-Or, D., Chen, B.: Synthesizing training images for boosting human 3d pose estimation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 479–488. IEEE (2016)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Dong, H., Liang, X., Shen, X., Wang, B., Lai, H., Zhu, J., Hu, Z., Yin, J.: Towards multi-pose guided virtual try-on network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9026–9035 (2019)
Ge, Y., Song, Y., Zhang, R., Ge, C., Liu, W., Luo, P.: Parser-free virtual try-on via distilling appearance flows. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8485–8493 (2021)
Gong, K., Liang, X., Zhang, D., Shen, X., Lin, L.: Look into person: self-supervised structure-sensitive learning and a new benchmark for human parsing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 932–940 (2017)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Guan, P., Reiss, L., Hirshberg, D.A., Weiss, A., Black, M.J.: Drape: dressing any person. ACM Trans. Graph. (TOG) 31(4), 1–10 (2012)
Article Google Scholar
Guo, J., Lu, S., Cai, H., Zhang, W., Yu, Y., Wang, J.: Long text generation via adversarial training with leaked information. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Han, X., Wu, Z., Wu, Z., Yu, R., Davis, L.S.: Viton: an image-based virtual try-on network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7543–7552 (2018)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local Nash equilibrium. In: Advances in Neural Information Processing Systems, pp. 6626–6637 (2017)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Issenhuth, T., Mary, J., Calauzenes, C.: Do not mask what you do not need to mask: a parser-free virtual try-on. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16, pp. 619–635. Springer (2020)
Jandial, S., Chopra, A., Ayush, K., Hemani, M., Krishnamurthy, B., Halwai, A.: Sievenet: A unified framework for robust image-based virtual try-on. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2182–2190 (2020)
Jetchev, N., Bergmann, U.: The conditional analogy GAN: swapping fashion articles on people images. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 2287–2292 (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., Matas, J.: Deblurgan: blind motion deblurring using conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8183–8192 (2018)
Lassner, C., Pons-Moll, G., Gehler, P.V.: A generative model of people in clothing. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 853–862 (2017)
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)
Lee, H.J., Lee, R., Kang, M., Cho, M., Park, G.: La-viton: a network for looking-attractive virtual try-on. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3129–3132. IEEE (2019)
Ma, L., Jia, X., Sun, Q., Schiele, B., Tuytelaars, T., Van Gool, L.: Pose guided person image generation. In: Advances in Neural Information Processing Systems, pp. 405–415 (2017)
Ma, T., Tian, W.: Back-projection-based progressive growing generative adversarial network for single image super-resolution. Vis. Comput. 37(5), 925–938 (2021)
Article Google Scholar
Minar, M., Tuan, T., Ahn, H., Rosin, P., Lai, Y.: Cp-vton+: clothing shape and texture preserving image-based virtual try-on. In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, vol. 2, p. 11 (2020)
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Mo, S., Cho, M., Shin, J.: Instagan: instance-aware image-to-image translation. arXiv preprint arXiv:1812.10889 (2018)
Pons-Moll, G., Pujades, S., Hu, S., Black, M.J.: Clothcap: seamless 4d clothing capture and retargeting. ACM Trans. Graph. (TOG) 36(4), 1–15 (2017)
Article Google Scholar
Pumarola, A., Agudo, A., Sanfeliu, A., Moreno-Noguer, F.: Unsupervised person image synthesis in arbitrary poses. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8620–8628 (2018)
Qiao, T., Zhang, J., Xu, D., Tao, D.: Mirrorgan: learning text-to-image generation by redescription. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1505–1514 (2019)
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. In: International Conference on Machine Learning, pp. 1060–1069. PMLR (2016)
Rocco, I., Arandjelovic, R., Sivic, J.: Convolutional neural network architecture for geometric matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6148–6157 (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241. Springer (2015)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)
Siarohin, A., Sangineto, E., Lathuiliere, S., Sebe, N.: Deformable GANs for pose-based human image generation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3408–3416 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)
Song, H., Wang, M., Zhang, L., Li, Y., Jiang, Z., Yin, G.: S2rgan: sonar-image super-resolution based on generative adversarial network. Vis. Comput. 37(8), 2285–2299 (2021)
Article Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Wang, B., Zheng, H., Liang, X., Chen, Y., Lin, L., Yang, M.: Toward characteristic-preserving image-based virtual try-on network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 589–604 (2018)
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Yang, H., Zhang, R., Guo, X., Liu, W., Zuo, W., Luo, P.: Towards photo-realistic virtual try-on by adaptively generating-preserving image content. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7850–7859 (2020)
Yang, Y., Cheng, Z., Yu, H., Zhang, Y., Cheng, X., Zhang, Z., Xie, G.: MSE-Net: generative image inpainting with multi-scale encoder. Vis. Comput., 1–13 (2021)
Yin, G., Liu, B., Sheng, L., Yu, N., Wang, X., Shao, J.: Semantics disentangling for text-to-image generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2327–2336 (2019)
Yu, L., Zhang, W., Wang, J., Yu, Y.: Seqgan: sequence generative adversarial nets with policy gradient. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
Yu, R., Wang, X., Xie, X.: Vtnfp: an image-based virtual try-on network with body and clothing feature preservation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10511–10520 (2019)
Zhang, Y., Gan, Z., Fan, K., Chen, Z., Henao, R., Shen, D., Carin, L.: Adversarial feature matching for text generation. In: International Conference on Machine Learning, pp. 4006–4015. PMLR (2017)
Zhao, B., Wu, X., Cheng, Z.Q., Liu, H., Jie, Z., Feng, J.: Multi-view image generation from a single-view. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 383–391 (2018)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)

Download references

Acknowledgements

This work is supported in part by the Science Foundation of Hubei under Grant No.2014CFB764 and Department of Education of the Hubei Province of China under Grant No.Q20131608, and Engineering Research Center of Hubei Province for Clothing Information.

Author information

Authors and Affiliations

Hubei Provincial Engineering Research Center for Intelligent Textile and Fashion, Wuhan, 430200, China
Yuan Chang, Tao Peng, Feng Yu, Ruhan He, Xinrong Hu & Minghua Jiang
Engineering Research Center of Hubei Province for Clothing Information, Wuhan, 430200, China
Yuan Chang, Tao Peng, Feng Yu, Junping Liu & Zili Zhang
School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, China
Yuan Chang, Tao Peng, Feng Yu, Ruhan He, Xinrong Hu, Junping Liu, Zili Zhang & Minghua Jiang

Authors

Yuan Chang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Peng
View author publications
You can also search for this author in PubMed Google Scholar
Feng Yu
View author publications
You can also search for this author in PubMed Google Scholar
Ruhan He
View author publications
You can also search for this author in PubMed Google Scholar
Xinrong Hu
View author publications
You can also search for this author in PubMed Google Scholar
Junping Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zili Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Minghua Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Peng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, Y., Peng, T., Yu, F. et al. VTNCT: an image-based virtual try-on network by combining feature with pixel transformation. Vis Comput 39, 2583–2596 (2023). https://doi.org/10.1007/s00371-022-02480-8

Download citation

Accepted: 24 March 2022
Published: 22 April 2022
Issue Date: July 2023
DOI: https://doi.org/10.1007/s00371-022-02480-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VTNCT: an image-based virtual try-on network by combining feature with pixel transformation

Abstract

Access this article

Similar content being viewed by others

LG-VTON: Fashion Landmark Meets Image-Based Virtual Try-On

PF-VTON: Toward High-Quality Parser-Free Virtual Try-On Network

CS-VITON: a realistic virtual try-on network based on clothing region alignment and SPM

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

VTNCT: an image-based virtual try-on network by combining feature with pixel transformation

Abstract

Access this article

Similar content being viewed by others

LG-VTON: Fashion Landmark Meets Image-Based Virtual Try-On

PF-VTON: Toward High-Quality Parser-Free Virtual Try-On Network

CS-VITON: a realistic virtual try-on network based on clothing region alignment and SPM

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation