Skip to main content

Facial-sketch Synthesis: A New Challenge

Abstract

This paper aims to conduct a comprehensive study on facial-sketch synthesis (FSS). However, due to the high cost of obtaining hand-drawn sketch datasets, there is a lack of a complete benchmark for assessing the development of FSS algorithms over the last decade. We first introduce a high-quality dataset for FSS, named FS2K, which consists of 2 104 image-sketch pairs spanning three types of sketch styles, image backgrounds, lighting conditions, skin colors, and facial attributes. FS2K differs from previous FSS datasets in difficulty, diversity, and scalability and should thus facilitate the progress of FSS research. Second, we present the largest-scale FSS investigation by reviewing 89 classic methods, including 25 handcrafted feature-based facial-sketch synthesis approaches, 29 general translation methods, and 35 image-to-sketch approaches. In addition, we elaborate comprehensive experiments on the existing 19 cutting-edge models. Third, we present a simple baseline for FSS, named FSGAN. With only two straightforward components, i.e., facial-aware masking and style-vector expansion, our FSGAN surpasses the performance of all previous state-of-the-art models on the proposed FS2K dataset by a large margin. Finally, we conclude with lessons learned over the past years and point out several unsolved challenges. Our code is available at https://github.com/DengPingFan/FSGAN.

References

  1. X. G. Wang, X. O. Tang. Face photo-sketch synthesis and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 11, pp. 1955–1967, 2009. DOI: https://doi.org/10.1109/TPAMI.2008.222.

    MathSciNet  Article  Google Scholar 

  2. R. Yi, Y. J. Liu, Y. K. Lai, P. L. Rosin. APDrawingGAN: Generating artistic portrait drawings from face photos with hierarchical GANs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 10735–10744, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01100.

  3. H. Koshimizu, M. Tominaga, T. Fujiwara, K. Murakami. On KANSEI facial image processing for computerized facial caricaturing system PICASSO. In Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, IEEE, Tokyo, Japan, pp. 294–299, 1999. DOI: https://doi.org/10.1109/ICSMC.1999.816567.

  4. N. Kumar, A. C. Berg, P. N. Belhumeur, S. K. Nayar. Attribute and simile classifiers for face verification. In Proceedings of IEEE 12th International Conference on Computer Vision, IEEE, Kyoto, Japan, pp. 365–372, 2009. DOI: https://doi.org/10.1109/ICCV.2009.5459250.

    Google Scholar 

  5. H. S. Du, Q. P. Hu, D. F. Qiao, I. Pitas. Robust face recognition via low-rank sparse representation-based classification. International Journal of Automation and Computing, vol. 12, no. 6, pp. 579–587, 2015. DOI: https://doi.org/10.1007/s11633-015-0901-2.

    Article  Google Scholar 

  6. Y. Z. Lu. A novel face recognition algorithm for distinguishing faces with various angles. International Journal of Automation and Computing, vol. 5, no. 2, pp. 193–197, 2008. DOI: https://doi.org/10.1007/s11633-008-0193-x.

    Article  Google Scholar 

  7. V. Jain, E. Learned-Miller. FDDB: A Benchmark for Face Detection in Unconstrained Settings, Technical Report UM-CS-2010-009, Department of Computer Science, University of Massachusetts Amherst, USA, 2010.

    Google Scholar 

  8. Z. P. Zhang, P. Luo, C. C. Loy, X. O. Tang. Facial landmark detection by deep multi-task learning. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 94–108, 2014. DOI: https://doi.org/10.1007/978-3-319-10599-4_7.

    Google Scholar 

  9. A. Bulat, G. Tzimiropoulos. How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230, 000 3D facial landmarks). In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1021–1030, 2017. DOI: https://doi.org/10.1109/ICCV.2017.116.

    Google Scholar 

  10. J. X. Sun, Q. Li, W. N. Wang, J. Zhao, Z. N. Sun. Multicaption text-to-face synthesis: Dataset and algorithm. In Proceedings of the 29th ACM International Conference on Multimedia, ACM, Chengdu, China, pp. 2290–2298, 2021. DOI: https://doi.org/10.1145/3474085.3475391.

    Google Scholar 

  11. R. Yi, M. F. Xia, Y. J. Liu, Y. K. Lai, P. L. Rosin. Line drawings for face portraits from photos using global and local structure based GANs. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 10, pp. 3462–3475, 2021. DOI: https://doi.org/10.1109/TPAMI.2020.2987931.

    Article  Google Scholar 

  12. Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on image Processing, vol. 13, no. 4, pp. 600–612, 2004. DOI: https://doi.org/10.1109/TIP.2003.819861.

    Article  Google Scholar 

  13. J. Y. Zhu, T. Park, P. Isola, A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2242–2251, 2017. DOI: https://doi.org/10.1109/ICCV.2017.244.

    Google Scholar 

  14. M. Y. Liu, T. Breuel, J. Kautz. Unsupervised image-to-image translation networks. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 700–708, 2017.

    Google Scholar 

  15. T. C. Wang, M. Y. Liu, J. Y. Zhu, A. Tao, J. Kautz, B. Catanzaro. High-resolution image synthesis and semantic manipulation with conditional GANs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8798–8807, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00917.

  16. T. Park, M. Y. Liu, T. C. Wang, J. Y. Zhu. Semantic image synthesis with spatially-adaptive normalization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2332–2341, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00244.

    Google Scholar 

  17. H. Y. Chang, Z. X. Wang, Y. Y. Chuang. Domain-specific mappings for generative adversarial style transfer. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 573–589, 2020. DOI: https://doi.org/10.1007/978-3-030-58598-3_34.

    Google Scholar 

  18. R. F. Chen, W. B. Huang, B. H. Huang, F. C. Sun, B. Fang. Reusing discriminators for encoding: Towards unsupervised image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8165–8174, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00819.

    Google Scholar 

  19. H. Y. Lee, H. Y. Tseng, Q. Mao, J. B. Huang, Y. D. Lu, M. Singh, M. H. Yang. DRIT++: Diverse image-to-image translation via disentangled representations. International Journal of Computer Vision, vol. 128, no. 10, pp. 2402–2417, 2020. DOI: https://doi.org/10.1007/s11263-019-01284-z.

    Article  Google Scholar 

  20. D. P. Fan, S. C. Zhang, Y. H. Wu, Y. Liu, M. M. Cheng, B. Ren, P. Rosin, R. R. Ji. Scoot: A perceptual metric for facial sketches. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 5611–5621, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00571.

    Google Scholar 

  21. H. S. Bhatt, S. Bharadwaj, R. Singh, M. Vatsa. On matching sketches with digital face images. In Proceedings of the 4th IEEE International Conference on Biometrics: Theory, Applications and Systems, IEEE, Washington DC, USA, 2010. DOI: https://doi.org/10.1109/BTAS.2010.5634507.

    Google Scholar 

  22. W. Zhang, X. G. Wang, X. O. Tang. Coupled information-theoretic encoding for face photo-sketch recognition. In Proceedings of Conference on Computer Vision and Pattern Recognition, IEEE, Colorado Springs, USA, pp. 513–520, 2011. DOI: https://doi.org/10.1109/CVPR.2011.5995324.

    Google Scholar 

  23. X. B. Gao, N. N. Wang, D. C. Tao, X. L. Li. Face sketch-photo synthesis and retrieval using sparse representation. IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 8, pp. 1213–1226, 2012. DOI: https://doi.org/10.1109/TCSVT.2012.2198090.

    Article  Google Scholar 

  24. I. Berger, A. Shamir, M. Mahler, E. Carter, J. Hodgins. Style and abstraction in portrait sketching. ACM Transactions on Graphics, vol. 32, no. 4, Article number 55, 2013. DOI: https://doi.org/10.1145/2461912.2461964.

    Article  Google Scholar 

  25. R. Yi, Y. J. Liu, Y. K. Lai, P. L. Rosin. Unpaired portrait drawing generation via asymmetric cycle mapping. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8214–8222, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00824.

    Google Scholar 

  26. C. L. Peng, X. B. Gao, N. N. Wang, J. Li. Face recognition from multiple stylistic sketches: Scenarios, datasets, and evaluation. Pattern Recognition, vol. 84, no. pp. 262–272, 2018. DOI: https://doi.org/10.1016/j.patcog.2018.07.014.

    Article  Google Scholar 

  27. A. M. Martinez, R. Benavente. The AR Face Database, CVC Technical Report 24, CVC, Spain, 1998.

    Google Scholar 

  28. N. N. Wang, X. B. Gao, D. C. Tao, X. L. Li. Face sketch-photo synthesis under multi-dictionary sparse representation framework. In Proceedings of 6th International Conference on Image and Graphics, IEEE, Hefei, China, pp. 82–87, 2011. DOI: https://doi.org/10.1109/ICIG.2011.112.

    Google Scholar 

  29. S. C. Zhang, R. R. Ji, J. Hu, X. Q. Lu, X. L. Li. Face sketch synthesis by multidomain adversarial learning. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 5, pp. 1419–1428, 2019. DOI: https://doi.org/10.1109/TNNLS.2018.2869574.

    Article  Google Scholar 

  30. M. R. Zhu, J. Li, N. N. Wang, X. B. Gao. Knowledge distillation for face photo-sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 2, pp. 893–906, 2022. DOI: https://doi.org/10.1109/TNNLS.2020.3030536.

    Article  Google Scholar 

  31. Z. W. Liu, P. Luo, X. G. Wang, X. O. Tang. Deep learning face attributes in the wild. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp. 3730–3738, 2015. DOI: https://doi.org/10.1109/ICCV.2015.425.

    Google Scholar 

  32. J. Kim, M. Kim, H. Kang, K. Lee. U-GAT-IT: Unsupervised generative attentional networks with adaptive layer-Instance normalization for image-to-image translation. In Proceedings of the 8th International Conference on Learning Representations, Ababa, Ethiopia, 2020.

    Google Scholar 

  33. P. Isola, J. Y. Zhu, T. H. Zhou, A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5967–5976, 2017. DOI: https://doi.org/10.1109/CVPR.2017.632.

    Google Scholar 

  34. K. Messer, J. Matas, J. Kittler, K. Jonsson, J. Luettin, G. Maitre. XM2VTSDB: The extended M2VTS database. In Proceedings of the 2nd International Conference on Audio and Video-based Biometric Person Authentication, Springer, Washington DC, USA, pp. 965–966, 1999.

    Google Scholar 

  35. P. J. Phillips, H. Moon, S. A. Rizvi, P. J. Rauss. The FERET evaluation methodology for face-recognition algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090–1104, 2000. DOI: https://doi.org/10.1109/34.879790.

    Article  Google Scholar 

  36. Á. Serrano, I. M. De Diego, C. Conde, E. Cabello, L. L. Shen, L. Bai. Influence of wavelet frequency and orientation in an SVM-based parallel Gabor PCA face verification system. In Proceedings of the 8th International Conference on Intelligent Data Engineering and Automated Learning, Springer, Birmingham, UK, pp. 219–228, 2007. DOI: https://doi.org/10.1007/978-3-540-77226-2_23.

    Google Scholar 

  37. H. S. Bhatt, S. Bharadwaj, R. Singh, M. Vatsa. Memetically optimized MCWLD for matching sketches with digital face images. IEEE Transactions on Information Forensics and Security, vol. 7, no. 5, pp. 1522–1535, 2012. DOI: https://doi.org/10.1109/TIFS.2012.2204252.

    Article  Google Scholar 

  38. M. Minear, D. C. Park. A lifespan database of adult facial stimuli. Behavior Research Methods, Instruments & Computers, vol. 36, no. 4, pp. 630–633, 2004. DOI: https://doi.org/10.3758/BF03206543.

    Article  Google Scholar 

  39. J. Nishino, T. Kamyama, H. Shira, T. Odaka, H. Ogura. Linguistic knowledge acquisition system on facial caricature drawing system. In Proceedings of IEEE International Fuzzy Systems. IEEE, Seoul, Korea, pp. 1591–1596, 1999. DOI: https://doi.org/10.1109/FUZZY.1999.790142.

    Google Scholar 

  40. S. Iwashita, Y. Takeda, T. Onisawa. Expressive facial caricature drawing. In Proceedings of IEEE International Fuzzy Systems. IEEE, Seoul, Korea, pp. 1597–1602, 1999. DOI: https://doi.org/10.1109/FUZZY.1999.790143.

    Google Scholar 

  41. Y. Z. Li, H. Kobatake. Extraction of facial sketch image based on morphological processing. In Proceedings of International Conference on Image Processing, IEEE, Santa Barbara, USA, pp. 316–319, 1997. DOI: https://doi.org/10.1109/ICIP.1997.632104.

    Google Scholar 

  42. M. Tominaga, S. Fukuoka, K. Murakami, H. Koshimizu. Facial caricaturing with motion caricaturing in PICASSO system. In Proceedings of IEEE/ASME International Conference on Advanced Intelligent Mechatronics, IEEE, Tokyo, Japan, pp. 30, 1997. DOI: https://doi.org/10.1109/AIM.1997.652888.

    Chapter  Google Scholar 

  43. S. E. Brennan. Caricature Generator, Ph. D. dissertation, Massachusetts Institute of Technology, USA, 1982.

    Google Scholar 

  44. N. N. Wang, D. C. Tao, X. B. Gao, X. L. Li, J. Li. A comprehensive survey to face hallucination. International Journal of Computer Vision, vol. 106, no. 1, pp. 9–30, 2014. DOI: https://doi.org/10.1007/s11263-013-0645-9.

    Article  Google Scholar 

  45. H. Chen, Y. Q. Xu, H. Y. Shum, S. C. Zhu, N. N. Zheng. Example-based facial sketch generation with non-parametric sampling. In Proceedings of the 8th IEEE International Conference on Computer Vision, IEEE, Vancouver, Canada, pp. 433–438, 2001. DOI: https://doi.org/10.1109/ICCV.2001.937657.

    Google Scholar 

  46. A. V. Nefian, M. H. Hayes III. Face recognition using an embedded HMM. In Proceedings of IEEE Conference on Audio and Video-based Biometric Person Authentication, IEEE, 1999.

    Google Scholar 

  47. X. B. Gao, J. J. Zhong, J. Li, C. N. Tian. Face sketch synthesis algorithm based on E-HMM and selective ensemble. IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 4, pp. 487–496, 2008. DOI: https://doi.org/10.1109/TCSVT.2008.918770.

    Article  Google Scholar 

  48. M. Eitz, J. Hays, M. Alexa. How do humans sketch objects? ACM Transactions on Graphics, vol. 31, no. 4, Article number 44, 2012. DOI: https://doi.org/10.1145/2185520.2185540.

  49. T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C. L. Zitnick. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 740–755, 2014. DOI: https://doi.org/10.1007/978-3-319-10602-1_48.

    Google Scholar 

  50. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. H. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, F. F. Li. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015. DOI: https://doi.org/10.1007/11263-015-0816-y.

    MathSciNet  Article  Google Scholar 

  51. M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, A. Vedaldi. Describing textures in the wild. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 3606–3613, 2014. DOI: https://doi.org/10.1109/CVPR.2014.461.

    Google Scholar 

  52. S. Y. Duck. Painter by numbers, wikiart.org, [Online], Available: https://www.kaggle.com/c/painter-by-numbers, 2016.

    Google Scholar 

  53. M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele. The cityscapes dataset for semantic urban scene understanding. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3213–3223, 2016. DOI: https://doi.org/10.1109/CVPR.2016.350.

    Google Scholar 

  54. R. Tyleček, R. Šára. Spatial pattern templates for recognition of objects with regular structure. In Proceedings of the 35th German Conference on Pattern Recognition, Springer, Saarbrücken, Germany, pp. 364–374, 2013. DOI: https://doi.org/10.1007/978-3-642-40602-7_39.

    Google Scholar 

  55. J. Y. Zhu, P. Krähenbühl, E. Shechtman, A. A. Efros. Generative visual manipulation on the natural image manifold. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 597–613, 2016. DOI: https://doi.org/10.1007/978-3-319-46454-1_36.

    Google Scholar 

  56. A. Yu, K. Grauman. Fine-grained visual comparisons with local learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 192–199, 2014. DOI: https://doi.org/10.1109/CV-PR.2014.32.

    Google Scholar 

  57. P. Y. Laffont, Z. Ren, X. F. Tao, C. Qian, J. Hays. Transient attributes for high-level understanding and editing of outdoor scenes. ACM Transactions on Graphics, vol. 33, no. 4, Article number 149, 2014. DOI: https://doi.org/10.1145/2601097.2601101.

    Google Scholar 

  58. Y. Lecun, L. Bottou, Y. Bengio, P. Haffner. Gradient-based learning applied to document recognition. Proceedings of IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. DOI: https://doi.org/10.1109/5.726791.

    Article  Google Scholar 

  59. C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie. The caltech-ucsd birds-200-2011 dataset, 2011. [Online], Available: https://authors.library.caltech.edu/27452/1/CUB_200_2011.pdf.

    Google Scholar 

  60. T. Karras, T. Aila, S. Laine, J. Lehtinen. Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

    Google Scholar 

  61. N. Silberman, D. Hoiem, P. Kohli, R. Fergus. Indoor segmentation and support inference from RGBD images. In Proceedings of the 12th European Conference on Computer Vision, Springer, Florence, Italy, pp. 746–760, 2012. DOI: https://doi.org/10.1007/978-3-642-33715-4_54.

    Google Scholar 

  62. B. L. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, A. Torralba. Scene parsing through ADE20K dataset. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5122–5130, 2017. DOI: https://doi.org/10.1109/CVPR.2017.544.

    Google Scholar 

  63. Q. Yu, Y. Z. Song, T. Xiang, T. M. Hospedales. Sketchx!-shoe/chair fine-grained SBIR dataset, 2017. [Online], Available: https://sketchx.eecs.qmul.ac.uk/downloads/.

    Google Scholar 

  64. D. Ha, D. Eck. A neural representation of sketch drawings. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

    Google Scholar 

  65. Y. H. Jin, J. K. Zhang, M. J. Li, Y. T. Tian, H. C. Zhu, Z. H. Fang. Towards the automatic anime characters creation with generative adversarial networks. [Online], Available: https://arxiv.org/pdf/1708.05509, 2017.

    Google Scholar 

  66. H. Z. Xu, Y. Gao, F. Yu, T. Darrell. End-to-end learning of driving models from large-scale video datasets. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 3530–3538, 2017. DOI: https://doi.org/10.1109/CVPR.2017.376.

    Google Scholar 

  67. G. Ros, L. Sellart, J. Materzynska, D. Vazquez, A. M. Lopez. The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3234–3243, 2016. DOI: https://doi.org/10.1109/CVPR.2016.352.

    Google Scholar 

  68. Z. W. Liu, P. Luo, S. Qiu, X. G. Wang, X. O. Tang. DeepFashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 1096–1104, 2016. DOI: https://doi.org/10.1109/CVPR.2016.124.

    Google Scholar 

  69. T. Karras, S. Laine, T. Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 4396–4405, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00453.

    Google Scholar 

  70. E. Agustsson, R. Timofte. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Honolulu, USA, pp. 1122–1131, 2017. DOI: https://doi.org/10.1109/CVPRW.2017.150.

    Google Scholar 

  71. B. Yao, X. Yang, S. C. Zhu. Introduction to a large-scale general purpose ground truth database: Methodology, annotation tool and benchmarks. In Proceedings of the 6th International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, Springer, Ezhou, China, pp. 169–183, 2007, DOI: https://doi.org/10.1007/978-3-540-74198-5_14.

    Chapter  Google Scholar 

  72. J. Krause, M. Stark, J. Deng, F. F. Li. 3D object representations for fine-grained categorization. In Proceedings of IEEE International Conference on Computer Vision Workshops, IEEE, Sydney, Australia, pp. 554–561, 2013. DOI: https://doi.org/10.1109/ICCVW.2013.77.

    Google Scholar 

  73. F. Yu, A. Seff, Y. D. Zhang, S. R. Song, T. Funkhouser, J. X. Xiao. LSUN: Construction of a large-scale image dataset using deep learning with humans in the loop. [Online], Available: https://arxiv.org/abs/1506.03365, 2015.

    Google Scholar 

  74. Q. S. Liu, X. O. Tang, H. L. Jin, H. Q. Lu, S. D. Ma. A nonlinear approach for face sketch synthesis and recognition. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, San Diego, USA, pp. 1005–1010, 2005. DOI: https://doi.org/10.1109/CVPR.2005.39.

    Google Scholar 

  75. Z. J. Xu, H. Chen, S. C. Zhu, J. B. Luo. A hierarchical compositional model for face representation and sketching. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 955–969, 2008. DOI: https://doi.org/10.1109/TPAMI.2008.50.

    Article  Google Scholar 

  76. W. Zhang, X. G. Wang, X. O. Tang. Lighting and pose robust face sketch synthesis. In Proceedings of the 11th European Conference on Computer Vision, Springer, Heraklion, Greece, pp. 420–433, 2010. DOI: https://doi.org/10.1007/978-3-642-15567-3_31.

    Google Scholar 

  77. N. Y. Ji, X. J. Chai, S. G. Shan, X. L. Chen. Local regression model for automatic face sketch generation. In Proceedings of the 6th International Conference on Image and Graphics, IEEE, Hefei, China, pp. 412–417, 2011. DOI: https://doi.org/10.1109/ICIG.2011.84.

    Google Scholar 

  78. L. Chang, M. Q. Zhou, X. M. Deng, Z. K. Wu, Y. J. Han. Face sketch synthesis via multivariate output regression. In Proceedings of the 14th International Conference on Human-computer Interaction, Springer, Orlando, USA, pp. 555–561, 2011. DOI: https://doi.org/10.1007/978-3-642-21602-2_60.

    Google Scholar 

  79. J. W. Zhang, N. N. Wang, X. B. Gao, D. C. Tao, X. L. Li. Face sketch-photo synthesis based on support vector regression. In Proceedings of the 18th IEEE International Conference on Image Processing, IEEE, Brussels, Belgium, pp. 1125–1128, 2011. DOI: https://doi.org/10.1109/ICIP.2011.6115625.

    Google Scholar 

  80. S. L. Wang, L. Zhang, Y. Liang, Q. Pan. Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 2216–2223, 2012. DOI: https://doi.org/10.1109/CVPR.2012.6247930.

    Google Scholar 

  81. H. Zhou, Z. H. Kuang, K. Y. K. Wong. Markov weight fields for face sketch synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 1091–1097, 2012. DOI: https://doi.org/10.1109/CVPR.2012.6247788.

    Google Scholar 

  82. T. H. Wang, J. Collomosse, A. Hunter, D. Greig. Learnable stroke models for example-based portrait painting. In Proceedings of British Machine Vision Conference, Bristol, UK, 2013.

    Google Scholar 

  83. N. N. Wang, D. C. Tao, X. B. Gao, X. L. Li, J. Li. Transductive face sketch-photo synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 9, pp. 1364–1376, 2013. DOI: https://doi.org/10.1109/TNNLS.2013.2258174.

    Article  Google Scholar 

  84. D. A. Huang, Y. C. F. Wang. Coupled dictionary and feature space learning with applications to cross-domain image synthesis and recognition. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Sydney, Australia, pp. 2496–2503, 2013. DOI: https://doi.org/10.1109/ICCV.2013.310.

    Google Scholar 

  85. Y. B. Song, L. C. Bao, Q. X. Yang, M. H. Yang. Real-time exemplar-based face sketch synthesis. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 800–813, 2014. DOI: https://doi.org/10.1007/978-3-319-10599-4_51.

    Google Scholar 

  86. S. C. Zhang, X. B. Gao, N. N. Wang, J. Li. Robust face sketch style synthesis. IEEE Transactions on Image Processing, vol. 25, no. 1, pp. 220–232, 2016. DOI: https://doi.org/10.1109/TIP.2015.2501755.

    MathSciNet  MATH  Article  Google Scholar 

  87. C. L. Peng, X. B. Gao, N. N. Wang, J. Li. Superpixel-based face sketch-photo synthesis. IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 2, pp. 288–299, 2017. DOI: https://doi.org/10.1109/TCSVT.2015.2502861.

    Article  Google Scholar 

  88. C. L. Peng, X. B. Gao, N. N. Wang, D. C. Tao, X. L. Li, J. Li. Multiple representations-based face sketch-photo synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 11, pp. 2201–2215, 2016. DOI: https://doi.org/10.1109/TNNLS.2015.2464681.

    Article  Google Scholar 

  89. Y. Li, Y. Z. Song, T. M. Hospedales, S. G. Gong. Freehand sketch synthesis with deformable stroke models. International Journal of Computer Vision, vol. 122, no. 1, pp. 169–190, 2017. DOI: https://doi.org/10.1007/s11263-016-0963-9.

    MathSciNet  Article  Google Scholar 

  90. J. Li, X. Y. Yu, C. L. Peng, N. N. Wang. Adaptive representation-based face sketch-photo synthesis. Neurocomputing, vol. 269, pp. 152–159, 2017. DOI: https://doi.org/10.1016/j.neucom.2016.10.095.

    Article  Google Scholar 

  91. N. N. Wang, X. B. Gao, J. Li. Random sampling for fast face sketch synthesis. Pattern Recognition, vol. 76, pp. 215–227, 2018. DOI: https://doi.org/10.1016/j.patcog.2017.11.008.

    Article  Google Scholar 

  92. Y. F. Men, Z. H. Lian, Y. M. Tang, J. G. Xiao. A common framework for interactive texture transfer. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6353–6362, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00665.

    Google Scholar 

  93. L. A. Gatys, A. S. Ecker, M. Bethge. A neural algorithm of artistic style. [Online], Available: https://arxiv.org/abs/1508.06576, 2015.

    Google Scholar 

  94. L. A. Gatys, A. S. Ecker, M. Bethge. Image style transfer using convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2414–2423, 2016. DOI: https://doi.org/10.1109/CVPR.2016.265.

    Google Scholar 

  95. J. Johnson, A. Alahi, F. F. Li. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 694–711, 2016. DOI: https://doi.org/10.1007/978-3-319-46475-6_43.

    Google Scholar 

  96. D. Ulyanov, V. Lebedev, A. Vedaldi, V. S. Lempitsky. Texture networks: Feed-forward synthesis of textures and stylized images. In Proceedings of the 33rd International Conference on International Conference on Machine Learning, New York, USA, pp. 1349–1357, 2016.

    Google Scholar 

  97. T. Q. Chen, M. Schmidt. Fast patch-based style transfer of arbitrary style. [Online], Available: https://arxiv.org/pdf/1612.04337, 2016.

    Google Scholar 

  98. V. Dumoulin, J. Shlens, M. Kudlur. A learned representation for artistic style. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.

    Google Scholar 

  99. D. Ulyanov, A. Vedaldi, V. Lempitsky. Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 4105–4113, 2017. DOI: https://doi.org/10.1109/CVPR.2017.437.

    Google Scholar 

  100. X. Huang, S. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1510–1519, 2017. DOI: https://doi.org/10.1109/ICCV.2017.167.

    Google Scholar 

  101. Y. J. Li, C. Fang, J. M. Yang, Z. W. Wang, X. Lu, M. H. Yang. Universal style transfer via feature transforms. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 385–395, 2017.

    Google Scholar 

  102. Y. Chen, Y. K. Lai, Y. J. Liu. CartoonGAN: Generative adversarial networks for photo cartoonization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 9465–9474, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00986.

    Google Scholar 

  103. R. Abdal, Y. P. Qin, P. Wonka. Image2StyleGAN: How to embed images into the StyleGAN latent space? In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 4431–4440, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00453.

    Google Scholar 

  104. D. Kotovenko, M. Wright, A. Heimbrecht, B. Ommer. Rethinking style transfer: From pixels to parameterized brushstrokes. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 12191–12200, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01202.

    Google Scholar 

  105. E. Richardson, Y. Alaluf, O. Patashnik, Y. Nitzan, Y. Azar, S. Shapiro, D. Cohen-Or. Encoding in style: A StyleGAN encoder for image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 2287–2296, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00232.

    Google Scholar 

  106. Z. L. Yi, H. Zhang, P. Tan, M. L. Gong. DualGAN: Unsupervised dual learning for image-to-image translation. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2868–2876, 2017. DOI: https://doi.org/10.1109/ICCV.2017.310.

    Google Scholar 

  107. T. Kim, M. Cha, H. Kim, J. K. Lee, J. Kim. Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 1857–1865, 2017.

    Google Scholar 

  108. J. Y. Zhu, R. Zhang, D. Pathak, T. Darrell, A. A. Efros, O. Wang, E. Shechtman. Toward multimodal image-to-image translation. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 465–476, 2017.

    Google Scholar 

  109. X. Huang, M. Y. Liu, S. Belongie, J. Kautz. Multimodal unsupervised image-to-image translation. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 179–196, 2018. DOI: https://doi.org/10.1007/978-3-030-01219-9_11.

    Google Scholar 

  110. P. Zhang, B. Zhang, D. Chen, L. Yuan, F. Wen. Cross-domain correspondence learning for exemplar-based image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 5142–5152, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00519.

    Google Scholar 

  111. L. M. Jiang, C. X. Zhang, M. Y. Huang, C. X. Liu, J. P. Shi, C. C. Loy. TSIT: A simple and versatile framework for image-to-image translation. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 206–222, 2020. DOI: https://doi.org/10.1007/978-3-030-58580-8_13.

    Google Scholar 

  112. Y. H. Zhao, R. H. Wu, H. Dong. Unpaired image-to-image translation using adversarial consistency loss. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 800–815, 2020. DOI: https://doi.org/10.1007/978-3-030-58545-7_46.

    Google Scholar 

  113. X. R. Zhou, B. Zhang, T. Zhang, P. Zhang, J. M. Bao, D. Chen, Z. F. Zhang, F. Wen. CoCosNet v2: Full-resolution correspondence learning for image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 11460–11470, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01130.

    Google Scholar 

  114. A. P. Chen, R. Y. Liu, L. Xie, Z. Chen, H. Su, J. Y. Yu. SofGAN: A portrait image generator with dynamic styling. ACM Transactions on Graphics, vol. 41, no. 1, Article number 1, 2022. DOI: https://doi.org/10.1145/3470848.

    Google Scholar 

  115. L. L. Zhang, L. Lin, X. Wu, S. Y. Ding, L. Zhang. End-to-end photo-sketch generation via fully convolutional representation learning. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ACM, Shanghai, China, pp. 627–634, 2015. DOI: https://doi.org/10.1145/2671188.2749321.

    Chapter  Google Scholar 

  116. M. R. Zhu, N. N. Wang, X. B. Gao, J. Li. Deep graphical feature learning for face sketch synthesis. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, pp. 3574–3580, 2017.

    Google Scholar 

  117. P. Sangkloy, J. W. Lu, C. Fang, F. Yu, J. Hays. Scribbler: Controlling deep image synthesis with sketch and color. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 6836–6845, 2017. DOI: https://doi.org/10.1109/CVPR.2017.723.

    Google Scholar 

  118. M. J. Zhang, N. N. Wang, Y. S. Li, R. X. Wang, X. B. Gao. Face sketch synthesis from coarse to fine. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, California, USA, pp. 7558–7565, 2018. DOI: https://doi.org/10.1609/aaai.v32i1.12224.

    Google Scholar 

  119. W. Q. Xian, P. Sangkloy, V. Agrawal, A. Raj, J. W. Lu, C. Fang, F. Yu, J. Hays. TextureGAN: Controlling deep image synthesis with texture patches. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8456–8465, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00882.

    Google Scholar 

  120. J. F. Song, K. Y. Pang, Y. Z. Song, T. Xiang, T. M. Hospedales. Learning to sketch with shortcut cycle consistency. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 801–810, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00090.

    Google Scholar 

  121. Y. Y. Lu, S. Z. Wu, Y. W. Tai, C. K. Tang. Image generation from sketch constraint using contextual GAN. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 213–228, 2018. DOI: https://doi.org/10.1007/978-3-030-01270-0_13.

    Google Scholar 

  122. S. C. Zhang, R. R. Ji, J. Hu, Y. Gao, C. W. Lin. Robust face sketch synthesis via generative adversarial fusion of priors and parametric sigmoid. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 1163–1169, 2018.

    Google Scholar 

  123. M. J. Zhang, N. Wang, Y. Li, X. Gao. Markov random neural fields for face sketch synthesis. In Proceedings of International Joint Conferences on Artificial Intelligence, Stockholm, Sweden, pp. 7558–7565, 2018.

    Google Scholar 

  124. L. D. Wang, V. Sindagi, V. Patel. High-quality facial photo-sketch synthesis using multi-adversarial networks. In Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, IEEE, Xi'an, China, pp. 83–90, 2018. DOI: https://doi.org/10.1109/FG.2018.00022.

    Google Scholar 

  125. M. J. Zhang, R. X. Wang, X. B. Gao, J. Li, D. C. Tao. Dual-transfer face sketch-photo synthesis. IEEE Transactions on Image Processing, vol. 28, no. 2, pp. 642–657, 2019. DOI: https://doi.org/10.1109/TIP.2018.2869688.

    MathSciNet  MATH  Article  Google Scholar 

  126. H. Kazemi, M. Iranmanesh, A. Dabouei, S. Soleymani, N. M. Nasrabadi. Facial attributes guided deep sketch-to-photo synthesis. In Proceedings of IEEE Winter Applications of Computer Vision Workshops, IEEE, Lake Tahoe, USA, 2018. DOI: https://doi.org/10.1109/WACVW.2018.00006.

    Google Scholar 

  127. H. Kazemi, F. Taherkhani, N. M. Nasrabadi. Unsupervised facial geometry learning for sketch to photo synthesis. In Proceedings of International Conference of the Biometrics Special Interest Group, IEEE, Darmstadt, Germany, 2018.

    Google Scholar 

  128. S. You, N. You, M. X. Pan. PI-REC: Progressive image reconstruction network with edge and color domain. [Online], Available: https://arxiv.org/abs/1903.10146, 2019.

    Google Scholar 

  129. M. J. Zhang, N. N. Wang, Y. S. Li, X. B. Gao. Deep latent low-rank representation for face sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 10, pp. 3109–3123, 2019. DOI: https://doi.org/10.1109/TNNLS.2018.2890017.

    Article  Google Scholar 

  130. M. R. Zhu, J. Li, N. N. Wang, X. B. Gao. A deep collaborative framework for face photo-sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 10, pp. 3096–3108, 2019. DOI: https://doi.org/10.1109/TNNLS.2018.2890018.

    Article  Google Scholar 

  131. M. J. Zhang, Y. S. Li, N. N. Wang, Y. Chi, X. B. Gao. Cascaded face sketch synthesis under various illuminations. IEEE Transactions on Image Processing, vol. 29, pp. 1507–1521, 2019. DOI: https://doi.org/10.1109/TIP.2019.2942514.

    MathSciNet  Article  Google Scholar 

  132. M. R. Zhu, N. N. Wang, X. B. Gao, J. Li, Z. F. Li. Face photo-sketch synthesis via knowledge transfer. In Proceedings of the 28th International Joint Conference on Artficial Intelligence, Macao, China, pp. 1048–1054, 2019.

    Google Scholar 

  133. Y. J. Li, C. Fang, A. Hertzmann, E. Shechtman, M. H. Yang. Im2Pencil: Controllable pencil illustration from photographs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1525–1534, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00162.

    Google Scholar 

  134. A. Ghosh, R. Zhang, P. Dokania, O. Wang, A. Efros, P. Torr, E. Shechtman. Interactive sketch & fill: Multiclass sketch-to-image translation. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 1171–1180, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00126.

    Google Scholar 

  135. X. R. Wang, J. Z. Yu. Learning to cartoonize using white-box cartoon representations. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8087–8096, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00811.

    Google Scholar 

  136. C. Y. Gao, Q. Liu, Q. Xu, L. M. Wang, J. Z. Liu, C. Q. Zou. SketchyCOCO: Image generation from freehand scene sketches. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 5173–5182, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00522.

    Google Scholar 

  137. S. Yang, Z. Y. Wang, J. Y. Liu, Z. M. Guo. Deep plastic surgery: Robust and controllable image editing with human-drawn sketches. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 601–617, 2020. DOI: https://doi.org/10.1007/978-3-030-58555-6_36.

    Google Scholar 

  138. S. Y. Chen, W. C. Su, L. Gao, S. H. Xia, H. B. Fu. DeepFaceDrawing: Deep generation of face images from sketches. ACM Transactions on Graphics, vol. 39, no. 4, Article number 72, 2020. DOI: https://doi.org/10.1145/3386569.3392386.

    Google Scholar 

  139. J. Yu, X. X. Xu, F. Gao, S. J. Shi, M. Wang, D. C. Tao, Q. M. Huang. Toward realistic face photo-sketch synthesis via composition-aided GANs. IEEE Transactions on Cybernetics, vol. 51, no. 9, pp. 4350–4362, 2021. DOI: https://doi.org/10.1109/TCYB.2020.2972944.

    Article  Google Scholar 

  140. Y. K. Fang, W. H. Deng, J. P. Du, J. N. Hu. Identity-aware CycleGAN for face photo-sketch synthesis and recognition. Pattern Recognition, vol. 102, Article number 107249, 2020. DOI: https://doi.org/10.1016/j.patcog.2020.107249.

  141. Y. Lin, S. G. Ling, K. R. Fu, P. Cheng. An identity-preserved model for face sketch-photo synthesis. IEEE Signal Processing Letters, vol. 27, pp. 1095–1099, 2020. DOI: https://doi.org/10.1109/LSP.2020.3005039.

    Article  Google Scholar 

  142. C. L. Peng, N. N. Wang, J. Li, X. B. Gao. Universal face photo-sketch style transfer via multiview domain translation. IEEE Transactions on Image Processing, vol. 29, pp. 8519–8534, 2020. DOI: https://doi.org/10.1109/TIP.2020.3016502.

    Article  Google Scholar 

  143. K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.

    Google Scholar 

  144. S. C. Duan, Z. X. Chen, Q. M. J. Wu, L. Cai, D. Lu. Multi-scale gradients self-attention residual learning for face photo-sketch transformation. IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1218–1230, 2020. DOI: https://doi.org/10.1109/TIFS.2020.3031386.

    Article  Google Scholar 

  145. S. Y. Wang, D. Bau, J. Y. Zhu. Sketch your own GAN. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 14030–14040, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.01379.

    Google Scholar 

  146. A. K. Bhunia, S. Khan, H. Cholakkal, R. M. Anwer, F. S. Khan, J. Laaksonen, M. Felsberg. DoodleFormer: Creative sketch drawing with transformers. [Online], Available: https://arxiv.org/abs/2112.03258, 2021.

    Google Scholar 

  147. H. Abdi, L. J. Williams. Principal component analysis. WIREs Computational Statistics, vol. 2, no. 4, pp. 433–459, 2010. DOI: https://doi.org/10.1002/wics.101.

    Article  Google Scholar 

  148. X. O. Tang, X. G. Wang. Face photo recognition using sketch. In Proceedings. International Conference on Image Processing, IEEE, Rochester, USA, pp. I–257–I–260, 2002. DOI: https://doi.org/10.1109/ICIP.2002.1038008.

    Google Scholar 

  149. X. O. Tang, X. G. Wang. Face sketch synthesis and recognition. In Proceedings of the 9th IEEE International Conference on Computer Vision, IEEE, Nice, France, pp. 687–694, 2003. DOI: https://doi.org/10.1109/ICCV.2003.1238414.

    Chapter  Google Scholar 

  150. X. O. Tang, X. G. Wang. Face sketch recognition. IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 1, pp. 50–57, 2004. DOI: https://doi.org/10.1109/TCSVT.2003.818353.

    Article  Google Scholar 

  151. S. T. Roweis, L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, vol. 290, no. 5500, pp. 2323–2326, 2000. DOI: https://doi.org/10.1126/science.290.5500.2323.

    Article  Google Scholar 

  152. S. Saxena, M. N. Teli. Comparison and analysis of image-to-image generative adversarial networks: A survey. [Online], Available: https://arxiv.org/abs/2112.12625, 2021.

    Google Scholar 

  153. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 2672–2680, 2014.

    Google Scholar 

  154. M. Mirza, S. Osindero. Conditional generative adversarial nets. [Online], Available: https://arxiv.org/abs/1411.1784, 2014.

    Google Scholar 

  155. O. Ronneberger, P. Fischer, T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention, Springer, Munich, Germany, pp. 234–241, 2015. DOI: https://doi.org/10.1007/978-3-319-24574-4_28.

    Google Scholar 

  156. Y. C. Jing, Y. Z. Yang, Z. L. Feng, J. W. Ye, Y. Z. Yu, M. L. Song. Neural style transfer: A review. IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 11, pp. 3365–3385, 2020. DOI: https://doi.org/10.1109/TVCG.2019.2921336.

    Article  Google Scholar 

  157. Y. H. Song, C. Yang, Y. J. Shen, P. Wang, Q. Huang, C. C. J. Kuo. SPG-Net: Segmentation prediction and guidance network for image inpainting. In Proceedings of British Machine Vision Conference, Newcastle, UK, 2018.

    Google Scholar 

  158. D. Yi, Z. Lei, S. C. Liao, S. Z. Li. Learning face representation from scratch. [Online], Available: https://arxiv.org/abs/1411.7923, 2014.

    Google Scholar 

  159. L. Wang, R. F. Li, K. Wang, J. Chen. Feature representation for facial expression recognition based on FACS and LBP. International Journal of Automation and Computing, vol. 11, no. 5, pp. 459–468, 2014. DOI: https://doi.org/10.1007/s11633-014-0835-0.

    Article  Google Scholar 

  160. X. Zheng, Y. Q. Guo, H. B. Huang, Y. Li, R. He. A survey of deep facial attribute analysis. International Journal of Computer Vision, vol. 128, no. 8, pp. 2002–2034, 2020. DOI: https://doi.org/10.1007/s11263-020-01308-z.

    Article  Google Scholar 

  161. G. B. Huang, M. Mattar, T. Berg, E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. In Proceedings of Workshop on Faces In “Real-Life” Images: Detection, Alignment, and Recognition, Marseille, France, Article number inria-321923, 2008.

    Google Scholar 

  162. R. Ranjan, V. M. Patel, R. Chellappa. Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 1, pp. 121–135, 2019. DOI: https://doi.org/10.1109/TPAMI.2017.2781233.

    Article  Google Scholar 

  163. E. M. Hand, R. Chellappa. Attributes for improved attributes: A multi-task network utilizing implicit and explicit relationships for facial attribute classification. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 4068–4074, 2017.

    Google Scholar 

  164. H. Han, A. K. Jain, F. Wang, S. G. Shan, X. L. Chen. Heterogeneous face attribute estimation: A deep multi-task learning approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 11, pp. 2597–2609, 2018. DOI: https://doi.org/10.1109/TPAMI.2017.2738004.

    Article  Google Scholar 

  165. Y. Jang, H. Gunes, I. Patras. SmileNet: Registration-free smiling face detection in the wild. In Proceedings of IEEE International Conference on Computer Vision Workshops, IEEE, Venice, Italy, pp. 1581–1589, 2017. DOI: https://doi.org/10.1109/ICCVW.2017.186.

    Google Scholar 

  166. R. Ranjan, S. Sankaranarayanan, C. D. Castillo, R. Chellappa. An all-in-one convolutional neural network for face analysis. In Proceedings of the 12th IEEE International Conference on Automatic Face & Gesture Recognition, IEEE, Washington DC, USA, pp. 17–24, 2017. DOI: https://doi.org/10.1109/FG.2017.137.

    Google Scholar 

  167. S. Li, W. H. Deng. Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing, 2020, to be published. DOI: https://doi.org/10.1109/TAFFC.2020.2981446.

    Google Scholar 

  168. N. Zhang, M. Paluri, M. Ranzato, T. Darrell, L. Bourdev. PANDA: Pose aligned networks for deep attribute modeling. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 1637–1644, 2014. DOI: https://doi.org/10.1109/CVPR.2014.212.

    Google Scholar 

  169. M. N. Kan, S. G. Shan, H. Chang, X. L. Chen. Stacked progressive auto-encoders (SPAE) for face recognition across poses. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 1883–1890, 2014. DOI: https://doi.org/10.1109/CVPR.2014.243.

    Google Scholar 

  170. Y. Wu, Z. G. Wang, Q. Ji. Facial feature tracking under varying facial expressions and face poses based on restricted Boltzmann machines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Portland, USA, pp. 3452–3459, 2013. DOI: https://doi.org/10.1109/CVPR.2013.443.

    Google Scholar 

  171. L. Tran, X. Yin, X. M. Liu. Disentangled representation learning GAN for pose-invariant face recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 1283–1292, 2017. DOI: https://doi.org/10.1109/CVPR.2017.141.

    Google Scholar 

  172. U. Toseeb, D. R. T. Keeble, E. J. Bryant. The significance of hair for face recognition.. PLoS One, vol. 7, no. 3, Article number e34144, 2012. DOI: https://doi.org/10.1371/journal.pone.0034144.

    Google Scholar 

  173. S. J. Bartel, K. Toews, L. Gronhovd, S. L. Prime. “Do I Know You?” altering hairstyle affects facial recognition. Visual Cognition, vol. 26, no. 3, pp. 149–155, 2018. DOI: https://doi.org/10.1080/13506285.2017.1394412.

    Article  Google Scholar 

  174. N. Kumar, P. Belhumeur, S. Nayar. FaceTracer: A search engine for large collections of images with faces. In Proceedings of the 10th European Conference on Computer Vision, Springer, Marseille, France, pp. 340–353, 2008. DOI: https://doi.org/10.1007/978-3-540-88693-8_25.

    Google Scholar 

  175. H. Y. Li, W. M. Dong, B. G. Hu. Facial image attributes transformation via conditional recycle generative adversarial networks. Journal of Computer Science and Technology, vol. 33, no. 3, pp. 511–521, 2018. DOI: https://doi.org/10.1007/s11390-018-1835-2.

    Article  Google Scholar 

  176. J. S. Pierrard, T. Vetter. Skin detail analysis for face recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Minneapolis, USA, 2007. DOI: https://doi.org/10.1109/CVPR.2007.383264.

    Google Scholar 

  177. S. Z. Li. Encyclopedia of Biometrics: I-Z, New York, USA: Springer, 2009.

    Book  Google Scholar 

  178. K. P. Zhang, Z. P. Zhang, Z. F. Li, Y. Qiao. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503, 2016. DOI: https://doi.org/10.1109/LSP.2016.2603342.

    Article  Google Scholar 

  179. K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.

    Google Scholar 

  180. Y. Choi, M. Choi, M. Kim, J. W. Ha, S. Kim, J. Choo. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8789–8797, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00916.

    Google Scholar 

  181. B. Zhao, B. Chang, Z. Q. Jie, L. Sigal. Modular generative adversarial networks. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 157–173, 2018. DOI: https://doi.org/10.1007/978-3-030-01264-9_10.

    Google Scholar 

  182. A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. M. Lin, A. Desmaison, L. Antiga, A. Lerer. Automatic differentiation In PyTorch. In Proceedings of the 31st Conference on Neural Information Processing Systems, Long Beach, USA, 2017.

    Google Scholar 

  183. D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2014.

    Google Scholar 

  184. Q. Yu, F. Liu, Y. Z. Song, T. Xiang, T. M. Hospedales, C. C. Loy. Sketch me that shoe. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 799–807, 2016. DOI: https://doi.org/10.1109/CVPR.2016.93.

    Google Scholar 

  185. C. Shorten, T. M. Khoshgoftaar. A survey on image data augmentation for deep learning. Journal of Big Data, vol. 6, no. 1, Article number 60, 2019. DOI: https://doi.org/10.1186/s40537-019-0197-0.

    Google Scholar 

  186. Y. X. Wang, C. C. Wu, L. Herranz, J. Van De Weijer, A. Gonzalez-Garcia, B. Raducanu. Transferring GANs: Generating images from limited data. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 220–236, 2018. DOI: https://doi.org/10.1007/978-3-030-01231-1_14.

    Google Scholar 

  187. Y. X. Wang, L. Yu, J. Van De Weijer. DeepI2I: Enabling deep hierarchical image-to-image translation by transferring from GANs. In Proceedings of the 34th in Neural Information Processing Systems, 2020.

    Google Scholar 

  188. A. Shocher, Y. Gandelsman, I. Mosseri, M. Yarom, M. Irani, W. T. Freeman, T. Dekel. Semantic pyramid for image generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 7455–7464, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00748.

    Google Scholar 

  189. S. Ravi, H. Larochelle. Optimization as a model for few-shot learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.

    Google Scholar 

  190. O. Chapelle, B. Scholkopf, A. Zien. Semi-supervised learning. IEEE Transactions on Neural Networks, vol. 20, no. 3, Article number 542, 2009. DOI: https://doi.org/10.1109/TNN.2009.2015974.

    Google Scholar 

  191. M. Oquab, L. Bottou, I. Laptev, J. Sivic. Is object localization for free? — Weakly-supervised learning with convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 685–694, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298668.

    Google Scholar 

  192. X. L. Wang, K. M. He, A. Gupta. Transitive Invariance for self-supervised visual representation learning. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1338–1347, 2017. DOI: https://doi.org/10.1109/ICCV.2017.149.

    Google Scholar 

  193. R. Pinto, T. Mettler, M. Taisch. Managing supplier delivery reliability risk under limited information: Foundations for a human-in-the-loop DSS. Decision Support Systems, vol. 54, no. 2, pp. 1076–1084, 2013. DOI: https://doi.org/10.1016/j.dss.2012.10.033.

    Article  Google Scholar 

  194. Y. LeCun. Generalization and network design strategies. Connectionism in Perspective, vol. 19, no. 143–155, Article number 18, 1989.

    Google Scholar 

  195. I. O. Tolstikhin, N. Houlsby, A. Kolesnikov, L. Beyer, X. H. Zhai, T. Unterthiner, J. Yung, A. Steiner, D. Keysers, J. Uszkoreit, M. Lucic, A. Dosovitskiy. MLP-mixer: An all-MLP architecture for vision. In Proceedings of the 34th in Neural Information Processing Systems, pp. 24261–24272, 2021.

    Google Scholar 

  196. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.

    Google Scholar 

  197. K. Lee, H. W. Chang, L. Jiang, H. Zhang, Z. W. Tu, C. Liu. ViTGAN: Training GANs with vision transformers. [Online], Available: https://arxiv.org/abs/2107.04589,2022.

  198. L. Zhang, L. Zhang, X. Q. Mou, D. Zhang. FSIM: A feature similarity index for image quality assessment. IEEE Transactions on Image Processing, vol. 20, no. 8, pp. 2378–2386, 2011. DOI: https://doi.org/10.1109/TIP.2011.2109730.

    MathSciNet  MATH  Article  Google Scholar 

  199. S. Avidan, A. Shamir. Seam carving for content-aware image resizing. ACM Transactions on Graphics, vol. 26, no. 3, pp. 10–1–10–9, 2007. DOI: https://doi.org/10.1145/1276377.1276390.

    Article  Google Scholar 

  200. C. Dong, C. C. Loy, K. M. He, X. O. Tang. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 295–307, 2016. DOI: https://doi.org/10.1109/TPAMI.2015.2439281.

    Article  Google Scholar 

  201. Y. Y. Hu, S. Yang, W. H. Yang, L. Y. Duan, J. Y. Liu. Towards coding for human and machine vision: A scalable image coding approach. In Proceedings of IEEE International Conference on Multimedia and Expo, IEEE, London, UK, 2020. DOI: https://doi.org/10.1109/ICME46284.2020.9102750.

    Google Scholar 

  202. E. Wood, T. Baltrušaitis, C. Hewitt, S. Dziadzio, T. J. Cashman, J. Shotton. Fake it till you make it: Face analysis in the wild using synthetic data alone. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 3661–3671, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00366.

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Grant-in-Aid for Japan Society for the Promotion of Science Fellows, Japan (No. 21F50377). The authors would like to thank the anonymous reviewers and editors for their helpful comments on this manuscript. We would like to thank Ning Li from NEPU, China for his help in collecting data and Professor Paul L. Rosin from Cardiff University, UK for the insightful feedback.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hong Liu or Xuebin Qin.

Additional information

Conflicts of interests

The authors declare that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Colored figures are available in the online version at https://link.springer.com/journal/11633

Deng-Ping Fan received the Ph.D. degree from Nankai University, China in 2019. He joined the Inception Institute of Artificial Intelligence (IIAI), UAE in 2019. He is a Postdoctoral Researcher, working with Prof. Luc Van Gool in Computer Vision Laboratory, ETH Zürich, Switzerland. He has published approximately 50 top journal and conference papers such as TPAMI, CVPR, ICCV, ECCV, etc. He won the Best Paper Finalist Award at IEEE CVPR 2019, and the Best Paper Award Nominee at IEEE CVPR 2020. He was recognized as the CVPR 2019 outstanding reviewer with a special mention award, the CVPR 2020 outstanding reviewer, the ECCV 2020 high-quality reviewer, and the CVPR 2021 outstanding reviewer. He served as a program committee board (PCB) member of IJCAI 2022–2024, a senior program committee (SPC) member of IJCAI 2021, a committee member of China Society of Image and Graphics (CSIG), area chair in NeurIPS 2021 Datasets and Benchmarks Track, area chair in MICCAI2020 Wshp (OMIA7), editorial board member of Computer Vision & AI.

His research interests include computer vision, deep learning, and visual attention, especially the human vision on co-salient object detection, RGB salient object detection, RGB-D salient object detection, and video salient object detection.

Ziling Huang received the B. Sc. degree in electrical engineering from North China Electric Power University, China in 2015, and the M. Sc. degree in electrical engineering from Taiwan Tsing Hua University, Taiwan, China in 2020. She is currently a Ph. D. degree candidate at Department of Information and Communication Engineering, Graduate School of Information Science and Technology, University of Tokyo, Japan. She was an intern student at National Institute of Informatics, Japan in 2019, and at ByteDance, China from 2019 to 2020.

Her research interests include computer vision and machine learning.

Peng Zheng is a master student in visual computing and communication program at Aalto University, Finland and University of Trento, Italy. He was a research intern at Inception Institute of Artificial Intelligence (IIAI),UAE from March 2021 to October 2021. He has been a research assistant in Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), AUE since, January 2022. He serves as the reviewer of IEEE TPAMI.

His research interests include computer vision and machine learning, especially on common information mining and person search.

Hong Liu received the Ph. D. degree from Xiamen University, China in 2020. He is now a Japan Society for the Promotion of Science Fellowship researcher at the National Institute of Informatics, Japan. He has published about 20+ papers in top journals and conferences such as TPAMI, IJCV, TIP, CVPR, ICCV, ECCV, ICLR. He was awarded the Outstanding Doctoral Dissertation Award of the China Society of Image and Graphics, JSPS International Fellowship, and Top-100 Chinese New Stars in Artificial Intelligence by Baidu Scholar.

His research interests include large-scale image retrieval, Riemannian-based machine learning, and adversarial learning.

Xuebin Qin received the Ph. D. degree from University of Alberta, Canada in 2020. Since March 2020, he is a research fellow at Department of Computing Vision, MBZUAI, UAE. He has published about 10 papers in vision and robotics conferences such as CVPR, ECCV, BMVC, ICPR, WACV, IROS.

His research interests include highly accurate image segmentation, salient object detection, image labeling, detection and vision tracking.

Luc Van Gool received the Ph. D. degree in electromechanical engineering at Katholieke Universiteit Leuven, Belgium in 1981. Currently, he is a professor at Katholieke Universiteit Leuven in Belgium and the ETH in Switzerland. He leads computer vision research at both places, and also teaches at both. He has been a program committee member of several major computer vision conferences. He received several Best Paper awards, won a David Marr Prize and a Koenderink Award, and was nominated Distinguished Researcher by the IEEE Computer Science Committee. He is a co-founder of 10 spin-off companies.

His interests include 3D reconstruction and modelling, object recognition, tracking, and gesture analysis, and the combination of those.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Fan, DP., Huang, Z., Zheng, P. et al. Facial-sketch Synthesis: A New Challenge. Mach. Intell. Res. 19, 257–287 (2022). https://doi.org/10.1007/s11633-022-1349-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-022-1349-9

Keywords

  • Facial sketch synthesis (FSS)
  • facial sketch dataset
  • benchmark
  • attribute
  • style transfer