Facial-sketch Synthesis: A New Challenge

Fan, Deng-Ping; Huang, Ziling; Zheng, Peng; Liu, Hong; Qin, Xuebin; Van Gool, Luc

doi:10.1007/s11633-022-1349-9

Facial-sketch Synthesis: A New Challenge

Review
Open access
Published: 30 July 2022

Volume 19, pages 257–287, (2022)
Cite this article

Download PDF

You have full access to this open access article

Machine Intelligence Research Aims and scope Submit manuscript

Facial-sketch Synthesis: A New Challenge

Download PDF

1695 Accesses
17 Citations
5 Altmetric
Explore all metrics

This article has been updated

Abstract

This paper aims to conduct a comprehensive study on facial-sketch synthesis (FSS). However, due to the high cost of obtaining hand-drawn sketch datasets, there is a lack of a complete benchmark for assessing the development of FSS algorithms over the last decade. We first introduce a high-quality dataset for FSS, named FS2K, which consists of 2 104 image-sketch pairs spanning three types of sketch styles, image backgrounds, lighting conditions, skin colors, and facial attributes. FS2K differs from previous FSS datasets in difficulty, diversity, and scalability and should thus facilitate the progress of FSS research. Second, we present the largest-scale FSS investigation by reviewing 89 classic methods, including 25 handcrafted feature-based facial-sketch synthesis approaches, 29 general translation methods, and 35 image-to-sketch approaches. In addition, we elaborate comprehensive experiments on the existing 19 cutting-edge models. Third, we present a simple baseline for FSS, named FSGAN. With only two straightforward components, i.e., facial-aware masking and style-vector expansion, our FSGAN surpasses the performance of all previous state-of-the-art models on the proposed FS2K dataset by a large margin. Finally, we conclude with lessons learned over the past years and point out several unsolved challenges. Our code is available at https://github.com/DengPingFan/FSGAN.

Article PDF

Face sketch synthesis: a survey

Article 13 February 2021

Face Sketch Synthesis Based on Adaptive Similarity Regularization

Diversifying detail and appearance in sketch-based face image synthesis

Article 22 June 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Change history

29 November 2022
In this article figures have been updated.

References

X. G. Wang, X. O. Tang. Face photo-sketch synthesis and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 11, pp. 1955–1967, 2009. DOI: https://doi.org/10.1109/TPAMI.2008.222.
Article MathSciNet Google Scholar
R. Yi, Y. J. Liu, Y. K. Lai, P. L. Rosin. APDrawingGAN: Generating artistic portrait drawings from face photos with hierarchical GANs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 10735–10744, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01100.
H. Koshimizu, M. Tominaga, T. Fujiwara, K. Murakami. On KANSEI facial image processing for computerized facial caricaturing system PICASSO. In Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, IEEE, Tokyo, Japan, pp. 294–299, 1999. DOI: https://doi.org/10.1109/ICSMC.1999.816567.
N. Kumar, A. C. Berg, P. N. Belhumeur, S. K. Nayar. Attribute and simile classifiers for face verification. In Proceedings of IEEE 12th International Conference on Computer Vision, IEEE, Kyoto, Japan, pp. 365–372, 2009. DOI: https://doi.org/10.1109/ICCV.2009.5459250.
Google Scholar
H. S. Du, Q. P. Hu, D. F. Qiao, I. Pitas. Robust face recognition via low-rank sparse representation-based classification. International Journal of Automation and Computing, vol. 12, no. 6, pp. 579–587, 2015. DOI: https://doi.org/10.1007/s11633-015-0901-2.
Article Google Scholar
Y. Z. Lu. A novel face recognition algorithm for distinguishing faces with various angles. International Journal of Automation and Computing, vol. 5, no. 2, pp. 193–197, 2008. DOI: https://doi.org/10.1007/s11633-008-0193-x.
Article Google Scholar
V. Jain, E. Learned-Miller. FDDB: A Benchmark for Face Detection in Unconstrained Settings, Technical Report UM-CS-2010-009, Department of Computer Science, University of Massachusetts Amherst, USA, 2010.
Google Scholar
Z. P. Zhang, P. Luo, C. C. Loy, X. O. Tang. Facial landmark detection by deep multi-task learning. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 94–108, 2014. DOI: https://doi.org/10.1007/978-3-319-10599-4_7.
Google Scholar
A. Bulat, G. Tzimiropoulos. How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230, 000 3D facial landmarks). In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1021–1030, 2017. DOI: https://doi.org/10.1109/ICCV.2017.116.
Google Scholar
J. X. Sun, Q. Li, W. N. Wang, J. Zhao, Z. N. Sun. Multicaption text-to-face synthesis: Dataset and algorithm. In Proceedings of the 29th ACM International Conference on Multimedia, ACM, Chengdu, China, pp. 2290–2298, 2021. DOI: https://doi.org/10.1145/3474085.3475391.
Google Scholar
R. Yi, M. F. Xia, Y. J. Liu, Y. K. Lai, P. L. Rosin. Line drawings for face portraits from photos using global and local structure based GANs. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 10, pp. 3462–3475, 2021. DOI: https://doi.org/10.1109/TPAMI.2020.2987931.
Article Google Scholar
Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on image Processing, vol. 13, no. 4, pp. 600–612, 2004. DOI: https://doi.org/10.1109/TIP.2003.819861.
Article Google Scholar
J. Y. Zhu, T. Park, P. Isola, A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2242–2251, 2017. DOI: https://doi.org/10.1109/ICCV.2017.244.
Google Scholar
M. Y. Liu, T. Breuel, J. Kautz. Unsupervised image-to-image translation networks. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 700–708, 2017.
Google Scholar
T. C. Wang, M. Y. Liu, J. Y. Zhu, A. Tao, J. Kautz, B. Catanzaro. High-resolution image synthesis and semantic manipulation with conditional GANs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8798–8807, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00917.
T. Park, M. Y. Liu, T. C. Wang, J. Y. Zhu. Semantic image synthesis with spatially-adaptive normalization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2332–2341, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00244.
Google Scholar
H. Y. Chang, Z. X. Wang, Y. Y. Chuang. Domain-specific mappings for generative adversarial style transfer. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 573–589, 2020. DOI: https://doi.org/10.1007/978-3-030-58598-3_34.
Google Scholar
R. F. Chen, W. B. Huang, B. H. Huang, F. C. Sun, B. Fang. Reusing discriminators for encoding: Towards unsupervised image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8165–8174, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00819.
Google Scholar
H. Y. Lee, H. Y. Tseng, Q. Mao, J. B. Huang, Y. D. Lu, M. Singh, M. H. Yang. DRIT⁺⁺: Diverse image-to-image translation via disentangled representations. International Journal of Computer Vision, vol. 128, no. 10, pp. 2402–2417, 2020. DOI: https://doi.org/10.1007/s11263-019-01284-z.
Article Google Scholar
D. P. Fan, S. C. Zhang, Y. H. Wu, Y. Liu, M. M. Cheng, B. Ren, P. Rosin, R. R. Ji. Scoot: A perceptual metric for facial sketches. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 5611–5621, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00571.
Google Scholar
H. S. Bhatt, S. Bharadwaj, R. Singh, M. Vatsa. On matching sketches with digital face images. In Proceedings of the 4th IEEE International Conference on Biometrics: Theory, Applications and Systems, IEEE, Washington DC, USA, 2010. DOI: https://doi.org/10.1109/BTAS.2010.5634507.
Google Scholar
W. Zhang, X. G. Wang, X. O. Tang. Coupled information-theoretic encoding for face photo-sketch recognition. In Proceedings of Conference on Computer Vision and Pattern Recognition, IEEE, Colorado Springs, USA, pp. 513–520, 2011. DOI: https://doi.org/10.1109/CVPR.2011.5995324.
Google Scholar
X. B. Gao, N. N. Wang, D. C. Tao, X. L. Li. Face sketch-photo synthesis and retrieval using sparse representation. IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 8, pp. 1213–1226, 2012. DOI: https://doi.org/10.1109/TCSVT.2012.2198090.
Article Google Scholar
I. Berger, A. Shamir, M. Mahler, E. Carter, J. Hodgins. Style and abstraction in portrait sketching. ACM Transactions on Graphics, vol. 32, no. 4, Article number 55, 2013. DOI: https://doi.org/10.1145/2461912.2461964.
Article Google Scholar
R. Yi, Y. J. Liu, Y. K. Lai, P. L. Rosin. Unpaired portrait drawing generation via asymmetric cycle mapping. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8214–8222, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00824.
Google Scholar
C. L. Peng, X. B. Gao, N. N. Wang, J. Li. Face recognition from multiple stylistic sketches: Scenarios, datasets, and evaluation. Pattern Recognition, vol. 84, no. pp. 262–272, 2018. DOI: https://doi.org/10.1016/j.patcog.2018.07.014.
Article Google Scholar
A. M. Martinez, R. Benavente. The AR Face Database, CVC Technical Report 24, CVC, Spain, 1998.
Google Scholar
N. N. Wang, X. B. Gao, D. C. Tao, X. L. Li. Face sketch-photo synthesis under multi-dictionary sparse representation framework. In Proceedings of 6th International Conference on Image and Graphics, IEEE, Hefei, China, pp. 82–87, 2011. DOI: https://doi.org/10.1109/ICIG.2011.112.
Google Scholar
S. C. Zhang, R. R. Ji, J. Hu, X. Q. Lu, X. L. Li. Face sketch synthesis by multidomain adversarial learning. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 5, pp. 1419–1428, 2019. DOI: https://doi.org/10.1109/TNNLS.2018.2869574.
Article Google Scholar
M. R. Zhu, J. Li, N. N. Wang, X. B. Gao. Knowledge distillation for face photo-sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 2, pp. 893–906, 2022. DOI: https://doi.org/10.1109/TNNLS.2020.3030536.
Article Google Scholar
Z. W. Liu, P. Luo, X. G. Wang, X. O. Tang. Deep learning face attributes in the wild. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp. 3730–3738, 2015. DOI: https://doi.org/10.1109/ICCV.2015.425.
Google Scholar
J. Kim, M. Kim, H. Kang, K. Lee. U-GAT-IT: Unsupervised generative attentional networks with adaptive layer-Instance normalization for image-to-image translation. In Proceedings of the 8th International Conference on Learning Representations, Ababa, Ethiopia, 2020.
Google Scholar
P. Isola, J. Y. Zhu, T. H. Zhou, A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5967–5976, 2017. DOI: https://doi.org/10.1109/CVPR.2017.632.
Google Scholar
K. Messer, J. Matas, J. Kittler, K. Jonsson, J. Luettin, G. Maitre. XM2VTSDB: The extended M2VTS database. In Proceedings of the 2nd International Conference on Audio and Video-based Biometric Person Authentication, Springer, Washington DC, USA, pp. 965–966, 1999.
Google Scholar
P. J. Phillips, H. Moon, S. A. Rizvi, P. J. Rauss. The FERET evaluation methodology for face-recognition algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090–1104, 2000. DOI: https://doi.org/10.1109/34.879790.
Article Google Scholar
Á. Serrano, I. M. De Diego, C. Conde, E. Cabello, L. L. Shen, L. Bai. Influence of wavelet frequency and orientation in an SVM-based parallel Gabor PCA face verification system. In Proceedings of the 8th International Conference on Intelligent Data Engineering and Automated Learning, Springer, Birmingham, UK, pp. 219–228, 2007. DOI: https://doi.org/10.1007/978-3-540-77226-2_23.
Google Scholar
H. S. Bhatt, S. Bharadwaj, R. Singh, M. Vatsa. Memetically optimized MCWLD for matching sketches with digital face images. IEEE Transactions on Information Forensics and Security, vol. 7, no. 5, pp. 1522–1535, 2012. DOI: https://doi.org/10.1109/TIFS.2012.2204252.
Article Google Scholar
M. Minear, D. C. Park. A lifespan database of adult facial stimuli. Behavior Research Methods, Instruments & Computers, vol. 36, no. 4, pp. 630–633, 2004. DOI: https://doi.org/10.3758/BF03206543.
Article Google Scholar
J. Nishino, T. Kamyama, H. Shira, T. Odaka, H. Ogura. Linguistic knowledge acquisition system on facial caricature drawing system. In Proceedings of IEEE International Fuzzy Systems. IEEE, Seoul, Korea, pp. 1591–1596, 1999. DOI: https://doi.org/10.1109/FUZZY.1999.790142.
Google Scholar
S. Iwashita, Y. Takeda, T. Onisawa. Expressive facial caricature drawing. In Proceedings of IEEE International Fuzzy Systems. IEEE, Seoul, Korea, pp. 1597–1602, 1999. DOI: https://doi.org/10.1109/FUZZY.1999.790143.
Google Scholar
Y. Z. Li, H. Kobatake. Extraction of facial sketch image based on morphological processing. In Proceedings of International Conference on Image Processing, IEEE, Santa Barbara, USA, pp. 316–319, 1997. DOI: https://doi.org/10.1109/ICIP.1997.632104.
Google Scholar
M. Tominaga, S. Fukuoka, K. Murakami, H. Koshimizu. Facial caricaturing with motion caricaturing in PICASSO system. In Proceedings of IEEE/ASME International Conference on Advanced Intelligent Mechatronics, IEEE, Tokyo, Japan, pp. 30, 1997. DOI: https://doi.org/10.1109/AIM.1997.652888.
Chapter Google Scholar
S. E. Brennan. Caricature Generator, Ph. D. dissertation, Massachusetts Institute of Technology, USA, 1982.
Google Scholar
N. N. Wang, D. C. Tao, X. B. Gao, X. L. Li, J. Li. A comprehensive survey to face hallucination. International Journal of Computer Vision, vol. 106, no. 1, pp. 9–30, 2014. DOI: https://doi.org/10.1007/s11263-013-0645-9.
Article Google Scholar
H. Chen, Y. Q. Xu, H. Y. Shum, S. C. Zhu, N. N. Zheng. Example-based facial sketch generation with non-parametric sampling. In Proceedings of the 8th IEEE International Conference on Computer Vision, IEEE, Vancouver, Canada, pp. 433–438, 2001. DOI: https://doi.org/10.1109/ICCV.2001.937657.
Google Scholar
A. V. Nefian, M. H. Hayes III. Face recognition using an embedded HMM. In Proceedings of IEEE Conference on Audio and Video-based Biometric Person Authentication, IEEE, 1999.
Google Scholar
X. B. Gao, J. J. Zhong, J. Li, C. N. Tian. Face sketch synthesis algorithm based on E-HMM and selective ensemble. IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 4, pp. 487–496, 2008. DOI: https://doi.org/10.1109/TCSVT.2008.918770.
Article Google Scholar
M. Eitz, J. Hays, M. Alexa. How do humans sketch objects? ACM Transactions on Graphics, vol. 31, no. 4, Article number 44, 2012. DOI: https://doi.org/10.1145/2185520.2185540.
T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C. L. Zitnick. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 740–755, 2014. DOI: https://doi.org/10.1007/978-3-319-10602-1_48.
Google Scholar
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. H. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, F. F. Li. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015. DOI: https://doi.org/10.1007/11263-015-0816-y.
Article MathSciNet Google Scholar
M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, A. Vedaldi. Describing textures in the wild. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 3606–3613, 2014. DOI: https://doi.org/10.1109/CVPR.2014.461.
Google Scholar
S. Y. Duck. Painter by numbers, wikiart.org, [Online], Available: https://www.kaggle.com/c/painter-by-numbers, 2016.
Google Scholar
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele. The cityscapes dataset for semantic urban scene understanding. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3213–3223, 2016. DOI: https://doi.org/10.1109/CVPR.2016.350.
Google Scholar
R. Tyleček, R. Šára. Spatial pattern templates for recognition of objects with regular structure. In Proceedings of the 35th German Conference on Pattern Recognition, Springer, Saarbrücken, Germany, pp. 364–374, 2013. DOI: https://doi.org/10.1007/978-3-642-40602-7_39.
Google Scholar
J. Y. Zhu, P. Krähenbühl, E. Shechtman, A. A. Efros. Generative visual manipulation on the natural image manifold. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 597–613, 2016. DOI: https://doi.org/10.1007/978-3-319-46454-1_36.
Google Scholar
A. Yu, K. Grauman. Fine-grained visual comparisons with local learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 192–199, 2014. DOI: https://doi.org/10.1109/CV-PR.2014.32.
Google Scholar
P. Y. Laffont, Z. Ren, X. F. Tao, C. Qian, J. Hays. Transient attributes for high-level understanding and editing of outdoor scenes. ACM Transactions on Graphics, vol. 33, no. 4, Article number 149, 2014. DOI: https://doi.org/10.1145/2601097.2601101.
Google Scholar
Y. Lecun, L. Bottou, Y. Bengio, P. Haffner. Gradient-based learning applied to document recognition. Proceedings of IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. DOI: https://doi.org/10.1109/5.726791.
Article Google Scholar
C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie. The caltech-ucsd birds-200-2011 dataset, 2011. [Online], Available: https://authors.library.caltech.edu/27452/1/CUB_200_2011.pdf.
Google Scholar
T. Karras, T. Aila, S. Laine, J. Lehtinen. Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
Google Scholar
N. Silberman, D. Hoiem, P. Kohli, R. Fergus. Indoor segmentation and support inference from RGBD images. In Proceedings of the 12th European Conference on Computer Vision, Springer, Florence, Italy, pp. 746–760, 2012. DOI: https://doi.org/10.1007/978-3-642-33715-4_54.
Google Scholar
B. L. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, A. Torralba. Scene parsing through ADE20K dataset. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5122–5130, 2017. DOI: https://doi.org/10.1109/CVPR.2017.544.
Google Scholar
Q. Yu, Y. Z. Song, T. Xiang, T. M. Hospedales. Sketchx!-shoe/chair fine-grained SBIR dataset, 2017. [Online], Available: https://sketchx.eecs.qmul.ac.uk/downloads/.
Google Scholar
D. Ha, D. Eck. A neural representation of sketch drawings. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
Google Scholar
Y. H. Jin, J. K. Zhang, M. J. Li, Y. T. Tian, H. C. Zhu, Z. H. Fang. Towards the automatic anime characters creation with generative adversarial networks. [Online], Available: https://arxiv.org/pdf/1708.05509, 2017.
Google Scholar
H. Z. Xu, Y. Gao, F. Yu, T. Darrell. End-to-end learning of driving models from large-scale video datasets. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 3530–3538, 2017. DOI: https://doi.org/10.1109/CVPR.2017.376.
Google Scholar
G. Ros, L. Sellart, J. Materzynska, D. Vazquez, A. M. Lopez. The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3234–3243, 2016. DOI: https://doi.org/10.1109/CVPR.2016.352.
Google Scholar
Z. W. Liu, P. Luo, S. Qiu, X. G. Wang, X. O. Tang. DeepFashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 1096–1104, 2016. DOI: https://doi.org/10.1109/CVPR.2016.124.
Google Scholar
T. Karras, S. Laine, T. Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 4396–4405, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00453.
Google Scholar
E. Agustsson, R. Timofte. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Honolulu, USA, pp. 1122–1131, 2017. DOI: https://doi.org/10.1109/CVPRW.2017.150.
Google Scholar
B. Yao, X. Yang, S. C. Zhu. Introduction to a large-scale general purpose ground truth database: Methodology, annotation tool and benchmarks. In Proceedings of the 6th International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, Springer, Ezhou, China, pp. 169–183, 2007, DOI: https://doi.org/10.1007/978-3-540-74198-5_14.
Chapter Google Scholar
J. Krause, M. Stark, J. Deng, F. F. Li. 3D object representations for fine-grained categorization. In Proceedings of IEEE International Conference on Computer Vision Workshops, IEEE, Sydney, Australia, pp. 554–561, 2013. DOI: https://doi.org/10.1109/ICCVW.2013.77.
Google Scholar
F. Yu, A. Seff, Y. D. Zhang, S. R. Song, T. Funkhouser, J. X. Xiao. LSUN: Construction of a large-scale image dataset using deep learning with humans in the loop. [Online], Available: https://arxiv.org/abs/1506.03365, 2015.
Google Scholar
Q. S. Liu, X. O. Tang, H. L. Jin, H. Q. Lu, S. D. Ma. A nonlinear approach for face sketch synthesis and recognition. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, San Diego, USA, pp. 1005–1010, 2005. DOI: https://doi.org/10.1109/CVPR.2005.39.
Google Scholar
Z. J. Xu, H. Chen, S. C. Zhu, J. B. Luo. A hierarchical compositional model for face representation and sketching. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 955–969, 2008. DOI: https://doi.org/10.1109/TPAMI.2008.50.
Article Google Scholar
W. Zhang, X. G. Wang, X. O. Tang. Lighting and pose robust face sketch synthesis. In Proceedings of the 11th European Conference on Computer Vision, Springer, Heraklion, Greece, pp. 420–433, 2010. DOI: https://doi.org/10.1007/978-3-642-15567-3_31.
Google Scholar
N. Y. Ji, X. J. Chai, S. G. Shan, X. L. Chen. Local regression model for automatic face sketch generation. In Proceedings of the 6th International Conference on Image and Graphics, IEEE, Hefei, China, pp. 412–417, 2011. DOI: https://doi.org/10.1109/ICIG.2011.84.
Google Scholar
L. Chang, M. Q. Zhou, X. M. Deng, Z. K. Wu, Y. J. Han. Face sketch synthesis via multivariate output regression. In Proceedings of the 14th International Conference on Human-computer Interaction, Springer, Orlando, USA, pp. 555–561, 2011. DOI: https://doi.org/10.1007/978-3-642-21602-2_60.
Google Scholar
J. W. Zhang, N. N. Wang, X. B. Gao, D. C. Tao, X. L. Li. Face sketch-photo synthesis based on support vector regression. In Proceedings of the 18th IEEE International Conference on Image Processing, IEEE, Brussels, Belgium, pp. 1125–1128, 2011. DOI: https://doi.org/10.1109/ICIP.2011.6115625.
Google Scholar
S. L. Wang, L. Zhang, Y. Liang, Q. Pan. Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 2216–2223, 2012. DOI: https://doi.org/10.1109/CVPR.2012.6247930.
Google Scholar
H. Zhou, Z. H. Kuang, K. Y. K. Wong. Markov weight fields for face sketch synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 1091–1097, 2012. DOI: https://doi.org/10.1109/CVPR.2012.6247788.
Google Scholar
T. H. Wang, J. Collomosse, A. Hunter, D. Greig. Learnable stroke models for example-based portrait painting. In Proceedings of British Machine Vision Conference, Bristol, UK, 2013.
Google Scholar
N. N. Wang, D. C. Tao, X. B. Gao, X. L. Li, J. Li. Transductive face sketch-photo synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 9, pp. 1364–1376, 2013. DOI: https://doi.org/10.1109/TNNLS.2013.2258174.
Article Google Scholar
D. A. Huang, Y. C. F. Wang. Coupled dictionary and feature space learning with applications to cross-domain image synthesis and recognition. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Sydney, Australia, pp. 2496–2503, 2013. DOI: https://doi.org/10.1109/ICCV.2013.310.
Google Scholar
Y. B. Song, L. C. Bao, Q. X. Yang, M. H. Yang. Real-time exemplar-based face sketch synthesis. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 800–813, 2014. DOI: https://doi.org/10.1007/978-3-319-10599-4_51.
Google Scholar
S. C. Zhang, X. B. Gao, N. N. Wang, J. Li. Robust face sketch style synthesis. IEEE Transactions on Image Processing, vol. 25, no. 1, pp. 220–232, 2016. DOI: https://doi.org/10.1109/TIP.2015.2501755.
Article MathSciNet MATH Google Scholar
C. L. Peng, X. B. Gao, N. N. Wang, J. Li. Superpixel-based face sketch-photo synthesis. IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 2, pp. 288–299, 2017. DOI: https://doi.org/10.1109/TCSVT.2015.2502861.
Article Google Scholar
C. L. Peng, X. B. Gao, N. N. Wang, D. C. Tao, X. L. Li, J. Li. Multiple representations-based face sketch-photo synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 11, pp. 2201–2215, 2016. DOI: https://doi.org/10.1109/TNNLS.2015.2464681.
Article Google Scholar
Y. Li, Y. Z. Song, T. M. Hospedales, S. G. Gong. Freehand sketch synthesis with deformable stroke models. International Journal of Computer Vision, vol. 122, no. 1, pp. 169–190, 2017. DOI: https://doi.org/10.1007/s11263-016-0963-9.
Article MathSciNet Google Scholar
J. Li, X. Y. Yu, C. L. Peng, N. N. Wang. Adaptive representation-based face sketch-photo synthesis. Neurocomputing, vol. 269, pp. 152–159, 2017. DOI: https://doi.org/10.1016/j.neucom.2016.10.095.
Article Google Scholar
N. N. Wang, X. B. Gao, J. Li. Random sampling for fast face sketch synthesis. Pattern Recognition, vol. 76, pp. 215–227, 2018. DOI: https://doi.org/10.1016/j.patcog.2017.11.008.
Article Google Scholar
Y. F. Men, Z. H. Lian, Y. M. Tang, J. G. Xiao. A common framework for interactive texture transfer. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6353–6362, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00665.
Google Scholar
L. A. Gatys, A. S. Ecker, M. Bethge. A neural algorithm of artistic style. [Online], Available: https://arxiv.org/abs/1508.06576, 2015.
Google Scholar
L. A. Gatys, A. S. Ecker, M. Bethge. Image style transfer using convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2414–2423, 2016. DOI: https://doi.org/10.1109/CVPR.2016.265.
Google Scholar
J. Johnson, A. Alahi, F. F. Li. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 694–711, 2016. DOI: https://doi.org/10.1007/978-3-319-46475-6_43.
Google Scholar
D. Ulyanov, V. Lebedev, A. Vedaldi, V. S. Lempitsky. Texture networks: Feed-forward synthesis of textures and stylized images. In Proceedings of the 33rd International Conference on International Conference on Machine Learning, New York, USA, pp. 1349–1357, 2016.
Google Scholar
T. Q. Chen, M. Schmidt. Fast patch-based style transfer of arbitrary style. [Online], Available: https://arxiv.org/pdf/1612.04337, 2016.
Google Scholar
V. Dumoulin, J. Shlens, M. Kudlur. A learned representation for artistic style. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
Google Scholar
D. Ulyanov, A. Vedaldi, V. Lempitsky. Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 4105–4113, 2017. DOI: https://doi.org/10.1109/CVPR.2017.437.
Google Scholar
X. Huang, S. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1510–1519, 2017. DOI: https://doi.org/10.1109/ICCV.2017.167.
Google Scholar
Y. J. Li, C. Fang, J. M. Yang, Z. W. Wang, X. Lu, M. H. Yang. Universal style transfer via feature transforms. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 385–395, 2017.
Google Scholar
Y. Chen, Y. K. Lai, Y. J. Liu. CartoonGAN: Generative adversarial networks for photo cartoonization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 9465–9474, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00986.
Google Scholar
R. Abdal, Y. P. Qin, P. Wonka. Image2StyleGAN: How to embed images into the StyleGAN latent space? In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 4431–4440, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00453.
Google Scholar
D. Kotovenko, M. Wright, A. Heimbrecht, B. Ommer. Rethinking style transfer: From pixels to parameterized brushstrokes. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 12191–12200, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01202.
Google Scholar
E. Richardson, Y. Alaluf, O. Patashnik, Y. Nitzan, Y. Azar, S. Shapiro, D. Cohen-Or. Encoding in style: A StyleGAN encoder for image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 2287–2296, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00232.
Google Scholar
Z. L. Yi, H. Zhang, P. Tan, M. L. Gong. DualGAN: Unsupervised dual learning for image-to-image translation. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2868–2876, 2017. DOI: https://doi.org/10.1109/ICCV.2017.310.
Google Scholar
T. Kim, M. Cha, H. Kim, J. K. Lee, J. Kim. Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 1857–1865, 2017.
Google Scholar
J. Y. Zhu, R. Zhang, D. Pathak, T. Darrell, A. A. Efros, O. Wang, E. Shechtman. Toward multimodal image-to-image translation. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 465–476, 2017.
Google Scholar
X. Huang, M. Y. Liu, S. Belongie, J. Kautz. Multimodal unsupervised image-to-image translation. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 179–196, 2018. DOI: https://doi.org/10.1007/978-3-030-01219-9_11.
Google Scholar
P. Zhang, B. Zhang, D. Chen, L. Yuan, F. Wen. Cross-domain correspondence learning for exemplar-based image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 5142–5152, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00519.
Google Scholar
L. M. Jiang, C. X. Zhang, M. Y. Huang, C. X. Liu, J. P. Shi, C. C. Loy. TSIT: A simple and versatile framework for image-to-image translation. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 206–222, 2020. DOI: https://doi.org/10.1007/978-3-030-58580-8_13.
Google Scholar
Y. H. Zhao, R. H. Wu, H. Dong. Unpaired image-to-image translation using adversarial consistency loss. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 800–815, 2020. DOI: https://doi.org/10.1007/978-3-030-58545-7_46.
Google Scholar
X. R. Zhou, B. Zhang, T. Zhang, P. Zhang, J. M. Bao, D. Chen, Z. F. Zhang, F. Wen. CoCosNet v2: Full-resolution correspondence learning for image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 11460–11470, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01130.
Google Scholar
A. P. Chen, R. Y. Liu, L. Xie, Z. Chen, H. Su, J. Y. Yu. SofGAN: A portrait image generator with dynamic styling. ACM Transactions on Graphics, vol. 41, no. 1, Article number 1, 2022. DOI: https://doi.org/10.1145/3470848.
Google Scholar
L. L. Zhang, L. Lin, X. Wu, S. Y. Ding, L. Zhang. End-to-end photo-sketch generation via fully convolutional representation learning. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ACM, Shanghai, China, pp. 627–634, 2015. DOI: https://doi.org/10.1145/2671188.2749321.
Chapter Google Scholar
M. R. Zhu, N. N. Wang, X. B. Gao, J. Li. Deep graphical feature learning for face sketch synthesis. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, pp. 3574–3580, 2017.
Google Scholar
P. Sangkloy, J. W. Lu, C. Fang, F. Yu, J. Hays. Scribbler: Controlling deep image synthesis with sketch and color. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 6836–6845, 2017. DOI: https://doi.org/10.1109/CVPR.2017.723.
Google Scholar
M. J. Zhang, N. N. Wang, Y. S. Li, R. X. Wang, X. B. Gao. Face sketch synthesis from coarse to fine. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, California, USA, pp. 7558–7565, 2018. DOI: https://doi.org/10.1609/aaai.v32i1.12224.
Google Scholar
W. Q. Xian, P. Sangkloy, V. Agrawal, A. Raj, J. W. Lu, C. Fang, F. Yu, J. Hays. TextureGAN: Controlling deep image synthesis with texture patches. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8456–8465, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00882.
Google Scholar
J. F. Song, K. Y. Pang, Y. Z. Song, T. Xiang, T. M. Hospedales. Learning to sketch with shortcut cycle consistency. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 801–810, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00090.
Google Scholar
Y. Y. Lu, S. Z. Wu, Y. W. Tai, C. K. Tang. Image generation from sketch constraint using contextual GAN. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 213–228, 2018. DOI: https://doi.org/10.1007/978-3-030-01270-0_13.
Google Scholar
S. C. Zhang, R. R. Ji, J. Hu, Y. Gao, C. W. Lin. Robust face sketch synthesis via generative adversarial fusion of priors and parametric sigmoid. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 1163–1169, 2018.
Google Scholar
M. J. Zhang, N. Wang, Y. Li, X. Gao. Markov random neural fields for face sketch synthesis. In Proceedings of International Joint Conferences on Artificial Intelligence, Stockholm, Sweden, pp. 7558–7565, 2018.
Google Scholar
L. D. Wang, V. Sindagi, V. Patel. High-quality facial photo-sketch synthesis using multi-adversarial networks. In Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, IEEE, Xi'an, China, pp. 83–90, 2018. DOI: https://doi.org/10.1109/FG.2018.00022.
Google Scholar
M. J. Zhang, R. X. Wang, X. B. Gao, J. Li, D. C. Tao. Dual-transfer face sketch-photo synthesis. IEEE Transactions on Image Processing, vol. 28, no. 2, pp. 642–657, 2019. DOI: https://doi.org/10.1109/TIP.2018.2869688.
Article MathSciNet MATH Google Scholar
H. Kazemi, M. Iranmanesh, A. Dabouei, S. Soleymani, N. M. Nasrabadi. Facial attributes guided deep sketch-to-photo synthesis. In Proceedings of IEEE Winter Applications of Computer Vision Workshops, IEEE, Lake Tahoe, USA, 2018. DOI: https://doi.org/10.1109/WACVW.2018.00006.
Google Scholar
H. Kazemi, F. Taherkhani, N. M. Nasrabadi. Unsupervised facial geometry learning for sketch to photo synthesis. In Proceedings of International Conference of the Biometrics Special Interest Group, IEEE, Darmstadt, Germany, 2018.
Google Scholar
S. You, N. You, M. X. Pan. PI-REC: Progressive image reconstruction network with edge and color domain. [Online], Available: https://arxiv.org/abs/1903.10146, 2019.
Google Scholar
M. J. Zhang, N. N. Wang, Y. S. Li, X. B. Gao. Deep latent low-rank representation for face sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 10, pp. 3109–3123, 2019. DOI: https://doi.org/10.1109/TNNLS.2018.2890017.
Article Google Scholar
M. R. Zhu, J. Li, N. N. Wang, X. B. Gao. A deep collaborative framework for face photo-sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 10, pp. 3096–3108, 2019. DOI: https://doi.org/10.1109/TNNLS.2018.2890018.
Article Google Scholar
M. J. Zhang, Y. S. Li, N. N. Wang, Y. Chi, X. B. Gao. Cascaded face sketch synthesis under various illuminations. IEEE Transactions on Image Processing, vol. 29, pp. 1507–1521, 2019. DOI: https://doi.org/10.1109/TIP.2019.2942514.
Article MathSciNet MATH Google Scholar
M. R. Zhu, N. N. Wang, X. B. Gao, J. Li, Z. F. Li. Face photo-sketch synthesis via knowledge transfer. In Proceedings of the 28th International Joint Conference on Artficial Intelligence, Macao, China, pp. 1048–1054, 2019.
Google Scholar
Y. J. Li, C. Fang, A. Hertzmann, E. Shechtman, M. H. Yang. Im2Pencil: Controllable pencil illustration from photographs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1525–1534, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00162.
Google Scholar
A. Ghosh, R. Zhang, P. Dokania, O. Wang, A. Efros, P. Torr, E. Shechtman. Interactive sketch & fill: Multiclass sketch-to-image translation. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 1171–1180, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00126.
Google Scholar
X. R. Wang, J. Z. Yu. Learning to cartoonize using white-box cartoon representations. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8087–8096, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00811.
Google Scholar
C. Y. Gao, Q. Liu, Q. Xu, L. M. Wang, J. Z. Liu, C. Q. Zou. SketchyCOCO: Image generation from freehand scene sketches. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 5173–5182, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00522.
Google Scholar
S. Yang, Z. Y. Wang, J. Y. Liu, Z. M. Guo. Deep plastic surgery: Robust and controllable image editing with human-drawn sketches. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 601–617, 2020. DOI: https://doi.org/10.1007/978-3-030-58555-6_36.
Google Scholar
S. Y. Chen, W. C. Su, L. Gao, S. H. Xia, H. B. Fu. DeepFaceDrawing: Deep generation of face images from sketches. ACM Transactions on Graphics, vol. 39, no. 4, Article number 72, 2020. DOI: https://doi.org/10.1145/3386569.3392386.
Google Scholar
J. Yu, X. X. Xu, F. Gao, S. J. Shi, M. Wang, D. C. Tao, Q. M. Huang. Toward realistic face photo-sketch synthesis via composition-aided GANs. IEEE Transactions on Cybernetics, vol. 51, no. 9, pp. 4350–4362, 2021. DOI: https://doi.org/10.1109/TCYB.2020.2972944.
Article Google Scholar
Y. K. Fang, W. H. Deng, J. P. Du, J. N. Hu. Identity-aware CycleGAN for face photo-sketch synthesis and recognition. Pattern Recognition, vol. 102, Article number 107249, 2020. DOI: https://doi.org/10.1016/j.patcog.2020.107249.
Y. Lin, S. G. Ling, K. R. Fu, P. Cheng. An identity-preserved model for face sketch-photo synthesis. IEEE Signal Processing Letters, vol. 27, pp. 1095–1099, 2020. DOI: https://doi.org/10.1109/LSP.2020.3005039.
Article Google Scholar
C. L. Peng, N. N. Wang, J. Li, X. B. Gao. Universal face photo-sketch style transfer via multiview domain translation. IEEE Transactions on Image Processing, vol. 29, pp. 8519–8534, 2020. DOI: https://doi.org/10.1109/TIP.2020.3016502.
Article MATH Google Scholar
K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
Google Scholar
S. C. Duan, Z. X. Chen, Q. M. J. Wu, L. Cai, D. Lu. Multi-scale gradients self-attention residual learning for face photo-sketch transformation. IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1218–1230, 2020. DOI: https://doi.org/10.1109/TIFS.2020.3031386.
Article Google Scholar
S. Y. Wang, D. Bau, J. Y. Zhu. Sketch your own GAN. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 14030–14040, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.01379.
Google Scholar
A. K. Bhunia, S. Khan, H. Cholakkal, R. M. Anwer, F. S. Khan, J. Laaksonen, M. Felsberg. DoodleFormer: Creative sketch drawing with transformers. [Online], Available: https://arxiv.org/abs/2112.03258, 2021.
Google Scholar
H. Abdi, L. J. Williams. Principal component analysis. WIREs Computational Statistics, vol. 2, no. 4, pp. 433–459, 2010. DOI: https://doi.org/10.1002/wics.101.
Article Google Scholar
X. O. Tang, X. G. Wang. Face photo recognition using sketch. In Proceedings. International Conference on Image Processing, IEEE, Rochester, USA, pp. I–257–I–260, 2002. DOI: https://doi.org/10.1109/ICIP.2002.1038008.
Google Scholar
X. O. Tang, X. G. Wang. Face sketch synthesis and recognition. In Proceedings of the 9th IEEE International Conference on Computer Vision, IEEE, Nice, France, pp. 687–694, 2003. DOI: https://doi.org/10.1109/ICCV.2003.1238414.
Chapter Google Scholar
X. O. Tang, X. G. Wang. Face sketch recognition. IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 1, pp. 50–57, 2004. DOI: https://doi.org/10.1109/TCSVT.2003.818353.
Article Google Scholar
S. T. Roweis, L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, vol. 290, no. 5500, pp. 2323–2326, 2000. DOI: https://doi.org/10.1126/science.290.5500.2323.
Article Google Scholar
S. Saxena, M. N. Teli. Comparison and analysis of image-to-image generative adversarial networks: A survey. [Online], Available: https://arxiv.org/abs/2112.12625, 2021.
Google Scholar
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 2672–2680, 2014.
Google Scholar
M. Mirza, S. Osindero. Conditional generative adversarial nets. [Online], Available: https://arxiv.org/abs/1411.1784, 2014.
Google Scholar
O. Ronneberger, P. Fischer, T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention, Springer, Munich, Germany, pp. 234–241, 2015. DOI: https://doi.org/10.1007/978-3-319-24574-4_28.
Google Scholar
Y. C. Jing, Y. Z. Yang, Z. L. Feng, J. W. Ye, Y. Z. Yu, M. L. Song. Neural style transfer: A review. IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 11, pp. 3365–3385, 2020. DOI: https://doi.org/10.1109/TVCG.2019.2921336.
Article Google Scholar
Y. H. Song, C. Yang, Y. J. Shen, P. Wang, Q. Huang, C. C. J. Kuo. SPG-Net: Segmentation prediction and guidance network for image inpainting. In Proceedings of British Machine Vision Conference, Newcastle, UK, 2018.
Google Scholar
D. Yi, Z. Lei, S. C. Liao, S. Z. Li. Learning face representation from scratch. [Online], Available: https://arxiv.org/abs/1411.7923, 2014.
Google Scholar
L. Wang, R. F. Li, K. Wang, J. Chen. Feature representation for facial expression recognition based on FACS and LBP. International Journal of Automation and Computing, vol. 11, no. 5, pp. 459–468, 2014. DOI: https://doi.org/10.1007/s11633-014-0835-0.
Article Google Scholar
X. Zheng, Y. Q. Guo, H. B. Huang, Y. Li, R. He. A survey of deep facial attribute analysis. International Journal of Computer Vision, vol. 128, no. 8, pp. 2002–2034, 2020. DOI: https://doi.org/10.1007/s11263-020-01308-z.
Article Google Scholar
G. B. Huang, M. Mattar, T. Berg, E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. In Proceedings of Workshop on Faces In “Real-Life” Images: Detection, Alignment, and Recognition, Marseille, France, Article number inria-321923, 2008.
Google Scholar
R. Ranjan, V. M. Patel, R. Chellappa. Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 1, pp. 121–135, 2019. DOI: https://doi.org/10.1109/TPAMI.2017.2781233.
Article Google Scholar
E. M. Hand, R. Chellappa. Attributes for improved attributes: A multi-task network utilizing implicit and explicit relationships for facial attribute classification. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 4068–4074, 2017.
Google Scholar
H. Han, A. K. Jain, F. Wang, S. G. Shan, X. L. Chen. Heterogeneous face attribute estimation: A deep multi-task learning approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 11, pp. 2597–2609, 2018. DOI: https://doi.org/10.1109/TPAMI.2017.2738004.
Article Google Scholar
Y. Jang, H. Gunes, I. Patras. SmileNet: Registration-free smiling face detection in the wild. In Proceedings of IEEE International Conference on Computer Vision Workshops, IEEE, Venice, Italy, pp. 1581–1589, 2017. DOI: https://doi.org/10.1109/ICCVW.2017.186.
Google Scholar
R. Ranjan, S. Sankaranarayanan, C. D. Castillo, R. Chellappa. An all-in-one convolutional neural network for face analysis. In Proceedings of the 12th IEEE International Conference on Automatic Face & Gesture Recognition, IEEE, Washington DC, USA, pp. 17–24, 2017. DOI: https://doi.org/10.1109/FG.2017.137.
Google Scholar
S. Li, W. H. Deng. Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing, 2020, to be published. DOI: https://doi.org/10.1109/TAFFC.2020.2981446.
Google Scholar
N. Zhang, M. Paluri, M. Ranzato, T. Darrell, L. Bourdev. PANDA: Pose aligned networks for deep attribute modeling. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 1637–1644, 2014. DOI: https://doi.org/10.1109/CVPR.2014.212.
Google Scholar
M. N. Kan, S. G. Shan, H. Chang, X. L. Chen. Stacked progressive auto-encoders (SPAE) for face recognition across poses. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 1883–1890, 2014. DOI: https://doi.org/10.1109/CVPR.2014.243.
Google Scholar
Y. Wu, Z. G. Wang, Q. Ji. Facial feature tracking under varying facial expressions and face poses based on restricted Boltzmann machines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Portland, USA, pp. 3452–3459, 2013. DOI: https://doi.org/10.1109/CVPR.2013.443.
Google Scholar
L. Tran, X. Yin, X. M. Liu. Disentangled representation learning GAN for pose-invariant face recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 1283–1292, 2017. DOI: https://doi.org/10.1109/CVPR.2017.141.
Google Scholar
U. Toseeb, D. R. T. Keeble, E. J. Bryant. The significance of hair for face recognition.. PLoS One, vol. 7, no. 3, Article number e34144, 2012. DOI: https://doi.org/10.1371/journal.pone.0034144.
Google Scholar
S. J. Bartel, K. Toews, L. Gronhovd, S. L. Prime. “Do I Know You?” altering hairstyle affects facial recognition. Visual Cognition, vol. 26, no. 3, pp. 149–155, 2018. DOI: https://doi.org/10.1080/13506285.2017.1394412.
Article Google Scholar
N. Kumar, P. Belhumeur, S. Nayar. FaceTracer: A search engine for large collections of images with faces. In Proceedings of the 10th European Conference on Computer Vision, Springer, Marseille, France, pp. 340–353, 2008. DOI: https://doi.org/10.1007/978-3-540-88693-8_25.
Google Scholar
H. Y. Li, W. M. Dong, B. G. Hu. Facial image attributes transformation via conditional recycle generative adversarial networks. Journal of Computer Science and Technology, vol. 33, no. 3, pp. 511–521, 2018. DOI: https://doi.org/10.1007/s11390-018-1835-2.
Article Google Scholar
J. S. Pierrard, T. Vetter. Skin detail analysis for face recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Minneapolis, USA, 2007. DOI: https://doi.org/10.1109/CVPR.2007.383264.
Google Scholar
S. Z. Li. Encyclopedia of Biometrics: I-Z, New York, USA: Springer, 2009.
Book Google Scholar
K. P. Zhang, Z. P. Zhang, Z. F. Li, Y. Qiao. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503, 2016. DOI: https://doi.org/10.1109/LSP.2016.2603342.
Article Google Scholar
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.
Google Scholar
Y. Choi, M. Choi, M. Kim, J. W. Ha, S. Kim, J. Choo. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8789–8797, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00916.
Google Scholar
B. Zhao, B. Chang, Z. Q. Jie, L. Sigal. Modular generative adversarial networks. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 157–173, 2018. DOI: https://doi.org/10.1007/978-3-030-01264-9_10.
Google Scholar
A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. M. Lin, A. Desmaison, L. Antiga, A. Lerer. Automatic differentiation In PyTorch. In Proceedings of the 31st Conference on Neural Information Processing Systems, Long Beach, USA, 2017.
Google Scholar
D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2014.
Google Scholar
Q. Yu, F. Liu, Y. Z. Song, T. Xiang, T. M. Hospedales, C. C. Loy. Sketch me that shoe. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 799–807, 2016. DOI: https://doi.org/10.1109/CVPR.2016.93.
Google Scholar
C. Shorten, T. M. Khoshgoftaar. A survey on image data augmentation for deep learning. Journal of Big Data, vol. 6, no. 1, Article number 60, 2019. DOI: https://doi.org/10.1186/s40537-019-0197-0.
Google Scholar
Y. X. Wang, C. C. Wu, L. Herranz, J. Van De Weijer, A. Gonzalez-Garcia, B. Raducanu. Transferring GANs: Generating images from limited data. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 220–236, 2018. DOI: https://doi.org/10.1007/978-3-030-01231-1_14.
Google Scholar
Y. X. Wang, L. Yu, J. Van De Weijer. DeepI2I: Enabling deep hierarchical image-to-image translation by transferring from GANs. In Proceedings of the 34th in Neural Information Processing Systems, 2020.
Google Scholar
A. Shocher, Y. Gandelsman, I. Mosseri, M. Yarom, M. Irani, W. T. Freeman, T. Dekel. Semantic pyramid for image generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 7455–7464, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00748.
Google Scholar
S. Ravi, H. Larochelle. Optimization as a model for few-shot learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
Google Scholar
O. Chapelle, B. Scholkopf, A. Zien. Semi-supervised learning. IEEE Transactions on Neural Networks, vol. 20, no. 3, Article number 542, 2009. DOI: https://doi.org/10.1109/TNN.2009.2015974.
Google Scholar
M. Oquab, L. Bottou, I. Laptev, J. Sivic. Is object localization for free? — Weakly-supervised learning with convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 685–694, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298668.
Google Scholar
X. L. Wang, K. M. He, A. Gupta. Transitive Invariance for self-supervised visual representation learning. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1338–1347, 2017. DOI: https://doi.org/10.1109/ICCV.2017.149.
Google Scholar
R. Pinto, T. Mettler, M. Taisch. Managing supplier delivery reliability risk under limited information: Foundations for a human-in-the-loop DSS. Decision Support Systems, vol. 54, no. 2, pp. 1076–1084, 2013. DOI: https://doi.org/10.1016/j.dss.2012.10.033.
Article Google Scholar
Y. LeCun. Generalization and network design strategies. Connectionism in Perspective, vol. 19, no. 143–155, Article number 18, 1989.
Google Scholar
I. O. Tolstikhin, N. Houlsby, A. Kolesnikov, L. Beyer, X. H. Zhai, T. Unterthiner, J. Yung, A. Steiner, D. Keysers, J. Uszkoreit, M. Lucic, A. Dosovitskiy. MLP-mixer: An all-MLP architecture for vision. In Proceedings of the 34th in Neural Information Processing Systems, pp. 24261–24272, 2021.
Google Scholar
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.
Google Scholar
K. Lee, H. W. Chang, L. Jiang, H. Zhang, Z. W. Tu, C. Liu. ViTGAN: Training GANs with vision transformers. [Online], Available: https://arxiv.org/abs/2107.04589,2022.
L. Zhang, L. Zhang, X. Q. Mou, D. Zhang. FSIM: A feature similarity index for image quality assessment. IEEE Transactions on Image Processing, vol. 20, no. 8, pp. 2378–2386, 2011. DOI: https://doi.org/10.1109/TIP.2011.2109730.
Article MathSciNet MATH Google Scholar
S. Avidan, A. Shamir. Seam carving for content-aware image resizing. ACM Transactions on Graphics, vol. 26, no. 3, pp. 10–1–10–9, 2007. DOI: https://doi.org/10.1145/1276377.1276390.
Article Google Scholar
C. Dong, C. C. Loy, K. M. He, X. O. Tang. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 295–307, 2016. DOI: https://doi.org/10.1109/TPAMI.2015.2439281.
Article Google Scholar
Y. Y. Hu, S. Yang, W. H. Yang, L. Y. Duan, J. Y. Liu. Towards coding for human and machine vision: A scalable image coding approach. In Proceedings of IEEE International Conference on Multimedia and Expo, IEEE, London, UK, 2020. DOI: https://doi.org/10.1109/ICME46284.2020.9102750.
Google Scholar
E. Wood, T. Baltrušaitis, C. Hewitt, S. Dziadzio, T. J. Cashman, J. Shotton. Fake it till you make it: Face analysis in the wild using synthetic data alone. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 3661–3671, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00366.
Google Scholar

Download references

Acknowledgements

This work was supported by the Grant-in-Aid for Japan Society for the Promotion of Science Fellows, Japan (No. 21F50377). The authors would like to thank the anonymous reviewers and editors for their helpful comments on this manuscript. We would like to thank Ning Li from NEPU, China for his help in collecting data and Professor Paul L. Rosin from Cardiff University, UK for the insightful feedback.

Author information

These authors contribute equally to this work

Authors and Affiliations

Computer Vision Laboratory, ETH Zürich, Zürich, 8092, Switzerland
Deng-Ping Fan & Luc Van Gool
Information and Communication Engineering, University of Tokyo, Tokyo, 113-8654, Japan
Ziling Huang
Computer Vision, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
Peng Zheng & Xuebin Qin
Digital Content and Media Sciences Research Division, National Institute of Informatics, Tokyo, 101-8430, Japan
Hong Liu

Authors

Deng-Ping Fan
View author publications
You can also search for this author in PubMed Google Scholar
Ziling Huang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Hong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xuebin Qin
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Hong Liu or Xuebin Qin.

Additional information

Conflicts of interests

The authors declare that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Colored figures are available in the online version at https://link.springer.com/journal/11633

Deng-Ping Fan received the Ph.D. degree from Nankai University, China in 2019. He joined the Inception Institute of Artificial Intelligence (IIAI), UAE in 2019. He is a Postdoctoral Researcher, working with Prof. Luc Van Gool in Computer Vision Laboratory, ETH Zürich, Switzerland. He has published approximately 50 top journal and conference papers such as TPAMI, CVPR, ICCV, ECCV, etc. He won the Best Paper Finalist Award at IEEE CVPR 2019, and the Best Paper Award Nominee at IEEE CVPR 2020. He was recognized as the CVPR 2019 outstanding reviewer with a special mention award, the CVPR 2020 outstanding reviewer, the ECCV 2020 high-quality reviewer, and the CVPR 2021 outstanding reviewer. He served as a program committee board (PCB) member of IJCAI 2022–2024, a senior program committee (SPC) member of IJCAI 2021, a committee member of China Society of Image and Graphics (CSIG), area chair in NeurIPS 2021 Datasets and Benchmarks Track, area chair in MICCAI2020 Wshp (OMIA7), editorial board member of Computer Vision & AI.

His research interests include computer vision, deep learning, and visual attention, especially the human vision on co-salient object detection, RGB salient object detection, RGB-D salient object detection, and video salient object detection.

Ziling Huang received the B. Sc. degree in electrical engineering from North China Electric Power University, China in 2015, and the M. Sc. degree in electrical engineering from Taiwan Tsing Hua University, Taiwan, China in 2020. She is currently a Ph. D. degree candidate at Department of Information and Communication Engineering, Graduate School of Information Science and Technology, University of Tokyo, Japan. She was an intern student at National Institute of Informatics, Japan in 2019, and at ByteDance, China from 2019 to 2020.

Her research interests include computer vision and machine learning.

Peng Zheng is a master student in visual computing and communication program at Aalto University, Finland and University of Trento, Italy. He was a research intern at Inception Institute of Artificial Intelligence (IIAI),UAE from March 2021 to October 2021. He has been a research assistant in Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), AUE since, January 2022. He serves as the reviewer of IEEE TPAMI.

His research interests include computer vision and machine learning, especially on common information mining and person search.

Hong Liu received the Ph. D. degree from Xiamen University, China in 2020. He is now a Japan Society for the Promotion of Science Fellowship researcher at the National Institute of Informatics, Japan. He has published about 20+ papers in top journals and conferences such as TPAMI, IJCV, TIP, CVPR, ICCV, ECCV, ICLR. He was awarded the Outstanding Doctoral Dissertation Award of the China Society of Image and Graphics, JSPS International Fellowship, and Top-100 Chinese New Stars in Artificial Intelligence by Baidu Scholar.

His research interests include large-scale image retrieval, Riemannian-based machine learning, and adversarial learning.

Xuebin Qin received the Ph. D. degree from University of Alberta, Canada in 2020. Since March 2020, he is a research fellow at Department of Computing Vision, MBZUAI, UAE. He has published about 10 papers in vision and robotics conferences such as CVPR, ECCV, BMVC, ICPR, WACV, IROS.

His research interests include highly accurate image segmentation, salient object detection, image labeling, detection and vision tracking.

Luc Van Gool received the Ph. D. degree in electromechanical engineering at Katholieke Universiteit Leuven, Belgium in 1981. Currently, he is a professor at Katholieke Universiteit Leuven in Belgium and the ETH in Switzerland. He leads computer vision research at both places, and also teaches at both. He has been a program committee member of several major computer vision conferences. He received several Best Paper awards, won a David Marr Prize and a Koenderink Award, and was nominated Distinguished Researcher by the IEEE Computer Science Committee. He is a co-founder of 10 spin-off companies.

His interests include 3D reconstruction and modelling, object recognition, tracking, and gesture analysis, and the combination of those.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fan, DP., Huang, Z., Zheng, P. et al. Facial-sketch Synthesis: A New Challenge. Mach. Intell. Res. 19, 257–287 (2022). https://doi.org/10.1007/s11633-022-1349-9

Download citation

Received: 30 March 2022
Accepted: 14 June 2022
Published: 30 July 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s11633-022-1349-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Facial-sketch Synthesis: A New Challenge

Abstract

Article PDF

Similar content being viewed by others

Face sketch synthesis: a survey

Face Sketch Synthesis Based on Adaptive Similarity Regularization

Diversifying detail and appearance in sketch-based face image synthesis

Change history

29 November 2022

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Conflicts of interests

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Facial-sketch Synthesis: A New Challenge

Abstract

Article PDF

Similar content being viewed by others

Face sketch synthesis: a survey

Face Sketch Synthesis Based on Adaptive Similarity Regularization

Diversifying detail and appearance in sketch-based face image synthesis

Explore related subjects

Change history

29 November 2022

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Conflicts of interests

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation