Abstract
This paper aims to conduct a comprehensive study on facial-sketch synthesis (FSS). However, due to the high cost of obtaining hand-drawn sketch datasets, there is a lack of a complete benchmark for assessing the development of FSS algorithms over the last decade. We first introduce a high-quality dataset for FSS, named FS2K, which consists of 2 104 image-sketch pairs spanning three types of sketch styles, image backgrounds, lighting conditions, skin colors, and facial attributes. FS2K differs from previous FSS datasets in difficulty, diversity, and scalability and should thus facilitate the progress of FSS research. Second, we present the largest-scale FSS investigation by reviewing 89 classic methods, including 25 handcrafted feature-based facial-sketch synthesis approaches, 29 general translation methods, and 35 image-to-sketch approaches. In addition, we elaborate comprehensive experiments on the existing 19 cutting-edge models. Third, we present a simple baseline for FSS, named FSGAN. With only two straightforward components, i.e., facial-aware masking and style-vector expansion, our FSGAN surpasses the performance of all previous state-of-the-art models on the proposed FS2K dataset by a large margin. Finally, we conclude with lessons learned over the past years and point out several unsolved challenges. Our code is available at https://github.com/DengPingFan/FSGAN.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Change history
29 November 2022
In this article figures have been updated.
References
X. G. Wang, X. O. Tang. Face photo-sketch synthesis and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 11, pp. 1955–1967, 2009. DOI: https://doi.org/10.1109/TPAMI.2008.222.
R. Yi, Y. J. Liu, Y. K. Lai, P. L. Rosin. APDrawingGAN: Generating artistic portrait drawings from face photos with hierarchical GANs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 10735–10744, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01100.
H. Koshimizu, M. Tominaga, T. Fujiwara, K. Murakami. On KANSEI facial image processing for computerized facial caricaturing system PICASSO. In Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, IEEE, Tokyo, Japan, pp. 294–299, 1999. DOI: https://doi.org/10.1109/ICSMC.1999.816567.
N. Kumar, A. C. Berg, P. N. Belhumeur, S. K. Nayar. Attribute and simile classifiers for face verification. In Proceedings of IEEE 12th International Conference on Computer Vision, IEEE, Kyoto, Japan, pp. 365–372, 2009. DOI: https://doi.org/10.1109/ICCV.2009.5459250.
H. S. Du, Q. P. Hu, D. F. Qiao, I. Pitas. Robust face recognition via low-rank sparse representation-based classification. International Journal of Automation and Computing, vol. 12, no. 6, pp. 579–587, 2015. DOI: https://doi.org/10.1007/s11633-015-0901-2.
Y. Z. Lu. A novel face recognition algorithm for distinguishing faces with various angles. International Journal of Automation and Computing, vol. 5, no. 2, pp. 193–197, 2008. DOI: https://doi.org/10.1007/s11633-008-0193-x.
V. Jain, E. Learned-Miller. FDDB: A Benchmark for Face Detection in Unconstrained Settings, Technical Report UM-CS-2010-009, Department of Computer Science, University of Massachusetts Amherst, USA, 2010.
Z. P. Zhang, P. Luo, C. C. Loy, X. O. Tang. Facial landmark detection by deep multi-task learning. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 94–108, 2014. DOI: https://doi.org/10.1007/978-3-319-10599-4_7.
A. Bulat, G. Tzimiropoulos. How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230, 000 3D facial landmarks). In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1021–1030, 2017. DOI: https://doi.org/10.1109/ICCV.2017.116.
J. X. Sun, Q. Li, W. N. Wang, J. Zhao, Z. N. Sun. Multicaption text-to-face synthesis: Dataset and algorithm. In Proceedings of the 29th ACM International Conference on Multimedia, ACM, Chengdu, China, pp. 2290–2298, 2021. DOI: https://doi.org/10.1145/3474085.3475391.
R. Yi, M. F. Xia, Y. J. Liu, Y. K. Lai, P. L. Rosin. Line drawings for face portraits from photos using global and local structure based GANs. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 10, pp. 3462–3475, 2021. DOI: https://doi.org/10.1109/TPAMI.2020.2987931.
Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on image Processing, vol. 13, no. 4, pp. 600–612, 2004. DOI: https://doi.org/10.1109/TIP.2003.819861.
J. Y. Zhu, T. Park, P. Isola, A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2242–2251, 2017. DOI: https://doi.org/10.1109/ICCV.2017.244.
M. Y. Liu, T. Breuel, J. Kautz. Unsupervised image-to-image translation networks. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 700–708, 2017.
T. C. Wang, M. Y. Liu, J. Y. Zhu, A. Tao, J. Kautz, B. Catanzaro. High-resolution image synthesis and semantic manipulation with conditional GANs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8798–8807, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00917.
T. Park, M. Y. Liu, T. C. Wang, J. Y. Zhu. Semantic image synthesis with spatially-adaptive normalization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2332–2341, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00244.
H. Y. Chang, Z. X. Wang, Y. Y. Chuang. Domain-specific mappings for generative adversarial style transfer. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 573–589, 2020. DOI: https://doi.org/10.1007/978-3-030-58598-3_34.
R. F. Chen, W. B. Huang, B. H. Huang, F. C. Sun, B. Fang. Reusing discriminators for encoding: Towards unsupervised image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8165–8174, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00819.
H. Y. Lee, H. Y. Tseng, Q. Mao, J. B. Huang, Y. D. Lu, M. Singh, M. H. Yang. DRIT++: Diverse image-to-image translation via disentangled representations. International Journal of Computer Vision, vol. 128, no. 10, pp. 2402–2417, 2020. DOI: https://doi.org/10.1007/s11263-019-01284-z.
D. P. Fan, S. C. Zhang, Y. H. Wu, Y. Liu, M. M. Cheng, B. Ren, P. Rosin, R. R. Ji. Scoot: A perceptual metric for facial sketches. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 5611–5621, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00571.
H. S. Bhatt, S. Bharadwaj, R. Singh, M. Vatsa. On matching sketches with digital face images. In Proceedings of the 4th IEEE International Conference on Biometrics: Theory, Applications and Systems, IEEE, Washington DC, USA, 2010. DOI: https://doi.org/10.1109/BTAS.2010.5634507.
W. Zhang, X. G. Wang, X. O. Tang. Coupled information-theoretic encoding for face photo-sketch recognition. In Proceedings of Conference on Computer Vision and Pattern Recognition, IEEE, Colorado Springs, USA, pp. 513–520, 2011. DOI: https://doi.org/10.1109/CVPR.2011.5995324.
X. B. Gao, N. N. Wang, D. C. Tao, X. L. Li. Face sketch-photo synthesis and retrieval using sparse representation. IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 8, pp. 1213–1226, 2012. DOI: https://doi.org/10.1109/TCSVT.2012.2198090.
I. Berger, A. Shamir, M. Mahler, E. Carter, J. Hodgins. Style and abstraction in portrait sketching. ACM Transactions on Graphics, vol. 32, no. 4, Article number 55, 2013. DOI: https://doi.org/10.1145/2461912.2461964.
R. Yi, Y. J. Liu, Y. K. Lai, P. L. Rosin. Unpaired portrait drawing generation via asymmetric cycle mapping. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8214–8222, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00824.
C. L. Peng, X. B. Gao, N. N. Wang, J. Li. Face recognition from multiple stylistic sketches: Scenarios, datasets, and evaluation. Pattern Recognition, vol. 84, no. pp. 262–272, 2018. DOI: https://doi.org/10.1016/j.patcog.2018.07.014.
A. M. Martinez, R. Benavente. The AR Face Database, CVC Technical Report 24, CVC, Spain, 1998.
N. N. Wang, X. B. Gao, D. C. Tao, X. L. Li. Face sketch-photo synthesis under multi-dictionary sparse representation framework. In Proceedings of 6th International Conference on Image and Graphics, IEEE, Hefei, China, pp. 82–87, 2011. DOI: https://doi.org/10.1109/ICIG.2011.112.
S. C. Zhang, R. R. Ji, J. Hu, X. Q. Lu, X. L. Li. Face sketch synthesis by multidomain adversarial learning. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 5, pp. 1419–1428, 2019. DOI: https://doi.org/10.1109/TNNLS.2018.2869574.
M. R. Zhu, J. Li, N. N. Wang, X. B. Gao. Knowledge distillation for face photo-sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 2, pp. 893–906, 2022. DOI: https://doi.org/10.1109/TNNLS.2020.3030536.
Z. W. Liu, P. Luo, X. G. Wang, X. O. Tang. Deep learning face attributes in the wild. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp. 3730–3738, 2015. DOI: https://doi.org/10.1109/ICCV.2015.425.
J. Kim, M. Kim, H. Kang, K. Lee. U-GAT-IT: Unsupervised generative attentional networks with adaptive layer-Instance normalization for image-to-image translation. In Proceedings of the 8th International Conference on Learning Representations, Ababa, Ethiopia, 2020.
P. Isola, J. Y. Zhu, T. H. Zhou, A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5967–5976, 2017. DOI: https://doi.org/10.1109/CVPR.2017.632.
K. Messer, J. Matas, J. Kittler, K. Jonsson, J. Luettin, G. Maitre. XM2VTSDB: The extended M2VTS database. In Proceedings of the 2nd International Conference on Audio and Video-based Biometric Person Authentication, Springer, Washington DC, USA, pp. 965–966, 1999.
P. J. Phillips, H. Moon, S. A. Rizvi, P. J. Rauss. The FERET evaluation methodology for face-recognition algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090–1104, 2000. DOI: https://doi.org/10.1109/34.879790.
Á. Serrano, I. M. De Diego, C. Conde, E. Cabello, L. L. Shen, L. Bai. Influence of wavelet frequency and orientation in an SVM-based parallel Gabor PCA face verification system. In Proceedings of the 8th International Conference on Intelligent Data Engineering and Automated Learning, Springer, Birmingham, UK, pp. 219–228, 2007. DOI: https://doi.org/10.1007/978-3-540-77226-2_23.
H. S. Bhatt, S. Bharadwaj, R. Singh, M. Vatsa. Memetically optimized MCWLD for matching sketches with digital face images. IEEE Transactions on Information Forensics and Security, vol. 7, no. 5, pp. 1522–1535, 2012. DOI: https://doi.org/10.1109/TIFS.2012.2204252.
M. Minear, D. C. Park. A lifespan database of adult facial stimuli. Behavior Research Methods, Instruments & Computers, vol. 36, no. 4, pp. 630–633, 2004. DOI: https://doi.org/10.3758/BF03206543.
J. Nishino, T. Kamyama, H. Shira, T. Odaka, H. Ogura. Linguistic knowledge acquisition system on facial caricature drawing system. In Proceedings of IEEE International Fuzzy Systems. IEEE, Seoul, Korea, pp. 1591–1596, 1999. DOI: https://doi.org/10.1109/FUZZY.1999.790142.
S. Iwashita, Y. Takeda, T. Onisawa. Expressive facial caricature drawing. In Proceedings of IEEE International Fuzzy Systems. IEEE, Seoul, Korea, pp. 1597–1602, 1999. DOI: https://doi.org/10.1109/FUZZY.1999.790143.
Y. Z. Li, H. Kobatake. Extraction of facial sketch image based on morphological processing. In Proceedings of International Conference on Image Processing, IEEE, Santa Barbara, USA, pp. 316–319, 1997. DOI: https://doi.org/10.1109/ICIP.1997.632104.
M. Tominaga, S. Fukuoka, K. Murakami, H. Koshimizu. Facial caricaturing with motion caricaturing in PICASSO system. In Proceedings of IEEE/ASME International Conference on Advanced Intelligent Mechatronics, IEEE, Tokyo, Japan, pp. 30, 1997. DOI: https://doi.org/10.1109/AIM.1997.652888.
S. E. Brennan. Caricature Generator, Ph. D. dissertation, Massachusetts Institute of Technology, USA, 1982.
N. N. Wang, D. C. Tao, X. B. Gao, X. L. Li, J. Li. A comprehensive survey to face hallucination. International Journal of Computer Vision, vol. 106, no. 1, pp. 9–30, 2014. DOI: https://doi.org/10.1007/s11263-013-0645-9.
H. Chen, Y. Q. Xu, H. Y. Shum, S. C. Zhu, N. N. Zheng. Example-based facial sketch generation with non-parametric sampling. In Proceedings of the 8th IEEE International Conference on Computer Vision, IEEE, Vancouver, Canada, pp. 433–438, 2001. DOI: https://doi.org/10.1109/ICCV.2001.937657.
A. V. Nefian, M. H. Hayes III. Face recognition using an embedded HMM. In Proceedings of IEEE Conference on Audio and Video-based Biometric Person Authentication, IEEE, 1999.
X. B. Gao, J. J. Zhong, J. Li, C. N. Tian. Face sketch synthesis algorithm based on E-HMM and selective ensemble. IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 4, pp. 487–496, 2008. DOI: https://doi.org/10.1109/TCSVT.2008.918770.
M. Eitz, J. Hays, M. Alexa. How do humans sketch objects? ACM Transactions on Graphics, vol. 31, no. 4, Article number 44, 2012. DOI: https://doi.org/10.1145/2185520.2185540.
T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C. L. Zitnick. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 740–755, 2014. DOI: https://doi.org/10.1007/978-3-319-10602-1_48.
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. H. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, F. F. Li. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015. DOI: https://doi.org/10.1007/11263-015-0816-y.
M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, A. Vedaldi. Describing textures in the wild. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 3606–3613, 2014. DOI: https://doi.org/10.1109/CVPR.2014.461.
S. Y. Duck. Painter by numbers, wikiart.org, [Online], Available: https://www.kaggle.com/c/painter-by-numbers, 2016.
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele. The cityscapes dataset for semantic urban scene understanding. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3213–3223, 2016. DOI: https://doi.org/10.1109/CVPR.2016.350.
R. Tyleček, R. Šára. Spatial pattern templates for recognition of objects with regular structure. In Proceedings of the 35th German Conference on Pattern Recognition, Springer, Saarbrücken, Germany, pp. 364–374, 2013. DOI: https://doi.org/10.1007/978-3-642-40602-7_39.
J. Y. Zhu, P. Krähenbühl, E. Shechtman, A. A. Efros. Generative visual manipulation on the natural image manifold. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 597–613, 2016. DOI: https://doi.org/10.1007/978-3-319-46454-1_36.
A. Yu, K. Grauman. Fine-grained visual comparisons with local learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 192–199, 2014. DOI: https://doi.org/10.1109/CV-PR.2014.32.
P. Y. Laffont, Z. Ren, X. F. Tao, C. Qian, J. Hays. Transient attributes for high-level understanding and editing of outdoor scenes. ACM Transactions on Graphics, vol. 33, no. 4, Article number 149, 2014. DOI: https://doi.org/10.1145/2601097.2601101.
Y. Lecun, L. Bottou, Y. Bengio, P. Haffner. Gradient-based learning applied to document recognition. Proceedings of IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. DOI: https://doi.org/10.1109/5.726791.
C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie. The caltech-ucsd birds-200-2011 dataset, 2011. [Online], Available: https://authors.library.caltech.edu/27452/1/CUB_200_2011.pdf.
T. Karras, T. Aila, S. Laine, J. Lehtinen. Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
N. Silberman, D. Hoiem, P. Kohli, R. Fergus. Indoor segmentation and support inference from RGBD images. In Proceedings of the 12th European Conference on Computer Vision, Springer, Florence, Italy, pp. 746–760, 2012. DOI: https://doi.org/10.1007/978-3-642-33715-4_54.
B. L. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, A. Torralba. Scene parsing through ADE20K dataset. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5122–5130, 2017. DOI: https://doi.org/10.1109/CVPR.2017.544.
Q. Yu, Y. Z. Song, T. Xiang, T. M. Hospedales. Sketchx!-shoe/chair fine-grained SBIR dataset, 2017. [Online], Available: https://sketchx.eecs.qmul.ac.uk/downloads/.
D. Ha, D. Eck. A neural representation of sketch drawings. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
Y. H. Jin, J. K. Zhang, M. J. Li, Y. T. Tian, H. C. Zhu, Z. H. Fang. Towards the automatic anime characters creation with generative adversarial networks. [Online], Available: https://arxiv.org/pdf/1708.05509, 2017.
H. Z. Xu, Y. Gao, F. Yu, T. Darrell. End-to-end learning of driving models from large-scale video datasets. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 3530–3538, 2017. DOI: https://doi.org/10.1109/CVPR.2017.376.
G. Ros, L. Sellart, J. Materzynska, D. Vazquez, A. M. Lopez. The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3234–3243, 2016. DOI: https://doi.org/10.1109/CVPR.2016.352.
Z. W. Liu, P. Luo, S. Qiu, X. G. Wang, X. O. Tang. DeepFashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 1096–1104, 2016. DOI: https://doi.org/10.1109/CVPR.2016.124.
T. Karras, S. Laine, T. Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 4396–4405, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00453.
E. Agustsson, R. Timofte. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Honolulu, USA, pp. 1122–1131, 2017. DOI: https://doi.org/10.1109/CVPRW.2017.150.
B. Yao, X. Yang, S. C. Zhu. Introduction to a large-scale general purpose ground truth database: Methodology, annotation tool and benchmarks. In Proceedings of the 6th International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, Springer, Ezhou, China, pp. 169–183, 2007, DOI: https://doi.org/10.1007/978-3-540-74198-5_14.
J. Krause, M. Stark, J. Deng, F. F. Li. 3D object representations for fine-grained categorization. In Proceedings of IEEE International Conference on Computer Vision Workshops, IEEE, Sydney, Australia, pp. 554–561, 2013. DOI: https://doi.org/10.1109/ICCVW.2013.77.
F. Yu, A. Seff, Y. D. Zhang, S. R. Song, T. Funkhouser, J. X. Xiao. LSUN: Construction of a large-scale image dataset using deep learning with humans in the loop. [Online], Available: https://arxiv.org/abs/1506.03365, 2015.
Q. S. Liu, X. O. Tang, H. L. Jin, H. Q. Lu, S. D. Ma. A nonlinear approach for face sketch synthesis and recognition. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, San Diego, USA, pp. 1005–1010, 2005. DOI: https://doi.org/10.1109/CVPR.2005.39.
Z. J. Xu, H. Chen, S. C. Zhu, J. B. Luo. A hierarchical compositional model for face representation and sketching. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 955–969, 2008. DOI: https://doi.org/10.1109/TPAMI.2008.50.
W. Zhang, X. G. Wang, X. O. Tang. Lighting and pose robust face sketch synthesis. In Proceedings of the 11th European Conference on Computer Vision, Springer, Heraklion, Greece, pp. 420–433, 2010. DOI: https://doi.org/10.1007/978-3-642-15567-3_31.
N. Y. Ji, X. J. Chai, S. G. Shan, X. L. Chen. Local regression model for automatic face sketch generation. In Proceedings of the 6th International Conference on Image and Graphics, IEEE, Hefei, China, pp. 412–417, 2011. DOI: https://doi.org/10.1109/ICIG.2011.84.
L. Chang, M. Q. Zhou, X. M. Deng, Z. K. Wu, Y. J. Han. Face sketch synthesis via multivariate output regression. In Proceedings of the 14th International Conference on Human-computer Interaction, Springer, Orlando, USA, pp. 555–561, 2011. DOI: https://doi.org/10.1007/978-3-642-21602-2_60.
J. W. Zhang, N. N. Wang, X. B. Gao, D. C. Tao, X. L. Li. Face sketch-photo synthesis based on support vector regression. In Proceedings of the 18th IEEE International Conference on Image Processing, IEEE, Brussels, Belgium, pp. 1125–1128, 2011. DOI: https://doi.org/10.1109/ICIP.2011.6115625.
S. L. Wang, L. Zhang, Y. Liang, Q. Pan. Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 2216–2223, 2012. DOI: https://doi.org/10.1109/CVPR.2012.6247930.
H. Zhou, Z. H. Kuang, K. Y. K. Wong. Markov weight fields for face sketch synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 1091–1097, 2012. DOI: https://doi.org/10.1109/CVPR.2012.6247788.
T. H. Wang, J. Collomosse, A. Hunter, D. Greig. Learnable stroke models for example-based portrait painting. In Proceedings of British Machine Vision Conference, Bristol, UK, 2013.
N. N. Wang, D. C. Tao, X. B. Gao, X. L. Li, J. Li. Transductive face sketch-photo synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 9, pp. 1364–1376, 2013. DOI: https://doi.org/10.1109/TNNLS.2013.2258174.
D. A. Huang, Y. C. F. Wang. Coupled dictionary and feature space learning with applications to cross-domain image synthesis and recognition. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Sydney, Australia, pp. 2496–2503, 2013. DOI: https://doi.org/10.1109/ICCV.2013.310.
Y. B. Song, L. C. Bao, Q. X. Yang, M. H. Yang. Real-time exemplar-based face sketch synthesis. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 800–813, 2014. DOI: https://doi.org/10.1007/978-3-319-10599-4_51.
S. C. Zhang, X. B. Gao, N. N. Wang, J. Li. Robust face sketch style synthesis. IEEE Transactions on Image Processing, vol. 25, no. 1, pp. 220–232, 2016. DOI: https://doi.org/10.1109/TIP.2015.2501755.
C. L. Peng, X. B. Gao, N. N. Wang, J. Li. Superpixel-based face sketch-photo synthesis. IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 2, pp. 288–299, 2017. DOI: https://doi.org/10.1109/TCSVT.2015.2502861.
C. L. Peng, X. B. Gao, N. N. Wang, D. C. Tao, X. L. Li, J. Li. Multiple representations-based face sketch-photo synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 11, pp. 2201–2215, 2016. DOI: https://doi.org/10.1109/TNNLS.2015.2464681.
Y. Li, Y. Z. Song, T. M. Hospedales, S. G. Gong. Freehand sketch synthesis with deformable stroke models. International Journal of Computer Vision, vol. 122, no. 1, pp. 169–190, 2017. DOI: https://doi.org/10.1007/s11263-016-0963-9.
J. Li, X. Y. Yu, C. L. Peng, N. N. Wang. Adaptive representation-based face sketch-photo synthesis. Neurocomputing, vol. 269, pp. 152–159, 2017. DOI: https://doi.org/10.1016/j.neucom.2016.10.095.
N. N. Wang, X. B. Gao, J. Li. Random sampling for fast face sketch synthesis. Pattern Recognition, vol. 76, pp. 215–227, 2018. DOI: https://doi.org/10.1016/j.patcog.2017.11.008.
Y. F. Men, Z. H. Lian, Y. M. Tang, J. G. Xiao. A common framework for interactive texture transfer. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6353–6362, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00665.
L. A. Gatys, A. S. Ecker, M. Bethge. A neural algorithm of artistic style. [Online], Available: https://arxiv.org/abs/1508.06576, 2015.
L. A. Gatys, A. S. Ecker, M. Bethge. Image style transfer using convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2414–2423, 2016. DOI: https://doi.org/10.1109/CVPR.2016.265.
J. Johnson, A. Alahi, F. F. Li. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 694–711, 2016. DOI: https://doi.org/10.1007/978-3-319-46475-6_43.
D. Ulyanov, V. Lebedev, A. Vedaldi, V. S. Lempitsky. Texture networks: Feed-forward synthesis of textures and stylized images. In Proceedings of the 33rd International Conference on International Conference on Machine Learning, New York, USA, pp. 1349–1357, 2016.
T. Q. Chen, M. Schmidt. Fast patch-based style transfer of arbitrary style. [Online], Available: https://arxiv.org/pdf/1612.04337, 2016.
V. Dumoulin, J. Shlens, M. Kudlur. A learned representation for artistic style. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
D. Ulyanov, A. Vedaldi, V. Lempitsky. Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 4105–4113, 2017. DOI: https://doi.org/10.1109/CVPR.2017.437.
X. Huang, S. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1510–1519, 2017. DOI: https://doi.org/10.1109/ICCV.2017.167.
Y. J. Li, C. Fang, J. M. Yang, Z. W. Wang, X. Lu, M. H. Yang. Universal style transfer via feature transforms. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 385–395, 2017.
Y. Chen, Y. K. Lai, Y. J. Liu. CartoonGAN: Generative adversarial networks for photo cartoonization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 9465–9474, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00986.
R. Abdal, Y. P. Qin, P. Wonka. Image2StyleGAN: How to embed images into the StyleGAN latent space? In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 4431–4440, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00453.
D. Kotovenko, M. Wright, A. Heimbrecht, B. Ommer. Rethinking style transfer: From pixels to parameterized brushstrokes. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 12191–12200, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01202.
E. Richardson, Y. Alaluf, O. Patashnik, Y. Nitzan, Y. Azar, S. Shapiro, D. Cohen-Or. Encoding in style: A StyleGAN encoder for image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 2287–2296, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00232.
Z. L. Yi, H. Zhang, P. Tan, M. L. Gong. DualGAN: Unsupervised dual learning for image-to-image translation. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2868–2876, 2017. DOI: https://doi.org/10.1109/ICCV.2017.310.
T. Kim, M. Cha, H. Kim, J. K. Lee, J. Kim. Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 1857–1865, 2017.
J. Y. Zhu, R. Zhang, D. Pathak, T. Darrell, A. A. Efros, O. Wang, E. Shechtman. Toward multimodal image-to-image translation. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 465–476, 2017.
X. Huang, M. Y. Liu, S. Belongie, J. Kautz. Multimodal unsupervised image-to-image translation. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 179–196, 2018. DOI: https://doi.org/10.1007/978-3-030-01219-9_11.
P. Zhang, B. Zhang, D. Chen, L. Yuan, F. Wen. Cross-domain correspondence learning for exemplar-based image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 5142–5152, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00519.
L. M. Jiang, C. X. Zhang, M. Y. Huang, C. X. Liu, J. P. Shi, C. C. Loy. TSIT: A simple and versatile framework for image-to-image translation. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 206–222, 2020. DOI: https://doi.org/10.1007/978-3-030-58580-8_13.
Y. H. Zhao, R. H. Wu, H. Dong. Unpaired image-to-image translation using adversarial consistency loss. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 800–815, 2020. DOI: https://doi.org/10.1007/978-3-030-58545-7_46.
X. R. Zhou, B. Zhang, T. Zhang, P. Zhang, J. M. Bao, D. Chen, Z. F. Zhang, F. Wen. CoCosNet v2: Full-resolution correspondence learning for image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 11460–11470, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01130.
A. P. Chen, R. Y. Liu, L. Xie, Z. Chen, H. Su, J. Y. Yu. SofGAN: A portrait image generator with dynamic styling. ACM Transactions on Graphics, vol. 41, no. 1, Article number 1, 2022. DOI: https://doi.org/10.1145/3470848.
L. L. Zhang, L. Lin, X. Wu, S. Y. Ding, L. Zhang. End-to-end photo-sketch generation via fully convolutional representation learning. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ACM, Shanghai, China, pp. 627–634, 2015. DOI: https://doi.org/10.1145/2671188.2749321.
M. R. Zhu, N. N. Wang, X. B. Gao, J. Li. Deep graphical feature learning for face sketch synthesis. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, pp. 3574–3580, 2017.
P. Sangkloy, J. W. Lu, C. Fang, F. Yu, J. Hays. Scribbler: Controlling deep image synthesis with sketch and color. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 6836–6845, 2017. DOI: https://doi.org/10.1109/CVPR.2017.723.
M. J. Zhang, N. N. Wang, Y. S. Li, R. X. Wang, X. B. Gao. Face sketch synthesis from coarse to fine. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, California, USA, pp. 7558–7565, 2018. DOI: https://doi.org/10.1609/aaai.v32i1.12224.
W. Q. Xian, P. Sangkloy, V. Agrawal, A. Raj, J. W. Lu, C. Fang, F. Yu, J. Hays. TextureGAN: Controlling deep image synthesis with texture patches. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8456–8465, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00882.
J. F. Song, K. Y. Pang, Y. Z. Song, T. Xiang, T. M. Hospedales. Learning to sketch with shortcut cycle consistency. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 801–810, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00090.
Y. Y. Lu, S. Z. Wu, Y. W. Tai, C. K. Tang. Image generation from sketch constraint using contextual GAN. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 213–228, 2018. DOI: https://doi.org/10.1007/978-3-030-01270-0_13.
S. C. Zhang, R. R. Ji, J. Hu, Y. Gao, C. W. Lin. Robust face sketch synthesis via generative adversarial fusion of priors and parametric sigmoid. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 1163–1169, 2018.
M. J. Zhang, N. Wang, Y. Li, X. Gao. Markov random neural fields for face sketch synthesis. In Proceedings of International Joint Conferences on Artificial Intelligence, Stockholm, Sweden, pp. 7558–7565, 2018.
L. D. Wang, V. Sindagi, V. Patel. High-quality facial photo-sketch synthesis using multi-adversarial networks. In Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, IEEE, Xi'an, China, pp. 83–90, 2018. DOI: https://doi.org/10.1109/FG.2018.00022.
M. J. Zhang, R. X. Wang, X. B. Gao, J. Li, D. C. Tao. Dual-transfer face sketch-photo synthesis. IEEE Transactions on Image Processing, vol. 28, no. 2, pp. 642–657, 2019. DOI: https://doi.org/10.1109/TIP.2018.2869688.
H. Kazemi, M. Iranmanesh, A. Dabouei, S. Soleymani, N. M. Nasrabadi. Facial attributes guided deep sketch-to-photo synthesis. In Proceedings of IEEE Winter Applications of Computer Vision Workshops, IEEE, Lake Tahoe, USA, 2018. DOI: https://doi.org/10.1109/WACVW.2018.00006.
H. Kazemi, F. Taherkhani, N. M. Nasrabadi. Unsupervised facial geometry learning for sketch to photo synthesis. In Proceedings of International Conference of the Biometrics Special Interest Group, IEEE, Darmstadt, Germany, 2018.
S. You, N. You, M. X. Pan. PI-REC: Progressive image reconstruction network with edge and color domain. [Online], Available: https://arxiv.org/abs/1903.10146, 2019.
M. J. Zhang, N. N. Wang, Y. S. Li, X. B. Gao. Deep latent low-rank representation for face sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 10, pp. 3109–3123, 2019. DOI: https://doi.org/10.1109/TNNLS.2018.2890017.
M. R. Zhu, J. Li, N. N. Wang, X. B. Gao. A deep collaborative framework for face photo-sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 10, pp. 3096–3108, 2019. DOI: https://doi.org/10.1109/TNNLS.2018.2890018.
M. J. Zhang, Y. S. Li, N. N. Wang, Y. Chi, X. B. Gao. Cascaded face sketch synthesis under various illuminations. IEEE Transactions on Image Processing, vol. 29, pp. 1507–1521, 2019. DOI: https://doi.org/10.1109/TIP.2019.2942514.
M. R. Zhu, N. N. Wang, X. B. Gao, J. Li, Z. F. Li. Face photo-sketch synthesis via knowledge transfer. In Proceedings of the 28th International Joint Conference on Artficial Intelligence, Macao, China, pp. 1048–1054, 2019.
Y. J. Li, C. Fang, A. Hertzmann, E. Shechtman, M. H. Yang. Im2Pencil: Controllable pencil illustration from photographs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1525–1534, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00162.
A. Ghosh, R. Zhang, P. Dokania, O. Wang, A. Efros, P. Torr, E. Shechtman. Interactive sketch & fill: Multiclass sketch-to-image translation. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 1171–1180, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00126.
X. R. Wang, J. Z. Yu. Learning to cartoonize using white-box cartoon representations. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8087–8096, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00811.
C. Y. Gao, Q. Liu, Q. Xu, L. M. Wang, J. Z. Liu, C. Q. Zou. SketchyCOCO: Image generation from freehand scene sketches. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 5173–5182, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00522.
S. Yang, Z. Y. Wang, J. Y. Liu, Z. M. Guo. Deep plastic surgery: Robust and controllable image editing with human-drawn sketches. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 601–617, 2020. DOI: https://doi.org/10.1007/978-3-030-58555-6_36.
S. Y. Chen, W. C. Su, L. Gao, S. H. Xia, H. B. Fu. DeepFaceDrawing: Deep generation of face images from sketches. ACM Transactions on Graphics, vol. 39, no. 4, Article number 72, 2020. DOI: https://doi.org/10.1145/3386569.3392386.
J. Yu, X. X. Xu, F. Gao, S. J. Shi, M. Wang, D. C. Tao, Q. M. Huang. Toward realistic face photo-sketch synthesis via composition-aided GANs. IEEE Transactions on Cybernetics, vol. 51, no. 9, pp. 4350–4362, 2021. DOI: https://doi.org/10.1109/TCYB.2020.2972944.
Y. K. Fang, W. H. Deng, J. P. Du, J. N. Hu. Identity-aware CycleGAN for face photo-sketch synthesis and recognition. Pattern Recognition, vol. 102, Article number 107249, 2020. DOI: https://doi.org/10.1016/j.patcog.2020.107249.
Y. Lin, S. G. Ling, K. R. Fu, P. Cheng. An identity-preserved model for face sketch-photo synthesis. IEEE Signal Processing Letters, vol. 27, pp. 1095–1099, 2020. DOI: https://doi.org/10.1109/LSP.2020.3005039.
C. L. Peng, N. N. Wang, J. Li, X. B. Gao. Universal face photo-sketch style transfer via multiview domain translation. IEEE Transactions on Image Processing, vol. 29, pp. 8519–8534, 2020. DOI: https://doi.org/10.1109/TIP.2020.3016502.
K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
S. C. Duan, Z. X. Chen, Q. M. J. Wu, L. Cai, D. Lu. Multi-scale gradients self-attention residual learning for face photo-sketch transformation. IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1218–1230, 2020. DOI: https://doi.org/10.1109/TIFS.2020.3031386.
S. Y. Wang, D. Bau, J. Y. Zhu. Sketch your own GAN. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 14030–14040, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.01379.
A. K. Bhunia, S. Khan, H. Cholakkal, R. M. Anwer, F. S. Khan, J. Laaksonen, M. Felsberg. DoodleFormer: Creative sketch drawing with transformers. [Online], Available: https://arxiv.org/abs/2112.03258, 2021.
H. Abdi, L. J. Williams. Principal component analysis. WIREs Computational Statistics, vol. 2, no. 4, pp. 433–459, 2010. DOI: https://doi.org/10.1002/wics.101.
X. O. Tang, X. G. Wang. Face photo recognition using sketch. In Proceedings. International Conference on Image Processing, IEEE, Rochester, USA, pp. I–257–I–260, 2002. DOI: https://doi.org/10.1109/ICIP.2002.1038008.
X. O. Tang, X. G. Wang. Face sketch synthesis and recognition. In Proceedings of the 9th IEEE International Conference on Computer Vision, IEEE, Nice, France, pp. 687–694, 2003. DOI: https://doi.org/10.1109/ICCV.2003.1238414.
X. O. Tang, X. G. Wang. Face sketch recognition. IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 1, pp. 50–57, 2004. DOI: https://doi.org/10.1109/TCSVT.2003.818353.
S. T. Roweis, L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, vol. 290, no. 5500, pp. 2323–2326, 2000. DOI: https://doi.org/10.1126/science.290.5500.2323.
S. Saxena, M. N. Teli. Comparison and analysis of image-to-image generative adversarial networks: A survey. [Online], Available: https://arxiv.org/abs/2112.12625, 2021.
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 2672–2680, 2014.
M. Mirza, S. Osindero. Conditional generative adversarial nets. [Online], Available: https://arxiv.org/abs/1411.1784, 2014.
O. Ronneberger, P. Fischer, T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention, Springer, Munich, Germany, pp. 234–241, 2015. DOI: https://doi.org/10.1007/978-3-319-24574-4_28.
Y. C. Jing, Y. Z. Yang, Z. L. Feng, J. W. Ye, Y. Z. Yu, M. L. Song. Neural style transfer: A review. IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 11, pp. 3365–3385, 2020. DOI: https://doi.org/10.1109/TVCG.2019.2921336.
Y. H. Song, C. Yang, Y. J. Shen, P. Wang, Q. Huang, C. C. J. Kuo. SPG-Net: Segmentation prediction and guidance network for image inpainting. In Proceedings of British Machine Vision Conference, Newcastle, UK, 2018.
D. Yi, Z. Lei, S. C. Liao, S. Z. Li. Learning face representation from scratch. [Online], Available: https://arxiv.org/abs/1411.7923, 2014.
L. Wang, R. F. Li, K. Wang, J. Chen. Feature representation for facial expression recognition based on FACS and LBP. International Journal of Automation and Computing, vol. 11, no. 5, pp. 459–468, 2014. DOI: https://doi.org/10.1007/s11633-014-0835-0.
X. Zheng, Y. Q. Guo, H. B. Huang, Y. Li, R. He. A survey of deep facial attribute analysis. International Journal of Computer Vision, vol. 128, no. 8, pp. 2002–2034, 2020. DOI: https://doi.org/10.1007/s11263-020-01308-z.
G. B. Huang, M. Mattar, T. Berg, E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. In Proceedings of Workshop on Faces In “Real-Life” Images: Detection, Alignment, and Recognition, Marseille, France, Article number inria-321923, 2008.
R. Ranjan, V. M. Patel, R. Chellappa. Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 1, pp. 121–135, 2019. DOI: https://doi.org/10.1109/TPAMI.2017.2781233.
E. M. Hand, R. Chellappa. Attributes for improved attributes: A multi-task network utilizing implicit and explicit relationships for facial attribute classification. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 4068–4074, 2017.
H. Han, A. K. Jain, F. Wang, S. G. Shan, X. L. Chen. Heterogeneous face attribute estimation: A deep multi-task learning approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 11, pp. 2597–2609, 2018. DOI: https://doi.org/10.1109/TPAMI.2017.2738004.
Y. Jang, H. Gunes, I. Patras. SmileNet: Registration-free smiling face detection in the wild. In Proceedings of IEEE International Conference on Computer Vision Workshops, IEEE, Venice, Italy, pp. 1581–1589, 2017. DOI: https://doi.org/10.1109/ICCVW.2017.186.
R. Ranjan, S. Sankaranarayanan, C. D. Castillo, R. Chellappa. An all-in-one convolutional neural network for face analysis. In Proceedings of the 12th IEEE International Conference on Automatic Face & Gesture Recognition, IEEE, Washington DC, USA, pp. 17–24, 2017. DOI: https://doi.org/10.1109/FG.2017.137.
S. Li, W. H. Deng. Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing, 2020, to be published. DOI: https://doi.org/10.1109/TAFFC.2020.2981446.
N. Zhang, M. Paluri, M. Ranzato, T. Darrell, L. Bourdev. PANDA: Pose aligned networks for deep attribute modeling. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 1637–1644, 2014. DOI: https://doi.org/10.1109/CVPR.2014.212.
M. N. Kan, S. G. Shan, H. Chang, X. L. Chen. Stacked progressive auto-encoders (SPAE) for face recognition across poses. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 1883–1890, 2014. DOI: https://doi.org/10.1109/CVPR.2014.243.
Y. Wu, Z. G. Wang, Q. Ji. Facial feature tracking under varying facial expressions and face poses based on restricted Boltzmann machines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Portland, USA, pp. 3452–3459, 2013. DOI: https://doi.org/10.1109/CVPR.2013.443.
L. Tran, X. Yin, X. M. Liu. Disentangled representation learning GAN for pose-invariant face recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 1283–1292, 2017. DOI: https://doi.org/10.1109/CVPR.2017.141.
U. Toseeb, D. R. T. Keeble, E. J. Bryant. The significance of hair for face recognition.. PLoS One, vol. 7, no. 3, Article number e34144, 2012. DOI: https://doi.org/10.1371/journal.pone.0034144.
S. J. Bartel, K. Toews, L. Gronhovd, S. L. Prime. “Do I Know You?” altering hairstyle affects facial recognition. Visual Cognition, vol. 26, no. 3, pp. 149–155, 2018. DOI: https://doi.org/10.1080/13506285.2017.1394412.
N. Kumar, P. Belhumeur, S. Nayar. FaceTracer: A search engine for large collections of images with faces. In Proceedings of the 10th European Conference on Computer Vision, Springer, Marseille, France, pp. 340–353, 2008. DOI: https://doi.org/10.1007/978-3-540-88693-8_25.
H. Y. Li, W. M. Dong, B. G. Hu. Facial image attributes transformation via conditional recycle generative adversarial networks. Journal of Computer Science and Technology, vol. 33, no. 3, pp. 511–521, 2018. DOI: https://doi.org/10.1007/s11390-018-1835-2.
J. S. Pierrard, T. Vetter. Skin detail analysis for face recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Minneapolis, USA, 2007. DOI: https://doi.org/10.1109/CVPR.2007.383264.
S. Z. Li. Encyclopedia of Biometrics: I-Z, New York, USA: Springer, 2009.
K. P. Zhang, Z. P. Zhang, Z. F. Li, Y. Qiao. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503, 2016. DOI: https://doi.org/10.1109/LSP.2016.2603342.
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.
Y. Choi, M. Choi, M. Kim, J. W. Ha, S. Kim, J. Choo. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8789–8797, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00916.
B. Zhao, B. Chang, Z. Q. Jie, L. Sigal. Modular generative adversarial networks. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 157–173, 2018. DOI: https://doi.org/10.1007/978-3-030-01264-9_10.
A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. M. Lin, A. Desmaison, L. Antiga, A. Lerer. Automatic differentiation In PyTorch. In Proceedings of the 31st Conference on Neural Information Processing Systems, Long Beach, USA, 2017.
D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2014.
Q. Yu, F. Liu, Y. Z. Song, T. Xiang, T. M. Hospedales, C. C. Loy. Sketch me that shoe. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 799–807, 2016. DOI: https://doi.org/10.1109/CVPR.2016.93.
C. Shorten, T. M. Khoshgoftaar. A survey on image data augmentation for deep learning. Journal of Big Data, vol. 6, no. 1, Article number 60, 2019. DOI: https://doi.org/10.1186/s40537-019-0197-0.
Y. X. Wang, C. C. Wu, L. Herranz, J. Van De Weijer, A. Gonzalez-Garcia, B. Raducanu. Transferring GANs: Generating images from limited data. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 220–236, 2018. DOI: https://doi.org/10.1007/978-3-030-01231-1_14.
Y. X. Wang, L. Yu, J. Van De Weijer. DeepI2I: Enabling deep hierarchical image-to-image translation by transferring from GANs. In Proceedings of the 34th in Neural Information Processing Systems, 2020.
A. Shocher, Y. Gandelsman, I. Mosseri, M. Yarom, M. Irani, W. T. Freeman, T. Dekel. Semantic pyramid for image generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 7455–7464, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00748.
S. Ravi, H. Larochelle. Optimization as a model for few-shot learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
O. Chapelle, B. Scholkopf, A. Zien. Semi-supervised learning. IEEE Transactions on Neural Networks, vol. 20, no. 3, Article number 542, 2009. DOI: https://doi.org/10.1109/TNN.2009.2015974.
M. Oquab, L. Bottou, I. Laptev, J. Sivic. Is object localization for free? — Weakly-supervised learning with convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 685–694, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298668.
X. L. Wang, K. M. He, A. Gupta. Transitive Invariance for self-supervised visual representation learning. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1338–1347, 2017. DOI: https://doi.org/10.1109/ICCV.2017.149.
R. Pinto, T. Mettler, M. Taisch. Managing supplier delivery reliability risk under limited information: Foundations for a human-in-the-loop DSS. Decision Support Systems, vol. 54, no. 2, pp. 1076–1084, 2013. DOI: https://doi.org/10.1016/j.dss.2012.10.033.
Y. LeCun. Generalization and network design strategies. Connectionism in Perspective, vol. 19, no. 143–155, Article number 18, 1989.
I. O. Tolstikhin, N. Houlsby, A. Kolesnikov, L. Beyer, X. H. Zhai, T. Unterthiner, J. Yung, A. Steiner, D. Keysers, J. Uszkoreit, M. Lucic, A. Dosovitskiy. MLP-mixer: An all-MLP architecture for vision. In Proceedings of the 34th in Neural Information Processing Systems, pp. 24261–24272, 2021.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.
K. Lee, H. W. Chang, L. Jiang, H. Zhang, Z. W. Tu, C. Liu. ViTGAN: Training GANs with vision transformers. [Online], Available: https://arxiv.org/abs/2107.04589,2022.
L. Zhang, L. Zhang, X. Q. Mou, D. Zhang. FSIM: A feature similarity index for image quality assessment. IEEE Transactions on Image Processing, vol. 20, no. 8, pp. 2378–2386, 2011. DOI: https://doi.org/10.1109/TIP.2011.2109730.
S. Avidan, A. Shamir. Seam carving for content-aware image resizing. ACM Transactions on Graphics, vol. 26, no. 3, pp. 10–1–10–9, 2007. DOI: https://doi.org/10.1145/1276377.1276390.
C. Dong, C. C. Loy, K. M. He, X. O. Tang. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 295–307, 2016. DOI: https://doi.org/10.1109/TPAMI.2015.2439281.
Y. Y. Hu, S. Yang, W. H. Yang, L. Y. Duan, J. Y. Liu. Towards coding for human and machine vision: A scalable image coding approach. In Proceedings of IEEE International Conference on Multimedia and Expo, IEEE, London, UK, 2020. DOI: https://doi.org/10.1109/ICME46284.2020.9102750.
E. Wood, T. Baltrušaitis, C. Hewitt, S. Dziadzio, T. J. Cashman, J. Shotton. Fake it till you make it: Face analysis in the wild using synthetic data alone. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 3661–3671, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00366.
Acknowledgements
This work was supported by the Grant-in-Aid for Japan Society for the Promotion of Science Fellows, Japan (No. 21F50377). The authors would like to thank the anonymous reviewers and editors for their helpful comments on this manuscript. We would like to thank Ning Li from NEPU, China for his help in collecting data and Professor Paul L. Rosin from Cardiff University, UK for the insightful feedback.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Conflicts of interests
The authors declare that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Colored figures are available in the online version at https://link.springer.com/journal/11633
Deng-Ping Fan received the Ph.D. degree from Nankai University, China in 2019. He joined the Inception Institute of Artificial Intelligence (IIAI), UAE in 2019. He is a Postdoctoral Researcher, working with Prof. Luc Van Gool in Computer Vision Laboratory, ETH Zürich, Switzerland. He has published approximately 50 top journal and conference papers such as TPAMI, CVPR, ICCV, ECCV, etc. He won the Best Paper Finalist Award at IEEE CVPR 2019, and the Best Paper Award Nominee at IEEE CVPR 2020. He was recognized as the CVPR 2019 outstanding reviewer with a special mention award, the CVPR 2020 outstanding reviewer, the ECCV 2020 high-quality reviewer, and the CVPR 2021 outstanding reviewer. He served as a program committee board (PCB) member of IJCAI 2022–2024, a senior program committee (SPC) member of IJCAI 2021, a committee member of China Society of Image and Graphics (CSIG), area chair in NeurIPS 2021 Datasets and Benchmarks Track, area chair in MICCAI2020 Wshp (OMIA7), editorial board member of Computer Vision & AI.
His research interests include computer vision, deep learning, and visual attention, especially the human vision on co-salient object detection, RGB salient object detection, RGB-D salient object detection, and video salient object detection.
Ziling Huang received the B. Sc. degree in electrical engineering from North China Electric Power University, China in 2015, and the M. Sc. degree in electrical engineering from Taiwan Tsing Hua University, Taiwan, China in 2020. She is currently a Ph. D. degree candidate at Department of Information and Communication Engineering, Graduate School of Information Science and Technology, University of Tokyo, Japan. She was an intern student at National Institute of Informatics, Japan in 2019, and at ByteDance, China from 2019 to 2020.
Her research interests include computer vision and machine learning.
Peng Zheng is a master student in visual computing and communication program at Aalto University, Finland and University of Trento, Italy. He was a research intern at Inception Institute of Artificial Intelligence (IIAI),UAE from March 2021 to October 2021. He has been a research assistant in Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), AUE since, January 2022. He serves as the reviewer of IEEE TPAMI.
His research interests include computer vision and machine learning, especially on common information mining and person search.
Hong Liu received the Ph. D. degree from Xiamen University, China in 2020. He is now a Japan Society for the Promotion of Science Fellowship researcher at the National Institute of Informatics, Japan. He has published about 20+ papers in top journals and conferences such as TPAMI, IJCV, TIP, CVPR, ICCV, ECCV, ICLR. He was awarded the Outstanding Doctoral Dissertation Award of the China Society of Image and Graphics, JSPS International Fellowship, and Top-100 Chinese New Stars in Artificial Intelligence by Baidu Scholar.
His research interests include large-scale image retrieval, Riemannian-based machine learning, and adversarial learning.
Xuebin Qin received the Ph. D. degree from University of Alberta, Canada in 2020. Since March 2020, he is a research fellow at Department of Computing Vision, MBZUAI, UAE. He has published about 10 papers in vision and robotics conferences such as CVPR, ECCV, BMVC, ICPR, WACV, IROS.
His research interests include highly accurate image segmentation, salient object detection, image labeling, detection and vision tracking.
Luc Van Gool received the Ph. D. degree in electromechanical engineering at Katholieke Universiteit Leuven, Belgium in 1981. Currently, he is a professor at Katholieke Universiteit Leuven in Belgium and the ETH in Switzerland. He leads computer vision research at both places, and also teaches at both. He has been a program committee member of several major computer vision conferences. He received several Best Paper awards, won a David Marr Prize and a Koenderink Award, and was nominated Distinguished Researcher by the IEEE Computer Science Committee. He is a co-founder of 10 spin-off companies.
His interests include 3D reconstruction and modelling, object recognition, tracking, and gesture analysis, and the combination of those.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Fan, DP., Huang, Z., Zheng, P. et al. Facial-sketch Synthesis: A New Challenge. Mach. Intell. Res. 19, 257–287 (2022). https://doi.org/10.1007/s11633-022-1349-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11633-022-1349-9