Skip to main content
Springer Nature Link
Log in
Menu
Find a journal Publish with us Track your research
Search
Cart
  1. Home
  2. Computational Visual Media
  3. Article

Reference-guided structure-aware deep sketch colorization for cartoons

  • Research Article
  • Open access
  • Published: 27 October 2021
  • Volume 8, pages 135–148, (2022)
  • Cite this article
Download PDF

You have full access to this open access article

Computational Visual Media Aims and scope
Reference-guided structure-aware deep sketch colorization for cartoons
Download PDF
  • Xueting Liu1,
  • Wenliang Wu2,
  • Chengze Li1,
  • Yifan Li2 &
  • …
  • Huisi Wu2 
  • 2311 Accesses

  • 15 Citations

  • Explore all metrics

Abstract

Digital cartoon production requires extensive manual labor to colorize sketches with visually pleasant color composition and color shading. During colorization, the artist usually takes an existing cartoon image as color guidance, particularly when colorizing related characters or an animation sequence. Reference-guided colorization is more intuitive than colorization with other hints, such as color points or scribbles, or text-based hints. Unfortunately, reference-guided colorization is challenging since the style of the colorized image should match the style of the reference image in terms of both global color composition and local color shading. In this paper, we propose a novel learning-based framework which colorizes a sketch based on a color style feature extracted from a reference color image. Our framework contains a color style extractor to extract the color feature from a color image, a colorization network to generate multi-scale output images by combining a sketch and a color feature, and a multi-scale discriminator to improve the reality of the output image. Extensive qualitative and quantitative evaluations show that our method outperforms existing methods, providing both superior visual quality and style reference consistency in the task of reference-based colorization.

Article PDF

Download to read the full article text

Similar content being viewed by others

Reference-Based Line Drawing Colorization Through Diffusion Model

Chapter © 2024

Cross-Domain Learning for Reference-Based Sketch Colorization with Structural and Colorific Strategy

Chapter © 2022

Anime Sketch Coloring Based on Self-attention Gate and Progressive PatchGAN

Chapter © 2024

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.
  • Artificial Intelligence
Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

  1. Isola, P.; Zhu, J. Y.; Zhou, T. H.; Efros, A. A. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5967–5976, 2017.

  2. Zhu, J. Y.; Park, T.; Isola, P.; Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, 2242–2251, 2017.

  3. Zhang, R.; Isola, P.; Efros, A. A. Colorful image colorization. In: Computer Vision-ECCV 2016. Lecture Notes in Computer Science, Vol 9907. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 649–666, 2016.

    Chapter  Google Scholar 

  4. Yonetsuji, T. Paints chainer. 2017. Available at https://github.com/pfnet/Paintschainer.

  5. Zhang, L.; Ji, Y.; Lin, X.; Liu, C. P. Style transfer for anime sketches with enhanced residual U-net and auxiliary classifier GAN. In: Proceedings of the 4th IAPR Asian Conference on Pattern Recognition, 506–511, 2017.

  6. Kim, H.; Jhoo, H. Y.; Park, E.; Yoo, S. Tag2Pix: Line art colorization using text tag with SECat and changing loss. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9055–9064, 2019.

  7. Huang, X.; Belongie, S. Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, 1510–1519, 2017.

  8. Li, Y.; Fang, C.; Yang, J.; Wang, Z.; Lu, X.; Yang, M. Universal style transfer via feature transforms. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 385–395, 2017.

  9. Gonzalez-Garcia, A.; van de Weijer, J.; Bengio, Y. Image-to-image translation for cross-domain disentanglement. In: Proceedings of the 33rd Conference on Neural Information Processing Systems, 1294–1305, 2018.

  10. Yu, X.; Chen, Y.; Liu, S.; Li, T.; Li, G. Multi-mapping image-to-image translation via learning disentanglement. In: Proceedings of the 33rd Conference on Neural Information Processing Systems, 2990–2999, 2019.

  11. Li, X. J.; Zhao, H. L.; Nie, G. Z.; Huang, H. Image recoloring using geodesic distance based color harmonization. Computational Visual Media Vol. 1, No. 2, 143–155, 2015.

    Article  Google Scholar 

  12. Miao, Y. W.; Hu, F. X.; Zhang, X. D.; Chen, J. Z.; Pajarola, R. SymmSketch: Creating symmetric 3D free-form shapes from 2D sketches. Computational Visual Media Vol. 1, No. 1, 3–16, 2015.

    Article  Google Scholar 

  13. Todo, H.; Yamaguchi, Y. Estimating reflectance and shape of objects from a single cartoon-shaded image. Computational Visual Media Vol. 3, No. 1, 21–31, 2017.

    Article  Google Scholar 

  14. Hertzmann, A.; Jacobs, C. E.; Oliver, N.; Curless, B.; Salesin, D. H. Image analogies. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, 327–340, 2001.

  15. Gatys, L. A.; Ecker, A. S.; Bethge, M. Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2414–2423, 2016.

  16. Johnson, J.; Alahi, A.; Li, F. F. Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision — ECCV 2016. Lecture Notes in Computer Science, Vol. 9906. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 694–711, 2016.

    Chapter  Google Scholar 

  17. Sanakoyeu, A.; Kotovenko, D.; Lang, S.; Ommer, B. A style-aware content loss for real-time HD style transfer. In: Proceedings of the European Conference on Computer Vision, Vol. 11212, 715–731, 2018.

    Google Scholar 

  18. Zhang, Y. L.; Fang, C.; Wang, Y. L.; Wang, Z. W.; Lin, Z.; Fu, Y.; Yang, J. Multimodal style transfer via graph cuts. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 5942–5950, 2019.

  19. Song, C.; Wu, Z.; Zhou, Y.; Gong, M.; Huang, H. ETNet: Error transition network for arbitrary style transfer. In: Proceedings of the Advances in Neural Information Processing Systems, 668–677, 2019.

  20. Wang, H.; Li, Y. J.; Wang, Y. H.; Hu, H. J.; Yang, M. H. Collaborative distillation for ultraresolution universal style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1857–1866, 2020.

  21. Li, X.; Liu, S.; Kautz, J.; Yang, M. Learning linear transformations for fast arbitrary style transfer. arXiv preprint arXiv: 1808.04537, 2018.

  22. Gao, W.; Li, Y. J.; Yin, Y. H.; Yang, M. H. Fast video multi-style transfer. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 3211–3219, 2020.

  23. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Communications of the ACM Vol. 63, No. 11, 139–144, 2020

    Article  MathSciNet  Google Scholar 

  24. Kim, T.; Cha, M.; Kim, H.; Lee, J. K.; Kim, J. Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the 34th International Conference on Machine Learning, 1857–1865, 2017.

  25. Zhang, L.; Li, C. Z.; Wong, T. T.; Ji, Y.; Liu, C. P. Two-stage sketch colorization. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 261, 2019.

  26. Zhu, J.-Y.; Zhang, R.; Pathak, D.; Darrell, T.; Efros, A. A.; Wang, O.; Shechtman, E. Toward multimodal image-to-image translation. In: Proceedings of the 31st Conference on Neural Information Processing Systems, 465–476, 2017.

  27. Lee, J.; Kim, E.; Lee, Y.; Kim, D.; Chang, J.; Choo, J. Reference-based sketch image colorization using augmented-self reference and dense semantic correspondence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5800–5809, 2020.

  28. Welsh, T.; Ashikhmin, M.; Mueller, K. Transferring color to greyscale images. In: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, 277–280, 2002.

  29. Bugeau, A.; Ta, V. T.; Papadakis, N. Variational exemplar-based image colorization. IEEE Transactions on Image Processing Vol. 23, No. 1, 298–307, 2014.

    Article  MathSciNet  Google Scholar 

  30. Liu, X. P.; Wan, L.; Qu, Y. G.; Wong, T. T., Lin, S., Leung, C. S., Heng, P. A. Intrinsic colorization. In: Proceedings of the ACM SIGGRAPH Asia 2008 papers, Article No. 152, 2008.

  31. Chia, A. Y. S.; Zhuo, S. J.; Gupta, R. K.; Tai, Y. W.; Cho, S. Y.; Tan, P.; Lin, S. Semantic colorization with Internet images. ACM Transactions on Graphics Vol. 30, No. 6, Article No. 156, 2011.

  32. Gupta, R. K.; Chia, A. Y. S.; Rajan, D.; Ng, E. S.; Huang, Z. Y. Image colorization using similar images. In: Proceedings of the 20th ACM International Conference on Multimedia, 369–378, 2012.

  33. Tai, Y. W.; Jia, J. Y.; Tang, C. K. Local color transfer via probabilistic segmentation by expectation-maximization. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 747–754, 2005.

  34. He, M. M.; Chen, D. D.; Liao, J.; Sander, P. V.; Yuan, L. Deep exemplar-based colorization. ACM Transactions on Graphics Vol. 37, No. 4, Article No. 47, 2018.

  35. Zhang, B.; He, M. M.; Liao, J.; Sander, P. V.; Yuan, L.; Bermak, A.; Chen, D. Deep exemplar-based video colorization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8052–8061, 2019.

  36. Huang, X.; Liu, M. Y.; Belongie, S.; Kautz, J. Multimodal unsupervised image-to-image translation. In: Computer Vision — ECCV 2018. Lecture Notes in Computer Science, Vol. 11207. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 179–196, 2018.

    Chapter  Google Scholar 

  37. Lee, H. Y.; Tseng, H. Y.; Huang, J. B.; Singh, M.; Yang, M. H. Diverse image-to-image translation via disentangled representations. In: Computer Vision — ECCV 2018. Lecture Notes in Computer Science, Vol. 11205. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 36–52, 2018.

    Chapter  Google Scholar 

  38. He, K. M.; Zhang, X. Y.; Ren, S. Q.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.

  39. Ulyanov, D.; Vedaldi, A.; Lempitsky, V. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022, 2016.

  40. Available at https://www.kaggle.com/ktaebum/anime-sketch-colorization-pair.

  41. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

  42. Miyato, T.; Kataoka, T.; Koyama, M.; Yoshida, Y. Spectral normalization for generative adversarial networks. In: Proceedings of the 6th International Conference on Learning Representations, 2018.

  43. Mescheder, L.; Geiger, A.; Nowozin, S. Which training methods for GANs do actually converge? In: Proceedings of the 35th International Conference on Machine Learning, 3478–3487, 2018.

  44. Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations, 2015.

  45. Sun, T. H.; Lai, C. H.; Wong, S. K.; Wang, Y. S. Adversarial colorization of icons based on contour and color conditions. In: Proceedings of the 27th ACM International Conference on Multimedia, 683–691, 2019.

  46. Wang, Z.; Bovik, A. C.; Sheikh, H. R.; Simoncelli, E. P. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing Vol. 13, No. 4, 600–612, 2004.

    Article  Google Scholar 

  47. Dowson, D. C.; Landau, B. V. The Fréchet distance between multivariate normal distributions. Journal of Multivariate Analysis Vol. 12, No. 3, 450–455, 1982.

    Article  MathSciNet  Google Scholar 

  48. Iizuka, S.; Simo-Serra, E.; Ishikawa, H. Globally and locally consistent image completion. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 107, 2017.

Download references

Acknowledgements

This work was supported in part by a CIHE Institutional Development Grant No. IDG200107, the National Natural Science Foundation of China under Grant No. 61973221, and the Natural Science Foundation of Guangdong Province of China under Grant Nos. 2018A030313381 and 2019A1515011165.

Author information

Authors and Affiliations

  1. Caritas Institute of Higher Education, Hong Kong SAR, China

    Xueting Liu & Chengze Li

  2. Shenzhen University, Shenzhen, 518060, China

    Wenliang Wu, Yifan Li & Huisi Wu

Authors
  1. Xueting Liu
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Wenliang Wu
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Chengze Li
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Yifan Li
    View author publications

    You can also search for this author in PubMed Google Scholar

  5. Huisi Wu
    View author publications

    You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengze Li.

Additional information

Xueting Liu received her B.Eng. degree in computer science and technology from Tsinghua University and her Ph.D. degree in computer science and engineering from the Chinese University of Hong Kong in 2009 and 2014, respectively. She is currently an assistant professor in the School of Computing and Information Sciences, Caritas Institute of Higher Education. Her research interests include computational art, intelligent art, computer vision, and computer graphics.

Wenliang Wu received his B.Sc. degree from Guangdong Ocean University of Science and Technology in 2019. He is currently a graduate student in the College of Computer Science and Software Engineering, Shenzhen University. His research interests include computer vision and computer graphics.

Chengze Li received her B.Eng. degree from the University of Science and Technology of China in 2013, and her Ph.D. degree in computer science and engineering from the Chinese University of Hong Kong in 2020. Chengze is currently an assistant professor in the School of Computing and Information Sciences, Caritas Institute of Higher Education, with research interests in 2D non-photorealistic media analysis and processing, computational photography, and computer graphics.

Yifan Li received his B.Sc. degree from Jiangxi University of Science and Technology in 2018 and is now a graduate student in the College of Computer Science and Software Engineering, Shenzhen University. His research interests include computer graphics, computer vision, machine learning, and deep learning.

Huisi Wu received his B.E. and M.E. degrees in computer science both from Xi’an Jiaotong University in 2004 and 2007, respectively. He obtained his Ph.D. degree in computer science from the Chinese University of Hong Kong in 2011. He is currently an associate professor in the College of Computer Science and Software Engineering, Shenzhen University. His research interests include computer graphics, image processing, and medical imaging.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, X., Wu, W., Li, C. et al. Reference-guided structure-aware deep sketch colorization for cartoons. Comp. Visual Media 8, 135–148 (2022). https://doi.org/10.1007/s41095-021-0228-6

Download citation

  • Received: 21 January 2021

  • Accepted: 18 March 2021

  • Published: 27 October 2021

  • Issue Date: March 2022

  • DOI: https://doi.org/10.1007/s41095-021-0228-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • sketch colorization
  • image style editing
  • deep feature understanding
  • reference-based image colorization
Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Advertisement

Search

Navigation

  • Find a journal
  • Publish with us
  • Track your research

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Journal finder
  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support
  • Cancel contracts here

152.53.107.23

Not affiliated

Springer Nature

© 2025 Springer Nature