Skip to main content

Anime Sketch Coloring Based on Self-attention Gate and Progressive PatchGAN

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Abstract

Traditional manual coloring methods require hand-drawn colors to create visually pleasing color combinations, which is both time-consuming and laborious. Reference-based line art coloring is a challenging task in computer vision. However, existing reference-based methods often struggle to generate visually appealing coloring images because sketch images lack texture and training data. To address this, we propose a new sketch coloring network based on the PatchGAN architecture. First, we propose a new self-attention gate (SAG) to effectively and correctly identifying the line semantic information from shallow to deep layers in the CNN. Second, we propose a new Progressive PatchGAN (PPGAN) to help train the discriminator to better distinguish real anime images. Our experiments show that compared to existing methods, our approach demonstrates significant improvements in some benchmark tests, with Fréchet Inception Distance (FID) improved up to 24.195% and Structural Similarity Index Measure (SSIM) improved up to 14.30% compared to the best values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Liu, G., Chen, X., Hu, Y.: Anime sketch coloring with swish-gated residual U-Net. In: Peng, H., Deng, C., Wu, Z., Liu, Y. (eds.) ISICA 2018. CCIS, vol. 986, pp. 190–204. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-6473-0_17

    Chapter  Google Scholar 

  2. Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014). https://doi.org/10.48550/arXiv.1411.1784

  3. Yan, D., Ito, R., Moriai, R., Saito, S.: Two-step training: adjustable sketch colourization via reference image and text tag. In: Computer Graphics Forum, Wiley Online Library (2023). https://doi.org/10.1111/cgf.14791

  4. Seo, C.W., Seo, Y.: Seg2pix: few shot training line art colorization with segmented image data. Appl. Sci. 11, 1464 (2021). https://doi.org/10.3390/app11041464

    Article  Google Scholar 

  5. Sato, K., Matsui, Y., Yamasaki, T., Aizawa, K.: Reference-based manga colorization by graph correspondence using quadratic programming. In: SIGGRAPH Asia 2014 Technical Briefs, pp. 1–4 (2014). https://doi.org/10.1145/2669024.2669037

  6. Choi, Y., Uh, Y., Yoo, J., Ha, J.-W.: StarGAN v2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8188–8197 (2020). https://doi.org/10.1109/CVPR42600.2020.00821

  7. Lee, J., Kim, E., Lee, Y., Kim, D., Chang, J., Choo, J.: Reference based sketch image colorization using augmented-self reference and dense semantic correspondence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5801–5810 (2020). https://doi.org/10.1109/cvpr42600.2020.00584

  8. Z. Li, Z. Geng, Z. Kang, W. Chen, Y. Yang, Eliminating gradient conflict in reference-based line-art colorization, in: Computer Vision- ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVII, Springer, 2022, pp. 579–596. https://doi.org/10.1007/978-3-031-19790-1_35

  9. Bynagari, N.B.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Asian J. Appl. Sci. Eng. 8, 25–34 (2019)

    Article  Google Scholar 

  10. Liu, Y., Qin, Z., Wan, T., Luo, Z.: Auto-painter: cartoon image generation from sketch by using conditional wasserstein generative adversarial networks. Neurocomputing 311, 78–87 (2018). https://doi.org/10.1016/j.neucom.2018.05.045

    Article  Google Scholar 

  11. Park, D.Y., Lee, K.H.: Arbitrary style transfer with style-attentional networks. In: proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5880–5888 (2019). https://doi.org/10.1109/CVPR.2019.00603

  12. Frans, K.: Outline colorization through tandem adversarial networks, arXiv preprint arXiv:1704.08834 (2017). https://doi.org/10.48550/arXiv.1704.08834

  13. Huber, P.J.: Robust estimation of a location parameter. In: Kotz, S., Johnson, N.L. (eds.) Breakthroughs in Statistics, pp. 492–518. Springer, New York (1992). https://doi.org/10.1214/aoms/1177703732

  14. Tai, Y.-W., Jia, J., Tang, C.-K.: Local color transfer via probabilistic segmentation by expectation-maximization. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 747–754. IEEE (2005). https://doi.org/10.1109/CVPR.2005.215

  15. Available at https://www.kaggle.com/ktaebum/animesketch-colorization-pair

  16. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43

    Chapter  Google Scholar 

  17. Huang, J., Liao, J., Kwong, S.: Semantic example guided image-to image translation. IEEE Trans. Multimedia 23, 1654–1665 (2020). https://doi.org/10.1109/TMM.2020.3001536

    Article  Google Scholar 

  18. Furusawa, C., Kitaoka, S., Li, M., Odagiri, Y.: Generative probabilistic image colorization. arXiv preprint arXiv:2109.14518 (2021). https://doi.org/10.48550/arXiv.2109.14518

  19. Zhang, G., Qu, M., Jin, Y., Song, Q.: Colorization for anime sketches with cycle-consistent adversarial network. Int. J. Perform. Eng. 15, 910 (2019). https://doi.org/10.23940/ijpe.19.03.p20.910918

  20. Ci, Y., Ma, X., Wang, Z., Li, H., Luo, Z.: User-guided deep anime line art colorization with conditional adversarial networks. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 1536–1544 (2018). https://doi.org/10.1145/3240508.3240661

  21. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016). https://doi.org/10.1109/CVPR.2016.265

  22. Zhang, L., Li, C., Wong, T.-T., Ji, Y., Liu, C.: Two-stage sketch colorization. ACM Trans. Graphics (TOG) 37, 1–14 (2018). https://doi.org/10.1145/3272127.3275090

    Article  Google Scholar 

  23. Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11

    Chapter  Google Scholar 

  24. Parakkat, A.D., Memari, P., Cani, M.-P.: Delaunay painting: Perceptual image colouring from raster contours with gaps. In: Computer Graphics Forum, vol. 41, Wiley Online Library, pp. 166–181 (2022). https://doi.org/10.1111/cgf.14517

  25. Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017). https://doi.org/10.1109/ICCV.2017.167

  26. Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017). https://doi.org/10.1109/CVPR.2017.632

  27. Yan, C., Chung, J.J.Y., Kiheon, Y., Gingold, Y., Adar, E., Hong, S.R.: Flatmagic: improving flat colorization through AI-driven design for digital comic professionals. In: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pp. 1–17 (2022). https://doi.org/10.1145/3491102.3502075

  28. Yuan, M., Simo-Serra, E.: Line art colorization with concatenated spatial attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3946–3950 (2021). https://doi.org/10.1109/CVPRW53098.2021.00442

Download references

Acknowledgment

This work is supported by NSFC (Grant No. 62366047 and No. 62061042).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nianyi Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, H., Wang, N., Fang, J., Jia, Y., Ji, L., Chen, X. (2024). Anime Sketch Coloring Based on Self-attention Gate and Progressive PatchGAN. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14435. Springer, Singapore. https://doi.org/10.1007/978-981-99-8552-4_19

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8552-4_19

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8551-7

  • Online ISBN: 978-981-99-8552-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics