An IBC Reference Block Enhancement Model Based on GAN for Screen Content Video Coding

Yang, Pengjian; Wang, Jun; Zhong, Guangyu; Zhang, Pengyuan; Zhang, Lai; Liang, Fan; Yang, Jianxin

doi:10.1007/978-3-030-98355-0_2

Pengjian Yang¹⁵,
Jun Wang¹⁶,
Guangyu Zhong¹⁵,
Pengyuan Zhang¹⁷,
Lai Zhang¹⁵,
Fan Liang¹⁵ &
…
Jianxin Yang¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13142))

Included in the following conference series:

International Conference on Multimedia Modeling

2021 Accesses

Abstract

As a special kind of video coding, screen content coding (SCC) has received widespread attention because of the popularity of online classes and conferences. However, few people use neural networks to improve the compression efficiency of SCC. Intra block copy (IBC) is one of the most important coding tools in SCC, which can save half of the bitrate. Due to the need to copy the content of the reference block, the performance of IBC mode largely depends on the quality of the reference block. In the standard encoding process of Versatile Video Coding (VVC), the IBC reference block is not filtered, and there are still serious compression artifacts. This will result in a decrease in IBC search accuracy and SCC compression efficiency. Inspired by in-loop filtering, we propose an IBC reference blocks enhancement network based on GAN (IREGAN) to filter the reference blocks before IBC estimation, which can improve the quality of IBC reference block and the accuracy of IBC matching. In addition to the generator used for image enhancement, our model also includes a variance-based classifier and a discriminator obtained from adversarial training. The classifier can effectively improve the efficiency of the model and the discriminator can improve the robustness of the entire system. Experimental results demonstrate the performance gains of IREGAN with VTM10.0, offering about 6.98% BDBR reduction, 0.71dB BDPSNR gains in average (luminance). SSIM increased by 0.0113 and the number of blocks using IBC mode is increased by 1.42%.

This work was supported by Key-Area R&D Program of Guangdong Province under Grant 2019B010135002, and Innovative & Enterprising Team of Zhuhai under Grant 2019ZHCDGY07.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zhao, X., Liu, S., Zhao, L., Xu, X., Zhu, B., Li, X.: A comparative study of HEVC, VVC, VP9, AV1 and AVS3 video codecs. In: Applications of Digital Image Processing XLIII, vol. 11510, p. 1151011. International Society for Optics and Photonics (2020)
Google Scholar
Xu, X., et al.: Intra block copy in HEVC screen content coding extensions. IEEE J. Emerg. Sel. Top. Circuits Syst. 6(4), 409–419 (2016)
Article Google Scholar
Hu, Y., Li, Y., Chen, Z., Xu, X., Liu, S.: Performance analysis of intra block copy for screen content coding in AVS3. In: 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 123–126. IEEE (2020)
Google Scholar
Xu, X., Liu, S.: Screen content coding in recently developed video coding standards. In: 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP), pp. 1–2. IEEE (2020)
Google Scholar
Cao, J., Li, Z., Liang, F., Wang, J.: An intra-affine current picture referencing mode for screen content coding in VVC. In: 2019 Picture Coding Symposium (PCS), pp. 1–5. IEEE (2019)
Google Scholar
Tsang, S.H., Kwong, N.W., Chan, Y.L.: Fastsccnet: fast mode decision in VVC screen content coding via fully convolutional network. In: 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP), pp. 177–180. IEEE (2020)
Google Scholar
Xu, X., Li, X., Liu, S.: Intra block copy in versatile video coding with reference sample memory reuse. In: 2019 Picture Coding Symposium (PCS), pp. 1–5. IEEE (2019)
Google Scholar
Pan, Z., Yi, X., Zhang, Y., Jeon, B., Kwong, S.: Efficient in-loop filtering based on enhanced deep convolutional neural networks for HEVC. IEEE Trans. Image Process. 29, 5352–5366 (2020)
Article Google Scholar
Zhang, Y., Shen, T., Ji, X., Zhang, Y., Xiong, R., Dai, Q.: Residual highway convolutional neural networks for in-loop filtering in HEVC. IEEE Trans. Image Process. 27(8), 3827–3841 (2018)
Article MathSciNet Google Scholar
Dai, Y., Liu, D., Wu, F.: A convolutional neural network approach for post-processing in HEVC intra coding. In: Amsaleg, L., Guðmundsson, G.Þ, Gurrin, C., Jónsson, B.Þ, Satoh, S. (eds.) MMM 2017. LNCS, vol. 10132, pp. 28–39. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51811-4_3
Chapter Google Scholar
Lu, M., Chen, T., Liu, H., Ma, Z.: Learned image restoration for VVC intra coding. In: CVPR Workshops (2019)
Google Scholar
Xue, Y., Su, J.: Attention based image compression post-processing convlutional neural network. In: CVPR Workshops (2019)
Google Scholar
Cho, S., et al.: Low bit-rate image compression based on post-processing with grouped residual dense network. In: CVPR Workshops (2019)
Google Scholar
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)
Google Scholar
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)
Google Scholar
Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., Matas, J.: Deblurgan: blind motion deblurring using conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8183–8192 (2018)
Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Google Scholar
Galteri, L., Bertini, M., Seidenari, L., Uricchio, T., Del Bimbo, A.: Increasing video perceptual quality with GANs and semantic coding. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 862–870 (2020)
Google Scholar
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Google Scholar
Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2017)
Google Scholar
Bjontegaard, G.: Calculation of average PSNR differences between RD-curves. VCEG-M33 (2001)
Google Scholar
Hore, A., Ziou, D.: Image quality metrics: PSNR vs. SSIM. In: 2010 20th International Conference on Pattern Recognition, pp. 2366–2369. IEEE (2010)
Google Scholar
The VTM reference software for VVC development, version 10.0. https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tree/VTM-10.0

Download references

Author information

Authors and Affiliations

Electronics and Information Technology, Sun Yat-sen University, Guangzhou, China
Pengjian Yang, Guangyu Zhong, Lai Zhang & Fan Liang
Zhuhai Jieli Technology Co., Ltd., Zhuhai, China
Jun Wang & Jianxin Yang
Wuhan Research Institute of Posts and Telecommunications, Wuhan, China
Pengyuan Zhang

Authors

Pengjian Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guangyu Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Pengyuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Fan Liang
View author publications
You can also search for this author in PubMed Google Scholar
Jianxin Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Wang .

Editor information

Editors and Affiliations

IT University of Copenhagen, Copenhagen, Denmark
Björn Þór Jónsson
Dublin City University, Dublin, Ireland
Cathal Gurrin
University of Science, VNU-HCM, Ho Chi Minh City, Vietnam
Minh-Triet Tran
University of Bergen, Bergen, Norway
Duc-Tien Dang-Nguyen
National Tsing Hua University, Hsinchu, Taiwan
Anita Min-Chun Hu
Hanoi University of Science and Technology, Hanoi, Vietnam
Binh Huynh Thi Thanh
Median Technologies, Valbonne, France
Benoit Huet

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, P. et al. (2022). An IBC Reference Block Enhancement Model Based on GAN for Screen Content Video Coding. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-98355-0_2
Published: 15 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98354-3
Online ISBN: 978-3-030-98355-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An IBC Reference Block Enhancement Model Based on GAN for Screen Content Video Coding