Skip to main content

Deep Neural Network Based Frame Reconstruction for Optimized Video Coding

  • Conference paper
  • First Online:
Artificial Intelligence and Mobile Services – AIMS 2018 (AIMS 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10970))

Included in the following conference series:

Abstract

Video coding has served as a key enabling technology to the explosion in online video sharing and consumption. This includes live video streaming, online video sharing, video conferencing, video surveillance, remote medicine, online education, online gaming, video broadcasting, cloud video services, and many others. The recently released open source royalty-free video coding standard known as AV1, designed and developed by the Alliance of Open Media (AOM), achieves a 30%–40% data rate reduction from previous generational video coding standards, which includes VP9 and HEVC. This paper aims to outline paradigms that may provide further coding performance gains over AV1. Image restoration has demonstrated significant effectiveness in video coding performance enhancement in AV1. This paper describes techniques in the same vein effectively optimizing frame reconstruction through the use of the Deep Neural Networks (DNN) to further improve coding performance. Initial explorations of our proposed approach have demonstrated promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mukherjee, D., Bankoski, J., Grange, A., Han, J., Koleszar, J., Wilkins, P., Xu, Y., Bultje, R.S.: The latest open-source video codec VP9 - an overview and preliminary results. In: Picture Coding Symposium (PCS), December 2013

    Google Scholar 

  2. Sullivan, G.J., Ohm, J., Han, W., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circ. Syst. Video Technol 22(12), 1649–1668 (2012)

    Article  Google Scholar 

  3. Wiegand, T., Sullivan, G.J., Bjontegaard, G., Luthra, A.: Overview of the H.264/AVC video coding standard. IEEE Trans. Circ. Syst. Video Technol. 13(7), 560–576 (2003)

    Article  Google Scholar 

  4. Alliance for Open Media. http://aomedia.org

  5. Mukherjee, D., Li, S., Chen, Y., Anis, S., Parker, S., Bankoski, J.: A switchable loop-restoration with side-information framework for the emerging AV1 video codec. In: Proceedings of the IEEE International Conference on Image Processing, 17–20 September 2017, Beijing, China (2017)

    Google Scholar 

  6. Fu, C., Chen, D., Liu, Z., Zhu, F., Delp, E.J.: Texture segmentation based video compression using convolutional neural networks. In: Proceedings of the IS&T Electronic Imaging on Visual Information Processing and Communication Conference, San Jose, California, United States, February 2018

    Google Scholar 

  7. Chen, Y., Murherjee, D., Han, J., Grange, A., Xu, Y., Liu, Z., Parker, S., Chen, C., Su, H., Joshi, U., Chiang, C.-H., Wang, Y., Wilkins, P., Bankoski, J., Trudeau, L., Egge, N., Valin, J.-M., Davies, T., Midtskogen, S, Norkin, A., de Rivaz, P.: An overview of core coding tools in the AV1 video codec. In: Picture Coding Symposium (PCS), 24–27 June 2018, San Francisco, California, United States (2018, submitted)

    Google Scholar 

  8. Chen, D., Fu, C., Zhu, F., Liu, Z.: AV1 video coding using texture analysis with convolutional neural networks. In: Picture Coding Symposium (PCS), 24–27 June 2018, San Francisco, California, United States (2018, submitted)

    Google Scholar 

  9. Finn, C., Goodfellow, I., Levine, S.: Unsupervised learning for physical interaction through video prediction. In: 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (2016)

    Google Scholar 

  10. Mathieu, M., Couprie, C., LeCun, Y.: Deep multi-scale video prediction beyond mean square error. In: International Conference on Learning Representations (ICLR) (2016)

    Google Scholar 

  11. Oh, J., Guo, X., Lee, H., Lewis, R.L., Singh, S.: Action-conditional video prediction using deep networks in atari games. In: Neural Information Processing Systems (NIPS) (2015)

    Google Scholar 

  12. Walker, J., Doersch, C., Gupta, A., Hebert, M.: An uncertain future: forecasting from static images using variational autoencoders. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 835–851. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_51

    Chapter  Google Scholar 

  13. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43

    Chapter  Google Scholar 

  14. Dong, C., Deng, Y., Loy, C.C., Tang, X.: Compression artifacts reduction by a deep convolutional network. In: 2015 IEEE International Conference on Computer Vision (ICCV 2015), 7–13 December 2015, Santiago, Chile, pp. 576–584 (2015)

    Google Scholar 

  15. Wang, Z., Liu, D., Chang, S., Ling, Q., Yang, Y., Huang, T.S.: Deep dual-domain based fast restoration of jpeg-compressed images. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), 27–30 June 2016, Las Vegas, USA, pp. 2764–2772 (2016)

    Google Scholar 

  16. Guo, J., Chao, H.: Building dual-domain representations for compression artifacts reduction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 628–644. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_38

    Chapter  Google Scholar 

  17. Park, W.-S., Kim, M.: CNN-based in-loop filtering for coding efficiency improvement. In: IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP 2016), 11–12 July 2016, Bordeaux, France, pp. 1–5 (2016)

    Google Scholar 

  18. Dai, Y., Liu, D., Wu, D.: A convolutional neural network approach for post-processing in HEVC intra coding. In: The 24th International Conference on MultiMedia Modeling (MMM 2017), 4–6 January, Reykjavik, Iceland, pp. 28–39 (2017)

    Google Scholar 

  19. Li, C., Song, L., Xie, R., Zhang, W.: CNN based post-processing to improve HEVC. In: 2017 IEEE International Conference on Image Processing (ICIP 2017), Beijing, China, 17–20 September 2017 (2017)

    Google Scholar 

  20. Kang, J., Kim, S., Lee, K.M.: Multi-modal/multi-scale convolutional neural network based in-loop filter design for next generation video codec. In: 2017 IEEE International Conference on Image Processing (ICIP 2017), Beijing, China, 17–20 September 2017 (2017)

    Google Scholar 

  21. Greaves, A., Winter, H.: Multi-frame video super-resolution using convolutional neural networks (2018)

    Google Scholar 

  22. Mnih, V., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dandan Ding .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ding, D., Liu, P., Chen, Y., Zhu, Z., Liu, Z., Bankoski, J. (2018). Deep Neural Network Based Frame Reconstruction for Optimized Video Coding. In: Aiello, M., Yang, Y., Zou, Y., Zhang, LJ. (eds) Artificial Intelligence and Mobile Services – AIMS 2018. AIMS 2018. Lecture Notes in Computer Science(), vol 10970. Springer, Cham. https://doi.org/10.1007/978-3-319-94361-9_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-94361-9_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-94360-2

  • Online ISBN: 978-3-319-94361-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics