Overview of Research in the field of Video Compression using Deep Neural Networks

Birman, Raz; Segal, Yoram; Hadar, Ofer

doi:10.1007/s11042-019-08572-3

Overview of Research in the field of Video Compression using Deep Neural Networks

Published: 07 January 2020

Volume 79, pages 11699–11722, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

1971 Accesses
18 Citations
Explore all metrics

Abstract

Deep Neural Networks (DNN) have emerged in recent year as a best-of-breed alternative for performing various classification, prediction and identification tasks in images and other fields of study. In the last few years, various research groups are exploring the option to harness them to improve video coding with the primary purpose of reducing video compression rates while retaining same video quality. Evolving neural-networks based video coding research efforts are focused on two different directions: (1) improving existing video codecs by performing better predictions that are incorporated within the same codec framework, and (2) holistic methods of end-to-end image/video compression schemes. While some of the results are promising and the prospects are good, no breakthrough has been reported as of yet. This paper provides an overview of state-of-the-art research work, providing examples of few prominent publications that illustrate and further explain the different highlighted topics in the field of using DNNs for video compression. Our conclusion is that the benefits have not been fully explored yet and additional work is expected to accomplish the next generation, neural networks based codecs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 19

Deep Learning Based Video Compression Techniques with Future Research Issues

Article 28 June 2023

Quality Scalable Video Coding Based on Neural Representation

Deep Neural Network Based Frame Reconstruction for Optimized Video Coding

Notes

A heat map image that reflects the movement magnitude and direction of individual pixels between consecutive video frames.
A basic processing unit of HEVC that is the equivalent to block in previous standards (such as H.264)

References

Ball’e J, Laparra V, Simoncelli EP (2017) End-to-end optimized image compression. International Conference on Learning Representations (ICLR)
R. Birman, Y. Segal, A. D. Malka, O. Hadar (2018) Intra prediction with deep learning. SPIE Optics + Photonics conference, San Diego, California
Chaabouni S, Benois-Pineau J, Hadar O, Amar CB (2016) Deep learning for saliency prediction in natural video. arXiv preprint arXiv:1604.08010
Chen T, Liu H, Shen Q, Yue T, Cao X, Ma Z (2017) DeepCoder: A deep neural network based video compression. IEEE Visual Communications and Image Processing (VCIP), pp. 1–4
Chen Z, He T, Jin X, Wu F (2019) Learning for video compression. IEEE Transactions on Circuits and Systems for Video Technology
Cui W, Zhang T, Zhang S, Jiang F, Zuo W, Zhao D (2018) Convolutional neural networks based intra prediction for HEVC. arXiv preprint arXiv:1808.05734
Goodfellow JI, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. Advances in neural information processing systems (pp. 2672-2680)
Hadar O, Shleifer A, Mukherjee D, Joshi U, Mazar I, Yuzvinsky M, Tavor N, Itzhak N, Birman R (2017) Novel Modes and Adaptive Block Scanning Order for Intra Prediction in AV1. SPIE Optics + Photonics conference, San Diego, California (USA)
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Article MathSciNet Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Hu Y, Yang W, Li M, Liu J (2018) Progressive spatial recurrent neural network for intra prediction. arXiv preprint arXiv:1807.02232
Huo S, Liu D, Wu F, Li H (2018) Convolutional neural network-based motion compensation refinement for video coding. IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–4
Ibrahim EM, Badry E, Abdelsalam AM, Abdalla IL, Sayed M, Shalaby H (2018) Neural networks based fractional pixel motion estimation for HEVC. IEEE International Symposium on Multimedia (ISM), pp. 110–113
Jiang F, Tao W, Liu S, Ren J, Guo X, Zhao D (2017) An end-to-end compression framework based on convolutional neural networks. IEEE Transact Circ Syst Vid Technol 28(10):3007–3018
Article Google Scholar
Johnston N, Vincent D, Minnen D, Covell M, Singh S, Chinen T, Hwang SJ, Shor J, Toderici G (2017) Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. arXiv preprint arXiv: 1703.10114
Kin CYS, Coker B (2017) Video compression using recurrent convolutional neural networks. cs231n/Stanford
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, pp. 1097–1105
Lainema J, Ugur K (2011) Angular intra prediction in high efficiency video coding (HEVC). Multimedia Signal Processing (MMSP), IEEE 13th International Workshop
Lainema J, Bossen F, Han WJ, Min J, Ugur K (2012) Intra coding of the HEVC standard. IEEE Transact Circ Syst Vid Technol 22(12):1792–1801
Article Google Scholar
Larsen ABL, Sønderby SK, Larochelle H, Winther O (2015) Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv: 1512.09300
Laude T, Ostermann J (2016) Deep learning-based intra prediction mode decision for HEVC. Picture Coding Symposium (PCS), IEEE
Lee JK, Kim N, Cho S, Kang JW (2018) Convolution neural network based video coding technique using reference video synthesis. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, pp. 505–508
Li Honggui, M. Trocan (2018) Deep neural network based single pixel prediction for unified video coding. Neurocomputing 272, pp. 558–570
Li J, Li B, Xu J, Xiong R (2017) Intra prediction using fully connected network for video coding. IEEE International Conference on Image Processing (ICIP), pp. 1–5
Li Y, Liu D, Li H, Li L, Wu F, Zhang H, Yang H (2017) Convolutional neural network-based block up-sampling for intra frame coding. IEEE Transactions on Circuits and Systems Video Technology 28, no. 9. pp. 2316–2330
Li Y, Li B, Liu D, Chen Z (2017) A convolutional neural network-based approach to rate control in HEVC intra coding. 2017 IEEE Visual Communications Image Processing (VCIP), pp. 1–4
Lin J, Liu D, Li H, Wu F (2018) Generative adversarial network-based frame extrapolation for video coding. VCIP:pp. 1–4.
Liu J, Xia S, Yang W, Li M, Liu D (2019) One-for-all: grouped variation network based fractional interpolation in video coding. IEEE Transact Image Process 28(5):2140–2151
Article MathSciNet Google Scholar
Liu D, Li Y, Lin J, Li H, Wu F (2019) Deep learning-based video coding: a review and a case study. arXiv preprint arXiv:1904.12462
Lu G, Ouyang W, Xu D, Zhang X, Cai C, Gao Z (2018) DVC: an end-to-end deep video compression framework. arXiv preprint arXiv:1812.00101
Ma S, Zhang X, Jia C, Zhao Z, Wang S, Wang S (2019) Image and video compression with neural networks: A review. IEEE Transact Circ Syst Vid Technol. https://doi.org/10.1109/TCSVT.2019.2910119
Mathieu M, Couprie C, LeCun Y (2015) Deep multi-scale video prediction beyond mean square error. arXiv preprint arXiv:1511.05440
Mukherjee D, Su H, Bankoski J, Converse A, Han J, Liu Z, Xu Y (2015) An overview of new video coding tools under consideration for VP10: the successor to VP9. In Applications of Digital Image Processing XXXVIII, vol. 9599, p. 95991E. International Society for Optics and Photonics.
Oord AVD, Kalchbrenner N, Kavukcuoglu K (2017) Pixel recurrent neural networks. International Conference on Machine Learning (ICML)
Santurkar S, Budden D, Shavit N (2018) Generative compression. Picture coding symposium (PCS). IEEE, pp. 258–262
Schiopu I, Liu Y, Munteanu A (2018) CNN-based Prediction for Lossless Coding of Photographic Images”. in IEEE Picture Coding Symposium (PCS) (pp. 16–20)
Selimović A, Meden B, Peer P, Hladnik A (2018) Analysis of Content-Aware Image Compression with VGG16. 2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI). IEEE, pp. 1–7
Shen M, Xue P, Wang C (2011) Down-sampling based video coding using super-resolution technique. IEEE Transact Circ Syst Vid Technol 21(6):755–765
Article Google Scholar
Srivastava N, Mansimov E, Salakhudinov R (2015) Unsupervised learning of video representations using lstms. International conference on machine learning, pp. 843–852
Su H, Wen M, Wu N, Ren J, Zhang C (2014) Efficient Parallel Video Processing Techniques on GPU: From Framework to Implementation. Sci World J 2014, Article ID 716020:19. https://doi.org/10.1155/2014/716020
Article Google Scholar
Sullivan GJ, Ohm J, Han W-J, Wiegand T (2012) Overview of the high efficiency video coding (hevc) standard. IEEE Transact Circ Syst Vid Technol 22(12):1649–1668
Article Google Scholar
Takahashi K, Naemura T, Tanaka M (2011) Rate-distortion analysis of super-resolution image/video decoding. IEEE International Conference on Image Processing. IEEE, pp. 1629–1632
Toderici G, Vincent D, Johnston N, Jin Hwang S, Minnen D, Shor J, Covell M (2017) Full resolution image compression with recurrent neural networks. IEEE Conference on Computer Vision Pattern Recognition, pp. 5306–5314
Wang Y, Fan X, Jia C, Zhao D, Gao W (2018) Neural network based inter prediction for HEVC. IEEE International Conference on Multimedia and Expo (ICME) (pp. 1–6)
Wu CY, Singhal N, Krahenbuhl P (2018) Video compression through image interpolation. Proceedings of the European Conference on Computer Vision (ECCV), pp. 416–431
Xingjian SHI, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. Advances in neural information processing systems, pp. 802–810
Yan N, Liu D, Li B, Li H, Xu T, Wu F (2018) Convolutional neural network-based invertible half-pixel interpolation filter for video coding. IEEE International Conference on Image Processing (ICIP), pp. 201–205
Zhang H, Song L, Luo Z, Yang X (2017) Learning a convolutional neural network for fractional interpolation in HEVC inter coding. IEEE Visual Communications and Image Processing (VCIP), pp. 1–4
Zhao Z, Wang S, Zhang X, Ma S, Yang J (2018) CNN-based bi-directional motion compensation for high efficiency video coding, IEEE International Symposium on Circuits and Systems (ISCAS),pp. 1–4
Zhu S, Liu C, Xu Z (2019) High-definition video compression system based on perception guidance of salient information of a convolutional neural network and HEVC compression domain. IEEE Transactions on Circuits and Systems for Video Technology

Download references

Author information

Authors and Affiliations

Department of Communication Systems Engineering, School of Electrical and Computer Engineering, Ben-Gurion University, Be’er Sheva, Israel
Raz Birman, Yoram Segal & Ofer Hadar

Authors

Raz Birman
View author publications
You can also search for this author in PubMed Google Scholar
Yoram Segal
View author publications
You can also search for this author in PubMed Google Scholar
Ofer Hadar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raz Birman.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Birman, R., Segal, Y. & Hadar, O. Overview of Research in the field of Video Compression using Deep Neural Networks. Multimed Tools Appl 79, 11699–11722 (2020). https://doi.org/10.1007/s11042-019-08572-3

Download citation

Received: 21 March 2019
Revised: 27 November 2019
Accepted: 06 December 2019
Published: 07 January 2020
Issue Date: May 2020
DOI: https://doi.org/10.1007/s11042-019-08572-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Overview of Research in the field of Video Compression using Deep Neural Networks

Abstract

Access this article

Similar content being viewed by others

Deep Learning Based Video Compression Techniques with Future Research Issues

Quality Scalable Video Coding Based on Neural Representation

Deep Neural Network Based Frame Reconstruction for Optimized Video Coding

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Overview of Research in the field of Video Compression using Deep Neural Networks

Abstract

Access this article

Similar content being viewed by others

Deep Learning Based Video Compression Techniques with Future Research Issues

Quality Scalable Video Coding Based on Neural Representation

Deep Neural Network Based Frame Reconstruction for Optimized Video Coding

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation