Skip to main content
Log in

Content adaptive downsampling for low bitrate video coding

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Video compression plays an essential role in many multimedia applications, and compression efficiency is highly related to video content. In this paper, based on video content characteristics, we propose a content adaptive downsampling scheme to improve video coding efficiency at a low bitrate. Specifically, we extract content-aware spatial-temporal features including normalized Spatial Information (SI), normalized Temporal Information (TI), and spatial masking as the perceptual features. Then, Support Vector Machine (SVM) is adopted to predict the optimal coding configuration for the video, i.e., direct encoding or downsample encoding. Experimental results show that the proposed method achieves considerable bitrate savings for video sequences at different resolutions with low computational complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Li Z, Duanmu Z, Liu W, Wang Z(2019) AVC, HEVC, VP9, AVS2 or AV1? — A Comparative Study of State-of-the-Art Video Encoders on 4K Videos. In: Image Analysis and Recognition. pp 162–173

  2. x264. [Online]. Accessed 15 March 2022. https://github.com/mirror/x264

  3. Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circuits Syst Video Technol 13(7):560–576

    Article  Google Scholar 

  4. x265. [Online]. Accessed 15 March 2022. https://github.com/videolan/x265

  5. Sullivan GJ, Ohm J-R, Han W-J, Wiegand T (2012) Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans Circuits Syst Video Technol 22(12):1649–1668

    Article  Google Scholar 

  6. libvpx. [Online]. Accessed 15 March 2022. https://github.com/webmproject/libvpx

  7. Guo L, De Cock J, Aaron A (2018) Compression Performance Comparison of x264, x265, libvpx and aomenc for On-Demand Adaptive Streaming Applications. In: 2018 Picture Coding Symposium (PCS). pp 26–30

  8. Bross B, Chen J, Ohm J-R, Sullivan GJ, Wang Y-K (2021) Developments in International Video Coding Standardization After AVC, With an Overview of Versatile Video Coding (VVC). Proc IEEE. 109(9):1463–1493

    Article  Google Scholar 

  9. Mercat A, Mkinen A, Sainio J, Lemmetti A, Viitanen M, Vanne J (2021) Comparative Rate-Distortion-Complexity Analysis of VVC and HEVC Video Codecs. IEEE Access. 9:67813–67828

    Article  Google Scholar 

  10. Saldanha M, Sanchez G, Marcon C, Agostini L (2021) Performance analysis of vvc intra coding. J Vis Commun Image Represent 79:103202. https://doi.org/10.1016/j.r.2021.103202

    Article  Google Scholar 

  11. Zhang X, Wu X (2008) Can Lower Resolution Be Better? In: Data Compression Conference (dcc 2008). pp 302–311

  12. He X, Fu W, Lin H, Li X, Peng X (2016) A Video Coding System with Spatial-Temporal Down-/Up-Sampling and Super-Resolution Reconstruction. In: 2016 8th International Conference on Computational Intelligence and Communication Networks (CICN). pp 236–239

  13. Zhang F, Afonso M, Bull DR (2021) Vistra2: Video coding using spatial resolution and effective bit depth adaptation. Signal Processing: Image Communication. 97:116355

    Google Scholar 

  14. Aaron A, Li Z, Manohara M, Cock JD, Ronca D (2015) Per-title encode optimization. Accessed 8 May 2021. [Online]. http://techblog.netflix.com/2015/12/per-title-encode-optimization.html

  15. De Cock J, Li Z, Manohara M, Aaron A (2016) Complexity-based consistent-quality encoding in the cloud. In: 2016 IEEE International Conference on Image Processing (ICIP). pp 1484–1488

  16. Katsavounidis I (2018) Dynamic optimizer a perceptual video encoding optimization framework. Accessed 8 May 2021. [Online]. https://netflixtechblog.com/dynamic-optimizer-a-perceptual-video-encoding-optimization-framework-e19f1e3a277f

  17. Amirpour H, Timmerer C, Ghanbari M (2021) PSTR: Per-Title Encoding Using Spatio-Temporal Resolutions. In: 2021 IEEE International Conference on Multimedia and Expo (ICME). pp 1–6

  18. Katsenou AV, Afonso M, Agrafiotis D, Bull DR (2016) Predicting video rate-distortion curves using textural features. In: 2016 Picture Coding Symposium (PCS). pp 1–5

  19. Katsenou AV, Afonso M, Bull DR (2022) Study of compression statistics and prediction of rate-distortion curves for video texture. Signal Process Image Commun. 101:116551

    Article  Google Scholar 

  20. Katsenou AV, Sole J, Bull DR (2019) Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming. In: 2019 Picture Coding Symposium (PCS). pp 1–5

  21. Katsenou AV, Sole J, Bull DR (2021) Efficient Bitrate Ladder Construction for Content-Optimized Adaptive Video Streaming. IEEE Open J. Signal Process. 2:496–511

    Article  Google Scholar 

  22. Lin JY, Liu T-J, Wu EH-H, Kuo C-CJ (2014) A fusion-based video quality assessment (fvqa) index. In: Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific. pp 1–5

  23. Woei-Tan Loh DBLB (2018) An error-based video quality assessment method with temporal information. Multimed Tools Appl 77. https://doi.org/10.1007/s11042-018-6107-1

  24. Fan Q, Luo W, Xia Y, Li G, He D (2019) Metrics and methods of video quality assessment: a brief review. Multimed Tools Appl 78. https://doi.org/10.1007/s11042-017-4848-x

  25. Yasamin Fazliani SS, Andrade Ernesto (2022) Neural network solution for a real-time no-reference video quality assessment of H.264/AVC video bitstreams. Multimed Tools Appl 81:100. https://doi.org/10.1007/s11042-021-10654-0

    Article  Google Scholar 

  26. Netflix (2018) VMAF: Perceptual Video Quality Assessment Base on Multi-Method Fusion. Accessed 18 March 2022. [Online]. https://github.com/Netflix/vmaf

  27. Wang H, Katsavounidis I, Zhou J, Park J, Lei S, Zhou X, Pun M-O, Jin X, Wang R, Wang X, Zhang Y, Huang J, Kwong S, Kuo C-CJ (2017) VideoSet: A large-scale compressed video quality dataset based on JND measurement. J Vis Commun Image Represent 46:292–302

    Article  Google Scholar 

  28. G.Bjontegaard (2001) Calculation of average PSNR differences between RD curves. ITU-T Q.6/SG16 VCEG 13th meeting

  29. Nguyen V-A, Tan Y-P, Lin W (2008) Adaptive downsampling/upsampling for better video compression at low bit rate. In: 2008 IEEE International Symposium on Circuits and Systems (ISCAS). pp 1624–1627

  30. Huang C-R, Huang W-Y, Liao Y-S, Lee C-C, Yeh Y-W (2022) A Content-Adaptive Resizing Framework for Boosting Computation Speed of Background Modeling Methods. IEEE Trans Syst Man Cybern 52(2):1192–1204

    Article  Google Scholar 

  31. Wang R-J, Huang C-W, Chang P-C (2014) Adaptive Downsampling Video Coding With Spatially Scalable Rate-Distortion Modeling. IEEE Trans Circuits Syst Video Technol 24(11):1957–1968

    Article  Google Scholar 

  32. Dong J, Ye Y (2014) Adaptive Downsampling for High-Definition Video Coding. IEEE Trans Circuits Syst Video Technol 24(3):480–488

    Article  Google Scholar 

  33. Sullivan GJ, Wiegand T (1998) Rate-distortion optimization for video compression. IEEE Signal Process Mag 15(6):74–90

    Article  ADS  Google Scholar 

  34. Afonso M, Zhang F, Katsenou A, Agrafiotis D, Bull D (2017) Low complexity video coding based on spatial resolution adaptation. In: 2017 IEEE International Conference on Image Processing (ICIP). pp 3011–3015

  35. Afonso M, Zhang F, Bull DR (2019) Video Compression Based on Spatio-Temporal Resolution Adaptation. IEEE Trans Circuits Syst Video Technol 29(1):275–280

    Article  Google Scholar 

  36. Wang Y, Zhu K, Wu J, Zhu Y (2017) Content aware video quality prediction model for HEVC encoded bitstream. Multimed. Tools Appl 76:19191–19209

    Article  Google Scholar 

  37. Ling S, Baveye Y, Callet PL, Skinner J, Katsavounidis I (2020) Towards Perceptually-Optimized Compression Of User Generated Content (UGC): Prediction Of UGC Rate-Distortion Category. In: 2020 IEEE International Conference on Multimedia and Expo (ICME). pp 1–6

  38. Meng S, Li Y, Liao Y, Li J, Wang S (2020) Learning to encode user-generated short videos with lower bitrate and the same perceptual quality. In: 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP). pp 383–386

  39. Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning Spatiotemporal Features with 3D Convolutional Networks. In: 2015 IEEE International Conference on Computer Vision (ICCV). pp 4489–4497

  40. Li J, Liu X, Xiao J, Li H, Wang S, Liu L (2019) Dynamic spatio-temporal feature learning via graph convolution in 3d convolutional networks. In: 2019 International Conference on Data Mining Workshops (ICDMW). pp 646–652

  41. Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Darrell T, Saenko K (2015) Long-term recurrent convolutional networks for visual recognition and description. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp 2625–2634

  42. Zvezdakov S, Kondranin D, Vatolin D (2021) Machine-Learning-Based Method for Content-Adaptive Video Encoding. In: 2021 Picture Coding Symposium (PCS). pp 1–5

  43. Qin S, Yang C, An P (2022) Adaptive rescaling for video coding optimization. In: 2022 7th International Conference on Signal and Image Processing (ICSIP). IEEE, pp 601–605

  44. FFmpeg. [Online]. Accessed 15 March 2022. https://FFmpeg.org/documentation.html

  45. al LA (2005) X264-A Free H264/AVC Encoder. [Online]. Accessed 15 March 2022. http://www.videolan.org/developers/x264.html

  46. Zhang X, Yang C, Wang H, Xu W, Kuo C-CJ (2020) Satisfied-User-Ratio Modeling for Compressed Video. IEEE Trans Image Process 29:3777–3789

    Article  ADS  Google Scholar 

  47. (ITU), ITU (2008) Subjective Video Quality Assessment Methods for Multimedia Applications. ITU T Recommendation P.910

  48. Fenimore C, Libert J, Wolf S (2000) Perceptual Effects of Noise in Digital Video Compression. p 109

  49. Antong Y, Xiuhua J, Xiaohua L (2015) Quality assessment of videos compressed by HEVC based on video content complexity. In: 2015 IEEE International Conference on Computer and Communications (ICCC). pp 425–429

  50. Hu S, Jin L, Wang H, Zhang Y, Kwong S, Kuo C-CJ (2015) Compressed image quality metric based on perceptually weighted distortion. IEEE Trans Image Process 24(12):5594–5608

    Article  ADS  MathSciNet  PubMed  Google Scholar 

  51. Hu S, Jin L, Wang H, Zhang Y, Kwong S, Kuo C-CJ (2017) Objective Video Quality Assessment Based on Perceptually Weighted Mean Squared Error. IEEE Trans Circuits Syst Video Technol 27(9):1844–1855

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the NSFC under Grant 61901252, 62171002, 62071287, 62020106011, and Science and Technology Commission of Shanghai Municipality under Grant 22ZR1424300, 20DZ2290100.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao Yang.

Ethics declarations

Conflicts of interest

All authors declare that there are no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qin, S., Yang, C. & An, P. Content adaptive downsampling for low bitrate video coding. Multimed Tools Appl 83, 26547–26563 (2024). https://doi.org/10.1007/s11042-023-16532-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16532-1

Keywords

Navigation