Skip to main content
Log in

Video Codec Using Machine Learning Based on Parametric Orthogonal Filters

  • Published:
Optical Memory and Neural Networks Aims and scope Submit manuscript

Abstract

The research deals with video encoding using a machine learning-based videoframe approximator. The use of neural networks and hierarchical classifiers is considered in the context of this sort of approximator. Using a machine learning-based hierarchical classifier, the approximator switches at each point of a videoframe between elementary approximators from a predefined set of elementary classifiers. Convolutional filters with parametric orthogonal kernels are used as elementary classifiers. An algorithm for optimizing the hierarchical classifier is considered. The algorithm is based on recursive recalculations of the entropy quality index, which provides a good approximation of the encoded-data size. This sort of videoframe approximator is intended for a video codec using nested representations of videoframes. Real video sequences are used in computational experiments. The results indicate that the use of the videoframe approximator with a hierarchical classifier engaging parametric orthogonal kernels enables a noticeable reduction of the size of the encoded-data array.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.

Similar content being viewed by others

REFERENCES

  1. Ibaba, A., Adeshina, S., and Aibinu, A.M., A review of video compression optimization techniques, in 2021 1st International Conference on Multidisciplinary Engineering and Applied Science (ICMEAS), IEEE, pp. 1–5.

  2. Zhang, T. and Mao, S., An overview of emerging video coding standards, GetMobile: Mobile Comput. Commun., 2019, vol. 22, no. 4, pp. 13–20.

    Article  Google Scholar 

  3. Jamil, S. and Piran, M., Learning-driven lossy image compression; A Comprehensive Survey, 2022. arXiv preprint arXiv:2201.09240.

  4. Li, Y., Liu, G., Sun, Y., Liu, Q., and Chen, S., 3D tensor auto-encoder with application to video compression, ACM Trans. Multimedia Comput., Commun., Appl. (TOMM), 2021, vol. 17, no. 2, pp. 1–18.

    Article  Google Scholar 

  5. Yang, R., Mentzer, F., Van Gool, L., and Timofte, R., Learning for video compression with recurrent auto-encoder and recurrent probability model, IEEE J. Sel. Top. Signal Process., 2020, vol. 15, no. 2, pp. 388–401.

    Article  Google Scholar 

  6. Sara, U., Akter, M., and Uddin, M.S., Image quality assessment through FSIM, SSIM, MSE and PSNR – A comparative study, J. Comput. Commun., 2019, vol. 7, no. 3, pp. 8–18.

    Article  Google Scholar 

  7. Lin, J., Liu, D., Li, H., and Wu, F., M-LVC: Multiple frames prediction for learned video compression, in Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3546–3554.

  8. De Cock, J., De Decker, A., and Sivashanmugam, S., Low-complexity quality measurement for real-time video compression, in SMPTE 2022 Media Technology Summit, SMPTE, 2022, pp. 1–12.

    Google Scholar 

  9. Antsiferova, A., Lavrushkin, S., Smirnov, M., Gushchin, A., Vatolin, D., and Kulikov, D., Video compression dataset and benchmark of learning-based video-quality metrics, 2022. arXiv preprint arXiv:2211.12109.

  10. Mansri, I., Doghmane, N., Kouadria, N., Harize, S., and Bekhouch, A., Comparative evaluation of VVC, HEVC, H. 264, AV1, and VP9 encoders for low-delay video applications, in 2020 Fourth International Conference on Multimedia Computing, Networking and Applications (MCNA), IEEE, 2020, pp. 38–43.

  11. Habibian, A., Rozendaal, T.V., Tomczak, J.M., and Cohen, T.S., Video compression with rate-distortion autoencoders, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 7033–7042.

  12. Sergeyev, V.V., Glumov, N.I., Gashnikov, M.V., Myasnikov, V.V., and Farberov, E., A software environment for image compression and visualization based on hierarchical grid interpolation, Pattern Recognit. Image Anal., 2001, vol. 11, no. 2, pp. 428–429.

    Google Scholar 

  13. Gashnikov, M.V., Use of neural networks and decision trees in compression of 2D and 3D digital signals, Opt. Mem. Neural Networks, 2022, vol. 31, no. 4, pp. 379–392.

    Article  Google Scholar 

  14. Dynamic Scenes Data Set. http://vision.eecs.yorku.ca/research/dynamic-scenes.

Download references

Funding

The work was supported by the Russian Science Foundation, project no. 22-21-00662.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. V. Gashnikov.

Ethics declarations

The author of this work declares that he has no conflicts of interest.

Additional information

Publisher’s Note.

Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gashnikov, M.V. Video Codec Using Machine Learning Based on Parametric Orthogonal Filters. Opt. Mem. Neural Networks 32, 226–232 (2023). https://doi.org/10.3103/S1060992X23040021

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S1060992X23040021

Key words:

Navigation