Abstract
The research deals with video encoding using a machine learning-based videoframe approximator. The use of neural networks and hierarchical classifiers is considered in the context of this sort of approximator. Using a machine learning-based hierarchical classifier, the approximator switches at each point of a videoframe between elementary approximators from a predefined set of elementary classifiers. Convolutional filters with parametric orthogonal kernels are used as elementary classifiers. An algorithm for optimizing the hierarchical classifier is considered. The algorithm is based on recursive recalculations of the entropy quality index, which provides a good approximation of the encoded-data size. This sort of videoframe approximator is intended for a video codec using nested representations of videoframes. Real video sequences are used in computational experiments. The results indicate that the use of the videoframe approximator with a hierarchical classifier engaging parametric orthogonal kernels enables a noticeable reduction of the size of the encoded-data array.
Similar content being viewed by others
REFERENCES
Ibaba, A., Adeshina, S., and Aibinu, A.M., A review of video compression optimization techniques, in 2021 1st International Conference on Multidisciplinary Engineering and Applied Science (ICMEAS), IEEE, pp. 1–5.
Zhang, T. and Mao, S., An overview of emerging video coding standards, GetMobile: Mobile Comput. Commun., 2019, vol. 22, no. 4, pp. 13–20.
Jamil, S. and Piran, M., Learning-driven lossy image compression; A Comprehensive Survey, 2022. arXiv preprint arXiv:2201.09240.
Li, Y., Liu, G., Sun, Y., Liu, Q., and Chen, S., 3D tensor auto-encoder with application to video compression, ACM Trans. Multimedia Comput., Commun., Appl. (TOMM), 2021, vol. 17, no. 2, pp. 1–18.
Yang, R., Mentzer, F., Van Gool, L., and Timofte, R., Learning for video compression with recurrent auto-encoder and recurrent probability model, IEEE J. Sel. Top. Signal Process., 2020, vol. 15, no. 2, pp. 388–401.
Sara, U., Akter, M., and Uddin, M.S., Image quality assessment through FSIM, SSIM, MSE and PSNR – A comparative study, J. Comput. Commun., 2019, vol. 7, no. 3, pp. 8–18.
Lin, J., Liu, D., Li, H., and Wu, F., M-LVC: Multiple frames prediction for learned video compression, in Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3546–3554.
De Cock, J., De Decker, A., and Sivashanmugam, S., Low-complexity quality measurement for real-time video compression, in SMPTE 2022 Media Technology Summit, SMPTE, 2022, pp. 1–12.
Antsiferova, A., Lavrushkin, S., Smirnov, M., Gushchin, A., Vatolin, D., and Kulikov, D., Video compression dataset and benchmark of learning-based video-quality metrics, 2022. arXiv preprint arXiv:2211.12109.
Mansri, I., Doghmane, N., Kouadria, N., Harize, S., and Bekhouch, A., Comparative evaluation of VVC, HEVC, H. 264, AV1, and VP9 encoders for low-delay video applications, in 2020 Fourth International Conference on Multimedia Computing, Networking and Applications (MCNA), IEEE, 2020, pp. 38–43.
Habibian, A., Rozendaal, T.V., Tomczak, J.M., and Cohen, T.S., Video compression with rate-distortion autoencoders, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 7033–7042.
Sergeyev, V.V., Glumov, N.I., Gashnikov, M.V., Myasnikov, V.V., and Farberov, E., A software environment for image compression and visualization based on hierarchical grid interpolation, Pattern Recognit. Image Anal., 2001, vol. 11, no. 2, pp. 428–429.
Gashnikov, M.V., Use of neural networks and decision trees in compression of 2D and 3D digital signals, Opt. Mem. Neural Networks, 2022, vol. 31, no. 4, pp. 379–392.
Dynamic Scenes Data Set. http://vision.eecs.yorku.ca/research/dynamic-scenes.
Funding
The work was supported by the Russian Science Foundation, project no. 22-21-00662.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The author of this work declares that he has no conflicts of interest.
Additional information
Publisher’s Note.
Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Gashnikov, M.V. Video Codec Using Machine Learning Based on Parametric Orthogonal Filters. Opt. Mem. Neural Networks 32, 226–232 (2023). https://doi.org/10.3103/S1060992X23040021
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1060992X23040021