Image compression with learned lifting-based DWT and learned tree-based entropy models

Sahin, Ugur Berk; Kamisli, Fatih

doi:10.1007/s00530-023-01192-w

Image compression with learned lifting-based DWT and learned tree-based entropy models

Regular Paper
Published: 16 October 2023

Volume 29, pages 3369–3384, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Ugur Berk Sahin^1,2 &
Fatih Kamisli¹

260 Accesses
Explore all metrics

Abstract

This paper explores learned image compression based on traditional and learned discrete wavelet transform (DWT) architectures and learned entropy models for coding DWT subband coefficients. A learned DWT is obtained through the lifting scheme with learned nonlinear predict and update filters. Several learned entropy models, with varying computational complexities, are explored to exploit inter- and intra-DWT subband coefficient dependencies, akin to traditional EZW, SPIHT, or EBCOT algorithms. Experimental results show that when the explored learned entropy models are combined with traditional wavelet filters, such as the CDF 9/7 filters, compression performance that far exceeds that of JPEG2000 can be achieved. When the learned entropy models are combined with the learned DWT, compression performance increases further. The computations in the learned DWT and all entropy models, except one, can be simply parallelized, and thus, the systems provide practical encoding and decoding times on GPUs, unlike other DWT-based learned compression systems in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Invertible update-then-predict integer lifting wavelet for lossless image compression

Article Open access 14 January 2017

Automatic filter coefficient calculation in lifting scheme wavelet transform for lossless image compression

Article 24 April 2020

Deep Image Coding in the Fractional Wavelet Transform Domain based on High-Frequency Sub-bands Prediction

Data availability

The datasets that are analyzed during the current study (training and testing) are available from [39, 48], respectively.

Notes

Available at https://github.com/uberkk/ImageCompressionLearnedLiftingandLearnedTreeBasedModels
Encoding/decoding times for JPEG2000, and the two systems with IISCEM are obtained with a CPU (due to sequential encoding/decoding requirement), while all others are obtained with a GPU.

References

Jiao, L., Zhao, J.: A survey on the new generation of deep learning in image processing. IEEE Access 7, 172231–172263 (2019)
Article Google Scholar
Steinmetz, R.: Data compression in multimedia computing-standards and systems. Multimed. Syst. 1(5), 187–204 (1994)
Article Google Scholar
Pennebaker, W.B., Mitchell, J.L.: JPEG: Still image data compression standard. Springer (1992)
Rabbani, M., Joshi, R.: An overview of the jpeg 2000 still image compression standard. Signal Process. Image commun. 17(1), 3–48 (2002)
Article Google Scholar
Christopoulos, C., Skodras, A., Ebrahimi, T.: The JPEG2000 still image coding system: an overview. Consum. Electron. IEEE Trans 46(4), 1103–1127 (2000). https://doi.org/10.1109/30.920468
Article Google Scholar
Lainema, J, Hannuksela, MM, Vadakital ,VK, Aksu, EB: Hevc still image coding and high efficiency image file format. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 71–75 (2016). https://doi.org/10.1109/ICIP.2016.7532321
(Netflix) CC.: AV1 Image File Format (AVIF). Last accessed 26 February 2023 (2023). http://www.aomediacodec.github.io
Goodfellow, I, Bengio, Y., Courville, A.: Deep Learning. MIT Press. http://www.deeplearningbook.org (2016)
Goyal, V.K.: Theoretical foundations of transform coding. IEEE Signal Process. Mag. 18(5), 9–21 (2001)
Article Google Scholar
Ahmed, N., Natarajan, T., Rao, K.R.: Discrete cosine transform. IEEE Trans. Comput. 100(1), 90–93 (1974)
Article MathSciNet MATH Google Scholar
Han, J., Saxena, A., Melkote, V., Rose, K.: Jointly optimized spatial prediction and block transform for video and image coding. IEEE Trans. Image Process. 21(4), 1874–1884 (2011)
MathSciNet MATH Google Scholar
Kamisli, F.: Block-based spatial prediction and transforms based on 2d markov processes for image and video compression. IEEE Trans. Image Process. 24(4), 1247–1260 (2015)
Article MathSciNet MATH Google Scholar
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. arXiv preprint arXiv:1611.01704 (2016)
Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018)
Hilton, M.L., Jawerth, B.D., Sengupta, A.: Compressing still and moving images with wavelets. Multimed. Syst. 2, 218–227 (1994)
Article Google Scholar
Geetha, V., Anbumani, V., Murugesan, G., Gomathi, S.: Hybrid optimal algorithm-based 2d discrete wavelet transform for image compression using fractional kca. Multimed. Syst. 26, 687–702 (2020)
Article Google Scholar
Buccigrossi, R.W., Simoncelli, E.P.: Image compression via joint statistical characterization in the wavelet domain. IEEE Trans. Image Process. 8(12), 1688–1701 (1999)
Article Google Scholar
Liu, Z., Karam, L.J.: 2002 Quantifying the intra and inter subband correlations in the zerotree-based wavelet image coders. Conf Rec Thirty-Sixth Asilomar Conf Signals Syst Comput 2, 1730–17342 (2002). https://doi.org/10.1109/ACSSC.2002.1197071
Article Google Scholar
Shapiro, J.M.: Embedded image coding using zerotrees of wavelet coefficients. IEEE Trans. Signal Process. 41(12), 3445–3462 (1993)
Article MATH Google Scholar
Said, A., Pearlman, W.A.: A new, fast, and efficient image codec based on set partitioning in hierarchical trees. IEEE Trans. Circuits Syst. Video Technol. 6(3), 243–250 (1996)
Article Google Scholar
Taubman, D.: High performance scalable image compression with ebcot. IEEE Trans. Image Process. 9(7), 1158–1170 (2000)
Article MathSciNet Google Scholar
Ma, H., Liu, D., Yan, N., Li, H., Wu, F.: End-to-end optimized versatile image compression with wavelet-like transform. IEEE Trans. Pattern Anal. Mach. Intell. 44, 1247 (2020)
Article Google Scholar
Minnen, D., Ballé, J., Toderici, G.D.: Joint autoregressive and hierarchical priors for learned image compression. Adv. Neural Inform. Process. Syst. (2018). https://doi.org/10.48550/arXiv.1809.02736
Article Google Scholar
Sweldens, W.: The lifting scheme: A construction of second generation wavelets. SIAM J. Math. Anal. 29(2), 511–546 (1998). https://doi.org/10.1137/S0036141095289051
Article MathSciNet MATH Google Scholar
Daubechies, I., Sweldens, W.: Factoring wavelet transforms into lifting steps. J. Fourier Anal. Appl. 4(3), 247–269 (1998)
Article MathSciNet MATH Google Scholar
Cohen, A., Daubechies, I., Feauveau, J.-C.: Biorthogonal bases of compactly supported wavelets. Commun. Pure Appl. Math. 45(5), 485–560 (1992)
Article MathSciNet MATH Google Scholar
Dragotti, P.L., Vetterli, M.: Wavelet footprints: theory, algorithms, and applications. IEEE Trans. Signal Process. 51(5), 1306–1323 (2003)
Article MathSciNet MATH Google Scholar
Dragotti, P.L., Vetterli, M.: Footprints and edgeprints for image denoising and compression. In: Proceedings 2001 International Conference on Image Processing (Cat. No. 01CH37205), vol. 2, pp. 237–240 (2001). IEEE
Dragotti, P.L., Vetterli, M.: Deconvolution with wavelet footprints for ill-posed inverse problems. IEEE Int. Conf. Acoust. Speech Signal Process. 2, 1257 (2002)
Google Scholar
Zhao, X., Huang, P., Shu, X.: Wavelet-attention cnn for image classification. Multimed. Syst. 28(3), 915–924 (2022)
Article Google Scholar
Brahimi, T., Khelifi, F., Laouir, F., Kacha, A.: A new, enhanced ezw image codec with subband classification. Multimed. Syst. 28(1), 1–19 (2022)
Article Google Scholar
Cheng, Z., Sun, H., Takeuchi, M., Katto, J.: Learned image compression with discretized gaussian mixture likelihoods and attention modules. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7939–7948 (2020)
Yílmaz, M.A., Kelesş, O., Güven, H., Tekalp, A.M., Malik, J., Kíranyaz, S.: Self-organized variational autoencoders (self-vae) for learned image compression. In: 2021 IEEE International Conference on Image Processing (ICIP), pp. 3732–3736 (2021). IEEE
Lu, M., Guo, P., Shi, H., Cao, C., Ma, Z.: Transformer-based image compression. arXiv preprint arXiv:2111.06707 (2021)
Minnen, D., Singh, S.: Channel-wise autoregressive entropy models for learned image compression. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 3339–3343 (2020). IEEE
He, D., Yang, Z., Peng, W., Ma, R., Qin, H., Wang, Y.: Elic: Efficient learned image compression with unevenly grouped space-channel contextual adaptive coding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5718–5727 (2022)
Kim, J.-H., Heo, B., Lee, J.-S.: Joint global and local hierarchical priors for learned image compression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5992–6001 (2022)
Ma, H., Liu, D., Xiong, R., Wu, F.: iwave: Cnn-based wavelet-like transform for image compression. IEEE Trans. Multimed. 22(7), 1667–1679 (2019)
Article Google Scholar
Kodak, E.: Kodak Lossless True Color Image Suite (PhotoCD PCD0992). Last accessed 2 February 2023 (2023). http://r0k.us/graphics/kodak
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Ballé, J.: Efficient nonlinear transforms for lossy image compression. In: 2018 Picture Coding Symposium (PCS), pp. 248–252 (2018). IEEE
Marcellin, M.W., Lepley, M.A., Bilgin, A., Flohr, T.J., Chinen, T.T., Kasner, J.H.: An overview of quantization in jpeg 2000. Signal Process.Image Commun. 17(1), 73–84 (2002)
Article Google Scholar
Ballé, J., Laparra, V., Simoncelli, E.P.: Density modeling of images using a generalized normalization transformation. arXiv preprint arXiv:1511.06281 (2015)
Bégaint, J., Racapé, F., Feltman, S., Pushparaja, A.: Compressai: a pytorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029 (2020)
Chilinski, P., Silva, R.: Neural likelihoods via cumulative distribution functions. In: Conference on Uncertainty in Artificial Intelligence, pp. 420–429 (2020). PMLR
Van den Oord, A., Kalchbrenner, N., Espeholt, L., Vinyals, O., Graves, A., et al.: Conditional image generation with pixelcnn decoders. Adv. Neural Inform. Process Syst. 29 (2016)
Salimans, T., Karpathy, A., Chen, X., Kingma, D.P.: Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications. arXiv preprint arXiv:1701.05517 (2017)
Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. CoRR arXiv:abs/1711.09078 (2017)
Sahin, U.B., Kamisli, F.: Learned-DWT-and-Tree-based-Entropy-Models. Last accessed 26 February 2023 (2023). https://github.com/uberkk/ImageCompressionLearnedLiftingandLearnedTreeBasedModels
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimization of nonlinear transform codes for perceptual quality. In: 2016 Picture Coding Symposium (PCS), pp. 1–5 (2016). IEEE
Pakdaman, F., Gabbouj, M.: Comprehensive complexity assessment of emerging learned image compression on cpu and gpu. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). IEEE
Sovrasov, V.: Ptflops: a Flops Counting Tool for Neural Networks in Pytorch Framework. https://github.com/sovrasov/flops-counter.pytorch

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Electrical and Electronics Engineering, Middle East Technical University, 06800, Ankara, Turkey
Ugur Berk Sahin & Fatih Kamisli
Aselsan, Ankara, Turkey
Ugur Berk Sahin

Authors

Ugur Berk Sahin
View author publications
You can also search for this author in PubMed Google Scholar
Fatih Kamisli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ugur Berk Sahin or Fatih Kamisli.

Ethics declarations

Code availability

The codes to repeat the results in this paper are available from the authors on GitHub [49].

Additional information

Communicated by Q. Shen.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Following parameters are used in experimental results. In Fig. 7 a), \(ch=32\) is used for processing LL and \(ch=96\) is used for processing LH, HL, and HH subbands together. In Fig. 7 b), \(ch=32\) is used for each subband. In Fig. 8, \(ch=243\) is used for jointly processing LH, HL, and HH subbands, and output gives mean and scale for the corresponding three channels. In Fig. 9, on the right-hand side, \(ch=243\) is used for jointly processing LH, HL, and HH subbands and ch/3 is used on the left-hand side for processing each subband LH, HL, and HH separately (total of 243 channels). In Fig. 10, \(ch=162\) is used for processing each LH, HL, and HH subband separately. In Fig. 11, \(ch=81\) is used for processing each subband LL, LH, HL, and HH. In Fig. 13, \(ch=32\) is used. Our codes are available on github at [49].

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sahin, U.B., Kamisli, F. Image compression with learned lifting-based DWT and learned tree-based entropy models. Multimedia Systems 29, 3369–3384 (2023). https://doi.org/10.1007/s00530-023-01192-w

Download citation

Received: 10 March 2023
Accepted: 25 September 2023
Published: 16 October 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s00530-023-01192-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image compression with learned lifting-based DWT and learned tree-based entropy models

Abstract

Access this article

Similar content being viewed by others

Invertible update-then-predict integer lifting wavelet for lossless image compression

Automatic filter coefficient calculation in lifting scheme wavelet transform for lossless image compression

Deep Image Coding in the Fractional Wavelet Transform Domain based on High-Frequency Sub-bands Prediction

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Code availability

Additional information

Publisher's Note

Appendix A

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Image compression with learned lifting-based DWT and learned tree-based entropy models

Abstract

Access this article

Similar content being viewed by others

Invertible update-then-predict integer lifting wavelet for lossless image compression

Automatic filter coefficient calculation in lifting scheme wavelet transform for lossless image compression

Deep Image Coding in the Fractional Wavelet Transform Domain based on High-Frequency Sub-bands Prediction

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Code availability

Additional information

Publisher's Note

Appendix A

Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation