Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders

Sebai, D.; Shah, A. Ulah

doi:10.1007/s11760-022-02231-1

Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders

Original Paper
Published: 29 April 2022

Volume 17, pages 285–293, (2023)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

687 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Accessibility to big training datasets together with current advances in computing power has emerged interest in the leverage of deep learning to address image compression. This needs to train and deploy separate networks for rate adaptation, which is impractical and extensive in terms of memory cost and power consumption, especially for broad bitrate ranges. To deal with such limitation, the variable-rate compression methods use the Lagrange multiplier to control the Rate/Distortion trade-offs in order not to require retraining of the neural network for each rate. However, they do not make an optimized bit allocation for the eye-catching foreground details, and do not consider the different degree of attention that the human eye has to each area of the image. Thus, other deep learning-based image compression approaches, which could outperform the above ones, are replied on the use of additional information. In this paper, we present a loss-conditional autoencoder tailored to the specific task of semantic image understanding to achieve higher visual quality in lossy variable-rate compression. Our framework is a neural network-based scheme able to automatically optimize coding parameters with multi-term perceptual loss function based on semantic-important structural SIMilarity index. To ensure the rate adaptation, we suggest modulating the compression network on the bitwidth of its activations by quantizing them according to several bitwidth values. Experiments are presented on the JPEG AI dataset in which our method achieves competitive and higher visual quality for the same compressed size, when compared to conventional codecs and related work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Neural Multi-scale Image Compression

A unified efficient deep image compression framework and its application on human-centric Task

Article 14 December 2023

Coupled Squeeze-and-Excitation Blocks Based CNN for Image Compression

Notes

A learning-based image coding ad hoc group created by the JPEG Committee to assess and develop image coding solutions following an end-to-end learning-based coding approach.

References

Wallace, G. K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), xviii–xxxiv (1992)
Rabbani, M., Joshi, R.: An overview of the JPEG 2000 still image compression standard. Sig. Process.: Image Commun. 17(1), 3–48 (2002)
Bellard, F.: BPG image format. https://bellard.org/bpg (2014)
Agustsson, E., Mentzer, F., Tschannen, M., Cavigelli, L., Timofte, R., Benini, L., Gool, L.V.: Soft-to-hard vector quantization for end-toend learning compressible representations. Adv. Neural Inf. Process. Syst. 30, 1141–1151 (2017)
Google Scholar
Minnen, D., Ballé, J., Toderici, G.D.: Joint autoregressive and hierarchical priors for learned image compression. Adv. Neural Inf. Process. Syst. 31, 10794–10803 (2018)
Google Scholar
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. arXiv:1611.01704 (2016)
Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. In: International Conference on Learning Representations (2018)
Lee, J., Cho, S., Beack, S.-K.: Context-adaptive entropy model for end-to-end optimized image compression. In: International Conference on Learning Representations (2019)
Toderici, G., Vincent, D., Johnston, N., Hwang, S.J., Minnen, D., Shor, J., Covell, M.: Full resolution image compression with recurrent neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5306–5314 (2017)
Theis, L., Shi, W., Cunningham, A., Huszar, F.: Lossy image compression with compressive autoencoders. In: International Conference on Learning Representations (2017)
Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Gool, L.V.: Conditional probability models for deep image compression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4394–4402 (2018)
Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: International Conference on Machine Learning, pp. 2922–2930 (2017)
Yang, F., Herranz, L., van de Weijer, J., Guitin, J.A.I., Lopez, A., Mozerov, M.: Variable rate deep image compression with modulated autoencoder. IEEE Signal Process. Lett. 27, 331-335 (2020)
Article Google Scholar
Choi, Y., El-Khamy, M., Lee, J.: Variable Rate Deep Image Compression With a Conditional autoencoder. In: International Conference on Computer Vision (ICCV) (2019)
Zhou, J., Nakagawa, A., Kato, K., Wen, S., Kazui, K., Tan, Z.: Variable rate image compression method with dead-zone quantizer. In: Conference on Computer Vision and Pattern Recognition Workshops (2020)
Lin, J., Akbari, M., Fu, H., Zhang, Q.,Wang, S., Liang, J., Liu, D., Liang, F., Zhang, G., Tu, C.: Learned variable-rate multi-frequency image compression using modulated generalized octave convolution. In: International Workshop on Multimedia Signal Processing (2020)
Lee, W.-C., Chang, C.-P., Peng, W.-H., Hang, H.-M.: A hybrid layered image compressor with deep-learning technique. In: International Workshop on Multimedia Signal Processing (2020)
Dosovitskiy, A., Djolonga, J.: You only train once: Loss-conditional training of deep networks In: International Conference on Learning Representations (2020)
Akbari, M., Liang, J., Han, J.: DSSLIC: deep semantic segmentation-based layered image compression. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2019)
Li, M., Zuo, W., Gu, S., Zhao, D., Zhang, D.: Learning convolutional networks for content weighted image compression. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)
Chen, Z., Tianyu, H.: Learning based facial image compression with semantic fidelity metric. Neurocomputing 338, 16–25 (2019)
Mahalingaiah, K., Sharma, H., Kaplish, P., Cheng, I.: Semantic learning for image compression (SLIC)
Prakash, A., Moran, N., Garber, S., DiLillo, A., Storer, J.: Semantic perceptual image compression using deep convolution networks. In: Data Compression Conference (DCC) (2017)
Hoang, T.M., Zhou, J., Fan, Y.: Image compression with encoder-decoder matched semantic segmentation. In: Conference on Computer Vision and Pattern Recognition Workshops (2020)
Wang, C., Han, Y., Wang, W.: An end-to-end deep learning image compression framework based on semantic analysis. Appl. Sci. 9(17), 3580 (2019)
Article Google Scholar
Akyazi, P., Ebrahimi, T.: Learning-based image compression using convolutional autoencoder and wavelet decomposition. IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)
Luo, S., Yang, Y., Song, M.: DeepSIC: deep semantic image compression. In: International Conference on Neural Information Processing (2018)
Prakash, A., Moran, N., Garber, S., DiLillo, A., Storer, J.: Semantic perceptual image compression using deep convolution networks. IEEE Data Compression Conference (DCC) (2017)
Chen, Z., He, T.: Learning based facial image compression with semantic fidelity metric. Neurocomputing 338, 16–25 (2019)
Article Google Scholar
Ascenso, J., Akyazi, P.: MPEG AI image coding common test conditions. In: \(84^{th}\) JPEG meeting (ISO/IEC JTC 1/SC29/WG1, document N84035), Brussels, Geneva: ISO, 13–19 July (2019)
Zhou, B.L., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 3(4), 600–612 (2004)
Article Google Scholar
Sebai, D.: Multi-rate deep semantic image compression with quantized modulated autoencoder. In: IEEE International Workshop on Multimedia Signal Processing (MMSP) (2021)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. https://authors.library.caltech.edu/7694/
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE International Conference on Computer Vision (ICCV) (2015)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980 (2014)
Fraunhofer Heinrich Hertz Institute.: VVC official test model VTM. https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware

Download references

Author information

Authors and Affiliations

Cristal Laboratory,National School of Computer Sciences, University of Manouba, Manouba, Tunisia
D. Sebai
Faculty of Computer Science and Information Technology, University Tun Hussein Onn, Parit Raja, Johor, Malaysia
A. Ulah Shah

Authors

D. Sebai
View author publications
You can also search for this author in PubMed Google Scholar
A. Ulah Shah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to D. Sebai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sebai, D., Shah, A.U. Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders. SIViP 17, 285–293 (2023). https://doi.org/10.1007/s11760-022-02231-1

Download citation

Received: 25 December 2021
Revised: 27 March 2022
Accepted: 29 March 2022
Published: 29 April 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s11760-022-02231-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders

Abstract

Access this article

Similar content being viewed by others

Neural Multi-scale Image Compression

A unified efficient deep image compression framework and its application on human-centric Task

Coupled Squeeze-and-Excitation Blocks Based CNN for Image Compression

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders

Abstract

Access this article

Similar content being viewed by others

Neural Multi-scale Image Compression

A unified efficient deep image compression framework and its application on human-centric Task

Coupled Squeeze-and-Excitation Blocks Based CNN for Image Compression

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation