Skip to main content
Log in

Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Accessibility to big training datasets together with current advances in computing power has emerged interest in the leverage of deep learning to address image compression. This needs to train and deploy separate networks for rate adaptation, which is impractical and extensive in terms of memory cost and power consumption, especially for broad bitrate ranges. To deal with such limitation, the variable-rate compression methods use the Lagrange multiplier to control the Rate/Distortion trade-offs in order not to require retraining of the neural network for each rate. However, they do not make an optimized bit allocation for the eye-catching foreground details, and do not consider the different degree of attention that the human eye has to each area of the image. Thus, other deep learning-based image compression approaches, which could outperform the above ones, are replied on the use of additional information. In this paper, we present a loss-conditional autoencoder tailored to the specific task of semantic image understanding to achieve higher visual quality in lossy variable-rate compression. Our framework is a neural network-based scheme able to automatically optimize coding parameters with multi-term perceptual loss function based on semantic-important structural SIMilarity index. To ensure the rate adaptation, we suggest modulating the compression network on the bitwidth of its activations by quantizing them according to several bitwidth values. Experiments are presented on the JPEG AI dataset in which our method achieves competitive and higher visual quality for the same compressed size, when compared to conventional codecs and related work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. A learning-based image coding ad hoc group created by the JPEG Committee to assess and develop image coding solutions following an end-to-end learning-based coding approach.

References

  1. Wallace, G. K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), xviii–xxxiv (1992)

  2. Rabbani, M., Joshi, R.: An overview of the JPEG 2000 still image compression standard. Sig. Process.: Image Commun. 17(1), 3–48 (2002)

  3. Bellard, F.: BPG image format. https://bellard.org/bpg (2014)

  4. Agustsson, E., Mentzer, F., Tschannen, M., Cavigelli, L., Timofte, R., Benini, L., Gool, L.V.: Soft-to-hard vector quantization for end-toend learning compressible representations. Adv. Neural Inf. Process. Syst. 30, 1141–1151 (2017)

    Google Scholar 

  5. Minnen, D., Ballé, J., Toderici, G.D.: Joint autoregressive and hierarchical priors for learned image compression. Adv. Neural Inf. Process. Syst. 31, 10794–10803 (2018)

    Google Scholar 

  6. Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. arXiv:1611.01704 (2016)

  7. Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. In: International Conference on Learning Representations (2018)

  8. Lee, J., Cho, S., Beack, S.-K.: Context-adaptive entropy model for end-to-end optimized image compression. In: International Conference on Learning Representations (2019)

  9. Toderici, G., Vincent, D., Johnston, N., Hwang, S.J., Minnen, D., Shor, J., Covell, M.: Full resolution image compression with recurrent neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5306–5314 (2017)

  10. Theis, L., Shi, W., Cunningham, A., Huszar, F.: Lossy image compression with compressive autoencoders. In: International Conference on Learning Representations (2017)

  11. Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Gool, L.V.: Conditional probability models for deep image compression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4394–4402 (2018)

  12. Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: International Conference on Machine Learning, pp. 2922–2930 (2017)

  13. Yang, F., Herranz, L., van de Weijer, J., Guitin, J.A.I., Lopez, A., Mozerov, M.: Variable rate deep image compression with modulated autoencoder. IEEE Signal Process. Lett. 27, 331-335 (2020)

    Article  Google Scholar 

  14. Choi, Y., El-Khamy, M., Lee, J.: Variable Rate Deep Image Compression With a Conditional autoencoder. In: International Conference on Computer Vision (ICCV) (2019)

  15. Zhou, J., Nakagawa, A., Kato, K., Wen, S., Kazui, K., Tan, Z.: Variable rate image compression method with dead-zone quantizer. In: Conference on Computer Vision and Pattern Recognition Workshops (2020)

  16. Lin, J., Akbari, M., Fu, H., Zhang, Q.,Wang, S., Liang, J., Liu, D., Liang, F., Zhang, G., Tu, C.: Learned variable-rate multi-frequency image compression using modulated generalized octave convolution. In: International Workshop on Multimedia Signal Processing (2020)

  17. Lee, W.-C., Chang, C.-P., Peng, W.-H., Hang, H.-M.: A hybrid layered image compressor with deep-learning technique. In: International Workshop on Multimedia Signal Processing (2020)

  18. Dosovitskiy, A., Djolonga, J.: You only train once: Loss-conditional training of deep networks In: International Conference on Learning Representations (2020)

  19. Akbari, M., Liang, J., Han, J.: DSSLIC: deep semantic segmentation-based layered image compression. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2019)

  20. Li, M., Zuo, W., Gu, S., Zhao, D., Zhang, D.: Learning convolutional networks for content weighted image compression. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)

  21. Chen, Z., Tianyu, H.: Learning based facial image compression with semantic fidelity metric. Neurocomputing 338, 16–25 (2019)

  22. Mahalingaiah, K., Sharma, H., Kaplish, P., Cheng, I.: Semantic learning for image compression (SLIC)

  23. Prakash, A., Moran, N., Garber, S., DiLillo, A., Storer, J.: Semantic perceptual image compression using deep convolution networks. In: Data Compression Conference (DCC) (2017)

  24. Hoang, T.M., Zhou, J., Fan, Y.: Image compression with encoder-decoder matched semantic segmentation. In: Conference on Computer Vision and Pattern Recognition Workshops (2020)

  25. Wang, C., Han, Y., Wang, W.: An end-to-end deep learning image compression framework based on semantic analysis. Appl. Sci. 9(17), 3580 (2019)

    Article  Google Scholar 

  26. Akyazi, P., Ebrahimi, T.: Learning-based image compression using convolutional autoencoder and wavelet decomposition. IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)

  27. Luo, S., Yang, Y., Song, M.: DeepSIC: deep semantic image compression. In: International Conference on Neural Information Processing (2018)

  28. Prakash, A., Moran, N., Garber, S., DiLillo, A., Storer, J.: Semantic perceptual image compression using deep convolution networks. IEEE Data Compression Conference (DCC) (2017)

  29. Chen, Z., He, T.: Learning based facial image compression with semantic fidelity metric. Neurocomputing 338, 16–25 (2019)

    Article  Google Scholar 

  30. Ascenso, J., Akyazi, P.: MPEG AI image coding common test conditions. In: \(84^{th}\) JPEG meeting (ISO/IEC JTC 1/SC29/WG1, document N84035), Brussels, Geneva: ISO, 13–19 July (2019)

  31. Zhou, B.L., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)

  32. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 3(4), 600–612 (2004)

    Article  Google Scholar 

  33. Sebai, D.: Multi-rate deep semantic image compression with quantized modulated autoencoder. In: IEEE International Workshop on Multimedia Signal Processing (MMSP) (2021)

  34. Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. https://authors.library.caltech.edu/7694/

  35. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE International Conference on Computer Vision (ICCV) (2015)

  36. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980 (2014)

  37. Fraunhofer Heinrich Hertz Institute.: VVC official test model VTM. https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Sebai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sebai, D., Shah, A.U. Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders. SIViP 17, 285–293 (2023). https://doi.org/10.1007/s11760-022-02231-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-022-02231-1

Keywords

Navigation