Advertisement

Task-Aware Quantization Network for JPEG Image Compression

Conference paper
  • 583 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12365)

Abstract

We propose to learn a deep neural network for JPEG image compression, which predicts image-specific optimized quantization tables fully compatible with the standard JPEG encoder and decoder. Moreover, our approach provides the capability to learn task-specific quantization tables in a principled way by adjusting the objective function of the network. The main challenge to realize this idea is that there exist non-differentiable components in the encoder such as run-length encoding and Huffman coding and it is not straightforward to predict the probability distribution of the quantized image representations. We address these issues by learning a differentiable loss function that approximates bitrates using simple network blocks—two MLPs and an LSTM. We evaluate the proposed algorithm using multiple task-specific losses—two for semantic image understanding and another two for conventional image compression—and demonstrate the effectiveness of our approach to the individual tasks.

Keywords

JPEG image compression Adaptive quantization Bitrate approximation 

Notes

Acknowledgments

This work was partly supported by Kakao and Kakao Brain Corporation, and IITP grant funded by the Korea government (MSIT) (2016-0-00563, 2017-0-01779). We also thank Hyeonwoo Noh for fruitful discussions.

Supplementary material

504476_1_En_19_MOESM1_ESM.pdf (1.2 mb)
Supplementary material 1 (pdf 1222 KB)

References

  1. 1.
    Agustsson, E., et al.: Soft-to-hard vector quantization for end-to-end learning compressible representations. In: NeurIPS (2017)Google Scholar
  2. 2.
    Ahmed, N., Natarajan, T., Rao, K.R.: Discrete cosine transform. IEEE Trans. Comput. 100(1), 90–93 (1974)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Ahumada Jr, A.J., Peterson, H.A.: Luminance-model-based DCT quantization for color image compression. In: SPIE (1992)Google Scholar
  4. 4.
    Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: ICLR (2017)Google Scholar
  5. 5.
    Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. In: ICLR (2018)Google Scholar
  6. 6.
    Davisson, L.D.: Rate-distortion theory and application. Proc. IEEE 60(7), 800–808 (1972)CrossRefGoogle Scholar
  7. 7.
    Dodge, S., Karam, L.: Understanding how image quality affects deep neural networks. In: QoMEX (2016)Google Scholar
  8. 8.
    Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: CVPRW (2004)Google Scholar
  9. 9.
    Flores, B.E.: A pragmatic view of accuracy measurement in forecasting. Omega 14(2), 93–98 (1986)CrossRefGoogle Scholar
  10. 10.
    Gueguen, L., Sergeev, A., Kadlec, B., Liu, R., Yosinski, J.: Faster neural networks straight from JPEG. In: NeurIPS (2018)Google Scholar
  11. 11.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  12. 12.
    Hodosh, M., Young, P., Hockenmaier, J.: Framing image description as a ranking task: data, models and evaluation metrics. J. Artif. Intell. Res. 47, 853–899 (2013)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Hopkins, M., Mitzenmacher, M., Wagner-Carena, S.: Simulated annealing for JPEG quantization. In: DCC (2018)Google Scholar
  14. 14.
    Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)CrossRefGoogle Scholar
  15. 15.
    Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: ICLR (2017)Google Scholar
  16. 16.
    Jayant, N., Johnston, J., Safranek, R.: Signal compression based on models of human perception. Proc. IEEE 81(10), 1385–1422 (1993)CrossRefGoogle Scholar
  17. 17.
    Johnston, N., et al.: Improved Lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In: CVPR (2018)Google Scholar
  18. 18.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)Google Scholar
  19. 19.
    Knuth, D.E.: Dynamic Huffman coding. J. Algorithms 6(2), 163–180 (1985)Google Scholar
  20. 20.
    Lakhani, G.: Optimal Huffman coding of DCT blocks. IEEE Trans. Circuits Syst. Video Technol. 14(4), 522–527 (2004)CrossRefGoogle Scholar
  21. 21.
    Lee, J., Cho, S., Beack, S.K.: Context-adaptive entropy model for end-to-end optimized image compression. In: ICLR (2019)Google Scholar
  22. 22.
    Leshno, M., Lin, V.Y., Pinkus, A., Schocken, S.: Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw. 6(6), 861–867 (1993)CrossRefGoogle Scholar
  23. 23.
    Li, M., Zuo, W., Gu, S., Zhao, D., Zhang, D.: Learning convolutional networks for content-weighted image compression. In: CVPR (2018)Google Scholar
  24. 24.
    Liu, Z., et al.: DeepN-JPEG: a deep neural network favorable JPEG-based image compression framework. In: DAC (2018)Google Scholar
  25. 25.
    Lo, S.Y., Hang, H.M.: Exploring semantic segmentation on the DCT representation. In: MMAsia (2019)Google Scholar
  26. 26.
    Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Van Gool, L.: Conditional probability models for deep image compression. In: CVPR (2018)Google Scholar
  27. 27.
    Mhaskar, H.N., Micchelli, C.A.: Approximation by superposition of sigmoidal and radial basis functions. Adv. Appl. Math. 13(3), 350–373 (1992)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Minnen, D., Ballé, J., Toderici, G.D.: Joint autoregressive and hierarchical priors for learned image compression. In: NeurIPS (2018)Google Scholar
  29. 29.
    Monro, D.M., Sherlock, B.G.: Optimum DCT quantization. In: DCC (1993)Google Scholar
  30. 30.
    Pennebaker, W.B., Mitchell, J.L.: JPEG: Still Image Data Compression Standard. Springer Science and Business Media, New York (1992)Google Scholar
  31. 31.
    Ratnakar, V., Livny, M.: RD-OPT: an efficient algorithm for optimizing DCT quantization tables. In: DCC (1995)Google Scholar
  32. 32.
    Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: CVPR (2016)Google Scholar
  34. 34.
    Skodras, A., Christopoulos, C., Ebrahimi, T.: The JPEG 2000 still image compression standard. IEEE Signal Process. Mag. 18(5), 36–58 (2001)CrossRefGoogle Scholar
  35. 35.
    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR (2016)Google Scholar
  36. 36.
    Theis, L., Shi, W., Cunningham, A., Huszár, F.: Lossy image compression with compressive autoencoders. In: ICLR (2017)Google Scholar
  37. 37.
    Toderici, G., et al.: Full resolution image compression with recurrent neural networks. In: CVPR (2017)Google Scholar
  38. 38.
    Verma, V., Agarwal, N., Khanna, N.: DCT-domain deep convolutional neural networks for multiple JPEG compression classification. Sig. Process. Image Commun. 67, 22–33 (2018)CrossRefGoogle Scholar
  39. 39.
    Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: CVPR (2015)Google Scholar
  40. 40.
    Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: ACSSC (2003)Google Scholar
  41. 41.
    Watson, A.B.: Visually optimal DCT quantization matrices for individual images. In: DCC (1993)Google Scholar
  42. 42.
    Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: ECCV (2018)Google Scholar
  43. 43.
    Zhang, L., Zhang, L., Mou, X., Zhang, D.: FSIM: a feature similarity index for image quality assessment. IEEE Trans. Image Process. 20(8), 2378–2386 (2011)Google Scholar
  44. 44.
    Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of ECE and ASRISeoul National UniversitySeoulKorea

Personalised recommendations