Skip to main content
Log in

Textual Image Compression for Maintaining or Improving the Recognition Performance

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

The present study investigates compression of textual images with high compression ratios while preserving or improving the general quality, readability, and optical character recognition of the compressed textual images. A novel textual image compression/decompression approach is proposed in which the compression path includes dynamic range reduction, wavelet transform, and set partitioning in hierarchical trees (SPIHT) encoding. The decompression path employs SPIHT decoding, then inverse wavelet transform, and then the proposed image enhancement technique. The compression and recognition performances of the proposed approach are evaluated using quantitative and qualitative measures that are then compared to those of the JPEG2000, DjVu, and multi-dimensional multi-scale parser approaches. In addition to the conventional rate-distortion curve, mean opinion score (MOS) is used and the novel measures of “breakdown point” and “downfall slope” are defined. The quantitative and qualitative results of the proposed approach have achieved results similar to those of the peak signal-to-noise ratio, but considerably outperformed the other two approaches for average MOS, average recognition rate, and the newly defined measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. K.U. Barthel, S.M. Partlin, M. Thierschmann, New technology for raster document image compression, in Part of the IS&T/SPIE Conference on Document Recognition and Retrieval VII, vol. 3967, pp. 286–290 (2000)

  2. L. Bottou, P. Haffner, P.G. Howard, P. Simard, Y. Bengio, Y. LeCun, High quality document image compression with DjVu. J. Electron. Imaging 7(3), 410–425 (1998)

    Article  Google Scholar 

  3. R.L. de Queiroz, R.R. Buckley, M. Xu, Mixed raster content (MRC) model for compound image compression. Vis. Commun. Image Process. 3653, 1106–1117 (1998)

    Article  Google Scholar 

  4. R.L. de Queiroz, R. Buckley, M. Xu, Mixed raster content (MRC) model for compound image compression, in Proceedings of the IS&T/SPIE Symposium on Electronic Imaging Science & Technology Visual Communications and Image Processing, vol. 3653, pp. 1106-1117 (1999)

  5. M. Ghanbari, Standard Codecs: Image Compression to Advanced Video Coding, 3rd edn. (The Institution of Engineering and Technology, London, 2011)

    Book  Google Scholar 

  6. R.C. Gonzales, R.E. Woods, Digital Image Processing, 3rd edn. (Prentice Hall, Englewood Cliffs, 2007)

    Google Scholar 

  7. H. Grailu, M. Lotfizad, H. Sadoghi, Farsi and Arabic document images lossy compression based on the mixed raster content model (IJDAR). Int. J. Doc. Anal. Recognit. 12(4), 227–248 (2009)

    Article  MATH  Google Scholar 

  8. D.B. Graziosil, N.M. Rodrigues, E.A.B. da Silva, M.B. de Carvalho, S.M.M. de Faria, Lossy and lossless image encoding using multi-scale recurrent pattern matching. IET Image Process. 7(6), 556–566 (2013)

    Article  Google Scholar 

  9. E. Haneda, J. Yi, C.A. Bouman, Segmentation for MRC compression, in Proceedings of SPIE-IS&T Electronic Imaging, Color Imaging XII: Processing, Hardcopy, and Applications, vol. 6493, pp. 252–262 (2007)

  10. K. Hu, Z. Tang, L. Gao, Y. Mu, MC-JBIG2: an improved algorithm for Chinese textual image compression. Int. J. Doc. Anal. Recognit. (IJDAR) 13(4), 271–284 (2010)

    Article  Google Scholar 

  11. D.P. Huttenlocher, P.F. Felzenswalb, W. Rucklidge, DigiPaper: a versatile color document image representation, in International Conference on Image Processing (ICIP), pp. 219-223 (1999)

  12. S.V. Khangar, L.G. Malik, Handwritten text image compression for Indic script document. Int. J. Comput. Appl. 47(5), 11–16 (2012)

    Google Scholar 

  13. E.Y. Lam, Compound document compression with model-based biased reconstruction. J. Electron. Imaging (JEI) 13(1), 191–197 (2004)

    Article  Google Scholar 

  14. Mathworks. http://www.mathworks.com, Image Processing Toolbox, see documentation for the “imadjust” function

  15. Mathworks. http://www.mathworks.com, Image Processing Toolbox, see documentation for the “imsharpen” function

  16. B. Oztan, A. Malik, Z. Fan, R. Eschbach, Removal of artifacts from JPEG compressed document images, in Proceedings of SPIE-IS&T Electronic Imaging, vol. 6493, 649306-1: 649306-9 (2007)

  17. W.A. Pearlman, A. Said, Set partitioning coding: part I of set partition coding and image wavelet coding systems. Found. Trends Signal Process. 2(2), 95–180 (2008)

    Article  MATH  Google Scholar 

  18. W.A. Pearlman, A. Said, Image wavelet coding systems: part II of set partition coding and image wavelet coding systems. Found. Trends Signal Process. 2(3), 181–246 (2008)

    Article  MATH  Google Scholar 

  19. W.A. Pearlman, A. Said, Digital Signal Compression: Principles and Practice (Cambridge University Press, New York, 2011)

    Book  MATH  Google Scholar 

  20. A.M. Rufai, G. Anbarjafari, H. Demirel, Lossy image compression using singular value decomposition and wavelet difference reduction. Digit. Signal Process. 24, 117–123 (2014)

    Article  Google Scholar 

  21. A. Said, W.A. Pearlman, A new, fast, and efficient image codec based on set partitioning in hierarchical trees. IEEE Trans. Circuits Syst Video Technol. 6(3), 243–250 (1996)

    Article  Google Scholar 

  22. L.H. Sharpe, B. Manns, JPEG2000 options for document image compression, in Proceedings of SPIE, Document Recognition and Retrieval IX, vol. 4670, pp. 167–173 (2002)

  23. D. Taubman, High performance scalable image compression with EBCOT. IEEE Trans. Image Process. 9(7), 1151–1170 (2000)

    Article  Google Scholar 

  24. M. Thierschmann, K.-U. Barthel, U.-E. Martin, A scalable DSP-architecture for high-speed color document compression, in Proceedings of SPIE, Document Recognition and Retrieval VIII, vol. 4307, pp. 158–166 (2001)

  25. B.-F. Wu, C.-C. Chiu, Y.-L. Chen, Algorithms for compressing compound document images with large text/background overlap. IEEE Proc. Vis. Image Signal Process. 151(6), 453–459 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hadi Grailu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Grailu, H. Textual Image Compression for Maintaining or Improving the Recognition Performance. Circuits Syst Signal Process 36, 658–674 (2017). https://doi.org/10.1007/s00034-016-0317-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-016-0317-4

Keywords

Navigation