Methods of Natural Image Preprocessing Supporting the Automatic Text Recognition Using the OCR Algorithms

  • Piotr Lech
  • Krzysztof Okarma
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 389)


Reading text from natural images is much more difficult than from scanned text documents since the text may appear in all colors, different sizes and types, often with distorted geometry or textures applied. The paper presents the idea of high-speed image preprocessing algorithms utilizing the quasi-local histogram based methods such as binarization, ROI filtering, line and corners detection, etc. which can be helpful for this task. Their low computational cost is provided by a reduction of the amount of processed information carried out by means of a simple random sampling. The approach presented in the paper allows to minimize some problems with the implementation of the OCR algorithms operating on natural images on devices with low computing power (e.g. mobile or embedded). Due to relatively small computational effort it is possible to test multiple hypotheses e.g. related to the possible location of the text in the image. Their verification can be based on the analysis of images in various color spaces. An additional advantage of the discussed algorithms is their construction allowing an efficient parallel implementation further reducing the computation time.


Image binarization Natural images OCR 


  1. 1.
    International Telecommunication Union recommendation BT.709-5—parameter values for the HDTV standards for production and international programme exchange (2001)Google Scholar
  2. 2.
    International Telecommunication Union recommendation BT.601-7—studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios (2011)Google Scholar
  3. 3.
    Bissacco, A., Cummins, M., Netzer, Y., Neven, H.: PhotoOCR: Reading text in uncontrolled conditions. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 785–792 (2013)Google Scholar
  4. 4.
    de Campos, T.E., Babu, B.R., Varma, M.: Character recognition in natural images. In: Proceedings of the International Conference on Computer Vision Theory and Applications (2009)Google Scholar
  5. 5.
    Chen, H., Tsai, S., Schroth, G., Chen, D., Grzeszczuk, R., Girod, B.: Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions. In: Proceedings of the 18th IEEE International Conference on Image Processing (ICIP), pp. 2609–2612 (2011)Google Scholar
  6. 6.
    Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970 (2010)Google Scholar
  7. 7.
    Forczmański, P., Frejlichowski, D.: Robust stamps detection and classification by means of general shape analysis. In: Bolc, L., Tadeusiewicz, R., Chmielewski, L., Wojciechowski, K. (eds.) Computer Vision and Graphics. Lecture Notes in Computer Science, vol. 6374, pp. 360–367. Springer, Berlin (2010)Google Scholar
  8. 8.
    Gooch, A.A., Olsen, S.C., Tumblin, J., Gooch, B.: Color2Gray: salience-preserving color removal. ACM Trans. Graph. 24(3), 634–639 (2005)CrossRefGoogle Scholar
  9. 9.
    Grundland, M., Dodgson, N.A.: Decolorize: fast, contrast enhancing, color to grayscale conversion. Pattern Recogn. 40(11), 2891–2896 (2007)CrossRefGoogle Scholar
  10. 10.
    Ikica, A., Peer, P.: Swt voting-based color reduction for text detection in natural scene images. EURASIP J. Adv. Sig. Process. 2013(1), Article ID 95 (2013)Google Scholar
  11. 11.
    Kapur, J., Sahoo, P., Wong, A.: A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vis. Graph. Image Process. 29(3), 273–285 (1985)CrossRefGoogle Scholar
  12. 12.
    Milyaev, S., Barinova, O., Novikova, T., Kohli, P., Lempitsky, V.: Image binarization for end-to-end text understanding in natural images. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 128–132 (2013)Google Scholar
  13. 13.
    Nagy, R., Dicker, A., Meyer-Wegener, K.: NEOCR: A configurable dataset for natural image text recognition. In: Iwamura, M., Shafait, F. (eds.) Camera-Based Document Analysis and Recognition. Lecture Notes in Computer Science, vol. 7139, pp. 150–163. Springer, Berlin (2012)CrossRefGoogle Scholar
  14. 14.
    Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)CrossRefGoogle Scholar
  15. 15.
    Roubtsova, N.S., Wijnhoven, R.G.J., de With, P.H.N.: Integrated text detection and recognition in natural images. In: Image Processing: Algorithms and Systems X and Parallel Processing for Imaging Applications II. Proceedings of SPIE, vol. 8295, pp. 829507–829521 (2012)Google Scholar
  16. 16.
    Smith, R.: An overview of the Tesseract OCR engine. In: Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR), vol. 2, pp. 629–633 (2007)Google Scholar
  17. 17.
    Su, B., Lu, S., Tian, S., Lim, J.H., Tan, C.L.: Character recognition in natural scenes using convolutional co-occurrence HOG. In: Proceedings of 22nd International Conference on Pattern Recognition (ICPR), pp. 2926–2931 (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.West Pomeranian University of TechnologySzczecinPoland
  2. 2.Faculty of Electrical EngineeringDepartment of Signal Processing and Multimedia EngineeringSzczecinPoland

Personalised recommendations