Skip to main content

Text Detection from Scene and Born Images: How Good is Tesseract?

  • Conference paper
  • First Online:
Recent Trends in Communication and Intelligent Systems

Part of the book series: Algorithms for Intelligent Systems ((AIS))

  • 226 Accesses

Abstract

Detection of texts from scene images has been an active research area from last couple of decades. The problem of the research becomes challenging due to several environmental clutters such as background complexities, poor resolution, arbitrary orientation of texts, and appearance of texts in multi-lingual scenario. Tesseract is a well-known OCR engine for document-level image analysis. However, to the best of our knowledge, implementation of Tesseract in text detection has not been reported yet. Therefore, this paper presents a fair assessment of the performance of Tesseract in text detection. Reported work is evaluated on multiple benchmark datasets, viz. ICDAR 2013 (born images), ICDAR 2013 (focused scene text), and ICDAR 2019-MLT to validate its performance and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. T. Khan, R. Sarkar, A.F. Mollah, Deep learning approaches to scene text detection: a comprehensive review. Artif. Intell. Rev. 54, 3239–3298 (2021)

    Google Scholar 

  2. N. Pawar, Z. Shaikh, P. Shinde, Y.P. Warke, Image to text conversion using Tesseract. Int. Res. J. Eng. Technol. 6(2), 516–519 (2019)

    Google Scholar 

  3. S. Long, X. He, C. Yao, Scene text detection and recognition: the deep learning era (2020). arXiv:1811.04256v5

  4. Z. Raisi, M.A. Naiel, P. Fieguth, S. Wardell, J. Zelek, Text detection and recognition in the wild: a review (2020). arXiv:2006.04305v2

  5. C.R. Kulkarni, A.B. Barbadekar, Text detection and recognition: a review. Int. Res. J. Eng. Technol. 4(6), 179–185 (2017)

    Google Scholar 

  6. T. Khan, A.F. Mollah, AUTNT-A component level dataset for text non-text classification and benchmarking with novel script invariant feature descriptors and D-CNN. Multimedia Tools Appl. 78(22), 32159–32186 (2019)

    Google Scholar 

  7. A.F. Mollah, S. Basu, M. Nasipuri, Text detection from camera captured images using a novel fuzzy-based technique, in 3rd International Conference on Emerging Applications of Information Technology (2012), pp. 291–294

    Google Scholar 

  8. T. Khan, A.F. Mollah, A novel text localization scheme for camera captured document images, in 2nd International Conference on Computer Vision and Image Processing, Advances in Intelligent Systems and Computing, vol. 703, pp. 253–264 (2018)

    Google Scholar 

  9. T. Khan, A.F. Mollah, Text non-text classification based on area occupancy of equidistant pixels. Int. Conf. Comput. Intell. Data Sci. Procedia Comput. Sci. 167, 1889–1900 (2020)

    Google Scholar 

  10. A.C. Ozgen, M. Fasounaki, H.K. Ekenel, Text detection in natural and computer-generated images, in 26th Signal Processing and Communications Applications Conference (IEEE, 2018), pp. 1–4

    Google Scholar 

  11. M. Behzadi, R. Safabakhsh, Text detection in natural scenes using fully convolutional DenseNets, in Proceedings of 4th Iranian Conference on Signal Processing and Intelligent Systems (IEEE, 2019), pp. 11–14

    Google Scholar 

  12. Z. Liu, G. Lin, S. Yang, J. Feng, W. Lin, W.L. Goh, Learning Markov clustering networks for scene text detection (2018). arXiv:1805.08365v1

  13. H. Qin, H. Zhang, H. Wang, Y. Yan, M. Zhang, W. Zhao, An algorithm for scene text detection using multi-box and semantic segmentation. Appl. Sci. 9(6), 1054 (2019)

    Google Scholar 

  14. M. Liao, Z. Wan, C. Yao, K. Chen, X. Bai, Real-time scene text detection with differentiable binarization, in 34th Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (2020), pp. 11474–11481

    Google Scholar 

  15. A. Coates, B. Carpenter, C. Case, S. Satheesh, B. Suresh, T. Wang, D.J. Wu, A.Y. Ng, Text detection and character recognition in scene images with unsupervised feature learning, in ICDAR (IEEE, 2011), pp. 440–445

    Google Scholar 

  16. J.J. Lee, P.H. Lee, S.W. Lee, A. Yuille, C. Koch, Adaboost for text detection in natural scene, in ICDAR (2011), pp. 429–434

    Google Scholar 

  17. W. Huang, Z. Lin, J. Yang, J. Wang, Text localization in natural images using stroke feature transform and text covariance descriptors, in Proceedings of the IEEE International Conference on Computer Vision (2013), pp. 1241–1248

    Google Scholar 

  18. T. Khan, A.F. Mollah, Distance transform-based stroke feature descriptor for text non-text classification, in Recent Developments in Machine Learning and Data Analytics (2019), pp. 189–200

    Google Scholar 

  19. M. Liao, Z. Zhu, B. Shi, G.S. Xia, X. Bai, Rotation-sensitive regression for oriented scene text detection, in IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 5909–5918

    Google Scholar 

  20. F. Liu, C. Chen, D. Gu, J. Zheng, FTPN: Scene text detection with feature pyramid based text proposal network. IEEE Access 7, 44219–44228 (2019)

    Article  Google Scholar 

  21. Y. Tang, X. Wu, Scene text detection and segmentation based on cascaded convolution neural networks. IEEE Trans. Image Process. 26(3), 1509–1520 (2017)

    Google Scholar 

  22. P. He, W. Huang, T. He, Q. Zhu, Y. Qiao, X. Li, Single shot text detector with regional attention, in IEEE International Conference on Computer Vision (2017), pp. 3047–3055

    Google Scholar 

  23. T.Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Belongie, Feature pyramid networks for object detection, in IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

    Google Scholar 

  24. S.V. Rice, F.R. Jenkins, T.A. Nartker, The fourth annual test of OCR accuracy, in Computer Science (1995), pp 1–39

    Google Scholar 

  25. N. Islam, Z. Islam, N. Noor, A survey on optical character recognition system. ITB J. Inf. Commun. Technol. 10(2), 1–4 (2016)

    Google Scholar 

  26. B. Sharma, A.K. Rao, OCR related technology methods. Int. J. Adv. Trends Comput. Sci. Eng. 9(3), 2789–2793 (2020)

    Google Scholar 

  27. K.A. Hamad, M. Kaya, A detailed analysis of optical character recognition technology, in 3rd International Conference on Advanced Technology & Sciences; Int. J. Appl. Math. Electron. Comput. 4(Special Issue), 244–249 (2016)

    Google Scholar 

  28. R. Smith, An overview of the Tesseract OCR engine, in 9th International Conference on Document Analysis and Recognition (2007), pp. 629–633

    Google Scholar 

  29. R. Smith, D. Antonova, D.-S. Lee, Adapting the Tesseract open source OCR engine for multilingual OCR, in International Workshop on Multilingual OCR (2009), pp. 1–8

    Google Scholar 

  30. R. Smith, Hybrid page layout analysis via tab-stop detection, in 10th International Conference on Document Analysis and Recognition (2009), pp. 241–245

    Google Scholar 

  31. D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, M. Iwamura, J. Matas, L. Neumann, V.R. Chandrasekhar, S. Lu, F. Shafait, ICDAR 2015 competition on robust reading, in 13th ICDAR (IEEE, 2015), pp. 1156–1160

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nadeem Anwar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Anwar, N., Khan, T., Mollah, A.F. (2022). Text Detection from Scene and Born Images: How Good is Tesseract?. In: Pundir, A.K.S., Yadav, N., Sharma, H., Das, S. (eds) Recent Trends in Communication and Intelligent Systems. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-1324-2_13

Download citation

Publish with us

Policies and ethics