Text Detection from Scene and Born Images: How Good is Tesseract?

Anwar, Nadeem; Khan, Tauseef; Mollah, Ayatullah Faruk

doi:10.1007/978-981-19-1324-2_13

Nadeem Anwar⁸,
Tauseef Khan^9,10 &
Ayatullah Faruk Mollah¹⁰

Part of the book series: Algorithms for Intelligent Systems ((AIS))

226 Accesses

Abstract

Detection of texts from scene images has been an active research area from last couple of decades. The problem of the research becomes challenging due to several environmental clutters such as background complexities, poor resolution, arbitrary orientation of texts, and appearance of texts in multi-lingual scenario. Tesseract is a well-known OCR engine for document-level image analysis. However, to the best of our knowledge, implementation of Tesseract in text detection has not been reported yet. Therefore, this paper presents a fair assessment of the performance of Tesseract in text detection. Reported work is evaluated on multiple benchmark datasets, viz. ICDAR 2013 (born images), ICDAR 2013 (focused scene text), and ICDAR 2019-MLT to validate its performance and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

T. Khan, R. Sarkar, A.F. Mollah, Deep learning approaches to scene text detection: a comprehensive review. Artif. Intell. Rev. 54, 3239–3298 (2021)
Google Scholar
N. Pawar, Z. Shaikh, P. Shinde, Y.P. Warke, Image to text conversion using Tesseract. Int. Res. J. Eng. Technol. 6(2), 516–519 (2019)
Google Scholar
S. Long, X. He, C. Yao, Scene text detection and recognition: the deep learning era (2020). arXiv:1811.04256v5
Z. Raisi, M.A. Naiel, P. Fieguth, S. Wardell, J. Zelek, Text detection and recognition in the wild: a review (2020). arXiv:2006.04305v2
C.R. Kulkarni, A.B. Barbadekar, Text detection and recognition: a review. Int. Res. J. Eng. Technol. 4(6), 179–185 (2017)
Google Scholar
T. Khan, A.F. Mollah, AUTNT-A component level dataset for text non-text classification and benchmarking with novel script invariant feature descriptors and D-CNN. Multimedia Tools Appl. 78(22), 32159–32186 (2019)
Google Scholar
A.F. Mollah, S. Basu, M. Nasipuri, Text detection from camera captured images using a novel fuzzy-based technique, in 3rd International Conference on Emerging Applications of Information Technology (2012), pp. 291–294
Google Scholar
T. Khan, A.F. Mollah, A novel text localization scheme for camera captured document images, in 2nd International Conference on Computer Vision and Image Processing, Advances in Intelligent Systems and Computing, vol. 703, pp. 253–264 (2018)
Google Scholar
T. Khan, A.F. Mollah, Text non-text classification based on area occupancy of equidistant pixels. Int. Conf. Comput. Intell. Data Sci. Procedia Comput. Sci. 167, 1889–1900 (2020)
Google Scholar
A.C. Ozgen, M. Fasounaki, H.K. Ekenel, Text detection in natural and computer-generated images, in 26th Signal Processing and Communications Applications Conference (IEEE, 2018), pp. 1–4
Google Scholar
M. Behzadi, R. Safabakhsh, Text detection in natural scenes using fully convolutional DenseNets, in Proceedings of 4th Iranian Conference on Signal Processing and Intelligent Systems (IEEE, 2019), pp. 11–14
Google Scholar
Z. Liu, G. Lin, S. Yang, J. Feng, W. Lin, W.L. Goh, Learning Markov clustering networks for scene text detection (2018). arXiv:1805.08365v1
H. Qin, H. Zhang, H. Wang, Y. Yan, M. Zhang, W. Zhao, An algorithm for scene text detection using multi-box and semantic segmentation. Appl. Sci. 9(6), 1054 (2019)
Google Scholar
M. Liao, Z. Wan, C. Yao, K. Chen, X. Bai, Real-time scene text detection with differentiable binarization, in 34th Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (2020), pp. 11474–11481
Google Scholar
A. Coates, B. Carpenter, C. Case, S. Satheesh, B. Suresh, T. Wang, D.J. Wu, A.Y. Ng, Text detection and character recognition in scene images with unsupervised feature learning, in ICDAR (IEEE, 2011), pp. 440–445
Google Scholar
J.J. Lee, P.H. Lee, S.W. Lee, A. Yuille, C. Koch, Adaboost for text detection in natural scene, in ICDAR (2011), pp. 429–434
Google Scholar
W. Huang, Z. Lin, J. Yang, J. Wang, Text localization in natural images using stroke feature transform and text covariance descriptors, in Proceedings of the IEEE International Conference on Computer Vision (2013), pp. 1241–1248
Google Scholar
T. Khan, A.F. Mollah, Distance transform-based stroke feature descriptor for text non-text classification, in Recent Developments in Machine Learning and Data Analytics (2019), pp. 189–200
Google Scholar
M. Liao, Z. Zhu, B. Shi, G.S. Xia, X. Bai, Rotation-sensitive regression for oriented scene text detection, in IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 5909–5918
Google Scholar
F. Liu, C. Chen, D. Gu, J. Zheng, FTPN: Scene text detection with feature pyramid based text proposal network. IEEE Access 7, 44219–44228 (2019)
Article Google Scholar
Y. Tang, X. Wu, Scene text detection and segmentation based on cascaded convolution neural networks. IEEE Trans. Image Process. 26(3), 1509–1520 (2017)
Google Scholar
P. He, W. Huang, T. He, Q. Zhu, Y. Qiao, X. Li, Single shot text detector with regional attention, in IEEE International Conference on Computer Vision (2017), pp. 3047–3055
Google Scholar
T.Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Belongie, Feature pyramid networks for object detection, in IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
S.V. Rice, F.R. Jenkins, T.A. Nartker, The fourth annual test of OCR accuracy, in Computer Science (1995), pp 1–39
Google Scholar
N. Islam, Z. Islam, N. Noor, A survey on optical character recognition system. ITB J. Inf. Commun. Technol. 10(2), 1–4 (2016)
Google Scholar
B. Sharma, A.K. Rao, OCR related technology methods. Int. J. Adv. Trends Comput. Sci. Eng. 9(3), 2789–2793 (2020)
Google Scholar
K.A. Hamad, M. Kaya, A detailed analysis of optical character recognition technology, in 3rd International Conference on Advanced Technology & Sciences; Int. J. Appl. Math. Electron. Comput. 4(Special Issue), 244–249 (2016)
Google Scholar
R. Smith, An overview of the Tesseract OCR engine, in 9th International Conference on Document Analysis and Recognition (2007), pp. 629–633
Google Scholar
R. Smith, D. Antonova, D.-S. Lee, Adapting the Tesseract open source OCR engine for multilingual OCR, in International Workshop on Multilingual OCR (2009), pp. 1–8
Google Scholar
R. Smith, Hybrid page layout analysis via tab-stop detection, in 10th International Conference on Document Analysis and Recognition (2009), pp. 241–245
Google Scholar
D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, M. Iwamura, J. Matas, L. Neumann, V.R. Chandrasekhar, S. Lu, F. Shafait, ICDAR 2015 competition on robust reading, in 13th ICDAR (IEEE, 2015), pp. 1156–1160
Google Scholar

Download references

Author information

Authors and Affiliations

The Adabi Society High Madrasah, Angus, Hooghly, 712221, India
Nadeem Anwar
Department of Information Technology, Haldia Institute of Technology, ICARE Complex, Haldia, 721657, India
Tauseef Khan
Department of Computer Science and Engineering, Aliah University, Newtown Campus, Kolkata, 700160, India
Tauseef Khan & Ayatullah Faruk Mollah

Authors

Nadeem Anwar
View author publications
You can also search for this author in PubMed Google Scholar
Tauseef Khan
View author publications
You can also search for this author in PubMed Google Scholar
Ayatullah Faruk Mollah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nadeem Anwar .

Editor information

Editors and Affiliations

Arya College of Engineering & IT, Jaipur, Rajasthan, India
Aditya Kumar Singh Pundir
Department of Mathematics, National Institute of Technology, Hamirpur, Himachal Pradesh, India
Neha Yadav
Department of Computer Science and Engineering, Rajasthan Technical University, Kota, Rajasthan, India
Harish Sharma
Electronics and Communication Sciences Unit, Indian Statistical Institute, Kolkata, West Bengal, India
Swagatam Das

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Anwar, N., Khan, T., Mollah, A.F. (2022). Text Detection from Scene and Born Images: How Good is Tesseract?. In: Pundir, A.K.S., Yadav, N., Sharma, H., Das, S. (eds) Recent Trends in Communication and Intelligent Systems. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-1324-2_13

Download citation

DOI: https://doi.org/10.1007/978-981-19-1324-2_13
Published: 25 May 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1323-5
Online ISBN: 978-981-19-1324-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics