Scene Text Recognition: A Preliminary Investigation on Various Techniques and Implementation Using Deep Learning Classifiers

  • N. Bhavesh Shri Kumar
  • Dasi Naga Brahma Krishna Sumanth Reddy
  • K. Sairam
  • J. NarenEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1087)


Recognizing text in scene images plays a vital role especially for applications dealing with environmental interactions. For the system to recognize the environment, textual regions present in them hold a great source of information. But the text recognition task in scene images is complicated due to various unavoidable clutter and distortion in the scene images. Font styling in scene images is also not regulated, and hence, there is a lot of touching between fonts as well. Prior to the recognition of text present in the scene image, identification of correct textual regions and extracting textual edges pose as tedious tasks. Inclusion of unwanted edge features in the task will deteriorate the accuracy of the model. In this work, various methodologies which have been proposed for the identification of textual regions, extraction of textual edges, and recognition of text in scenes have been reviewed. Also, a simple implementation of the same has been done using deep learning classifiers.


Scene text recognition Text localization Edge detection Machine learning Deep learning 


  1. 1.
    X. Wang, Y. Song, Y. Zhang, J. Xin, Natural scene text detection with multi-layer segmentation and higher order conditional random field-based analysis. Pattern Recognit. Lett. 60–61, 41–47 (2015)CrossRefGoogle Scholar
  2. 2.
    Y. Wei, Z. Zhang, W. Shen, D. Zeng, M. Fang, S. Zhou, Text detection in scene images based on exhaustive segmentation. Signal Process. Image Commun. 50, 1–8 (2017)CrossRefGoogle Scholar
  3. 3.
    A. Sain, A.K. Bhunia, P.P. Roy, U. Pal, Multi-oriented text detection and verification in video frames and scene images. Neurocomputing 275, 1531–1549 (2018)CrossRefGoogle Scholar
  4. 4.
    Y. Zheng, Q. Li, J. Liu, H. Liu, G. Li, S. Zhang, A cascaded method for text detection in natural scene images. Neurocomputing 238, 307–315 (2017)CrossRefGoogle Scholar
  5. 5.
    G.J. Ansari, J.H. Shah, M. Yasmin, M. Sharif, S.L. Fernandes, A novel machine learning approach for scene text extraction. Futur. Gener. Comput. Syst. 87, 328–340 (2018)CrossRefGoogle Scholar
  6. 6.
    X. Zhang, X. Gao, C. Tian, Text detection in natural scene images based on color prior guided MSER. Neurocomputing 307, 61–71 (2018)CrossRefGoogle Scholar
  7. 7.
    B. Su, S. Lu, Accurate recognition of words in scenes without character segmentation using recurrent neural network. Pattern Recognit. 63(June 2016), 397–405 (2017)CrossRefGoogle Scholar
  8. 8.
    V. Khare, P. Shivakumara, P. Raveendran, M. Blumenstein, A blind deconvolution model for scene text detection and recognition in video. Pattern Recognit. 54, 128–148 (2016)CrossRefGoogle Scholar
  9. 9.
    C. Yu, Y. Song, Y. Zhang, Scene text localization using edge analysis and feature pool. Neurocomputing 175, 652–661 (2016)CrossRefGoogle Scholar
  10. 10.
    J.H. Seok, J.H. Kim, Scene text recognition using a Hough forest implicit shape model and semi-Markov conditional random fields. Pattern Recognit. 48, 3584–3599 (2015)CrossRefGoogle Scholar
  11. 11.
    M. Šarić, Scene text segmentation using low variation extremal regions and sorting based character grouping. Neurocomputing 266, 56–65 (2017)CrossRefGoogle Scholar
  12. 12.
    S. Dey et al., Script independent approach for multi-oriented text detection in scene image. Neurocomputing 242, 96–112 (2017)CrossRefGoogle Scholar
  13. 13.
    L.M. Francis, N. Sreenath, TEDLESS—text detection using least-square SVM from natural scene. J. King Saud Univ.—Comput. Inf. Sci. (2017).
  14. 14.
    C. Shi, C. Wang, B. Xiao, S. Gao, J. Hu, Author’ s accepted manuscript end-to-end scene text recognition using tree-structured models. Pattern Recognit. 47, 2853–2866 (2014)CrossRefGoogle Scholar
  15. 15.
    A. Mishra, K. Alahari, C.V. Jawahar, Enhancing energy minimization framework for scene text recognition with top-down cues. Comput. Vis. Image Underst. 145, 30–42 (2016)CrossRefGoogle Scholar
  16. 16.
    K. Fan, S.J. Baek, A robust proposal generation method for text lines in natural scene images. Neurocomputing 304, 47–63 (2018)CrossRefGoogle Scholar
  17. 17.
    L. Sun, Q. Huo, W. Jia, K. Chen, A robust approach for text detection from natural scene images. Pattern Recognit. 48, 2906–2920 (2015)CrossRefGoogle Scholar
  18. 18.
    Wahyono, K. Jo, LED Dot matrix text recognition method in natural scene. Neurocomputing 151, 1033–1041 (2015)CrossRefGoogle Scholar
  19. 19.
    C. Merino-Gracia, M. Mirmehdi, J. Sigut, J.L. González-Mora, Fast perspective recovery of text in natural scenes. Image Vis. Comput. 31(10), 714–724 (2013)CrossRefGoogle Scholar
  20. 20.
    D. Bazazian, R. Gómez, A. Nicolaou, L. Gómez, FAST: facilitated and accurate scene text proposals through FCN. Pattern Recognit. Lett. 0, 1–9 (2017)Google Scholar
  21. 21.
    S. Roy, P. Shivakumara, N. Jain, V. Khare, A. Dutta, U. Pal, T. Lu, Rough-fuzzy based scene categorization for text detection and recognition in video. Pattern Recognit. 80, 64–82 (2018)CrossRefGoogle Scholar
  22. 22.
    D. NguyenVan, S. Lu, S. Tian, N. Ouarti, M. Mokhtari, A pooling-based scene text proposal technique for scene text reading in the wild. Pattern Recognit. (2018)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • N. Bhavesh Shri Kumar
    • 1
  • Dasi Naga Brahma Krishna Sumanth Reddy
    • 1
  • K. Sairam
    • 1
  • J. Naren
    • 1
    Email author
  1. 1.SASTRA Deemed UniversityThanjavurIndia

Personalised recommendations