Skip to main content
Log in

Scene text detection and recognition: a survey

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Scene text detection and recognition have been given a lot of attention in recent years and have been used in many vision-based applications. In this field, there are various types of challenges, including images with wavy text, images with text rotation and orientation, changing the scale and variety of text fonts, noisy images, wild background images, which make the detection and recognition of text from the image more complex and difficult. In this article, we first presented a comprehensive review of recent advances in text detection and recognition and described the advantages and disadvantages. The common datasets were introduced. Then, the recent methods compared together and analyzed the text detection and recognition systems. According to the recent decade studies, one of the most important challenges is curved and vertical text detection in this field. We have expressed approaches for the development of the detection and recognition system. Also, we have described the methods that are robust in the detection and recognition of curved and vertical texts. Finally, we have presented some approaches to develop text detection and recognition systems as the future work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  1. Ali S et al (2015) A review on text detection techniques. VFAST Trans Soft Eng 3(1):67–76

    Google Scholar 

  2. Almazán J et al (2014) Word spotting and recognition with embedded attributes. IEEE Trans Pattern Anal Mach Intell 36(12):2552–2566

    Article  Google Scholar 

  3. Alsharif, Ouais, and Joelle Pineau (2013) End-to-end text recognition with hybrid HMM maxout models. arXiv preprint arXiv:1310.1811

  4. Ayed AB, Halima MB, Alimi AM (2015) MapReduce based text detection in big data natural scene videos. Procedia Comput Sci 53:216–223

    Article  Google Scholar 

  5. Baek, Youngmin, et al. (2019) Character region awareness for text detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

  6. Bai X, Shi B, Zhang C, Cai X, Qi L (2017) Text/non-text image classification in the wild with convolutional neural networks. Pattern Recogn 66:437–446

    Article  Google Scholar 

  7. Bai, Fan, et al. (2018) Edit probability for scene text recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  8. Baran, Remigiusz, Pavol Partila, and Rafal Wilk (2018) Automated text detection and character recognition in natural scenes based on local image features and contour processing techniques. International Conference on Intelligent Human Systems Integration. Springer, Cham

  9. Bissacco, Alessandro, et al. (2013) Photoocr: Reading text in uncontrolled conditions. Proceedings of the ieee international conference on computer vision

  10. Campos D, Emídio T, Babu BR, Varma M (2009) Character recognition in natural images. VISAPP 2:7

    Google Scholar 

  11. Chen, Yuxin, and Yunxue Shao (2019) "Scene Text Recognition Based on Deep Learning: A Brief Survey. 2019 IEEE 11th International Conference on Communication Software and Networks (ICCSN). IEEE

  12. Chen J, Zhao H, Yang J, Zhang J, Li T, Wang K (2017) An intelligent character recognition method to filter spam images on cloud. Soft Comput 21(3):753–763

    Article  Google Scholar 

  13. Cheng, Zhanzhan, et al. (2017) Focusing attention: Towards accurate text recognition in natural images. Proceedings of the IEEE international conference on computer vision

  14. Cheng G, Zhou P, Han J (2016) Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Trans Geosci Remote Sens 54(12):7405–7415

    Article  Google Scholar 

  15. Cheng, Zhanzhan, et al. (2018) Aon: Towards arbitrarily-oriented text recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  16. Ch'ng, Chee Kheng, and Chee Seng Chan (2017) Total-text: A comprehensive dataset for scene text detection and recognition. 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE

  17. Cho H, Sung M, Jun B (2016) Canny text detector: fast and robust scene text localization algorithm. Proc IEEE Conf Comput Vis Pattern Recognit

  18. Coates, Adam, et al. (2011) Text detection and character recognition in scene images with unsupervised feature learning. 2011 International Conference on Document Analysis and Recognition. IEEE

  19. Dai, Yuchen, et al. (2018) Fused text segmentation networks for multi-oriented scene text detection." 2018 24th International Conference on Pattern Recognition (ICPR). IEEE

  20. Elad M, Aharon M (2006) Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans Image Process 15(12):3736–3745

    Article  MathSciNet  Google Scholar 

  21. Epshtein, Boris, Eyal Ofek, and Yonatan Wexler (2010) Detecting text in natural scenes with stroke width transform. 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE

  22. Feng, Wei, et al. (2019) TextDragon: An end-to-end framework for arbitrary shaped text spotting. Proceedings of the IEEE/CVF International Conference on Computer Vision

  23. Goel, Vibhor, et al. (2013) Whole is greater than sum of parts: Recognizing scene text words." 2013 12th International Conference on Document Analysis and Recognition. IEEE

  24. Gupta N, Jalal AS (2019) A robust model for salient text detection in natural scene images using MSER feature detector and Grabcut. Multimed Tools Appl 78(8):10821–10835

    Article  Google Scholar 

  25. Han, Junwei, et al. (2019) P-CNN: Part-based convolutional neural networks for fine-grained visual categorization. IEEE transactions on pattern analysis and machine intelligence

  26. He T, Huang W, Qiao Y, Yao J (2016) Text-attentional convolutional neural network for scene text detection. IEEE Trans Image Process 25(6):2529–2541

    Article  MathSciNet  MATH  Google Scholar 

  27. He W et al (2020) Realtime multi-scale scene text detection with scale-based region proposal network. Pattern Recognition 98:107026

    Article  Google Scholar 

  28. Huang, Weilin, et al. (2013) Text localization in natural images using stroke feature transform and text covariance descriptors." Proceedings of the IEEE international conference on computer vision

  29. Huang, Weilin, Yu Qiao, and Xiaoou Tang (2014) Robust scene text detection with convolution neural network induced mser trees. European conference on computer vision. Springer, Cham

  30. Islam, Md Rabiul, et al. (2016) Text detection and recognition using enhanced MSER detection and a novel OCR technique. 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV). IEEE

  31. Jaderberg, Max, et al. (2014) Synthetic data and artificial neural networks for natural scene text recognition. arXiv preprint arXiv:1406.2227

  32. Jaderberg M, Vedaldi A, Zisserman A (2014) Deep features for text spotting. European conference on computer vision, Springer, Cham

    Book  Google Scholar 

  33. Jaderberg, Max, et al. (2014) Deep structured output learning for unconstrained text recognition. arXiv preprint arXiv:1412.5903

  34. Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2016) Reading text in the wild with convolutional neural networks. Int J Comput Vis 116(1):1–20

    Article  MathSciNet  Google Scholar 

  35. Jain AK, Bin Y (1998) Automatic text location in images and video frames. Pattern Recogn 31(12):2055–2076

    Article  Google Scholar 

  36. Jeong, Munho, and Kang-Hyun Jo (2015) "Multi language text detection using fast stroke width transform." 2015 21st Korea-Japan joint workshop on Frontiers of computer vision (FCV). IEEE

  37. Jiang, Yingying, et al. (2017) R2cnn: rotational region cnn for orientation robust scene text detection. arXiv preprint arXiv:1706.09579

  38. Karatzas, Dimosthenis, et al. (2013) ICDAR 2013 robust reading competition. 2013 12th International Conference on Document Analysis and Recognition. IEEE

  39. Karatzas, Dimosthenis, et al. (2015) ICDAR 2015 competition on robust reading. 2015 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE

  40. Koo HI, Kim DH (2013) Scene text detection via connected component clustering and nontext filtering. IEEE Trans Image Process 22(6):2296–2305

    Article  MathSciNet  MATH  Google Scholar 

  41. Kumar S (2016) Krishan Kumar, and Rahul Kumar Mishra. "scene text recognition using artificial neural network: a survey.". Int J Comput Appl 137(6):40–50

    Google Scholar 

  42. Lee C-Y, Osindero S (2016) Recursive recurrent nets with attention modeling for ocr in the wild. Proc IEEE Conf Comput Vis Pattern Recognit

  43. Liao, Minghui, et al. (2019) Scene text recognition from two-dimensional perspective. Proceedings of the AAAI Conference on Artificial Intelligence. 33:01

  44. Liao, Minghui, et al. (2017) Textboxes: A fast text detector with a single deep neural network. Proceedings of the AAAI conference on artificial intelligence. 31:1

  45. Liao, Minghui, et al. (2018) Rotation-sensitive regression for oriented scene text detection. Proceedings of the IEEE conference on computer vision and pattern recognition

  46. Liao M, Shi B, Bai X (2018) Textboxes++: a single-shot oriented scene text detector. IEEE Trans Image Process 27(8):3676–3690

    Article  MathSciNet  MATH  Google Scholar 

  47. Liu X, Meng G, Pan C (2019) Scene text detection and recognition with advances in deep learning: a survey. Int J Doc Anal Recognit 22(2):143–162

    Article  Google Scholar 

  48. Liu F, Chen C, Gu D, Zheng J (2019) FTPN: scene text detection with feature pyramid based text proposal network. IEEE Access 7:44219–44228

    Article  Google Scholar 

  49. Long, Shangbang, et al. (2020) A new perspective for flexible feature gathering in scene text recognition via character anchor pooling. ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE

  50. Long S, He X, Yao C (2020) Scene text detection and recognition: the deep learning era. Int J Comput Vis 129:1–24

    Google Scholar 

  51. Lucas SM, Panaretos A, Sosa L, Tang A, Wong S, Young R, Ashida K, Nagai H, Okamoto M, Yamamoto H, Miyao H, Zhu JM, Ou WW, Wolf C, Jolion J-M, Todoran L, Worring M, Lin X (2005) ICDAR 2003 robust reading competitions: entries, results, and future directions. IJDAR 7(2–3):105–122

    Article  Google Scholar 

  52. Luo C, Jin L, Sun Z (2019) Moran: a multi-object rectified attention network for scene text recognition. Pattern Recogn 90:109–118

    Article  Google Scholar 

  53. Lyu, Pengyuan, et al. (2018) Multi-oriented scene text detection via corner localization and region segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition

  54. Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimedia 20(11):3111–3122

    Article  Google Scholar 

  55. Mishra, Anand, Karteek Alahari, and Jawahar CV (2012) Scene text recognition using higher order language priors. BMVC-British Machine Vision Conference. BMVA

  56. Mishra, Anand, Karteek Alahari, and Jawahar CV (2012) Top-down and bottom-up cues for scene text recognition." 2012 IEEE conference on computer vision and pattern recognition. IEEE

  57. Naiemi F, Ghods V, Khalesi H (2019) An efficient character recognition method using enhanced HOG for spam image detection. Soft Comput 23(22):11759–11774

    Article  Google Scholar 

  58. Naiemi F, Ghods V, Khalesi H (2020) Scene text detection using enhanced extremal region and convolutional neural network. Multimed Tools Appl 79(37):27137–27159

    Article  Google Scholar 

  59. Naiemi, Fatemeh, Vahid Ghods, and Hassan Khalesi (2021) MOSTL: an accurate multi oriented scene text localization. Circuits, Systems, and Signal Processing, in press

  60. Naiemi F, Ghods V, Khalesi H (2021) A novel pipeline framework for multi oriented scene text image detection and recognition. Expert Syst Appl 170:114549

    Article  Google Scholar 

  61. Nayef, Nibal, et al. (2017) Icdar2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt." 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE

  62. Nayef, Nibal, et al. (2019) ICDAR2019 robust reading challenge on multi-lingual scene text detection and recognition—RRC-MLT-2019. 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE

  63. Neumann L, Matas J (2010) A method for text localization and recognition in real-world images. Asian conference on computer vision, Springer, Berlin, Heidelberg

    Google Scholar 

  64. Neumann, Lukáš, and Jiří Matas (2012) Real-time scene text localization and recognition. 2012 IEEE conference on computer vision and pattern recognition. IEEE

  65. Neumann L, Matas J (2015) Real-time lexicon-free scene text localization and recognition. IEEE Trans Pattern Anal Mach Intell 38(9):1872–1885

    Article  Google Scholar 

  66. Neycharan JG, Ahmadyfard A (2018) Edge color transform: a new operator for natural scene text localization. Multimed Tools Appl 77(6):7615–7636

    Article  Google Scholar 

  67. Novikova, Tatiana, et al. (2012) Large-lexicon attribute-consistent text recognition in natural images." European conference on computer vision. Springer, Berlin, Heidelberg

  68. Pan Y-F, Hou X, Liu C-L (2010) A hybrid approach to detect and localize texts in natural scene images. IEEE Trans Image Process 20(3):800–813

    MathSciNet  MATH  Google Scholar 

  69. Qiao, Liang, et al. (2020) Text perceptron: Towards end-to-end arbitrary-shaped text spotting. Proceedings of the AAAI Conference on Artificial Intelligence. 34:07

  70. Qin, Siyang, et al. (2019) Towards unconstrained end-to-end text spotting. Proceedings of the IEEE/CVF International Conference on Computer Vision

  71. Ranjbarzadeh R, Saadi SB (2020) Automated liver and tumor segmentation based on concave and convex points using fuzzy c-means and mean shift clustering. Measurement 150:107086

    Article  Google Scholar 

  72. Ren X, Zhou Y, Huang Z, Sun J, Yang X, Chen K (2017) A novel text structure feature extractor for Chinese scene text detection and recognition. IEEE Access 5:3193–3204

    Article  Google Scholar 

  73. Rodriguez-Serrano JA, Gordo A, Perronnin F (2015) Label embedding: a frugal baseline for text recognition. Int J Comput Vis 113(3):193–207

    Article  Google Scholar 

  74. Shahab, Asif, Faisal Shafait, and Andreas Dengel (2011) ICDAR 2011 robust reading competition challenge 2: Reading text in scene images. 2011 international conference on document analysis and recognition. IEEE

  75. Shi, Baoguang, et al. (2016) Robust scene text recognition with automatic rectification. Proceedings of the IEEE conference on computer vision and pattern recognition

  76. Shi, Cunzhao, et al. (2013) Scene text recognition using part-based tree-structured character detection." Proceedings of the IEEE conference on computer vision and pattern recognition

  77. Shi B, Bai X, Yao C (2016) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298–2304

    Article  Google Scholar 

  78. Shivakumara P, Phan TQ, Tan CL (2010) A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33(2):412–419

    Article  Google Scholar 

  79. Shivakumara P, Phan TQ, Lu S, Tan CL (2013) Gradient vector flow and grouping-based method for arbitrarily oriented scene text detection in video images. IEEE Trans Circuits Syst Video Technol 23(10):1729–1739

    Article  Google Scholar 

  80. Su, Bolan, and Shijian Lu. (2014) Accurate scene text recognition based on recurrent neural network." Asian Conference on Computer Vision. Springer, Cham

  81. Sung, Myung-Chul, et al. (2015) Scene text detection with robust character candidate extraction method." 2015 13th International conference on document analysis and recognition (ICDAR). IEEE

  82. Tabassum, Adiba, and Shweta A. Dhondse (2015) Text detection using MSER and stroke width transform." 2015 Fifth International Conference on Communication Systems and Network Technologies. IEEE

  83. Tian, Zhi, et al. (2016) Detecting text in natural image with connectionist text proposal network. European conference on computer vision. Springer, Cham

  84. Vasilopoulos N, Kavallieratou E (2017) Unified layout analysis and text localization framework. J Electron Imaging 26(1):013009

    Article  Google Scholar 

  85. Veit, Andreas, et al. (2016) Coco-text: Dataset and benchmark for text detection and recognition in natural images. arXiv preprint arXiv:1601.07140

  86. Wang K, Belongie S (2010) Word spotting in the wild. European conference on computer vision, Springer, Berlin, Heidelberg

    Book  Google Scholar 

  87. Wang, Jianfeng, and Xiaolin Hu. (2017) Gated recurrent convolution neural network for ocr. Proceedings of the 31st International Conference on Neural Information Processing Systems

  88. Wang, Kai, Boris Babenko, and Serge Belongie (2011) "End-to-end scene text recognition." 2011 International Conference on Computer Vision. IEEE

  89. Wang, Kai, Boris Babenko, and Serge Belongie (2011) End-to-end scene text recognition. 2011 International Conference on Computer Vision. IEEE

  90. Wang, Tao, et al. (2012) End-to-end text recognition with convolutional neural networks. Proceedings of the 21st international conference on pattern recognition (ICPR2012). IEEE

  91. Wang R, Sang N, Gao C (2015) Text detection approach based on confidence map and context information. Neurocomputing 157:153–165

    Article  Google Scholar 

  92. Wang, Wenhai, et al. (2019) Shape robust text detection with progressive scale expansion network. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

  93. Wang Q, Huang Y, Jia W, He X, Blumenstein M, Lyu S, Lu Y (2020) FACLSTM: ConvLSTM with focused attention for scene text recognition. Science China Inf Sci 63(2):1–14

    MathSciNet  Google Scholar 

  94. Wright J, Yang AY, Ganesh A, Sastry SS, Yi Ma (2008) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227

    Article  Google Scholar 

  95. Yang, Xiao, et al. (2017) Learning to Read Irregular Text with Attention Mechanisms. IJCAI. 1:2

  96. Yang, Qiangpeng, et al. (2018) Inceptext: A new inception-text module with deformable psroi pooling for multi-oriented scene text detection. arXiv preprint arXiv:1805.01167

  97. Yao, Cong, et al. (2012) Detecting texts of arbitrary orientations in natural images." 2012 IEEE conference on computer vision and pattern recognition. IEEE

  98. Yao C, Bai X, Liu W (2014) A unified framework for multioriented text detection and recognition. IEEE Trans Image Process 23(11):4737–4749

    Article  MathSciNet  MATH  Google Scholar 

  99. Yao, Cong, et al. (2014) Strokelets: A learned multi-scale representation for scene text recognition. Proceedings of the IEEE conference on computer vision and pattern recognition

  100. Yao, Cong, et al. (2016) Scene text detection via holistic, multi-channel prediction. arXiv preprint arXiv:1606.09002

  101. Ye Q, Doermann D (2014) Text detection and recognition in imagery: a survey. IEEE Trans Pattern Anal Mach Intell 37(7):1480–1500

    Article  Google Scholar 

  102. Ye Q, Huang Q, Gao W, Zhao D (2005) Fast and robust text detection in images and video frames. Image Vis Comput 23(6):565–576

    Article  Google Scholar 

  103. Yin X-C et al (2013) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):970–983

    Google Scholar 

  104. Yuan J, Wei B, Liu Y, Zhang Y, Wang L (2015) A method for text line detection in natural images. Multimed Tools Appl 74(3):859–884

    Article  Google Scholar 

  105. Zhan F, Shijian L (2019) Esir: end-to-end scene text recognition via iterative image rectification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

  106. Zhang, Yaping, et al. (2019) Sequence-to-sequence domain adaptation network for robust text image recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

  107. Zhang H, Zhao K, Song Y-Z, Guo J (2013) Text extraction from natural scene image: a survey. Neurocomputing 122:310–323

    Article  Google Scholar 

  108. Zhang, Zheng, et al. (2015) Symmetry-based text line detection in natural scenes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  109. Zhang, Zheng, et al. (2016) Multi-oriented text detection with fully convolutional networks. Proceedings of the IEEE conference on computer vision and pattern recognition

  110. Zhang D, Meng D, Han J (2016) Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Trans Pattern Anal Mach Intell 39(5):865–878

    Article  Google Scholar 

  111. Zhang, Chengquan, et al. (2019) Look more than once: An accurate detector for text of arbitrary shapes. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

  112. Zheng Y, Iwana BK, Uchida S (2019) Mining the displacement of max-pooling for text recognition. Pattern Recogn 93:558–569

    Article  Google Scholar 

  113. Zhong Z, Sun L, Huo Q (2019) An anchor-free region proposal network for faster R-CNN-based text detection approaches. Int J Doc Anal Recognit 22(3):315–327

    Article  Google Scholar 

  114. Zhou, Xinyu, et al. (2017) East: an efficient and accurate scene text detector. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition

  115. Zhu Y, Yao C, Bai X (2016) Scene text detection and recognition: recent advances and future trends. Front Comput Sci 10(1):19–36

    Article  Google Scholar 

  116. Zhu W et al (2017) Scene text detection via extremal region based double threshold convolutional network classification. PloS one 12.8:e0182227

    Article  Google Scholar 

  117. Zhu, Zhen, et al. (2018) Feature Fusion for Scene Text Detection. 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vahid Ghods.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix (Abbreviations)

Appendix (Abbreviations)

AN :

attention network

AON :

arbitrarily-oriented text recognition

CA-FCN :

character attention fully convolutional network

CCA :

connected component analysis

CNN :

convolutional neural network

CRF :

conditional random field

CRAFT :

character region awareness for text detection

CRNN :

combination of CNN with RNN

CTPN :

connectionist text proposal network

DR :

direct regressor

end-to-end recognition :

text detection and recognition system

EAST :

efficient and accurate scene text detector

ECT :

edge color transform

ESIR :

end-to-end trainable scene text recognition network by iterative rectification

ER :

extremal region

FAN :

focusing attention network

FCN :

fully convolutional networks

FN :

focusing network

FTPN :

feature pyramid-based text proposal network

FTSN :

fused text segmentation networks

GRCNN :

gated recurrent convolution neural network

GVF :

gradient vector flow

i.inception :

improved inception layer

i.ReLU :

improved ReLU layer

IRM :

iterative refinement module

IIIT5K :

IIIT 5 K-Words

LOMO :

look more than once

LWDP :

local word directional pattern

MIL :

multiple instance learning

MLP :

multi-layer perceptron

MOSTL :

an accurate multi oriented scene text localization

MSER :

maximally stable extremal region

MSP-Net :

multiscale spatial partition network

new.i.inception layers :

new improved inception layer

new.i.ReLU :

new improved ReLU layer

NMS :

non-maximum suppression

PLN :

part localization network

P-CNN :

part-based convolutional neural network

PSENet :

progressive scale expansion network

R2AM :

recursive recurrent neural networks with attention modeling

RARE :

robust text recognizer with automatic rectification

RICNN :

rotation-invariant CNN

RPN :

region proposal network

RRPN :

rotation region proposal networks

RRD :

rotation-sensitive regression for oriented scene text detection

SE :

squeeze-and-excitation

SEM :

shape expression module

SRN :

sequence recognition network

SPL :

self-paced learning

STRHOG :

scale and translation robust HOG

STN :

spatial transformer network

SVT :

street view text

SWT :

Stroke width transform

wDTW :

weighted dynamic time warping

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Naiemi, F., Ghods, V. & Khalesi, H. Scene text detection and recognition: a survey. Multimed Tools Appl 81, 20255–20290 (2022). https://doi.org/10.1007/s11042-022-12693-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12693-7

Keywords

Navigation