Abstract
We use text as the primary medium for providing precise information. Text could be major source of information for understanding of a scene imagery or video, once it is recognized. Although identifying and understanding textual information is fairly simple for us, it can be extremely complex task for machines. Variations such as color, orientation, scale, flow, lighting, noise, occlusion, language features and font can make this task of computer vision challenging. Detecting presence of text and precisely locating regions of text are vital for faster and precise recognition. Due to the complexity of the task, most of the popular techniques of today require an intense training phase and powerful computation infrastructure. In the proposed method, we have tried to minimize the amount of training required to achieve a decent text localization result. We have observed that, morphological gradient analysis enhances textual regions and contour feature analysis can help to eliminate non-textual components. Combination of these techniques produces promising results with small dataset, minimal training and limited computational ability. Also, the proposed detector can detect text across multiple languages and is fairly robust against the variations such as orientation and scale. The proposed method achieves an F-measure of 0.77 on MSRA-TD500 after the training with 300 images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Basu, S., et al.: Multilingual scene text detection using gradient morphology. Int. J. Comput. Vis. Image Process. 10(3), 31–43 (2020). https://doi.org/10.4018/IJCVIP.2020070103
Bradski, G.: The OpenCV library. Dr. Dobb’s J. Softw. Tools (2000)
Chen, D., Luettin, J.: A survey of text detection and recognition in images and videos (2000)
Chen, H., et al.: Robust text detection in natural images with edge-enhanced maximally stable Extremal regions. In: 2011 18th IEEE International Conference on Image Processing, pp. 2609–2612 (2011). https://doi.org/10.1109/ICIP.2011.6116200
Coates, A., et al.: Text detection and character recognition in scene images with unsupervised feature learning. In: 2011 International Conference on Document Analysis and Recognition, pp. 440–445 (2011). https://doi.org/10.1109/ICDAR.2011.95
Epshtein, B., et al.: Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2963–2970 (2010). https://doi.org/10.1109/CVPR.2010.5540041
Fu, K., et al.: Text detection for natural scene based on MobileNet V2 and U-Net. In: 2019 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 1560–1564 (2019). https://doi.org/10.1109/ICMA.2019.8816384
Harris, C.R., et al.: Array programming with {NumPy}. Nature 585(7825), 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2
He, T., et al.: Text-attentional convolutional neural network for scene text detection. IEEE Trans. Image Process. 25(6), 2529–2541 (2016). https://doi.org/10.1109/TIP.2016.2547588
Jung, K., et al.: Text information extraction in images and video: a survey. Pattern Recogn. 37(5), 977–997 (2004). https://doi.org/10.1016/j.patcog.2003.10.012
Li, H., Lu, H.: AT-text: assembling text components for efficient dense scene text detection. Future Internet. 12(11), 1–14 (2020). https://doi.org/10.3390/fi12110200
Liao, M., et al.: Rotation-sensitive regression for oriented scene text detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 5909–5918 (2018). https://doi.org/10.1109/CVPR.2018.00619
Liu, Y., et al.: Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recogn. 90, 337–345 (2019). https://doi.org/10.1016/j.patcog.2019.02.002
Long, S., He, X., Yao, C.: Scene text detection and recognition: the deep learning era. Int. J. Comput. Vision 129(1), 161–184 (2020). https://doi.org/10.1007/s11263-020-01369-0
Malik, J., et al.: Contour and texture analysis for image segmentation. Int. J. Comput. Vision 43(1), 7–27 (2001). https://doi.org/10.1023/A:1011174803800
Matas, J., et al.: Robust wide-baseline stereo from maximally stable extremal regions. In: Image and Vision Computing (2004). https://doi.org/10.1016/j.imavis.2004.02.006
Rivest, J.-F., et al.: Morphological gradients. J. Electron. Imaging 2(4), 326–336 (1993). https://doi.org/10.1117/12.159642
Shekar, B.H., et al.: Discrete wavelet transform and gradient difference based approach for text localization in videos. In: Proceedings - 2014 5th International Conference on Signal and Image Processing, ICSIP 2014, pp. 280–284 (2014). https://doi.org/10.1109/ICSIP.2014.50
Shekar, B.H., Raveeshwara, S.: Contour feature learning for locating text in natural scene images. Int. J. Inf. Technol. (2022). https://doi.org/10.1007/s41870-021-00851-3
Shekar, B.H., Smitha M.L.: Morphological gradient based approach for text localization in video/scene images. In: 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2426–2431 (2014). https://doi.org/10.1109/ICACCI.2014.6968426
Wan, Z., et al.: TextScanner: Reading characters in order for robust scene text recognition. arXiv. (2019). https://doi.org/10.1609/aaai.v34i07.6891
Wang, X., et al.: Arbitrary shape scene text detection with adaptive text region representation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2019-June, pp. 6442–6451 (2019). https://doi.org/10.1109/CVPR.2019.00661
Wu, V., et al.: Textfinder: an automatic system to detect and recognize text in images. IEEE Trans. Pattern Anal. Mach. Intell. 21(11), 1224–1229 (1999). https://doi.org/10.1109/34.809116
Yang, Q., et al.: Inceptext: A new inception-text module with deformable PSROI pooling for multi-oriented scene text detection. In: IJCAI International Joint Conference on Artificial Intelligence, pp. 1071–1077 (2018). https://doi.org/10.24963/ijcai.2018/149
Yao, C., et al.: Detecting texts of arbitrary orientations in natural images. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 8, 1083–1090 (2012). https://doi.org/10.1109/CVPR.2012.6247787
Yao, C., et al.: Scene text detection via holistic, multi-channel prediction 1–10 (2016)
Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2014)
Yin, X.C., et al.: Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans. Image Process. 25(6), 2752–2773 (2016). https://doi.org/10.1109/TIP.2016.2554321
Zhang, Y., Huang, Y., Zhao, D., Wu, C.H., Ip, W.H., Yung, K.L.: A scene text detector based on deep feature merging. Multimed. Tools Appl. 80(19), 29005–29016 (2021). https://doi.org/10.1007/s11042-021-11101-w
Zhang, Z., et al.: Multi-oriented text detection with fully convolutional networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4159–4167 (2016). https://doi.org/10.1109/CVPR.2016.451
Zhong, Y., et al.: Locating text in complex color images. Pattern Recogn. 28(10), 1523–1535 (1995). https://doi.org/10.1016/0031-3203(95)00030-4
Zhou, X., et al.: EAST: an efficient and accurate scene text detector. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 2642–2651 (2017). https://doi.org/10.1109/CVPR.2017.283
Zhu, A.: Scene text detection and recognition. Front. Comp. Sci. 10(1), 19–36 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Shekar, B.H., Raveeshwara, S. (2022). Morphological Gradient Analysis and Contour Feature Learning for Locating Text in Natural Scene Images. In: Raman, B., Murala, S., Chowdhury, A., Dhall, A., Goyal, P. (eds) Computer Vision and Image Processing. CVIP 2021. Communications in Computer and Information Science, vol 1568. Springer, Cham. https://doi.org/10.1007/978-3-031-11349-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-031-11349-9_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11348-2
Online ISBN: 978-3-031-11349-9
eBook Packages: Computer ScienceComputer Science (R0)