Abstract
Natural scene or document images captured from camera devices containing text are the most informative region for communication. Extraction of text regions from such images is the primary and fundamental task of obtaining textual content present in images. Classifying foreground objects as text/non-text elements is one of the significant modules in scene text localization. Stroke width is an important discriminating feature of text blocks. In this paper, a distance transform-based stroke feature descriptor is reported for component level classification of foreground components obtained from input images. Potential stroke pixels are identified from distance map of a component using strict staircase method, and distribution of distance values of such pixels is used for designing the feature descriptors. Finally, we classify the components using a neural network-based classifier. Experimental result shows that component classification accuracy is more than 88%, which is much impressive in practical scenario.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4159—4167, IEEE (2016)
Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. II–II (2004)
Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1083–1090, IEEE (2012)
Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. In: IEEE Transactions on Image Processing, pp. 2594–2605, IEEE (2011)
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3538–3545, IEEE (2012)
Huang, W., Lin, Z., Yang, J., Wang, J.: Text localization in natural images using stroke feature transform and text covariance descriptors. In: IEEE International Conference on Computer Vision, pp. 1241–1248, IEEE (2013)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886–893, IEEE (2005)
Minetto, R., Thome, N., Cord, M., Leite, N.J., Stolfi, J.: T-HOG: an effective gradient-based descriptor for single line text regions. Pattern Recognit., 1078–1090 (2013). Elsevier
Tian, S., Bhattacharya, U., Lu, S., Su, B., Wang, Q., Wei, X., Lu, Y., Tan, C.L.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognit. 51, 125–134 (2016). Elsevier
Ojala, T., Pietikäinen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recognit., 51–59 (1996). Elsevier
Mäenpää, T., Pietikäinen, M.: Multi-scale binary patterns for texture analysis. Image Anal., 267–275 (2003). Springer
Goto, H., Tanaka, M.: Text-tracking wearable camera system for the blind. In: 10th International Conference on Document Analysis and Recognition, pp. 141–145, IEEE (2009)
Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video frames. Imag. Vision Comput. 23(6), 565–576 (2005). Elsevier
Epshtein, B., Ofek, E., Wexler, Y: Detecting text in natural scenes with stroke width transform. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970, IEEE (2010)
Neumann, L., Matas, J.: Efficient scene text localization and recognition with local character refinement. In: 13th International Conference on Document Analysis and Recognition, pp. 746–750, IEEE (2015)
Subramanian, K., Natarajan, P., Decerbo, M., Castanon, D.: Character-stroke detection for text-localization and extraction. In: 9th International Conference on Document Analysis and Recognition, ICDAR, pp. 33–37, IEEE (2007)
Mollah, A.F., Basu, S., Nasipuri, M.: Text detection from camera captured images using a novel fuzzy-based technique. In: 3rd International Conference on Emerging Applications of Information Technology (EAIT), pp. 291–294, IEEE (2012)
Khan, T., Mollah, A.F.: A novel text localization scheme for camera captured document images. In: 2nd International Conference on Computer Vision & Image Processing (CVIP), pp. 253–264, Springer Nature (2017)
Acknowledgements
This work is carried out in the research lab of Computer Science & Engineering Department of Aliah University. The first author is grateful to Maulana Azad National Fellowship (MANF) for the financial support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Khan, T., Mollah, A.F. (2019). Distance Transform-Based Stroke Feature Descriptor for Text Non-text Classification. In: Kalita, J., Balas, V., Borah, S., Pradhan, R. (eds) Recent Developments in Machine Learning and Data Analytics. Advances in Intelligent Systems and Computing, vol 740. Springer, Singapore. https://doi.org/10.1007/978-981-13-1280-9_19
Download citation
DOI: https://doi.org/10.1007/978-981-13-1280-9_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1279-3
Online ISBN: 978-981-13-1280-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)