Abstract
Text detection is a challenging task in computer vision. In this paper, we focus on English text detection in a natural scene image. We propose a hierarchical approach for text detection, which unifies the word-level text detection and character-level detection as well as the text spatial layout. In our approach, we firstly use stroke width transformation (SWT) to filter an image in a word level. Secondly, we employ the random forest to select discriminative features of characters and compute the confident values of characters. Finally, we use conditional random field to integrate the discriminative information with the text spatial layout, which separates the text from the background. The proposed approach is implemented on the ICDAR dataset, which is a challenging dataset for text detection, and the experiment results demonstrate that our approach is efficient and effective, and it is superior to the state-of-the-art methods in comprehensive criteria.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jung, C., Liu, Q., Kim, J.: Accurate text localization in imag es based on SVM output scores. Image and Vision Computing 27, 1295–1301 (2009)
Jung, C., Liu, Q., Kim, J.: A stroke filter and its application to text localization. Pattern Recognition Letters 30, 114–122 (2009)
Lienhart, R., Effelsberg, W.: Automatic text segmentation and text recognition for video indexing, TR-98-009, Universiy of Mannheim (1998)
Jung, K., Kim, J.: Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Transactions onPattern Analysis and Machine Intelligence 25(12), 1631–1639 (2003)
Lucas, S.M.: ICDAR 2005 text locating competition results. In: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 80–84 (2005)
Lyu, M.R., Song, J., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. In: IEEE Transactions on Circuits and Systems for Video Technology, pp. 243–255 (February 2005)
Shivakumara, P., Huang, W., Phan, T.Q., Tan, C.L.: Accurate video text detection through classification of low and high contrast images. Pattern Recognition 43, 2165–2185 (2010)
Hua, X.-S., Chen, X.-R., Wenyin, L., Zhang, H.-J.: Automatic location of text in video frames. In: Proceedings of the 2001 ACM Workshops on Multimedia: Multimedia Information Retrieval, October 05 (2001)
Thillou, C.M., Gosselin, B.: Color text extraction with selective metric-based clustering. Computer Vision and Image Understanding 107, 97–107 (2007)
Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video frames. Image and Vision Computing 23(6), 565–576 (2005)
Chen, X., Yuille, A.L.: A time efficient cascade for real-time object detection: with applications for the visually impaired. In: Proceedings of the CVAVI 2005, IEEE Conference on Computer Vision and Pattern Recognition Workshop (2005)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 13-18, pp. 2963–2970 (2010)
Liu, C., Wang, C., Dai, R.: Text Detection in Images Based on Unsupervised Classification of Edge-based Features. In: Eighth International Conference on Document Analysis and Recognition (ICDAR 2005), pp. 610–614 (2005)
Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), November 6-13, pp. 1457–1464 (2011)
Wang, K., Belongie, S.: Word Spotting in the Wild. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 591–604. Springer, Heidelberg (2010)
Dalal, N.: Finding people in images and videos, France: the French National Institute for Research in Computer Science and Control. In: INRIA (2006)
Liu, Q., Jung, C., Moon, Y.: Text Segmentation based on Stroke Filter. In: Proceedings of International Conference on Multimedia, pp. 129–132 (2006)
Mishra, A., Alahari, K., Jawahar, C.V.: Top-Down and Bottom-Up Cues for Scene Text Recognition. In: CVPR (2012)
Neumann, L., Matas, J.: A Method for Text Localization and Recognition in Real-world Images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Qu, Y., Liao, W., Lu, S., Wu, S. (2013). Hierarchical Text Detection: From Word Level to Character Level. In: Li, S., et al. Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol 7733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35728-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-35728-2_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35727-5
Online ISBN: 978-3-642-35728-2
eBook Packages: Computer ScienceComputer Science (R0)