Hierarchical Text Detection: From Word Level to Character Level

Qu, Yanyun; Liao, Weimin; Lu, Shen; Wu, Shaojie

doi:10.1007/978-3-642-35728-2_3

Yanyun Qu⁷,
Weimin Liao⁷,
Shen Lu⁷ &
…
Shaojie Wu⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7733))

2034 Accesses
1 Citations

Abstract

Text detection is a challenging task in computer vision. In this paper, we focus on English text detection in a natural scene image. We propose a hierarchical approach for text detection, which unifies the word-level text detection and character-level detection as well as the text spatial layout. In our approach, we firstly use stroke width transformation (SWT) to filter an image in a word level. Secondly, we employ the random forest to select discriminative features of characters and compute the confident values of characters. Finally, we use conditional random field to integrate the discriminative information with the text spatial layout, which separates the text from the background. The proposed approach is implemented on the ICDAR dataset, which is a challenging dataset for text detection, and the experiment results demonstrate that our approach is efficient and effective, and it is superior to the state-of-the-art methods in comprehensive criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jung, C., Liu, Q., Kim, J.: Accurate text localization in imag es based on SVM output scores. Image and Vision Computing 27, 1295–1301 (2009)
Article Google Scholar
Jung, C., Liu, Q., Kim, J.: A stroke filter and its application to text localization. Pattern Recognition Letters 30, 114–122 (2009)
Article Google Scholar
Lienhart, R., Effelsberg, W.: Automatic text segmentation and text recognition for video indexing, TR-98-009, Universiy of Mannheim (1998)
Google Scholar
Jung, K., Kim, J.: Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Transactions onPattern Analysis and Machine Intelligence 25(12), 1631–1639 (2003)
Article Google Scholar
Lucas, S.M.: ICDAR 2005 text locating competition results. In: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 80–84 (2005)
Google Scholar
Lyu, M.R., Song, J., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. In: IEEE Transactions on Circuits and Systems for Video Technology, pp. 243–255 (February 2005)
Article Google Scholar
Shivakumara, P., Huang, W., Phan, T.Q., Tan, C.L.: Accurate video text detection through classification of low and high contrast images. Pattern Recognition 43, 2165–2185 (2010)
Article Google Scholar
Hua, X.-S., Chen, X.-R., Wenyin, L., Zhang, H.-J.: Automatic location of text in video frames. In: Proceedings of the 2001 ACM Workshops on Multimedia: Multimedia Information Retrieval, October 05 (2001)
Google Scholar
Thillou, C.M., Gosselin, B.: Color text extraction with selective metric-based clustering. Computer Vision and Image Understanding 107, 97–107 (2007)
Article Google Scholar
Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video frames. Image and Vision Computing 23(6), 565–576 (2005)
Article Google Scholar
Chen, X., Yuille, A.L.: A time efficient cascade for real-time object detection: with applications for the visually impaired. In: Proceedings of the CVAVI 2005, IEEE Conference on Computer Vision and Pattern Recognition Workshop (2005)
Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 13-18, pp. 2963–2970 (2010)
Google Scholar
Liu, C., Wang, C., Dai, R.: Text Detection in Images Based on Unsupervised Classification of Edge-based Features. In: Eighth International Conference on Document Analysis and Recognition (ICDAR 2005), pp. 610–614 (2005)
Google Scholar
Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), November 6-13, pp. 1457–1464 (2011)
Google Scholar
Wang, K., Belongie, S.: Word Spotting in the Wild. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 591–604. Springer, Heidelberg (2010)
Chapter Google Scholar
Dalal, N.: Finding people in images and videos, France: the French National Institute for Research in Computer Science and Control. In: INRIA (2006)
Google Scholar
Liu, Q., Jung, C., Moon, Y.: Text Segmentation based on Stroke Filter. In: Proceedings of International Conference on Multimedia, pp. 129–132 (2006)
Google Scholar
Mishra, A., Alahari, K., Jawahar, C.V.: Top-Down and Bottom-Up Cues for Scene Text Recognition. In: CVPR (2012)
Google Scholar
Neumann, L., Matas, J.: A Method for Text Localization and Recognition in Real-world Images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Xiamen University, 361005, P.R. China
Yanyun Qu, Weimin Liao, Shen Lu & Shaojie Wu

Authors

Yanyun Qu
View author publications
You can also search for this author in PubMed Google Scholar
Weimin Liao
View author publications
You can also search for this author in PubMed Google Scholar
Shen Lu
View author publications
You can also search for this author in PubMed Google Scholar
Shaojie Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Asia, 5 Danling Street, 100080, Beijing, China
Shipeng Li & Tao Mei &
School of Electrical Engineering and Computer Science, University of Ottawa, 800 King Edward, K1N 6N5, Ottawa, ON, Canada
Abdulmotaleb El Saddik
School of Computer and Information, Hefei University of Technology, Road Tunxi 193#, 230009, Hefei, Anhui, China
Meng Wang & Richang Hong &
Department of Information Engineering and Computer Science, University of Trento, ommarive 14, 38100, Trento, Italy
Nicu Sebe
Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, 117583, Singapore, Singapore
Shuicheng Yan
School of Computing, CLARITY: Centre for Sensor Web Technologies, Dublin City University, Glasnevin, 9, Dublin, Ireland
Cathal Gurrin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qu, Y., Liao, W., Lu, S., Wu, S. (2013). Hierarchical Text Detection: From Word Level to Character Level. In: Li, S., et al. Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol 7733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35728-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-35728-2_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35727-5
Online ISBN: 978-3-642-35728-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics