Skip to main content

Hierarchical Text Detection: From Word Level to Character Level

  • Conference paper
Advances in Multimedia Modeling

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7733))

Abstract

Text detection is a challenging task in computer vision. In this paper, we focus on English text detection in a natural scene image. We propose a hierarchical approach for text detection, which unifies the word-level text detection and character-level detection as well as the text spatial layout. In our approach, we firstly use stroke width transformation (SWT) to filter an image in a word level. Secondly, we employ the random forest to select discriminative features of characters and compute the confident values of characters. Finally, we use conditional random field to integrate the discriminative information with the text spatial layout, which separates the text from the background. The proposed approach is implemented on the ICDAR dataset, which is a challenging dataset for text detection, and the experiment results demonstrate that our approach is efficient and effective, and it is superior to the state-of-the-art methods in comprehensive criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jung, C., Liu, Q., Kim, J.: Accurate text localization in imag es based on SVM output scores. Image and Vision Computing 27, 1295–1301 (2009)

    Article  Google Scholar 

  2. Jung, C., Liu, Q., Kim, J.: A stroke filter and its application to text localization. Pattern Recognition Letters 30, 114–122 (2009)

    Article  Google Scholar 

  3. Lienhart, R., Effelsberg, W.: Automatic text segmentation and text recognition for video indexing, TR-98-009, Universiy of Mannheim (1998)

    Google Scholar 

  4. Jung, K., Kim, J.: Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Transactions onPattern Analysis and Machine Intelligence 25(12), 1631–1639 (2003)

    Article  Google Scholar 

  5. Lucas, S.M.: ICDAR 2005 text locating competition results. In: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 80–84 (2005)

    Google Scholar 

  6. Lyu, M.R., Song, J., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. In: IEEE Transactions on Circuits and Systems for Video Technology, pp. 243–255 (February 2005)

    Article  Google Scholar 

  7. Shivakumara, P., Huang, W., Phan, T.Q., Tan, C.L.: Accurate video text detection through classification of low and high contrast images. Pattern Recognition 43, 2165–2185 (2010)

    Article  Google Scholar 

  8. Hua, X.-S., Chen, X.-R., Wenyin, L., Zhang, H.-J.: Automatic location of text in video frames. In: Proceedings of the 2001 ACM Workshops on Multimedia: Multimedia Information Retrieval, October 05 (2001)

    Google Scholar 

  9. Thillou, C.M., Gosselin, B.: Color text extraction with selective metric-based clustering. Computer Vision and Image Understanding 107, 97–107 (2007)

    Article  Google Scholar 

  10. Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video frames. Image and Vision Computing 23(6), 565–576 (2005)

    Article  Google Scholar 

  11. Chen, X., Yuille, A.L.: A time efficient cascade for real-time object detection: with applications for the visually impaired. In: Proceedings of the CVAVI 2005, IEEE Conference on Computer Vision and Pattern Recognition Workshop (2005)

    Google Scholar 

  12. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 13-18, pp. 2963–2970 (2010)

    Google Scholar 

  13. Liu, C., Wang, C., Dai, R.: Text Detection in Images Based on Unsupervised Classification of Edge-based Features. In: Eighth International Conference on Document Analysis and Recognition (ICDAR 2005), pp. 610–614 (2005)

    Google Scholar 

  14. Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), November 6-13, pp. 1457–1464 (2011)

    Google Scholar 

  15. Wang, K., Belongie, S.: Word Spotting in the Wild. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 591–604. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  16. Dalal, N.: Finding people in images and videos, France: the French National Institute for Research in Computer Science and Control. In: INRIA (2006)

    Google Scholar 

  17. Liu, Q., Jung, C., Moon, Y.: Text Segmentation based on Stroke Filter. In: Proceedings of International Conference on Multimedia, pp. 129–132 (2006)

    Google Scholar 

  18. Mishra, A., Alahari, K., Jawahar, C.V.: Top-Down and Bottom-Up Cues for Scene Text Recognition. In: CVPR (2012)

    Google Scholar 

  19. Neumann, L., Matas, J.: A Method for Text Localization and Recognition in Real-world Images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qu, Y., Liao, W., Lu, S., Wu, S. (2013). Hierarchical Text Detection: From Word Level to Character Level. In: Li, S., et al. Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol 7733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35728-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35728-2_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35727-5

  • Online ISBN: 978-3-642-35728-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics