An Efficient Method for Text Detection in Video Based on Stroke Width Similarity

  • Viet Cuong Dinh
  • Seong Soo Chun
  • Seungwook Cha
  • Hanjin Ryu
  • Sanghoon Sull
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4843)


Text appearing in video provides semantic knowledge and significant information for video indexing and retrieval system. This paper proposes an effective method for text detection in video based on the similarity in stroke width of text (which is defined as the distance between two edges of a stroke). From the observation that text regions can be characterized by a dominant fixed stroke width, edge detection with local adaptive thresholds is first devised to keep text- while reducing background-regions. Second, morphological dilation operator with adaptive structuring element size determined by stroke width value is exploited to roughly localize text regions. Finally, to reduce false alarm and refine text location, a new multi-frame refinement method is applied. Experimental results show that the proposed method is not only robust to different levels of background complexity, but also effective to different fonts (size, color) and languages of text.


Edge Detection Edge Pixel Text Region Dilation Operator Text Detection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Zhu, Q., Yeh, M.C., Cheng, K.T.: Multimodal fusion using learned text concepts for image categorization. In: Proc. of ACM Int’l. Conf. on Multimedia, pp. 211–220. ACM Press, New York (2006)Google Scholar
  2. 2.
    Lienhart, R.: Dynamic video summarization of home video. In: Proc. of SPIE, vol. 3972, pp. 378–389 (1999)Google Scholar
  3. 3.
    Fan, J., Luo, H., Elmagarmid, A.K.: Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing. IEEE Trans. on Image Processing 13, 974–992 (2004)CrossRefGoogle Scholar
  4. 4.
    Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. Pattern Recognition 28, 1523–1536 (1995)CrossRefGoogle Scholar
  5. 5.
    Jain, A.K., Yu, B.: Automatic text location in images and video frames. In: Proc. of Int’l. Conf. on Pattern Recognition, vol. 2, pp. 1497–1499 (August 1998)Google Scholar
  6. 6.
    Ohya, J., Shio, A., Akamatsu, S.: Recognition characters in scene images. IEEE Trans. on Pattern Analysis and Machine Intelligence 16, 214–220 (1994)CrossRefGoogle Scholar
  7. 7.
    Qiao, Y.L., Li, M., Lu, Z.M., Sun, S.H.: Gabor filter based text extraction from digital document images. In: Proc. of Int’l. Conf. on Intelligent Information Hiding and Multimedia Signal Processing, pp. 297–300 (December 2006)Google Scholar
  8. 8.
    Li, H., Doermann, D., Kia, O.: Automatic text detection and tracking in digital video. IEEE Trans. on Image Processing, 147–156 (2000)Google Scholar
  9. 9.
    Chen, D., Bourlard, H., Thiran, J.P.: Text identification in complex background using SVM. In: Proc. of Int’l. Conf. on Document Analysis and Recognition, vol. 2, pp. 621–626 (December 2001)Google Scholar
  10. 10.
    Lyu, M.R., Song, J., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans. on Circuits Systems Video Technology, 243–255 (2005)Google Scholar
  11. 11.
    Jung, K.C., Han, J.H., Kim, K.I., Park, S.H.: Support vector machines for text location in news video images. In: Proc. of Int’l. Conf. on System Technology, pp. 176–189 (September 2000)Google Scholar
  12. 12.
    Gonzalez, R.-C., Woods, R.E.: Digital Image Processing, 2nd edn., pp. 602–608. Prentice-Hall, Englewood Cliffs (2002)Google Scholar
  13. 13.
    Lienhart, R., Wernicke, A.: Localizing and segmenting text in images and videos. IEEE Trans. on Circuits Systems Video Technology, 256–268 (2002)Google Scholar
  14. 14.
    Li, H., Doermann, D.: Text enhancement in digital video using multiple frame integration. In: Proc. of ACM Int’l. Conf. on Multimedia, pp. 19–22. ACM Press, New York (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Viet Cuong Dinh
    • 1
  • Seong Soo Chun
    • 1
  • Seungwook Cha
    • 1
  • Hanjin Ryu
    • 1
  • Sanghoon Sull
    • 1
  1. 1.Department of Electronics and Computer Engineering, Korea University, 5-1 Anam-dong, Seongbuk-gu, Seoul, 136-701Korea

Personalised recommendations