Skip to main content
Log in

Redefining the DCT-based feature for scene text detection

Analysis and comparison of spatial frequency-based features

  • Original Paper
  • Published:
International Journal of Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

We analyze some spatial frequency-based features used for text region detection in natural scene images, and redefine the DCT-based feature. We employ Fisher’s discriminant analysis to improve the DCT-based feature and to achieve higher accuracy. An unsupervised thresholding method for discriminating text and non-text regions is introduced and tested as well. Experimental results show that a wide high frequency band, covering some lower-middle frequency components, is generally more suitable for scene text detection despite the original definition of the DCT-based feature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Crandall, D., Antani, S., Kasturi, R.: Extraction of special effects caption text events from digital video. Int. J. Document Anal. Recogn. (IJDAR) 5(2–3), 138–157 (2003)

    Article  Google Scholar 

  2. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7, 179–188 (1936)

    Google Scholar 

  3. Gllavata, J., Ewerth, R., Freisleben, B.: Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: Proceedings of 17th International Conference on Pattern Recognition, vol. 1, pp. 425–428 (2004)

  4. Jung, K., Kim, K.I., Jain, A.K.: Text information extraction in images and video: a survey. Pattern Recognit. 37, 977–997 (2004)

    Article  Google Scholar 

  5. Kim, K.C., Byun, H.R., Song, Y.J., Choi, Y.W., Chi, S.Y., Kim, K.K., Chung, Y.K.: Scene text extraction in natural scene images using hierarchical feature combining and verification. In: Proceedings of 17th International Conference on Pattern Recognition, vol. 2, pp. 679–682 (2004)

  6. Liang, J., Doermann, D., Li, H.: Camera-based analysis of text and documents: a survey. Int. J. Document Anal. Recogn. (IJDAR) 7(2–3), 84–104 (2005)

    Article  Google Scholar 

  7. Lim, Y.K., Choi, S.H., Lee, S.W.: Text extraction in MPEG compressed video for content-based indexing. In: Proceedings of 15th International Conference on Pattern Recognition, vol. 4, pp. 409–412 (2000)

  8. Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: Proceedings of 7th International Conference on Document Analysis and Recognition (ICDAR 2003), vol. II, pp.682–687 (2003)

  9. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. SMC-9(1), 62–66 (1979)

    MathSciNet  Google Scholar 

  10. Zhong, Y., Zhang, H., Jain, A.K.: Automatic caption localization in compressed video. IEEE Trans. Pattrn Anal. Mach. Intell. PAMI-22(4), 385–392 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hideaki Goto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goto, H. Redefining the DCT-based feature for scene text detection. IJDAR 11, 1–8 (2008). https://doi.org/10.1007/s10032-008-0061-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-008-0061-9

Keywords

Navigation