Skip to main content
Log in

A learning-based method to detect and segment text from scene images

  • Published:
Journal of Zhejiang University-SCIENCE A Aims and scope Submit manuscript

Abstract

This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorithm. Then all the CCs including text CCs and non-text CCs are verified on their text features by a 2-stage classification module, where most non-text CCs are discarded by an attentional cascade classifier and remaining CCs are further verified by an SVM. All the accepted CCs are output to result in text only binary image. Experiments with many images in different scenes showed satisfactory performance of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Chen, D., Shearer, K., Bourlard, H., 2001. Text Enhancement with Symmetric Alter for Video OCR. Proc. International Conference on Image Analysis and Recognition, p.192–197.

  • Chun, B.T., Bae, Y., Kim, T.Y., 1999. Automatic Text Extraction in Digital Videos Using FFT and Neural Network. Proc. IEEE International Fuzzy Systems Conference. Seoul, Korea, 2:1112–1115.

    Google Scholar 

  • Clark, P., Mirmehdi, M., 2000. Finding Text Regions Using Localized Measures. Proc. 11th British Machine Vision Conference, p.675–684.

  • Ekin, A., 2006. Local Information Based Overlaid Text Detection by Classifier Fusion. Proc. International Conference on Acoustics, Speech and Signal Processing, 2:753–756.

    Google Scholar 

  • Kim, K.I., Jung, K., Kim, J.H., 2003. Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Trans. Pattern Anal. Machine Intell., 25(12):1631–1639. [doi:10.1109/TPAMI.2003.1251157]

    Article  MathSciNet  Google Scholar 

  • Kim, K.C., Byun, H.R., Song, Y.J., Choi, Y.W., Chi, S.Y., Kim, K.K., Chung, Y.K., 2004. Scene Text Extraction in Natural Scene Images Using Hierarchical Feature Combining and Verification. Proc. International Conference on Computer Vision and Pattern Recognition, 2:679–682.

    Google Scholar 

  • Liu, C., Wang, C., Dai, R., 2005. Text Detection in Images Based on Unsupervised Classification of Edge-based Features. Proc. International Conference on Document Analysis and Recognition.

  • Liu, C.L., Koga, M., Fujisawa, H., 2005. Gabor Feature Extraction for Character Recognition Comparison with Gradient Feature. Proc. 8th International Conference on Document Analysis and Recognition, 1:121–125.

    Article  Google Scholar 

  • Lyu, M.R., Song, J., Cai, M., 2005. A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans. Circuits Syst. Video Technol., 15(2):243–255. [doi:10.1109/TCSVT.2004.841653]

    Article  Google Scholar 

  • Mao, W., Chung, F., Lanm, K., Siu, W., 2002. Hybrid Chinese/English Text Detection in Images and Video Frames. Proc. International Conference on Computer Vision and Pattern Recognition, 3:1015–1018.

    Google Scholar 

  • Qian, X., Liu, G., 2006. Text Detection, Localization and Segmentation in Compressed Videos. Proc. International Conference on Acoustics, Speech and Signal Processing, 2:385–388.

    Google Scholar 

  • Takahashi, H., Nakajima, M., 2005. Region Graph Based Text Extraction from Outdoor Images. Proc. 3rd International Conference on Information Technology and Applications, 1:680–685. [doi:10.1109/ICITA.2005.235]

    Article  Google Scholar 

  • Wang, K.Q., Kangas, J.A., 2003. Character location in scene images from digital camera. Pattern Recognition, 36(10):2287–2299. [doi:10.1016/S0031-3203(03)00082-7]

    Article  MATH  Google Scholar 

  • Weinman, J., Hanson, A., McCallum, A., 2004. Sign Detection in Natural Images with Conditional Random Fields. Proc. IEEE International Workshop on Machine Learning for Signal Processing. Brazil, p.549–558. [doi:10.1109/MLSP.2004.1423018]

  • Winger, L., Robinson, J.A., Jernigan, M.E., 2000. Low-complexity character extraction in low-contrast scene images. IEEE Trans. Pattern Recog. Artif. Intell., 14(2):113–135. [doi:10.1142/S0218001400000106]

    Article  Google Scholar 

  • Zhang, D.Q., Chang, F.H., 2004. Learning to Detect Scene Text Using a Higher-Order MRF with Belief Propagation. Proc. International Conference on Computer Vision and Pattern Recognition, p.101–107.

  • Zhu, K., Qi, F., Jiang, R., Xu, L., 2005. Using Adaboost to Detect and Segment Characters from Natural Scenes. Proc. Conference on Camera Based Document Analysis and Recognition, p.52–59.

Download references

Author information

Authors and Affiliations

Authors

Additional information

Project supported by the OMRON and SJTU Collaborative Foundation under PVS project (2005.03–2005.10)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, Rj., Qi, Fh., Xu, L. et al. A learning-based method to detect and segment text from scene images. J. Zhejiang Univ. - Sci. A 8, 568–574 (2007). https://doi.org/10.1631/jzus.2007.A0568

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.2007.A0568

Key words

CLC number

Navigation