This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorithm. Then all the CCs including text CCs and non-text CCs are verified on their text features by a 2-stage classification module, where most non-text CCs are discarded by an attentional cascade classifier and remaining CCs are further verified by an SVM. All the accepted CCs are output to result in text only binary image. Experiments with many images in different scenes showed satisfactory performance of our proposed method.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Chen, D., Shearer, K., Bourlard, H., 2001. Text Enhancement with Symmetric Alter for Video OCR. Proc. International Conference on Image Analysis and Recognition, p.192–197.
Chun, B.T., Bae, Y., Kim, T.Y., 1999. Automatic Text Extraction in Digital Videos Using FFT and Neural Network. Proc. IEEE International Fuzzy Systems Conference. Seoul, Korea, 2:1112–1115.
Clark, P., Mirmehdi, M., 2000. Finding Text Regions Using Localized Measures. Proc. 11th British Machine Vision Conference, p.675–684.
Ekin, A., 2006. Local Information Based Overlaid Text Detection by Classifier Fusion. Proc. International Conference on Acoustics, Speech and Signal Processing, 2:753–756.
Kim, K.I., Jung, K., Kim, J.H., 2003. Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Trans. Pattern Anal. Machine Intell., 25(12):1631–1639. [doi:10.1109/TPAMI.2003.1251157]
Kim, K.C., Byun, H.R., Song, Y.J., Choi, Y.W., Chi, S.Y., Kim, K.K., Chung, Y.K., 2004. Scene Text Extraction in Natural Scene Images Using Hierarchical Feature Combining and Verification. Proc. International Conference on Computer Vision and Pattern Recognition, 2:679–682.
Liu, C., Wang, C., Dai, R., 2005. Text Detection in Images Based on Unsupervised Classification of Edge-based Features. Proc. International Conference on Document Analysis and Recognition.
Liu, C.L., Koga, M., Fujisawa, H., 2005. Gabor Feature Extraction for Character Recognition Comparison with Gradient Feature. Proc. 8th International Conference on Document Analysis and Recognition, 1:121–125.
Lyu, M.R., Song, J., Cai, M., 2005. A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans. Circuits Syst. Video Technol., 15(2):243–255. [doi:10.1109/TCSVT.2004.841653]
Mao, W., Chung, F., Lanm, K., Siu, W., 2002. Hybrid Chinese/English Text Detection in Images and Video Frames. Proc. International Conference on Computer Vision and Pattern Recognition, 3:1015–1018.
Qian, X., Liu, G., 2006. Text Detection, Localization and Segmentation in Compressed Videos. Proc. International Conference on Acoustics, Speech and Signal Processing, 2:385–388.
Takahashi, H., Nakajima, M., 2005. Region Graph Based Text Extraction from Outdoor Images. Proc. 3rd International Conference on Information Technology and Applications, 1:680–685. [doi:10.1109/ICITA.2005.235]
Wang, K.Q., Kangas, J.A., 2003. Character location in scene images from digital camera. Pattern Recognition, 36(10):2287–2299. [doi:10.1016/S0031-3203(03)00082-7]
Weinman, J., Hanson, A., McCallum, A., 2004. Sign Detection in Natural Images with Conditional Random Fields. Proc. IEEE International Workshop on Machine Learning for Signal Processing. Brazil, p.549–558. [doi:10.1109/MLSP.2004.1423018]
Winger, L., Robinson, J.A., Jernigan, M.E., 2000. Low-complexity character extraction in low-contrast scene images. IEEE Trans. Pattern Recog. Artif. Intell., 14(2):113–135. [doi:10.1142/S0218001400000106]
Zhang, D.Q., Chang, F.H., 2004. Learning to Detect Scene Text Using a Higher-Order MRF with Belief Propagation. Proc. International Conference on Computer Vision and Pattern Recognition, p.101–107.
Zhu, K., Qi, F., Jiang, R., Xu, L., 2005. Using Adaboost to Detect and Segment Characters from Natural Scenes. Proc. Conference on Camera Based Document Analysis and Recognition, p.52–59.
Project supported by the OMRON and SJTU Collaborative Foundation under PVS project (2005.03–2005.10)
About this article
Cite this article
Jiang, Rj., Qi, Fh., Xu, L. et al. A learning-based method to detect and segment text from scene images. J. Zhejiang Univ. - Sci. A 8, 568–574 (2007). https://doi.org/10.1631/jzus.2007.A0568
- Text detection
- Text segmentation
- Text feature
- Attentional cascade