Learning to Discriminate Text from Synthetic Data

  • José Antonio Álvarez Ruiz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7416)


Service robots could use textual information to perform important tasks, like product identification. However, natural scene text such as found in household environments can be very arbitrary in terms of size, color, font, layout, symbol repertoire, language, etc. This large variability makes robust text information extraction extremely difficult. Our work on textual information extraction for gray-scale still images uses adaptive binarization, connected component classification with a support vector machine and filtering based on the proximity of the connected components to their neighbours. The contribution of our approach is the use of a partially synthetic dataset for training. This decreases the burden of ground truth labelling at the connected component level. Our experiments show that classification generalization on real instances can be attained when training a classifier with synthetic data. We present our results on the ICDAR dataset.


Natural scene text object identification adaptive binarization support vector machine synthetic dataset 


  1. 1.
    Bin, Y., Jia-Xiong, P.: Improvement and Invariance Analysis of Zernike Moments using as a Region- based Shape Descriptor. Journal of Pattern Recognition and Image Analysis 12(4), 419–428 (2002), Google Scholar
  2. 2.
    Bulacu, M., Ezaki, N., Schomaker, L.: Text Detection and Pose Estimation for a Reading Robot. (2003),
  3. 3.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), software,
  4. 4.
    Chen, X., Yuille, A.: Detecting and reading text in natural scenes. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2. IEEE Computer Society (1999, 2004),
  5. 5.
    Lai, J.Y., Sowmya, A., Trinder, J.: Support Vector Machine Experiments for Road Recognition in High Resolution Images. In: Perner, P., Imiya, A. (eds.) MLDM 2005. LNCS (LNAI), vol. 3587, pp. 426–436. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  6. 6.
    Hu, M.: Visual Pattern Recognition by Moment invariants. IRE Transactions on Information Theory 8(2), 179–187 (1962), zbMATHCrossRefGoogle Scholar
  7. 7.
    Lucas, S., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R., Ashida, K., Nagai, H., Okamoto, M., Yamamoto, H., et al.: ICDAR 2003 robust reading competitions: entries, results, and future directions. International Journal on Document Analysis and Recognition 7(2), 105–122 (2005), CrossRefGoogle Scholar
  8. 8.
    Mancas-Thillou, C., Gosselin, B.: Color text extraction with selective metric-based clustering. Computer Vision and Image Understanding 107(1-2), 97–107 (2007), CrossRefGoogle Scholar
  9. 9.
    Pan, Y.F., Hou, X., Liu, C.L.: A Robust System to Detect and Localize Texts in Natural Scene Images. In: 2008 The Eighth IAPR International Workshop on Document Analysis Systems, pp. 35–42 (September 2008),
  10. 10.
    Pan, Y.F., Hou, X., Liu, C.L.: Text Localization in Natural Scene Images Based on Conditional Random Field. In: 10th International Conference on Document Analysis and Recognition, pp. 6–10 (July 2009),
  11. 11.
    Shafait, F., Keysers, D., Breuel, T.M.: Efficient implementation of local adaptive thresholding techniques using integral images. In: Document Recognition and Retrieval XV, San Jose, CA (January 2008)Google Scholar
  12. 12.
    Teh, C.H., Chin, R.: On image analysis by the methods of moments. IEEE Transactions on Pattern Analysis and Machine Intelligence 10(4), 496–513 (1988), zbMATHCrossRefGoogle Scholar
  13. 13.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, pp. 511–518 (2001),
  14. 14.
    Wolf, C., Jolion, J.M.: Extraction and recognition of artificial text in multimedia documents. Formal Pattern Analysis & Applications 6(4) (February 2004),
  15. 15.
    Wolf, C., Jolion, J.M.: Object count/area graphs for the evaluation of object detection and segmentation algorithms. International Journal on Document Analysis and Recognition 8(4), 280–296 (2006)CrossRefGoogle Scholar
  16. 16.
    Zhu, K.h., Qi, F.h., Jiang, R.j., Xu, L.: Automatic character detection and segmentation in natural scene images. Journal of Zhejiang University Science A 8(1), 63–71 (January 2007),
  17. 17.
    Zini, L., Destrero, A., Odone, F.: A Classification Architecture Based on Connected Components for Text Detection in Unconstrained Environments. Advanced Video and Signal Based Surveillance, 176–181 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • José Antonio Álvarez Ruiz
    • 1
  1. 1.Science DepartmentBonn-Rhine-Sieg University of Applied Sciences ComputerSankt AugustinGermany

Personalised recommendations