Advertisement

Multi-orientation Scene Text Detection Leveraging Background Suppression

  • Xihan WangEmail author
  • Xiaoyi Feng
  • Zhaoqiang Xia
  • Jinye Peng
  • Eric Granger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10666)

Abstract

Most state-of-the-art text detection methods are devoted to horizontal texts and these methods cannot work well when encountering blurred, multi-oriented, low-resolution and small-sized texts. In this paper, we propose to localize texts from the perspective of suppressing more non-text backgrounds, in which a coarse-to-fine strategy is presented to remove non-text pixels from images. Firstly, the fully convolutional network (FCN) framework is utilized to make the coarse prediction of text labeling. Secondly, an efficient saliency measure based on background priors is employed to further suppress non-text pixels and generate fine character candidate regions. The remaining candidates of character regions composite text lines, so that the proposed method can handle multi-orientation texts in natural scene images. Two public datasets, MSRA-TD500 and ICDAR2013 are utilized to evaluate the performance of our proposed method. Experimental results show that our method achieves high recall rate and demonstrates the competitive performance.

Keywords

Scene text detection Fully Convolutional Network Background suppression Multi-orientation texts 

Notes

Acknowledgment

This paper is supported by H3C Foundation of Ministry of Education of China, No. 2017A19050, the National Aerospace Science and Technology Foundation and the National Nature Science Foundation of China (No. 61702419).

References

  1. 1.
    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)CrossRefGoogle Scholar
  2. 2.
    Cheng, M.M., Mitra, N.J., Huang, X., Torr, P.H., Hu, S.M.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2015)CrossRefGoogle Scholar
  3. 3.
    Coates, A., Carpenter, B., Case, C., Satheesh, S., Suresh, B., Wang, T., Wu, D.J., Ng, A.Y.: Text detection and character recognition in scene images with unsupervised feature learning. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 440–445. IEEE (2011)Google Scholar
  4. 4.
    Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970. IEEE (2010)Google Scholar
  5. 5.
    Huang, W., Qiao, Y., Tang, X.: Robust scene text detection with convolution neural network induced MSER trees. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 497–511. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10593-2_33 Google Scholar
  6. 6.
    Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. Int. J. Comput. Vis. 116(1), 1–20 (2016)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Jaderberg, M., Vedaldi, A., Zisserman, A.: Deep features for text spotting. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 512–528. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10593-2_34 Google Scholar
  8. 8.
    Johnson, D.B.: Efficient algorithms for shortest paths in sparse networks. J. ACM (JACM) 24(1), 1–13 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Kang, L., Li, Y., Doermann, D.: Orientation robust text line detection in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4034–4041 (2014)Google Scholar
  10. 10.
    Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., i Bigorda, L.G., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., de las Heras, L.P.: Icdar 2013 robust reading competition. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1484–1493. IEEE (2013)Google Scholar
  11. 11.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)Google Scholar
  12. 12.
    Neumann, L., Matas, J.: A method for text localization and recognition in real-world images. Comput. Vis.-ACCV 2010, 770–783 (2011)Google Scholar
  13. 13.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  14. 14.
    Tian, S., Pan, Y., Huang, C., Lu, S., Yu, K., Tan, C.L.: Text flow: a unified text detection system in natural scene images. In: IEEE International Conference on Computer Vision, pp. 4651–4659 (2016)Google Scholar
  15. 15.
    Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 56–72. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46484-8_4 CrossRefGoogle Scholar
  16. 16.
    Wei, Y., Wen, F., Zhu, W., Sun, J.: Geodesic saliency using background priors. Comput. Vis.-ECCV 2012, 29–42 (2012)CrossRefGoogle Scholar
  17. 17.
    Wei, Y., Zhang, Z., Shen, W., Zeng, D., Fang, M., Zhou, S.: Text detection in scene images based on exhaustive segmentation. Sig. Process.: Image Commun. 50, 1–8 (2017)Google Scholar
  18. 18.
    Xie, S., Tu, Z.: Holistically-nested edge detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1395–1403 (2015)Google Scholar
  19. 19.
    Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1083–1090. IEEE (2012)Google Scholar
  20. 20.
    Yi, C., Tian, Y.: Scene text recognition in mobile applications by character descriptor and structure configuration. IEEE Trans. Image Process. 23(7), 2972–2982 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Yin, F., Liu, C.L.: Handwritten text line extraction based on minimum spanning tree clustering. In: International Conference on Wavelet Analysis and Pattern Recognition, ICWAPR 2007, vol. 3, pp. 1123–1128. IEEE (2007)Google Scholar
  22. 22.
    Yin, X.C., Pei, W.Y., Zhang, J., Hao, H.W.: Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1930–1937 (2015)CrossRefGoogle Scholar
  23. 23.
    Yin, X.C., Yin, X., Huang, K., Hao, H.W.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014)CrossRefGoogle Scholar
  24. 24.
    Zhang, Z., Shen, W., Yao, C., Bai, X.: Symmetry-based text line detection in natural scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2558–2567 (2015)Google Scholar
  25. 25.
    Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4159–4167 (2016)Google Scholar
  26. 26.
    Zhong, Z., Jin, L., Zhang, S., Feng, Z.: Deeptext: A unified framework for text proposal generation and text detection in natural images. arXiv preprint arXiv:1605.07314 (2016)

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Xihan Wang
    • 1
    Email author
  • Xiaoyi Feng
    • 1
  • Zhaoqiang Xia
    • 1
  • Jinye Peng
    • 2
  • Eric Granger
    • 3
  1. 1.School of Electronics and InformationNorthwestern Polytechnical UniversityXi’anChina
  2. 2.School of Information Science and TechnologyNorthwest UniversityXi’anChina
  3. 3.Laboratoire d’imagerie, de vision et d’intelligence artificielle (LIVIA), École de technologie supérieureUniversité du QuébecMontréalCanada

Personalised recommendations