Advertisement

Feature Pooling in Scene Character Recognition: A Comprehensive Study

  • Zhong Zhang
  • Hong Wang
  • Shuang Liu
  • Yunxue Shao
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 463)

Abstract

In this paper, we focus on the feature pooling methods for scene character recognition. We research three kinds of pooling methods: the average (sum) pooling, max pooling and weighted-based pooling methods. Specifically, various feature pooling methods are introduced, their merits and demerits are studied, and existing problems are discussed. Finally, we offer a specific comparison on the ICDAR2003 and Chars74k databases.

Keywords

Scene character recognition Feature pooling Feature representation 

Notes

Acknowledgements

This work is supported by National Natural Science Foundation of China under Grant No. 61501327, No. 61711530240 and No. 61401309, Natural Science Foundation of Tianjin under Grant No. 17JCZDJC30600, and No. 15JCQNJC01700, the Open Projects Program of National Laboratory of Pattern Recognition under Grant No. 201700001, and Doctoral Fund of Tianjin Normal University under Grant No. 5RL134 and No. 52XB1405.

References

  1. 1.
    Pavlovic, V.I., Sharma, R., Huang, T.S.: Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 677–695 (1997)Google Scholar
  2. 2.
    DeSouza, G.N., Kak, A.C.: Vision for mobile robot navigation: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 237–267 (2002)Google Scholar
  3. 3.
    Vailaya, A., Figueiredo, M.A.T., Jain, A.K., Zhang, H.J.: Image classification for content-based indexing. IEEE Trans. Image Process. 10(1), 117–130 (2001)Google Scholar
  4. 4.
    Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 366–373 (2004)Google Scholar
  5. 5.
    Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3538–3545 (2012)Google Scholar
  6. 6.
    Gemert, J., Geusebroek, J., Veenman, C., Smeulders, A.: Kernel codebooks for scene categorization. In: European Conference on Computer Vision (ECCV), pp. 696–709 (2008)Google Scholar
  7. 7.
    Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360–3367 (2010)Google Scholar
  8. 8.
    Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning, pp. 1096–1103 (2008)Google Scholar
  9. 9.
    Norouzi, M., Ranjbar, M., Mori, G.: Stacks of convolutional restricted boltzmann machines for shift-invariant feature learning. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2735–2742 (2009)Google Scholar
  10. 10.
    Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S.: Cross-view action recognition using contextual maximum margin clustering. IEEE Trans. Circuits Syst. Video Technol. 24(10), 1663–1668 (2014)Google Scholar
  11. 11.
    Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1794–1801 (2009)Google Scholar
  12. 12.
    Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S.: Action recognition using context-constrained linear coding. IEEE Signal Process. Lett. 19(7), 439–442 (2012)Google Scholar
  13. 13.
    Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: International Conforence on Document Analysis and Recognition, pp. 682–687 (2003)Google Scholar
  14. 14.
    de Campos, T.E., Babu, B.R., Varma, M.: Character recognition in natural images. In: International Conference on Computer Vision and Applications, pp. 273–280 (2009)Google Scholar
  15. 15.
    Zubair, S., Yan, F., Wang, W.: Dictionary learning based sparse coefficients for audio classification with max and average pooling. Digit. Signal Proc. 23(3), 960–970 (2013)Google Scholar
  16. 16.
    Murray, N., Perronnin, F.: Generalized max pooling. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2473–2480 (2014)Google Scholar
  17. 17.
    Hu, Y., Li, M., Yu, N.: Multiple-instance ranking: learning to rank images for image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)Google Scholar
  18. 18.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2169–2178 (2006)Google Scholar
  19. 19.
    Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)Google Scholar
  20. 20.
    Gao, S., Wang, C., Xiao, B., Shi, C., Zhang, Z.: Stroke bank: a high-level representation for scene character recognition. In: International Conference on Pattern Recognition (ICPR), pp. 2909–2913 (2014)Google Scholar
  21. 21.
    Xiong, W., Zhang, L., Du, B., Tao, D.: Combining local and global: rich and robust feature pooling for visual recognition. Pattern Recogn. 62, 225–235 (2017)Google Scholar
  22. 22.
    Lee, C., Bhardwaj, A., Di, W., Jagadeesh, V., Piramuthu, R.: Region-based discriminative feature pooling for scene text recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4050–4057 (2014)Google Scholar
  23. 23.
    Shi, C., Gao, S., Liu, M., Qi, C., Wang, C., Xiao, B.: Stroke detector and structure based models for character recognition: a comparative study. IEEE Trans. Image Process. 24(12), 4952–4964 (2015)Google Scholar
  24. 24.
    Yi, C., Yang, X., Tian, Y.: Feature representations for scene text character recognition: a comparative study. In: International Conference on Document Analysis and Recognition, pp. 907–911 (2013)Google Scholar
  25. 25.
    Tian, S., Bhattacharya, U., Lu, S., Su, B.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recogn. 51, 126–134 (2016)Google Scholar
  26. 26.
    Su, B., Lu, S., Tian, S., Lim, J.H., Tan, C.L.: Character recognition in natural scene using convolutional co-occurrence HOG. In: International Conference on Pattern Recognition (ICPR), pp. 2926–2931 (2014)Google Scholar
  27. 27.
    Gao, S., Wang, C., Xiao, B., Shi, C., Zhou, W., Zhang, Z.: Learning co-occurrence strokes for scene character recognition based on spatiality embedded dictionary. In: IEEE International Conference on Image Processing (ICIP), pp. 5956–5960 (2014)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Zhong Zhang
    • 1
  • Hong Wang
    • 1
  • Shuang Liu
    • 1
  • Yunxue Shao
    • 2
  1. 1.Tianjin Key Laboratory of Wireless Mobile Communications and Power TransmissionTianjin Normal UniversityTianjinChina
  2. 2.College of Computer ScienceInner Mongolia UniversityInner MongoliaChina

Personalised recommendations