Abstract
In this paper, we focus on the feature pooling methods for scene character recognition. We research three kinds of pooling methods: the average (sum) pooling, max pooling and weighted-based pooling methods. Specifically, various feature pooling methods are introduced, their merits and demerits are studied, and existing problems are discussed. Finally, we offer a specific comparison on the ICDAR2003 and Chars74k databases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Pavlovic, V.I., Sharma, R., Huang, T.S.: Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 677–695 (1997)
DeSouza, G.N., Kak, A.C.: Vision for mobile robot navigation: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 237–267 (2002)
Vailaya, A., Figueiredo, M.A.T., Jain, A.K., Zhang, H.J.: Image classification for content-based indexing. IEEE Trans. Image Process. 10(1), 117–130 (2001)
Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 366–373 (2004)
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3538–3545 (2012)
Gemert, J., Geusebroek, J., Veenman, C., Smeulders, A.: Kernel codebooks for scene categorization. In: European Conference on Computer Vision (ECCV), pp. 696–709 (2008)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360–3367 (2010)
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning, pp. 1096–1103 (2008)
Norouzi, M., Ranjbar, M., Mori, G.: Stacks of convolutional restricted boltzmann machines for shift-invariant feature learning. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2735–2742 (2009)
Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S.: Cross-view action recognition using contextual maximum margin clustering. IEEE Trans. Circuits Syst. Video Technol. 24(10), 1663–1668 (2014)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1794–1801 (2009)
Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S.: Action recognition using context-constrained linear coding. IEEE Signal Process. Lett. 19(7), 439–442 (2012)
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: International Conforence on Document Analysis and Recognition, pp. 682–687 (2003)
de Campos, T.E., Babu, B.R., Varma, M.: Character recognition in natural images. In: International Conference on Computer Vision and Applications, pp. 273–280 (2009)
Zubair, S., Yan, F., Wang, W.: Dictionary learning based sparse coefficients for audio classification with max and average pooling. Digit. Signal Proc. 23(3), 960–970 (2013)
Murray, N., Perronnin, F.: Generalized max pooling. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2473–2480 (2014)
Hu, Y., Li, M., Yu, N.: Multiple-instance ranking: learning to rank images for image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2169–2178 (2006)
Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)
Gao, S., Wang, C., Xiao, B., Shi, C., Zhang, Z.: Stroke bank: a high-level representation for scene character recognition. In: International Conference on Pattern Recognition (ICPR), pp. 2909–2913 (2014)
Xiong, W., Zhang, L., Du, B., Tao, D.: Combining local and global: rich and robust feature pooling for visual recognition. Pattern Recogn. 62, 225–235 (2017)
Lee, C., Bhardwaj, A., Di, W., Jagadeesh, V., Piramuthu, R.: Region-based discriminative feature pooling for scene text recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4050–4057 (2014)
Shi, C., Gao, S., Liu, M., Qi, C., Wang, C., Xiao, B.: Stroke detector and structure based models for character recognition: a comparative study. IEEE Trans. Image Process. 24(12), 4952–4964 (2015)
Yi, C., Yang, X., Tian, Y.: Feature representations for scene text character recognition: a comparative study. In: International Conference on Document Analysis and Recognition, pp. 907–911 (2013)
Tian, S., Bhattacharya, U., Lu, S., Su, B.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recogn. 51, 126–134 (2016)
Su, B., Lu, S., Tian, S., Lim, J.H., Tan, C.L.: Character recognition in natural scene using convolutional co-occurrence HOG. In: International Conference on Pattern Recognition (ICPR), pp. 2926–2931 (2014)
Gao, S., Wang, C., Xiao, B., Shi, C., Zhou, W., Zhang, Z.: Learning co-occurrence strokes for scene character recognition based on spatiality embedded dictionary. In: IEEE International Conference on Image Processing (ICIP), pp. 5956–5960 (2014)
Acknowledgements
This work is supported by National Natural Science Foundation of China under Grant No. 61501327, No. 61711530240 and No. 61401309, Natural Science Foundation of Tianjin under Grant No. 17JCZDJC30600, and No. 15JCQNJC01700, the Open Projects Program of National Laboratory of Pattern Recognition under Grant No. 201700001, and Doctoral Fund of Tianjin Normal University under Grant No. 5RL134 and No. 52XB1405.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, Z., Wang, H., Liu, S., Shao, Y. (2019). Feature Pooling in Scene Character Recognition: A Comprehensive Study. In: Liang, Q., Mu, J., Jia, M., Wang, W., Feng, X., Zhang, B. (eds) Communications, Signal Processing, and Systems. CSPS 2017. Lecture Notes in Electrical Engineering, vol 463. Springer, Singapore. https://doi.org/10.1007/978-981-10-6571-2_262
Download citation
DOI: https://doi.org/10.1007/978-981-10-6571-2_262
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6570-5
Online ISBN: 978-981-10-6571-2
eBook Packages: EngineeringEngineering (R0)