Abstract
Text in images contains exact semantic information and the text knowledge can be utilized in many image cognition and understanding applications. The human reading habits can provide the clues of text line structure for text line extraction. In this paper, we propose a novel human reading knowledge inspired text line extraction method based on k-shortest paths global optimization. Firstly, the candidate character extraction is reformulated as Maximal Stable Extremal Region (MSER) algorithm on gray, red, blue, and green channels of the target images, and the extracted MSERs are fed into Convolutional Neural Network (CNN) to remove the noise components. Then, the directed graph is built upon the character component nodes with edges inspired by human reading sense. The directed graph can automatically construct the relationship to eliminate the disorder of candidate text components. The text line paths optimization is inspired by the human reading ability in planning of a text line path sequentially. Therefore, the text line extraction problem can be solved using the k-shortest paths optimization algorithm by taking advantage of the human reading sense structure of the directed graph. It can extract the text lines iteratively to avoid the exhaustive searching and obtain global optimized text line number. The proposed method achieves the f-measure of 0.820 and 0.812 on public ICDAR2011 and ICDAR2013 dataset, respectively. The experimental results demonstrate the effectiveness of the proposed human reading knowledge inspired text line extraction method in comparison with state-of-the-art methods This paper presents one human reading knowledge inspired text line extraction method, which approves that the human reading knowledge can benefit the text line extraction and image text discovery.
Similar content being viewed by others
References
Bellman R. On a routing problem. Quarterly of applied mathematics pp. 87–90. 1958.
Berclaz J, Fleuret F, Turetken E, Fua P. Multiple object tracking using k-shortest paths optimization. IEEE Trans Pattern Anal Mach Intell. 2011;33(9):1806–19.
Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR), pp. 2963–2970. IEEE. 2010.
Hanif SM, Prevost L. Text detection and localization in complex scene images using constrained adaboost algorithm. In: 2009 10th international conference on document analysis and recognition, pp. 1–5. IEEE. 2009.
Huang W, Lin Z, Yang J, Wang J. Text localization in natural images using stroke feature transform and text covariance descriptors. In: Proceedings of the IEEE international conference on computer vision, pp. 1241–1248. 2013.
Huang W, Qiao Y, Tang X. Robust scene text detection with convolution neural network induced mser trees. In: European conference on computer vision, pp. 497–511. Springer. 2014.
Jung K, Kim KI, Jain AK. Text information extraction in images and video: a survey. Pattern Recogn. 2004;37(5):977–97.
Karatzas D, Shafait F, Uchida S, Iwamura M, i Bigorda LG, Mestre SR, Mas J, Mota DF, Almazan JA, de las Heras LP. Icdar 2013 robust reading competition. In: 2013 12th international conference on document analysis and recognition, pp. 1484–1493. IEEE. 2013.
Koo HI, Kim DH. Scene text detection via connected component clustering and nontext filtering. IEEE Trans Image Process. 2013;22(6):2296–305.
LeCun Y, Kavukcuoglu K, Farabet C, et al. Convolutional networks and applications in vision. In: ISCAS, pp. 253–256. 2010.
Liu J, Su H, Yi Y, Hu W. Robust text detection via multi-degree of sharpening and blurring. Signal Process. 2016;124:259– 65.
Marghi YM, Towhidkhah F, Gharibzadeh S. Human brain function in path planning: a task study. Cognitive Computation pp. 1–14. 2016.
Matas J, Chum O, Urban M, Pajdla T. Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput. 2004;22(10):761–7.
Minetto R, Thome N, Cord M, Fabrizio J, Marcotegui B. Snoopertext: A multiresolution system for text detection in complex visual scenes. In: 2010 IEEE international conference on image processing, pp. 3861–3864. IEEE . 2010.
Neumann L, Matas J. Real-time scene text localization and recognition. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp. 3538–3545. IEEE. 2012.
Nistér D, Stewénius H. Linear time maximally stable extremal regions. In: European Conference on Computer Vision, pp. 183–196. Springer. 2008.
Sauvola J, Pietikäinen M. Adaptive document image binarization. Pattern Recogn. 2000;33(2):225–36.
Shahab A, Shafait F, Dengel A. Icdar 2011 robust reading competition challenge 2: Reading text in scene images. In: 2011 international conference on document analysis and recognition, pp. 1491–1496. IEEE. 2011.
Shi C, Wang C, Xiao B, Zhang Y, Gao S. Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recogn Lett. 2013;34(2):107– 16.
Tian S, Pan Y, Huang C, Lu S, Yu K, Lim Tan C. Text flow: A unified text detection system in natural scene images. In: Proceedings of the IEEE international conference on computer vision, pp. 4651–4659. 2015.
Wang K, Belongie S. Word spotting in the wild. In: European Conference on Computer Vision, pp. 591–604. Springer. 2010.
Wang L, Fan W, Sun J, Naoi S, Hiroshi T. Text line extraction in document images. In: 2015 13th international conference on document analysis and recognition (ICDAR), pp. 191–195. IEEE. 2015.
Wang QF, Cambria E, Liu CL, Hussain A. Common sense knowledge for handwritten chinese text recognition. Cogn Comput. 2013;5(2):234–42.
Yang L, Lin H, Lin Y, Liu S. Detection and extraction of hot topics on chinese microblogs. Cogn Comput. 2016;8(4):577–86.
Ye Q, Doermann D. Text detection and recognition in imagery: A survey. IEEE Trans Pattern Anal Mach Intell. 2015;37(7):1480–500.
Yi C, Tian Y. Text detection in natural scene images by stroke gabor words. In: 2011 international conference on document analysis and recognition, pp. 177–181. IEEE. 2011.
Yin X, Yin X, Hao HW, Iqbal K. Effective text localization in natural scene images with mser, geometry-based grouping and adaboost. In: 2012 21st international conference on pattern recognition (ICPR), pp. 725–728. IEEE. 2012.
Yin X, Pei WY, Zhang J, Hao HW. Multi-orientation scene text detection with adaptive clustering. IEEE Trans Pattern Anal Mach Intell. 2015;37(9):1930–7.
Yin X, Yin X, Huang K, Hao HW. Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell. 2014;36(5):970–83.
Zhang Z, Shen W, Yao C, Bai X. Symmetry-based text line detection in natural scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2558–2567. 2015.
Zhu A, Gao R, Uchida S. Could scene context be beneficial for scene text detection Pattern Recognition. 2016;58:204–15.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Rights and permissions
About this article
Cite this article
Wang, L., Uchida, S., Zhu, A. et al. Human Reading Knowledge Inspired Text Line Extraction. Cogn Comput 10, 84–93 (2018). https://doi.org/10.1007/s12559-017-9490-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-017-9490-4