TextCatcher: a method to detect curved and challenging text in natural scenes

Fabrizio, Jonathan; Robert-Seidowsky, Myriam; Dubuisson, Séverine; Calarasanu, Stefania; Boissel, Raphaël

doi:10.1007/s10032-016-0264-4

TextCatcher: a method to detect curved and challenging text in natural scenes

Original Paper
Published: 11 March 2016

Volume 19, pages 99–117, (2016)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Jonathan Fabrizio¹,
Myriam Robert-Seidowsky¹,
Séverine Dubuisson²,
Stefania Calarasanu¹ &
…
Raphaël Boissel¹

2121 Accesses
29 Citations
Explore all metrics

Abstract

In this paper, we propose a text detection algorithm which is hybrid and multi-scale. First, it relies on a connected component-based approach: After the segmentation of the image, a classification step using a new wavelet descriptor spots the letters. A new graph modeling and its traversal procedure allow to form candidate text areas. Second, a texture-based approach discards the false positives. Finally, the detected text areas are precisely cut out and a new binarization step is introduced. The main advantage of our method is that few assumptions are put forward. Thus, “challenging texts” like multi-sized, multi-colored, multi-oriented or curved text can be localized. The efficiency of TextCatcher has been validated on three different datasets: Two come from the ICDAR competition, and the third one contains photographs we have taken with various daily life texts. We present both qualitative and quantitative results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast and Accurate Text Detection in Natural Scene Images

Edge color transform: a new operator for natural scene text localization

Article 25 April 2017

Review on Text Recognition in Natural Scene Images

Notes

http://www.itowns.fr/.
The scores of participating are freely available [22].
Dataset is available at https://www.lrde.epita.fr/~jonathan/
https://github.com/mop/LTPTextDetector.
https://github.com/Itseez/opencv_contrib/blob/master/modules/text/samples/end_to_end_recognition.cpp.

References

Abrash, M.: Michael Abrash’s Graphics Programming Black Book, 10th edn. Coriolis Group Books, Scottsdale (1997)
Google Scholar
Anthimopoulos, M., Gatos, B., Pratikakis, I.: Detection of artificial and scene text in images and video frames. Pattern Anal. Appl. 16(3), 431–446 (2013)
Article MathSciNet Google Scholar
Arth, C., Limberger, F., Bischof, H.: Real-time license plate recognition on an embedded dsp-platform. In: Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Bai, B., Yin, F., Liu, C.L.: Scene text localization using gradient local correlation. In: International Conference on Document Analysis and Recognition, pp. 1380–1384 (2013)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. Conf. Comput. Vis. Pattern Recognit. 1, 886–893 (2005)
Daubechies, I.: Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, Philadelphia (1992)
Book MATH Google Scholar
Daubechies, I., Sweldens, W.: Factoring wavelet transforms into lifting steps. J. Fourier Anal. Appl. 4(3), 245–267 (1998)
Article MathSciNet MATH Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Conference on Computer Vision and Pattern Recognition, pp. 2963–2970 (2010). doi:10.1109/CVPR.2010.5540041
Fabrizio, J., Marcotegui, B., Cord, M.: Text segmentation in natural scenes using toggle-mapping. In: International Conference on Image Processing, pp. 2349–2352 (2009)
Fabrizio, J., Marcotegui, B., Cord, M.: Text detection in street level image. Pattern Anal. Appl. 16(4), 519–533 (2013)
Article MathSciNet Google Scholar
Gao, S., Wang, C., Xiao, B., Shi, C., Zhang, Y., Lv, Z., Shi, Y.: Adaptive scene text detection based on transferring adaboost. In: International Conference on Document Analysis and Recognition, pp. 388–392 (2013)
Gatos, B., Ntirogiannis, K., Pratikakis, I.: Icdar document image binarization contest. In: International Conference on Document Analysis and Recognition (2009)
Gomez, L., Karatzas, D.: Multi-script text extraction from natural scenes. In: International Conference on Document Analysis and Recognition, pp. 467–471 (2013)
Haralick, R., Shanmugam, K., Dinstein, I.: Textural features for image classification. IEEE Trans. Syst. Man Cybern. 3(6), 610–621 (1973)
Article Google Scholar
Huang, W., Qiao, Y., Tang, X.: Robust scene text detection with convolution neural network induced MSER trees. In: European Conference on Computer Vision, pp. 497–511 (2014)
Jaderberg, M., Vedaldi, A., Zisserman, A.: Deep features for text spotting. In: European Conference on Computer Vision (2014)
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Proceedings of ECML, pp. 137–142 (1998)
Jung, K., Kim, I.K., Jain, A.K.: Text information extraction in images and video: a survey. Pattern Recogn. 37(5), 977–997 (2004)
Article Google Scholar
Kan, C., Srinath, M.D.: Scene text localization and recognition with oriented stroke detection. In: International Conference on Computer Vision, pp. 97–104 (2013)
Karaoglu, S., Fernando, B., Tremeau, A.: A novel algorithm for text detection and localization in natural scene images. In: Proceedings of DICTA, pp. 635–642 (2010)
Karaoglu, S., Gemert, J., Gevers, T.: Object reading: text recognition for object recognition. In: Proceedings of ECCVW-IFCVCR, pp. 456–465 (2012)
Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Bigorda, L.G., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., de las Heras, L.P.: ICDAR 2013 robust reading competition. In: International Conference on Document Analysis and Recognition, pp. 1484–1493 (2013)
Kasar, T., Agarai, G.: Multi-script and multi-oriented text localization from scene images. In: International Workshop on Camera-Based Document Analysis and Recognition, pp. 1–14 (2012)
Li, R., Wang, S., Shi, Z.: A two level algorithm for text detection in natural scene images. In: International Workshop on Document Analysis Systems (2014)
Li, Y., Shen, C., Jia, W., van den Hengel, A.: Leveraging surrounding context for scene text detection. In: International Conference on Image Processing, pp. 2264–2268 (2013)
Mao, J., Li, H., Zhou, W., Yan, S., Tian, Q.: Scale based region growing for scene text detection. In: International conference on MultiMedia, pp. 1007–1016 (2013)
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: British Machine Vision Conference, pp. 384–393 (2002)
Meng, Q., Song, Y.: Text detection in natural scenes with salient region. In: International Workshop on Document Analysis Systems, pp. 384–388 (2012)
Merino-Gracia, C., Lenc, K., Mirmehdi, M.: A head-mounted device for recognizing text in natural scenes. In: International Workshop on Camera-Based Document Analysis and Recognition, pp. 29–41 (2011)
Minetto, R., Thome, N., Cord, M., Leite, N.J., Stolfi, J.: T-hog: an effective gradient-based descriptor for single line text regions. Pattern Recogn. 46(3), 1078–1090 (2013)
Article Google Scholar
Neumann, L., Matas, J.: A method for text localization and recognition in real-world images. In: Asian Conference on Computer Vision, pp. 770–783 (2011)
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: Conference on Computer Vision and Pattern Recognition, pp. 3538–3545 (2012)
Neumann, L., Matas, J.: On combining multiple segmentations in scene text recognition. In: International Conference on Document Analysis and Recognition, pp. 523–527 (2013)
Ojala, T., Pietikinen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recogn. 29(1), 51–59 (1996)
Article Google Scholar
Olena Team: Milena, generic c++ library for image processing and pattern recognition. https://www.lrde.epita.fr/wiki/Olena/Milena
Opitz, M., Diem, M., Fiel S. and Kleber, F., Sablatnig: End-to-end text recognition with local ternary patterns, mser and deep convolutional nets. In: International Workshop on Document Analysis Systems (2014)
Phan, T.Q., Shivakumara, P., Tan, C.L.: Detecting text in the real world. In: International conference on MultiMedia, pp. 765–768 (2012)
Prakash, S., Ravishankar, M.: Multi-oriented video text detection and extraction using dct feature extraction and projection based rotation calculation. In: Proceedings of ICACCI, pp. 714–718 (2013)
Serra, J.: Toggle mappings. In: Simon, J.C. (ed.) From pixels to features, pp. 61–72. Elsevier, North-Holland (1989)
Google Scholar
Shi, C., Wang, C., Xiao, B., Zhang, Y., Gao, S.: Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recogn. Lett. 34(2), 107–116 (2013)
Article Google Scholar
Shi, C., Wang, C., Xiao, B., Zhang, Y., Gao, S., Zhang, Z.: Scene text recognition using part-based tree-structured character detection. In: Conference on Computer Vision and Pattern Recognition, pp. 2961–2968 (2013)
Shivakumara, P., Basavaraju, H.T., Guru, D.S., Tan, C.L.: Detection of curved text in video: Quad tree based method. In: International Conference on Document Analysis and Recognition, pp. 594–598 (2013)
Sumathi, C.P., Santhanam, T., Gayathri, G.: A survey on various approaches of text extraction in images. Int. J. Comput. Sci. Eng. Surv. 3(4), 27–42 (2012)
Article Google Scholar
Tomer, P., Goyal, A.: Ant clustering based text detection in natural scene images. In: Proceedings of ICCCNT, pp. 1–7 (2013)
Usevitch, B.E.: A tutorial on modern lossy wavelet image compression: foundations of JPEG 2000. IEEE Signal Process. Mag. 18(5), 22–35 (2001)
Article Google Scholar
Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: International Conference on Computer Vision, pp. 1457–1464 (2011)
Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolutional neural networks. In: International Conference on Pattern Recognition, pp. 3304–3308 (2012)
Wang, X., Song, Y., Zhang, Y.: Natural scene text detection with multi-channel connected component segmentation. In: International Conference on Document Analysis and Recognition, pp. 1375–1379 (2013)
Wolf, C., Jolion, J.M.: Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int. J. Doc. Anal. Recognit. 8(4), 280–296 (2006)
Article Google Scholar
Xu-Cheng, Y., Xuwang, Y., Kaizhu, H., Hong-Wei, H.: Robust text detection in natural scene images. Pattern Anal. Mach. Intell. 36(5), 970–983 (2013)
Article Google Scholar
Yang, H., Quehl, B., Sack, H.: A framework for improved video text detection and recognition. Multimed. Tools Appl. 69(1), 217–245 (2014)
Yao, C., Xiang, B., Wenyu, L., Yi, M., Zhuowan, T.: Detecting texts of arbitrary orientations in natural images. In: International Conference on Computer Vision, pp. 1083–1090 (2012)
Yi, C., Tian, Y.: Assistive text reading from complex background for blind persons. In: International Workshop on Camera-Based Document Analysis and Recognition, pp. 15–28 (2011)
Zagoris, K., Pratikakis, I.: Text detection in natural images using bio-inspired models. In: International Conference on Document Analysis and Recognition, pp. 1370–1374 (2013)
Zhang, J., Chong, Y.: Text localization based on the discrete shearlet transform. In: ICSESS, pp. 262–266 (2013)
Zhang, J., Kasturi, R.: Extraction of text objects in video documents: recent progress. In: International Workshop on Document Analysis Systems, pp. 5–17 (2008)
Zhang, Y., Huang, K., Liu, C.: Fast and robust graph-based transductive learning via minimum tree cut. In: 11th IEEE International Conference on Data Mining, ICDM 2011, Vancouver, BC, Canada, December 11–14, 2011, pp. 952–961 (2011)

Download references

Author information

Authors and Affiliations

EPITA Research and Development Laboratory (LRDE), 14-16, rue Voltaire, 94276, Le Kremlin Bicêtre, France
Jonathan Fabrizio, Myriam Robert-Seidowsky, Stefania Calarasanu & Raphaël Boissel
Sorbonne Universités, UPMC Univ Paris 06 CNRS, UMR 7222, ISIR, 75005, Paris, France
Séverine Dubuisson

Authors

Jonathan Fabrizio
View author publications
You can also search for this author in PubMed Google Scholar
Myriam Robert-Seidowsky
View author publications
You can also search for this author in PubMed Google Scholar
Séverine Dubuisson
View author publications
You can also search for this author in PubMed Google Scholar
Stefania Calarasanu
View author publications
You can also search for this author in PubMed Google Scholar
Raphaël Boissel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jonathan Fabrizio.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fabrizio, J., Robert-Seidowsky, M., Dubuisson, S. et al. TextCatcher: a method to detect curved and challenging text in natural scenes. IJDAR 19, 99–117 (2016). https://doi.org/10.1007/s10032-016-0264-4

Download citation

Received: 26 May 2015
Revised: 06 January 2016
Accepted: 13 February 2016
Published: 11 March 2016
Issue Date: June 2016
DOI: https://doi.org/10.1007/s10032-016-0264-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TextCatcher: a method to detect curved and challenging text in natural scenes

Abstract

Access this article

Similar content being viewed by others

Fast and Accurate Text Detection in Natural Scene Images

Edge color transform: a new operator for natural scene text localization

Review on Text Recognition in Natural Scene Images

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

TextCatcher: a method to detect curved and challenging text in natural scenes

Abstract

Access this article

Similar content being viewed by others

Fast and Accurate Text Detection in Natural Scene Images

Edge color transform: a new operator for natural scene text localization

Review on Text Recognition in Natural Scene Images

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation