Exploiting colour information for better scene text detection and recognition

Fraz, Muhammad; Sarfraz, M. Saquib; Edirisinghe, Eran A.

doi:10.1007/s10032-015-0239-x

Exploiting colour information for better scene text detection and recognition

Special Issue Paper
Published: 19 February 2015

Volume 18, pages 153–167, (2015)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Muhammad Fraz¹,
M. Saquib Sarfraz² &
Eran A. Edirisinghe¹

622 Accesses
12 Citations
Explore all metrics

Abstract

This paper presents an approach for text detection and recognition in scene images. The main contribution of this paper is to demonstrate that the colour information within the images if efficiently exploited is good enough to identify text regions from the surrounding noise. In the same way, the colour information present in character and word images can be used to achieve significant performance improvement in the recognition of characters and words. The proposed pipeline makes use of the colour information and low-level image processing operations to enhance text information that improves the overall performance of text detection and recognition in the wild. The proposed method offers two main advantages. First, it enhances the text regions up to a level of clarity where a simple off-the-shelf feature representation and classification method achieves state-of-the-art recognition performance. Second, the proposed framework is computationally fast as compared to other text detection and recognition techniques that offer good accuracy at the cost of significantly high latency. We performed extensive experimentation to evaluate our method on challenging benchmark datasets (Chars74K, ICDAR03, ICDAR11 and SVT), and the results show a considerable performance improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Lienhart, R., Effelsberg, W.: Automatic text segmentation and text recognition for video indexing. J. Multimed. Syst. 8, 69–81 (1998)
Article Google Scholar
Fraz, M., Zafar, I., Tzanidou, G., Edirisinghe, E.A., Sarfraz, M.S.: Human object annotation for surveillance video forensics. J. Electron. Imaging 22(4), 041115 (2013)
Article Google Scholar
Sarfraz, M.S., Shahzad, A., Elahi, Muhammad A., Fraz, M., Zafar, I., Edirisinghe, E.A.: Real-time automatic license plate recognition for CCTV forensic applications. J. Real-Time Image Process. 8(3), 285–295 (2013)
Article Google Scholar
Dumitras, T.: Eye of the Beholder: Phone-based text-recognition for the visually-impaired. In: 10th IEEE International Symposium on Wearable Computers (2006)
Huang, W., Lin, Z., Yang, J., Wang, J.: Text localization in natural images using stroke feature transform and text covariance descriptor. In: ICCV (2013)
Neumann, L., Matas, J.: Scene text localization and recognition with oriented stroke detection. In: ICCV (2013)
Ezaki, N., Bulacu, M., Schomaker, L.: Text detection from natural scene images: towards a system for visually impaired persons. Pattern Recognit. 2, 683–686 (2004)
Google Scholar
Lucas, S.M.: Text locating competition results. In: ICDAR (2005)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: CVPR (2010)
Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. PAMI 33(2), 412–419 (2011)
Neumann, L., Matas, J.: A method for text localization and recognition in real-world images. In: ACCV (2010)
Chen, H., Tsai, S.S., Schroth, G., Chen, D.M., Grzesczuk, R., Girod, B.: Robust text detection in natural scene images with edge enhanced maximally stable extremal regions. In: ICIP (2011)
de Campos, T., Babu, B., Varma, M.: Character recognition in natural images. In: VISAPP (2009)
Sosa, L.P., Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR2003 robust reading competition. In: ICDAR (2003)
Wang, K., Babenko, B., Belongie, S.: End to end scene text recognition. In: ICCV (2011)
Jain, A.K., Zhong, Y.: Page segmentation using texture analysis. Pattern Recognit. 29(5), 743–770 (1996)
Article Google Scholar
Zhong, Y., Zhang, H., Jain, A.K.: Automatic caption localization in compressed video. PAMI 22(4), 385–392 (2000)
Article Google Scholar
Wu, V., Manmatha, R., Riseman, E.R.: Textfinder: an automatic system to detect and recognize text in images. PAMI 21(11), 1224–1229 (1999)
Article Google Scholar
Wu, V., Manmatha, R., Riseman, E.R.: Finding text in images. In: ACM Conference on Digital Libraries (1997)
Sin, B., Kim, S., Cho, B.: Locating characters in scene images using frequency features. In: ICPR (2002)
Mao, W., Chung, F., Lanm, K., Siu, W.: Hybrid Chinese/English text detection in images and video frames. In: ICPR (2002)
Lim, Y.K., Choi, S.H., Lee, S.W.: Text extraction in MPEG compressed video for content-based indexing. In: ICPR, pp. 409412 (2000)
Lee, C.W., Jung, K., Kim, H.J.: Automatic text detection and removal in video sequences. Pattern Recognit. Lett. 24(15), 2607–2623 (2003)
Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: CVPR (2004)
Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video frames. Image Vis. Comput. 23(6), 565–576 (2005)
Article Google Scholar
Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: CVPR (2012)
Mosleh, A., Bouguila, N., Hamza, A.B.: Image text detection using a bandlet-based edge detector and stroke width transform. In: BMVC (2012)
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: CVPR (2012)
Shi, C., Wang, C., Xiao, B., Zhang, Y., Gao, S., Zhang, Z.: Scene text recognition using part-based tree-structured character detection. In: CVPR (2013)
Mishra, A., Alahari, K., Jawahar, C.V.: Scene text recognition using higher order language priors. In: BMVC (2012)
Mishra, A., Alahari, K., Jawahar, C.V.: Top-down and bottom-up cues for scene text recognition. In: CVPR (2012)
Wang, K., Belongie, S.: Word spotting in the wild. In: ECCV (2010)
Dalal, N., Triggs, B.: Histogram of oriented gradients for human detection. In: CVPR (2005)
Sheshadri, K., Divyala, S.K.: Exemplar driven character recognition in the wild. In: BMVC (2012)
Yi, C., Yang, X., Tian, Y.: Feature representations for scene text character recognition: a comparative study. In: ICDAR (2013)
Lee, C., Bharadwaj, A., Di, W., Jagadeesh, V., Piramuthu, R.: Region based discriminative pooling for scene text recognition. In: CVPR (2014)
Smith, D.L., Field, J., Miller, E.L.: Enforcing similarity constraints with integer programming for better scene text recognition. In: CVPR (2011)
Weinmann, J., Butler, Z., Knoll, D., Field, J.: Towards integrated scene text reading. In: PAMI (2013)
Bissaco, A., Cummins, M., Netzer, Y., Neven, H.: PhotoOCR: reading text in uncontrolled Conditions. In: ICCV (2013)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001)
Pan, Y., Hou, X., Liu, C.: Text localization in natural scene images based on conditional random fields. In: ICDAR (2009)
Yao, C., Bai, X., Shi, B., Liu, W.: Strokelets: a learned multi-scale representation for scene text recognition. In: CVPR (2014)
Novikova, T., Barinoya, O., Kohli, P., Lempitsky, V.: Large-lexicon attribute-consistent text recognition in natural images. In: ECCV (2012)
Milyaev, S., Barinova, O., Kohli, P., Lempitsky, V.: Image binarization for end-to-end text understanding in natural images. In: ICDAR (2013)
Mishra, A., Alahari, K., Jawahar, C.V.: An MRF model for binarization of natural scene text. In: ICDAR (2011)
Wakhara, T., Kita, K.: Binarization of color character strings in scene image using k-mean clustering and support vector machines. In: ICDAR (2011)
Field, J.L., Miller, E.G.L.: Improving open-vocabulary scene text recognition. In: ICDAR (2013)
Bianco, S., Ciocca, G., Cusanom, C., Schenttini, R.: Improving color constancy using indoor-outdoor image classification. J. Image Process. 17(12), 2381–2392 (2008)
Buchsbaum, G.: A spatial processor model for object color perception. J. Franklin Inst. 310, 126 (1980)
Heckbert, P.S.: Color image quantization for frame buffer display. Comput. Graph. 16(3), 297–307 (1982)
Nvarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)
Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: ICCV (1998)
Wolf, C., Jolion, J.M.: Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int. J. Doc. Anal. 8(4), 280–296 (2006)
Article Google Scholar
Shi, C., Wang, C., Xiao, B., Zhang, Y., Gao, S.: Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognit. 34(2), 107–116 (2013)
Article Google Scholar
Shahab, A., Shafait, F., Dengel, A.: ICDAR2011 robust reading competition challenge 2: reading text in scene images. In: ICDAR (2011)
Yi, C., Tian, Y.: Text extraction from scene images by character appearance and structure modelling. J. Comput. Vis. Image Underst. 117(2), 182–194 (2013)
Gonzalez, A., Begasa, L., Yebes, J., Bonte, S.: Text localization in complex images. In: ICPR (2007)
Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. In: IEEE Transaction on Image Processing, p. 25942605 (2011)
Neumann, L., Matas, J.: Text localization in real world images using efficiently pruned exhaustive search. In: ICDAR (2011)
Goel, V., Mishra, A., Alahari, K., Jawahar, C.V.: Whole is greater than sum of parts: recognizing scene text words. In: ICDAR, pp. 398402 (2013)
Phan, T.Q., Shivakumara, P., Tian, S., Tan, C.L.: Recognizing text with perspective distortion in natural scenes. In: ICCV (2013)

Download references

Author information

Authors and Affiliations

Department of Computer Science, Loughborough University, Loughborough, UK
Muhammad Fraz & Eran A. Edirisinghe
Institute of Anthoromatics and Robotics, Karlsruhe Institute of Technology, Karlsruhe, Germany
M. Saquib Sarfraz

Authors

Muhammad Fraz
View author publications
You can also search for this author in PubMed Google Scholar
M. Saquib Sarfraz
View author publications
You can also search for this author in PubMed Google Scholar
Eran A. Edirisinghe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Fraz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fraz, M., Sarfraz, M.S. & Edirisinghe, E.A. Exploiting colour information for better scene text detection and recognition. IJDAR 18, 153–167 (2015). https://doi.org/10.1007/s10032-015-0239-x

Download citation

Received: 01 June 2014
Revised: 28 December 2014
Accepted: 10 January 2015
Published: 19 February 2015
Issue Date: June 2015
DOI: https://doi.org/10.1007/s10032-015-0239-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Exploiting colour information for better scene text detection and recognition

Abstract

Access this article

Similar content being viewed by others

Scene text detection and recognition: recent advances and future trends

An Evaluation and Validation of Contemporary Approaches in Scene Text Extraction and Recognition

Review on Text Recognition in Natural Scene Images

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exploiting colour information for better scene text detection and recognition

Abstract

Access this article

Similar content being viewed by others

Scene text detection and recognition: recent advances and future trends

An Evaluation and Validation of Contemporary Approaches in Scene Text Extraction and Recognition

Review on Text Recognition in Natural Scene Images

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation