Detection of artificial and scene text in images and video frames

Anthimopoulos, Marios; Gatos, Basilis; Pratikakis, Ioannis

doi:10.1007/s10044-011-0237-7

Detection of artificial and scene text in images and video frames

Industrial and Commercial Application
Published: 23 September 2011

Volume 16, pages 431–446, (2013)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Marios Anthimopoulos¹,
Basilis Gatos¹ &
Ioannis Pratikakis²

808 Accesses
24 Citations
Explore all metrics

Abstract

Textual information in images and video frames constitutes a valuable source of high-level semantics for multimedia indexing and retrieval systems. Text detection is the most crucial step in a multimedia text extraction system and although it has been extensively studied the past decade still, it does not exist a generic architecture that would work for artificial and scene text in multimedia content. In this paper we propose a system for text detection of both artificial and scene text in images and video frames. The system is based on a machine learning stage which uses an Random Forest classifier and a highly discriminative feature set produced by using a new texture operator called Multilevel Adaptive Color edge Local Binary Pattern (MACeLBP). MACeLBP describes the spatial distribution of color edges in multiple adaptive levels of contrast. Then, a gradient-based algorithm is applied to achieve distinction among text lines as well as refinement in the localization of the text lines. The whole algorithm is situated in a multiresolution framework to achieve invariance to scale for the detection of text lines. Finally, an optional connected-component step segments text lines into words based on the distances between the resulting components. The experimental results are produced by applying a concise evaluation methodology and prove the superior performance achieved by the proposed text detection system for artificial and scene text in images and video frames.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Lienhart R, Effelsberg W (2000) Automatic text segmentation and text recognition for video indexing. ACM/Springer Multime´d Sys 8:69–81
Article Google Scholar
Sobottka K, Bunke H, Kronenberg H (1999) Identification of text on colored book and journal covers. International conference on document analysis and recognition, pp 57–63
Wang K, Kangas JA (2003) Character location in scene images from digital camera. Pattern Recognit 36(10):2287–2299
Article MATH Google Scholar
Sato T, Kanade T, Hughes E, and Smith M (1998) Video ocr for digital news archives, IEEE workshop on content-based access of image and video databases, pp 52–60
Anthimopoulos M, Gatos B, Pratikakis I (2007) Multiresolution text detection in video frames. International conference on computer vision theory and applications, pp 161–166
Kim W, Kim C (2009) A new approach for overlay text detection and extraction from complex video scene. IEEE Trans Image Process 18(2):401–411
Article MathSciNet Google Scholar
Chen X, Yang J, Zhang J, Waibel A (2004) Automatic detection and recognition of signs from natural scenes. IEEE Trans Image Process 13(1):87–99
Article Google Scholar
Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transforms, IEEE conference on computer vision and pattern recognition, San Francisco
Zhong Y, Zhang H, Jain AK (2000) Automatic caption localization in compressed video. IEEE Trans Pattern Anal Machine Intell 22(4):385–392
Article Google Scholar
Crandall D, Antani S, Kasturi R (2003) Extraction of special effects caption text events from digital video. Int J Document Anal Recognit 5(2–3):138–157
Google Scholar
Lim Y.K, Choi S.H, and Lee S.W (2000) Text extraction in mpeg compressed video for content-based indexing. International conference on pattern recognition, pp 409–412
Gargi U, Crandall D.J, Antani S, Gandhi T, Keener R, Kasturi R (1999) A system for automatic text detection in video. International conference on document analysis and recognition, pp 29–32
Goto H (2008) Redefining the DCT-based feature for scene text detection: Analysis and comparison of spatial frequency-based features. Int J Document Anal Recognit 11(1):1–8
Article MathSciNet Google Scholar
Chen D, Odobez J-M, Thiran J-P (2004) A localization/verification scheme for finding text in images and videos based on contrast independent features and machine learning methods. Image Commun 19(3):205–217
Google Scholar
Ye Q, Huang Q, Gao W, Zhao D (2005) Fast and robust text detection in images and video frames. Image Vision Comput 23(6):565–576
Article Google Scholar
Jung C, Liu Q, Kim J (2009) A stroke filter and its application to text localization. Pattern Recogn Lett 30(2):114–122
Article Google Scholar
Anthimopoulos M, Gatos B, Pratikakis I (2010) A two-stage scheme for text detection in video images. Image Vision Comput 28(9):1413–1426
Article Google Scholar
Ye Q, Jiao J, Huang J, Yu H (2007) Text detection and restoration in natural scene images. J Vis Commun Image Represent 18(6):504–513
Article Google Scholar
Ji R, Xu P, Yao H, Zhang Z, Sun X, Liu T (2008) Directional correlation analysis of local Haar binary pattern for text detection. IEEE International Conference on Multimedia & Expo, pp 885–888
A. Ekin (2006) Information based overlaid text detection by classifier fusion. IEEE international conference on acoustics, speech and signal processing, pp II-753–II-756
Jung K (2001) Neural network-based text location in color images. Pattern Recogn Lett 22(14):1503–1515
Article MATH Google Scholar
Kim KI, Jung K, Park SH, Kim HJ (2001) Support vector machine-based text detection in digital video. Pattern Recogn 34(2):527–529
Article Google Scholar
Wolf C and Jolion J-M (2004) Model Based Text Detection in Images and Videos: a Learning Approach. Technical Report LIRIS-RR-2004-13 Laboratoire d’Informatique en Images et Systemes d’Information, INSA de Lyon, France
Lienhart R, Wernicke A (2002) Localizing and segmenting text in images and videos. IEEE Trans Circuits and Systems for Video Technol 12(4):256–268
Article Google Scholar
Li H, Doermann D, Kia O (2000) Automatic Text Detection and Tracking in Digital Video. IEEE Trans Image Process 9(1):147–156
Article Google Scholar
Chen X.R, Yuille A.L (2004) Detecting and reading text in natural scenes. IEEE computer society conference on computer vision and pattern recognition, pp 366–373
Viola PA, Jones MJ (2004) Robust real-time face detection. Int J Comp Vision 57(2):137–154
Article Google Scholar
Ojala T, Pietikainen M, Harwood D (1996) A comparative study of texture measures with classification based on feature distributions. Pattern Recogn 29(1):51–59
Article Google Scholar
Breiman L (2001) Random forests. Machine Learn 45(1):5–32
Article MATH Google Scholar
Tang Y, Krasse S, He Y, Yang W, Alperovitch D (2008) Support vector machines and random forests modeling for spam senders behavior analysis. GLOBECOM, pp 2174–2178
Bosch A, Zisserman A, Munoz X (2007) Image classification using random forests and ferns, 11th IEEE international conference on computer vision, pp 1–8
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE transactions on systems. Man Cybern 9(1):62–66
Article MathSciNet Google Scholar
Lucas S, Panaretos A, Sosa L, Tang A, Wong S, Young R (2003) ICDAR 2003 robust reading competitions, ICDAR, pp 682–687
Wolf C, Jolion J (2006) Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int J Doc Anal Recognit 8(4):280–296
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Center for Scientific Research “Demokritos”, 153 10, Athens, Greece
Marios Anthimopoulos & Basilis Gatos
Department of Electrical and Computer Engineering, Democritus University of Thrace, 671 00, Xanthi, Greece
Ioannis Pratikakis

Authors

Marios Anthimopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Basilis Gatos
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis Pratikakis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marios Anthimopoulos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Anthimopoulos, M., Gatos, B. & Pratikakis, I. Detection of artificial and scene text in images and video frames. Pattern Anal Applic 16, 431–446 (2013). https://doi.org/10.1007/s10044-011-0237-7

Download citation

Received: 06 August 2010
Accepted: 02 September 2011
Published: 23 September 2011
Issue Date: August 2013
DOI: https://doi.org/10.1007/s10044-011-0237-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detection of artificial and scene text in images and video frames

Abstract

Access this article

Similar content being viewed by others

TextCatcher: a method to detect curved and challenging text in natural scenes

Text Localization Based on Fast Feature Pyramids and Multi-Resolution Maximally Stable Extremal Regions

Texture-Based Text Detection in Digital Images with Wavelet Features and Support Vector Machines

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detection of artificial and scene text in images and video frames

Abstract

Access this article

Similar content being viewed by others

TextCatcher: a method to detect curved and challenging text in natural scenes

Text Localization Based on Fast Feature Pyramids and Multi-Resolution Maximally Stable Extremal Regions

Texture-Based Text Detection in Digital Images with Wavelet Features and Support Vector Machines

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation