Fast RT-LoG operator for scene text detection

Nguyen Dinh, Cong; Delalandre, Mathieu; Conte, Donatello; Pham, The Anh

doi:10.1007/s11554-020-00942-7

Fast RT-LoG operator for scene text detection

Original Research Paper
Published: 24 January 2020

Volume 18, pages 19–36, (2021)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Cong Nguyen Dinh ORCID: orcid.org/0000-0003-0798-5511^1,2,
Mathieu Delalandre²,
Donatello Conte² &
…
The Anh Pham¹

373 Accesses
4 Citations
Explore all metrics

Abstract

This paper proposes a new real-time Laplacian of Gaussian (RT-LoG) operator for scene text detection. This method takes advantage of the Gaussian kernel distribution in the spatial/scale-space domains and kernel decomposition with the box filtering method. Two levels of optimization are given. The first level of optimization within the spatial domain is obtained by box mutualization. The second level of optimization within the spatial/scale-space domains is performed using a mixed method for box selection. The proposed RT-LoG operator is evaluated on the ICDAR2017 RRC-MLT dataset in terms of robustness and time processing. The results are compared with the state-of-the-art real-time operators for scene text detection. The proposed operator appears as the top performance with the best trade-off between robustness and time processing. The proposed operator can support approximately 30 frames per second (FPS) up to the Quad-HD resolution on a regular CPU architecture with a low-level latency. In addition, the proposed operator can support the full pipeline for scene text detection. Our system is competitive with the top accurate systems of the literature while processing with a difference of two orders of magnitude in term of processing resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Microsoft COCO: Common Objects in Context

Attention mechanisms in computer vision: A survey

Article Open access 15 March 2022

A Comprehensive Overview of Image Enhancement Techniques

Article 23 April 2021

Notes

In practice, \(k \in ]1, \sqrt{2}]\).
For simplification, considering the 1D case.
Single Precision.

References

Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. PAMI 37(7), 1480–1500 (2015)
Article Google Scholar
Long, S., He, X., Ya, C.: Scene text detection and recognition: the deep learning era, arXiv:1811.04256 (2018)
Nayef, N., Yin, F., Bizid, I., Choi, H.: ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT. ICDAR (2017). https://doi.org/10.1109/ICDAR.2017.237
Article Google Scholar
Neumann, L., Matas, J.: Real-time lexicon-free scene text localization and recognition. PAMI 38(9), 1872–1885 (2016)
Article Google Scholar
Buttazzo, G.C.: Hard real-time computing systems: predictable scheduling algorithms and applications. Springer Science & Business Media, Berlin (2011)
Book Google Scholar
Rey-Otero, I., Morel, J.M.: An analysis of scale-space sampling in SIFT. ICIP (2014). https://doi.org/10.1109/ICIP.2014.7025982
Article Google Scholar
Busta, M., Neumann, L., Matas, J.: Fastext: efficient unconstrained scene text detector. ICCV (2015). https://doi.org/10.1109/ICCV.2015.143
Article Google Scholar
Cho, H., Sung, M., Jun, B.: Canny text detector: Fast and robust scene text localization algorithm. CVPR (2016). https://doi.org/10.1109/CVPR.2016.388
Article Google Scholar
Epshtein, B., Ofek, E.: Detecting text in natural scenes with stroke width transform. CVPR (2010). https://doi.org/10.1109/CVPR.2010.5540041
Article Google Scholar
Girones, X., Julia, C.: Real-time text localization in natural scene images using a linear spatial filter. ICDAR (2017). https://doi.org/10.1109/ICDAR.2017.208
Article Google Scholar
Gomez, L., Karatzas, D.: MSER-based real-time text detection and tracking. ICPR (2014). https://doi.org/10.1109/ICPR.2014.536
Article Google Scholar
Turki, H., Halima, M.B., Alimi, A.: Text detection based on MSER and CNN features. ICDAR (2017). https://doi.org/10.1109/ICDAR.2017.159
Article Google Scholar
Zhao, R., Niu, X., Wu, Y., Luk, W., Liu, Q.: Optimizing CNN-based object detection algorithms on embedded FPGA platforms. ISARC (2017). https://doi.org/10.1007/978-3-319-56258-2_22
Article Google Scholar
Maceina, T.J., Manduchi, G.: Assessment of general purpose GPU systems in real-time control. TNS 64(6), 1455–1460 (2017)
Google Scholar
Kim, H., Nam, H., Jung, W., Lee, J.: Performance analysis of CNN frameworks for GPUs. ISPASS (2017). https://doi.org/10.1109/ISPASS.2017.7975270
Article Google Scholar
Wang, F., Zhao, L., Li, X., Wang, X.: Geometry-aware scene text detection with instance transformation network. CVPR (2018). https://doi.org/10.1109/CVPR.2018.00150
Article Google Scholar
Fragoso, V., Srivastava, G., Nagar, A., Li, Z.: Cascade of box (CABOX) filters for optimal scale space approximation. CVPR (2014). https://doi.org/10.1109/CVPRW.2014.24
Article Google Scholar
Liu, Y., Zhang, D., Zhang, Y.: Real-time scene text detection based on stroke model. ICPR (2014). https://doi.org/10.1109/ICPR.2014.537
Article Google Scholar
Nguyen, D.C., Delalandre, M., Conte, D., Pham, T.A.: Performance evaluation of real-time and scale-invariant LoG operators for text detection. VISAPP (2019). https://doi.org/10.5220/0007361503440353
Article Google Scholar
Lindeberg, T.: Scale-space theory: a basic tool for analysing structures at different scales. JAS 21, 224–270 (1994)
Google Scholar
Charalampidis, D.: Recursive implementation of the Gaussian filter using truncated cosine functions. TIP 64(14), 3554–3565 (2016)
MathSciNet MATH Google Scholar
Elboher, E., Werman, M.: Efficient and accurate Gaussian image filtering using running sums. ISDA 897–902, (2011)
Viola, P., Jones, M.J.: Robust real-time face detection. IJCV 57(2), 137–154 (2004)
Article Google Scholar
Strang, G.: Introduction to Linear Algebra, 5th edn. Cambridge Press, Cambridge (1993)
MATH Google Scholar
Karatzas, D., Gomez-Bigorda, L.: ICDAR 2015 competition on robust reading. ICDAR 1156–1160, (2015)
Siddhesh, K., Amit, A.: Faster K-Means Cluster Estimation, arXiv, vol.1701.04600 (2017)
Medioni, G.G., Lim, J., Park, J.: Text segmentation in color images using tensor voting. Image Vis Comput IVC 25.5, 671–685 (2007)
Google Scholar
Mao, J., Li, H., Zhou, W., Yan, S., Tian, Q.: Scale based region growing for scene text detection. ACMMM (2013). https://doi.org/10.1145/2502081.2502108
Article Google Scholar
Zhu, W., Lou, J., Chen, L., Xia, Q., Ren, M.: Scene text detection via extremal region based double threshold convolutional network classification. PLoS One 12(8), e0182227 (2017)
Article Google Scholar
Yin, X.C., Pei, W.Y., Zhang, J.: Multi-orientation scene text detection with adaptive clustering. PAMI 37(9), 1930–1937 (2015)
Article Google Scholar
Dai, J., Wang, Z., Zhao, X., Shao, S.: Scene text detection based on enhanced multi-channels MSER and a fast text grouping process. ICCCBDA (2018). https://doi.org/10.1109/ICCCBDA.2018.8386541
Article Google Scholar
Nguyen, C., Delalandre, M., Conte, D., Pham, T.: Fast scene text detection with RT-LoG operator and CNN. VISAPP, (2020)
Liu, J., Liu, X., Sheng, J., Liang, D.: Pyramid Mask Text Detector, arXiv preprint: arXiv:1903.11800 (2019)
He, W., Zhang, X.Y., Yin, F., Liu, C.L.: Multi-oriented and multi-lingual scene text detection with direct regression. TIP 27(11), 5406–5419 (2018)
MathSciNet Google Scholar
Huang, Z., Zhong, Z., Sun, L., Huo, Q.: Mask R-CNN with pyramid attention network for scene text detection. WACV (2019). https://doi.org/10.1109/WACV.2019.00086
Article Google Scholar
Zhang, C., Liang, B., Huang, Z., En, M., Han, J.: Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes, arXiv preprint: arXiv:1904.06535 (2019)
Lyu, P., Yao, C., Wu, W., Yan, S.: Multi-oriented scene text detection via corner localization and region segmentation. CVPR 7553–7563, (2018)
Liu, X., Liang, D., Yan, S., Chen, D.: Fots: Fast oriented text spotting with a unified network. CVPR 5676–5685, (2018)
Zhong, Z., Sun, L., Huo, Q.: An anchor-free region proposal network for faster r-cnn based text detection approaches, arXiv preprint: arXiv:1804.09003 (2018)
Wang, H., Rong, X., Tian, Y.: Towards accurate instance-level text spotting with guided attention. ICME (2019). https://doi.org/10.1109/ICME.2019.00175
Article Google Scholar
Lyu, P., Liao, M., Yao, C., Wu, W.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes, ECCV (2018)
Zhou, X., Yao, C., Wen, H., Wang, Y.: EAST: an efficient and accurate scene text detector. CVPR 5551–5560, (2017)
He, P., Huang, W., He, T., Zhu, Q.: Single shot text detector with regional attention. ICCV 3047–3055, (2017)
Miao, Z., Jiang, X.: Contrast invariant interest point detection by zero-norm log filter. TIP 25(1), 331–342 (2016)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Hong Duc University, Thanh Hoa City, Vietnam
Cong Nguyen Dinh & The Anh Pham
Tours University, Tours City, France
Cong Nguyen Dinh, Mathieu Delalandre & Donatello Conte

Authors

Cong Nguyen Dinh
View author publications
You can also search for this author in PubMed Google Scholar
Mathieu Delalandre
View author publications
You can also search for this author in PubMed Google Scholar
Donatello Conte
View author publications
You can also search for this author in PubMed Google Scholar
The Anh Pham
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cong Nguyen Dinh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nguyen Dinh, C., Delalandre, M., Conte, D. et al. Fast RT-LoG operator for scene text detection. J Real-Time Image Proc 18, 19–36 (2021). https://doi.org/10.1007/s11554-020-00942-7

Download citation

Received: 24 June 2019
Accepted: 07 January 2020
Published: 24 January 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s11554-020-00942-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast RT-LoG operator for scene text detection

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

Attention mechanisms in computer vision: A survey

A Comprehensive Overview of Image Enhancement Techniques

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast RT-LoG operator for scene text detection

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

Attention mechanisms in computer vision: A survey

A Comprehensive Overview of Image Enhancement Techniques

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation