Irregular scene text detection via attention guided border labeling

Chen, Jie; Lian, Zhouhui; Wang, Yizhi; Tang, Yingmin; Xiao, Jianguo

doi:10.1007/s11432-019-2673-8

Irregular scene text detection via attention guided border labeling

Research Paper
Published: 08 November 2019

Volume 62, article number 220103, (2019)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Jie Chen^1,2,
Zhouhui Lian^1,2,
Yizhi Wang^1,2,
Yingmin Tang^1,2 &
…
Jianguo Xiao^1,2

186 Accesses
13 Citations
Explore all metrics

Abstract

Scene text detection plays an important role in many computer vision applications. With the help of recent deep learning techniques, multi-oriented text detection that was considered to be quite challenging has been solved to some extent. However, most existing methods still perform poorly for curved text detection, mainly due to the limitation of their text representations (e.g., horizontal boxes, rotated rectangles or quadrangles). To solve this problem, we propose a novel method to detect irregular scene texts based on instance-aware segmentation. The key idea is to design an attention guided semantic segmentation model to precisely label the weighted borders of text regions. Experiments conducted on several widely-used benchmarks demonstrate that our method achieves superior results on curved text datasets (i.e., with F-score 80.1% and 78.8% for the CTW1500 and Total-Text, respectively) and obtains comparable performance on multi-oriented text datasets compared to the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TextPolar: irregular scene text detection using polar representation

Article 23 May 2021

TK-Text: Multi-shaped Scene Text Detection via Instance Segmentation

Curved Scene Text Detection Based on Mask R-CNN

References

Shi B G, Bai X, Belongie S. Detecting oriented text in natural images by linking segments. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 2550–2558
Tian Z, Huang W L, He T, et al. Detecting text in natural image with connectionist text proposal network. In: Proceedings of European Conference on Computer Vision, Amsterdam, 2016. 56–72
Lyu P Y, Yao C, Wu W H, et al. Multi-oriented scene text detection via corner localization and region segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 7553–7563
Yao C, Bai X, Sang N, et al. Scene text detection via holistic, multi-channel prediction. 2016. ArXiv:1606.09002
Zhang Z, Zhang C Q, Shen W, et al. Multi-oriented text detection with fully convolutional networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016. 4159–4167
He D F, Yang X, Liang C, et al. Multi-scale FCN with cascaded instance aware segmentation for arbitrary oriented word spotting in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 3519–3528
Wu Y, Natarajan P. Self-organized text detection with minimal post-processing via border learning. In: Proceedings of IEEE International Conference on Computer Vision, Venice, 2017. 5000–5009
Polzounov A, Ablavatski A, Escalera S, et al. Wordfence: text detection in natural images with border awareness. In: Proceedings of IEEE International Conference on Image Processing, Beijing, 2017. 1222–1226
Woo S, Park J, Lee J Y, et al. Cbam: convolutional block attention module. In: Proceedings of European Conference on Computer Vision, Munich, 2018. 3–19
Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, 2010. 2963–2970
Neumann L, Matas J. A method for text localization and recognition in real-world images. In: Proceedings of Asian Conference on Computer Vision, Queenstown, 2010. 770–783
Tian S X, Lu S J, Li C S. Wetext: scene text detection under weak supervision. In: Proceedings of IEEE International Conference on Computer Vision, Venice, 2017. 1492–1500
Tian S X, Pan Y F, Huang C, et al. Text flow: a unified text detection system in natural scene images. In: Proceedings of IEEE International Conference on Computer Vision, Santiago, 2015. 4651–4659
Liao M H, Shi B G, Bai X, et al. Textboxes: a fast text detector with a single deep neural network. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, 2017
Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector. In: Proceedings of European Conference on Computer Vision, Amsterdam, 2016. 21–37
Ma J Q, Shao W Y, Ye H, et al. Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimedia, 2018, 20: 3111–3122
Article Google Scholar
Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Advances in Neural Information Processing Systems, Palais, 2015. 91–99
Lyu P Y, Yao C, Wu W H, et al. Multi-oriented scene text detection via corner localization and region segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 7553–7563
Xu Y C, Wang Y K, Zhou W, et al. TextField: learning a deep direction field for irregular scene text detection. IEEE Trans Image Process, 2019, 28: 5566–5579
Article MathSciNet MATH Google Scholar
Xue C H, Lu S J, Zhan F N. Accurate scene text detection through border semantics awareness and bootstrapping. In: Proceedings of European Conference on Computer Vision, Munich, 2018. 355–372
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. 3431–3440
Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 2117–2125
Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In: Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, 2015. 234–241
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556
Milletari F, Navab N, Ahmadi S A. V-net: fully convolutional neural networks for volumetric medical image segmentation. In: Proceedings of the 4th International Conference on 3D Vision (3DV), California, 2016. 565–571
Gupta A, Vedaldi A, Zisserman A. Synthetic data for text localisation in natural images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016. 2315–2324
Yuliang L, Lianwen J, Shuaitao Z, et al. Detecting curve text in the wild: new dataset and new solution. 2017. ArXiv:1712.02170
Ch’ng C K, Chan C S. Total-text: a comprehensive dataset for scene text detection and recognition. In: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 2017. 935–942
Karatzas D, Gomez-Bigorda L, Nicolaou A, et al. ICDAR 2015 competition on robust reading. In: Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR), Nancy, 2015. 1156–1160
Yao C, Bai X, Liu W Y, et al. Detecting texts of arbitrary orientations in natural images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Providence, 2012. 1083–1090
Kingma D P, Ba J. Adam: a method for stochastic optimization. 2014. ArXiv:1412.6980
Zhou X Y, Yao C, Wen H, et al. EAST: an efficient and accurate scene text detector. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 5551–5560
Liu Y L, Jin L W. Deep matching prior network: toward tighter multi-oriented text detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 1962–1969
Liu Y L, Jin L W, Zhang S T, et al. Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recogn, 2019, 90: 337–345
Article Google Scholar
Long S B, Ruan J Q, Zhang W J, et al. Textsnake: a flexible representation for detecting text of arbitrary shapes. In: Proceedings of European Conference on Computer Vision, Munich, 2018. 20–36
Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. In: Proceedings of IEEE International Conference on Computer Vision, Santiago, 2015. 1520–1528
Hu H, Zhang C Q, Luo Y X, et al. Wordsup: exploiting word annotations for character based text detection. In: Proceedings of IEEE International Conference on Computer Vision, Venice, 2017. 4940–4949
Wang F F, Zhao L M, Li X, et al. Geometry-aware scene text detection with instance transformation network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 1381–1389
Deng D, Liu H F, Li X L, et al. Pixellink: detecting scene text via instance segmentation. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, 2018
He W H, Zhang X Y, Yin F, et al. Deep direct regression for multi-oriented scene text detection. In: Proceedings of IEEE International Conference on Computer Vision, Venice, 2017. 745–753

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 61672056, 61672043) and Key Laboratory of Science, Technology and Standard in Press Industry (Key Laboratory of Intelligent Press Media Technology).

Author information

Authors and Affiliations

Wangxuan Institute of Computer Technology, Peking University, Beijing, 100080, China
Jie Chen, Zhouhui Lian, Yizhi Wang, Yingmin Tang & Jianguo Xiao
Center For Chinese Font Design and Research, Peking University, Beijing, 100080, China
Jie Chen, Zhouhui Lian, Yizhi Wang, Yingmin Tang & Jianguo Xiao

Authors

Jie Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhouhui Lian
View author publications
You can also search for this author in PubMed Google Scholar
Yizhi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yingmin Tang
View author publications
You can also search for this author in PubMed Google Scholar
Jianguo Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhouhui Lian.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, J., Lian, Z., Wang, Y. et al. Irregular scene text detection via attention guided border labeling. Sci. China Inf. Sci. 62, 220103 (2019). https://doi.org/10.1007/s11432-019-2673-8

Download citation

Received: 20 June 2019
Revised: 08 August 2019
Accepted: 25 September 2019
Published: 08 November 2019
DOI: https://doi.org/10.1007/s11432-019-2673-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Irregular scene text detection via attention guided border labeling

Abstract

Access this article

Similar content being viewed by others

TextPolar: irregular scene text detection using polar representation

TK-Text: Multi-shaped Scene Text Detection via Instance Segmentation

Curved Scene Text Detection Based on Mask R-CNN

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Irregular scene text detection via attention guided border labeling

Abstract

Access this article

Similar content being viewed by others

TextPolar: irregular scene text detection using polar representation

TK-Text: Multi-shaped Scene Text Detection via Instance Segmentation

Curved Scene Text Detection Based on Mask R-CNN

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation