K-NN Based Text Segmentation from Digital Images Using a New Binarization Scheme

Ghoshal, Ranjit; Das, Sayan; Saha, Aditya

doi:10.1007/978-981-10-6430-2_17

Ranjit Ghoshal¹²,
Sayan Das¹² &
Aditya Saha¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 776))

Included in the following conference series:

International Conference on Computational Intelligence, Communications, and Business Analytics

1388 Accesses

Abstract

Text segmentation in digital images is requisite for many image analysis and interpretation tasks. In this article, we have proposed an effective binarization method towards text segmentation in digital images. This method produces a number of connected components consisting of text as well as non-text. Next, it is required to identify the possible text components from the obtained connected components. Further, to distinguish between text and non-text components, a set of features are identified. Then, during training, we consider the two feature files namely text and non-text prepared by us. Here, K-Nearest Neighbour (K-NN) classifier is considered for the present two class classification problem. The experiments are based on ICDAR 2011 Born Digital Dataset. We have accomplished in binarization and as well as segmenting between text and non-text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Chen, X., Yuille, A.: Detecting and reading text in natural scenes. In: Proceedings of the IEEE Conference on CVPR, Washington, DC, USA, vol. 2, pp. 366–373 (2004)
Google Scholar
Kim, K., Jung, K., Kim, J.: Texture-base approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 25(12), 1631–1639 (2003)
Article Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of the IEEE Conference on CVPR, San Francisco, CA, USA, pp. 2963–2970 (2010)
Google Scholar
Mancas-Thillou, C., Gosselin, B.: Color text extraction with selective metric-based clustering. Comput. Vis. Image Underst. 107(1–2), 97–107 (2007)
Article Google Scholar
Yi, C., Tian, Y.: Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification. IEEE Trans. Image Process. 21(9), 4256–4268 (2012)
Article MathSciNet MATH Google Scholar
Pan, Y.-F., Hou, X., Liu, C.-L.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans. Image Process. 20(3), 800–813 (2011)
Article MathSciNet MATH Google Scholar
Chen et al., H.: Robust text detection in natural images with edge enhanced maximally stable extremal regions. In: Proceedings of the IEEE International Conference on Image Processing, pp. 2609–2612 (2011)
Google Scholar
Merino-Gracia, C., Lenc, K., Mirmehdi, M.: A head-mounted device for recognizing text in natural scenes. In: Proceedings of the International Workshop CBDAR, Beijing, China, pp. 29–41 (2011)
Google Scholar
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: Proceedings of the IEEE Conference on CVPR, Providence, RI, USA, pp. 3538–3545 (2012)
Google Scholar
Shi, C., Wang, C., Xiao, B., Zhang, Y., Gao, S.: Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognit. Lett. 34(2), 107–116 (2013)
Article Google Scholar
Zeng, C., Jia, W., He, X.: Text detection in born-digital images using multiple layer images. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 26–31 May 2013
Google Scholar
Xu, J., Shivakumara, P., Lu, T.: Text detection in born-digital images by mass estimation. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), 3–6 November 2015
Google Scholar
Bhattacharya, U., Parui, S.K., Mondal, S.: Devanagari and Bangla text extraction from natural scene images. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (2010)
Google Scholar
Kumar, D., Ramakrishnan, A.G.: OTCYMIST: Otsu-Canny minimal spanning tree for born-digital images. In: 2012 10th IAPR International Workshop on Document Analysis Systems (DAS), 27–29 March 2012
Google Scholar
Karatzas, D., Mestre, S.R., Mas, J., Nourbakhsh, F., Roy, P.P.: ICDAR 2011 robust reading competition challenge 1: reading text in born-digital images. In: Proceedings of the ICDAR, pp. 1485–1490 (2011)
Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 377–393 (1979)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

St. Thomas’ College of Engineering and Technology, Kolkata, 700023, India
Ranjit Ghoshal, Sayan Das & Aditya Saha

Authors

Ranjit Ghoshal
View author publications
You can also search for this author in PubMed Google Scholar
Sayan Das
View author publications
You can also search for this author in PubMed Google Scholar
Aditya Saha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ranjit Ghoshal .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, University of Kalyani, Kalyani, West Bengal, India
J. K. Mandal
Department of Computer and System Sciences, Visva Bharati University, Bolpur Santiniketan, West Bengal, India
Paramartha Dutta
Department of Information Technology, Calcutta Business School, Kolkata, India
Somnath Mukhopadhyay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ghoshal, R., Das, S., Saha, A. (2017). K-NN Based Text Segmentation from Digital Images Using a New Binarization Scheme. In: Mandal, J., Dutta, P., Mukhopadhyay, S. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2017. Communications in Computer and Information Science, vol 776. Springer, Singapore. https://doi.org/10.1007/978-981-10-6430-2_17

Download citation

DOI: https://doi.org/10.1007/978-981-10-6430-2_17
Published: 26 September 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6429-6
Online ISBN: 978-981-10-6430-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics