A Framework for Multi-lingual Scene Text Detection Using K-means++ and Memetic Algorithms

Chakraborty, Neelotpal; Ray, Averi; Mollah, Ayatullah Faruk; Basu, Subhadip; Sarkar, Ram

doi:10.1007/978-981-15-9492-2_9

Neelotpal Chakraborty⁴,
Averi Ray⁴,
Ayatullah Faruk Mollah⁵,
Subhadip Basu⁴ &
…
Ram Sarkar⁴

Part of the book series: Studies in Big Data ((SBD,volume 82))

589 Accesses
1 Citations

Abstract

Recent years have witnessed an exponential surge in interest to explore the domain of scene text detection as well as analysis in natural scene images. However, owing to the complexities arising due to various factors, it can be said that existing techniques may fail at times while attempting to detect text components. This paper presents a system wherein an image is taken as input and its color components are extracted at first. Next the intensity values from each color channel are grouped together using K-means++ clustering algorithm. Memetic algorithm is then applied to get an optimal set of candidate components from the color maps while eliminating the background. The spurious components are removed on the basis of their dimension and entropy measure. This system is experimentally evaluated on two standard datasets namely MLe2e and KAIST, and on an in-house dataset of 400 images, all having multi-lingual texts. The results obtained are comparable with some state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Y. Zhu, C. Yao, X. Bai, Scene text detection and recognition: recent advances and future trends. Frontiers Comput. Sci. 10(1), 19–36 (2016)
Article Google Scholar
H. Chen, S.S. Tsai, G. Schroth, D.M. Chen, R. Grzeszczuk, B. Girod, Robust text detection in natural images with edge-enhanced maximally stable extremal regions, in 2011 18th IEEE International Conference on Image Processing (IEEE, 2011, September), pp. 2609–2612
Google Scholar
B. Epshtein, E. Ofek, Y. Wexler, Detecting text in natural scenes with stroke width transform, in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (IEEE, 2010, June), pp. 2963–2970
Google Scholar
A. Mukhopadhyay, S. Kumar, S.R. Chowdhury, N. Chakraborty, A.F. Mollah, S. Basu, R. Sarkar, Multi-lingual scene text detection using one-class classifier. Int. J. Comput. Vis. Image Process. (IJCVIP) 9(2), 48–65 (2019)
Article Google Scholar
D. Arthur, S. Vassilvitskii, k-means++: The Advantages of Careful Seeding (Stanford, 2006)
Google Scholar
P. Moscato, C. Cotta, A modern introduction to memetic algorithms, in Handbook of Metaheuristics (Springer, Boston, MA, 2010), pp. 141–183
Google Scholar
K. Fan, S.J. Baek, A robust proposal generation method for text lines in natural scene images. Neurocomputing 304, 47–63 (2018)
Article Google Scholar
L. Li, S. Yu, L. Zhong, X. Li, Multilingual text detection with nonlinear neural network, in Mathematical Problems in Engineering (2015)
Google Scholar
L. Gomez, D. Karatzas, A fine-grained approach to scene text script identification, in 2016 12th IAPR Workshop on Document Analysis Systems (DAS) (IEEE, 2016, April), pp. 192–197
Google Scholar
L. Gomez, A. Nicolaou, D. Karatzas, Improving patch-based scene text script identification with ensembles of conjoined networks. Pattern Recogn. 67, 85–96 (2017)
Article Google Scholar
J. Matas, O. Chum, M. Urban, T. Pajdla, Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)
Article Google Scholar
N. Chakraborty, S. Biswas, A.F. Mollah, S. Basu, R. Sarkar, Multi-lingual scene text detection by local histogram analysis and selection of optimal area for MSER, in International Conference on Computational Intelligence, Communications, and Business Analytics (Springer, Singapore, 2018, July), pp. 234–242
Google Scholar
S. Panda, S. Ash, N. Chakraborty, A.F. Mollah, S. Basu, R. Sarkar, Parameter tuning in MSER for text localization in multi-lingual camera-captured scene text images, in International Conference on Computational Intelligence in Pattern Recognition (Springer, Singapore, 2020), pp. 999–1009
Google Scholar
A.C. Özgen, M. Fasounaki, H.K. Ekenel, Text detection in natural and computer-generated images, in 2018 26th Signal Processing and Communications Applications Conference (SIU) (IEEE, 2018, May), pp. 1–4
Google Scholar
A. Agrawal, P. Mukherjee, S. Srivastava, B. Lall, Enhanced characterness for text detection in the wild, in Proceedings of 2nd International Conference on Computer Vision & Image Processing (Springer, Singapore, 2018), pp. 359–369
Google Scholar
I.N. Dutta, N. Chakraborty, A.F. Mollah, S. Basu, R. Sarkar, Multi-lingual text localization from camera captured images based on foreground homogenity analysis, in Recent Developments in Machine Learning and Data Analytics (Springer, Singapore, 2019), pp. 149–158
Google Scholar
Y. Li, Vehicle extraction using histogram and genetic algorithm based fuzzy image segmentation from high resolution UAV aerial imagery, in ISPRS08 (2008,) p. B3b, 529
Google Scholar
S. Saha, N. Chakraborty, S. Kundu, S. Paul, A.F. Mollah, S. Basu, R. Sarkar, Multi-lingual scene text detection and language identification, in Pattern Recognition Letters (2020)
Google Scholar
T. Khan, A.F. Mollah, Text non-text classification based on area occupancy of equidistant pixels. Proc. Comput. Sci. 167, 1889–1900 (2020)
Article Google Scholar
R. Bagi, T. Dutta, H.P. Gupta, Cluttered textspotter: an end-to-end trainable light-weight scene text spotter for cluttered environment, in IEEE Access (2020)
Google Scholar
J. Han, J. Pei, M. Kamber, Data Mining: Concepts and Techniques. Elsevier (2011)
Google Scholar
P. Jana, S. Ghosh, S.K. Bera, R. Sarkar, Handwritten document image binarization: an adaptive K-means based approach, in 2017 IEEE Calcutta Conference (CALCON) (IEEE, 2017, December), pp. 226–230
Google Scholar
M. Ghosh, T. Kundu, D. Ghosh, R. Sarkar, Feature selection for facial emotion recognition using late hill-climbing based memetic algorithm. Multimedia Tools Appl. 78(18), 25753–25779 (2019)
Article Google Scholar
M. Ghosh, S. Begum, R. Sarkar, D. Chakraborty, U. Maulik, Recursive memetic algorithm for gene selection in microarray data. Expert Syst. Appl. 116, 172–185 (2019)
Article Google Scholar
M. Ghosh, S. Malakar, S. Bhowmik, R. Sarkar, M. Nasipuri, Feature selection for handwritten word recognition using memetic algorithm, in Advances in Intelligent Computing (Springer, Singapore, 2019), pp. 103–124
Google Scholar
M. Ghosh, S. Malakar, S. Bhowmik, R. Sarkar, M. Nasipuri, Memetic algorithm based feature selection for handwritten city name recognition, in International Conference on Computational Intelligence, Communications, and Business Analytics (Springer, Singapore, 2017, March), pp. 599–613
Google Scholar

Download references

Acknowledgements

This work is partially supported by the CMATER research laboratory of the Computer Science and Engineering Department, Jadavpur University, India, PURSE-II and UPE-II, project. SB is partially funded by DBT grant (BT/PR16356/BID/7/596/2016). RS, SB and AFM are partially funded by DST grant (EMR/2016/007213).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Jadavpur University, Kolkata, 700032, India
Neelotpal Chakraborty, Averi Ray, Subhadip Basu & Ram Sarkar
Department of Computer Science and Engineering, Aliah University, Kolkata, 700160, India
Ayatullah Faruk Mollah

Authors

Neelotpal Chakraborty
View author publications
You can also search for this author in PubMed Google Scholar
Averi Ray
View author publications
You can also search for this author in PubMed Google Scholar
Ayatullah Faruk Mollah
View author publications
You can also search for this author in PubMed Google Scholar
Subhadip Basu
View author publications
You can also search for this author in PubMed Google Scholar
Ram Sarkar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Neelotpal Chakraborty .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering and Information Technology, Jaypee University of Information Technology, Solan, Himachal Pradesh, India
Pardeep Kumar
Department of Computer Science and Engineering, National Institute of Technology, Patna, Bihar, India
Amit Kumar Singh

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chakraborty, N., Ray, A., Mollah, A.F., Basu, S., Sarkar, R. (2021). A Framework for Multi-lingual Scene Text Detection Using K-means++ and Memetic Algorithms. In: Kumar, P., Singh, A.K. (eds) Machine Learning for Intelligent Multimedia Analytics. Studies in Big Data, vol 82. Springer, Singapore. https://doi.org/10.1007/978-981-15-9492-2_9

Download citation

DOI: https://doi.org/10.1007/978-981-15-9492-2_9
Published: 17 January 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-9491-5
Online ISBN: 978-981-15-9492-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics