Skip to main content

A Framework for Multi-lingual Scene Text Detection Using K-means++ and Memetic Algorithms

  • Chapter
  • First Online:
Machine Learning for Intelligent Multimedia Analytics

Part of the book series: Studies in Big Data ((SBD,volume 82))

Abstract

Recent years have witnessed an exponential surge in interest to explore the domain of scene text detection as well as analysis in natural scene images. However, owing to the complexities arising due to various factors, it can be said that existing techniques may fail at times while attempting to detect text components. This paper presents a system wherein an image is taken as input and its color components are extracted at first. Next the intensity values from each color channel are grouped together using K-means++ clustering algorithm. Memetic algorithm is then applied to get an optimal set of candidate components from the color maps while eliminating the background. The spurious components are removed on the basis of their dimension and entropy measure. This system is experimentally evaluated on two standard datasets namely MLe2e and KAIST, and on an in-house dataset of 400 images, all having multi-lingual texts. The results obtained are comparable with some state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Y. Zhu, C. Yao, X. Bai, Scene text detection and recognition: recent advances and future trends. Frontiers Comput. Sci. 10(1), 19–36 (2016)

    Article  Google Scholar 

  2. H. Chen, S.S. Tsai, G. Schroth, D.M. Chen, R. Grzeszczuk, B. Girod, Robust text detection in natural images with edge-enhanced maximally stable extremal regions, in 2011 18th IEEE International Conference on Image Processing (IEEE, 2011, September), pp. 2609–2612

    Google Scholar 

  3. B. Epshtein, E. Ofek, Y. Wexler, Detecting text in natural scenes with stroke width transform, in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (IEEE, 2010, June), pp. 2963–2970

    Google Scholar 

  4. A. Mukhopadhyay, S. Kumar, S.R. Chowdhury, N. Chakraborty, A.F. Mollah, S. Basu, R. Sarkar, Multi-lingual scene text detection using one-class classifier. Int. J. Comput. Vis. Image Process. (IJCVIP) 9(2), 48–65 (2019)

    Article  Google Scholar 

  5. D. Arthur, S. Vassilvitskii, k-means++: The Advantages of Careful Seeding (Stanford, 2006)

    Google Scholar 

  6. P. Moscato, C. Cotta, A modern introduction to memetic algorithms, in Handbook of Metaheuristics (Springer, Boston, MA, 2010), pp. 141–183

    Google Scholar 

  7. K. Fan, S.J. Baek, A robust proposal generation method for text lines in natural scene images. Neurocomputing 304, 47–63 (2018)

    Article  Google Scholar 

  8. L. Li, S. Yu, L. Zhong, X. Li, Multilingual text detection with nonlinear neural network, in Mathematical Problems in Engineering (2015)

    Google Scholar 

  9. L. Gomez, D. Karatzas, A fine-grained approach to scene text script identification, in 2016 12th IAPR Workshop on Document Analysis Systems (DAS) (IEEE, 2016, April), pp. 192–197

    Google Scholar 

  10. L. Gomez, A. Nicolaou, D. Karatzas, Improving patch-based scene text script identification with ensembles of conjoined networks. Pattern Recogn. 67, 85–96 (2017)

    Article  Google Scholar 

  11. J. Matas, O. Chum, M. Urban, T. Pajdla, Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)

    Article  Google Scholar 

  12. N. Chakraborty, S. Biswas, A.F. Mollah, S. Basu, R. Sarkar, Multi-lingual scene text detection by local histogram analysis and selection of optimal area for MSER, in International Conference on Computational Intelligence, Communications, and Business Analytics (Springer, Singapore, 2018, July), pp. 234–242

    Google Scholar 

  13. S. Panda, S. Ash, N. Chakraborty, A.F. Mollah, S. Basu, R. Sarkar, Parameter tuning in MSER for text localization in multi-lingual camera-captured scene text images, in International Conference on Computational Intelligence in Pattern Recognition (Springer, Singapore, 2020), pp. 999–1009

    Google Scholar 

  14. A.C. Özgen, M. Fasounaki, H.K. Ekenel, Text detection in natural and computer-generated images, in 2018 26th Signal Processing and Communications Applications Conference (SIU) (IEEE, 2018, May), pp. 1–4

    Google Scholar 

  15. A. Agrawal, P. Mukherjee, S. Srivastava, B. Lall, Enhanced characterness for text detection in the wild, in Proceedings of 2nd International Conference on Computer Vision & Image Processing (Springer, Singapore, 2018), pp. 359–369

    Google Scholar 

  16. I.N. Dutta, N. Chakraborty, A.F. Mollah, S. Basu, R. Sarkar, Multi-lingual text localization from camera captured images based on foreground homogenity analysis, in Recent Developments in Machine Learning and Data Analytics (Springer, Singapore, 2019), pp. 149–158

    Google Scholar 

  17. Y. Li, Vehicle extraction using histogram and genetic algorithm based fuzzy image segmentation from high resolution UAV aerial imagery, in ISPRS08 (2008,) p. B3b, 529

    Google Scholar 

  18. S. Saha, N. Chakraborty, S. Kundu, S. Paul, A.F. Mollah, S. Basu, R. Sarkar, Multi-lingual scene text detection and language identification, in Pattern Recognition Letters (2020)

    Google Scholar 

  19. T. Khan, A.F. Mollah, Text non-text classification based on area occupancy of equidistant pixels. Proc. Comput. Sci. 167, 1889–1900 (2020)

    Article  Google Scholar 

  20. R. Bagi, T. Dutta, H.P. Gupta, Cluttered textspotter: an end-to-end trainable light-weight scene text spotter for cluttered environment, in IEEE Access (2020)

    Google Scholar 

  21. J. Han, J. Pei, M. Kamber, Data Mining: Concepts and Techniques. Elsevier (2011)

    Google Scholar 

  22. P. Jana, S. Ghosh, S.K. Bera, R. Sarkar, Handwritten document image binarization: an adaptive K-means based approach, in 2017 IEEE Calcutta Conference (CALCON) (IEEE, 2017, December), pp. 226–230

    Google Scholar 

  23. M. Ghosh, T. Kundu, D. Ghosh, R. Sarkar, Feature selection for facial emotion recognition using late hill-climbing based memetic algorithm. Multimedia Tools Appl. 78(18), 25753–25779 (2019)

    Article  Google Scholar 

  24. M. Ghosh, S. Begum, R. Sarkar, D. Chakraborty, U. Maulik, Recursive memetic algorithm for gene selection in microarray data. Expert Syst. Appl. 116, 172–185 (2019)

    Article  Google Scholar 

  25. M. Ghosh, S. Malakar, S. Bhowmik, R. Sarkar, M. Nasipuri, Feature selection for handwritten word recognition using memetic algorithm, in Advances in Intelligent Computing (Springer, Singapore, 2019), pp. 103–124

    Google Scholar 

  26. M. Ghosh, S. Malakar, S. Bhowmik, R. Sarkar, M. Nasipuri, Memetic algorithm based feature selection for handwritten city name recognition, in International Conference on Computational Intelligence, Communications, and Business Analytics (Springer, Singapore, 2017, March), pp. 599–613

    Google Scholar 

Download references

Acknowledgements

This work is partially supported by the CMATER research laboratory of the Computer Science and Engineering Department, Jadavpur University, India, PURSE-II and UPE-II, project. SB is partially funded by DBT grant (BT/PR16356/BID/7/596/2016). RS, SB and AFM are partially funded by DST grant (EMR/2016/007213).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Neelotpal Chakraborty .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Chakraborty, N., Ray, A., Mollah, A.F., Basu, S., Sarkar, R. (2021). A Framework for Multi-lingual Scene Text Detection Using K-means++ and Memetic Algorithms. In: Kumar, P., Singh, A.K. (eds) Machine Learning for Intelligent Multimedia Analytics. Studies in Big Data, vol 82. Springer, Singapore. https://doi.org/10.1007/978-981-15-9492-2_9

Download citation

Publish with us

Policies and ethics