Skip to main content
Log in

Multi-level Fuzzy Based Renyi Entropy for Linguistic Classification of Texts in Natural Scene Images

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

This paper focuses on linguistic classification of scene texts in natural scene images. In this paper, an attempt is made to localize texts based on multi-level thresholding by fuzzy-based Renyi entropy. Complex natural scene images with diversified challenges are considered. A set of heuristic rules comprising geometric filters and stroke width transform govern the process of locating potential text regions. The scene images may contain more than one language, where text recognition by optical character recognition system becomes challenging. Manual intervention is needed to specify the language of each text. To overcome this hurdle, linguistic classification of text regions is suggested in this paper. The proposed method is validated using publicly available dataset—MSRA-TD500. Results show that fuzzy-based Renyi entropy thresholding is able to segment the foreground text from complex natural scene images. Geometric filters could capture the inherent uniformity of the text. Stroke width transform eliminates the non-text regions. The performance measures such as precision, recall and F-measures are 78%, 77% and 76%, respectively. This shows the ability of the algorithm to extract the text from the scenes. The geometric feature such as area and corner shows better variation in discriminating the linguistic texts. Further, the first three Hu moment features also contribute remarkable role in analyzing the shape of extracted text regions. The classifier based on support vector machine (SVM) yields classification accuracy of 85.45% in discriminating English and Chinese alphabets. Area under the ROC curve (AUC) is 0.851 for SVM classifier. The proposed methodology has proved its robustness against common degradations, such as uneven illumination, varying font characteristics and blurring effects. Experimental results show that our method achieves better performance in linguistic classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. Pattern Recognit. 28(10), 1523–1535 (1995)

    Article  Google Scholar 

  2. Lienhart, R.W., Stuber, F.: Automatic text recognition in digital videos. In: Image and Video Processing IV, vol. 2666, pp. 180–189. International Society for Optics and Photonics (1996)

  3. Li, H., Doermann, D.: A video text detection system based on automated training. In: Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, vol. 2, pp. 223–226. IEEE (2000)

  4. Wolf, C., Jolion, J.M., Chassaing, F.: Text localization, enhancement and binarization in multimedia documents. In: Object Recognition Supported by User Interaction for Service Robots, vol. 2, pp. 1037–1040. IEEE (2002)

  5. Park, J., Lee, G.: A robust algorithm for text region detection in natural scene images. Can. J. Electr. Comput. Eng. 33(3/4), 215–222 (2008)

    Article  Google Scholar 

  6. Yao, C., Bai, X., Liu, W.: A unified framework for multioriented text detection and recognition. IEEE Trans. Image Process. 23(11), 4737–4749 (2014)

    Article  MathSciNet  Google Scholar 

  7. He, T., Tian, Z., Huang, W., Shen, C., Qiao, Y., Sun, C.: An end-to-end textspotter with explicit alignment and attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5020–5029 (2018)

  8. Jung, K., Kim, K.I., Jain, A.K.: Text information extraction in images and video: a survey. Pattern Recognit. 37(5), 977–997 (2004)

    Article  Google Scholar 

  9. Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)

    Article  Google Scholar 

  10. Yin, X.C., Zuo, Z.Y., Tian, S., Liu, C.L.: Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans. Image Process. 25(6), 2752–2773 (2016)

    Article  MathSciNet  Google Scholar 

  11. Jain, A.K., Yu, B.: Automatic text location in images and video frames. Pattern Recognit. 31(12), 2055–2076 (1998)

    Article  Google Scholar 

  12. Kim, E.Y., Jung, K., Jeong, K.Y., Kim, H.J.: Automatic text region extraction using cluster-based templates. In: Proceedings of the International Conference on Advance in Pattern Recognition and Digital Techniques, pp. 412–421 (2000)

  13. Yi, C., Tian, Y.: Text detection in natural scene images by stroke gabor words. In: 2011 International Conference on Document Analysis and Recognition, pp. 177–181. IEEE (2011)

  14. Gllavata, J., Ewerth, R., Freisleben, B.: Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 1, pp. 425–428. IEEE (2004)

  15. Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video frames. Image Vis. Comput. 23(6), 565–576 (2005)

    Article  Google Scholar 

  16. Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 412–419 (2011)

    Article  Google Scholar 

  17. Fabrizio, J., Robert-Seidowsky, M., Dubuisson, S., Calarasanu, S., Boissel, R.: Textcatcher: a method to detect curved and challenging text in natural scenes. Int. J. Doc. Anal. Recognit. (IJDAR) 19(2), 99–117 (2016)

    Article  Google Scholar 

  18. Liang, G., Shivakumara, P., Lu, T., Tan, C.L.: Multi-spectral fusion based approach for arbitrarily oriented scene text detection in video images. IEEE Trans. Image Process. 24(11), 4488–4501 (2015)

    Article  MathSciNet  Google Scholar 

  19. Zhu, A., Gao, R., Uchida, S.: Could scene context be beneficial for scene text detection? Pattern Recognit. 58, 204–215 (2016)

    Article  Google Scholar 

  20. Karaoglu, S., Tao, R., Gevers, T., Smeulders, A.W.: Words matter: scene text for image classification and retrieval. IEEE Trans. Multimed. 19(5), 1063–1076 (2017)

    Article  Google Scholar 

  21. Wei, Y., Zhang, Z., Shen, W., Zeng, D., Fang, M., Zhou, S.: Text detection in scene images based on exhaustive segmentation. Signal Process. Image Commun. 50, 1–8 (2017)

    Article  Google Scholar 

  22. Sain, A., Bhunia, A.K., Roy, P.P., Pal, U.: Multi-oriented text detection and verification in video frames and scene images. Neurocomputing 275, 1531–1549 (2018)

    Article  Google Scholar 

  23. Hakak, S., Kamsin, A., Shivakumara, P., Idris, M.Y.I., Gilkar, G.A.: A new split based searching for exact pattern matching for natural texts. PLoS One 13(7), e0200912 (2018)

    Article  Google Scholar 

  24. Barbieri, A.L., De Arruda, G.F., Rodrigues, F.A., Bruno, O.M., da Fontoura Costa, L.: An entropy-based approach to automatic image segmentation of satellite images. Physica A 390(3), 512–518 (2011)

    Article  Google Scholar 

  25. Nobre, R.H., Rodrigues, F.A., Marques, R.C., Nobre, J.S., Neto, J.F., Medeiros, F.N.: SAR image segmentation with Renyi’s entropy. IEEE Signal Process. Lett. 23(11), 1551–1555 (2016)

    Article  Google Scholar 

  26. Liu, Y., Chen, K.S.: An information entropy-based sensitivity analysis of radar sensing of rough surface. Remote Sens. 10(2), 286 (2018)

    Article  Google Scholar 

  27. Frery, A.C., Cintra, R.J., Nascimento, A.D.: Entropy-based statistical analysis of PolSAR data. IEEE Trans. Geosci. Remote Sens. 51(6), 3733–3743 (2013)

    Article  Google Scholar 

  28. El-Sayed, M.A., Khafagy, M.A.: Using Renyi’s entropy for edge detection in level images. Int. J. Intell. Comput. Inf. Sci. 11, 1–10 (2011)

    Google Scholar 

  29. Sahoo, P.K., Arora, G.: A thresholding method based on two-dimensional Renyi’s entropy. Pattern Recognit. 37(6), 1149–1161 (2004)

    Article  Google Scholar 

  30. Tian, S., Bhattacharya, U., Lu, S., Su, B., Wang, Q., Wei, X., Lu, Y., Tan, C.L.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognit. 51, 125–134 (2016)

    Article  Google Scholar 

  31. Mei, J., Dai, L., Shi, B., Bai, X.: Scene text script identification with convolutional recurrent neural networks. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 4053–4058. IEEE (2016)

  32. Shi, B., Bai, X., Yao, C.: Script identification in the wild via discriminative convolutional neural network. Pattern Recognit. 52, 448–458 (2016)

    Article  Google Scholar 

  33. Zheng, Y., Jeon, B., Xu, D., Wu, Q.M., Zhang, H.: Image segmentation by generalized hierarchical fuzzy C-means algorithm. J. Intell. Fuzzy Syst. 28(2), 961–973 (2015)

    Article  Google Scholar 

  34. Chen, M., Ludwig, S.A.: Color image segmentation using fuzzy C-regression model. Adv. Fuzzy Syst. (2017). https://doi.org/10.1155/2017/4582948

    Article  MathSciNet  Google Scholar 

  35. Sun, K., Mou, S., Qiu, J., Wang, T., Gao, H.: Adaptive fuzzy control for non-triangular structural stochastic switched non-linear systems with full state constraints. IEEE Trans. Fuzzy Syst. (2018). https://doi.org/10.1109/TFUZZ.2018.2883374

    Article  Google Scholar 

  36. Qiu, J., Sun, K., Wang, T., Gao, H.: Observer-based fuzzy adaptive event-triggered control for pure-feedback nonlinear systems with prescribed performance. IEEE Trans. Fuzzy Syst. (2019). https://doi.org/10.1109/TFUZZ.2019.2895560

    Article  Google Scholar 

  37. MSRA Text Detection 500 Database (MSRA-TD500). http://www.iapr-tc11.org/mediawiki/index.php/MSRA_Text_Detection_500_Database_(MSRA-TD500). Accessed 9 July 2018

  38. Tizhoosh, H.R.: Fuzzy image enhancement: an overview. In: Fuzzy Techniques in Image Processing, pp. 137–171. Physica, Heidelberg (2000)

  39. Rényi, A.: On measures of information and entropy. In: Proceedings of the 4th Berkely Symposium on Mathematical Statistics and Probability, pp. 547–561 (1961)

  40. Sarkar, S., Paul, S., Burman, R., Das, S., Chaudhuri, S.S.: A fuzzy entropy based multi-level image thresholding using differential evolution. In: International Conference on Swarm, Evolutionary, and Memetic Computing, pp. 386–395. Springer, Cham (2014)

  41. Sarkar, S., Das, S., Chaudhuri, S.S.: Hyper-spectral image segmentation using Rényi entropy based multi-level thresholding aided with differential evolution. Expert Syst. Appl. 50, 120–129 (2016)

    Article  Google Scholar 

  42. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2963–2970. IEEE (2010)

  43. Hu, M.K.: Visual pattern recognition by moment invariants. IRE Trans. Inf. Theory 8(2), 179–187 (1962)

    Article  Google Scholar 

  44. Burges, C.J.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998)

    Article  Google Scholar 

  45. Karthik, R., Menaka, R., Chellamuthu, C.: A comprehensive framework for classification of brain tumour images using SVM and curvelet transform. Int. J. Biomed. Eng. Technol. 17(2), 168–177 (2015)

    Article  Google Scholar 

  46. Yin, X.C., Yin, X., Huang, K., Hao, H.W.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014)

    Article  Google Scholar 

  47. Yin, X.C., Pei, W.Y., Zhang, J., Hao, H.W.: Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1930–1937 (2015)

    Article  Google Scholar 

  48. Li, Y., Jia, W., Shen, C., van den Hengel, A.: Characterness: an indicator of text in the wild. IEEE Trans. Image Process. 23(4), 1666–1677 (2014)

    Article  MathSciNet  Google Scholar 

  49. Kang, L., Li, Y., Doermann, D.: Orientation robust text line detection in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4034–4041(2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Angia Venkatesan Karpagam.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Karpagam, A., Manikandan, M. Multi-level Fuzzy Based Renyi Entropy for Linguistic Classification of Texts in Natural Scene Images. Int. J. Fuzzy Syst. 22, 438–449 (2020). https://doi.org/10.1007/s40815-019-00654-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-019-00654-6

Keywords

Navigation