Abstract
This paper focuses on linguistic classification of scene texts in natural scene images. In this paper, an attempt is made to localize texts based on multi-level thresholding by fuzzy-based Renyi entropy. Complex natural scene images with diversified challenges are considered. A set of heuristic rules comprising geometric filters and stroke width transform govern the process of locating potential text regions. The scene images may contain more than one language, where text recognition by optical character recognition system becomes challenging. Manual intervention is needed to specify the language of each text. To overcome this hurdle, linguistic classification of text regions is suggested in this paper. The proposed method is validated using publicly available dataset—MSRA-TD500. Results show that fuzzy-based Renyi entropy thresholding is able to segment the foreground text from complex natural scene images. Geometric filters could capture the inherent uniformity of the text. Stroke width transform eliminates the non-text regions. The performance measures such as precision, recall and F-measures are 78%, 77% and 76%, respectively. This shows the ability of the algorithm to extract the text from the scenes. The geometric feature such as area and corner shows better variation in discriminating the linguistic texts. Further, the first three Hu moment features also contribute remarkable role in analyzing the shape of extracted text regions. The classifier based on support vector machine (SVM) yields classification accuracy of 85.45% in discriminating English and Chinese alphabets. Area under the ROC curve (AUC) is 0.851 for SVM classifier. The proposed methodology has proved its robustness against common degradations, such as uneven illumination, varying font characteristics and blurring effects. Experimental results show that our method achieves better performance in linguistic classification.
Similar content being viewed by others
References
Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. Pattern Recognit. 28(10), 1523–1535 (1995)
Lienhart, R.W., Stuber, F.: Automatic text recognition in digital videos. In: Image and Video Processing IV, vol. 2666, pp. 180–189. International Society for Optics and Photonics (1996)
Li, H., Doermann, D.: A video text detection system based on automated training. In: Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, vol. 2, pp. 223–226. IEEE (2000)
Wolf, C., Jolion, J.M., Chassaing, F.: Text localization, enhancement and binarization in multimedia documents. In: Object Recognition Supported by User Interaction for Service Robots, vol. 2, pp. 1037–1040. IEEE (2002)
Park, J., Lee, G.: A robust algorithm for text region detection in natural scene images. Can. J. Electr. Comput. Eng. 33(3/4), 215–222 (2008)
Yao, C., Bai, X., Liu, W.: A unified framework for multioriented text detection and recognition. IEEE Trans. Image Process. 23(11), 4737–4749 (2014)
He, T., Tian, Z., Huang, W., Shen, C., Qiao, Y., Sun, C.: An end-to-end textspotter with explicit alignment and attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5020–5029 (2018)
Jung, K., Kim, K.I., Jain, A.K.: Text information extraction in images and video: a survey. Pattern Recognit. 37(5), 977–997 (2004)
Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)
Yin, X.C., Zuo, Z.Y., Tian, S., Liu, C.L.: Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans. Image Process. 25(6), 2752–2773 (2016)
Jain, A.K., Yu, B.: Automatic text location in images and video frames. Pattern Recognit. 31(12), 2055–2076 (1998)
Kim, E.Y., Jung, K., Jeong, K.Y., Kim, H.J.: Automatic text region extraction using cluster-based templates. In: Proceedings of the International Conference on Advance in Pattern Recognition and Digital Techniques, pp. 412–421 (2000)
Yi, C., Tian, Y.: Text detection in natural scene images by stroke gabor words. In: 2011 International Conference on Document Analysis and Recognition, pp. 177–181. IEEE (2011)
Gllavata, J., Ewerth, R., Freisleben, B.: Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 1, pp. 425–428. IEEE (2004)
Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video frames. Image Vis. Comput. 23(6), 565–576 (2005)
Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 412–419 (2011)
Fabrizio, J., Robert-Seidowsky, M., Dubuisson, S., Calarasanu, S., Boissel, R.: Textcatcher: a method to detect curved and challenging text in natural scenes. Int. J. Doc. Anal. Recognit. (IJDAR) 19(2), 99–117 (2016)
Liang, G., Shivakumara, P., Lu, T., Tan, C.L.: Multi-spectral fusion based approach for arbitrarily oriented scene text detection in video images. IEEE Trans. Image Process. 24(11), 4488–4501 (2015)
Zhu, A., Gao, R., Uchida, S.: Could scene context be beneficial for scene text detection? Pattern Recognit. 58, 204–215 (2016)
Karaoglu, S., Tao, R., Gevers, T., Smeulders, A.W.: Words matter: scene text for image classification and retrieval. IEEE Trans. Multimed. 19(5), 1063–1076 (2017)
Wei, Y., Zhang, Z., Shen, W., Zeng, D., Fang, M., Zhou, S.: Text detection in scene images based on exhaustive segmentation. Signal Process. Image Commun. 50, 1–8 (2017)
Sain, A., Bhunia, A.K., Roy, P.P., Pal, U.: Multi-oriented text detection and verification in video frames and scene images. Neurocomputing 275, 1531–1549 (2018)
Hakak, S., Kamsin, A., Shivakumara, P., Idris, M.Y.I., Gilkar, G.A.: A new split based searching for exact pattern matching for natural texts. PLoS One 13(7), e0200912 (2018)
Barbieri, A.L., De Arruda, G.F., Rodrigues, F.A., Bruno, O.M., da Fontoura Costa, L.: An entropy-based approach to automatic image segmentation of satellite images. Physica A 390(3), 512–518 (2011)
Nobre, R.H., Rodrigues, F.A., Marques, R.C., Nobre, J.S., Neto, J.F., Medeiros, F.N.: SAR image segmentation with Renyi’s entropy. IEEE Signal Process. Lett. 23(11), 1551–1555 (2016)
Liu, Y., Chen, K.S.: An information entropy-based sensitivity analysis of radar sensing of rough surface. Remote Sens. 10(2), 286 (2018)
Frery, A.C., Cintra, R.J., Nascimento, A.D.: Entropy-based statistical analysis of PolSAR data. IEEE Trans. Geosci. Remote Sens. 51(6), 3733–3743 (2013)
El-Sayed, M.A., Khafagy, M.A.: Using Renyi’s entropy for edge detection in level images. Int. J. Intell. Comput. Inf. Sci. 11, 1–10 (2011)
Sahoo, P.K., Arora, G.: A thresholding method based on two-dimensional Renyi’s entropy. Pattern Recognit. 37(6), 1149–1161 (2004)
Tian, S., Bhattacharya, U., Lu, S., Su, B., Wang, Q., Wei, X., Lu, Y., Tan, C.L.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognit. 51, 125–134 (2016)
Mei, J., Dai, L., Shi, B., Bai, X.: Scene text script identification with convolutional recurrent neural networks. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 4053–4058. IEEE (2016)
Shi, B., Bai, X., Yao, C.: Script identification in the wild via discriminative convolutional neural network. Pattern Recognit. 52, 448–458 (2016)
Zheng, Y., Jeon, B., Xu, D., Wu, Q.M., Zhang, H.: Image segmentation by generalized hierarchical fuzzy C-means algorithm. J. Intell. Fuzzy Syst. 28(2), 961–973 (2015)
Chen, M., Ludwig, S.A.: Color image segmentation using fuzzy C-regression model. Adv. Fuzzy Syst. (2017). https://doi.org/10.1155/2017/4582948
Sun, K., Mou, S., Qiu, J., Wang, T., Gao, H.: Adaptive fuzzy control for non-triangular structural stochastic switched non-linear systems with full state constraints. IEEE Trans. Fuzzy Syst. (2018). https://doi.org/10.1109/TFUZZ.2018.2883374
Qiu, J., Sun, K., Wang, T., Gao, H.: Observer-based fuzzy adaptive event-triggered control for pure-feedback nonlinear systems with prescribed performance. IEEE Trans. Fuzzy Syst. (2019). https://doi.org/10.1109/TFUZZ.2019.2895560
MSRA Text Detection 500 Database (MSRA-TD500). http://www.iapr-tc11.org/mediawiki/index.php/MSRA_Text_Detection_500_Database_(MSRA-TD500). Accessed 9 July 2018
Tizhoosh, H.R.: Fuzzy image enhancement: an overview. In: Fuzzy Techniques in Image Processing, pp. 137–171. Physica, Heidelberg (2000)
Rényi, A.: On measures of information and entropy. In: Proceedings of the 4th Berkely Symposium on Mathematical Statistics and Probability, pp. 547–561 (1961)
Sarkar, S., Paul, S., Burman, R., Das, S., Chaudhuri, S.S.: A fuzzy entropy based multi-level image thresholding using differential evolution. In: International Conference on Swarm, Evolutionary, and Memetic Computing, pp. 386–395. Springer, Cham (2014)
Sarkar, S., Das, S., Chaudhuri, S.S.: Hyper-spectral image segmentation using Rényi entropy based multi-level thresholding aided with differential evolution. Expert Syst. Appl. 50, 120–129 (2016)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2963–2970. IEEE (2010)
Hu, M.K.: Visual pattern recognition by moment invariants. IRE Trans. Inf. Theory 8(2), 179–187 (1962)
Burges, C.J.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998)
Karthik, R., Menaka, R., Chellamuthu, C.: A comprehensive framework for classification of brain tumour images using SVM and curvelet transform. Int. J. Biomed. Eng. Technol. 17(2), 168–177 (2015)
Yin, X.C., Yin, X., Huang, K., Hao, H.W.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014)
Yin, X.C., Pei, W.Y., Zhang, J., Hao, H.W.: Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1930–1937 (2015)
Li, Y., Jia, W., Shen, C., van den Hengel, A.: Characterness: an indicator of text in the wild. IEEE Trans. Image Process. 23(4), 1666–1677 (2014)
Kang, L., Li, Y., Doermann, D.: Orientation robust text line detection in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4034–4041(2014)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Karpagam, A., Manikandan, M. Multi-level Fuzzy Based Renyi Entropy for Linguistic Classification of Texts in Natural Scene Images. Int. J. Fuzzy Syst. 22, 438–449 (2020). https://doi.org/10.1007/s40815-019-00654-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40815-019-00654-6