An enhanced text detection technique for the visually impaired to read text

Joan, S. P. Faustina; Valli, S.

doi:10.1007/s10796-016-9699-x

An enhanced text detection technique for the visually impaired to read text

Published: 20 September 2016

Volume 19, pages 1039–1056, (2017)
Cite this article

Information Systems Frontiers Aims and scope Submit manuscript

S. P. Faustina Joan¹ &
S. Valli¹

504 Accesses
16 Citations
Explore all metrics

Abstract

An enhanced text detection technique (ETDT) is proposed, which is expected to aid the visually impaired to overcome their reading challenges. This work enhances the edge-preserving maximally stable extremal regions (eMSER) algorithm using the pyramid histogram of oriented gradients (PHOG). Histogram of oriented gradients (HOG) derived from different pyramid levels is important while detecting maximally stable extremal regions (MSER) in the ETDT approach because it gives more spatial information when compared to HOG information from a single level. To group text, a four-line, text-grouping method is newly designed for this work. Also, a new text feature, Shapeness Score is proposed, which significantly identifies text regions when combined with the other features based on morphology and stroke widths. Using the feature vector of dimension 10, the J48 decision tree and AdaBoost machine learning algorithms identify the text regions in the images. The algorithm yields better results than the existing benchmark algorithms for the ICDAR 2011 born-digital dataset and must be improved with respect to the scene text dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Text Detection Based on Text Shape Feature Analysis with Intelligent Grouping in Natural Scene Images

TextCatcher: a method to detect curved and challenging text in natural scenes

Article 11 March 2016

Deep Learning and Particle Swarm Optimisation-Based Techniques for Visually Impaired Humans' Text Recognition and Identification

Article 29 October 2021

References

Bissacco, A., Cummins, M., Netzer, Y., & Neven, H. (2013). PhotoOCR: Reading text in uncontrolled conditions. In Proceedings of 2013 I.E. International Conference on Computer Vision. Sydney: IEEE.
Bosch, A., Zisserman, A., & Munoz, X. (2007). Representing shape with a spatial pyramid kernel. In Proceedings of 6th ACM International Conference on Image and Video Retrieval (CIVR 2007). Amsterdam: ACM.
Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 679–698.
Article Google Scholar
Cerf, M., Frady, E. P., & Koch, C. (2009). Faces and text attract gaze independent of the task: Experimental data and computer model. Journal of Vision, 9(12), 10–10.
Article Google Scholar
Chen, H., Tsai, S. S., Schroth, G., Chen, D. M., Grzeszczuk, R., & Girod, B. (2011). Robust text detection in natural images with edge-enhanced Maximally stable extremal regions. In Proceedings of 2011 18th IEEE International Conference on Image Processing (ICIP 2011). Brussels: IEEE.
Criminisi, A., Perez, P., & Toyama, K. (2004). Region Filling and Object Removal by Exemplar-Based Image Inpainting. IEEE Transactions on Image Processing, 13(9), 1–13.
Article Google Scholar
Epshtein, B., Ofek, E., & Wexler, Y. (2010). Detecting text in natural scenes with stroke width transform. In Proceedings of 2010 I.E. Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2010). San Francisco: IEEE.
Ezaki, N., Kiyota, K., Minh, B., Bulacu, M., & Schomaker, L. (2005). Improved text-detection methods for a camera-based text reading system for blind persons. In Proceedings of Eighth International Conference on Document Analysis and Recognition (ICDAR'05). Washington, DC: IEEE.
Fathima, A. A., Vaidehi, V., & Selvaraj, K. (2014). Fall Detection with Part-Based Approach for Indoor Environment. International Journal of Intelligent Information Technologies, 10(4), 51–69.
Article Google Scholar
Freund, Y., & Schapire, R. (1996). Experiments with a new boosting algorithm. In Proceedings of Thirteenth International Conference on Machine Learning (ICML ‘96). Bari: Morgan Kaufmann.
Gomez, L., & Karatzas, D. (2014). MSER-based real-time text detection and tracking. In Proceedings of 22nd International Conference on Pattern Recognition (ICPR 2014). Stockholm: IEEE.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software. ACM SIGKDD Explorations Newsletter, 11(1), 10–18.
Article Google Scholar
Haritaoglu, I. (2001). Scene text extraction and translation for handheld devices. In Proceedings of 2001 I.E. Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001). Kauai: IEEE.
He, K., Sun, J., & Tang, X. (2013). Guided Image Filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(6), 1397–1409.
Article Google Scholar
Hemalatha, C. S., & Vaidehi, V. (2013). Associative Classification based Human Activity Recognition and Fall Detection using Accelerometer. International Journal of Intelligent Information Technologies, 9(3), 20–37.
Article Google Scholar
Hemalatha, C. S., Vaidehi, V., Nithya, K., Fathima, A. A., Visalakshi, M., & Saranya, M. (2015). Multi-Level Search Space Reduction Framework for Face Image Database. International Journal of Intelligent Information Technologies, 11(1), 12–29.
Article Google Scholar
Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2009). Learning to predict where humans look. In Proceedings of 2009 I.E. 12th International Conference on Computer Vision. Kyoto: IEEE.
Jung, K., Kim, K. I., & Jain, A. K. (2004). Text information extraction in images and video: A survey. Pattern Recognition, 37(5), 977–997.
Article Google Scholar
Karatzas, D., Mestre, S. R., Mas, J., Nourbakhsh, F., & Roy, P. P. (2011). ICDAR 2011 Robust reading competition - challenge 1: Reading text in born-digital images (Web and Email). In Proceedings of 11th International Conference on Document Analysis and Recognition (ICDAR 2011). Beijing: IEEE.
Kay, L. (1984). Electronic aids for blind persons: An interdisciplinary subject. IEEE Proceedings A - Physical Science, Measurement and Instrumentation, Management and Education Reviews, 131(7), 559–576.
Article Google Scholar
Koo, H. I., & Kim, D. H. (2013). Scene Text Detection via Connected Component Clustering and Nontext Filtering. IEEE Transactions on Image Processing, 22(6), 2296–2305.
Article Google Scholar
Kurzweil, R. C., Bhathena, F., & Baum, S. R. (2000). U.S. Patent No. 6,033,224. Washington, DC: U.S. Patent and Trademark Office.
Lee, J., Lee, P., Lee, S., Yuille, A., & Koch, C. (2011). AdaBoost for text detection in natural scene. In Proceedings of 11th International Conference on Document Analysis and Recognition (ICDAR 2011). Beijing: IEEE.
Leija, L., Santiago, S., & Alvarado, C. (1996). A system of text reading and translation to voice for blind persons. In Proceedings of 18th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Amsterdam: IEEE.
Li, Y., & Lu, H. (2012, November). Scene text detection via stroke width. In Proceedings of 21st International Conference on Pattern Recognition (ICPR 2012). Tsukuba Science City: IEEE.
Li, Y., Jia, W., Shen, C., & Hengel, A. V. (2014). Characterness: An Indicator of Text in the Wild. IEEE Transactions on Image Processing, 23(4), 1666–1677.
Article Google Scholar
Lin, L., & Tan, C. L. (2005). Text extraction from name cards using neural network. In Proceedings of 2005 International Joint Conference on Neural Networks (IJCNN 2005). Montreal: IEEE.
Matas, J., Chum, O., Urban, M., & Pajdla, T. (2002). Robust wide baseline stereo from maximally stable extremal regions. In Proceedings of 13th British Machine Vision Conference (BMVC 2002). Cardiff: British Machine Vision Association.
Mcleod, K., Iskandar, D. N., & Burger, A. (2013). Towards the Semantic Representation of Biological Images. International Journal of Intelligent Information Technologies, 9(4), 35–54.
Article Google Scholar
Nassu, B. T., Minetto, R., & Oliveira, L. E. (2013). Text line detection in document images: Towards a support system for the blind. In Proceedings of 2013 12th International Conference on Document Analysis and Recognition (ICDAR 2013). Washington, DC: IEEE.
Neumann, L., & Matas, J. (2012). Real-time scene text localization and recognition. In Proceedings of 2012 I.E. Conference on Computer Vision and Pattern Recognition (CVPR 2012). Rhode Island: IEEE.
Omotayo, O. R. (1983). A microcomputer-based reading aid for blind students. IEEE Transactions on Education, 26(4), 156–161.
Article Google Scholar
Pazio, M., Niedzwiecki, M., Kowalik, R., & Lebiedz, J. (2007). Text detection system for the blind. In Proceedings of 15th European Signal Processing Conference (EUSIPCO 2007), Poznań: EURASIP.
Peng, E., Peursum, P., & Li, L. (2012). Product barcode and expiry date detection for the visually impaired using a smartphone. In Proceedings of 2012 International Conference on Digital Image Computing Techniques and Applications (DICTA 2012). Fremantle: IEEE.
Rajam, I. F., & Valli, S. (2013). A survey on content based image retrieval. Life Science Journal, 10(2), 2475–2487.
Google Scholar
Rosin, P. L. (1999). Measuring rectangularity. Machine Vision and Applications, 11(4), 191–196.
Article Google Scholar
Shahab, A., Shafait, F., & Dengel, A. (2011). ICDAR 2011 Robust reading competition challenge 2: Reading text in scene images In Proceedings of 11th International Conference on Document Analysis and Recognition (ICDAR 2011). Beijing: IEEE.
Shanthi, S., & Bhaskaran, V. M. (2013). A Novel Approach for Detecting and Classifying Breast Cancer in Mammogram Images. International Journal of Intelligent Information Technologies, 9(1), 21–39.
Article Google Scholar
Stark, J. (2000). Adaptive image contrast enhancement using generalizations of histogram equalization. IEEE Transactions on Image Processing, 9(5), 889–896.
Article Google Scholar
Tsai, C. (2012). Non-motion blur detection for helping blind persons to "see" business cards. In Proceedings of 2012 International Conference on Machine Learning and Cybernetics (ICMLC 2012). Shaanxi: IEEE.
Wang, R., Sang, N., & Gao, C. (2015). Scene Text Identification by Leveraging Mid-level Patches and Context Information. IEEE Signal Processing Letters, 22(7), 963–967.
Article Google Scholar
World Health Organization. (2014). Visual impairment and blindness. Retrieved from http://www.who.int/mediacentre/factsheets/fs282/en/
Ye, Z., Yi, C., & Tian, Y. (2013). Reading labels of cylinder objects for blind persons. In Proceedings of 2013 I.E. International Conference on Multimedia and Expo (ICME 2013). San Jose: IEEE.
Yi, C., Tian, Y., & Arditi, A. (2014). Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons. IEEE/ASME Transactions on Mechatronics, 19(3), 808–817.
Article Google Scholar
Yin, X., Huang, K., & Hao, H. (2014). Robust Text Detection in Natural Scene Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5), 970–983.
Article Google Scholar
Yu, C., Zhang, Y., Liu, Y., Meng, Q., & Song, Y. (2015). Text detection and recognition in natural scene with edge analysis. IET Computer Vision, 9(4), 603–613.
Article Google Scholar
Zamberletti, A., Noce, L., & Gallo, I. (2015). Text localization based on fast feature pyramids and multi-resolution maximally stable extremal regions. In Proceedings of 12th Asian Conference on Computer Vision (ACCV 2014) Workshop. Singapore: Springer.
Zhang, X., & Sugumaran, V. (2014). Content Based Search Engine for Historical Calligraphy Images. International Journal of Intelligent Information Technologies, 10(3), 1–18.
Article Google Scholar
Zhou, G., Jia, Z., Liu, Y., & Xu, L. (2015). Scene text detection method based on the hierarchical model. IET Computer Vision, 9(4), 500–510.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, College of Engineering, Guindy, Anna University, Chennai, India
S. P. Faustina Joan & S. Valli

Authors

S. P. Faustina Joan
View author publications
You can also search for this author in PubMed Google Scholar
S. Valli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. P. Faustina Joan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Joan, S.P.F., Valli, S. An enhanced text detection technique for the visually impaired to read text. Inf Syst Front 19, 1039–1056 (2017). https://doi.org/10.1007/s10796-016-9699-x

Download citation

Published: 20 September 2016
Issue Date: October 2017
DOI: https://doi.org/10.1007/s10796-016-9699-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An enhanced text detection technique for the visually impaired to read text

Abstract

Access this article

Similar content being viewed by others

Text Detection Based on Text Shape Feature Analysis with Intelligent Grouping in Natural Scene Images

TextCatcher: a method to detect curved and challenging text in natural scenes

Deep Learning and Particle Swarm Optimisation-Based Techniques for Visually Impaired Humans' Text Recognition and Identification

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An enhanced text detection technique for the visually impaired to read text

Abstract

Access this article

Similar content being viewed by others

Text Detection Based on Text Shape Feature Analysis with Intelligent Grouping in Natural Scene Images

TextCatcher: a method to detect curved and challenging text in natural scenes

Deep Learning and Particle Swarm Optimisation-Based Techniques for Visually Impaired Humans' Text Recognition and Identification

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation