Skip to main content
Log in

An enhanced text detection technique for the visually impaired to read text

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

An enhanced text detection technique (ETDT) is proposed, which is expected to aid the visually impaired to overcome their reading challenges. This work enhances the edge-preserving maximally stable extremal regions (eMSER) algorithm using the pyramid histogram of oriented gradients (PHOG). Histogram of oriented gradients (HOG) derived from different pyramid levels is important while detecting maximally stable extremal regions (MSER) in the ETDT approach because it gives more spatial information when compared to HOG information from a single level. To group text, a four-line, text-grouping method is newly designed for this work. Also, a new text feature, Shapeness Score is proposed, which significantly identifies text regions when combined with the other features based on morphology and stroke widths. Using the feature vector of dimension 10, the J48 decision tree and AdaBoost machine learning algorithms identify the text regions in the images. The algorithm yields better results than the existing benchmark algorithms for the ICDAR 2011 born-digital dataset and must be improved with respect to the scene text dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  • Bissacco, A., Cummins, M., Netzer, Y., & Neven, H. (2013). PhotoOCR: Reading text in uncontrolled conditions. In Proceedings of 2013 I.E. International Conference on Computer Vision. Sydney: IEEE.

  • Bosch, A., Zisserman, A., & Munoz, X. (2007). Representing shape with a spatial pyramid kernel. In Proceedings of 6th ACM International Conference on Image and Video Retrieval (CIVR 2007). Amsterdam: ACM.

  • Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 679–698.

    Article  Google Scholar 

  • Cerf, M., Frady, E. P., & Koch, C. (2009). Faces and text attract gaze independent of the task: Experimental data and computer model. Journal of Vision, 9(12), 10–10.

    Article  Google Scholar 

  • Chen, H., Tsai, S. S., Schroth, G., Chen, D. M., Grzeszczuk, R., & Girod, B. (2011). Robust text detection in natural images with edge-enhanced Maximally stable extremal regions. In Proceedings of 2011 18th IEEE International Conference on Image Processing (ICIP 2011). Brussels: IEEE.

  • Criminisi, A., Perez, P., & Toyama, K. (2004). Region Filling and Object Removal by Exemplar-Based Image Inpainting. IEEE Transactions on Image Processing, 13(9), 1–13.

    Article  Google Scholar 

  • Epshtein, B., Ofek, E., & Wexler, Y. (2010). Detecting text in natural scenes with stroke width transform. In Proceedings of 2010 I.E. Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2010). San Francisco: IEEE.

  • Ezaki, N., Kiyota, K., Minh, B., Bulacu, M., & Schomaker, L. (2005). Improved text-detection methods for a camera-based text reading system for blind persons. In Proceedings of Eighth International Conference on Document Analysis and Recognition (ICDAR'05). Washington, DC: IEEE.

  • Fathima, A. A., Vaidehi, V., & Selvaraj, K. (2014). Fall Detection with Part-Based Approach for Indoor Environment. International Journal of Intelligent Information Technologies, 10(4), 51–69.

    Article  Google Scholar 

  • Freund, Y., & Schapire, R. (1996). Experiments with a new boosting algorithm. In Proceedings of Thirteenth International Conference on Machine Learning (ICML ‘96). Bari: Morgan Kaufmann.

  • Gomez, L., & Karatzas, D. (2014). MSER-based real-time text detection and tracking. In Proceedings of 22nd International Conference on Pattern Recognition (ICPR 2014). Stockholm: IEEE.

  • Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software. ACM SIGKDD Explorations Newsletter, 11(1), 10–18.

    Article  Google Scholar 

  • Haritaoglu, I. (2001). Scene text extraction and translation for handheld devices. In Proceedings of 2001 I.E. Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001). Kauai: IEEE.

  • He, K., Sun, J., & Tang, X. (2013). Guided Image Filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(6), 1397–1409.

    Article  Google Scholar 

  • Hemalatha, C. S., & Vaidehi, V. (2013). Associative Classification based Human Activity Recognition and Fall Detection using Accelerometer. International Journal of Intelligent Information Technologies, 9(3), 20–37.

    Article  Google Scholar 

  • Hemalatha, C. S., Vaidehi, V., Nithya, K., Fathima, A. A., Visalakshi, M., & Saranya, M. (2015). Multi-Level Search Space Reduction Framework for Face Image Database. International Journal of Intelligent Information Technologies, 11(1), 12–29.

    Article  Google Scholar 

  • Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2009). Learning to predict where humans look. In Proceedings of 2009 I.E. 12th International Conference on Computer Vision. Kyoto: IEEE.

  • Jung, K., Kim, K. I., & Jain, A. K. (2004). Text information extraction in images and video: A survey. Pattern Recognition, 37(5), 977–997.

    Article  Google Scholar 

  • Karatzas, D., Mestre, S. R., Mas, J., Nourbakhsh, F., & Roy, P. P. (2011). ICDAR 2011 Robust reading competition - challenge 1: Reading text in born-digital images (Web and Email). In Proceedings of 11th International Conference on Document Analysis and Recognition (ICDAR 2011). Beijing: IEEE.

  • Kay, L. (1984). Electronic aids for blind persons: An interdisciplinary subject. IEEE Proceedings A - Physical Science, Measurement and Instrumentation, Management and Education Reviews, 131(7), 559–576.

    Article  Google Scholar 

  • Koo, H. I., & Kim, D. H. (2013). Scene Text Detection via Connected Component Clustering and Nontext Filtering. IEEE Transactions on Image Processing, 22(6), 2296–2305.

    Article  Google Scholar 

  • Kurzweil, R. C., Bhathena, F., & Baum, S. R. (2000). U.S. Patent No. 6,033,224. Washington, DC: U.S. Patent and Trademark Office.

  • Lee, J., Lee, P., Lee, S., Yuille, A., & Koch, C. (2011). AdaBoost for text detection in natural scene. In Proceedings of 11th International Conference on Document Analysis and Recognition (ICDAR 2011). Beijing: IEEE.

  • Leija, L., Santiago, S., & Alvarado, C. (1996). A system of text reading and translation to voice for blind persons. In Proceedings of 18th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Amsterdam: IEEE.

  • Li, Y., & Lu, H. (2012, November). Scene text detection via stroke width. In Proceedings of 21st International Conference on Pattern Recognition (ICPR 2012). Tsukuba Science City: IEEE.

  • Li, Y., Jia, W., Shen, C., & Hengel, A. V. (2014). Characterness: An Indicator of Text in the Wild. IEEE Transactions on Image Processing, 23(4), 1666–1677.

    Article  Google Scholar 

  • Lin, L., & Tan, C. L. (2005). Text extraction from name cards using neural network. In Proceedings of 2005 International Joint Conference on Neural Networks (IJCNN 2005). Montreal: IEEE.

  • Matas, J., Chum, O., Urban, M., & Pajdla, T. (2002). Robust wide baseline stereo from maximally stable extremal regions. In Proceedings of 13th British Machine Vision Conference (BMVC 2002). Cardiff: British Machine Vision Association.

  • Mcleod, K., Iskandar, D. N., & Burger, A. (2013). Towards the Semantic Representation of Biological Images. International Journal of Intelligent Information Technologies, 9(4), 35–54.

    Article  Google Scholar 

  • Nassu, B. T., Minetto, R., & Oliveira, L. E. (2013). Text line detection in document images: Towards a support system for the blind. In Proceedings of 2013 12th International Conference on Document Analysis and Recognition (ICDAR 2013). Washington, DC: IEEE.

  • Neumann, L., & Matas, J. (2012). Real-time scene text localization and recognition. In Proceedings of 2012 I.E. Conference on Computer Vision and Pattern Recognition (CVPR 2012). Rhode Island: IEEE.

  • Omotayo, O. R. (1983). A microcomputer-based reading aid for blind students. IEEE Transactions on Education, 26(4), 156–161.

    Article  Google Scholar 

  • Pazio, M., Niedzwiecki, M., Kowalik, R., & Lebiedz, J. (2007). Text detection system for the blind. In Proceedings of 15th European Signal Processing Conference (EUSIPCO 2007), Poznań: EURASIP.

  • Peng, E., Peursum, P., & Li, L. (2012). Product barcode and expiry date detection for the visually impaired using a smartphone. In Proceedings of 2012 International Conference on Digital Image Computing Techniques and Applications (DICTA 2012). Fremantle: IEEE.

  • Rajam, I. F., & Valli, S. (2013). A survey on content based image retrieval. Life Science Journal, 10(2), 2475–2487.

    Google Scholar 

  • Rosin, P. L. (1999). Measuring rectangularity. Machine Vision and Applications, 11(4), 191–196.

    Article  Google Scholar 

  • Shahab, A., Shafait, F., & Dengel, A. (2011). ICDAR 2011 Robust reading competition challenge 2: Reading text in scene images In Proceedings of 11th International Conference on Document Analysis and Recognition (ICDAR 2011). Beijing: IEEE.

  • Shanthi, S., & Bhaskaran, V. M. (2013). A Novel Approach for Detecting and Classifying Breast Cancer in Mammogram Images. International Journal of Intelligent Information Technologies, 9(1), 21–39.

    Article  Google Scholar 

  • Stark, J. (2000). Adaptive image contrast enhancement using generalizations of histogram equalization. IEEE Transactions on Image Processing, 9(5), 889–896.

    Article  Google Scholar 

  • Tsai, C. (2012). Non-motion blur detection for helping blind persons to "see" business cards. In Proceedings of 2012 International Conference on Machine Learning and Cybernetics (ICMLC 2012). Shaanxi: IEEE.

  • Wang, R., Sang, N., & Gao, C. (2015). Scene Text Identification by Leveraging Mid-level Patches and Context Information. IEEE Signal Processing Letters, 22(7), 963–967.

    Article  Google Scholar 

  • World Health Organization. (2014). Visual impairment and blindness. Retrieved from http://www.who.int/mediacentre/factsheets/fs282/en/

  • Ye, Z., Yi, C., & Tian, Y. (2013). Reading labels of cylinder objects for blind persons. In Proceedings of 2013 I.E. International Conference on Multimedia and Expo (ICME 2013). San Jose: IEEE.

  • Yi, C., Tian, Y., & Arditi, A. (2014). Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons. IEEE/ASME Transactions on Mechatronics, 19(3), 808–817.

    Article  Google Scholar 

  • Yin, X., Huang, K., & Hao, H. (2014). Robust Text Detection in Natural Scene Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5), 970–983.

    Article  Google Scholar 

  • Yu, C., Zhang, Y., Liu, Y., Meng, Q., & Song, Y. (2015). Text detection and recognition in natural scene with edge analysis. IET Computer Vision, 9(4), 603–613.

    Article  Google Scholar 

  • Zamberletti, A., Noce, L., & Gallo, I. (2015). Text localization based on fast feature pyramids and multi-resolution maximally stable extremal regions. In Proceedings of 12th Asian Conference on Computer Vision (ACCV 2014) Workshop. Singapore: Springer.

  • Zhang, X., & Sugumaran, V. (2014). Content Based Search Engine for Historical Calligraphy Images. International Journal of Intelligent Information Technologies, 10(3), 1–18.

    Article  Google Scholar 

  • Zhou, G., Jia, Z., Liu, Y., & Xu, L. (2015). Scene text detection method based on the hierarchical model. IET Computer Vision, 9(4), 500–510.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. P. Faustina Joan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Joan, S.P.F., Valli, S. An enhanced text detection technique for the visually impaired to read text. Inf Syst Front 19, 1039–1056 (2017). https://doi.org/10.1007/s10796-016-9699-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-016-9699-x

Keywords

Navigation