Advertisement

Object Detection to Assist Visually Impaired People: A Deep Neural Network Adventure

  • Fereshteh S. BashiriEmail author
  • Eric LaRose
  • Jonathan C. Badger
  • Roshan M. D’Souza
  • Zeyun Yu
  • Peggy Peissig
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11241)

Abstract

Blindness or vision impairment, one of the top ten disabilities among men and women, targets more than 7 million Americans of all ages. Accessible visual information is of paramount importance to improve independence and safety of blind and visually impaired people, and there is a pressing need to develop smart automated systems to assist their navigation, specifically in unfamiliar healthcare environments, such as clinics, hospitals, and urgent cares. This contribution focused on developing computer vision algorithms composed with a deep neural network to assist visually impaired individual’s mobility in clinical environments by accurately detecting doors, stairs, and signages, the most remarkable landmarks. Quantitative experiments demonstrate that with enough number of training samples, the network recognizes the objects of interest with an accuracy of over 98% within a fraction of a second.

Keywords

Machine learning and predictive modeling Mobile health and wearable devices Data mining and knowledge discovery Assistive technology for visually impaired people 

Notes

Acknowledgements

The authors greatly appreciate and acknowledge the contributions of Dr. Ahmad Pahlavan Tafti for his contributions on study design, data collection and drafting the manuscript. Our special thanks goes to Daniel Wall and Anne Nikolai at Marshfield Clinic Research Institute (MCRI) for their help and contributions in collecting the dataset and preparing the current paper. F.S. Bashiri would like to thank the Summer Research Internship Program (SRIP) at MCRI for financial support. Furthermore, we gratefully acknowledge the support of NVIDIA Corporation with the donation of the Quadro M5000 GPU used for this research.

References

  1. 1.
    Ahmetovic, D., et al.: Achieving practical and accurate indoor navigation for people with visual impairments. In: Proceedings of the 14th Web for All Conference on The Future of Accessible Work, p. 31. ACM (2017)Google Scholar
  2. 2.
    Bashiri, F.S., LaRose, E., Peissig, P., Tafti, A.P.: Mcindoor20000: a fully-labeled image dataset to advance indoor objects detection. Data Brief 17, 71–75 (2018)CrossRefGoogle Scholar
  3. 3.
    Berger, A., Vokalova, A., Maly, F., Poulova, P.: Google glass used as assistive technology its utilization for blind and visually impaired people. In: Younas, M., Awan, I., Holubova, I. (eds.) MobiWIS 2017. LNCS, vol. 10486, pp. 70–82. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-65515-4_6CrossRefGoogle Scholar
  4. 4.
    BIRCatMCRI: Mcindoor20000. GitHub repository (2017)Google Scholar
  5. 5.
    Bourne, R.R., et al.: Magnitude, temporal trends, and projections of the global prevalence of blindness and distance and near vision impairment: a systematic review and meta-analysis. Lancet Glob. Health 5(9), e888–e897 (2017)CrossRefGoogle Scholar
  6. 6.
    Erickson, W., Lee, C.G., von Schrader, S.: 2016 disability status reports: United states (2018)Google Scholar
  7. 7.
    Gaudissart, V., Ferreira, S., Thillou, C., Gosselin, B.: Sypole: mobile reading assistant for blind people. In: 9th Conference Speech and Computer (2004)Google Scholar
  8. 8.
    Gupta, D.S.: Architecture of convolutional neural networks (CNNs) demystified (2017)Google Scholar
  9. 9.
    Havaei, M., Guizard, N., Larochelle, H., Jodoin, P.-M.: Deep learning trends for focal brain pathology segmentation in MRI. In: Holzinger, A. (ed.) Machine Learning for Health Informatics. LNCS (LNAI), vol. 9605, pp. 125–148. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-50478-0_6CrossRefGoogle Scholar
  10. 10.
    Huang, J.: Accelerating AI with GPUs: A New Computing Model (2016)Google Scholar
  11. 11.
    Jabnoun, H., Benzarti, F., Amiri, H.: A new method for text detection and recognition in indoor scene for assisting blind people. In: Ninth International Conference on Machine Vision (ICMV 2016), vol. 10341, p. 1034123. International Society for Optics and Photonics (2017)Google Scholar
  12. 12.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  13. 13.
    Kruthiventi, S.S., Ayush, K., Babu, R.V.: Deepfix: a fully convolutional neural network for predicting human eye fixations. arXiv preprint arXiv:1510.02927 (2015)
  14. 14.
    Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)CrossRefGoogle Scholar
  15. 15.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRefGoogle Scholar
  16. 16.
    LeCun, Y., et al.: Handwritten digit recognition with a back-propagation network. In: Advances in Neural Information Processing Systems, pp. 396–404 (1990)Google Scholar
  17. 17.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  18. 18.
    Manoj, B., Rohini, V.: A novel approach to object detection and distance measurement for visually impaired people. Int. J. Comput. Intell. Res. 13(4), 479–484 (2017)Google Scholar
  19. 19.
    Mekhalfi, M.L., Melgani, F., Bazi, Y., Alajlan, N.: Fast indoor scene description for blind people with multiresolution random projections. J. Vis. Commun. Image Represent. 44, 95–105 (2017)CrossRefGoogle Scholar
  20. 20.
    Srinivas, S., Sarvadevabhatla, R.K., Mopuri, K.R., Prabhu, N., Kruthiventi, S.S., Babu, R.V.: A taxonomy of deep convolutional neural nets for computer vision. Front. Robot. AI 2, 36 (2016)CrossRefGoogle Scholar
  21. 21.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)Google Scholar
  22. 22.
    Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)Google Scholar
  23. 23.
    Tekin, E., Coughlan, J.M., Shen, H.: Real-time detection and reading of LED/LCD displays for visually impaired persons. In: Proceedings/IEEE Workshop on Applications of Computer Vision. IEEE Workshop on Applications of Computer Vision, p. 491. NIH Public Access (2011)Google Scholar
  24. 24.
    Tekin, E., Vásquez, D., Coughlan, J.M.: SK smartphone barcode reader for the blind. In: Journal on technology and persons with disabilities:... Annual International Technology and Persons with Disabilities Conference, vol. 28, p. 230. NIH Public Access (2013)Google Scholar
  25. 25.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_53CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Fereshteh S. Bashiri
    • 1
    • 2
    Email author
  • Eric LaRose
    • 1
  • Jonathan C. Badger
    • 1
  • Roshan M. D’Souza
    • 2
  • Zeyun Yu
    • 2
  • Peggy Peissig
    • 1
  1. 1.Marshfield Clinic Research InstituteMarshfieldUSA
  2. 2.University of Wisconsin-MilwaukeeMilwaukeeUSA

Personalised recommendations