Support System Using Microsoft Kinect and Mobile Phone for Daily Activity of Visually Impaired

  • Mohammad M. Rahman
  • Bruce Poon
  • Md. Ashraful Amin
  • Hong Yan


The aim of this paper is to outline a system based on Microsoft Kinect and mobile devices that will provide assistant to visually impaired people. Our primary goal is to provide navigation aid that will help visually impaired to navigate. This includes detection and identification of face, texts and chairs. This is implemented using Microsoft Kinect and machine learning methods are used for this process as it requires rough identification of object. For data acquisition and processing, OpenCV, OpenKinect, Tesseract and Espeak are used. Features that have been incorporated for building this aiding tool are object detection and recognition, face detection and recognition, object location determination, optical character recognition and audio feedback. The face recognition system showed an accuracy of 90 %, the text recognition yielded an accuracy of 65 % and the chairs are recognized with more than 74 % accuracy. To identify denominations of bank notes, more accurate recognition is required. Mobile phone is used to identify bank note denomination. The proposed system can recognize Bangladeshi paper currency notes with 89.4 % accuracy on plain paper background and with 78.4 % accuracy tested on a complex background.


3D camera Human computer interaction (HCI) Kinect Mobile computing Navigational aid ORB SIFT SURF Visual impairment 



This work is jointly supported by Independent University, Bangladesh and University Grants Commission of Bangladesh under Higher Education Quality Enhancement Project (HEQEP) Number: CP-3359.


  1. 1.
    M. Ridwan, E. Choudhury, B. Poon, M.A. Amin, H. Yan, A navigational aid system for visually impaired using microsoft kinect, in Proceedings of the International MultiConference of Engineers and Computer Scientists, IMECS 2014, 12–14 Mar 2014, Hong Kong. Lecture Notes in Engineering and Computer Science, pp. 417–422Google Scholar
  2. 2.
    A. Opelt, M. Fussenegger, A. Pinz, P. Auer, Weak hypotheses and boosting for generic object detection and recognition, in Proceedings of the 8th European Conference on Computer Vision. Lecture Notes in Computer Science, vol. 3022, pp. 71–84 (2004)Google Scholar
  3. 3.
    D.G. Lowe, Object recognition from local scale-invariant features. IEEE Trans. Pattern Anal. Mach. Intell. 2, 1150–1157 (1999)Google Scholar
  4. 4.
    R. Fergus, P. Perona, A. Zisserman, Object class recognition by unsupervised scale-invariant learning. Proc. Comput. Vis. Pattern Recogn. 2, 264–271 (2003)Google Scholar
  5. 5.
    S. Mahamud, M. Hebert, J. Shi, Object recognition using boosted discriminants. Proc. Comput. Vis. Pattern Recogn. 1, 551–558 (2001)Google Scholar
  6. 6.
    P. Viola, M. Jones, Rapid object detection using boosted cascade of simple features, in Proceedings of the Conference on Computer Vision and Pattern Recognition (2009)Google Scholar
  7. 7.
    S. Sclaroff, A. Pentland, Modal matching for correspondence and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 17(6), 545–561 (1995)CrossRefGoogle Scholar
  8. 8.
    L.S. Shapiro, J.M. Brady, Feature-based correspondence: an eigenvector approach. Image Vis. Comput. 10(5), 283–288 (1992)CrossRefGoogle Scholar
  9. 9.
    S. Umeyama, An eigen decomposition approach to weighted graph matching problems. IEEE Trans. Pattern Anal. Mach. Intell. 10(1), 71–96 (1991)Google Scholar
  10. 10.
    M.A. Turk, A.P. Pentland, Eigenface for recognition. J. Cogn. Neurosci. 3, 71–86 (1991)Google Scholar
  11. 11.
    Tesseract Open Source OCR Engine: Available
  12. 12.
    D.G. Lowe, Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004).
  13. 13.
    H. Bay, A. Ess, T. Tuytelaars, L.V. Gool, Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110, 346–359 (2008).
  14. 14.
    E. Rublee, V. Rabaud, K. Konolige, G. Bradski, ORB: An efficient alternative to SIFT or SURF, in Proceedings of the International Conference on Computer Vision, pp. 2564–2571 (2011)Google Scholar
  15. 15.
    Open Source Computer Vision: Available
  16. 16.
    Open source Open Kinect project: Available
  17. 17.
    M.A. Amin, H. Yan, An empirical study on the characteristics of gabor representations for face recognition. Int. J. Pattern Recognit Artif Intell. 23(3), 401–431 (2009)CrossRefGoogle Scholar
  18. 18.
    B. Poon, M.A. Amin, H. Yan, Performance evaluation and comparison of PCA based human face recognition methods for distorted images. Int. J. Mach. Learn. Cybernet. 2(4), 245–259 (2011)CrossRefGoogle Scholar
  19. 19.
    M.Z. Hossain, M.A. Amin, H. Yan, Rapid feature extraction for Bangla handwritten digit recognition, in Proceedings of the International Conference of Machine Learning and Cybernetics, pp. 1832–1837 (2011)Google Scholar
  20. 20.
    ESpeak—A Voice Synthesizer: Available
  21. 21.
    Central Bank of Bangledesh: Available
  22. 22.
  23. 23.
  24. 24.
    Issues with imgldx in Descriptor Matcher mexopencv: Stack overflow. Available
  25. 25.
    The CMU Multi-PIE Face Database: Available
  26. 26.
    The Shefield (previously UMIST) Face Database: Available
  27. 27.
    MIT Center for Biological and Computational Learning Face Database: Available

Copyright information

© Springer Science+Business Media Dordrecht 2015

Authors and Affiliations

  • Mohammad M. Rahman
    • 1
  • Bruce Poon
    • 2
  • Md. Ashraful Amin
    • 1
  • Hong Yan
    • 3
  1. 1.Computer Vision and Cybernetics Group, Computer Science and EngineeringIndependent University, BangladeshDhakaBangladesh
  2. 2.School of Electrical and Information EngineeringUniversity of SydneySydneyAustralia
  3. 3.Department of Electronic EngineeringCity University of Hong KongHong KongChina

Personalised recommendations