Abstract
In the last two years, the authors’ research has been focused on designing a smart solution to compensate for hearing or visual deficiencies using the Google Glass hardware device and its own software architecture. This paper presents a solution aimed on deaf people or people with hearing impairment. At the beginning of this paper there is a brief explanation of the architecture of the designed solution, a description of the user interface through Google Glass. Related work to face detection, visual activity detection and speech recognition is presented with many possible approaches to these research areas. The principle of the solution lies in the combination of face detection and subsequent assignment of the recognized speech to the mouth of the correct face in the image. The aim of the solution is to digitally capture the ambient sound and based on its evaluation using neural networks, to present detected speech in a text form assigned to detected face from camera stream. The testing has shown that the solution is beneficial, and it is working as expected. Machine Learning Kit provides good results in face detection and communication with Google Cloud Speech API is fast enough for smooth user experience.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shearer, A.E., Hildebrand, M.S., Smith, R.J.: Hereditary hearing loss and deafness overview. In: Adam, M.P., et al. (eds.) GeneReviews®. University of Washington, Seattle (1993)
Deafness and hearing loss. https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss
NárodnĂ plán opatĹ™enĂ pro snĂĹľenĂ negativnĂch dĹŻsledkĹŻ zdravotnĂho postiĹľenĂ. https://www.knihkm.cz/handy/texty/narplan93.htm
Hrubý, J.: Kolik je u nás sluchově postiženỳch (1998)
Hrubý, J.: Tak kolik těch sluchově postiženỳch u nás vlastně je? (2009)
Forecast number of mobile users worldwide 2019–2023—Statistic. https://www.statista.com/statistics/218984/number-of-global-mobile-users-since-2010/
Mobile phone penetration worldwide 2013–2019—Statistic. https://www.statista.com/statistics/470018/mobile-phone-user-penetration-worldwide/
Mobile connections worldwide by country 2013–2019—Statistic. https://www.statista.com/statistics/203636/mobile-connections-worldwide-by-country/
Hjelmås, E., Low, B.K.: Face detection: a survey. Comput. Vis. Image Underst. 83, 236–274 (2001). https://doi.org/10.1006/cviu.2001.0921
Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5325–5334. IEEE, Boston (2015). https://doi.org/10.1109/CVPR.2015.7299170
Siatras, S., Nikolaidis, N., Krinidis, M., Pitas, I.: Visual lip activity detection and speaker detection using mouth region intensities. IEEE Trans. Circ. Syst. Video Technol. 19, 133–137 (2009). https://doi.org/10.1109/TCSVT.2008.2009262
Johnson, D.H., Dudgeon, D.E.: Array Signal Processing: Concepts and Techniques. PTR Prentice Hall, Englewood Cliffs (1993)
Bourlard, H.A., Morgan, N.: Connectionist Speech Recognition. Springer, Boston (1994). https://doi.org/10.1007/978-1-4615-3210-1
Graves, A., Mohamed, A., Hinton, G.: Speech Recognition with Deep Recurrent Neural Networks. arXiv:1303.5778 [cs] (2013)
Doddington, G.R.: Speaker recognition—identifying people by their voices. Proc. IEEE 73, 1651–1664 (1985). https://doi.org/10.1109/PROC.1985.13345
Berger, A., Vokalova, A., Maly, F., Poulova, P.: Google glass used as assistive technology its utilization for blind and visually impaired people. In: Younas, M., Awan, I., Holubova, I. (eds.) MobiWIS 2017. LNCS, vol. 10486, pp. 70–82. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65515-4_6
Berger, A., Maly, F.: Prototype of a smart google glass solution for deaf (and hearing impaired) people. In: Younas, M., Awan, I., Ghinea, G., Catalan Cid, M. (eds.) MobiWIS 2018. LNCS, vol. 10995, pp. 38–47. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97163-6_4
Google Cloud including GCP & G Suite—Try Free. https://cloud.google.com/
Cloud Vision API Documentation—Cloud Vision API Documentation. https://cloud.google.com/vision/docs/
Holey, P.N., Gaikwad, V.T.: Google glass technology. Int. J. 2 (2014)
Exploiting a Bug in Google’s Glass - Jay Freeman (saurik). http://www.saurik.com/id/16
Vahabzadeh, A., Keshav, N.U., Salisbury, J.P., Sahin, N.T.: Improvement of attention-deficit/hyperactivity disorder symptoms in school-aged children, adolescents, and young adults with autism via a digital smartglasses-based socioemotional coaching aid: short-term, uncontrolled pilot study. JMIR Ment Health 5 (2018). https://doi.org/10.2196/mental.9631
Deshpande, S., Uplenchwar, G., Chaudhari, D.N.: Google glass. Int. J. Sci. Eng. Res. 4, 0–4 (2013)
How does Google glass work? (Infographic). https://www.varifocals.net/google-glass/
Face Detection. https://firebase.google.com/docs/ml-kit/detect-faces
Cloud Speech-to-Text - Speech Recognition—Cloud Speech-to-Text API. https://cloud.google.com/speech-to-text/
Overview of Face Detection and Face Recognition - Amazon Rekognition. https://docs.aws.amazon.com/rekognition/latest/dg/face-feature-differences.html
Acknowledgment
This work and the contribution were supported by the project of Students Grant Agency – FIM, University of Hradec Kralove, Czech Republic.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Berger, A., Kostak, M., Maly, F. (2019). Mobile AR Solution for Deaf People. In: Awan, I., Younas, M., Ăśnal, P., Aleksy, M. (eds) Mobile Web and Intelligent Information Systems. MobiWIS 2019. Lecture Notes in Computer Science(), vol 11673. Springer, Cham. https://doi.org/10.1007/978-3-030-27192-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-27192-3_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27191-6
Online ISBN: 978-3-030-27192-3
eBook Packages: Computer ScienceComputer Science (R0)