Real-Time Mexican Sign Language Interpretation Using CNN and HMM

Ramírez Sánchez, Jairo Enrique; Rodríguez, Arely Anguiano; Mendoza, Miguel González

doi:10.1007/978-3-030-89817-5_4

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13067))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

1065 Accesses
2 Citations

Abstract

Mexican Sign Language (MSL) is the primary form of communication for the deaf community in Mexico. MSL has a different grammatical structure than Spanish; furthermore, facial expression plays a determining role in complementing context-based meaning. This turns it difficult for a hearing person without prior knowledge of the language to understand what is to be transmitted, representing an important communication barrier for deaf people. In order to face this, we present the first architecture to consider facial features as indicators of grammatical tense to develop a real-time interpreter from MSL to written Spanish. Our model uses the open source MediaPipe library to extract marks from the face, body position and hands. Three 2D convolutional neural networks are used to encode individually and extract patterns, the networks converge to a multilayer perceptron for classification. Finally, a Hidden Markov Model is used to morphosyntactically predict the most probable sequence of words based on a preloaded knowledge base. From the experiments were carried out, a precision of 94.9% was obtained with \(\sigma = 0.07\) for the recognition of 75 isolated words and 94.1% with \(\sigma = 0.09\) for the interpretation of 20 sentences in MSL in a medical context. Being an approach based on camera inputs and observing that even with a few samples an adequate generalization can be achieved, it would be feasible to scale our architecture to other sign languages and offer possibilities of efficient communication to millions of people with hearing disability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cruz, M.: Gramática de la Lengua de Señas Mexicana, 1st edn. Centro de Estudios Lingüísticos y Literarios, Colegio de México (2008)
Google Scholar
Ordóñez, E.: Asociación de intérpretes en lengua de señas del distrito federal: Número de intérpretes de lengua de señas en México (2015)
Google Scholar
Rashed, J.R.: New Method for Hand Gesture Recognition Using Wavelet Neural Network (2017)
Google Scholar
Ben Jmaa, A., Mahdi, W., Ben Jemaa, Y., Ben Hamadou, A.: A new approach for hand gestures recognition based on depth map captured by RGB-D camera. Computacion y Sistemas 20, 709–721 (2016)
Google Scholar
Dong, C., Leu, M.C., Yin, Z.: American Sign Language alphabet recognition using Microsoft Kinect. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Volume 2015, pp. 44–52. IEEE Computer Society, October 2015
Google Scholar
Fels, S.S., Hinton, G.E.: Glove-TalkII - A neural-network interface which maps gestures to parallel formant speech synthesizer controls. IEEE Trans. Neural Networks 8, 977–984 (1997)
Article Google Scholar
Tolba, A.S.: Arabic Glove-Talk (AGT): a communication aid for vocally impaired. Pattern Anal. Appl. 1, 218–230 (1998)
Article Google Scholar
Grobel, K., Assan, M.: Isolated sign language recognition using Hidden Markov Models. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. vol. 1, pp. 162–167. IEEE (1997)
Google Scholar
García-Bautista, G., Trujillo-Romero, F., Caballero-Morales, S.O.: Mexican sign language recognition using Kinect and data time warping algorithm. In: 2017 International Conference on Electronics, Communications and Computers, CONIELECOMP 2017, pp. 1–5. Institute of Electrical and Electronics Engineers Inc., (2017)
Google Scholar
Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: Salah, A.A., Lepri, B. (eds.) HBU 2011. LNCS, vol. 7065, pp. 29–39. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25446-8_4
Chapter Google Scholar
Kadhim, R.A., Khamees, M.: A real-time American sign language recognition system using convolutional neural network for real datasets. TEM J. 9, 937–943 (2020)
Article Google Scholar
Huang, J., Zhou, W., Li, H., Li, W.: Sign language recognition using 3D convolutional neural networks. In: Proceedings - IEEE International Conference on Multimedia and Expo, vol. 2015, pp. 1–6. IEEE Computer Society, August 2015
Google Scholar
Huang, J., Zhou, W., Zhang, Q., Li, H., Li, W.: Video-based sign language recognition without temporal segmentation. In: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 2257–2264. AAAI Press (2018)
Google Scholar
Carmona-Arroyo, G., Rios-Figueroa, H.V., Avendaño-Garrido, M.L.:Mexican Sign-Language static-alphabet recognition using 3D affine invariants. In: Machine Vision Inspection Systems, vol. 2, pp. 171–192. Wiley (2021)
Google Scholar
Galicia, R., Carranza, O., Jimenez, E.D., Rivera, G.E.: Mexican sign language recognition using movement sensor. In: IEEE International Symposium on Industrial Electronics, vol. 2015, pp. 573–578. Institute of Electrical and Electronics Engineers Inc., (2015)
Google Scholar
Luis-Pérez, F.E., Trujillo-Romero, F., Martínez-Velazco, W.: Control of a service robot using the Mexican sign language. In: Batyrshin, I., Sidorov, G. (eds.) MICAI 2011. LNCS (LNAI), vol. 7095, pp. 419–430. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25330-0_37
Chapter Google Scholar
Sataloff, R.T., Johns, M.M., Kost, K.M.: Reconocimiento de Imágenes del Lenguaje de Señas Mexicano, México D.F. (2012)
Google Scholar
Solís, F., Martínez, D., Espinoza, O.: Automatic Mexican sign language recognition using normalized moments and artificial neural networks. Engineering 08, 733–740 (2016)
Article Google Scholar
Solís, F., Toxqui, C., Martínez, D.: Mexican sign language recognition using Jacobi-Fourier moments. Engineering 07, 700–705 (2015)
Article Google Scholar
Álvarez, N.A.: Kinect V2 como alternativa para desarrollar un traductor de ideogramas de lengua de señas mexicana (LSM) (2016)
Google Scholar
Sosa-Jimenez, C.O., Rios-Figueroa, H.V., Rechy-Ramirez, E.J., Marin-Hernandez, A., Gonzalez-Cosio, A.L.S.: Real-time Mexican Sign Language recognition. In: 2017 IEEE International Autumn Meeting on Power, Electronics and Computing, ROPEC 2017, vol. 2018, pp. 1–6. Institute of Electrical and Electronics Engineers Inc. (2017)
Google Scholar
Lugaresi, C., et al.: MediaPipe: a framework for building perception pipelines (2019)
Google Scholar
Naveenkumar, M., Ayyasamy, V.: OpenCV for computer vision applications. In: Proceedings of National Conference on Big Data and Cloud Computing (NCBDC 2015), pp. 52–56 (2016)
Google Scholar
Serafìn, M., González, R.: Diccionario de Lengua de Señas Mexicana, vol. 38, México D.F. (2011)
Google Scholar
Forney, G.D.: The Viterbi algorithm. Proc. IEEE 61, 268–278 (1973)
Article MathSciNet Google Scholar
Keval, H., Sasse, M.A.: To catch a thief - You need at least 8 frames per second: the impact of frame rates on user performance in a CCTV detection task. In: MM’08 - Proceedings of the 2008 ACM International Conference on Multimedia, with Co-located Symposium and Workshops, pp. 941–944 (2008)
Google Scholar
Bisong, E.: Google Colaboratory, pp. 59–64. Apress, Berkeley (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Technological Institute of Monterrey, School of Engineering and Sciences, Atizapán de Zaragoza, Mexico
Jairo Enrique Ramírez Sánchez, Arely Anguiano Rodríguez & Miguel González Mendoza

Authors

Jairo Enrique Ramírez Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Arely Anguiano Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
Miguel González Mendoza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jairo Enrique Ramírez Sánchez , Arely Anguiano Rodríguez or Miguel González Mendoza .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Centro de Investigación en Computación, Mexico City, Mexico
Ildar Batyrshin
Instituto Politécnico Nacional, Centro de Investigación en Computación, Mexico City, Mexico
Alexander Gelbukh
Instituto Politécnico Nacional, Centro de Investigación en Computación, Mexico City, Mexico
Grigori Sidorov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ramírez Sánchez, J.E., Rodríguez, A.A., Mendoza, M.G. (2021). Real-Time Mexican Sign Language Interpretation Using CNN and HMM. In: Batyrshin, I., Gelbukh, A., Sidorov, G. (eds) Advances in Computational Intelligence. MICAI 2021. Lecture Notes in Computer Science(), vol 13067. Springer, Cham. https://doi.org/10.1007/978-3-030-89817-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-89817-5_4
Published: 21 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89816-8
Online ISBN: 978-3-030-89817-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics