Towards Automatic Recognition of Sign Language Gestures Using Kinect 2.0

  • Dmitry Ryumin
  • Alexey A. KarpovEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10278)


We present a prototype of a new computer system aimed at recognition of manual gestures using Kinect 2.0 for Windows. This sensor allows getting a stream of optical images having FullHD resolution with 30 frames per second (fps) and a depth map of the scene. At present, our system is able to recognize continuous fingerspelling gestures and sequences of digits in Russian and Kazakh sign languages (SL). Our gesture vocabulary contains 52 fingerspelling gestures. We have collected a visual database of SL gestures, which consists of Kinect-based recordings of 2 persons (a man and a woman) demonstrating manual gestures. 5 samples of each gesture were applied for training models and the rest data were used for tuning and testing the developed recognition system. Model of each gesture is presented as a vector of informative visual features, calculated for the hand palm and all fingers. Feature vectors are extracted from both training and test samples of gestures, then comparison of reference patterns (models) and sequences of test vectors is made using the Euclidian distance. Sequences of vectors are compared using the dynamic time warping method (dynamic programming) and a reference pattern with a minimal distance is selected as a recognition result. According to our experiments in the signer-dependent mode with 2 demonstrators from the visual database, the average accuracy of gesture recognition is 87% for 52 manual signs.


Sign language Assistive technology Automatic gesture recognition Image processing Kinect sensor 



This research is partially supported by the Russian Foundation for Basic Research (project No. 16-37-60100), by the Council for Grants of the President of the Russian Federation (project No. MD-254.2017.8), by the state research (№ 0073-2014-0005), as well as by the Government of the Russian Federation (grant No. 074-U01).


  1. 1.
    Koller, O., Forster, J., Ney, H.: Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Comput. Vis. Image Underst. 141, 108–125 (2015)CrossRefGoogle Scholar
  2. 2.
    Cooper, H., Ong, E.J., Pugeault, N., Bowden, R.: Sign language recognition using sub-units. J. Mach. Learn. Res. 13, 2205–2231 (2012)zbMATHGoogle Scholar
  3. 3.
    Guo, X., Yang, T.: Gesture recognition based on HMM-FNN model using a Kinect. J. Multimodal User Interfaces 11, 1–7 (2016). doi: 10.1007/s12193-016-0215-x. SpringerCrossRefGoogle Scholar
  4. 4.
    Karpov, A., Kipyatkova, I., Zelezny, M.: Automatic technologies for processing spoken sign languages. Procedia Comput. Sci. 81, 201–207 (2016)CrossRefGoogle Scholar
  5. 5.
    Karpov, A., Krnoul, Z., Zelezny, M., Ronzhin, A.: Multimodal synthesizer for Russian and Czech Sign Languages and Audio-Visual Speech. In: Stephanidis, C., Antona, M. (eds.) UAHCI/HCII 2013. LNCS, vol. 8009, pp. 520–529. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-39188-0_56 CrossRefGoogle Scholar
  6. 6.
    Kindiroglu, A., Yalcin, H., Aran, O., Hruz, M., Campr, P., Akarun, L., Karpov, A.: Automatic recognition of fingerspelling gestures in multiple languages for a communication interface for the disabled. Pattern Recogn. Image Anal. 22(4), 527–536 (2012)CrossRefGoogle Scholar
  7. 7.
    Hruz, M., Campr, P., Dikici, E., Kindiroglu, A., Krnoul, Z., Ronzhin, A.L., Sak, H., Schorno, D., Akarun, L., Aran, O., Karpov, A., Saraclar, M., Zelezny, M.: Automatic fingersign to speech translation system. J. Multimodal User Interfaces 4(2), 61–79 (2011)CrossRefGoogle Scholar
  8. 8.
    Sousa, L., Rodrigues, J.M.F., Monteiro, J., Cardoso, P.J.S., Lam, R.: GyGSLA: a portable glove system for learning sign language alphabet. In: Antona, M., Stephanidis, C. (eds.) UAHCI 2016. LNCS, vol. 9739, pp. 159–170. Springer, Cham (2016). doi: 10.1007/978-3-319-40238-3_16 CrossRefGoogle Scholar
  9. 9.
    Shibata, H., Nishimura, H., Tanaka, H.: Basic investigation for improvement of sign language recognition using classification scheme. In: Yamamoto, S. (ed.) HIMI 2016. LNCS, vol. 9734, pp. 563–574. Springer, Cham (2016). doi: 10.1007/978-3-319-40349-6_55 CrossRefGoogle Scholar
  10. 10.
    Nagashima, Y., et al.: A support tool for analyzing the 3D motions of sign language and the construction of a morpheme dictionary. In: Stephanidis, C. (ed.) HCI 2016. CCIS, vol. 618, pp. 124–129. Springer, Cham (2016). doi: 10.1007/978-3-319-40542-1_20 CrossRefGoogle Scholar
  11. 11.
    Sako, S., Hatano, M., Kitamura, T.: Real-time Japanese sign language recognition based on three phonological elements of sign. In: Stephanidis, C. (ed.) HCI 2016. CCIS, vol. 618, pp. 130–136. Springer, Cham (2016). doi: 10.1007/978-3-319-40542-1_21 CrossRefGoogle Scholar
  12. 12.
    Halim, Z., Abbas, G.: A Kinect-based sign language hand gesture recognition system for hearing- and speech-impaired: a pilot study of Pakistani sign language. Assistive Technol. 27(1), 34–43 (2015)CrossRefGoogle Scholar
  13. 13.
    Chong, W., Zhong, L., Shing-Chow, C.: Superpixel-based hand gesture recognition with Kinect depth camera. IEEE Trans. Multimed. 1(17), 29–39 (2015)Google Scholar
  14. 14.
    Microsoft Developer Network. Skeletal Tracking.
  15. 15.
    Sharma, D., Vatta, S.: Optimizing the search in hierarchical database using Quad Tree. Int. J. Sci. Res. Sci. Eng. Technol. 1(4), 221–226 (2015). SpringerGoogle Scholar
  16. 16.
    Sreedhar, K., Panlal, B.: Enhancement of images using morphological transformations. Int. J. Comput. Sci. Inf. Technol. 4(1), 33–50 (2012)Google Scholar
  17. 17.
    Sossa-Azuela, J.H., Santiago-Montero, R., Pérez-Cisneros, M., Rubio-Espino, E.: Computing the Euler number of a binary image based on a vertex codification. J. Appl. Res. Technology. 11, 360–370 (2013)CrossRefGoogle Scholar
  18. 18.
    Chaple G., Daruwala R., Gofane, M.: Comparisons of Robert, Prewitt, Sobel operator based edge detection methods for real time uses on FPGA. In: Proceeding International Conference on Technologies for Sustainable Development ICTSD-2015. IEEEXplore (2015)Google Scholar
  19. 19.
    Kaehler, A., Bradsky, G.: Learning OpenCV 3. O’Reilly Media, California (2017)Google Scholar
  20. 20.
    OpenGL library.
  21. 21.
  22. 22.
    Kipyatkova, I.S., Karpov, A.A.: Variants of deep artificial neural networks for speech recognition systems. SPIIRAS Proc. 49(6), 80–103 (2016). doi: 10.15622/sp.49.5 CrossRefGoogle Scholar
  23. 23.
    Ivanko, D.V., Karpov, A.A.: An analysis of perspectives for using high-speed cameras in processing dynamic video information. SPIIRAS Proc. 44(1), 98–113 (2016). doi: 10.15622/sp.44.7 CrossRefGoogle Scholar
  24. 24.
    Sargin, M., Aran, O., Karpov, A., Ofli, F., Yasinnik, Y., Wilson, S., Erzin, E., Yemez, Y., Tekalp, M.: Combined gesture-speech analysis and speech driven gesture synthesis. In: Proceeding IEEE International Conference on Multimedia and Expo ICME-2006, Toronto, Canada. IEEEXplore (2006)Google Scholar
  25. 25.
    Karpov, A., Ronzhin, A.: A universal assistive technology with multimodal input and multimedia output interfaces. In: Stephanidis, C., Antona, M. (eds.) UAHCI/HCII 2014. LNCS, vol. 8513, pp. 369–378. Springer, Cham (2014). doi: 10.1007/978-3-319-07437-5_35 Google Scholar
  26. 26.
    Karpov, A., Akarun, L., Yalçın, H., Ronzhin, A.L., Demiröz B., Çoban A., Zelezny M.: Audio-visual signal processing in a multimodal assisted living environment. In: Proceeding of 15th International Conference INTERSPEECH-2014, Singapore, pp. 1023–1027 (2014)Google Scholar
  27. 27.
    Karpov, A., Ronzhin, A., Kipyatkova, I.: Automatic analysis of speech and acoustic events for ambient assisted living. In: Antona, M., Stephanidis, C. (eds.) UAHCI/HCII 2015. LNCS, vol. 9176, pp. 455–463. Springer, Cham (2015). doi: 10.1007/978-3-319-20681-3_43 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, SPIIRASSt. PetersburgRussian Federation
  2. 2.ITMO UniversitySt. PetersburgRussian Federation

Personalised recommendations